I am trying to scrape some tables from a website. The url has two parameters that keeps changing with every table - id value and an alpha value. The example of the url is as follows:
http://resources.afaqs.com/index.html?id=123&category=AD+Agencies&alpha=A
I want to iterate through id and alpha value. My code so far is as follows:
import csv
import bs4 as bs
import requests
data = ['1','2','3','7','A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','W','X','Y','Z']
number = None
while number < 500:
for i in data:
url = "http://resources.afaqs.com/index.html?id="
if number is not None:
url += str(number) + "&category=AD+Agencies&alpha={}".format(i)
print(url)
if number is None:
number = 1
else:
number += 1
This iterates the id number from 1 to 499 and for the alpha value A to Z sequentially. Whereas what I want is: for every id, I want the alpha values to be iterated from A to Z.
I tried changing the for loop by using it before while loop, for loop before print url, etc...each of these combinations gives odd results and not the one that I wanted.
Can someone help please?
Noneand then doing a check, why not just setnumberto0and then you can do a simpleif not numbercheck and do yournumber += 1without any extraifstatements?1, 2, 3, 7in yourdatalist if you only want A to Z?number < 500wherenumberisNoneobject <- it will raiseTypeErrorin Python 3, so it looks like you are using 2.*