Using While with For loop in url in python

Question

I am trying to scrape some tables from a website. The url has two parameters that keeps changing with every table - id value and an alpha value. The example of the url is as follows:

http://resources.afaqs.com/index.html?id=123&category=AD+Agencies&alpha=A

I want to iterate through id and alpha value. My code so far is as follows:

import csv
import bs4 as bs
import requests


data = ['1','2','3','7','A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','W','X','Y','Z']
number = None


while number < 500:
    for i in data:
        url = "http://resources.afaqs.com/index.html?id="
        if number is not None:
            url += str(number) + "&category=AD+Agencies&alpha={}".format(i)
        print(url)

        if number is None:
            number = 1
        else:
            number += 1

This iterates the id number from 1 to 499 and for the alpha value A to Z sequentially. Whereas what I want is: for every id, I want the alpha values to be iterated from A to Z.

I tried changing the for loop by using it before while loop, for loop before print url, etc...each of these combinations gives odd results and not the one that I wanted.

Can someone help please?

Why are you setting number to None and then doing a check, why not just set number to 0 and then you can do a simple if not number check and do your number += 1 without any extra if statements? — Peter Featherstone
– Peter Featherstone, Commented Jun 14, 2017 at 12:29
Why do you have 1, 2, 3, 7 in your data list if you only want A to Z? — Peter Featherstone
– Peter Featherstone, Commented Jun 14, 2017 at 12:32
number < 500 where number is None object <- it will raise TypeError in Python 3, so it looks like you are using 2.* — Azat Ibrakov
– Azat Ibrakov, Commented Jun 14, 2017 at 12:32

Błotosmętek · Accepted Answer · 2017-06-14 12:34:38Z

3

Don't use the while loop at all, use nested for:

url = "http://resources.afaqs.com/index.html?id={}&category=AD+Agencies&alpha={}"
for number in range(1,500):
    for i in data:
        print url.format(number, i)

answered Jun 14, 2017 at 12:34

Błotosmętek

13k23 silver badges30 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Błotosmętek Over a year ago

Oh, and if you use only one-character entries in data, you might change it like this: data = '1237ABCDEFGHIJKLMNOPQRSTUVWXYZ'

Peter Featherstone Over a year ago

This feels much more Pythonic, I like

Azat Ibrakov · Accepted Answer · 2017-06-14 12:49:43Z

assuming we need to iterate through ids and for each id iterate through uppercase latin letters we can write

from string import ascii_uppercase


def get_urls(number_stop):
    url = "http://resources.afaqs.com/index.html?id={}&category=AD+Agencies&alpha={}"
    urls = []
    for number in range(1, number_stop):
        for letter in ascii_uppercase:
            urls.append(url.format(number, letter))
    return urls

or using generator

from string import ascii_uppercase


def generate_urls(number_stop):
    url = "http://resources.afaqs.com/index.html?id={}&category=AD+Agencies&alpha={}"
    for number in range(1, number_stop):
        for letter in ascii_uppercase:
            yield url.format(number, letter)

or finally using generator & product to get rid of extra loop

from itertools import product
from string import ascii_uppercase


def generate_urls(number_stop):
    url = "http://resources.afaqs.com/index.html?id={}&category=AD+Agencies&alpha={}"
    for number, letter in product(range(1, number_stop),
                                  ascii_uppercase):
        yield url.format(number, letter)

Collectives™ on Stack Overflow

Using While with For loop in url in python

2 Answers 2

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related