4

I am trying to add a string in the middle of an url. Somehow my output looks like this:

http://www.Holiday.com/('Woman',)/Beach
http://www.Holiday.com/('Men',)/Beach

Somehow it should look like this:

http://www.Holiday.com/Woman/Beach
http://www.Holiday.com/Men/Beach

The code which I am using looks like the following:

list = {'Woman','Men'}
url_test = 'http://www.Holiday.com/{}/Beach'
for i in zip(list):
    url = url_test.format(str(i))
    print(url)
2
  • And what do you see printed? Commented Feb 8, 2017 at 19:06
  • 1
    And Please don't use list as a variable name. Commented Feb 8, 2017 at 19:07

3 Answers 3

6

Almost there. Just no need for zip:

items = {'Woman','Men'} # notice that this is a `set` and not a list
url_test = 'http://www.Holiday.com/{}/Beach'
for i in items:
    url = url_test.format(i)
    print(url)

The purpose of the zip function is to join several collections by the index if the item. When the zip joins the values from each collection it places them in a tuple which it's __str__ representation is exactly what you got. Here you just want to iterate the items in the collection

Sign up to request clarification or add additional context in comments.

3 Comments

Worth mentioning that list isn't a good var name and that the str (which you removed) was unnecessary
@brianpck - true :) especially that it is actually a set
@GiladGreen, thanks a lot this what I was looking for. Somehow I misuderstood the use of zip.
2

You can try this also, And please don't use list as a variable name.

lst = {'Woman','Men'}
url_test = 'http://www.Holiday.com/%s/Beach'
for i in lst:
     url = url_test %i
     print url

Comments

0
from urllib.request import urlopen
from bs4 import BeautifulSoup as BS
url = "https://www.imdb.com/chart/top?ref_=nv_mv_250"
html = urlopen(url)

url_list = BS(html, 'lxml')
type(url_list)

all_links = url_list.find_all('a', href=re.compile("/title/tt"))
for link in all_links:
  print(link.get("href"))
all_urls = link.get("href")

url_test = 'http://www.imdb.com/{}/'
for i in all_urls:
    urls = url_test.format(i)
    print(urls)

    this is the code to scrape the urls of all the 250 movies from the main url.
    but the code gives the result as ------

    http://www.imdb.com///
    http://www.imdb.com/t/
    http://www.imdb.com/i/
    http://www.imdb.com/t/
    http://www.imdb.com/l/
    http://www.imdb.com/e/
    http://www.imdb.com///
    and so on ...

    how can i split 'all_urls' using a comma, or how can I make a list of urls in 
    'all_urls'....

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.