I am trying to scrape a date on a series of URLs that are in a csv and then output the dates to a new CSV.
I have the basic python code working but can't figure out how to load the CSV in (instead of pulling it from an array) and scrape each url and then output it to a new CSV. From reading a couple posts I think I would want to use the csv python module but can't get it working.
Here is my code for the scraping part
import urllib
import re
exampleurls =["http://www.domain1.com","http://www.domain2.com","http://www.domain3.com"]
i=0
while i<len(exampleurls):
url = exampleurls[i]
htmlfile = urllib.urlopen(url)
htmltext = htmlfile.read()
regex = 'on [0-9][0-9]\.[0-9][0-9]\.[0-9][0-9]'
pattern = re.compile(regex)
date = re.findall(pattern,htmltext)
print date
i+=1
Any help is much appreciated!
import csvThen you can try writing some code and post it here.