2

I am new to Selenium and am trying to scrape data (just names for now) from these bourbon product cards on thewhiskeyexchange.com. I have tested all of my css (and xpath) selectors in scrapy shell so I know that they are correct, but the output returns coded information about the "session" and the element that I do not understand. The quantity of items in the list seem to be correct, so maybe Selenium is doing exactly what it is supposed to do and I just dont know how to convert the output to something I should use. How do I get just the names from the product cards?

I have tried both the driver and the local selector functions Selenium offers with the same results. beautiful soup functions return the data I need, but that method is too inefficient for the scope of the project I am working on. Any insight as to how I can fix this would be greatly appreciated.

IN[]:
chrome_options = Options()
chrome_options.add_argument("--incognito")
chrome_options.add_argument("--window-size=1920x1080")
chrome_options.binary_location = "C:\Program Files\Google\Chrome\Application\chrome.exe"

IN[]:
driver = webdriver.Chrome(ChromeDriverManager().install())

IN[]:
url = "https://www.thewhiskyexchange.com/c/639/bourbon-whiskey"
driver.get(url)
time.sleep(5) # second delay to improve visual quality
html = driver.page_source
html # HTTP request response object is as expected

IN[]:
els = driver.find_elements_by_css_selector('p.product-card__name')
# local method: els = driver.find_elements(By.CSS_SELECTOR, 'p.product-card__name')
els

OUT[]:
[<selenium.webdriver.remote.webelement.WebElement (session="e521768d8df1dd788b1fda816299b0b5", element="b9384a19-f8c9-46b2-be99-780200dcba99")>,
 <selenium.webdriver.remote.webelement.WebElement (session="e521768d8df1dd788b1fda816299b0b5", element="af76dfa8-b86c-426a-8ad8-30ea904ed11b")>,
 <selenium.webdriver.remote.webelement.WebElement (session="e521768d8df1dd788b1fda816299b0b5", element="58b14e5a-6bc3-443a-807f-ec696e83b096")>, ...

1 Answer 1

3
find_elements

returns a list of web element whereas find_element returns a single web element.

You can iterate over the list and extract the text like it below:

IN[]:
els = driver.find_elements(By.CSS_SELECTOR, 'p.product-card__name')
for e in els:
    print(e.text)

Also, note that find_elements_by_css_selector has been deprecated in newer selenium version (also known as Selenium 4) so one should use find_elements(By.CSS_SELECTOR, "") instead.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks for explaining this. Saved me some pain when updating an old project.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.