0

https://www.narendramodi.in/category/text-speeches -> I wanted to scrape this page. As this a dynamic one, I need to scroll down until the bottom of the page and then get the HTML content to scrape it. But when this website is opened through selenium chrome web driver, neither manually nor automatically is the website loading dynamically as I scroll down. When the website is opened from normal chrome, it works just fine. I even tried with firefox driver and the result is same. Here's the code that I have tried out.

driver = webdriver.Chrome(executable_path=r'C:/tools/drivers/chromedriver.exe')
driver.get('https://www.narendramodi.in/news')
# https://stackoverflow.com/a/27760083

SCROLL_PAUSE_TIME = 2.0
# Get scroll height
last_height = driver.execute_script("return document.body.scrollHeight")
print(last_height)

while True:
    # Scroll down to bottom
    time.sleep(SCROLL_PAUSE_TIME)

    driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")

    # Wait to load page
    time.sleep(SCROLL_PAUSE_TIME)

    # Calculate new scroll height and compare with last scroll height
    new_height = driver.execute_script("return document.body.scrollHeight")
    print(new_height)
    if new_height == last_height:
        break
    last_height = new_height


res = driver.execute_script("return document.documentElement.outerHTML")
driver.quit()

soup = BeautifulSoup(res, 'lxml')

How can I scrape this entire page?

2
  • Can you not just scrape the API that populates the page rather than using selenium? Commented Feb 2, 2020 at 15:43
  • It seems like Infinite Scroll, but you can refer to following link: stackoverflow.com/questions/59838948/… Commented Feb 6, 2020 at 10:58

1 Answer 1

0

Some website detects the use of Selenium and stop loading its content. You can try tuning Selenium settings or using a package like selenium-stealth (pypi link: https://pypi.org/project/selenium-stealth/)

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.