In this video, we go through how to scrape data from javascript rendered websites using Scrapy Playwright.
We cover:
How To Install Scrapy Playwright
How To Use Scrapy Playwright In Your Spiders
How To Wait For The Page To Load
How To Scrape Multiple Pages
How To Scroll The Page Elements With Scrapy Playwright
How To Take screenshots With Scrapy Playwright
The codebase we use in this video can be cloned from here:
https://github.com/python-scrapy-play...
The article you can read while following this video can be found here: https://scrapeops.io/python-scrapy-pl...
***** RUNNING ON WINDOWS! ******
As of writing this guide, Scrapy Playwright doesn't work with Windows. However, it is possible to run it with WSL (Windows Subsystem for Linux).
This is a good video tutorial to check out if you need to install WSL on your windows machine:
• Linux Terminal & GUI Inside of Window...
00:00 - Intro
01:10 - Cloning & Installing the Scrapy Project
02:18 - Adding the settings needed for Scrapy Playwright
02:52 - Adding the scraping code to our Spider
08:50 - Wait for the page to finish loading specific page element CSS selectors
11:54 - Scraping multiple pages by clicking through to the next page
15:47 - Scraping a page with infinite scroll using Playwright
18:11 - Taking a screenshot while scraping with Playwright
20:13 - Outro
Смотрите видео Scrapy-Playwright: How To Scrape Dynamic JS Websites (2022) онлайн без регистрации, длительностью часов минут секунд в хорошем качестве. Это видео добавил пользователь ScrapeOps 09 Сентябрь 2022, не забудьте поделиться им ссылкой с друзьями и знакомыми, на нашем сайте его посмотрели 22,936 раз и оно понравилось 309 людям.