Control a web browser from R to web scrap static and dynamic websites using {chromote}

Published: 29 October 2024
on channel: TheCoatlessProfessor
285
20

Learn to scrape static and dynamic JavaScript websites using R and Chromote! Perfect for data scientists working with modern web applications.

In this tutorial, you'll learn:
How to use Chromote R package with Chrome DevTools Protocol
Scraping JavaScript-generated content
Handling dynamic page elements
Converting web data to R dataframes
Cleaning and analyzing scraped data

Projects Covered:
1. Basic scraping with R-project.org
2. Advanced weather data scraping from Windy.com

Required R Packages 📦 :
Chromote
rvest
lubridate
dplyr

Required Software:
Chrome or a Chromium-based browser

Timeline:
0:00 - Introduction to Chromote and dynamic web scraping
0:42 - Overview of the tutorial structure
1:31 - Initial setup and browser demonstration
2:16 - Installing required R packages (Chromote, rvest, lubridate)
2:47 - Launching a Chrome browser session from R
3:37 - Getting Chrome version information
4:15 - Navigating to r-project.org websites programmatically
4:48 - Using system sleep for page loading
5:28 - Introduction to Chrome DevTools for element inspection
6:22 - Using CSS selectors to identify page elements
7:35 - Highlighting and capturing page elements
8:42 - Extracting HTML content with Chromote
10:04 - Processing HTML with rvest
11:22 - Advanced example: Scraping Windy.com
12:27 - Handling dynamic search functionality
13:44 - Programmatically entering search queries
14:35 - Interacting with search results
15:23 - Extracting weather forecast data
16:28 - Converting HTML tables to R dataframes
19:13 - Cleaning and processing weather data
20:29 - Analyzing the extracted weather data
21:19 - Cleaning up browser sessions
21:49 - Conclusion

💻 Code & Resources:
Blog post: https://blog.thecoatlessprofessor.com...
GitHub Repository: https://github.com/coatless-videos/ch...

🔗 Connect with me:
GitHub: https://github.com/coatless
Website: https://thecoatlessprofessor.com
LinkedIn:   / jamesbalamuta  
BlueSky: https://bsky.app/profile/coatless.bsk...
Mastodon: https://mastodon.social/@coatless
Twitter/X:   / axiomsofxyz  

#Rstats #DataScience #WebScraping #Programming #DataAnalysis #Tutorial

❓ Have questions? Leave them in the comments below!


Watch video Control a web browser from R to web scrap static and dynamic websites using {chromote} online without registration, duration hours minute second in high quality. This video was added by user TheCoatlessProfessor 29 October 2024, don't forget to share it with your friends and acquaintances, it has been viewed on our site 285 once and liked it 20 people.