Polars is one of the most trending Python framework which hits Pandas in many performance tests. This video presents 8 distinct tests which demonstrates differences between Pandas and Polars in duration in seconds while running specific functions on data.
I tested the following functions by scoring these two frameworks:
Test 1: read a single CSV file
Test 2, and 3: select columns from a loaded dataframe (two approaches).
Test 4: Filtering data in a dataframe.
Test 5 and 6: Create a new column (two approaches).
Test 7: Group and aggregate data.
Test 8: Fill missing data.
I evaluated the competition in two groups:
1. Group where I did not used Lazy evaluation in Polars.
2. Group where I used Lazy evaluation in Polars.
From a high level perspective, Polars represents data in memory with Arrow arrays while Pandas represents data in memory in Numpy arrays. For this reason, Polars suggest Lazy functionality which makes it much faster. I mentioned it multiple times in this video (Polars has Eager and Lazy APIs, while Pandas can suggest Eager only).
The content of the whole experiment is:
0:00 - Intro
1:08 - Introducing experiment Python code
12:05 - Run the experiment
18:04 - Experiment results (summary).
21:22 - Final test results.
Github repo with Python codes used in this experiment: https://github.com/vb100/polars_vs_pa...
Additional material:
Official Polars documentation: https://www.pola.rs/
Lazy functionality in Polars: https://towardsdatascience.com/unders...
#polars #pandas #experiment
Смотрите видео Polars vs Pandas | detailed test with explained results онлайн без регистрации, длительностью часов минут секунд в хорошем качестве. Это видео добавил пользователь Data Science Garage 26 Апрель 2023, не забудьте поделиться им ссылкой с друзьями и знакомыми, на нашем сайте его посмотрели 932 раз и оно понравилось 22 людям.