Automated Data Profiling using ydata-profiling on Pandas Dataframe -

Published: 12 December 2023
on channel: Knowledge Share
2,291
33

#pandasprofiling #python #pandas #dataquality #azuredatabricks #azuredatafactory #azuredataengineer #databricks #dataanalysis

In this session we discussed on how to perform data profiling using ydata-profiling library. For Demo purpose , we have used Jupyter, you can also apply this on your databricks and data stored in your Azure Storage Location

Link for ydata-profiling page : https://pypi.org/project/ydata-profil...
Link for csv data set : https://www.kaggle.com/datasets/matto...


Sample Code :
pip install ydata-profiling

import pandas as pd
df1 = pd.read_csv(r"D:\Data_Quality\Selected_Online_Sport_Wagering_Data.csv")

from ydata_profiling import ProfileReport
from ydata_profiling.utils.cache import cache_file
report=ProfileReport(df1,title="Quality_Test", explorative=True)
report.to_file("D:\Data_Quality\Data_results.html")



#dataprofiling
#dataengineeringessentials
#dataengineering
#dataengineer
#pandas #pyspark
#KnowledgeShare
#ydata-quality
#dataquality
#python
#automateddataprofiling


Watch video Automated Data Profiling using ydata-profiling on Pandas Dataframe - online without registration, duration hours minute second in high quality. This video was added by user Knowledge Share 12 December 2023, don't forget to share it with your friends and acquaintances, it has been viewed on our site 2,29 once and liked it 3 people.