In this video, we’ve discussed Exploratory Data Analysis using the Pandas Profiling package. In this video, we've performed a quick & dirty data exploration for the ‘North Carolina Birth’ data set by using just one line of code.
This package generates profile reports from the pandas data frame. Pandas_profiling extends the pandas data frame with df.profile_report() for quick data analysis.
👉 For each column the following statistics - if relevant for the column type - are presented in an interactive HTML report:
• Type inference: detect the types of columns in a data frame.
• Essentials: type, unique values, missing values
• Descriptive statistics like mean, mode, standard deviation, sum, median absolute deviation, minimum value, Q1, median, Q3, maximum, range, interquartile range, coefficient of variation, kurtosis, skewness
• Most frequent values
• Histogram
• Correlations highlighting of highly correlated variables, Spearman, Pearson and Kendall matrices
• Missing values matrix, count, heatmap, and dendrogram of missing values
• Text analysis learns about categories (Uppercase, Space), scripts (Latin, Cyrillic), and blocks (ASCII) of text data.
• File and Image analysis extract file sizes, creation dates, and dimensions and scan for truncated images or those containing EXIF information.
#Exploratory data analysis is an important, iterative & time-consuming process. Some auto EDA libraries can help you save your time by quickly generating necessary plots with just 1-2 lines of code. This can speed up your task to understand and explore missing variables, distributions, and the relationship of different variables.
Other popular libraries for automating data visualization tasks are sweetviz and autoviz. For R users, the DataExplorer package is good for this.
👉 autoviz Tutorial: [ Ссылка ]
👉 Sweetviz Tutorial: [ Ссылка ]
👉 More on Data Visualization: [ Ссылка ]
👉 Subscribe to my channel for more videos on statistics, data science, machine learning, exploratory data analysis, data preparation, & tools like excel, R & Python at [ Ссылка ]
👉 For interesting quick post/tips, follow me on [ Ссылка ]
Follow me on SlideShare: [ Ссылка ]
#dataviz #datastorytelling #visualization
Ещё видео!