Submit a blog to KDnuggets -- Top Blogs Win A Reward

Topics: AI | Data Science | Data Visualization | Deep Learning | Machine Learning | NLP | Python | R | Statistics

KDnuggets Home » News » 2021 » Jul » Tutorials, Overviews » A Lightning Fast Look at Single Line Exploratory Data Analysis ( 21:n26 )

A Lightning Fast Look at Single Line Exploratory Data Analysis


Here's a very quick look at how you can perform EDA with a single line of code using D-Tale.



By Harsha Mandala, Student at JNTUH College of Engineering Jagityala

Exploratory Data Analysis or EDA is an important step in inspecting data methodically. EDA is a process of investigating the datasets to discover patters or to find out their main characteristics, often with visual methods.

 

Performing EDA using a single line of code

 
Here's how you can perform EDA with a single line of code using D-Tale.

  • Install D-Tale Package dtale.PyPi .
  • Copy the code 'pip install dtale' and make sure to download the latest version.
  • Paste the code in an Anaconda prompt or any Python prompt and press enter.

In Juptyer Notebook (any Python notebook) import Seaborn and load a dataset. Data sets like 'iris', 'titanic', 'Sample-Superstore' are complicated to analyse. Actions like 'plotting different charts', 'network viewer', 'predictive power sources', and 'correlations' etc, can all be done without any coding.


 

Describing data

 
All the basic statistical details like minimum value, maximum value, standard deviation (std), mean (or) average value, median etc, can be handled category wise.

 

Click on 'Code Export' to copy, paste the code

 
Each data which is analysed directly and its code is placed in HTML according to the method. Code can be copy pasted into the notebook, and it even contains comments which make it more user friendly.

 

Plotting charts

 
Depending upon the rows and columns or variables you selected, it is much easier to access all the type of charts with just a couple of clicks.

 

Correlations

 
You can display correlation values and plot charts of the correlation values of different variables directly after selected.

Packages like dataprep, dataproc, and dtale can be used to speed up data science. These packages simplify the entire EDA process with minimal lines of code.

 
Bio: is a student at JNTUH College of Engineering Jagityala.

Original. Reposted with permission.

Related:


Sign Up

By subscribing you accept KDnuggets Privacy Policy