- How to frame the right questions to be answered using data - Mar 18, 2021.
Understanding your data first is a key step before going too far into any data science project. But, you can't fully understand your data until you know the right questions to ask of it.
- Know your data much faster with the new Sweetviz Python library - Mar 12, 2021.
One of the latest exploratory data analysis libraries is a new open-source Python library called Sweetviz, for just the purposes of finding out data types, missing information, distribution of values, correlations, etc. Find out more about the library and how to use it here.
- 11 Essential Code Blocks for Complete EDA (Exploratory Data Analysis) - Mar 5, 2021.
This article is a practical guide to exploring any data science project and gain valuable insights.
- Pandas Profiling: One-Line Magical Code for EDA - Feb 24, 2021.
EDA can be automated using a Python library called Pandas Profiling. Let’s explore Pandas profiling to do EDA in a very short time and with just a single line code.
- Powerful Exploratory Data Analysis in just two lines of code - Feb 22, 2021.
EDA is a fundamental early process for any Data Science investigation. Typical approaches for visualization and exploration are powerful, but can be cumbersome for getting to the heart of your data. Now, you can get to know your data much faster with only a few lines of code... and it might even be fun!
- The Best Tool for Data Blending is KNIME - Jan 13, 2021.
These are the lessons and best practices I learned in many years of experience in data blending, and the software that became my most important tool in my day-to-day work.
- 14 Data Science projects to improve your skills - Dec 1, 2020.
There's a lot of data out there and so many data science techniques to master or review. Check out these great project ideas from easy to advanced difficulty levels to develop new skills and strengthen your portfolio.
- Top Python Libraries for Data Science, Data Visualization & Machine Learning - Nov 2, 2020.
This article compiles the 38 top Python libraries for data science, data visualization & machine learning, as best determined by KDnuggets staff.
- Statistical and Visual Exploratory Data Analysis with One Line of Code - Sep 21, 2020.
If EDA is not executed correctly, it can cause us to start modeling with “unclean” data. See how to use Pandas Profiling to perform EDA with a single line of code.
- Bring your Pandas Dataframes to life with D-Tale - Aug 13, 2020.
Bring your Pandas dataframes to life with D-Tale. D-Tale is an open-source solution for which you can visualize, analyze and learn how to code Pandas data structures. In this tutorial you'll learn how to open the grid, build columns, create charts and view code exports.
- First Steps of a Data Science Project - Jul 29, 2020.
Many data science projects are launched with good intentions, but fail to deliver because the correct process is not understood. To achieve good performance and results in this work, the first steps must include clearly defining goals and outcomes, collecting data, and preparing and exploring the data. This is all about solving problems, which requires a systematic process.
- Exploratory Data Analysis on Steroids - Jul 6, 2020.
This is a central aspect of Data Science, which sometimes gets overlooked. The first step of anything you do should be to know your data: understand it, get familiar with it. This concept gets even more important as you increase your data volume: imagine trying to parse through thousands or millions of registers and make sense out of them.
- Why Python is One of the Most Preferred Languages for Data Science? - Jan 3, 2020.
Why do most data scientists love Python? Learn more about how so many well-developed Python packages can help you accomplish your crucial data science tasks.
- Exploratory Data Analysis Using Python - Aug 7, 2019.
In this tutorial, you’ll use Python and Pandas to explore a dataset and create visual distributions, identify and eliminate outliers, and uncover correlations between two datasets.
- Five Command Line Tools for Data Science - Jul 31, 2019.
You can do more data science than you think from the terminal.
- Fantastic Four of Data Science Project Preparation - Jul 26, 2019.
This article takes a closer look at the four fantastic things we should keep in mind when approaching every new data science project.
- KDnuggets™ News 19:n19, May 15: Data Scientist – Best Job of the Year!; How (not) to use Machine Learning for time series forecasting - May 15, 2019.
"Please, explain." Interpretability of machine learning models; How to fix an Unbalanced Dataset; Data Science Poem; Customer Churn Prediction Using Machine Learning; A Complete Exploratory Data Analysis and Visualization for Text
- Most impactful AI trends of 2018: The rise of ML Engineering - Mar 1, 2019.
As both research and applied teams are doubling down on their engineering and infrastructure needs, the nascent field of ML Engineering will build upon 2018’s foundation and truly blossom in 2019.
- Airbnb Rental Listings Dataset Mining - Jan 28, 2019.
An Exploratory Analysis of Airbnb’s Data to understand the rental landscape in New York City.
- Beginner Data Visualization & Exploration Using Pandas - Oct 22, 2018.
This tutorial will offer a beginner guide into how to get around with Pandas for data wrangling and visualization.
Pages: 1 2
- Top 12 Essential Command Line Tools for Data Scientists - Mar 21, 2018.
This post is a short introductory overview of 12 Unix-like operating system command line tools of value to data science tasks, and the data scientists who perform them.
- Applied Data Science: Solving a Predictive Maintenance Business Problem Part 2 - Feb 20, 2018.
In this post we will discuss further on how exploratory analysis can be used for getting insights for feature engineering.
- Data Science at the Command Line: Exploring Data - Feb 14, 2018.
See what's available in the freely-available book "Data Science at the Command Line" by digging into data exploration in the terminal.
- Next Generation Data Manipulation with R and dplyr - Aug 31, 2017.
The idea behind the dplyr package is to do one thing at a time. dplyr has separate functions for every task which make its implementation crisp and easy to understand.
- Exploratory Data Analysis in Python - Jul 7, 2017.
We view EDA very much like a tree: there is a basic series of steps you perform every time you perform EDA (the main trunk of the tree) but at each step, observations will lead you down other avenues (branches) of exploration by raising questions you want to answer or hypotheses you want to test.
- 5 Machine Learning Projects You Can No Longer Overlook, May - May 10, 2017.
In this month's installment of Machine Learning Projects You Can No Longer Overlook, we find some data preparation and exploration tools, a (the?) reinforcement learning "framework," a new automated machine learning library, and yet another distributed deep learning library.
- The Value of Exploratory Data Analysis - Apr 20, 2017.
In this post, we will give a high level overview of what exploratory data analysis (EDA) typically entails and then describe three of the major ways EDA is critical to successfully model and interpret its results.
- 5 Machine Learning Projects You Can No Longer Overlook, April - Apr 13, 2017.
It's about that time again... 5 more machine learning or machine learning-related projects you may not yet have heard of, but may want to consider checking out. Find tools for data exploration, topic modeling, high-level APIs, and feature selection herein.
- Top Data Scientist Daniel Tunkelang on Data Recycling - Nov 22, 2016.
Respected Data Scientist Daniel Tunkelang shares some insight into data recycling, using data from other contexts to bootstrap your initial statistical models until you can collect live data.
- 5 Steps for Advanced Data Analysis using Visualization - Oct 28, 2016.
In most of the scientific researches, due to large amount of experiment data, statistical analysis is typically done by technical experts in computing and statistics. Unfortunately, these experts are not the experts of underlying research; which may cause gaps in analysis. If actual researchers are given easy to use tools and methods to handle and analyse data, it will enrich the research outcome for sure.
- Emory University: Tenure-Track Faculty Position in Data Exploration - Sep 26, 2016.
We are particularly interested in applicants with expertise in interactive data exploration, broadly construed, which includes data mining, analytics, visualization, human-computer interaction, and summarization.
- Caravel: Airbnb’s data exploration platform - Apr 13, 2016.
For data exploration, discovery, and collaborative analytics, AirBnB have built and open sourced, a data exploration and dashboarding platform named Caravel. It allows data exploration through rich visualizations while performing fast and intuitive “slicing and dicing” of your dataset.
- Change in Perspective with Process Mining - Feb 9, 2016.
Process mining is focused on the analysis of processes, and is an excellent tool in particular for the exploratory analysis of process-related data. Understand how effectively use it as an exploratory analysis tool, which can rapidly and flexibly take different perspectives on your processes.
Pages: 1 2 3
- Improve your processes with statistical models - Jan 7, 2016.
Through real-world case studies, this technical primer will help you: find best practices to interactively explore the patterns in your data, build useful statistical models, and visually interact with these models.
- Beyond One-Hot: an exploration of categorical variables - Dec 8, 2015.
Coding categorical variables into numbers, by assign an integer to each category ordinal coding of the machine learning algorithms. Here, we explore different ways of converting a categorical variable and their effects on the dimensionality of data.
- Business Analytics Webinars: Practical Training, 7 Live Sessions, Dec 10 - Dec 3, 2015.
Stay ahead of the curve in business analytics with our Dec 10 Webinar Marathon, and watch industry experts deliver 7 back-to-back sessions on hot topics. Register now.
- Improve your processes with statistical models - Nov 3, 2015.
Get technical primer with best practices to interactively explore the patterns in your data, build useful statistical models of these patterns, and visually interact with these models.
- INFORMS Courses: Essential Practice Skills, Data Exploration and Visualization, November, Baltimore - Oct 5, 2015.
Two INFORMS courses teach Essential Practice Skills for High-Impact Analytics Projects (Nov 18-19) and Data Exploration & Visualization (Nov 10-11). Both courses are given at Johns Hopkins University, Baltimore, MD.
- Webcast: Tech expert Phil Simon on exploring data - Jun 17, 2015.
Phil Simon, award-winning author, talks about how data visualization can help improve data quality, promoting the exploratory mindset, telling good stories with data, and more. On demand webcast.
- Statwing, Modern Data Analysis Software - Jan 30, 2014.
Every decision maker in the organization needs to be capable of analyzing data, but most tools require a lot of mundane and time-consuming data cleaning. Statwing solves that problem and lets you focus on data analysis.