- 20 Core Data Science Concepts for Beginners - Dec 8, 2020.
With so much to learn and so many advancements to follow in the field of data science, there are a core set of foundational concepts that remain essential. Twenty of these ideas are highlighted here that are key to review when preparing for a job interview or just to refresh your appreciation of the basics.
- 7 Steps to Mastering Data Preparation for Machine Learning with Python — 2019 Edition - Jun 24, 2019.
Interested in mastering data preparation with Python? Follow these 7 steps which cover the concepts, the individual tasks, as well as different approaches to tackling the entire process from within the Python ecosystem.
- Text Wrangling & Pre-processing: A Practitioner’s Guide to NLP - Aug 3, 2018.
I will highlight some of the most important steps which are used heavily in Natural Language Processing (NLP) pipelines and I frequently use them in my NLP projects.
- Data Retrieval and Cleaning: Tracking Migratory Patterns - Jul 3, 2018.
In this post, we walk through investigating, retrieving, and cleaning a real world data set. We will also describe the cost benefits and necessary tools involved in building your own data sets.
- 7 Steps to Mastering Data Preparation with Python - Jun 2, 2017.
Follow these 7 steps for mastering data preparation, covering the concepts, the individual tasks, as well as different approaches to tackling the entire process from within the Python ecosystem.
Pages: 1 2
- Trifacta – Wrangling US Flight Data, part 2 - May 22, 2015.
This post shows how to use Trifacta to clean the data and enrich it with airport geo-locations and airline names, including filling missing values, and doing a lookup from another dataset. We also learn which is the best airline at O’Hare airport.
Pages: 1 2 3
- Trifacta – Wrangling US Flight Data - May 12, 2015.
A useful case study shows how Trifacta can clean and analyze US Flight data, including cleaning up markup, removing unrelated and redundant columns, cleaning geographic names and more.
Pages: 1 2 3
- Baby Boom: Udemy Excel Tutorial on Analyzing Large Data Sets - Apr 15, 2015.
This tutorial not only shows how to use Excel Pivot Tables and Graphs, but teaches the mindset needed in exploratory data analysis - look beneath the surface, consider the non-obvious interpretations, and question everything (including the data).
Pages: 1 2 3
- Top KDnuggets tweets, Apr 2-5: The Data Science ecosystem: Data wrangling useful tools and tips - Apr 6, 2015.
The #datascience ecosystem part 2: Data wrangling useful tools and tips; 10 R Packages to Win Kaggle Competitions; Forrester Wave #BigData Predictive #Analytics Solutions 2015, gainers, losers; How Microsoft uses Big Data to predict traffic jams in advance.
- Upcoming Webcasts on Analytics, Big Data, Data Science – Mar 10 and beyond - Mar 9, 2015.
Data Wrangling and the Art of Big Data Discovery, Data Mining: Failure to Launch, The State of Hadoop Adoption, Addressing the Challenges of Data Variety, and more.
- Upcoming Webcasts on Analytics, Big Data, Data Science – Mar 3 and beyond - Mar 2, 2015.
Data Wrangling and the Art of Big Data Discovery, Hadoop - A Solution for Big Data, Fast Data Meets Open Source, Real-Time Data on Hadoop with Apache Kafka, and more.
- Top KDnuggets tweets, Oct 6-7: Great TED talk by @KnCukier “Big Data is better data”; Top 10 One-Person Startups - Oct 8, 2014.
Great TED talk by @KnCukier "Big Data is better data"; Top 10 One-Person Startups; 7 critical elements of effective dashboards and visualizations; Making Sense of Public Data - Wrangling Jeopardy.