- Data Validation in Machine Learning is Imperative, Not Optional - May 24, 2021.
Before we reach model training in the pipeline, there are various components like data ingestion, data versioning, data validation, and data pre-processing that need to be executed. In this article, we will discuss data validation, why it is important, its challenges, and more.
- How to get started managing data quality with SQL and scale - May 4, 2021.
Silent data quality issues are the biggest problem facing data teams today, who are flying blind with no systems or processes in place to monitor and detect bad data before it has a downstream impact.
- Data Validation and Data Verification – From Dictionary to Machine Learning - Mar 16, 2021.
In this article, we will understand the difference between data verification and data validation, two terms which are often used interchangeably when we talk about data quality. However, these two terms are distinct.
- Data Observability, Part II: How to Build Your Own Data Quality Monitors Using SQL - Feb 23, 2021.
Using schema and lineage to understand the root cause of your data anomalies.
- Inside the Architecture Powering Data Quality Management at Uber - Feb 22, 2021.
Data Quality Monitor implements novel statistical methods for anomaly detection and quality management in large data infrastructures.
- Data Observability: Building Data Quality Monitors Using SQL - Feb 16, 2021.
To trigger an alert when data breaks, data teams can leverage a tried and true tactic from our friends in software engineering: monitoring and observability. In this article, we walk through how you can create your own data quality monitors for freshness and distribution from scratch using SQL.
- My machine learning model does not learn. What should I do? - Feb 10, 2021.
This article presents 7 hints on how to get out of the quicksand.
- 10 Principles of Practical Statistical Reasoning - Nov 3, 2020.
Practical Statistical Reasoning is a term that covers the nature and objective of applied statistics/data science, principles common to all applications, and practical steps/questions for better conclusions. The following principles have helped me become more efficient with my analyses and clearer in my conclusions.
- 6 Common Mistakes in Data Science and How To Avoid Them - Sep 10, 2020.
As a novice or seasoned Data Scientist, your work depends on the data, which is rarely perfect. Properly handling the typical issues with data quality and completeness is crucial, and we review how to avoid six of these common scenarios.
- How A Single Source of Truth Can Benefit Your Organization - Aug 7, 2020.
A single source of truth provides stakeholders with a clear picture of the enterprise assets and the potential complications that can disrupt the data strategy. Find out how you can implement this single source of truth in your enterprise ecosystem.
- How Bad Data is Affecting Your Organization’s Operational Efficiency - Mar 5, 2020.
Despite recognizing the importance of data quality, many companies still fail to implement a data quality framework that could protect them from making costly mistakes. Poor data does not just cause revenue loss – it’s the reason your company could lose employees, customers and reputation!
- Data Quality Assessment Is Not All Roses. What Challenges Should You Be Aware Of? - Sep 24, 2019.
Of all data quality characteristics, we consider consistency and accuracy to be the most difficult ones to measure. Here, we describe the challenges that you may encounter and the ways to overcome them.
- Sierra View Medical Center: Quality Analytics Engineer [Porterville, CA] - Jul 10, 2019.
Seeking a Quality Analytics Engineer, to be responsible for supporting desired patient outcomes and effectiveness of care, through ongoing analysis of health care information, quality measurement and identification of opportunities for improvement.
- Webinar: The Value-Based Return on Creating a High-Quality Data Pipeline,
Sep 12 - Sep 7, 2018.
Learn why data quality and data integration are key to delivering meaningful, actionable results, and how to develop data and analytics strategies that offer visibility into healthcare cost and quality.
- YouTube videos on database management, SQL, Datawarehousing, Business Intelligence, OLAP, Big Data, NoSQL databases, data quality, data governance and Analytics – free - May 18, 2018.
Watch over 20 hours of YouTube videos on databases and database design, Physical Data Storage, Transaction Management and Database Access, and Data Warehousing, Data Governance and (Big) Data Analytics - all free.
- Must-Know: What are common data quality issues for Big Data and how to handle them? - May 16, 2017.
Let's have a look at common quality issues facing Big Data in terms of the key characteristics of Big Data – Volume, Velocity, Variety, Veracity, and Value.
- 17 More Must-Know Data Science Interview Questions and Answers, Part 3 - Mar 15, 2017.
The third and final part of 17 new must-know Data Science interview questions and answers covers A/B testing, data visualization, Twitter influence evaluation, and Big Data quality.
Pages: 1 2
- Bad Data + Good Models = Bad Results - Jan 26, 2017.
No matter how advanced is your Machine Learning algorithm, the results will be bad if the input data
is bad. We examine one popular IMDB dataset and discuss how an analyst can deal with such data.
- Ten Simple Rules for Effective Statistical Practice: An Overview - Jun 23, 2016.
An overview of 10 simple rules to follow to ensure proper effective statistical data analysis.
- Hyundai: Quality Data Analytics Manager - Jun 1, 2016.
Seeking a Quality Data Analytics Manager to apply deep analytical skills to blend, process, and explore complex datasets for the Product Quality department to aid in knowledge discovery, and to assist with performing root-cause analysis, survival analysis, time series forecasting and other analytical activities.
- Webcast: Tech expert Phil Simon on exploring data - Jun 17, 2015.
Phil Simon, award-winning author, talks about how data visualization can help improve data quality, promoting the exploratory mindset, telling good stories with data, and more. On demand webcast.
- In Machine Learning, What is Better: More Data or better Algorithms - Jun 17, 2015.
Gross over-generalization of “more data gives better results” is misguiding. Here we explain, in which scenario more data or more features are helpful and which are not. Also, how the choice of the algorithm affects the end result.
- Interview: Michael Lurye, Time Warner Cable on Key Lessons from Shifting to Hadoop - Apr 14, 2015.
We discuss the key lessons from shifting to Hadoop, data management in today’s world, future of Data Science, advice and more.
- Interview: Josh Hemann, Activision on Why the Tolerance for Ambiguity is Vital - Mar 12, 2015.
We discuss handling bias in data, other data quality concerns, advice, desired qualities, and more.
- Top KDnuggets tweets, Sep 26-28: Any data scientist worth their salary will say you should start with a question - Sep 29, 2014.
CNN embarrassing lack of "Data Quality" - this #Scotland Independence poll adds; Statistical & Machine learning with R; Any data scientist worth their salary will say you should start with a question; Automotive Customer Churn Prediction using SVM and SOM.
- Interview: Pallas Horwitz, Blue Shell Games on Why Data Science is So Critical for Gaming Studios - Aug 14, 2014.
We discuss the role of data science at Blue Shell Games, the importance of "Lean Data", key metrics for online games, cross-product projects and optimizing meeting the data needs across an organization.
- Lavastorm Sun Seekers Caribbean Challenge 2 - Aug 5, 2014.
Use Lavastorm Analytics Engine Public Edition to overcome data quality issues and consolidate the lists. Step-by-step instructions make completing the task a snap! Submit your entry by August 31, 2014.
- Interview: Christophe Toum, Talend on Why Big Data Needs Big Governance - Aug 2, 2014.
We discuss the priority order of data governance for Big Data initiatives, impact of increasing shift towards Hadoop and NoSQL, data quality, current trends, talent crunch, advice and more.
- Interview: Aparna Pujar, eBay on Evolution of Behavior Analytics for User Engagement - Jul 25, 2014.
We discuss Behavior Analytics vs. Web Analytics, important metrics for user engagement, challenges of behavior insights domain, future of multi-screen analytics, key soft skill and more.
- Lynn Goldstein, Chief Data Officer, NYU on the Need for Data Governance - Jun 3, 2014.
We discuss the role of Data Governance, establishing Big Data accountability, impact of Data Governance on Data Quality, and assessing the education available for Data Governance.
- Forrester Research: Build Trusted Data with Data Quality - Apr 1, 2014.
Key takeaways of the report include: How managing data quality brings IT and the business closer together, Different data quality definitions, and advantages of transparency in data quality.