- How to solve machine learning problems in the real world - Sep 2, 2021.
Becoming a machine learning engineer pro is your goal? Sure, online ML courses and Kaggle-style competitions are great resources to learn the basics. However, the daily job of a ML engineer requires an additional layer of skills that you won’t master through these approaches.
Advice, Business, Data Quality, Machine Learning, SQL, Tips, XGBoost
- Data Validation in Machine Learning is Imperative, Not Optional - May 24, 2021.
Before we reach model training in the pipeline, there are various components like data ingestion, data versioning, data validation, and data pre-processing that need to be executed. In this article, we will discuss data validation, why it is important, its challenges, and more.
Data Quality, Machine Learning, Production, Validation
- How to get started managing data quality with SQL and scale - May 4, 2021.
Silent data quality issues are the biggest problem facing data teams today, who are flying blind with no systems or processes in place to monitor and detect bad data before it has a downstream impact.
Data Preparation, Data Quality, Scalability, SQL
- Data Validation and Data Verification – From Dictionary to Machine Learning - Mar 16, 2021.
In this article, we will understand the difference between data verification and data validation, two terms which are often used interchangeably when we talk about data quality. However, these two terms are distinct.
Data Quality, Machine Learning, Validation
- Data Observability, Part II: How to Build Your Own Data Quality Monitors Using SQL - Feb 23, 2021.
Using schema and lineage to understand the root cause of your data anomalies.
Data Engineering, Data Quality, Data Science, Data Science Platform, SQL
- Inside the Architecture Powering Data Quality Management at Uber - Feb 22, 2021.
Data Quality Monitor implements novel statistical methods for anomaly detection and quality management in large data infrastructures.
Architecture, Data Quality, Uber
- Data Observability: Building Data Quality Monitors Using SQL - Feb 16, 2021.
To trigger an alert when data breaks, data teams can leverage a tried and true tactic from our friends in software engineering: monitoring and observability. In this article, we walk through how you can create your own data quality monitors for freshness and distribution from scratch using SQL.
Data Engineering, Data Quality, Data Science, Data Science Platform, SQL
- My machine learning model does not learn. What should I do? - Feb 10, 2021.
This article presents 7 hints on how to get out of the quicksand.
Algorithms, Business Context, Data Quality, Hyperparameter, Machine Learning, Modeling, Tips
- 10 Principles of Practical Statistical Reasoning - Nov 3, 2020.
Practical Statistical Reasoning is a term that covers the nature and objective of applied statistics/data science, principles common to all applications, and practical steps/questions for better conclusions. The following principles have helped me become more efficient with my analyses and clearer in my conclusions.
Data Analysis, Data Quality, Data Science, Statistical Analysis, Statistics
- 6 Common Mistakes in Data Science and How To Avoid Them - Sep 10, 2020.
As a novice or seasoned Data Scientist, your work depends on the data, which is rarely perfect. Properly handling the typical issues with data quality and completeness is crucial, and we review how to avoid six of these common scenarios.
Advice, Data Quality, Data Science, Hyperparameter, Mistakes, Overfitting
- How A Single Source of Truth Can Benefit Your Organization - Aug 7, 2020.
A single source of truth provides stakeholders with a clear picture of the enterprise assets and the potential complications that can disrupt the data strategy. Find out how you can implement this single source of truth in your enterprise ecosystem.
Business Intelligence, Data Management, Data Quality, Decision Making
- How Bad Data is Affecting Your Organization’s Operational Efficiency - Mar 5, 2020.
Despite recognizing the importance of data quality, many companies still fail to implement a data quality framework that could protect them from making costly mistakes. Poor data does not just cause revenue loss – it’s the reason your company could lose employees, customers and reputation!
Business, Data Management, Data Operations, Data Quality, Efficiency
- Data Quality Assessment Is Not All Roses. What Challenges Should You Be Aware Of? - Sep 24, 2019.
Of all data quality characteristics, we consider consistency and accuracy to be the most difficult ones to measure. Here, we describe the challenges that you may encounter and the ways to overcome them.
Challenges, Data Quality
- Sierra View Medical Center: Quality Analytics Engineer [Porterville, CA] - Jul 10, 2019.
Seeking a Quality Analytics Engineer, to be responsible for supporting desired patient outcomes and effectiveness of care, through ongoing analysis of health care information, quality measurement and identification of opportunities for improvement.
Analytics, CA, Data Quality, Porterville, Sierra View Medical
- Webinar: The Value-Based Return on Creating a High-Quality Data Pipeline,
Sep 12 - Sep 7, 2018.
Learn why data quality and data integration are key to delivering meaningful, actionable results, and how to develop data and analytics strategies that offer visibility into healthcare cost and quality.
Data Quality, Healthcare, Looker, Pipeline
- YouTube videos on database management, SQL, Datawarehousing, Business Intelligence, OLAP, Big Data, NoSQL databases, data quality, data governance and Analytics – free - May 18, 2018.
Watch over 20 hours of YouTube videos on databases and database design, Physical Data Storage, Transaction Management and Database Access, and Data Warehousing, Data Governance and (Big) Data Analytics - all free.
Analytics, Bart Baesens, Big Data, Business Intelligence, Data Governance, Data Quality, Data Warehousing, Databases, NoSQL, SQL, Youtube
Must-Know: What are common data quality issues for Big Data and how to handle them? - May 16, 2017.
Let's have a look at common quality issues facing Big Data in terms of the key characteristics of Big Data – Volume, Velocity, Variety, Veracity, and Value.
3Vs of Big Data, Big Data, Data Quality, Interview Questions
17 More Must-Know Data Science Interview Questions and Answers, Part 3 - Mar 15, 2017.
The third and final part of 17 new must-know Data Science interview questions and answers covers A/B testing, data visualization, Twitter influence evaluation, and Big Data quality.
Pages: 1 2
3Vs of Big Data, A/B Testing, Big Data, Data Quality, Data Science, Data Visualization, Influencers, Interview Questions, Statistics, Twitter
- Bad Data + Good Models = Bad Results - Jan 26, 2017.
No matter how advanced is your Machine Learning algorithm, the results will be bad if the input data
is bad. We examine one popular IMDB dataset and discuss how an analyst can deal with such data.
Data Quality, Face Recognition, IMDb, Kaggle, Movies
- Ten Simple Rules for Effective Statistical Practice: An Overview - Jun 23, 2016.
An overview of 10 simple rules to follow to ensure proper effective statistical data analysis.
Advice, Data Quality, Noise, Replication, Reproducibility, Statistical Analysis
- Hyundai: Quality Data Analytics Manager - Jun 1, 2016.
Seeking a Quality Data Analytics Manager to apply deep analytical skills to blend, process, and explore complex datasets for the Product Quality department to aid in knowledge discovery, and to assist with performing root-cause analysis, survival analysis, time series forecasting and other analytical activities.
CA, Data Analytics, Data Quality, Fountain Valley, Hyundai, Manager
- Webcast: Tech expert Phil Simon on exploring data - Jun 17, 2015.
Phil Simon, award-winning author, talks about how data visualization can help improve data quality, promoting the exploratory mindset, telling good stories with data, and more. On demand webcast.
Data Exploration, Data Quality, Data Visualization, JMP
- In Machine Learning, What is Better: More Data or better Algorithms - Jun 17, 2015.
Gross over-generalization of “more data gives better results” is misguiding. Here we explain, in which scenario more data or more features are helpful and which are not. Also, how the choice of the algorithm affects the end result.
Big Data Hype, Data Quality, IMDb, Machine Learning, Quora, Xavier Amatriain
- Interview: Michael Lurye, Time Warner Cable on Key Lessons from Shifting to Hadoop - Apr 14, 2015.
We discuss the key lessons from shifting to Hadoop, data management in today’s world, future of Data Science, advice and more.
Data Quality, Data Warehousing, Hadoop, Interview, Mike Lurye, Time Warner Cable, Trends
- Interview: Josh Hemann, Activision on Why the Tolerance for Ambiguity is Vital - Mar 12, 2015.
We discuss handling bias in data, other data quality concerns, advice, desired qualities, and more.
Activision, Advice, Bias, Career, Data Quality, Data Science, Data Visualization, Graphics, Interview, Josh Hemann, Junk Charts
- Top KDnuggets tweets, Sep 26-28: Any data scientist worth their salary will say you should start with a question - Sep 29, 2014.
CNN embarrassing lack of "Data Quality" - this #Scotland Independence poll adds; Statistical & Machine learning with R; Any data scientist worth their salary will say you should start with a question; Automotive Customer Churn Prediction using SVM and SOM.
Churn, Data Quality, Jake Porway, SOM, SVM
- Interview: Pallas Horwitz, Blue Shell Games on Why Data Science is So Critical for Gaming Studios - Aug 14, 2014.
We discuss the role of data science at Blue Shell Games, the importance of "Lean Data", key metrics for online games, cross-product projects and optimizing meeting the data needs across an organization.
Blue Shell Games, Data Quality, Data Science, Lean Data, Metrics, Optimization, Pallas Horwitz, Video Games
- Lavastorm Sun Seekers Caribbean Challenge 2 - Aug 5, 2014.
Use Lavastorm Analytics Engine Public Edition to overcome data quality issues and consolidate the lists. Step-by-step instructions make completing the task a snap! Submit your entry by August 31, 2014.
Challenge, Data Quality, Lavastorm
- Interview: Christophe Toum, Talend on Why Big Data Needs Big Governance - Aug 2, 2014.
We discuss the priority order of data governance for Big Data initiatives, impact of increasing shift towards Hadoop and NoSQL, data quality, current trends, talent crunch, advice and more.
Christophe Toum, Data Governance, Data Management, Data Quality, Hiring, Talend, Trends
- Interview: Aparna Pujar, eBay on Evolution of Behavior Analytics for User Engagement - Jul 25, 2014.
We discuss Behavior Analytics vs. Web Analytics, important metrics for user engagement, challenges of behavior insights domain, future of multi-screen analytics, key soft skill and more.
Analytics, Aparna Pujar, Customer Engagement, Data Governance, Data Quality, eBay, Marketing, Metrics, Trends
- Lynn Goldstein, Chief Data Officer, NYU on the Need for Data Governance - Jun 3, 2014.
We discuss the role of Data Governance, establishing Big Data accountability, impact of Data Governance on Data Quality, and assessing the education available for Data Governance.
Data Governance, Data Quality, Data Science, Lynn Goldstein, NYU
- Forrester Research: Build Trusted Data with Data Quality - Apr 1, 2014.
Key takeaways of the report include: How managing data quality brings IT and the business closer together, Different data quality definitions, and advantages of transparency in data quality.
Data Quality, Forrester, Lavastorm, Report