new Topics: Coronavirus | AI | Data Science | Deep Learning | Machine Learning | Python | R | Statistics

KDnuggets Home » News » 2020 » Mar » Opinions » How Bad Data is Affecting Your Organization’s Operational Efficiency ( 20:n10 )

How Bad Data is Affecting Your Organization’s Operational Efficiency


Despite recognizing the importance of data quality, many companies still fail to implement a data quality framework that could protect them from making costly mistakes. Poor data does not just cause revenue loss – it’s the reason your company could lose employees, customers and reputation!



By Farah Kim, Product Marketing Specialist

Most organizations today understand the importance of data and are ramping up efforts to collect more data. The problem? Organizations are finding it difficult to ensure the quality of their data.

In fact, according to KPMG’s 2016 Global CEO Outlook, ‘84% of CEOs are concerned about the quality of the data they’re basing decisions on.

Despite ‘worrying’ about data quality, most organizations still overlook the role of data quality within operational and business intelligence applications. It is only when a major initiative such as a migration or transformation project fails that the company to truly recognize and make an effort to take data quality seriously.

Figure

 

An Example of Bad Data Affecting Operational Efficiency

 
Here’s an example of how bad data quality can start a vicious chain of events throughout an organization.

An insurance company with nearly 54,000 employees around the globe only realized they had a problem with conflicting data in their mainframe database when they were confronted with the consequences of incorrect payments and mismatched vendor data. They realized that their current system did not have an option for standardizing payee names, meaning every time they ran a query against the main database, they would have to sort through a long list of duplicates.

Not only did this cause them significant troubles with customers, but it also maximized their operational inefficiency. Employees were put to the task of manually sorting, matching and removing redundant data. The outcome? Disgruntled customers, demotivated employees, lagging processes and a loss of yearly revenue.

Examples like these are plenty. Be it the banking, technology, healthcare, real estate, industrial or retail sector, data quality or the lack thereof can cause significant challenges in business operations.

Figure

 

So What Exactly is Data Quality & Why Does it Happen?

 
In simple terms, data quality refers to the accuracy, completeness, timeliness and consistency of the data used in your organization. It’s important to note that the scope of data quality is not limited to just customer data – it includes product data, company data, finance data, vendor and external stakeholder data, internal operations data, just to name a few. For an organization to function optimally, it needs to keep data quality as its foundation.

While data quality can refer to multiple problems, we’ll limit the discussion to just three major type of data quality issues that directly impacts the operational efficiency of an organization.

  • Poor Data Entry: Data errors often begin at the source. When there is no standardization at the data entry point, there will be higher chances of errors. This is quite common with companies that have multiple people from multiple locations entering data. Someone working in the UK may be entering prices in pounds while someone working in the US may be using $ without even realizing the flaw. Once this error goes into the system, it becomes a bottleneck, one that can cost companies $611 billion each year!
    Not to mention, poor data entry also includes misspells, typos, variations in spellings, writing of nicknames and middle names, abbreviations instead of full spellings are some of the most common data quality problems.

  • Duplicate Data: There is no escaping duplicate data. For every entity in a database, there is bound to be at least 3 versions of that same entity caused by several reasons. Duplicate data is one of the leading causes of flawed data analysis that organizations are wary of. You would assume you have thousands of qualifying leads, but on closer inspection, it could just be in hundreds.
  • Invalid Data: How many email address or physical addresses in a data source are bound to be invalid? Likely, most of them. Unless there’s a mechanism for entering complete data, most addresses will lack a Zip Code, a valid phone number, and even valid email addresses. The problem is so severe, that up to 94% of businesses suspect that their customer and prospect data is inaccurate.

 

Additional Effects of Bad Data on Operational Efficiency

 
As if revenue loss was not already devastating enough, bad data can cause your employees to lose their morale, decrease their efficiency all while generating a negative perception of your company.

After customers, employees are the most affected with bad data. Across departments, be it marketing or sales, customer relationship or lead generation; employees spend a significant amount of their time cleaning data and resolving the consequences of bad data.

Organizations that hire data scientists to make sense of their data end up spending 60% of their time cleaning and normalizing data. In the meanwhile, frontline employees like customer service reps, sales reps etc are working with corrupt, inaccurate and flawed data, dodging bullets as they come.

Dirty data can literally pull an organization down. In an age when governments and state regulators are constantly imposing data compliance and regulations, companies are even more under pressure to implement data security measures. Dirty data is also a data security risk!

Take this example of PayPal being fined millions of dollars for violating sanctions simply because they had not screened their data to match with the government sanction list.

In the real world, data quality is far from mere typos and invalid address errors. In healthcare, flawed data could mean the difference between life and death. In banking, it could mean huge financial losses, in security, it could mean national risk.

Data is the foundation of every business today and it needs to be managed more than ever.

 

Solutions and Best Practices

 
There is data quality framework developed by data management solution providers that include core operations as:

  • Data Integration
  • Data Profiling
  • Data Cleansing
  • List Matching

These functions form to create a full-fledged data quality management process that most third-party tools offer as a holistic solution.
Here’s what each of these do to help your organization get data you can use.

Figure

 

Data Integration:
An average enterprise organization today is connected to 464 apps! That means you have terabytes of data flowing into the organization from multiple sources. Many of these data sources have their own data format; call logs, email entries, web forms etc all create multiple types of data that are stored in different platforms. Data integration allows for the combination of data from these multiple sources, where the user can choose to clean, dedupe or restructure the data according to their set standards. Most business find using third-party data integration software as much more beneficial to their cause than creating an in-house solution.

Data Profiling:
Data profiling tools help you examine record fields to determine whether they meet standard data types of allowed values and ranges. For example, is the gender rightly marked against his or her name. Is there numeric data in a name field? Is there alphabetic values in the phone or ZIP code field? All these issues are identified in the data profiling process.

Data Cleansing:
Once the data is collected, merged and profiled, the inconsistencies must be corrected. Data cleansing is the first step to record linkage which means that if you want to merge your data or create a report from multiple data sets, you will need to verify and validate that data. Data cleansing involves converting data fields to a standard format, correcting typos, removing inconsistencies, filling in missing values, verifying and validating contact information.

List Matching:
Simply stated, list matching lets you match data within the data source, across the data source and between data sources. Using a powerful list matching tool, you can match data to identify duplicates and remove redundancy. The scope of list matching is not limited to removing duplicates – many organizations, especially government institutions need list matching to generate reports, meet grant requirements and much more.

Unlike in the past, companies today recognize the importance of data quality, but they still fail to implement any solution unless a high-stake company initiative fails due to poor data quality. A forward-thinking organization is one that recognizes data quality as its foundation impacting every business area. While data quality cannot be achieved overnight, it can be possible with a plan and a robust data quality tool that incorporates the data quality lifecycle.

 
Bio: Farah Kim (LinkedIn) is an ambitious content specialist, known for her human-centric content approach that bridges the gap between businesses and their audience. At Data Ladder, she works as our Product Marketing Specialist, creating high-quality, high-impact content for our niche target audience of technical experts and business executives.

Related:


Sign Up

By subscribing you accept KDnuggets Privacy Policy