Don’t Let Data Silos and Dark Data Clog Your Data Supply Chain

With the rise of the big data, it has been cheaper and easier to accumulate the huge amount of data. But make sure you are getting value out of this data, otherwise it could create a bottleneck and get redundant for the business.

By Stephen Baker, CEO Attivio.

Today as organizations attempt to leverage Big Data and compete on analytics, there are “kinks” in the data supply chain—slowing the process that collects, stores, analyzes, and transforms data into insights. The kinks are information silos, and they stand between analysis and actionable insight.

Unfortunately, the task of unclogging the data supply chain often falls to various subject matter experts (SMEs), each of whom may only know what a small subset of an organization’s data sources contain. A recent Forrester study of 50 US-based IT and business decision-makers at large firms of 1,000 or more employees commissioned by Attivio found that 58 percent relied on subject matter experts while 48 percent review source system documentation (1).  Only 34 percent used automated data profiling tools—and these were early-generation tools that required trained technicians and relied on human-created metadata (1).


SMEs are great. But to rely on them for data profiling and identification creates a manual bottleneck that can slow analytics initiatives to a crawl. To realize the potential of data as a strategic asset demands a more comprehensive plan of attack. One that’s automated and fast.

First, forget about trying to merge or eliminate silos. It’s never going to happen. Let your data rest where it is and focus instead on bridging the chasm between the silo and the business analyst as quickly as possible.


Likewise for dark data, which Gartner describes as information assets that organizations collect but rarely use, any effort to merge dark data with other data stores is wasted. Why? Because not all data organizations amass is worth saving.

And remember, the value of data decreases over time. Nucleus Research borrows the half-life concept from physics to “measure both [data’s] initial and diminishing value, and to plan a data management strategy reflective of that value.” Nucleus suggests that the value of data is directly linked to the “tempo of the company’s decision-making processes and the time horizons over which those decisions are implemented.”So much of what’s in the silos that choke the enterprise DSC may be useless anyway.

To speed Big Data and BI initiatives and streamline your data supply chain, you don’t need to go on a long, expensive, and futile crusade to get rid of data silos. Automated software can crawl your published network, profile the data in every back alley silo in the enterprise, and produce a master index of ALL your data. It doesn’t eliminate the silos; it makes them irrelevant.

“Will all of the data be valuable?”No. But here’s the best part. With that index, you can offer self-service data discovery to your business analysts and data scientists. They can shop for data sets and query the data to their hearts’ content. The nature of the query automatically determines what data is relevant and what isn’t.

So, keep your eyes on the prize—analytic insight. Not on the distractions of data silos.

(1) Taylor, C., Accelerate BI Initiatives With Self-Service Data Discovery and Integration. A Custom Technology Adoption Profile, p. 4, Forrester Consulting, June 2015.

Bio: Stephen Baker is the Chief Executive Officer of Attivio, the Data Dexterity Company. In leading Attivio, Baker brings more than fifteen years of experience as a top executive within the enterprise software industry.