Topics: AI | Data Science | Data Visualization | Deep Learning | Machine Learning | NLP | Python | R | Statistics

KDnuggets Home » News » 2014 » Oct » Opinions, Interviews, Reports » Big Data Is Not Big Context ( 14:n27 )

Big Data Is Not Big Context

Learn about common misconceptions when approaching big data problems, and how the ambiguity of human language requires more sophisticated techniques for more accurate understanding.

By Ron Ezsak (Expert System), Oct 2014.

Technology (think big data) presumes that everything is in some form of indecipherable encrypted code that by itself means nothing unless it can be magically transformed by technology. In a sense and from a technology perspective the aforementioned statement is true. It is not true that everything is in some form of indecipherable encrypted code, but rather since technology does not truly understand the methods by which people communicate, to technology it is indecipherable in its native form.

What this means is that technology (read technologists) must manipulate content into a form suitable for technology to process. This abstraction is not so much intended to achieve a greater understanding of the subtlety and nuance of what is contained, but to make it manageable for processing. This is what I describe as the big data version of the scientific method, where you define the result and configure the experiment to produce it. But the desired product defines the process and the process defines the outcome produced.

The missing element in all of this algorithmic and statistical logic is at the very essence of how we communicate, that is context. Big data produces great insight when data is involved, but data is the output of a prior activity or operation as data does not exist in nature. Human communication does exist in nature as does human emotion, and as humanity has evolved so has the form, style, and locution of how we express ourselves. Big data is perceived as being the next Esperanto, a modern Lingua Franca, the Rosetta Stone of this phase of our continual sprint toward the next big innovation. But despite the illumination cast from this most shiniest of objects, when it comes to language big data's capability remains immature.

Three Mona Lisas

It seems completely incongruent that something virtually everyone is so familiar seems absent from the contemporary strategy and dialog. Soon after we can lift our head as an infant we learn to discern the emotive expressions of those around us. This expression defines the initial context of our immediate environment. Later when we begin to expand our communication skills and learn to read we are offered pictures and illustrations so that we may better understand the content. As our vocabulary expands the amount of visual support diminishes. Why, because we can now discern context from language. It is all about context, news, opinion, education, literature, all function because of context.

Context is the basis of comprehension and understanding, which is perhaps the reason why I find the application of big data where human communication and expression are concerned so perplexing. Sadly virtually all technology is illiterate and lacks the capacity to comprehend, it's the same condition that encumbers efforts in AI and machine learning. In language a great deal of information can be conveyed in one or two words, we disambiguate to be consistent with context. It's what we think of when we consider the meaning of a word.

Take "School" for example. Is the context related to fish or education? A school of fish connotes a body of water, perhaps multiple of fish moving in unison, whereas school in an educational context might include a building, desks, students, teachers, and the like. This semantic reasoning is something we do through our waking lives, but systems have not achieved the ability to be capable of understanding context.

Finally I often ask, people I encounter when discussing the importance of context, have you ever received an email from someone who was obviously upset but did not explicitly state that they were? Of course you did, that's context.

Ron Ezsak Bio: Ron Ezsak is Director for Expert System ( A software industry veteran, Mr. Ezsak has more than 30 years of experience, including deep domain expertise in global industrial, financial, and commercial sectors. He is based at the company's corporate headquarters in the Chicago area.


Sign Up

By subscribing you accept KDnuggets Privacy Policy