Machine Learning and Misinformation
The creative aspects of machine learning are overshadowed by visions of an autonomous future, but machine learning is a powerful tool for communication. Most machine learning in today’s products is related to understanding.
By Paul Soulos, Engineer and Interaction Designer.
Communication is an essential pillar of society. Humanity’s progression over the past millennium was largely driven by the development and evolution of communication as a tool for distributing siloed thoughts from one individual to others. Communication is naively defined as content and the mode of transmission — symbols manifested as images, language transmitted through speech and writing, digital files sent through the internet. These are methods through which we communicate thoughts, ideas, facts, and opinions. New forms of communication emerge to expand the lexicon of thought and reduce the friction required to create and transmit content.
Computers are unique as communication tools because they are used for the creation, transmission, and consumption of content. Most tools before the computer aided in one of these categories, but the digital computer quickly became the de facto platform for media. This unified tool simplifies the process for communicating; you know that what you create on your screen can be instantly reproduced on another’s screen. But the goal is not simply to transmit bits from device to device, the goal is to transmit ideas from one person to another.
Early practitioners in the field of Interaction Design espoused human-computer symbiosis1, a tight bond between human and machine that transcends the capabilities of either individually. As communication devices, computers would facilitate a level of understanding between people that was previously only accessible to skilled writers, speakers, and artists. Computational creation reduces the skill needed to craft content that resembles the ideal form as it exists in your head; colors can be selected from a color picker instead of requiring an individual to understand the complex nature of mixing different paints to achieve a certain palette. The trend is towards uninhibited creation of a sort that only exists in the mind.
A standard color picker.
The creative aspects of machine learning are overshadowed by visions of an autonomous future, but machine learning is a powerful tool for communication. Most machine learning in today’s products is related to understanding — your phone can translate your voice into text and you can search photos for certain objects or people because of machine understanding. To accomplish this, machine learning compresses raw data into representations that is uses to find similarities and make other judgements. Representations are a cognitive concept that signify properties2. For example, a person’s mood can be compressed from an image of their face into a mood representation variable: happy, neutral, or sad.
There is another side to machine learning that moves in the opposite direction, from representations to raw data. Generative modeling is a machine learning technique that creates new data that mimics the data that the machine was trained on. In the case of camera images, the generative model will create images that reflect photorealistic images. This makes it easier to create incredibly detailed content while only manipulating the underlying representation variables.
As the requirements to create and transmit media are reduced, we approach a scenario where you can realize any thought in a shareable manifestation. If you imagine an object, you need skill as a visual artist to move that image from your mind to the physical world. In the future, computers will reduce the training that is required to realize ideas in the physical world to the point where the inception of an idea is on level with the realization and communication of that idea. Generative modeling will bring huge advances to our ability to communicate with each other, but it also poses an enormous threat with the creation and dissemination of disinformation and misinformation. The difference between disinformation and misinformation is intent; disinformation is created with a malicious intent while misinformation is communicated without knowing the extent of the falsehood.
In the social media age, information becomes a weapon through networks, and we generally encounter misinformation. Propaganda pushed through state sponsored channels is disinformation, but the content in your social media feed shared by friends is misinformation. While new technologies accelerate our ability to communicate with each other, they also accelerate the spread of misinformation and disinformation. Whether we are ready for it or not, generative modeling is approaching. Will it bring progress or a misinformation nightmare that erodes the foundations of society?
Generative modeling may not be mainstream yet, but computers already aid us in frictionless communication. Consider using image search: this task can be exploratory when you want to know what something looks like, but you also use image search when you know what something already looks like and want to embed the image in a document, presentation, or conversation. The process of going through image results is a process of finding the image that most accurately approximates the image you see in your head.
Phones have made it just as easy to create and consume images as text. The rise of social media apps dedicated to images reflects the changing habits of people. Rather than attempt to describe a scene to a friend, you can simply snap a photo of it and send the image. Unfortunately, our reliance on images creates a convenient opening for the spread of misinformation. We all learn to read and write in school, and while it can be difficult to craft a convincing statement, anyone can write a sentence that is false. We consume text ostensibly if it strays far from reality because we know how easy it is to generate a false narrative. Cameras capture reality and we generally ingest this information as closely related to the truth.
There is still a barrier to create believable disinformation. While people shamelessly endorse and share disinformation produced by organizations with an agenda on social networks, we have not yet reached the point where the average person can easily create any piece of information they desire. Beyond words, images and visualizations help convince us that the underlying narrative is truthful. At the moment, fake images require you to be a skilled photo editor to maintain a sense of reality. Generative modeling is the tipping point where any individual can manifest the reality that exists in their head. One of the most interesting developments behind these techniques is the interfaces that we will use.
Images generated from a text query.
In an earlier example, image search was used as an example of computers aiding in communication by helping you find an image that approximates what your mind’s eye sees. It is an intuitive interface for quickly scanning a large amount of images to help find an appropriate sample. There are limitations to this approach, the largest of which is that the image needs to already exist for the search engine to index it. Beyond that, image search offers no control over tweaking an image and attempts to do so by someone who is not well trained in photo editing will quickly ruin the sense of reality. The image above shows a technique that generates a brand new image from a query3. Instead of returning an image that already exists, the generative system creates an entirely new image based on the text.
Computers can help us draw, even if we can’t.
The cartoon above is from a seminal paper written by JCR Licklider, the father of interaction design4. Already in 1968, he was able to spot the ability of the computer to aid as the ultimate medium. A group at UC Berkeley recently published pix2pix5, a machine learning system that effectively realizes the cartoon in Licklider’s paper. Instead of having the necessary skill as an illustrator, you can sketch a rough version of the image you want to send, and the computer can render a high resolution image. There is still work that needs to be accomplished before a pix2pix-like system makes it’s way into a consumer product, but generative modeling is already beginning to go mainstream in smaller ways.