Reflections on the State of AI: 2018

We provide a detailed overview of the key developments in the AI space, focusing on key players, applications, opportunities, and challenges.

By Alex D. Stern & Eugene Sidorin, Evolution One.

Every day, multiple news items discussing various things “related to AI” pile up in our mailboxes, and there are dozens, if not hundreds, articles, opinions and overviews being published every week that have the term “AI” in their titles. However, not everything claimed as related to artificial intelligence is actually meaningful or relevant, and instead can often have nothing to do with the field altogether.

Aiming to democratize the knowledge of machine learning, neural networks and other segments of AI, we’ve decided to launch our efforts by creating a set of focused articles that would cover the latest advancements in the field, taking a look at the key players, and providing some insights into the most promising technology, as well as both the opportunities and dilemmas the industry is facing today.

In this first article, we provide a concise overview of the key developments we saw in 2018, segmented by key contributors, applications and challenges.

Today, with hundreds of companies deeply engaged in the AI space, and even more working to figure out their strategy as related to the field, it might be a bit hard to pinpoint specific players that are best positioned to lead the way in the future. Still, if we look at any of the many lists outlining the key players (see here and here, for example), 5 companies — Google, Facebook, Amazon, Microsoft & IBM — inevitably end up on all of them (other companies mentioned almost as often are Apple, Tencent & Baidu). In this article, we’ll focus on Google & Microsoft, as in 2018 those two appeared in the news for the AI space most often; still, the rest of the tech giants were by no means less prolific, and we plan to cover some of them in more detail in our next article focusing on the latest advancements in technology.


Google Pixel: Night Sight capabilities demo from Google blog

It’s been a fruitful year for Google’s efforts in the AI space, as witnessed by the number of new products the company introduced, as well as some critical improvements made to the existing services.

The largest number of announcements came out of Google I/O, the company’s annual developer conference held in May. Among other things, Google introduced Smart Compose for Gmail, made some really impressive updates to Google Maps and, perhaps most importantly, announced its new artificial intelligence-powered virtual assistant, dubbed Google Duplex (see a good summary of all new products and features introduced at Google I/O in 2018 here).

In the company’s own words:

“[Google Duplex is] a new technology for conducting natural conversations to carry out “real world” tasks over the phone. The technology is directed toward completing specific tasks, such as scheduling certain types of appointments. For such tasks, the system makes the conversational experience as natural as possible, allowing people to speak normally, as they would to another person, without having to adapt to a machine.”

The recording of a phone call to a hair salon made by Duplex was so impressive that it even led some to question whether Google Duplex has passed the Turing test (hint: it hasn’t; at the very least, the person making the judgment has to be aware that she might be talking to a machine). It also sparked a heated conversation about whether it’s appropriate to use technology in such fashion without making people on the receiving end aware that they are not interacting with an actual human being, but rather are talking to a bot. While it might be hard to answer such questions definitively, we’ll probably see more of this discussion soon enough, since Google started rolling out Duplex to some of its smartphones in December.

Another interesting update from Google arrived with the latest update to their Pixel line of smartphones (Pixel 3 & 3 XL), which came with some really impressive new camera capabilities enabled by the use of AI (we’ll touch on it again later in this post in a section dedicated to the advancements in computational photography).

Finally, DeepMind Technologies, fully owned by Alphabet Inc, managed to achieve a major milestone with the latest iteration of its AlphaZero program.

We’ve already seen the impressive achievements of AlphaGo and AlphaGo Zero in the game of Go in 2015 & 2016, when it handily won most games when competing against two of the strongest Go champions; in 2018, however, DeepMind’s team managed to achieve something even more interesting — the newest AlphaZero engine demonstrated its clear superiority over all of the strongest existing engines in chess, shogi and Go.

What is particularly interesting about AlphaZero is that it managed to achieve this feat without having to study any of the logs of the games played by humans; instead, the program self-taught itself how to play all three games, being provided with only the basic rules to start with. As it turns out, by operating without the limitations that come from learning from the games that were previously played has resulted in AlphaZero adopting “a ground-breaking, highly dynamic and “unconventional” style of play” that differed from anything seen before. That, in turn, makes the engine more useful to the community who might then learn new tactics by observing machine-developed strategies. It also creates a promise of the real-world applications for this technology in the future, given AlphaZero’s ability to successfully learn from scratch and tackle perfect information problems.

DeepMind is also working towards creating systems that can deal with imperfect information problems, as it demonstrated with its recent success with AlphaStar that beat a few professional players in StarCraft II (whereas in the past, AI has struggled to successfully play StarCraft due to the game’s complexity).


Julia White, Corporate Vice President, Microsoft Azure Marketing, speaks at the Conversations on AI event in San Francisco. Photo by John Brecher for Microsoft

Playing the AI game at full throttle, similarly to Google, Microsoft had a blast launching new AI-related products and services in 2018, as well as improving some of the underlying technologies. A significant part of this work was community-centric, focusing on providing better tools and functionality for developers to build AI-powered solutions on Microsoft’s Cloud Platform.

Interestingly enough, Microsoft’s key developer conference called “Build” is also happening in May, just as Google’s does. In 2018, it’s been a packed event for Microsoft, with the company making a significant number of new announcements, and in particular, announcing Project Brainwave’s integration into Azure Machine Learning.

Project Brainwave (initially dubbed Catapult) was a result of several years of research that started at Bing back in 2010. Brainwave was announced to the community in August 2017 at Hot Chips, one of the top semiconductor conferences. In short, Brainwave is a hardware platform on FPGA chips, designed to accelerate real-time AI calculations, a critical area for services like search engines (which also explains why this project grew out of Bing). Now, with the integration of Brainwave into Azure’s Machine Learning, Microsoft claims Azure to be the most efficient cloud platform for AI.

At Ignite, another big conference held in Orlando last September, Microsoft released Cortana Skills Kit for Enterprise, which represents an exciting attempt to test AI-based assistants in the office space — think of the cases where you can program a bot to be able to schedule cleaning service for the office, or automatically submit a ticket to the help desk guided by your brief voice command.

A few days later, Microsoft also announced the integration of real-time translation feature into SwiftKey, an Android keyboard app acquired by Microsoft back in 2016. Finally, at the end of September, following Google Duplex lead, Microsoft released its Speech Services tool, introducing improved text-to speech synthesis capabilities.

Later in November came another series of interesting announcements, such as Cognitive Services Containers. Cognitive Services allow developers to leverage AI in their apps without requiring them to be experts in data science, or possessing extensive AI-related knowledge. The container story, in turn, is focused around the cases for Edge Computing — a concept when there is no need to send the data to the Cloud to perform calculations and rather process it locally, allowing to reduce latency and in many cases optimize costs. With Cognitive Services in Containers, Microsoft’s customers can now build applications that will run AI at the Edge locations.


Top 100 startups in AI, from CB Insights

Investments in AI space have been booming lately, although as was reasonably called out by Crunchbase, it could hard to estimate by how much. CB Insights has built a good infographic of AI space, and sliced and diced top startups by categories in this article. We see two major takeaways here — first, the largest rounds in AI industry in 2018 were raised by Chinese companies, such as SenseTime and Face++ ($1.6 billion and $0.6 billion, respectively). Second, of 11 unicorns existing today with $20 billion+ estimated valuation, 5 companies are from China and contribute up to 50% of total valuation, with SenseTime leading the group with a stunning $4.5 billion valuation. This underscores a critical point: China seems to be moving at a faster pace compared to other countries. Moreover, with its increasingly large footprint, China is now emerging as the powerhouse in the AI field. (For additional details, check out this summary outlining various national AI strategies that countries around the world are pursuing today).

Ethics, regulation & education

Deep fakes controversy

AI-generated fake clips of President Obama’s speeches, from The Verge

In December 2017, Motherboard published a story about a Reddit user going by the name ‘deepfakes’ who’s been posting hardcore porn videos featuring the faces of celebrities mapped onto the bodies of porn stars. While not perfect, those videos were quite believable, especially considering those were made by a single person. Although Reddit soon banned the videos, the discussion about the legality and potential misuses of this technology has only been heating up ever since.

The tech behind creating fake videos through swapping the actors’ faces has been around for a while, yet the ease of creation and the quality has definitely reached a new level in 2018 (see another example of what a single tech-savvy user could achieve here). Making fake porn videos, while disconcerting, might still be relatively harmless, but as we recently saw, the same technology could be used to generate fake news or create false propaganda materials (or make the president Barack Obama say things he’d never say, at least not in public), which could have serious repercussions for us as a society. Could anything be done about it? That remains to be seen, but the fact is, the deep fakes are here to stay and are likely to only get more difficult to distinguish from the real thing.

AI biases

Photo from Microsoft’s blog post “Facial recognition: It’s time for action

In the last few years, both supervised and unsupervised learning approaches have been producing some exciting results (DeepMind’s AlphaZero is one example of what could be achieved through unsupervised learning). Still, a large number of real-world applications require training models on labeled data (which, incidentally, is one of the key issues often holding back further progress).

However, having a large dataset of labeled data to train the model isn’t quite the same as having a good dataset. You see, the neural networks relying on supervised learning are only as good as the data they are initially trained on, so if the underlying dataset has any flaws (such as focusing on one characteristic at the expense of others), chances are that the neural network would pick up those biases and further amplify them. This might not sound too bad in theory, but only until we consider the possible issues stemming from it in real-world applications — and as we’ve seen in 2018, those could be quite profound.

For instance, in the study done by Joy Buolamwini, a researcher at the M.I.T. Media Lab, she demonstrated that the leading face recognition systems from Microsoft, IBM and Megvii misclassified gender of only 1% of white males, but made mistakes in up to 35% of darker-skinned females. The reason? Those models were trained on a biased dataset that contained a larger proportion of white males’ photos and thus got progressively better at correctly recognizing their gender. Considering that face recognition tech is now increasingly being used by law enforcement, and that African Americans have the highest chance to be singled out because they are disproportionately represented in mug-shot databases, such discrepancies in performance could have a very significant negative impact.

Another famous example of the same issue that was made public in 2018 was the case associated with Amazon’s internal AI-powered recruiting tool. Amazon intended to leverage machine learning capabilities to allow for more efficient recruiting processes, and, potentially, to altogether automate some of the steps. Unfortunately, as it turned out, the aforementioned tool was trained on the resumes of people who had previously applied to the company, and the majority of those were males. As a result, the model picked up these biases and in turn trained itself to downgrade female candidates, prioritizing things like “masculine language” to promote males’ applications instead. Amazon eventually scrapped the tool, but there are plenty of other companies trying to leverage AI to help with the recruiting processes, whose models might have similar flaws.

Today, an increasing number of people (and companies) are calling for the authorities to devise regulatory frameworks to govern the usage of face recognition. Will it happen anytime soon? That remains to be seen, but chances are that at least some level of oversight is coming.

Uber’s self-driving car kills pedestrian in Arizona

Uber Volvo XC90 autonomous vehicle, image from MIT Technology Review article

Even the greatest technology is unfortunately bound to occasionally make mistakes when operating in complex, nondeterministic environments. And thus, on March 18, 2018, the thing that was eventually bound to happen happened, when an autonomous vehicle belonging to Uber hit and killed pedestrian in Tempe, Arizona. This accident forced the company to suspend all tests for its driverless cars and re-examine both its processes and its tech; it also sparked a heated discussion about the current state of technology behind self-driving cars, as well as the ethical and regulatory challenges that needed to be addressed if the autonomous vehicles were to gain wider acceptance from general public anytime soon.

Nine months later, Uber was allowed to resume its tests of autonomous cars in Pittsburgh, followed by San Francisco and Toronto in December, although in those cases, Uber's self-driving vehicles remained restricted to “manual mode” (which meant the company would be focusing on exposing the cars’ software to new circumstances, rather than running active tests). To get in good graces of the authorities once again, Uber had, among other things, to agree to additional restrictions on the types of roads and conditions where it was allowed to operate its autonomous vehicles. Moreover, Uber had to switch to a system providing more rigorous training to the drivers (a critical piece, as the investigation of the fatal accident that occurred in March demonstrated that the driver was distracted and thus wasn’t paying attention to the road, as he was supposed to), now called “mission specialists”. Finally, the company introduced a third-party driver monitoring system, and also made additional improvements to its tech.

Still, it seems very unlikely that we’ve seen the end of the discussion about public safety and the necessary regulations when it comes to autonomous vehicles; rather, Uber's unfortunate accident has only fueled the ongoing debate. We’ll see what the year 2019 will bring us; one thing, however, we can be certain of is the next 2–3 years will likely prove to be critical in shaping the public opinion on the subject of self-driving cars.

For those who are curious about the history and the current state of the autonomous vehicles, we suggest checking out this in-depth guide to self-driving cars from Wired.

MIT invests $1 billion in new AI college

Photo: MIT Dome, by Christopher Harting

On October 15, 2018, MIT announced the creation of a new college, named MIT Stephen A. Schwarzman College of Computing after the co-founder and CEO of Blackstone who made the foundational gift of $350 million. The new college will be focused on addressing the global opportunities and challenges presented by the rise of artificial intelligence.

MIT already boasts a very strong reputation in the field (not to mention that its efforts in AI space could be traced back to the very beginnings of the field in the late 1950s). Still, it’s hard to overestimate the importance of this latest development — for instance, Schwarzman’s gift will allow for the creation of additional 50 faculty positions dedicated to AI research, effectively doubling the number of researchers focused on computing & AI at MIT.

The emphasis on cross-disciplinary collaboration, as well as the research on relevant policy and ethics to ensure responsible implementation of AI, is also noteworthy here — while we’ve seen a number of think tanks and research initiatives focused on these topics created in the last few years, it’s great to see MIT’s commitment here, as there’s still much and more work to do on the subject.

AI applications: computational photography

Image generated by Prisma app, from Business Insider

Computational photography, in the broadest sense, is an area where in the last few years AI has delivered perhaps the most noticeable advancements, at least from consumer’s perspective. Still, while there’s been a lot of progress in this field in the previous years (such as Google Photos learning how to automatically tag and categorize photos in 2015, or iPhone 7 getting the capability to automatically blur the background in the photos taken in portrait mode in 2016), in 2018 we’ve seen a number of particularly impressive technological feats making it to mass products.

Features like Google Pixel’s Night Sight mode, or Smart HDR capabilitiesmade available on iPhone XS and XS Max, are just a few examples of some of the things that have been made possible through the use of machine intelligence. What’s perhaps even more interesting here is that these new capabilities have now clearly demonstrated the ability of AI to enable improvements that extend beyond the physical limitations of the cameras, thus setting the entire field on a new exciting path. As a result, computational photography today has already proved its value to both those familiar with other advancements in AI space, and the users far removed from the field.

Another aspect of computational photography applications is when the neural network is being used to completely rework the image using an algorithm to adjust the output to look like the artwork of famous artists, like Van Gogh or Monet (e.g. see Prisma app). Similar concepts are used in various areas of machine vision and benefit, for example, driverless cars.

We will cover more of the specific technologies, such as large scale GANs and video-to-video synthesis, that have recently seen significant advancements, in Evolution One’s next article called “Key recent developments in machine intelligence”, a focused deep-dive into some of today’s hottest areas of artificial intelligence, like natural language processing and computer vision.

Bio: Alex D. Stern is a product strategist, with stints in venture capital and startups, currently working for Microsoft's Cloud & Enterprise.

Eugene Sidorin is an entrepreneur, physicist, and tech geek, currently working at Microsoft, focusing on Azure finance & strategy

Original. Reposted with permission.