How The New World of AI is Driving a New World of Processor Development
Blaize’s novel stream processor for Edge AI offers a case study of new opportunities for smaller companies to leverage semiconductor industry resources in pursuit of their goals.
Until now, most processor chips used for AI applications have been adaptations of devices, like GPUs, that were initially developed for other purposes. This is partly because those existing devices have proven effective enough to be useful, and partly because most companies have seen development of AI-specific processors as prohibitively costly, complex, and/or risky.
But, with the AI market poised for intense expansion, diverse new applications at the edge and in endpoints will require silicon that’s more closely tailored for specific tasks and situations, removing the need to go to the cloud for AI processing. We are now starting to see indications of how the semiconductor sector is evolving to meet those needs, and what it will mean for the evolution of the AI marketplace and those who seek to enter it.
A useful case study is provided by Edge AI startup Blaize, which is pursuing customers in industrial, smart city, automotive sensor fusion, last-mile delivery, and retail applications with its recently launched Pathfinder and Xplorer system-on-module platforms and accompanying software tools.
The company notes that existing Edge solutions for these sectors are “either too small to compute the load or too costly and too hard to productize.” Blaize aims to remedy this with a combination of multi-sensor processing power, compact size, extremely low power consumption, and easy programmability, powered by a novel Graph Streaming Processor (GSP) developed in collaboration with Samsung Foundry and design/IP firm VeriSilicon.
Why a Stream Processor?
Stream processors offer several advantages in real-time analysis of multiple sources of continuous graphical data, like that produced by arrays of cameras on factory floors or retail stores, or by coordinated combinations of sensors. Unlike traditional batch-oriented processors, stream processors do not have to temporarily store data for processing, or re-aggregate batches of data. This greatly reduces memory requirements and latency while boosting processing efficiency for tasks often needed in Edge AI situations.
By way of example, Blaize’s GSP implementation allows the Pathfinder module to provide five independent neural networks for video streams at 10 frames per second (50 fps total), with less than 100ms latency, while also supporting multiple simultaneous workloads across different streams. This enables sensor fusion, object detection, image noise reduction, fusing of video and lidar/radar data, and other functionality depending on the use case, with the platform having 100% programmability. And remarkably, despite providing 16 TOPS of processing power, the GSP draws just 7 watts.
As appealing as that is from a strategic perspective, bringing those capabilities to market in reliable, affordable silicon is a formidable task that, until recently, few companies would be willing or able to tackle. It’s well beyond the scope of typical custom chip projects, requiring a vast range of advanced design and manufacturing capabilities, integration of diverse circuit IP, and management and coordination of large interdisciplinary development teams — traditionally the province of big, specialized enterprises, not startups.
And in addition to fundamental development hurdles (timing, physical layout, memory management, etc.), both business and technical considerations require rapid and dependable time to market. This type of project generally offers only one chance to get it right, which amps up the risk-reward considerations.
Strategic Choices for Successful Development
As a result, when Blaize began its initial engagement with Samsung Foundry and explained the project goals and requirements, the foundry team recommended bringing in VeriSilicon, a one-stop source for custom silicon solutions and semiconductor IP, with an extensive track record including other AI-related projects. As a member of the Samsung Advanced Foundry Ecosystem (SAFE™) partnership program, VeriSilicon had also seen many of its design projects through the production process with Samsung, and drew on that experience for both strategic planning and the tactics used during development.
One of the first and most important decisions was the choice of manufacturing process node. After considering all available options, Samsung Foundry’s 14nm FinFET process emerged as the ideal choice, beating out several later nodes.
When considering the target applications for the initial chip, and the fact that much of the 14nm IP was not then available at more advanced nodes, the mature process node made the most sense. Cost was another concern. And given VeriSilicon’s experience and expertise in working with Samsung on 14nm over the years, they were confident that was the best path to first-time silicon success.
In addition, the partnered companies were able to refine and optimize Blaize’s initial GSP concept to meet the all-important goals for the Edge AI product platform. In the initial few months of the project, the Samsung Foundry team and VeriSilicon worked hand-in-hand, going through every possible design method that would give Blaize the combination of performance and power consumption that was required. Samsung Foundry was able to bring solutions to the table beyond the standard flows that allowed Blaize to fully meet its power/performance/area targets.
VeriSilicon also provided complete spec-to-production silicon turnkey services in addition to coordinating essential supply chain, reserving production capacity not only with Samsung but also with packaging, assembly, and test contractors.
On-Time Delivery, and Market Implications
The overall process required the tight coordination of multiple teams in China, India, the UK, and the US, but moved forward smoothly, even when the Covid-19 pandemic forced offices around the world to close just two weeks before the device taped-out.
And the up-front decision to choose a low-risk path to first silicon paid off handsomely when initial samples arrived. Usually, it can often take six to nine months to go from first silicon to customer samples, and 12 to 18 months to a full production run. Blaize was able to receive customer samples in two months. Since the silicon was pretty much perfect, they were able to go into real production after six months.
The approach used by Blaize wouldn’t work for every emerging AI company. It requires a particular combination of marketplace opportunity, technology, funding, and organizational capability, as well as competent partners. The success of the GSP project proves that, when the circumstances are right, existing semiconductor resources (exemplified by Samsung Foundry and its SAFE™ ecosystem, and the wide-ranging capabilities at VeriSilicon) are now able to provide an unprecedentedly accessible development path for specialized AI-oriented processors.
That would not have been the case only a few years ago, and it’s likely that these types of projects will be even more accessible and affordable going forward. As a result, a growing number of new AI-oriented organizations will no longer need to limit themselves to what can be done with repurposed silicon and will have greater freedom and scope to deliver effective solutions for their customers.
This article is co-authored by:
- Tim Dry, Director of Edge and Endpoint Segment Marketing, Samsung Foundry
- Santiago Fernandez Gomez, Vice President of Platform Engineering, Blaize
- Dr. Mahadev Kolluru, Corporate Vice President of North America Platform Solutions Sales, VeriSilicon