Jurassic-1 Language Models and AI21 Studio
AI21 Labs’ new developer platform offers instant access to our 178B-parameter language model, to help you build sophisticated text-based AI applications at scale.
By AI21 Labs
We are thrilled to announce the launch of AI21 Studio, our new developer platform where you can use our state-of-the-art Jurassic-1 language models to build your own applications and services. Jurassic-1 models come in two sizes, where the Jumbo version, at 178B parameters, is the largest and most sophisticated language model ever released for general use by developers. AI21 Studio is currently in open beta, allowing anyone to sign up and immediately start querying Jurassic-1 using our API and interactive web environment.
From a technical perspective, Jurassic-1 Jumbo enjoys a slight size advantage relative to GPT-3 (excess 3B parameters), but also, it introduces several conceptual novelties into this arena of huge language models. The depth-to-width ratio of Jurassic-1 Jumbo’s core Transformer architecture was optimized for its size -- It is shallower (76 vs 96 layers) and wider (13824 vs 12288 hidden dimension) than GPT-3, with more computational parameters per layer over fewer layers. This modification is aimed at maximizing the expressivity of the network, following theoretical insights published in the last NeurIPS. From a practical perspective, a shallower and wider network allows more parallelization between compute operations, reducing latency. Moreover, the Jurassic-1 models utilize a unique 250,000-token vocabulary which is not only much larger than most existing vocabularies (5x or more), but also the first to include multi-word tokens such as expressions, phrases, and named entities. Because of this, Jurassic-1 needs fewer tokens to represent a given amount of text, thereby improving computational efficiency and further reducing latency. Check out the white paper for more technical details, as well as a thorough evaluation of our models.
In order to help developers scale their applications beyond a proof-of-concept and efficiently serve production-scale traffic, AI21 Studio allows developers to train custom versions of Jurassic-1 models. Training a custom model is easy and requires as few as 50-100 training examples. Once trained, your custom model is served in AI21 Studio and immediately available for your exclusive use.
We created AI21 Studio to democratize access to cutting-edge AI technology. Using Jurassic-1 within AI21 Studio, you can quickly build text-based applications that rival those being dreamed up in the world’s biggest labs, even if you have no prior experience. We’ve been using AI21 Studio internally to power our own applications, and it has propelled our product development immensely. Now it’s your turn.