Amazon Web Services (AWS) is diving deeper into generative artificial intelligence (AI) software development with the launch of several new products. The goal is to make generative AI more accessible and affordable to a broader audience, by giving developers the choice to use pre-trained foundation models (FM) rather than having to train models from scratch.
AWS introduced a new fully managed service called Amazon Bedrock, which provides access to FMs from AI21 Labs, Anthropic, Stability AI, and Amazon’s own Titan model. The service simplifies the building and scaling of generative AI-based apps and allows organizations to use the model any way they want within the AWS environment.
AI from the Ground Up
“Bedrock offers a range of text and image FMs as a managed service, so organizations can customize a model based on their needs and integrate it into their apps using AWS tools,” said Bratin Saha, vice president of ML and AI at AWS. Some of the FMs available through Bedrock include AI21 Labs’ multilingual Jurassic-2 family, Anthropic’s Claude for conversational and text processing tasks, and Stability AI’s text-to-image models like Stable Diffusion, which generates high-quality images, art, logos, and designs.
In addition to Bedrock, AWS introduced two new Titan FMs that are being previewed with select customers before a broader release in the coming months. There will be two Titan models. The first is a generative large language model (LLM) for tasks like summarization, text generation, classification, and information extraction. The other is an LLM that translates text inputs—such as words and phrases—into numerical representations (known as embeddings), to provide more relevant and contextual responses. Amazon.com’s product search utilizes such a model.
Two Developer Approaches
Developers can take two approaches with generative AI: Those interested in building their own models can use SageMaker for training and hosting since it remains a highly performant platform (i.e., a system that is fast and efficient). However, for many customers, training foundational models can be expensive and time-consuming, involving data collection, curation, cleaning, and ensuring responsible behavior.
For those that prefer a more turnkey offering, Bedrock simplifies the process by offering a serverless infrastructure, which allows developers to focus on building apps without worrying about allocating instances or managing resources. While some large companies, like Adobe, may have specific needs and choose SageMaker, most would benefit from the ease-of-use that Bedrock offers.
“For a vast majority of customers, building foundation models is expensive and takes a lot of effort. This involves not only building the models, but also gathering all the data, curating the data, cleaning the data, and ensuring responsible behavior,” said Saha. “AWS can take care of all this heavy lifting, so developers can just focus on building the apps.”
AWS also announced the general availability of its Trn1n and Inf2 instances powered by Trainium and AWS Inferentia2 chips, respectively. According to AWS, Trn1 instances provide up to 50% savings on training costs and are optimized for large-scale deep learning models. The new network-optimized Trn1n instances offer 1,600 Gbps of network bandwidth—which AWS claims is up to 20% higher performance than the Trn1 models. Helixon and Money Forward are among the AWS customers already using these instances.
AWS launched Inferentia in 2018 as the first purpose-built chip for inferencing, which is the part of the deep learning process where the models are used to “infer” things the data presents. Examples of this are image recognition or a recommendation engine. With AI growing in complexity, AWS released Inf2 instances, which are designed for large-scale generative AI apps.
According to AWS, Inf2 instances provide up to four times higher throughput, ten times lower latency, and ultra-high-speed connectivity compared to Amazon’s EC2 instances. This is critical for the development of large language models (LLMs) which have massive data sets that are constantly changing.
When it comes to generative AI, data quality is a major concern, particularly in customer-centric contexts like call centers. AI models typically come pre-trained and are sufficient out-of-the-box without additional training for things like extracting sentiments from emails. When fine-tuning is needed, it usually requires only a few hundred or thousand samples of data. An example of fine tuning would be when industry specific jargon is introduced into an LLM.
Healthcare, financial services and the tech industry all need to be “fine tuned” to that vertical or even organization. Unlike a base model, which requires millions of data points, fine tuning can be done with a significantly smaller sample size. The true potential of these models lies in more abstract tasks as this provides higher level value . For instance, they can be used to summarize emails or documents versus a transcript which would require the user to then manually summarize.
“Since large language models have already been trained with terabytes of data, the amount of additional data needed for fine-tuning is typically small,” said Saha. “This makes it much easier to acquire high-quality data, unlike the current set of models where you are training everything from scratch.”
Several AWS customers are exploring various use cases using generative AI models for drafting emails, creating images using specific commands, and enhancing customer support experiences. Saha sees many other use cases for generative AI coming to fruition in the near future across different industries.
While AI has been in the tech spotlight for the better part of a decade, generative AI is still fairly new but will have a significant impact on society. It will enable faster, more accurate content creation, accelerate scientific discovery, automate many mundane tasks, improve customer experience and create new tools and applications that make technology more accessible. The approach AWS is taking is to make the process as turnkey as possible through the integration of its generative AI products with other AWS tools.