Getty Images
Vector search and storage key to AWS' database strategy
The tech giant is prioritizing vector search and storage, adding the capabilities to its data storage tools so customers can use them with language models to build AI applications.
Peanut butter and jelly. Batman and Robin. Generative AI and vector embeddings.
Each can exist apart from one another, but they're better together. As a result, as enterprises attempt to develop generative AI models that understand organizations' unique business needs, AWS is making vector search and storage a prominent part of its database strategy.
Generative AI capabilities took a major leap forward when OpenAI launched ChatGPT in November 2022. Suddenly, true natural language interactions between users and data were possible. So were time-saving measures such as code generation and process automation.
But those natural language interactions and code generation capabilities are only meaningful to a business if generative AI models understand their unique needs. Large language models (LLMs) such as ChatGPT and Google Gemini, however, don't have the data to understand an individual business.
They are trained on public data. They have the data to know who won the Civil War and why, and they can go through the Beatles' song catalog and produce a song that is written in a similar style.
But they have no clue whether a business's sales were up or down over a particular period and certainly can't predict what sales might be going forward. That is unless the language model is either developed from scratch by that individual enterprise and trained on its proprietary data or the public model is fine-tuned by the enterprise to understand its business.
Such development from scratch or retraining takes significant amounts of proprietary data. Finding the right proprietary data -- for example, sales figures in March in Michigan -- is onerous.
Retrieval-augmented generation (RAG) is an AI framework that retrieves data. Vectors are a critical aspect of RAG pipelines. Vectors are numerical assignments that give values to unstructured data such as text, images and audio files that represent the majority of all data and otherwise can't be accessed and used to inform models. Vectors also enable similarity searches, enabling developers to use the most relevant data for a given model.
AWS has had vector search and storage capabilities for years. For example, Amazon Music uses vectors to respond to queries and commands.
Now AWS is adding vector capabilities to all of its databases so customers can use the tools of their choice when retrieving data to inform generative AI models and applications. Google similarly has made vector search and storage a point of emphasis of late, developing AlloyDB AI, a database featuring vector capabilities.
With vector search and storage capabilities now integrated into most AWS databases, Ganapathy "G2" Krishnamoorthy, the tech giant's vice president of data lakes and analytics, recently took time to discuss AWS vector strategy.
He noted that databases OpenSearch Serverless, Amazon Aurora, Amazon RDS, Amazon Neptune Analytics, Amazon DocumentDB and Amazon DynamoDB all now have vector capabilities. In addition, vector capabilities for Amazon MemoryDB for Redis are in preview.
But beyond the current state of vector capabilities in AWS databases, Krishnamoorthy spoke about why vector capabilities are gaining popularity, what has driven AWS to make such tools widely available, and what AWS plans next regarding vector search and storage.
Vector search and storage has exploded in popularity over the past year, but to begin, what are vectors?
Ganapathy 'G2' Krishnamoorthy: Vectors are a way of representing information in a high-dimensional space. They can represent text, images and all these other [unstructured data] aspects. The information gets reduced to these strings of numbers that are called vectors or embeddings.
The way we make this connection between LLMs and vectors -- and we do it with things like Bedrock knowledge bases – [is] we take all of your images or your product catalog and encode them with LLMs as vector embeddings. That way, when a user is looking to find a specific answer to a question, the user will be able to do the search, receive a subset of data, and go through the generation step.
That's the end-use case. It's how we can represent real-world information in a manner that is understandable and easily operable by LLMs so that they can retrieve information in response to a user question and go through the generation and summarization steps. Vectors become indexes; they're another type of index that you would store alongside your existing data.
If vectors have existed for a while, why has their popularity skyrocketed of late?
Krishnamoorthy: Vector indexing is not entirely new. It's a capability that has existed in many AWS systems, such as our OpenSearch Service and as an extension in . Some of these capabilities have been in [databases] for some time. But they have exploded in terms of people's consciousness and in use as large language models took off.
There are copilot GPT and other examples that are about just working with the data that the LLMs have been trained on. But for model technology to be useful for a lot of business use cases, you need to have the models work with your information. This is where the vector capabilities become really important because the vector is representing your information -- your product catalog, your Salesforce data, your application models -- in the same way that the LLMs understand information.
With vector search and storage gaining popularity, what is AWS' approach to enabling customers to use vectors to train models?
Krishnamoorthy: From our point of view, if you think about it, vectors are an index alongside your data. Therefore, it makes sense to think about them as an extension of how you manage your data already.
That's why we're just adding these vector capabilities to the way people already manage their application data [rather than develop new vector-specific databases]. We are adding vector indexing capabilities in RDS, Aurora, OpenSearch, Neptune -- in all our databases. We want people to easily be able to use this capability without adding the complexity of a new system that they have to manage.
We're seeing strong adoption of these capabilities.
When did vector search and storage become a priority for AWS?
Krishnamoorthy: As we were building Bedrock as a capability for application developers to make it easy for them to build generative AI applications, vector indexing and vector database capabilities went hand in hand. Some of the underlying things, like the underlying vector engine in OpenSearch, have been in production for many years. When you go to Amazon Music and ask for a recommendation, that is using OpenSearch at a billion vector scale.
The underlying technology has been in production for a while. Now we're making it easily adoptable in the data systems. We took the same technology that was OpenSearch and made it into a serverless vector engine, which customers can now easily use from Bedrock. Our approach has been simplification and integration, and that has made it easy for customers to adopt.
Are AWS' vector search and storage capabilities all now generally available, or are some still in preview?
Krishnamoorthy: Things like Pgvector, which are open source vector capabilities from PostreSQL, in RDS and Aurora as well as the OpenSearch vector engine are all generally available. MemoryDB for Redis is [the only one] still in preview. All of them are significantly along the way toward general availability.
What has been the response from AWS users to the emphasis on vector search and storage?
Krishnamoorthy: The response has been positive. We have customers like GoDaddy and Intuit that are standardizing on the vector capabilities in AWS and have used it for use cases such as making it easy for data engineers to discover what data is actually available to them. They have their metadata in Aurora, and it was easy to add Pgvector to that capability.
For any application that was providing search-based discovery, it now makes sense to extend it to enable natural language engagement. It's a natural step forward for them to use Bedrock and other foundational models plus the vector database.
Is there any plan to develop a vector-specific database?
Krishnamoorthy: While we are adding these capabilities to Aurora or MemoryDB, you don't need to store your data in those databases. You can create a vector collection and use the databases entirely for those vector collections. You can use the databases as standalone vector databases, so there is the flexibility to do so. We think about the ease of use and performance of these systems as standalone vector capabilities.
Is adding vector search and storage built originally for something like Amazon Music to AWS databases a simple process?
Krishnamoorthy: One of the advantages we have is that we had vector engine capabilities in OpenSearch and PostgreSQL that have been proven out in large-scale use cases across Amazon. This core technology is built into the underlying engine, for example, of OpenSearch or Neptune. The places where we still needed to do innovation was -- in the case of OpenSearch, we needed to make it serverless. In many cases, large foundational models require wider vectors -- the dimensions of the vectors are getting bigger -- so we needed to make sure that we can match the underlying scale. We also needed to add algorithms for what data is considered to be similar.
We are leveraging, essentially, the scale and validation we already had. But we had to go and improve the technology because the foundational LLMs place new demands on these systems.
How will AWS expand or improve vector search and storage capabilities going forward?
Ganapathy 'G2' KrishnamoorthyVice president of data lakes and analytics, AWS
Krishnamoorthy: We work backward from customers. We are happy we are able to put these capabilities together for customers to build on. They are enabling a set of scenarios that we're excited about. Now we're looking [at], as customers are developing experiences, what will be other experiences that will evolve? We will take our cues from that.
A few things are clear. We're asking how we can make [vector search and storage] simpler. That's one reason we took the approach of not building separate systems but integrating [vector search and storage] into the databases they're already working with. A second thing is asking how we make it faster. In some ways, these are complex calculations, so we're trying to keep the system really responsive. The last one is how we can reduce the cost of adding these capabilities.
Regarding cost, cloud spending has become an issue for many enterprises, and training an LLM to understand a business without hallucinating requires significant compute power. How can AWS help customers better manage spending?
Krishnamoorthy: Retrieval-augmented generation enables you to address concerns around hallucinations and make sure responses from LLMs are grounded. In any use case, a customer should build a RAG solution using vector capabilities. Fine-tuning LLMs is an advanced next step from there. You can use Bedrock and vector capabilities to build a RAG solution first. If I wanted to fine-tune a model, I'd then probably set up a separate conversation [between data and an LLM].
But even that RAG pipeline is constantly working and costing money. So are there any cost-control measures AWS can provide?
Krishnamoorthy: We enable our customers to have a broad range of foundational models they can apply to their problem. One of the advantages we are providing is that through Bedrock, customers building solutions have a range of models that are available to them. They can then pick the right model that strikes the right balance of accuracy and cost efficiency.
Since they're building to the same API, they can run experiments and pick the model that has the highest performance [to] lowest cost ratio. Once they make that model selection, they can use that model to create the vector index so that they're using the same model to retrieve the vectors and generate them. In terms of helping our customers, Bedrock, which gives them the same API and a range of models, is a big step forward for customers.
Do you have any examples of users trying out different foundational models that enable them to strike the right balance between performance and cost?
Krishnamoorthy: I, myself, am a customer of these capabilities. In QuickSight we are enabling generative BI. As we were building our capabilities, we needed to experiment with available models and see which ones work best for the different experiences in QuickSight, such as natural language processing. It was easy for us to experiment with Bedrock and a range of models.
Bedrock is one of the core capabilities users have to find the right experience. The space is moving so fast that the models that were state of the art in terms of core capabilities or best price performance have evolved. Beyond one model being best for use cases right now, [Bedrock] enables users to tap into advances that have happened in the last six months and that will happen in the next six months. [Users] can just focus on building a great experience for applications.
When AWS took on the task of adding vector search and storage to its existing suite of databases, did the impetus come from customers requesting such capabilities of from observations about where data management and analytics were headed?
Krishnamoorthy: In some ways, we work back from customers. The big change was that there were rapid advances in foundational models, and there was a desire to introduce chat capabilities as a great user experience. Previously, vector search and storage were available for advanced developers. Now it is becoming democratized and mainstream. The big change I think about is that there have been big advances in foundational model capabilities that led to a big wave of mainstream adoption. We are responding to that.
The fact that we had all this work we had already done [that] was able to lead to all these capabilities [in databases] -- I'm excited about combining Bedrock with vector capabilities across our data stack. It makes it easy for customers to bring [generative AI] experiences into every application.
Editor's note: This Q&A has been edited for clarity and conciseness.
Eric Avidon is a senior news writer for TechTarget Editorial and a journalist with more than 25 years of experience. He covers analytics and data management.