What is the difference between Milvus and Weaviate? Using vector databases to power AI

Updated 11 months ago on July 18, 2023

Table of Contents

Inquiry

No significant differences between the technologies were found. The Milvus Python client implements a search method that retrieves a list of vectors, which allows a multi-vector query. The Weaviate Python client only allows vector search for a single vector.

As in the case of the indexing time analysis, both engines show similar behavior when executing queries. Note that before explicitly loading script collections for a query, Milvus experienced a warm-up effect that significantly affected the average query execution time for this system. Even after eliminating this drawback, the system is still prone to uneven query execution times. Nevertheless, it has a clear advantage over Weaviate: the average query execution time is shorter for all scenarios considered. Below are histograms of the average query execution times for each technology.

Conclusions

In this study, we set out to compare popular open source Vector Similarity Engine (VSE) solutions that facilitate embedding search through approximated nearest high-dimensional vectors. These new frameworks provide a more efficient approach to storing vector data. However, they are relatively recent and have yet to finalize fundamental features such as horizontal scaling, pagination, interleaving, and GPU support.

We hope to host a product catalog of 300k to 5 million products and latent relationships between products, queries and items, as well as other customer experience data. To realize our plans, we need the VSE to support a large dataset with multiple vector representations and provide multi-access efficiency. Therefore, as part of this study, it is critical for us to ensure that the following criteria are met:

  • VSE provides high quality (accuracy) results;
  • This is made possible by low-level implementations of executive index types.
  • VSE indexes the embeddings with satisfactory responsiveness;
  • This is made possible by low-level implementations of executive index types.
  • VSE fulfills requests at a high rate of speed;
  • This is made possible by low-level implementations of executive index types.
  • VSE enables horizontal scaling with load balancing and data redundancy to protect against hardware failures while increasing service capacity;
  • VSE provides a collection of K best results, but the system that integrates it can iterate the K best results over a window of size N.

This experiment focused mainly on two engines: Milvus and Weaviate. An analysis of the quality of the results obtained was not performed within the scope of this study, as this would have required an additional study of the configuration:

  • The embedding model(s) used to encode the information;
  • index type used.

Given that the experimental setup depended on the same configuration of all engines under study, we fixed the indexing algorithm on HNSW. Currently, to understand the impact of the indexing algorithm on speed, search quality and memory, you can refer to this blog article.

When analyzing indexing and query times, Milvus consistently outperforms Weaviate, with indexing times being particularly notable for the S9 scenario.

These technologies imposed technical limitations, such as the lack of support for multiple encodings (which actually required modifications to adapt the S4 - S9 scenarios). However, in [1], multi-vector querying refers to query retrieval based on a list of vectors and indexing entities that have more than one representation. Additional information on the toolkit roadmap indicates that these (and other) features should be expected in the next stable release.

More Questions

How do I install ChatGPT plugins? Updated 11 months ago

Go to the plugin store A button will appear to go to the plugin store. Here you can view a list of ChatGPT plugins. If you find a plugin you want to try, click on the green Install button next to it. Once installed, you will be able to access it anytime from the same menu.

Is there a Python API for ChatGPT? Updated 11 months ago

To use Chat GPT for Python, you need to install the OpenAI API client and create an API key. Once you have the API key, you can integrate ChatGPT directly into your applications, using environment variables or the ChatGPT messaging prompt for help writing and fixing code.

Is the ChatGPT API free? Updated 11 months ago

Is the ChatGPT API key free to use? No, the ChatGPI API Key is not free, however, users receive a free credit of about $18 when they create an account on OpenAPI. To do this, you need to open your preferred browser, click on the OpenAI API Key link, and log in.

What is the main advantage of custom system development? Updated 11 months ago

Targeted solutions. Perhaps the most important reason to invest in custom software development is to create a product that meets your exact needs. It's not uncommon for businesses to choose an off-the-shelf software option and then realize it's not right for them.

What is Milvus used for? Updated 11 months ago

Build powerful machine learning applications and manage massive vector data with Milvus. Searching data by easily definable criteria, such as querying a movie database by actor, director, genre, or release date, is easy.

Are AI developers in demand? Updated 11 months ago

Job Outlook for Artificial Intelligence Engineers Jobs for Artificial Intelligence Engineers are projected to grow 21% between 2021 and 2031, significantly higher than the average for all occupations (5%). AI engineers typically work for companies to help them improve their products, software, operations, and delivery.

Let's get in touch!

Please feel free to send us a message through the contact form.

Drop us a line at mailrequest@nosota.com / Give us a call over skypenosota.skype