Table of Contents
Introduction
A couple of weeks ago, in our ChatGPT plugins blog, we talked about how plugins can extend the functionality of ChatGPT by allowing it to utilize third-party resources to work with your chatters. These plugins are most valuable when they help compensate for the shortcomings that ChatGPT has. For example, ChatGPT is built on top of GPT 4.0, which is a large language model that does not understand mathematical and algebraic reasoning as well as written language, so using a WolframAlpha plugin as a "math mode" when solving math problems makes sense!
Another disadvantage of ChatGPT, which we have mentioned, is that it is not possible to use context when answering questions, if this context is not specified in the query text. The solution to this drawback is the ChatGPT retrieval plugin, which connects ChatGPT to a vector database, which provides a robust solution to the above problem. The vector database connected to ChatGPT can be used to store and access relevant information when answering queries and acts as a long-term memory of the LLM.
Plugins are a very powerful way that you and I can contribute to improving the use of LLM without having to retrain the underlying GPT model. Let's say you're using ChatGPT and you realize that it's not good enough to carry on a conversation when you ask it a question about the weather, or it's not knowledgeable enough about your health to suggest delicious and healthy recipes based on your previous blood sugar, blood pressure, and health status. You can create a plugin to address these issues and thereby improve usability for everyone, as they can simply install your plugin and use it!
The only questions remaining are how to access the exclusive alpha version of the plugin hosted by OpenAI, and how to create a plugin for ChatGPT! Don't worry, we have good news on both issues.
Weaviate, in partnership with OpenAI and Cortical Ventures, is organizing a full-day generative artificial intelligence hackathon at ODSC East on May 11 in Boston at the Hynes Convention Center. There, you'll have access to OpenAI's APIs and ChatGPT plugin tokens provided by OpenAI, and you'll be able to create your own plugins as well as AutoGPT-like applications to solve problems near and dear to you using tools like ChatGPT and Weaviate! You can register at the link above. Seats are limited, so don't delay!
Now let's move on to how you can create your own plugin for ChatGPT. Here we will go through the step-by-step process of creating a Weaviate Retrieval Plugin. The Weaviate retrieval plugin connects ChatGPT to a Weaviate instance and allows it to query relevant documents from the vector database, insert documents to "remember" information for the future, and delete documents to "forget" them! The process we used to create this plugin is very similar to what you might use to create a general purpose plugin, so we found it very instructive and hope it helps you!
How to create a ChatGPT plugin?
The code repository for the entire Weaviate Retrieval Plugin can be found here. Let's walk through the steps, including code snippets and some of the problems we encountered and how we eventually solved them.
The technology stack we used to develop this plugin is as follows:
- Python: write everything in python
- FastAPI: the server used to run the plugin
- Pytest: for writing and executing our tests
- Docker: we create containers for building, testing and deploying the plugin
Below are the steps we took to develop the plugin. The first part is about creating a web application with the desired endpoint, the second part is about developing the ChatGPT plugin, and the third part is about remote deployment using Fly.io. We cover the steps in order, but you can skip steps depending on your level of preparation for learning the material.
Part 1: Creating a Web Application
Step 1: Customizing the development environment
We used Dev Containers to create our development environment. Fly.io, Docker and Poetry were
added to the devcontainer.jsonfile
. Other dev container templates can be found here.
Step 2: Testing the installation
- After setting up the environment, we checked that everything worked: we created a dummy endpoint that will simply respond with a
{"Hello": "World"}
object when called.
from fastapi import FastAPI
"""
app = FastAPI()
@app.get("/")
defread_root():
Say hello to the world
"""
return{"Hello": "World"}
-
Set up tests with PyTest that accomplish two things: first, we want to verify that our Weaviate instance is up and running, which is configured here, and second, that the Fast API endpoint is responding. Both of these tests are defined here
-
We also created a makefile to automate the execution of the tests and start the endpoint. In the makefile, we also specified a
run
command that will start the server locally to verify that the network connection settings are configured correctly. You can also connect to port 8000, which is listening on FastAPI by default, to test connectivity. -
The final step to make sure everything is working correctly is to navigate to
localhost:8000/docs
, which should open the Swagger UI for your endpoint. The Swagger UI gives you the ability to work with the server, interact with any endpoints you've defined, and it's all updated in real time - this is especially handy when we later want to call endpoints manually to interact with the Weaviate instance to query, add, and delete objects.
Once all of the above is done and everything looks in order, you can start implementing plugin-specific features.
Step 3: Implementation of the function to obtain embeddings of vectors
Since we are implementing a plugin that connects a vector database to ChatGPT, we need to define a vector embedding generation method that can be used when upserting documents into our database to generate and store vector embeddings for our documents, this function will also be used to vectorize queries when querying and performing vector searches on the vector database. This feature is implemented here.
import openai
"
defget_embedding(text):
""
Get embedding for the given text
"""
results = openai.Embedding.create(input=text, model="text-embedding-ada-002")
return results["data"][0]["embedding"]
Here we simply chose the ada-002 model because OpenAI indicates that this is the model used in their pattern search plugin, however, since the query is running on a vector database, we could have used any vector analyzer.
Step 4: Implement Weaviate client and vector database initialization function
Next, we implement a couple of functions to initialize the Weaviate python client and initialize a Weaviate instance through the client, checking if the schema exists and if it doesn't, adding it.
import weaviate
")
import os
import logging
INDEX_NAME ="Document"
SCHEMA ={
"class": INDEX_NAME,
"properties":[
{"name": "text", "dataType":["text"]},
{"name": "document_id", "dataType":["string"]},
],
}
defget_client():
"""
Get client to Weaviate server
"""
host = os.environ.get("WEAVIATE_HOST", "http://localhost:8080")
return weaviate.Client(host)
definit_db():
"""
Create a schema for the database if it doesn't already exist
"""
client = get_client()
ifnot client.schema.schema.contains(SCHEMA):
logging.debug("Creating schema")
client.schema.create_class(SCHEMA)
else:
class_name = SCHEMA["class"]
logging.debug(f "Schema for {class_name} already exists")
logging.debug("Skipping schema creation
Step 5: Initialize the database at server startup and add a dependency for the Weaviate client
Now we need to integrate the use of these functions so that when the ChatGPT plugin server starts, the Weaviate instance and the client connection are automatically initialized. To do this, we use the FastAPI lifespan function in the main python script of the server, which is run every time the server starts. This simple function calls our database initialization function defined above, which produces a Weaviate client object. Any logic that needs to be executed when the server shuts down can be included after the yield
statement. Since we don't need to do anything special for our plugin, we'll leave it empty.
from fastapi import FastAPI
from contextlib import asyncontextmanager
from.database import get_client, init_db
@asynccontextmanager
asyncdeflifespan(app: FastAPI):
init_db()
yield
app = FastAPI(lifespan=lifespan)
defget_weaviate_client():
""""
Get client to Weaviate server
"""
yield get_client()
This completes the initial configuration and testing of the server. Now we move on to the most interesting part - implementing the endpoints that will give ChatGPT different ways to interact with our plugin!
Part 2: Implementing OpenAI Specific Functionality
Step 1: Develop Weaviate Retrieval Plugin specific endpoints
Our plugin has three specific endpoints: /upsert
, /query
and /delete
. These functions give ChatGPT the ability to add objects to a Weaviate instance, query and search for objects in the Weaviate instance, and finally delete objects when needed. When interacting with ChatGPT, when the plugin is enabled, it can be commanded to use a specific endpoint with a query, but it will also decide on its own when to use the appropriate endpoint to complete the query response! These endpoints extend the functionality of ChatGPT and allow it to interact with a vector database.
We developed these three endpoints using test-driven development, so we will show the tests that each endpoint must pass and then the implementation that satisfies these tests. To prepare the Weaviate instance for these tests, we added the following test documents using fixture:
@pytest.fixture
defdocuments(weaviate_client):
docs =[
{"text": "The lion is king of the jungle", "document_id": "1"},
{"text": "The lion is a predator", "document_id": "2"},
{"text": "The lion is a large animal", document_id: "3"},
{"text": "The capital of France is Paris", document_id: "4"},
{"text": "The capital of Germany is Berlin", document_id: "5"},
]
for doc in docs:
client.post("/upsert", json=doc)
Implementing the /upsert
endpoint:
After using the /upsert
endpoint, we want to verify that we've received the appropriate status code and also make sure that the content, id, and vector have been inserted correctly.
Here's a test that accomplishes that:
deftest_upsert(weaviate_client):
response = client.post("/upsert", json={"text": "Hello World", "document_id": "1"})
assert response.status_code ==200
docs = weaviate_client.data_object.get(with_vector=True)["objects"]
assertlen(docs)==1
assert docs[0]["properties"]["text"]=="Hello World"
assert docs[0]["properties"]["document_id"]=="1"
assert docs[0]["vector"]isnotNone
The implementation presented below satisfies all of the above requirements and tests:
@app.post("/upsert")
defupsert(doc: Document, client=Depends(get_weaviate_client)):
""""
Insert document into weaviate
"""
with client.batch as batch:
batch.add_data_object(
data_object=doc.dict(),
class_name=INDEX_NAME,
vector=get_embedding(doc.text),
)
return{"status": "ok"}
The /query
and /delete
endpoints were designed in a similar way, if you're interested you can read about it below!
For more details on the implementation of the /query endpoint, see.
Implement the /query
endpoint:
For this endpoint, we mainly want to verify that it returns the right number of objects and that the required document we expected is among the objects returned.
deftest_query(documents):
"]
LIMIT =3
response = client.post("/query", json={"text": "lion", "limit": LIMIT})
results = response.json()
assertlen(results)== LIMIT
for result in results:
assert "lion "in result["document"]["text
The implementation below accepts a query and returns a list of retrieved documents and metadata.
@app.post("/query", response_model=List[QueryResult])
]
defquery(query: Query, client=Depends(get_weaviate_client))-> List[Document]:
""""
Query weaviate for documents
"""
query_vector = get_embedding(query.text)
results =(
client.query.get(INDEX_NAME,["document_id", "text"])
.with_near_vector({"vector": query_vector})
.with_limit(query.limit)
.with_additional("certainty")
.do()
)
docs = results["data"]["Get"][INDEX_NAME]
return[
QueryResult(
document={"text": doc["text"], "document_id": doc["document_id"]},
score=doc["_additional"]["certainty"],
)
for doc in docs
For more details on the implementation of the /delete endpoint, see.
Implement the /delete
endpoint:
Here we just want to check that the response is returned correctly and that the number of all objects in the Weaviate instance has decreased by one after deleting one object.
deftest_delete(documents, weaviate_client):
num_docs_before_delete = weaviate_client.data_object.get()["totalResults"]
response = client.post("/delete", json={"document_id": "3"})
assert response.status_code ==200
num_docs_after_delete = weaviate_client.data_object.get()["totalResults"]
assert num_docs_after_delete == num_docs_before_delete -1
And the realization of the endpoint is as follows:
@app.post("/delete")
":
defdelete(delete_request: DeleteRequest, client=Depends(get_weaviate_client)):
""""
Deleting a document from weaviate
"""
result = client.batch.delete_objects(
class_name=INDEX_NAME,
where={
"operator": "Equal",
"path":["document_id"],
"valueText": delete_request.document_id,
},
)
if result["results"]["successful"]==1:
return{"status"ok"}
else:
return{"status": "not found"}
Here we have shown how our endpoints work, this is where your plugin will be most unique, depending on what functionality you want to implement you can create the appropriate endpoints and test them.
Pay attention to the documentation we have included in all of our endpoints - it will be very important in the next step!
Step 2: Preparing plugin manifest files
This is where you tell OpenAI, and ChatGPT in particular, what endpoints your plugin exposes, how it can use those endpoints to perform specific tasks, what errors to expect if endpoints are used incorrectly, and more! The OpenAI instructions say to create two files: openapi.yaml and ai-plugin.json.
As you can see, both of these files must be in the .well-known
directory, which must be mounted in the application as follows in order for ChatGPT to use them correctly.
app.mount("/.well-known", StaticFiles(directory=".well-known"), name="static")
Let's take a closer look at these two files:
ai-plugin.json
{
}
"schema_version": "v1",
"name_for_human": "Weaviate Retrieval Plugin V2",
"name_for_model": "Weaviate_Retrieval_Plugin",
"description_for_human": "A plugin for interacting with documents using natural language. You can request, add and delete documents.",
"description_for_model": "A plugin for interacting with documents using natural language. You can request, add and delete documents.",
"auth":{
"type": "user_http",
"authorization_type": "bearer"
},
"api":{
"type": "openapi",
"url": "https://demo-retrieval-app.fly.dev/.well-known/openapi.yaml",
"is_user_authenticated":false
},
"logo_url": "https://demo-retrieval-app.fly.dev/.well-known/logo.png",
"contact_email": "support@example.com",
"legal_info_url": "http://www.example.com/legal"
This specifies data such as the name
of the application, logo assets and, most interestingly, the name_for_model
field specifies how the model (in this case ChatGPT/GPT4.0) will refer to the plugin, as well as a description of
the plugin, description_for_model,
that can be read and understood by the model.
openapi.yaml
This file is most important for defining endpoints and describing each one for ChatGPT.
Creating this .yaml file
was quite a challenge until we realized that we could simply generate the spec in json format by going into SwaggerUI via /docs
and clicking on the /openapi.json
link. You can then use this site to convert the .
json to . yaml
.
These two files are necessary for ChatGPT to properly understand and use your plugin's open endpoints.
A very interesting result of our experiments was that ChatGPT reads these files to understand not only when to use endpoints, but also how to use them correctly! So if ChatGPT is not using your endpoints correctly, you should try improving your plugin and endpoint descriptions. OpenAI offers several best practices for creating such descriptions. In our experiments, we found that if the description does not sufficiently describe how an endpoint should be used, ChatGPT calls the endpoint with the wrong syntax and retries if it fails. See the example below:
Conclusions: ChatGPT does not have hard-coded instructions on how to use plugin endpoints. You must be very careful how you describe your plugin and endpoints in ChatGPT so that they can be used as intended! The openapi.json
specification that FastAPI generates is based on how you have documented the endpoints in your code, i.e. the function name, docstring, query description, and field description in your pydantic models. The steps used to do this are beyond the scope of this article, please refer to the FastAPI documentation for more information. In general you want to have complete and comprehensive documentation for your plugin, as it is the documentation that will allow you to use it correctly!
In addition, care must be taken when specifying descriptions, doc strings, etc. so as not to exceed the context length, as the plugin description, API requests and API responses are inserted into the ChatGPT conversation. This is taken into account when defining the model context limit.
Step 3: Local deployment of the plugin and testing using the ChatGPT user interface
Allow http://localhost:8000 and https://chat.openai.com to make cross-requests to the plugin server. This can be easily done using the CORSMiddleware middleware from FastAPI.
)
if os.getenv("ENV", "dev")=="dev":
origins =[
f "http://localhost:8000",
"https://chat.openai.com",
]
app.add_middleware(
CORSMiddleware,
allow_origins=origins,
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
The above code will only be used for local testing and will allow your locally deployed application to communicate with ChatGPT. Note that our plugin has two applications that differ in the type of authentication used: for local deployment and testing we use no authentication, and for remote deployment we use Bearer and https tokens, which we will cover later. You can then run the plugin locally by following the instructions here. This will allow you to test all endpoints through the ChatGPT UI and make sure they are working correctly.
Before we move on to the next point, take a look at some of the results of the endpoint testing we've been doing locally:
Make sure that the upsert and query endpoints are working correctly. Note that here, depending on the language used in our query, ChatGPT will choose the appropriate endpoints to call.
Part 3: Remote deployment on Fly.io
Step 1: Preparing to remotely deploy the plugin to Fly.io
Once we have tested the plugin locally, we can deploy it remotely and install it in ChatGPT. Below are the steps we follow to share our plugin with those who have access to the alpha version:
-
Create a remote instance of Weaviate: This was done using Weaviate cloud services
-
Add a dockerfile. This dockerfile is a modified version of the file template provided by OpenAI and will be used to remotely configure the environment and start the server.
-
Update the
ai-plugin.json
andopenapi.yaml
plugin
manifest configuration files to now use bearer token authentication and the newly created WCS instance instead of localhost. -
Refresh the application to ensure that all communications are authenticated.
Here you can see the full diff for a project configured for local deployment and how it has been modified to prepare for remote deployment.
Step 2: Deploy to Fly.io and install in ChatGPT
This is the last step to deploy the plugin to Fly.io, and you can follow the detailed instructions here. You can then open ChatGPT in a browser and, if you have access to the alpha plugin, install your plugin by specifying the URL where it is hosted and providing a bearer token for authentication.
Conclusions
That, ladies and gentlemen, is how we created our Weavaite extraction plugin that augments ChatGPT with long-term memory. The process of creating the various plugins is fairly similar, and we believe that most of these steps can be followed in a similar manner to create a wide variety of plugins with the most variety in part 2 where you define an endpoint specific to your plugin.
Finally, let's visualize how ChatGPT can use the plugin we created. The figure below shows how the /query
endpoint can be used. Depending on the query, it can also call the /delete and
/upsert
endpoints.
More specifically, when the user accesses ChatGPT, he looks at the openapi.yaml
file to read the endpoint descriptions, and decides which endpoint to use when making the request. Above, it decides to use the /query
endpoint. It will then try to construct a valid query and retry if it fails. The query will return to ChatGPT the appropriate documents from Weaviate that it uses to answer the original query!
More Questions
Build powerful machine learning applications and manage massive vector data with Milvus. Searching data by easily definable criteria, such as querying a movie database by actor, director, genre, or release date, is easy.
Job Outlook for Artificial Intelligence Engineers Jobs for Artificial Intelligence Engineers are projected to grow 21% between 2021 and 2031, significantly higher than the average for all occupations (5%). AI engineers typically work for companies to help them improve their products, software, operations, and delivery.
Some of these types of AI are not even scientifically possible at this time. According to the current classification system, there are four main types of AI: reactive, limited memory, theory-of-mind, and self-aware.
Although DALL-E 2 is the best known in the field of AI image generation, it might make sense to try Stable Diffusion first: it has a free trial, it's cheaper, it's more powerful, and it has wider usage rights. If you get completely sidetracked, you can also use it to develop your own generative AI.
Stable Diffusion is a hidden diffusion model, a type of deep generative artificial neural network. Its code and model weights are published in the public domain, and it can run on most consumer hardware equipped with a modest GPU with at least 8 GB of VRAM.
The Stable Diffusion model provides the following benefits to developers interested in building applications based on it: Generation of new data: The Stable Diffusion model can be used to generate new data similar to the original training data, which proves useful when creating new images, text, or sounds.
Related Topics
Let's get in touch!
Please feel free to send us a message through the contact form.
Drop us a line at request@nosota.com / Give us a call over nosota.skype