Warning: This is an experimental feature. To our knowledge, this is stable, but there are still rough edges in the experience. Contributions are welcome!
Overview
Vector database allows user to store and retrieve embeddings. Feast provides general APIs to store and retrieve embeddings.
Integration
Below are supported vector databases and implemented features:
Vector Database
Retrieval
Indexing
Note: SQLite is in limited access and only working on Python 3.10. It will be updated as sqlite_vec progresses.
from batch_score_documents import run_model, TOKENIZER, MODELfrom transformers import AutoTokenizer, AutoModelquestion ="the most populous city in the U.S. state of Texas?"tokenizer = AutoTokenizer.from_pretrained(TOKENIZER)model = AutoModel.from_pretrained(MODEL)query_embedding =run_model(question, tokenizer, model)query = query_embedding.detach().cpu().numpy().tolist()[0]
Retrieve the top 5 similar documents
First create a feature store instance, and use the retrieve_online_documents API to retrieve the top 5 similar documents to the specified query.
from feast import FeatureStorestore =FeatureStore(repo_path=".")features = store.retrieve_online_documents( feature="city_embeddings:Embeddings", query=query, top_k=5).to_dict()defprint_online_features(features):for key, value insorted(features.items()):print(key, " : ", value)print_online_features(features)
Configuration
We offer two Online Store options for Vector Databases. PGVector and SQLite.
Installation with SQLite
If you are using pyenv to manage your Python versions, you can install the SQLite extension with the following command: