Warning: This is an experimental feature. To our knowledge, this is stable, but there are still rough edges in the experience. Contributions are welcome!
Overview
Vector database allows user to store and retrieve embeddings. Feast provides general APIs to store and retrieve embeddings.
Integration
Below are supported vector databases and implemented features:
Vector Database
Retrieval
Indexing
Pgvector
[x]
[ ]
Elasticsearch
[x]
[x]
Milvus
[ ]
[ ]
Faiss
[ ]
[ ]
SQLite
[x]
[ ]
Qdrant
[x]
[x]
Note: SQLite is in limited access and only working on Python 3.10. It will be updated as sqlite_vec progresses.
from batch_score_documents import run_model, TOKENIZER, MODELfrom transformers import AutoTokenizer, AutoModelquestion ="the most populous city in the U.S. state of Texas?"tokenizer = AutoTokenizer.from_pretrained(TOKENIZER)model = AutoModel.from_pretrained(MODEL)query_embedding =run_model(question, tokenizer, model)query = query_embedding.detach().cpu().numpy().tolist()[0]
Retrieve the top 5 similar documents
First create a feature store instance, and use the retrieve_online_documents API to retrieve the top 5 similar documents to the specified query.
from feast import FeatureStorestore =FeatureStore(repo_path=".")features = store.retrieve_online_documents( feature="city_embeddings:Embeddings", query=query, top_k=5).to_dict()defprint_online_features(features):for key, value insorted(features.items()):print(key, " : ", value)print_online_features(features)