# RAG Fine Tuning with Feast and Milvus

## Introduction

This example notebook provides a step-by-step demonstration of building and using a RAG system with Feast and the custom FeastRagRetriever. The notebook walks through:

1. Data Preparation
   * Loads a subset of the [Wikipedia DPR dataset](https://huggingface.co/datasets/facebook/wiki_dpr) (1% of training data)
   * Implements text chunking with configurable chunk size and overlap
   * Processes text into manageable passages with unique IDs
2. Embedding Generation
   * Uses `all-MiniLM-L6-v2` sentence transformer model
   * Generates 384-dimensional embeddings for text passages
   * Demonstrates batch processing with GPU support
3. Feature Store Setup
   * Creates a Parquet file as the historical data source
   * Configures Feast with the feature repository
   * Demonstrates writing embeddings from data source to Milvus online store which can be used for model training later
4. RAG System Implementation
   * **Embedding Model**: `all-MiniLM-L6-v2` (configurable)
   * **Generator Model**: `granite-3.2-2b-instruct` (configurable)
   * **Vector Store**: Custom implementation with Feast integration
   * **Retriever**: Custom implementation extending HuggingFace's RagRetriever
5. Query Demonstration
   * Perform inference with retrieved context

## Requirements

* A Kubernetes cluster with:
  * GPU nodes available (for model inference)
  * At least 200GB of storage
  * A standalone Milvus deployment. See example [here](https://github.com/milvus-io/milvus-helm/tree/master/charts/milvus).

## Running the example

Clone this repository: <https://github.com/feast-dev/feast.git> Navigate to the examples/rag-retriever directory. Here you will find the following files:

* **feature\_repo/feature\_store.yaml** This is the core configuration file for the RAG project's feature store, configuring a Milvus online store on a local provider.
  * In order to configure Milvus you should:
    * Update `feature_store.yaml` with your Milvus connection details:
      * host
      * port (default: 19530)
      * credentials (if required)
* **feature\_repo/ragproject\_repo.py** This is the Feast feature repository configuration that defines the schema and data source for Wikipedia passage embeddings.
* **rag\_feast.ipynb** This is a notebook demonstrating the implementation of a RAG system using Feast. The notebook provides:
  * A complete end-to-end example of building a RAG system with:
    * Data preparation using the Wiki DPR dataset
    * Text chunking and preprocessing
    * Vector embedding generation using sentence-transformers
    * Integration with Milvus vector store
    * Inference utilising a custom RagRetriever: FeastRagRetriever
  * Uses `all-MiniLM-L6-v2` for generating embeddings
  * Implements `granite-3.2-2b-instruct` as the generator model

Open `rag_feast.ipynb` and follow the steps in the notebook to run the example.

## FeastRagRetriver Low Level Design

![Low level design for feast rag retriever](/files/yhIWc6vHWgwX6oFMKptt)

## Helpful Information

* Ensure your Milvus instance is properly configured and running
* Vector dimensions and similarity metrics can be adjusted in the feature store configuration
* The example uses Wikipedia data, but the system can be adapted for other datasets


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.feast.dev/v0.54-branch/tutorials/rag-retriever.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
