Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
As a part of the Linux Foundation, we ask community members to adhere to the Linux Foundation Code of Conduct
GitHub Repository: Find the complete Feast codebase on GitHub.
Community Governance Doc: See the governance model of Feast, including who the maintainers are and how decisions are made.
Google Folder: This folder is used as a central repository for all Feast resources. For example:
Design proposals in the form of Request for Comments (RFC).
User surveys and meeting minutes.
Slide decks of conferences our contributors have spoken at.
Feast Linux Foundation Wiki: Our LFAI wiki page contains links to resources for contributors and maintainers.
GitHub Issues: Found a bug or need a feature? Create an issue on GitHub.
Feast (Feature Store) is an open-source feature store that helps teams operate production ML systems at scale by allowing them to define, manage, validate, and serve features for production AI/ML.
Feast's feature store is composed of two foundational components: (1) an offline store for historical feature extraction used in model training and an (2) online store for serving features at low-latency in production systems and applications.
Feast is a configurable operational data system that re-uses existing infrastructure to manage and serve machine learning features to realtime models. For more details, please review our architecture.
Concretely, Feast provides:
A Python SDK for programmatically defining features, entities, sources, and (optionally) transformations
A Python SDK for reading and writing features to configured offline and online data stores
An optional feature server for reading and writing features (useful for non-python languages)
A UI for viewing and exploring information about features defined in the project
A CLI tool for viewing and updating feature information
Feast allows ML platform teams to:
Make features consistently available for training and low-latency serving by managing an offline store (to process historical data for scale-out batch scoring or model training), a low-latency online store (to power real-time prediction), and a battle-tested feature server (to serve pre-computed features online).
Avoid data leakage by generating point-in-time correct feature sets so data scientists can focus on feature engineering rather than debugging error-prone dataset joining logic. This ensures that future feature values do not leak to models during training.
Decouple ML from data infrastructure by providing a single data access layer that abstracts feature storage from feature retrieval, ensuring models remain portable as you move from training models to serving models, from batch models to real-time models, and from one data infra system to another.
Note: Feast today primarily addresses timestamped structured data.
Note: Feast uses a push model for online serving. This means that the feature store pushes feature values to the online store, which reduces the latency of feature retrieval. This is more efficient than a pull model, where the model serving system must make a request to the feature store to retrieve feature values. See this document for a more detailed discussion.
Feast helps ML platform/MLOps teams with DevOps experience productionize real-time models. Feast also helps these teams build a feature platform that improves collaboration between data engineers, software engineers, machine learning engineers, and data scientists.
For Data Scientists: Feast is a tool where you can easily define, store, and retrieve your features for both model development and model deployment. By using Feast, you can focus on what you do best: build features that power your AI/ML models and maximize the value of your data.
For MLOps Engineers: Feast is a library that allows you to connect your existing infrastructure (e.g., online database, application server, microservice, analytical database, and orchestration tooling) that enables your Data Scientists to ship features for their models to production using a friendly SDK without having to be concerned with software engineering challenges that occur from serving real-time production systems. By using Feast, you can focus on maintaining a resilient system, instead of implementing features for Data Scientists.
For Data Engineers: Feast provides a centralized catalog for storing feature definitions, allowing one to maintain a single source of truth for feature data. It provides the abstraction for reading and writing to many different types of offline and online data stores. Using either the provided Python SDK or the feature server service, users can write data to the online and/or offline stores and then read that data out again in either low-latency online scenarios for model inference, or in batch scenarios for model training.
For AI Engineers: Feast provides a platform designed to scale your AI applications by enabling seamless integration of richer data and facilitating fine-tuning. With Feast, you can optimize the performance of your AI models while ensuring a scalable and efficient data pipeline.
An ETL / ELT system. Feast is not a general purpose data pipelining system. Users often leverage tools like dbt to manage upstream data transformations. Feast does support some transformations.
A data orchestration tool: Feast does not manage or orchestrate complex workflow DAGs. It relies on upstream data pipelines to produce feature values and integrations with tools like Airflow to make features consistently available.
A data warehouse: Feast is not a replacement for your data warehouse or the source of truth for all transformed data in your organization. Rather, Feast is a lightweight downstream layer that can serve data from an existing data warehouse (or other data sources) to models in production.
A database: Feast is not a database, but helps manage data stored in other systems (e.g. BigQuery, Snowflake, DynamoDB, Redis) to make features consistently available at training / serving time
batch feature engineering: Feast supports on-demand and streaming transformations. Feast is also investing in supporting batch transformations.
native streaming feature integration: Feast enables users to push streaming features, but does not pull from streaming sources or manage streaming pipelines.
data quality / drift detection: Feast has experimental integrations with Great Expectations, but is not purpose built to solve data drift / data quality issues. This requires more sophisticated monitoring across data pipelines, served feature values, labels, and model versions.
Many companies have used Feast to power real-world ML use cases such as:
Personalizing online recommendations by leveraging pre-computed historical user or item features.
Online fraud detection, using features that compare against (pre-computed) historical transaction patterns
Churn prediction (an offline model), generating feature values for all users at a fixed cadence in batch
Credit scoring, using pre-computed historical features to compute the probability of default
The best way to learn Feast is to use it. Head over to our Quickstart and try it out!
Explore the following resources to get started with Feast:
Quickstart is the fastest way to get started with Feast
Concepts describes all important Feast API concepts
Architecture describes Feast's overall architecture.
Tutorials shows full examples of using Feast in machine learning applications.
Running Feast with Snowflake/GCP/AWS provides a more in-depth guide to using Feast.
Reference contains detailed API and design documents.
Contributing contains resources for anyone who wants to contribute to Feast.
A feature transformation is a function that takes some set of input data and returns some set of output data. Feature transformations can happen on either raw data or derived data.
Feature transformations can be executed by three types of "transformation engines":
The Feast Feature Server
An Offline Store (e.g., Snowflake, BigQuery, DuckDB, Spark, etc.)
A Stream processor (e.g., Flink or Spark Streaming)
The three transformation engines are coupled with the .
Importantly, this implies that different feature transformation code may be used under different transformation engines, so understanding the tradeoffs of when to use which transformation engine/communication pattern is extremely critical to the success of your implementation.
In general, we recommend transformation engines and network calls to be chosen by aligning it with what is most appropriate for the data producer, feature/model usage, and overall product.
Feast's architecture is designed to be flexible and scalable. It is composed of several components that work together to provide a feature store that can be used to serve features for training and inference.
Feast uses a to ingest data from different sources and store feature values in the online store. This allows Feast to serve features in real-time with low latency.
Feast supports for On Demand and Streaming data sources and will support Batch transformations in the future. For Streaming and Batch data sources, Feast requires a separate (in the batch case, this is typically your Offline Store). We are exploring adding a default streaming engine to Feast.
Domain expertise is recommended when integrating a data source with Feast understand the to your application
We recommend for your Feature Store microservice. As mentioned in the document, precomputing features is the recommended optimal path to ensure low latency performance. Reducing feature serving to a lightweight database lookup is the ideal pattern, which means the marginal overhead of Python should be tolerable. Because of this we believe the pros of Python outweigh the costs, as reimplementing feature logic is undesirable. Java and Go Clients are also available for online feature retrieval.
is a security mechanism that restricts access to resources based on the roles of individual users within an organization. In the context of the Feast, RBAC ensures that only authorized users or groups can access or modify specific resources, thereby maintaining data security and operational integrity.
Use Python to serve your features.
Python has emerged as the primary language for machine learning, and this extends to feature serving and there are five main reasons Feast recommends using a microservice written in Python.
You should meet your users where they are. Python’s popularity in the machine learning community is undeniable. Its simplicity and readability make it an ideal language for writing and understanding complex algorithms. Python boasts a rich ecosystem of libraries such as TensorFlow, PyTorch, XGBoost, and scikit-learn, which provide robust support for developing and deploying machine learning models and we want Feast in this ecosystem.
Precomputing features is the recommended optimal path to ensure low latency performance. Reducing feature serving to a lightweight database lookup is the ideal pattern, which means the marginal overhead of Python should be tolerable. Precomputation ensures product experiences for downstream services are also fast. Slow user experiences are bad user experiences. Precompute and persist data as much as you can.
Ensuring that features used during model training (offline serving) and online serving are available in production to make real-time predictions is critical. When features are initially developed, they are typically written in Python. This is due to the convenience and efficiency provided by Python's data manipulation libraries. However, in a production environment, there is often interest or pressure to rewrite these features in a different language, like Java, Go, or C++, for performance reasons. This reimplementation introduces a significant risk: training and serving skew. Note that there will always be some minor exceptions (e.g., any Time Since Last Event types of features) but this should not be the rule.
Training and serving skew occurs when there are discrepancies between the features used during model training and those used during prediction. This can lead to degraded model performance, unreliable predictions, and reduced velocity in releasing new features and new models. The process of rewriting features in another language is prone to errors and inconsistencies, which exacerbate this issue.
Rewriting features in another language is not only risky but also resource-intensive. It requires significant time and effort from engineers to ensure that the features are correctly translated. This process can introduce bugs and inconsistencies, further increasing the risk of training and serving skew. Additionally, maintaining two versions of the same feature codebase adds unnecessary complexity and overhead. More importantly, the opportunity cost of this work is high and requires twice the amount of resourcing. Reimplementing code should only be done when the performance gains are worth the investment. Features should largely be precomputed so the latency performance gains should not be the highest impact work that your team can accomplish.
Rather than switching languages, it is more efficient to optimize the performance of your feature store while keeping Python as the primary language. Optimization is a two step process.
Use tools like CProfile to understand latency bottlenecks in your code. This will help you prioritize the biggest inefficiencies first. When we initially launched Python native transformations in Python, profiling the code helped us identify that Pandas resulted in a 10x overhead due to type conversion.
As mentioned, precomputation is the recommended path. In some cases, you may want fully synchronous writes from your data producer to your online feature store, in which case you will want your feature computations and writes to be very fast. In this case, we recommend optimizing the feature calculation code first.
You should optimize your code using libraries, tools, and caching. For example, identify whether your feature calculations can be optimized through vectorized calculations in NumPy; explore tools like Numba for faster execution; and cache frequently accessed data using tools like an lru_cache.
Lastly, Feast will continue to optimize serving in Python and making the overall infrastructure more performant. This will better serve the community.
So we recommend focusing on optimizing your feature-specific code, reporting latency bottlenecks to the maintainers, and contributing to help the infrastructure be more performant.
By keeping features in Python and optimizing performance, you can ensure consistency between training and serving, reduce the risk of errors, and focus on launching more product experiences for your customers.
Embrace Python for feature serving, and leverage its strengths to build robust and reliable machine learning systems.
Note: this ML Infrastructure diagram highlights an orchestration pattern that is driven by a client application. This is not the only approach that can be taken and different patterns will result in different trade-offs.
Production machine learning systems can choose from four approaches to serving machine learning predictions (the output of model inference):
Online model inference with online features
Offline mode inference without online features
Online model inference with online features and cached predictions
Online model inference without features
Note: online features can be sourced from batch, streaming, or request data sources.
These three approaches have different tradeoffs but, in general, have significant implementation differences.
Online model inference with online features is a powerful approach to serving data-driven machine learning applications. This requires a feature store to serve online features and a model server to serve model predictions (e.g., KServe). This approach is particularly useful for applications where request-time data is required to run inference.
Typically, Machine Learning teams find serving precomputed model predictions to be the most straightforward to implement. This approach simply treats the model predictions as a feature and serves them from the feature store using the standard Feast sdk. These model predictions are typically generated through some batch process where the model scores are precomputed. As a concrete example, the batch process can be as simple as a script that runs model inference locally for a set of users that can output a CSV. This output file could be used for materialization so that the model could be served online as shown in the code below.
Notice that the model server is not involved in this approach. Instead, the model predictions are precomputed and materialized to the online store.
While this approach can lead to quick impact for different business use cases, it suffers from stale data as well as only serving users/entities that were available at the time of the batch computation. In some cases, this tradeoff may be tolerable.
This approach is the most sophisticated where inference is optimized for low-latency by caching predictions and running model inference when data producers write features to the online store. This approach is particularly useful for applications where features are coming from multiple data sources, the model is computationally expensive to run, or latency is a significant constraint.
Note that in this case a seperate call to write_to_online_store
is required when the underlying data changes and predictions change along with it.
While this requires additional writes for every data producer, this approach will result in the lowest latency for model inference.
This approach does not require Feast. The model server can directly serve predictions without any features. This approach is common in Large Language Models (LLMs) and other models that do not require features to make predictions.
Implicit in the code examples above is a design choice about how clients orchestrate calls to get features and run model inference. The examples had a Feast-centric pattern because they are inputs to the model, so the sequencing is fairly obvious. An alternative approach can be Inference-centric where a client would call an inference endpoint and the inference service would be responsible for orchestration.
The list below contains the functionality that contributors are planning to develop for Feast.
We welcome contribution to all items in the roadmap!
Data Sources
Offline Stores
Online Stores
Feature Engineering
Streaming
Deployments
Feature Serving
Data Quality Management (See )
Feature Discovery and Governance
Natural Language Processing
Note that generative models using Retrieval Augmented Generation (RAG) do require features where the are treated as features, which Feast supports (this would fall under "Online Model Inference with Online Features").