LogoLogo
v0.11-branch
v0.11-branch
  • Introduction
  • Quickstart
  • Getting started
    • Install Feast
    • Create a feature repository
    • Deploy a feature store
    • Build a training dataset
    • Load data into the online store
    • Read features from the online store
  • Community
  • Roadmap
  • Changelog
  • Concepts
    • Overview
    • Feature view
    • Data model
    • Online store
    • Offline store
    • Provider
    • Architecture
  • Reference
    • Data sources
      • BigQuery
      • File
    • Offline stores
      • File
      • BigQuery
    • Online stores
      • SQLite
      • Redis
      • Datastore
    • Providers
      • Local
      • Google Cloud Platform
    • Feature repository
      • feature_store.yaml
      • .feastignore
    • Feast CLI reference
    • Python API reference
    • Usage
  • Feast on Kubernetes
    • Getting started
      • Install Feast
        • Docker Compose
        • Kubernetes (with Helm)
        • Amazon EKS (with Terraform)
        • Azure AKS (with Helm)
        • Azure AKS (with Terraform)
        • Google Cloud GKE (with Terraform)
        • IBM Cloud Kubernetes Service (IKS) and Red Hat OpenShift (with Kustomize)
      • Connect to Feast
        • Python SDK
        • Feast CLI
      • Learn Feast
    • Concepts
      • Overview
      • Architecture
      • Entities
      • Sources
      • Feature Tables
      • Stores
    • Tutorials
      • Minimal Ride Hailing Example
    • User guide
      • Overview
      • Getting online features
      • Getting training features
      • Define and ingest features
      • Extending Feast
    • Reference
      • Configuration Reference
      • Feast and Spark
      • Metrics Reference
      • Limitations
      • API Reference
        • Go SDK
        • Java SDK
        • Core gRPC API
        • Python SDK
        • Serving gRPC API
        • gRPC Types
    • Advanced
      • Troubleshooting
      • Metrics
      • Audit Logging
      • Security
      • Upgrading Feast
  • Contributing
    • Contribution process
    • Development guide
    • Versioning policy
    • Release process
Powered by GitBook
On this page
  • Sequence description
  • Components:

Was this helpful?

Edit on Git
Export as PDF
  1. Feast on Kubernetes
  2. Concepts

Architecture

PreviousOverviewNextEntities

Last updated 3 years ago

Was this helpful?

Sequence description

  1. Log Raw Events: Production backend applications are configured to emit internal state changes as events to a stream.

  2. Create Stream Features: Stream processing systems like Flink, Spark, and Beam are used to transform and refine events and to produce features that are logged back to the stream.

  3. Log Streaming Features: Both raw and refined events are logged into a data lake or batch storage location.

  4. Create Batch Features: ELT/ETL systems like Spark and SQL are used to transform data in the batch store.

  5. Define and Ingest Features: The Feast user defines based on the features available in batch and streaming sources and publish these definitions to Feast Core.

  6. Poll Feature Definitions: The Feast Job Service polls for new or changed feature definitions.

  7. Start Ingestion Jobs: Every new feature table definition results in a new ingestion job being provisioned (see limitations).

  8. Batch Ingestion: Batch ingestion jobs are short-lived jobs that load data from batch sources into either an offline or online store (see limitations).

  9. Stream Ingestion: Streaming ingestion jobs are long-lived jobs that load data from stream sources into online stores. A stream source and batch source on a feature table must have the same features/fields.

  10. Model Training: A model training pipeline is launched. It uses the Feast Python SDK to retrieve a training dataset and trains a model.

  11. Get Historical Features: Feast exports a point-in-time correct training dataset based on the list of features and entity DataFrame provided by the model training pipeline.

  12. Deploy Model: The trained model binary (and list of features) are deployed into a model serving system.

  13. Get Prediction: A backend system makes a request for a prediction from the model serving service.

  14. Retrieve Online Features: The model serving service makes a request to the Feast Online Serving service for online features using a Feast SDK.

  15. Return Prediction: The model serving service makes a prediction using the returned features and returns the outcome.

Limitations

  • Only Redis is supported for online storage.

  • Batch ingestion jobs must be triggered from your own scheduler like Airflow. Streaming ingestion jobs are automatically launched by the Feast Job Service.

Components:

A complete Feast deployment contains the following components:

  • Feast Core: Acts as the central registry for feature and entity definitions in Feast.

  • Feast Job Service: Manages data processing jobs that load data from sources into stores, and jobs that export training datasets.

  • Feast Serving: Provides low-latency access to feature values in an online store.

  • Feast Python SDK CLI: The primary user facing SDK. Used to:

    • Manage feature definitions with Feast Core.

    • Launch jobs through the Feast Job Service.

    • Retrieve training datasets.

    • Retrieve online features.

  • Online Store: The online store is a database that stores only the latest feature values for each entity. The online store can be populated by either batch ingestion jobs (in the case the user has no streaming source), or can be populated by a streaming ingestion job from a streaming source. Feast Online Serving looks up feature values from the online store.

  • Offline Store: The offline store persists batch data that has been ingested into Feast. This data is used for producing training datasets.

  • Feast Spark SDK: A Spark specific Feast SDK. Allows teams to use Spark for loading features into an online store and for building training datasets over offline sources.

Please see the for more details on configuring these components.

Java and Go Clients are also available for online feature retrieval. See .

API Reference
feature tables
configuration reference