arrow-left

Only this pageAll pages
gitbookPowered by GitBook
1 of 83

v0.11-branch

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Concepts

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Reference

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Feast on Kubernetes

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Contributing

Loading...

Loading...

Loading...

Loading...

Getting started

Install Feastchevron-rightCreate a feature repositorychevron-rightDeploy a feature storechevron-rightBuild a training datasetchevron-rightLoad data into the online storechevron-rightRead features from the online storechevron-right

Introduction

hashtag
What is Feast?

Feast (Feature Store) is an operational data system for managing and serving machine learning features to models in production.

hashtag
Problems Feast Solves

Models need consistent access to data: ML systems built on traditional data infrastructure are often coupled to databases, object stores, streams, and files. A result of this coupling, however, is that any change in data infrastructure may break dependent ML systems. Another challenge is that dual implementations of data retrieval for training and serving can lead to inconsistencies in data, which in turn can lead to training-serving skew.

Feast decouples your models from your data infrastructure by providing a single data access layer that abstracts feature storage from feature retrieval. Feast also provides a consistent means of referencing feature data for retrieval, and therefore ensures that models remain portable when moving from training to serving.

Deploying new features into production is difficult: Many ML teams consist of members with different objectives. Data scientists, for example, aim to deploy features into production as soon as possible, while engineers want to ensure that production systems remain stable. These differing objectives can create an organizational friction that slows time-to-market for new features.

Feast addresses this friction by providing both a centralized registry to which data scientists can publish features, and a battle-hardened serving layer. Together, these enable non-engineering teams to ship features into production with minimal oversight.

Models need point-in-time correct data: ML models in production require a view of data consistent with the one on which they are trained, otherwise the accuracy of these models could be compromised. Despite this need, many data science projects suffer from inconsistencies introduced by future feature values being leaked to models during training.

Feast solves the challenge of data leakage by providing point-in-time correct feature retrieval when exporting feature datasets for model training.

Features aren't reused across projects: Different teams within an organization are often unable to reuse features across projects. The siloed nature of development and the monolithic design of end-to-end ML systems contribute to duplication of feature creation and usage across teams and projects.

Feast addresses this problem by introducing feature reuse through a centralized system (a registry). This registry enables multiple teams working on different projects not only to contribute features, but also to reuse these same features. With Feast, data scientists can start new ML projects by selecting previously engineered features from a centralized registry, and are no longer required to develop new features for each project.

hashtag
Problems Feast does not yet solve

Feature engineering: We aim for Feast to support light-weight feature engineering as part of our API.

Feature discovery: We also aim for Feast to include a first-class user interface for exploring and discovering entities and features.

‌Feature validation: We additionally aim for Feast to improve support for statistics generation of feature data and subsequent validation of these statistics. Current support is limited.

hashtag
What Feast is not

or system: Feast is not (and does not plan to become) a general purpose data transformation or pipelining system. Feast plans to include a light-weight feature engineering toolkit, but we encourage teams to integrate Feast with upstream ETL/ELT systems that are specialized in transformation.

Data warehouse: Feast is not a replacement for your data warehouse or the source of truth for all transformed data in your organization. Rather, Feast is a light-weight downstream layer that can serve data from an existing data warehouse (or other data sources) to models in production.

Data catalog: Feast is not a general purpose data catalog for your organization. Feast is purely focused on cataloging features for use in ML pipelines or systems, and only to the extent of facilitating the reuse of features.

hashtag
How can I get started?

circle-info

The best way to learn Feast is to use it. Head over to our and try it out!

Explore the following resources to get started with Feast:

  • is the fastest way to get started with Feast

  • provides a step-by-step guide to using Feast.

  • describes all important Feast API concepts.

Online store

The Feast online store is used for low-latency online feature value lookups. Feature values are loaded into the online store from data sources in feature views using the materialize command.

The storage schema of features within the online store mirrors that of the data source used to populate the online store. One key difference between the online store and data sources is that only the latest feature values are stored per entity key. No historical values are stored.

Example batch data source

Once the above data source is materialized into Feast (using feast materialize), the feature values will be stored as follows:

Data sources

Please see for an explanation of data sources.

File

hashtag
Description

File data sources allow for the retrieval of historical feature values from files on disk for building training datasets, as well as for materializing features into an online store.

hashtag

Offline stores

Please see for an explanation of offline stores.

Online stores

Please see for an explanation of online stores.

Datastore

hashtag
Description

The online store provides support for materializing feature values into Cloud Datastore. The data model used to store feature values in Datastore is described in more detail .

hashtag

Providers

Please see for an explanation of providers.

Usage

hashtag
How Feast SDK usage is measured

The Feast project logs anonymous usage statistics and errors in order to inform our planning. Several client methods are tracked, beginning in Feast 0.9. Users are assigned a UUID which is sent along with the name of the method, the Feast version, the OS (using sys.platform), and the current time.

The is available here.

Data Source
BigQuerychevron-right
Filechevron-right
Offline Store
Filechevron-right
BigQuerychevron-right
Online Store
SQLitechevron-right
Redischevron-right
Datastorechevron-right
Provider
Localchevron-right
Google Cloud Platformchevron-right
hashtag
How to disable usage logging

Set the environment variable FEAST_USAGE to False.

source codearrow-up-right

Reference

Concepts

Tutorials

User guide

Install Feast

Install Feast using piparrow-up-right:

pip install feast

Install Feast with GCP dependencies (required when using BigQuery or Firestore):

pip install 'feast[gcp]'

Python SDK

Install the Feast Python SDKarrow-up-right using pip:

pip install feast==0.9.*

Connect to an existing Feast Core deployment:

from feast import Client

# Connect to an existing Feast Core deployment
client = Client(core_url='feast.example.com:6565')

# Ensure that your client is connected by printing out some feature tables
client.list_feature_tables()

Feast CLI

Install the Feast CLI using pip:

pip install feast==0.9.*

Configure the CLI to connect to your Feast Core deployment:

feast config set core_url your.feast.deployment
circle-info

By default, all configuration is stored in ~/.feast/config

The CLI is a wrapper around the Feast Python SDK:

$ feast

Usage: feast [OPTIONS] COMMAND [ARGS]...

Options:
  --help  Show this message and exit.

Commands:
  config          View and edit Feast properties
  entities        Create and manage entities    
  feature-tables  Create and manage feature tables
  jobs            Create and manage jobs
  projects        Create and manage projects
  version         Displays version and connectivity information
Example

Configuration options are available herearrow-up-right.

from feast import FileSource
from feast.data_format import ParquetFormat

parquet_file_source = FileSource(
    file_format=ParquetFormat(),
    file_url="file:///feast/customer.parquet",
)
Example

Configuration options are available herearrow-up-right.

Datastorearrow-up-right
herearrow-up-right
feature_store.yaml
project: my_feature_repo
registry: data/registry.db
provider: gcp
online_store:
  type: datastore
  project_id: my_gcp_project
  namespace: my_datastore_namespace

Reference contains detailed API and design documents.

  • Contributing contains resources for anyone who wants to contribute to Feast.

  • ETLarrow-up-right
    ELTarrow-up-right
    Quickstart
    Quickstart
    Getting started
    Concepts

    Create a feature repository

    A feature repository is a directory that contains the configuration of the feature store and individual features. This configuration is written as code (Python/YAML) and it's highly recommended that teams track it centrally using git. See Feature Repository for a detailed explanation of feature repositories.

    The easiest way to create a new feature repository to use feast init command:

    feast init
    
    Creating a new Feast
    feast init -t gcp
    
    Creating a new Feast repository in /<...>/tiny_pika.

    The init command creates a Python file with feature definitions, sample data, and a Feast configuration file for local development:

    Enter the directory:

    You can now use this feature repository for development. You can try the following:

    • Run feast apply to apply these definitions to Feast.

    • Edit the example feature definitions in example.py and run feast apply again to change feature definitions.

    Offline store

    Feast uses offline stores as storage and compute systems. Offline stores store historic time-series feature values. Feast does not generate these features, but instead uses the offline store as the interface for querying existing features in your organization.

    Offline stores are used primarily for two reasons

    1. Building training datasets from time-series features.

    2. Materializing (loading) features from the offline store into an online store in order to serve those features at low latency for prediction.

    Offline stores are configured through the . When building training datasets or materializing features into an online store, Feast will use the configured offline store along with the data sources you have defined as part of feature views to execute the necessary data operations.

    It is not possible to query all data sources from all offline stores, and only a single offline store can be used at a time. For example, it is not possible to query a BigQuery table from a File offline store, nor is it possible for a BigQuery offline store to query files from your local file system.

    Please see the reference for more details on configuring offline stores.

    BigQuery

    hashtag
    Description

    BigQuery data sources allow for the retrieval of historical feature values from BigQuery for building training datasets as well as materializing features into an online store.

    • Either a table reference or a SQL query can be provided.

    • No performance guarantees can be provided over SQL query-based sources. Please use table references where possible.

    hashtag
    Examples

    Using a table reference

    Using a query

    Configuration options are available .

    File

    hashtag
    Description

    The File offline store provides support for reading FileSources.

    • Only Parquet files are currently supported.

    • All data is downloaded and joined using Python and may not scale to production workloads.

    hashtag
    Example

    Configuration options are available .

    BigQuery

    hashtag
    Description

    The BigQuery offline store provides support for reading BigQuerySources.

    • BigQuery tables and views are allowed as sources.

    • All joins happen within BigQuery.

    • Entity dataframes can be provided as a SQL query or can be provided as a Pandas dataframe. Pandas dataframes will be uploaded to BigQuery in order to complete join operations.

    • A is returned when calling get_historical_features().

    hashtag
    Example

    Configuration options are available .

    Redis

    hashtag
    Description

    The Redisarrow-up-right online store provides support for materializing feature values into Redis.

    • Both Redis and Redis Cluster are supported

    • The data model used to store feature values in Redis is described in more detail .

    hashtag
    Examples

    Connecting to a single Redis instance

    Connecting to a Redis Cluster with SSL enabled and password authentication

    Configuration options are available .

    Local

    hashtag
    Description

    • Offline Store: Uses the File offline store by default. Also supports BigQuery as the offline store.

    • Online Store: Uses the Sqlite online store by default. Also supports Datastore as an online store.

    hashtag
    Example

    feature_store.yaml

    hashtag
    Overview

    feature_store.yaml is used to configure a feature store. The file must be located at the root of a feature repository. An example feature_store.yaml is shown below:

    feature_store.yaml
    project: loyal_spider
    registry: data/registry.db
    provider: local
    online_store:
        type: sqlite
        path: data/online_store.db

    hashtag
    Options

    The following top-level configuration options exist in the feature_store.yaml file.

    • provider — Configures the environment in which Feast will deploy and operate.

    • registry — Configures the location of the feature registry.

    • online_store — Configures the online store.

    Please see the API reference for the full list of configuration options.

    Getting started

    triangle-exclamation

    Feast on Kubernetes is only supported using Feast 0.9 (and below). We are working to add support for Feast on Kubernetes with the latest release of Feast (0.10+). Please see our roadmap for more details.

    hashtag
    Install Feast

    If you would like to deploy a new installation of Feast, click on

    hashtag
    Connect to Feast

    If you would like to connect to an existing Feast deployment, click on

    hashtag
    Learn Feast

    If you would like to learn more about Feast, click on

    Install Feast

    A production deployment of Feast is deployed using Kubernetes.

    hashtag
    Kubernetes (with Helm)

    This guide installs Feast into an existing Kubernetes cluster using Helm. The installation is not specific to any cloud platform or environment, but requires Kubernetes and Helm.

    Kubernetes (with Helm)chevron-right

    hashtag
    Amazon EKS (with Terraform)

    This guide installs Feast into an AWS environment using Terraform. The Terraform script is opinionated and intended to allow you to start quickly.

    hashtag
    Azure AKS (with Helm)

    This guide installs Feast into an Azure AKS environment with Helm.

    hashtag
    Azure AKS (with Terraform)

    This guide installs Feast into an Azure environment using Terraform. The Terraform script is opinionated and intended to allow you to start quickly.

    hashtag
    Google Cloud GKE (with Terraform)

    This guide installs Feast into a Google Cloud environment using Terraform. The Terraform script is opinionated and intended to allow you to start quickly.

    hashtag
    IBM Cloud Kubernetes Service (IKS) and Red Hat OpenShift (using Kustomize)

    This guide installs Feast into an existing or using Kustomize.

    Quickstart

    In this tutorial we will

    1. Deploy a local feature store with a Parquet file offline store and Sqlite online store.

    2. Build a training dataset using our time series features from our Parquet files.

    Deploy a feature store

    The Feast CLI can be used to deploy a feature store to your infrastructure, spinning up any necessary persistent resources like buckets or tables in data stores. The deployment target and effects depend on the provider that has been configured in your file, as well as the feature definitions found in your feature repository.

    circle-info

    Here we'll be using the example repository we created in the previous guide, . You can re-create it by running feast init in a new directory.

    Load data into the online store

    Feast allows users to load their feature data into an online store in order to serve the latest features to models for online prediction.

    hashtag
    Materializing features

    hashtag

    Read features from the online store

    The Feast Python SDK allows users to retrieve feature values from an online store. This API is used to look up feature values at low latency during model serving in order to make online predictions.

    circle-info

    Online stores only maintain the current state of features, i.e latest feature values. No historical data is stored or served.

    hashtag

    Overview

    The top-level namespace within Feast is a . Users define one or more within a project. Each feature view contains one or more that relate to a specific . A feature view must always have a , which in turn is used during the generation of training and when materializing feature values into the online store.

    hashtag
    Project

    Projects provide complete isolation of feature stores at the infrastructure level. This is accomplished through resource namespacing, e.g., prefixing table names with the associated project. Each project should be considered a completely separate universe of entities and features. It is not possible to retrieve features from multiple projects in a single request. We recommend having a single feature store and a single project per environment (

    Data model

    hashtag
    Dataset

    A dataset is a collection of rows that is produced by a historical retrieval from Feast in order to train a model. A dataset is produced by a join from one or more feature views onto an entity dataframe. Therefore, a dataset may consist of features from multiple feature views.

    Dataset vs Feature View: Feature views contain the schema of data and a reference to where data can be found (through its data source). Datasets are the actual data manifestation of querying those data sources.

    Dataset vs Data Source:

    Architecture

    hashtag
    Functionality

    • Create Batch Features: ELT/ETL systems like Spark and SQL are used to transform data in the batch store.

    SQLite

    hashtag
    Description

    The online store provides support for materializing feature values into an SQLite database for serving online features.

    • All feature values are stored in an on-disk SQLite database

    Kubernetes (with Helm)

    hashtag
    Overview

    This guide installs Feast on an existing Kubernetes cluster, and ensures the following services are running:

    • Feast Core

    Azure AKS (with Terraform)

    hashtag
    Overview

    This guide installs Feast on Azure using our .

    circle-info

    Learn Feast

    Explore the following resources to learn more about Feast:

    • describes all important Feast API concepts.

    • provides guidance on completing Feast workflows.

    Overview

    hashtag
    Concepts

    are objects in an organization like customers, transactions, and drivers, products, etc.

    are external sources of data where feature data can be found.

    are objects that define logical groupings of features, data sources, and other related metadata.

    Define and ingest features

    In order to retrieve features for both training and serving, Feast requires data being ingested into its offline and online stores.

    Users are expected to already have either a batch or stream source with data stored in it, ready to be ingested into Feast. Once a feature table (with the corresponding sources) has been registered with Feast, it is possible to load data from this source into stores.

    The following depicts an example ingestion flow from a data source to the online store.

    hashtag
    Batch Source to Online Store

    Connect to Feast

    hashtag
    Feast Python SDK

    The Feast Python SDK is used as a library to interact with a Feast deployment.

    • Define, register, and manage entities and features

    Stores

    In Feast, a store is a database that is populated with feature data that will ultimately be served to models.

    hashtag
    Offline (Historical) Store

    The offline store maintains historical copies of feature values. These features are grouped and stored in feature tables. During retrieval of historical data, features are queries from these feature tables in order to produce training datasets.

    $ tree
    .
    └── tiny_pika
        ├── data
        │   └── driver_stats.parquet
        ├── example.py
        └── feature_store.yaml
    
    1 directory, 3 files
    hashtag
    Online Store

    The online store maintains only the latest values for a specific feature.

    • Feature values are stored based on their entity keys

    • Feast currently supports Redis as an online store.

    • Online stores are meant for very high throughput writes from ingestion jobs and very low latency access to features during online serving.

    circle-info

    Feast only supports a single online store in production

    feature_store.yaml
    Offline Stores

    offline_store — Configures the offline store.

  • project — Defines a namespace for the entire feature store. Can be used to isolate multiple deployments in a single installation of Feast.

  • RepoConfigarrow-up-right
    Install Feast
    Install Feastchevron-right
    Connect to Feast
    Connect to Feastchevron-right
    Learn Feast
    Learn Feastchevron-right
    Amazon EKS (with Terraform)chevron-right
    Azure AKS (with Helm)chevron-right
    Azure AKS (with Terraform)chevron-right
    Google Cloud GKE (with Terraform)chevron-right
    IBM Cloud Kubernetes Servicearrow-up-right
    Red Hat OpenShift on IBM Cloudarrow-up-right
    IBM Cloud Kubernetes Service (IKS) and Red Hat OpenShift (with Kustomize)chevron-right
    Examplesarrow-up-right contains Jupyter notebooks that you can run on your Feast deployment.
  • Advanced contains information about both advanced and operational aspects of Feast.

  • Reference contains detailed API and design documents for advanced users.

  • Contributing contains resources for anyone who wants to contribute to Feast.

  • circle-info

    The best way to learn Feast is to use it. Jump over to our Quickstart guide to have one of our examples running in no time at all!

    Concepts
    User guide

    Advanced

    Ingest data into Feast

  • Build and retrieve training datasets

  • Retrieve online features

  • hashtag
    Feast CLI

    The Feast CLI is a command line implementation of the Feast Python SDK.

    • Define, register, and manage entities and features from the terminal

    • Ingest data into Feast

    • Manage ingestion jobs

    hashtag
    Online Serving Clients

    The following clients can be used to retrieve online feature values:

    • Feast Python SDKarrow-up-right

    • Feast Go SDKarrow-up-right

    • Feast Java SDKarrow-up-right

    Python SDKchevron-right
    Feast CLIchevron-right
    Initialize a git repository in the same directory and checking the feature repository into version control.
    repository
    in
    /
    <
    ..
    .
    >
    /tiny_pika.
    from feast import BigQuerySource
    
    my_bigquery_source = BigQuerySource(
        table_ref="gcp_project:bq_dataset.bq_table",
    )
    herearrow-up-right
    feature_store.yaml
    project: my_feature_repo
    registry: data/registry.db
    provider: local
    offline_store:
      type: file
    herearrow-up-right
    BigQueryRetrievalJobarrow-up-right
    herearrow-up-right
    herearrow-up-right
    herearrow-up-right
    feature_store.yaml
    project: my_feature_repo
    registry: data/registry.db
    provider: local

    Materialize feature values from the offline store into the online store.

  • Read the latest features from the online store for inference.

  • hashtag
    Install Feast

    Install the Feast SDK and CLI using pip:

    hashtag
    Create a feature repository

    Bootstrap a new feature repository using feast init from the command line:

    hashtag
    Register feature definitions and deploy your feature store

    The apply command registers all the objects in your feature repository and deploys a feature store:

    hashtag
    Generating training data

    The apply command builds a training dataset based on the time-series features defined in the feature repository:

    hashtag
    Load features into your online store

    The materialize command loads the latest feature values from your feature views into your online store:

    hashtag
    Fetching feature vectors for inference

    hashtag
    Next steps

    • Follow our Getting Started guide for a hands tutorial in using Feast

    • Join other Feast users and contributors in Slackarrow-up-right and become part of the community!

    hashtag
    Deploying

    To have Feast deploy your infrastructure, run feast apply from your command line while inside a feature repository:

    Depending on whether the feature repository is configured to use a local provider or one of the cloud providers like GCP or AWS, it may take from a couple of seconds to a minute to run to completion.

    circle-exclamation

    At this point, no data has been materialized to your online store. Feast apply simply registers the feature definitions with Feast and spins up any necessary infrastructure such as tables. To load data into the online store, run feast materialize. See Load data into the online store for more details.

    hashtag
    Cleaning up

    If you need to clean up the infrastructure created by feast apply, use the teardown command.

    triangle-exclamation

    Warning: teardown is an irreversible command and will remove all feature store infrastructure. Proceed with caution!

    ****

    feature_store.yaml
    Create a feature store
    1. Register feature views

    Before proceeding, please ensure that you have applied (registered) the feature views that should be materialized.

    hashtag
    2.a Materialize

    The materialize command allows users to materialize features over a specific historical time range into the online store.

    The above command will query the batch sources for all feature views over the provided time range, and load the latest feature values into the configured online store.

    It is also possible to materialize for specific feature views by using the -v / --views argument.

    The materialize command is completely stateless. It requires the user to provide the time ranges that will be loaded into the online store. This command is best used from a scheduler that tracks state, like Airflow.

    hashtag
    2.b Materialize Incremental (Alternative)

    For simplicity, Feast also provides a materialize command that will only ingest new data that has arrived in the offline store. Unlike materialize, materialize-incremental will track the state of previous ingestion runs inside of the feature registry.

    The example command below will load only new data that has arrived for each feature view up to the end date and time (2021-04-08T00:00:00).

    The materialize-incremental command functions similarly to materialize in that it loads data over a specific time range for all feature views (or the selected feature views) into the online store.

    Unlike materialize, materialize-incremental automatically determines the start time from which to load features from batch sources of each feature view. The first time materialize-incremental is executed it will set the start time to the oldest timestamp of each data source, and the end time as the one provided by the user. For each run of materialize-incremental, the end timestamp will be tracked.

    Subsequent runs of materialize-incremental will then set the start time to the end time of the previous run, thus only loading new data that has arrived into the online store. Note that the end time that is tracked for each run is at the feature view level, not globally for all feature views, i.e, different feature views may have different periods that have been materialized into the online store.

    Deploy a feature storechevron-right
    Retrieving online features

    hashtag
    1. Ensure that feature values have been loaded into the online store

    Please ensure that you have materialized (loaded) your feature values into the online store before starting

    hashtag
    2. Define feature references

    Create a list of features that you would like to retrieve. This list typically comes from the model training step and should accompany the model binary.

    hashtag
    3. Read online features

    Next, we will create a feature store object and call get_online_features() which reads the relevant feature values directly from the online store.

    Load data into the online storechevron-right

    Only the latest feature values are persisted

    hashtag
    Example

    Configuration options are available herearrow-up-right.

    SQLitearrow-up-right
    feature_store.yaml
    project: my_feature_repo
    registry: data/registry.db
    provider: local
    online_store:
      type: sqlite
      path: data/online_store.db
    hashtag
    Stream Source to Online Store

    hashtag
    Batch Source to Offline Store

    triangle-exclamation

    Not supported in Feast 0.8

    hashtag
    Stream Source to Offline Store

    triangle-exclamation

    Not supported in Feast 0.8

    # Replace "tiny_pika" with your auto-generated dir name
    cd tiny_pika
    from feast import BigQuerySource
    
    BigQuerySource(
        query="SELECT timestamp as ts, created, f1, f2 "
              "FROM `my_project.my_dataset.my_features`",
    )
    feature_store.yaml
    project: my_feature_repo
    registry: gs://my-bucket/data/registry.db
    provider: gcp
    offline_store:
      type: bigquery
      dataset: feast_bq_dataset
    feature_store.yaml
    project: my_feature_repo
    registry: data/registry.db
    provider: local
    online_store:
      type: redis
      connection_string: "localhost:6379"
    feature_store.yaml
    project: my_feature_repo
    registry: data/registry.db
    provider: local
    online_store:
      type: redis
      redis_type: redis_cluster
      connection_string: "redis1:6379,redis2:6379,ssl=true,password=my_password"
    pip install feast
    feast init feature_repo
    cd feature_repo
    Creating a new Feast repository in /home/Jovyan/feature_repo.
    feast apply
    Registered entity driver_id
    Registered feature view driver_hourly_stats
    Deploying infrastructure for driver_hourly_stats
    from datetime import datetime
    
    import pandas as pd
    
    from feast import FeatureStore
    
    entity_df = pd.DataFrame.from_dict(
        {
            "driver_id": [1001, 1002, 1003, 1004],
            "event_timestamp": [
                datetime(2021, 4, 12, 10, 59, 42),
                datetime(2021, 4, 12, 8, 12, 10),
                datetime(2021, 4, 12, 16, 40, 26),
                datetime(2021, 4, 12, 15, 1, 12),
            ],
        }
    )
    
    store = FeatureStore(repo_path=".")
    
    training_df = store.get_historical_features(
        entity_df=entity_df,
        feature_refs=[
            "driver_hourly_stats:conv_rate",
            "driver_hourly_stats:acc_rate",
            "driver_hourly_stats:avg_daily_trips",
        ],
    ).to_df()
    
    print(training_df.head())
    event_timestamp   driver_id  driver_hourly_stats__conv_rate  driver_hourly_stats__acc_rate  driver_hourly_stats__avg_daily_trips
    2021-04-12        1002       0.328245                        0.993218                       329
    2021-04-12        1001       0.448272                        0.873785                       767
    2021-04-12        1004       0.822571                        0.571790                       673
    2021-04-12        1003       0.556326                        0.605357                       335
    CURRENT_TIME=$(date -u +"%Y-%m-%dT%H:%M:%S")
    feast materialize-incremental $CURRENT_TIME
    from pprint import pprint
    from feast import FeatureStore
    
    store = FeatureStore(repo_path=".")
    
    feature_vector = store.get_online_features(
        feature_refs=[
            "driver_hourly_stats:conv_rate",
            "driver_hourly_stats:acc_rate",
            "driver_hourly_stats:avg_daily_trips",
        ],
        entity_rows=[{"driver_id": 1001}],
    ).to_dict()
    
    pprint(feature_vector)
    {
        'driver_id': [1001],
        'driver_hourly_stats__conv_rate': [0.49274],
        'driver_hourly_stats__acc_rate': [0.92743],
        'driver_hourly_stats__avg_daily_trips': [72],
    }
    feast apply
    
    # Processing example.py as example
    # Done!
    feast teardown
    feast materialize 2021-04-07T00:00:00 2021-04-08T00:00:00
    feast materialize 2021-04-07T00:00:00 2021-04-08T00:00:00 \
    --views driver_hourly_stats
    feast materialize-incremental 2021-04-08T00:00:00
    feature_refs = [
        "driver_hourly_stats:conv_rate",
        "driver_hourly_stats:acc_rate"
    ]
    fs = FeatureStore(repo_path="path/to/feature/repo")
    online_features = fs.get_online_features(
        feature_refs=feature_refs,
        entity_rows=[
            {"driver_id": 1001},
            {"driver_id": 1002}]
    ).to_dict()
    {
       "driver_hourly_stats__acc_rate":[
          0.2897740304470062,
          0.6447265148162842
       ],
       "driver_hourly_stats__conv_rate":[
          0.6508077383041382,
          0.14802511036396027
       ],
       "driver_id":[
          1001,
          1002
       ]
    }
    from feast import Client
    from datetime import datetime, timedelta
    
    client = Client(core_url="localhost:6565")
    driver_ft = client.get_feature_table("driver_trips")
    
    # Initialize date ranges
    today = datetime.now()
    yesterday = today - timedelta(1)
    
    # Launches a short-lived job that ingests data over the provided date range.
    client.start_offline_to_online_ingestion(
        driver_ft, yesterday, today
    )
    from feast import Client
    from datetime import datetime, timedelta
    
    client = Client(core_url="localhost:6565")
    driver_ft = client.get_feature_table("driver_trips")
    
    # Launches a long running streaming ingestion job
    client.start_stream_to_online_ingestion(driver_ft)
    dev
    ,
    staging
    ,
    prod
    ).
    circle-info

    Projects are currently being supported for backward compatibility reasons. Projects may change in the future as we simplify the Feast API.

    project
    feature views
    features
    entity
    data source
    datasets
    Datasets are the output of historical retrieval, whereas data sources are the inputs. One or more data sources can be used in the creation of a dataset.

    hashtag
    Feature References

    Feature references uniquely identify feature values in Feast. The structure of a feature reference in string form is as follows: <feature_table>:<feature>

    Feature references are used for the retrieval of features from Feast:

    It is possible to retrieve features from multiple feature views with a single request, and Feast is able to join features from multiple tables in order to build a training dataset. However, It is not possible to reference (or retrieve) features from multiple projects at the same time.

    hashtag
    Entity key

    Entity keys are one or more entity values that uniquely describe an entity. In the case of an entity (like a driver) that only has a single entity field, the entity is an entity key. However, it is also possible for an entity key to consist of multiple entity values. For example, a feature view with the composite entity of (customer, country) might have an entity key of (1001, 5).

    Entity keys act as primary keys. They are used during the lookup of features from the online store, and they are also used to match feature rows across feature views during point-in-time joins.

    hashtag
    Event timestamp

    The timestamp on which an event occurred, as found in a feature view's data source. The entity timestamp describes the event time at which a feature was observed or generated.

    Event timestamps are used during point-in-time joins to ensure that the latest feature values are joined from feature views onto entity rows. Event timestamps are also used to ensure that old feature values aren't served to models during online serving.

    hashtag
    Entity row

    An entity key at a specific point in time.

    hashtag
    Entity dataframe

    A collection of entity rows. Entity dataframes are the "left table" that is enriched with feature values when building training datasets. The entity dataframe is provided to Feast by users during historical retrieval:

    Example of an entity dataframe with feature values joined to it:

    Feast Apply: The user (or CI) publishes versioned controlled feature definitions using feast apply. This CLI command updates infrastructure and persists definitions in the object store registry.

  • Feast Materialize: The user (or scheduler) executes feast materialize which loads features from the offline store into the online store.

  • Model Training: A model training pipeline is launched. It uses the Feast Python SDK to retrieve a training dataset and trains a model.

  • Get Historical Features: Feast exports a point-in-time correct training dataset based on the list of features and entity dataframe provided by the model training pipeline.

  • Deploy Model: The trained model binary (and list of features) are deployed into a model serving system. This step is not executed by Feast.

  • Prediction: A backend system makes a request for a prediction from the model serving service.

  • Get Online Features: The model serving service makes a request to the Feast Online Serving service for online features using a Feast SDK.

  • hashtag
    Components

    A complete Feast deployment contains the following components:

    • Feast Online Serving: Provides low-latency access to feature values stores in the online store. This component is optional. Teams can also read feature values directly from the online store if necessary.

    • Feast Registry: An object store (GCS, S3) based registry used to persist feature definitions that are registered with the feature store. Systems can discover feature data by interacting with the registry through the Feast SDK.

    • Feast Python SDK/CLI: The primary user facing SDK. Used to:

      • Manage version controlled feature definitions.

      • Materialize (load) feature values into the online store.

      • Build and retrieve training datasets from the offline store.

    • Online Store: The online store is a database that stores only the latest feature values for each entity. The online store is populated by materialization jobs.

    • Offline Store: The offline store persists batch data that has been ingested into Feast. This data is used for producing training datasets. Feast does not manage the offline store directly, but runs queries against it.

    circle-info

    Java and Go Clients are also available for online feature retrieval. See API Reference.

    Feast Architecture Diagram

    Feast Online Serving

  • Postgres

  • Redis

  • Feast Jupyter (Optional)

  • Prometheus (Optional)

  • hashtag
    1. Requirements

    1. Install and configure Kubectlarrow-up-right

    2. Install Helm 3arrow-up-right

    hashtag
    2. Preparation

    Add the Feast Helm repository and download the latest charts:

    Feast includes a Helm chart that installs all necessary components to run Feast Core, Feast Online Serving, and an example Jupyter notebook.

    Feast Core requires Postgres to run, which requires a secret to be set on Kubernetes:

    hashtag
    3. Installation

    Install Feast using Helm. The pods may take a few minutes to initialize.

    hashtag
    4. Use Jupyter to connect to Feast

    After all the pods are in a RUNNING state, port-forward to the Jupyter Notebook Server in the cluster:

    You can now connect to the bundled Jupyter Notebook Server at localhost:8888 and follow the example Jupyter notebook.

    hashtag
    5. Further Reading

    • Feast Concepts

    • Feast Examples/Tutorialsarrow-up-right

    • Feast Helm Chart Documentationarrow-up-right

    The Terraform configuration used here is a greenfield installation that neither assumes anything about, nor integrates with, existing resources in your Azure account. The Terraform configuration presents an easy way to get started, but you may want to customize this set up before using Feast in production.

    This Terraform configuration creates the following resources:

    • Kubernetes cluster on Azure AKS

    • Kafka managed by HDInsight

    • Postgres database for Feast metadata, running as a pod on AKS

    • Redis cluster, using Azure Cache for Redis

    • to run Spark

    • Staging Azure blob storage container to store temporary data

    hashtag
    1. Requirements

    • Create an Azure account and configure credentials locallyarrow-up-right

    • Install Terraformarrow-up-right (tested with 0.13.5)

    • Install Helmarrow-up-right (tested with v3.4.2)

    hashtag
    2. Configure Terraform

    Create a .tfvars file underfeast/infra/terraform/azure. Name the file. In our example, we use my_feast.tfvars. You can see the full list of configuration variables in variables.tf. At a minimum, you need to set name_prefix and resource_group:

    hashtag
    3. Apply

    After completing the configuration, initialize Terraform and apply:

    hashtag
    4. Connect to Feast using Jupyter

    After all pods are running, connect to the Jupyter Notebook Server running in the cluster.

    To connect to the remote Feast server you just created, forward a port from the remote k8s cluster to your local machine.

    You can now connect to the bundled Jupyter Notebook Server at localhost:8888 and follow the example Jupyter notebook.

    reference Terraform configurationarrow-up-right
    hashtag
    Concept Hierarchy

    Feast contains the following core concepts:

    • Projects: Serve as a top level namespace for all Feast resources. Each project is a completely independent environment in Feast. Users can only work in a single project at a time.

    • Entities: Entities are the objects in an organization on which features occur. They map to your business domain (users, products, transactions, locations).

    • Feature Tables: Defines a group of features that occur on a specific entity.

    • Features: Individual feature within a feature table.

    Entities
    Sources
    Feature Tables

    Build a training dataset

    Feast allows users to build a training dataset from time-series feature data that already exists in an offline store. Users are expected to provide a list of features to retrieve (which may span multiple feature views), and a dataframe to join the resulting features onto. Feast will then execute a point-in-time join of multiple feature views onto the provided dataframe, and return the full resulting dataframe.

    hashtag
    Retrieving historical features

    hashtag
    1. Register your feature views

    Please ensure that you have created a feature repository and that you have registered (applied) your feature views with Feast.

    hashtag
    2. Define feature references

    Start by defining the feature references (e.g., driver_trips:average_daily_rides) for the features that you would like to retrieve from the offline store. These features can come from multiple feature tables. The only requirement is that the feature tables that make up the feature references have the same entity (or composite entity), and that they aren't located in the same offline store.

    3. Create an entity dataframe

    An entity dataframe is the target dataframe on which you would like to join feature values. The entity dataframe must contain a timestamp column called event_timestamp and all entities (primary keys) necessary to join feature tables onto. All entities found in feature views that are being joined onto the entity dataframe must be found as column on the entity dataframe.

    It is possible to provide entity dataframes as either a Pandas dataframe or a SQL query.

    Pandas:

    In the example below we create a Pandas based entity dataframe that has a single row with an event_timestamp column and a driver_id entity column. Pandas based entity dataframes may need to be uploaded into an offline store, which may result in longer wait times compared to a SQL based entity dataframe.

    SQL (Alternative):

    Below is an example of an entity dataframe built from a BigQuery SQL query. It is only possible to use this query when all feature views being queried are available in the same offline store (BigQuery).

    4. Launch historical retrieval

    Once the feature references and an entity dataframe are defined, it is possible to call get_historical_features(). This method launches a job that executes a point-in-time join of features from the offline store onto the entity dataframe. Once completed, a job reference will be returned. This job reference can then be converted to a Pandas dataframe by calling to_df().

    Community

    circle-check

    Office Hours: Have a question, feature request, idea, or just looking to speak to a real person? Come and join the Feast Office Hoursarrow-up-right on Friday and chat with a Feast contributor!

    hashtag
    Links & Resources

    • : Feel free to ask questions or say hello!

    • : We have both a user and developer mailing list.

      • Feast users should join group by clicking .

    hashtag
    How can I get help?

    • Slack: Need to speak to a human? Come ask a question in our Slack channel (link above).

    • GitHub Issues: Found a bug or need a feature? .

    • StackOverflow: Need to ask a question on how to use Feast? We also monitor and respond to .

    hashtag
    Community Calls

    We have a user and contributor community call every two weeks (Asia & US friendly).

    circle-info

    Please join the above Feast user groups in order to see calendar invites to the community calls

    hashtag
    Frequency (alternating times every 2 weeks)

    • Tuesday 18:00 pm to 18:30 pm (US, Asia)

    • Tuesday 10:00 am to 10:30 am (US, Europe)

    hashtag
    Links

    • Zoom:

    • Meeting notes:

    .feastignore

    hashtag
    Overview

    .feastignore is a file that is placed at the root of the Feature Repository. This file contains paths that should be ignored when running feast apply. An example .feastignore is shown below:

    .feastignore
    # Ignore virtual environment
    venv
    
    # Ignore a specific Python file
    scripts/foo.py
    
    # Ignore all Python files directly under scripts directory
    scripts/*.py
    
    # Ignore all "foo.py" anywhere under scripts directory
    scripts/**/foo.py

    .feastignore file is optional. If the file can not be found, every Python in the feature repo directory will be parsed by feast apply.

    hashtag
    Feast Ignore Patterns

    API Reference

    Please see the following API specific reference documentation:

    • Feast Core gRPC APIarrow-up-right: This is the gRPC API used by Feast Core. This API contains RPCs for creating and managing feature sets, stores, projects, and jobs.

    • Feast Serving gRPC APIarrow-up-right: This is the gRPC API used by Feast Serving. It contains RPCs used for the retrieval of online feature data or historical feature data.

    • : These are the gRPC types used by both Feast Core, Feast Serving, and the Go, Java, and Python clients.

    • : The Go library used for the retrieval of online features from Feast.

    • : The Java library used for the retrieval of online features from Feast.

    • : This is the complete reference to the Feast Python SDK. The SDK is used to manage feature sets, features, jobs, projects, and entities. It can also be used to retrieve training datasets or online features from Feast Serving.

    hashtag
    Community Contributions

    The following community provided SDKs are available:

    • : A Node.js SDK written in TypeScript. The SDK can be used to manage feature sets, features, jobs, projects, and entities.

    Overview

    hashtag
    Using Feast

    Feast development happens through three key workflows:

    1. Define and load feature data into Feast

    hashtag
    Defining feature tables and ingesting data into Feast

    Feature creators model the data within their organization into Feast through the definition of that contain . Feature tables are both a schema and a means of identifying data sources for features, and allow Feast to know how to interpret your data, and where to find it.

    After registering a feature table with Feast, users can trigger an ingestion from their data source into Feast. This loads feature values from an upstream data source into Feast stores through ingestion jobs.

    Visit to learn more about them.

    hashtag
    Retrieving historical features for training

    In order to generate a training dataset it is necessary to provide both an and feature references through the to retrieve historical features. For historical serving, Feast requires that you provide the entities and timestamps for the corresponding feature data. Feast produces a point-in-time correct dataset using the requested features. These features can be requested from an unlimited number of feature sets.

    hashtag
    Retrieving online features for online serving

    Online retrieval uses feature references through the to retrieve online features. Online serving allows for very low latency requests to feature data at very high throughput.

    Metrics

    circle-exclamation

    This page applies to Feast 0.7. The content may be out of date for Feast 0.8+

    hashtag
    Overview

    Feast Components export metrics that can provide insight into Feast behavior:

    See the for documentation on metrics are exported by Feast.

    circle-info

    Feast Job Controller currently does not export any metrics on its own. However its application.yml is used to configure metrics export for ingestion jobs.

    hashtag
    Pushing Ingestion Metrics to StatsD

    hashtag
    Feast Ingestion Job

    Feast Ingestion Job can be configured to push Ingestion metrics to a StatsD instance. Metrics export to StatsD for Ingestion Job is configured in Job Controller's application.yml under feast.jobs.metrics

    circle-info

    If you need Ingestion Metrics in Prometheus or some other metrics backend, use a metrics forwarder to forward Ingestion Metrics from StatsD to the metrics backend of choice. (ie Use to forward metrics to Prometheus).

    hashtag
    Exporting Feast Metrics to Prometheus

    hashtag
    Feast Core and Serving

    Feast Core and Serving exports metrics to a Prometheus instance via Prometheus scraping its /metrics endpoint. Metrics export to Prometheus for Core and Serving can be configured via their corresponding application.yml

    to scrape directly from Core and Serving's /metrics endpoint.

    hashtag
    Further Reading

    See the for documentation on metrics are exported by Feast.

    Contribution process

    We use RFCsarrow-up-right and GitHub issuesarrow-up-right to communicate development ideas. The simplest way to contribute to Feast is to leave comments in our RFCsarrow-up-right in the Feast Google Drivearrow-up-right or our GitHub issues. You will need to join our Google Group in order to get access.

    We follow a process of lazy consensusarrow-up-right. If you believe you know what the project needs then just start development. If you are unsure about which direction to take with development then please communicate your ideas through a GitHub issue or through our Slack Channel before starting development.

    Please submit a PR arrow-up-rightto the master branch of the Feast repository once you are ready to submit your contribution. Code submission to Feast (including submission from project maintainers) require review and approval from maintainers or code owners.

    PRs that are submitted by the general public need to be identified as ok-to-test. Once enabled, will run a range of tests to verify the submission, after which community members will help to review the pull request.

    circle-check

    Please sign the in order to have your code merged into the Feast repository.

    Google Cloud GKE (with Terraform)

    hashtag
    Overview

    This guide installs Feast on GKE using our reference Terraform configurationarrow-up-right.

    circle-info

    The Terraform configuration used here is a greenfield installation that neither assumes anything about, nor integrates with, existing resources in your GCP account. The Terraform configuration presents an easy way to get started, but you may want to customize this set up before using Feast in production.

    This Terraform configuration creates the following resources:

    • GKE cluster

    • Feast services running on GKE

    • Google Memorystore (Redis) as online store

    hashtag
    1. Requirements

    • Install > = 0.12 (tested with 0.13.3)

    • Install (tested with v3.3.4)

    • GCP and sufficient to create the resources listed above.

    hashtag
    2. Configure Terraform

    Create a .tfvars file underfeast/infra/terraform/gcp. Name the file. In our example, we use my_feast.tfvars. You can see the full list of configuration variables in variables.tf. Sample configurations are provided below:

    hashtag
    3. Apply

    After completing the configuration, initialize Terraform and apply:

    Limitations

    hashtag
    Feast API

    Limitation

    Motivation

    Features names and entity names cannot overlap in feature table definitions

    Features and entities become columns in historical stores which may cause conflicts

    The following field names are reserved in feature tables

    • event_timestamp

    • datetime

    hashtag
    Ingestion

    hashtag
    Storage

    Entities

    hashtag
    Overview

    An entity is any domain object that can be modeled and about which information can be stored. Entities are usually recognizable concepts, either concrete or abstract, such as persons, places, things, or events.

    Examples of entities in the context of ride-hailing and food delivery: customer, order, driver, restaurant, dish, area.

    Entities are important in the context of feature stores since features are always properties of a specific entity. For example, we could have a feature total_trips_24h for driver D011234 with a feature value of 11.

    Feast uses entities in the following way:

    • Entities serve as the keys used to look up features for producing training datasets and online feature values.

    • Entities serve as a natural grouping of features in a feature table. A feature table must belong to an entity (which could be a composite entity)

    hashtag
    Structure of an Entity

    When creating an entity specification, consider the following fields:

    • Name: Name of the entity

    • Description: Description of the entity

    • Value Type: Value type of the entity. Feast will attempt to coerce entity columns in your data sources into this type.

    A valid entity specification is shown below:

    hashtag
    Working with an Entity

    hashtag
    Creating an Entity:

    hashtag
    Updating an Entity:

    Permitted changes include:

    • The entity's description and labels

    The following changes are not permitted:

    • Project

    • Name of an entity

    • Type

    Feature view

    hashtag
    Feature View

    A feature view is an object that represents a logical group of time-series feature data as it is found in a . Feature views consist of one or more , , and a . Feature views allow Feast to model your existing feature data in a consistent way in both an offline (training) and online (serving) environment.

    Amazon EKS (with Terraform)

    hashtag
    Overview

    This guide installs Feast on AWS using our .

    circle-info

    Provider

    A provider is an implementation of a feature store using specific feature store components targeting a specific environment. More specifically, a provider is the target environment to which you have configured your feature store to deploy and run.

    Providers are built to orchestrate various components (offline store, online store, infrastructure, compute) inside an environment. For example, the gcp provider supports as an offline store and as an online store, ensuring that these components can work together seamlessly.

    Providers also come with default configurations which makes it easier for users to start a feature store in a specific environment.

    Please see for configuring providers.

    online_features = fs.get_online_features(
        feature_refs=[
            'driver_locations:lon',
            'drivers_activity:trips_today'
        ],
        entities=[{'driver': 'driver_1001'}]
    )
    training_df = store.get_historical_features(
        entity_df=entity_df, 
        feature_refs = [
            'drivers_activity:trips_today'
            'drivers_activity:rating'
        ],
    )
    helm repo add feast-charts https://feast-helm-charts.storage.googleapis.com
    helm repo update
    kubectl create secret generic feast-postgresql --from-literal=postgresql-password=password
    helm install feast-release feast-charts/feast
    kubectl port-forward \
    $(kubectl get pod -l app=feast-jupyter -o custom-columns=:metadata.name) 8888:8888
    Forwarding from 127.0.0.1:8888 -> 8888
    Forwarding from [::1]:8888 -> 8888
    my_feast.tfvars
    name_prefix = "feast"
    resource_group = "Feast" # pre-existing resource group
    $ cd feast/infra/terraform/azure
    $ terraform init
    $ terraform apply -var-file=my_feast.tfvars
    kubectl port-forward $(kubectl get pod -o custom-columns=:metadata.name | grep jupyter) 8888:8888
    Forwarding from 127.0.0.1:8888 -> 8888
    Forwarding from [::1]:8888 -> 8888
    created_timestamp
  • ingestion_id

  • job_id

  • These keywords are used for column names when persisting metadata in historical stores

    Limitation

    Motivation

    Once data has been ingested into Feast, there is currently no way to delete the data without manually going to the database and deleting it. However, during retrieval only the latest rows will be returned for a specific key (event_timestamp, entity) based on its created_timestamp.

    This functionality simply doesn't exist yet as a Feast API

    Limitation

    Motivation

    Feast does not support offline storage in Feast 0.8

    As part of our re-architecture of Feast, we moved from GCP to cloud-agnostic deployments. Developing offline storage support that is available in all cloud environments is a pending action.

    spark-on-k8s-operatorarrow-up-right

    Feast developers should join [email protected]envelope group by clicking herearrow-up-right.

  • Google Folderarrow-up-right: This folder is used as a central repository for all Feast resources. For example:

    • Design proposals in the form of Request for Comments (RFC).

    • User surveys and meeting minutes.

    • Slide decks of conferences our contributors have spoken at.

  • Feast GitHub Repositoryarrow-up-right: Find the complete Feast codebase on GitHub.

  • Feast Linux Foundation Wikiarrow-up-right: Our LFAI wiki page contains links to resources for contributors and maintainers.

  • Slackarrow-up-right
    Mailing listarrow-up-right
    [email protected]envelope
    herearrow-up-right
    Create an issue on GitHubarrow-up-right
    StackOverflowarrow-up-right
    https://zoom.us/j/6325193230arrow-up-right
    https://bit.ly/feast-notesarrow-up-right
    Feast gRPC Typesarrow-up-right
    Go Client SDKarrow-up-right
    Java Client SDKarrow-up-right
    Python SDKarrow-up-right
    Node.js SDKarrow-up-right
    Retrieve historical features for training models
    Retrieve online features for serving models
    feature tables
    data sources
    feature tables
    Define and ingest featureschevron-right
    entity dataframe
    Feast SDKarrow-up-right
    Getting training featureschevron-right
    Feast Online Serving APIarrow-up-right
    Getting online featureschevron-right
    Prowarrow-up-right
    Google CLAarrow-up-right
    BigQueryarrow-up-right
    Datastorearrow-up-right
    feature_store.yaml

    You can specify a double asterisk (**) anywhere in the expression. A double asterisk matches zero or more directories.

    Pattern

    Example matches

    Explanation

    venv

    venv/foo.py venv/a/foo.py

    You can specify a path to a specific directory. Everything in that directory will be ignored.

    scripts/foo.py

    scripts/foo.py

    You can specify a path to a specific file. Only that file will be ignored.

    scripts/*.py

    scripts/foo.py scripts/bar.py

    You can specify an asterisk (*) anywhere in the expression. An asterisk matches zero or more characters, except "/".

    scripts/**/foo.py

    scripts/foo.py scripts/a/foo.py scripts/a/b/foo.py

    Deploy a feature storechevron-right
    Feast Ingestion Jobs can be configured to push metrics into StatsD
    Prometheus can be configured to scrape metrics from Feast Core and Serving.
    Metrics Reference
    prometheus-statsd-exporterarrow-up-right
    Direct Prometheusarrow-up-right
    Metrics Reference
    Dataproc cluster
  • Kafka running on GKE, exposed to the dataproc cluster via internal load balancer

  • Terraformarrow-up-right
    Helmarrow-up-right
    authenticationarrow-up-right
    privilegearrow-up-right

    Labels: Labels are maps that allow users to attach their own metadata to entities

    feature_refs = [
        "driver_trips:average_daily_rides",
        "driver_trips:maximum_daily_rides",
        "driver_trips:rating",
        "driver_trips:rating:trip_completed",
    ]
    import pandas as pd
    from datetime import datetime
    
    entity_df = pd.DataFrame(
        {
            "event_timestamp": [pd.Timestamp(datetime.now(), tz="UTC")],
            "driver_id": [1001]
        }
    )
    entity_df = "SELECT event_timestamp, driver_id FROM my_gcp_project.table"
    from feast import FeatureStore
    
    fs = FeatureStore(repo_path="path/to/your/feature/repo")
    
    training_df = fs.get_historical_features(
        feature_refs=[
            "driver_hourly_stats:conv_rate",
            "driver_hourly_stats:acc_rate"
        ],
        entity_df=entity_df
    ).to_df()
     feast:
       jobs:
        metrics:
          # Enables Statd metrics export if true.
          enabled: true
          type: statsd
          # Host and port of the StatsD instance to export to.
          host: localhost
          port: 9125
    server:
      # Configures the port where metrics are exposed via /metrics for Prometheus to scrape.
      port: 8081
    my_feast.tfvars
    gcp_project_name        = "kf-feast"
    name_prefix             = "feast-0-8"
    region                  = "asia-east1"
    gke_machine_type        = "n1-standard-2"
    network                 = "default"
    subnetwork              = "default"
    dataproc_staging_bucket = "feast-dataproc"
    $ cd feast/infra/terraform/gcp
    $ terraform init
    $ terraform apply -var-file=my_feast.tfvars
    customer = Entity(
        name="customer_id",
        description="Customer id for ride customer",
        value_type=ValueType.INT64,
        labels={}
    )
    # Create a customer entity
    customer_entity = Entity(name="customer_id", description="ID of car customer")
    client.apply(customer_entity)
    # Update a customer entity
    customer_entity = client.get_entity("customer_id")
    customer_entity.description = "ID of bike customer"
    client.apply(customer_entity)
    Retrieve online features.
    Feature views are used during
    • The generation of training datasets by querying the data source of feature views in order to find historical feature values. A single training dataset may consist of features from multiple feature views.

    • Loading of feature values into an online store. Feature views determine the storage schema in the online store.

    • Retrieval of features from the online store. Feature views provide the schema definition to Feast in order to look up features from the online store.

    circle-info

    Feast does not generate feature values. It acts as the ingestion and serving system. The data sources described within feature views should reference feature values in their already computed form.

    hashtag
    Data Source

    Feast uses a time-series data model to represent data. This data model is used to interpret feature data in data sources in order to build training datasets or when materializing features into an online store.

    Below is an example data source with a single entity (driver) and two features (trips_today, and rating).

    Ride-hailing data source

    hashtag
    Entity

    An entity is a collection of semantically related features. Users define entities to map to the domain of their use case. For example, a ride-hailing service could have customers and drivers as their entities, which group related features that correspond to these customers and drivers.

    Entities are defined as part of feature views. Entities are used to identify the primary key on which feature values should be stored and retrieved. These keys are used during the lookup of feature values from the online store and the join process in point-in-time joins. It is possible to define composite entities (more than one entity object) in a feature view.

    Entities should be reused across feature views.

    hashtag
    Feature

    A feature is an individual measurable property observed on an entity. For example, a feature of a customer entity could be the number of transactions they have made on an average month.

    Features are defined as part of feature views. Since Feast does not transform data, a feature is essentially a schema that only contains a name and a type:

    Together with data sources, they indicate to Feast where to find your feature values, e.g., in a specific parquet file or BigQuery table. Feature definitions are also used when reading features from the feature store, using feature references.

    Feature names must be unique within a feature view.

    data source
    entities
    features
    data source
    driver_stats_fv = FeatureView(
        name="driver_activity",
        entities=["driver"],
        features=[
            Feature(name="trips_today", dtype=ValueType.INT64),
            Feature(name="rating", dtype=ValueType.FLOAT),
        ],
        input=BigQuerySource(
            table_ref="feast-oss.demo_data.driver_activity"
        )
    )
    The Terraform configuration used here is a greenfield installation that neither assumes anything about, nor integrates with, existing resources in your AWS account. The Terraform configuration presents an easy way to get started, but you may want to customize this set up before using Feast in production.

    This Terraform configuration creates the following resources:

    • Kubernetes cluster on Amazon EKS (3x r3.large nodes)

    • Kafka managed by Amazon MSK (2x kafka.t3.small nodes)

    • Postgres database for Feast metadata, using serverless Aurora (min capacity: 2)

    • Redis cluster, using Amazon Elasticache (1x cache.t2.micro)

    • Amazon EMR cluster to run Spark (3x spot m4.xlarge)

    • Staging S3 bucket to store temporary data

    hashtag
    1. Requirements

    • Create an AWS account and configure credentials locallyarrow-up-right

    • Install Terraformarrow-up-right > = 0.12 (tested with 0.13.3)

    • Install Helmarrow-up-right (tested with v3.3.4)

    hashtag
    2. Configure Terraform

    Create a .tfvars file underfeast/infra/terraform/aws. Name the file. In our example, we use my_feast.tfvars. You can see the full list of configuration variables in variables.tf. At a minimum, you need to set name_prefix and an AWS region:

    hashtag
    3. Apply

    After completing the configuration, initialize Terraform and apply:

    Starting may take a minute. A kubectl configuration file is also created in this directory, and the file's name will start with kubeconfig_ and end with a random suffix.

    hashtag
    4. Connect to Feast using Jupyter

    After all pods are running, connect to the Jupyter Notebook Server running in the cluster.

    To connect to the remote Feast server you just created, forward a port from the remote k8s cluster to your local machine. Replace kubeconfig_XXXXXXX below with the kubeconfig file name Terraform generates for you.

    You can now connect to the bundled Jupyter Notebook Server at localhost:8888 and follow the example Jupyter notebook.

    reference Terraform configurationarrow-up-right
    Configuring Feast components
    Feast and Spark

    Feature repository

    Feast manages two important sets of configuration: feature definitions, and configuration about how to run the feature store. With Feast, this configuration can be written declaratively and stored as code in a central location. This central location is called a feature repository, and it's essentially just a directory that contains some code files.

    The feature repository is the declarative source of truth for what the desired state of a feature store should be. The Feast CLI uses the feature repository to configure your infrastructure, e.g., migrate tables.

    hashtag
    What is a feature repository?

    A feature repository consists of:

    • A collection of Python files containing feature declarations.

    • A feature_store.yaml file containing infrastructural configuration.

    • A .feastignore file containing paths in the feature repository to ignore.

    circle-info

    Typically, users store their feature repositories in a Git repository, especially when working in teams. However, using Git is not a requirement.

    hashtag
    Structure of a feature repository

    The structure of a feature repository is as follows:

    • The root of the repository should contain a feature_store.yaml file and may contain a .feastignore file.

    • The repository should contain Python files that contain feature definitions.

    • The repository can contain other files as well, including documentation and potentially data files.

    An example structure of a feature repository is shown below:

    A couple of things to note about the feature repository:

    • Feast reads all Python files recursively when feast apply is ran, including subdirectories, even if they don't contain feature definitions.

    • It's recommended to add .feastignore and add paths to all imperative scripts if you need to store them inside the feature registry.

    hashtag
    The feature_store.yaml configuration file

    The configuration for a feature store is stored in a file named feature_store.yaml , which must be located at the root of a feature repository. An example feature_store.yaml file is shown below:

    The feature_store.yaml file configures how the feature store should run. See for more details.

    hashtag
    The .feastignore file

    This file contains paths that should be ignored when running feast apply. An example .feastignore is shown below:

    See for more details.

    hashtag
    Feature definitions

    A feature repository can also contain one or more Python files that contain feature definitions. An example feature definition file is shown below:

    To declare new feature definitions, just add code to the feature repository, either in existing files or in a new file. For more information on how to define features, see .

    hashtag
    Next steps

    • See to get started with an example feature repository.

    • See , , or for more information on the configuration files that live in a feature registry.

    Feast CLI reference

    hashtag
    Overview

    The Feast CLI comes bundled with the Feast Python package. It is immediately available after installing Feast.

    hashtag
    Global Options

    The Feast CLI provides one global top-level option that can be used with other commands

    chdir (-c, --chdir)

    This command allows users to run Feast CLI commands in a different folder from the current working directory.

    hashtag
    Apply

    Creates or updates a feature store deployment

    What does Feast apply do?

    1. Feast will scan Python files in your feature repository and find all Feast object definitions, such as feature views, entities, and data sources.

    2. Feast will validate your feature definitions

    3. Feast will sync the metadata about Feast objects to the registry. If a registry does not exist, then it will be instantiated. The standard registry is a simple protobuf binary file that is stored on disk (locally or in an object store).

    circle-exclamation

    feast apply (when configured to use cloud provider like gcp or aws) will create cloud infrastructure. This may incur costs.

    hashtag
    Entities

    List all registered entities

    hashtag
    Feature views

    List all registered feature views

    hashtag
    Init

    Creates a new feature repository

    It's also possible to use other templates

    or to set the name of the new project

    hashtag
    Materialize

    Load data from feature views into the online store between two dates

    Load data for specific feature views into the online store between two dates

    hashtag
    Materialize incremental

    Load data from feature views into the online store, beginning from either the previous materialize or materialize-incremental end date, or the beginning of time.

    hashtag
    Teardown

    Tear down deployed feature store infrastructure

    hashtag
    Version

    Print the current Feast version

    Sources

    hashtag
    Overview

    Sources are descriptions of external feature data and are registered to Feast as part of feature tables. Once registered, Feast can ingest feature data from these sources into stores.

    Currently, Feast supports the following source types:

    hashtag
    Batch Source

    • File (as in Spark): Parquet (only).

    • BigQuery

    hashtag
    Stream Source

    • Kafka

    • Kinesis

    The following encodings are supported on streams

    • Avro

    • Protobuf

    hashtag
    Structure of a Source

    For both batch and stream sources, the following configurations are necessary:

    • Event timestamp column: Name of column containing timestamp when event data occurred. Used during point-in-time join of feature values to .

    • Created timestamp column: Name of column containing timestamp when data is created. Used to deduplicate data when multiple copies of the same is ingested.

    Example data source specifications:

    The provides more information about options to specify for the above sources.

    hashtag
    Working with a Source

    hashtag
    Creating a Source

    Sources are defined as part of :

    Feast ensures that the source complies with the schema of the feature table. These specified data sources can then be included inside a feature table specification and registered to Feast Core.

    Architecture

    hashtag
    Sequence description

    1. Log Raw Events: Production backend applications are configured to emit internal state changes as events to a stream.

    2. Create Stream Features: Stream processing systems like Flink, Spark, and Beam are used to transform and refine events and to produce features that are logged back to the stream.

    3. Log Streaming Features: Both raw and refined events are logged into a data lake or batch storage location.

    4. Create Batch Features: ELT/ETL systems like Spark and SQL are used to transform data in the batch store.

    5. Define and Ingest Features: The Feast user defines based on the features available in batch and streaming sources and publish these definitions to Feast Core.

    6. Poll Feature Definitions: The Feast Job Service polls for new or changed feature definitions.

    7. Start Ingestion Jobs: Every new feature table definition results in a new ingestion job being provisioned (see limitations).

    8. Batch Ingestion: Batch ingestion jobs are short-lived jobs that load data from batch sources into either an offline or online store (see limitations).

    9. Stream Ingestion: Streaming ingestion jobs are long-lived jobs that load data from stream sources into online stores. A stream source and batch source on a feature table must have the same features/fields.

    10. Model Training: A model training pipeline is launched. It uses the Feast Python SDK to retrieve a training dataset and trains a model.

    11. Get Historical Features: Feast exports a point-in-time correct training dataset based on the list of features and entity DataFrame provided by the model training pipeline.

    12. Deploy Model: The trained model binary (and list of features) are deployed into a model serving system.

    13. Get Prediction: A backend system makes a request for a prediction from the model serving service.

    14. Retrieve Online Features: The model serving service makes a request to the Feast Online Serving service for online features using a Feast SDK.

    15. Return Prediction: The model serving service makes a prediction using the returned features and returns the outcome.

    circle-exclamation

    Limitations

    • Only Redis is supported for online storage.

    hashtag
    Components:

    A complete Feast deployment contains the following components:

    • Feast Core: Acts as the central registry for feature and entity definitions in Feast.

    • Feast Job Service: Manages data processing jobs that load data from sources into stores, and jobs that export training datasets.

    • Feast Serving: Provides low-latency access to feature values in an online store.

    Please see the for more details on configuring these components.

    circle-info

    Java and Go Clients are also available for online feature retrieval. See .

    Extending Feast

    hashtag
    Custom OnlineStore

    Feast allow users to create their own OnlineStore implementations, allowing Feast to read and write feature values to stores other than first-party implementations already in Feast directly. The interface for the is found at , and consists of four methods that need to be implemented.

    Troubleshooting

    circle-exclamation

    This page applies to Feast 0.7. The content may be out of date for Feast 0.8+

    If at any point in time you cannot resolve a problem, please see the section for reaching out to the Feast community.

    hashtag

    Docker Compose

    circle-check

    This guide is meant for exploratory purposes only. It allows users to run Feast locally using Docker Compose instead of Kubernetes. The goal of this guide is for users to be able to quickly try out the full Feast stack without needing to deploy to Kubernetes. It is not meant for production use.

    hashtag
    Overview

    driver = Entity(name='driver', value_type=ValueType.STRING, join_key='driver_id')
    trips_today = Feature(
        name="trips_today",
        dtype=ValueType.FLOAT
    )
    my_feast.tfvars
    name_prefix = "my-feast"
    region      = "us-east-1"
    $ cd feast/infra/terraform/aws
    $ terraform init
    $ terraform apply -var-file=my_feast.tfvars
    KUBECONFIG=kubeconfig_XXXXXXX kubectl port-forward \
    $(kubectl get pod -o custom-columns=:metadata.name | grep jupyter) 8888:8888
    Forwarding from 127.0.0.1:8888 -> 8888
    Forwarding from [::1]:8888 -> 8888
    Usage: feast [OPTIONS] COMMAND [ARGS]...
    
      Feast CLI
    
      For more information, see our public docs at https://docs.feast.dev/
    
      For any questions, you can reach us at https://slack.feast.dev/
    
    Options:
      -c, --chdir TEXT  Switch to a different feature repository directory before
                        executing the given subcommand.
    
      --help            Show this message and exit.
    
    Commands:
      apply                    Create or update a feature store deployment
      entities                 Access entities
      feature-views            Access feature views
      init                     Create a new Feast repository
      materialize              Run a (non-incremental) materialization job to...
      materialize-incremental  Run an incremental materialization job to ingest...
      registry-dump            Print contents of the metadata registry
      teardown                 Tear down deployed feature store infrastructure
      version                  Display Feast SDK version
    feature_store.yaml
    .feastignore
    Feature Views
    Create a feature repository
    feature_store.yaml
    .feastignore
    Feature Views

    Feast CLI will create all necessary feature store infrastructure. The exact infrastructure that is deployed or configured depends on the provider configuration that you have set in feature_store.yaml. For example, setting local as your provider will result in a sqlite online store being created.

    entity timestamps
    entity key
    Feast Python API documentationarrow-up-right
    feature tables
    hashtag
    Update/Teardown methods

    The update method is should be set up any state in the OnlineStore that is required before any data can be ingested into it. This can be things like tables in sqlite, or keyspaces in Cassandra, etc. The update method should be idempotent. Similarly, the teardown method should remove any state in the online store.

    hashtag
    Write/Read methods

    The online_write_batch method is responsible for writing the data into the online store - and online_read method is responsible for reading data from the online store.

    hashtag
    Custom OfflineStore

    Feast allow users to create their own OfflineStore implementations, allowing Feast to read and write feature values to stores other than first-party implementations already in Feast directly. The interface for the is found at herearrow-up-right, and consists of two methods that need to be implemented.

    hashtag
    Write method

    The pull_latest_from_table_or_query method is used to read data from a source for materialization into the OfflineStore.

    hashtag
    Read method

    The read method is responsible for reading historical features from the OfflineStore. The feature retrieval may be asynchronous, so the read method is expected to return an object that should produce a DataFrame representing the historical features once the feature retrieval job is complete.

    herearrow-up-right
    How can I verify that all services are operational?

    hashtag
    Docker Compose

    The containers should be in an up state:

    hashtag
    Google Kubernetes Engine

    All services should either be in a RUNNING state or COMPLETEDstate:

    hashtag
    How can I verify that I can connect to all services?

    First locate the the host and port of the Feast Services.

    hashtag
    Docker Compose (from inside the docker network)

    You will probably need to connect using the hostnames of services and standard Feast ports:

    hashtag
    Docker Compose (from outside the docker network)

    You will probably need to connect using localhost and standard ports:

    hashtag
    Google Kubernetes Engine (GKE)

    You will need to find the external IP of one of the nodes as well as the NodePorts. Please make sure that your firewall is open for these ports:

    netcat, telnet, or even curl can be used to test whether all services are available and ports are open, but grpc_cli is the most powerful. It can be installed from herearrow-up-right.

    hashtag
    Testing Connectivity From Feast Services:

    Use grpc_cli to test connetivity by listing the gRPC methods exposed by Feast services:

    hashtag
    How can I print logs from the Feast Services?

    Feast will typically have three services that you need to monitor if something goes wrong.

    • Feast Core

    • Feast Job Controller

    • Feast Serving (Online)

    • Feast Serving (Batch)

    In order to print the logs from these services, please run the commands below.

    hashtag
    Docker Compose

    Use docker-compose logs to obtain Feast component logs:

    hashtag
    Google Kubernetes Engine

    Use kubectl logs to obtain Feast component logs:

    Community
    $ tree -a
    .
    ├── data
    │   └── driver_stats.parquet
    ├── driver_features.py
    ├── feature_store.yaml
    └── .feastignore
    
    1 directory, 4 files
    feature_store.yaml
    project: my_feature_repo_1
    registry: data/metadata.db
    provider: local
    online_store:
        path: data/online_store.db
    .feastignore
    # Ignore virtual environment
    venv
    
    # Ignore a specific Python file
    scripts/foo.py
    
    # Ignore all Python files directly under scripts directory
    scripts/*.py
    
    # Ignore all "foo.py" anywhere under scripts directory
    scripts/**/foo.py
    driver_features.py
    from datetime import timedelta
    
    from feast import BigQuerySource, Entity, Feature, FeatureView, ValueType
    
    driver_locations_source = BigQuerySource(
        table_ref="rh_prod.ride_hailing_co.drivers",
        event_timestamp_column="event_timestamp",
        created_timestamp_column="created_timestamp",
    )
    
    driver = Entity(
        name="driver",
        value_type=ValueType.INT64,
        description="driver id",
    )
    
    driver_locations = FeatureView(
        name="driver_locations",
        entities=["driver"],
        ttl=timedelta(days=1),
        features=[
            Feature(name="lat", dtype=ValueType.FLOAT),
            Feature(name="lon", dtype=ValueType.STRING),
        ],
        input=driver_locations_source,
    )
    feast -c path/to/my/feature/repo apply
    feast apply
    feast entities list
    NAME       DESCRIPTION    TYPE
    driver_id  driver id      ValueType.INT64
    feast feature-views list
    NAME                 ENTITIES
    driver_hourly_stats  ['driver_id']
    feast init my_repo_name
    Creating a new Feast repository in /projects/my_repo_name.
    .
    ├── data
    │   └── driver_stats.parquet
    ├── example.py
    └── feature_store.yaml
    feast init -t gcp my_feature_repo
    feast init -t gcp my_feature_repo
    feast materialize 2020-01-01T00:00:00 2022-01-01T00:00:00
    feast materialize -v driver_hourly_stats 2020-01-01T00:00:00 2022-01-01T00:00:00
    Materializing 1 feature views from 2020-01-01 to 2022-01-01
    
    driver_hourly_stats:
    100%|██████████████████████████| 5/5 [00:00<00:00, 5949.37it/s]
    feast materialize-incremental 2022-01-01T00:00:00
    feast teardown
    feast version
    from feast import FileSource
    from feast.data_format import ParquetFormat
    
    batch_file_source = FileSource(
        file_format=ParquetFormat(),
        file_url="file:///feast/customer.parquet",
        event_timestamp_column="event_timestamp",
        created_timestamp_column="created_timestamp",
    )
    from feast import KafkaSource
    from feast.data_format import ProtoFormat
    
    stream_kafka_source = KafkaSource(
        bootstrap_servers="localhost:9094",
        message_format=ProtoFormat(class_path="class.path"),
        topic="driver_trips",
        event_timestamp_column="event_timestamp",
        created_timestamp_column="created_timestamp",
    )
    batch_bigquery_source = BigQuerySource(
        table_ref="gcp_project:bq_dataset.bq_table",
        event_timestamp_column="event_timestamp",
        created_timestamp_column="created_timestamp",
    )
    
    stream_kinesis_source = KinesisSource(
        bootstrap_servers="localhost:9094",
        record_format=ProtoFormat(class_path="class.path"),
        region="us-east-1",
        stream_name="driver_trips",
        event_timestamp_column="event_timestamp",
        created_timestamp_column="created_timestamp",
    )
    def update(
        self,
        config: RepoConfig,
        tables_to_delete: Sequence[Union[FeatureTable, FeatureView]],
        tables_to_keep: Sequence[Union[FeatureTable, FeatureView]],
        entities_to_delete: Sequence[Entity],
        entities_to_keep: Sequence[Entity],
        partial: bool,
    ):
        ...
    
    def teardown(
        self,
        config: RepoConfig,
        tables: Sequence[Union[FeatureTable, FeatureView]],
        entities: Sequence[Entity],
    ):
        ...
    def online_write_batch(
        self,
        config: RepoConfig,
        table: Union[FeatureTable, FeatureView],
        data: List[
            Tuple[EntityKeyProto, Dict[str, ValueProto], datetime, Optional[datetime]]
        ],
        progress: Optional[Callable[[int], Any]],
    ) -> None:
    
        ...
    
    def online_read(
        self,
        config: RepoConfig,
        table: Union[FeatureTable, FeatureView],
        entity_keys: List[EntityKeyProto],
        requested_features: Optional[List[str]] = None,
    ) -> List[Tuple[Optional[datetime], Optional[Dict[str, ValueProto]]]]:
        ...
    def pull_latest_from_table_or_query(
        data_source: DataSource,
        join_key_columns: List[str],
        feature_name_columns: List[str],
        event_timestamp_column: str,
        created_timestamp_column: Optional[str],
        start_date: datetime,
        end_date: datetime,
    ) -> pyarrow.Table:
        ...
    class RetrievalJob:
    
        @abstractmethod
        def to_df(self):
            pass
    
    def get_historical_features(
        config: RepoConfig,
        feature_views: List[FeatureView],
        feature_refs: List[str],
        entity_df: Union[pd.DataFrame, str],
        registry: Registry,
        project: str,
    ) -> RetrievalJob:
        pass
    docker ps
    kubectl get pods
    export FEAST_CORE_URL=core:6565
    export FEAST_ONLINE_SERVING_URL=online_serving:6566
    export FEAST_HISTORICAL_SERVING_URL=historical_serving:6567
    export FEAST_JOBCONTROLLER_URL=jobcontroller:6570
    export FEAST_CORE_URL=localhost:6565
    export FEAST_ONLINE_SERVING_URL=localhost:6566
    export FEAST_HISTORICAL_SERVING_URL=localhost:6567
    export FEAST_JOBCONTROLLER_URL=localhost:6570
    export FEAST_IP=$(kubectl describe nodes | grep ExternalIP | awk '{print $2}' | head -n 1)
    export FEAST_CORE_URL=${FEAST_IP}:32090
    export FEAST_ONLINE_SERVING_URL=${FEAST_IP}:32091
    export FEAST_HISTORICAL_SERVING_URL=${FEAST_IP}:32092
    grpc_cli ls ${FEAST_CORE_URL} feast.core.CoreService
    grpc_cli ls ${FEAST_JOBCONTROLLER_URL} feast.core.JobControllerService
    grpc_cli ls ${FEAST_HISTORICAL_SERVING_URL} feast.serving.ServingService
    grpc_cli ls ${FEAST_ONLINE_SERVING_URL} feast.serving.ServingService
     docker logs -f feast_core_1
     docker logs -f feast_jobcontroller_1
    docker logs -f feast_historical_serving_1
    docker logs -f feast_online_serving_1
    kubectl logs $(kubectl get pods | grep feast-core | awk '{print $1}')
    kubectl logs $(kubectl get pods | grep feast-jobcontroller | awk '{print $1}')
    kubectl logs $(kubectl get pods | grep feast-serving-batch | awk '{print $1}')
    kubectl logs $(kubectl get pods | grep feast-serving-online | awk '{print $1}')
    Batch ingestion jobs must be triggered from your own scheduler like Airflow. Streaming ingestion jobs are automatically launched by the Feast Job Service.

    Feast Python SDK CLI: The primary user facing SDK. Used to:

    • Manage feature definitions with Feast Core.

    • Launch jobs through the Feast Job Service.

    • Retrieve training datasets.

    • Retrieve online features.

  • Online Store: The online store is a database that stores only the latest feature values for each entity. The online store can be populated by either batch ingestion jobs (in the case the user has no streaming source), or can be populated by a streaming ingestion job from a streaming source. Feast Online Serving looks up feature values from the online store.

  • Offline Store: The offline store persists batch data that has been ingested into Feast. This data is used for producing training datasets.

  • Feast Spark SDK: A Spark specific Feast SDK. Allows teams to use Spark for loading features into an online store and for building training datasets over offline sources.

  • feature tables
    configuration reference
    API Reference
    This guide shows you how to deploy Feast using Docker Composearrow-up-right. Docker Compose allows you to explore the functionality provided by Feast while requiring only minimal infrastructure.

    This guide includes the following containerized components:

    • A complete Feast deployment

      • Feast Core with Postgres

      • Feast Online Serving with Redis.

      • Feast Job Service

    • A Jupyter Notebook Server with built in Feast example(s). For demo purposes only.

    • A Kafka cluster for testing streaming ingestion. For demo purposes only.

    hashtag
    Get Feast

    Clone the latest stable version of Feast from the Feast repositoryarrow-up-right:

    Create a new configuration file:

    hashtag
    Start Feast

    Start Feast with Docker Compose:

    Wait until all all containers are in a running state:

    hashtag
    Try our example(s)

    You can now connect to the bundled Jupyter Notebook Server running at localhost:8888 and follow the example Jupyter notebook.

    hashtag
    Troubleshooting

    hashtag
    Open ports

    Please ensure that the following ports are available on your host machine:

    • 6565

    • 6566

    • 8888

    • 9094

    • 5432

    If a port conflict cannot be resolved, you can modify the port mappings in the provided docker-compose.ymlarrow-up-right file to use different ports on the host.

    hashtag
    Containers are restarting or unavailable

    If some of the containers continue to restart, or you are unable to access a service, inspect the logs using the following command:

    If you are unable to resolve the problem, visit GitHubarrow-up-right to create an issue.

    hashtag
    Configuration

    The Feast Docker Compose setup can be configured by modifying properties in your .env file.

    hashtag
    Accessing Google Cloud Storage (GCP)

    To access Google Cloud Storage as a data source, the Docker Compose installation requires access to a GCP service account.

    • Create a new service accountarrow-up-right and save a JSON key.

    • Grant the service account access to your bucket(s).

    • Copy the service account to the path you have configured in .env under GCP_SERVICE_ACCOUNT.

    • Restart your Docker Compose setup of Feast.

    Feature Tables

    hashtag
    Overview

    Feature tables are both a schema and a logical means of grouping features, data sources, and other related metadata.

    Feature tables serve the following purposes:

    • Feature tables are a means for defining the location and properties of data .

    • Feature tables are used to create within Feast a database-level structure for the storage of feature values.

    • The data sources described within feature tables allow Feast to find and ingest feature data into stores within Feast.

    • Feature tables ensure data is efficiently stored during by providing a grouping mechanism of features values that occur on the same event timestamp.

    circle-info

    Feast does not yet apply feature transformations. Transformations are currently expected to happen before data is ingested into Feast. The data sources described within feature tables should reference feature values in their already transformed form.

    hashtag
    Features

    A feature is an individual measurable property observed on an entity. For example the amount of transactions (feature) a customer (entity) has completed. Features are used for both model training and scoring (batch, online).

    Features are defined as part of feature tables. Since Feast does not apply transformations, a feature is basically a schema that only contains a name and a type:

    Visit for the complete feature specification API.

    hashtag
    Structure of a Feature Table

    Feature tables contain the following fields:

    • Name: Name of feature table. This name must be unique within a project.

    • Entities: List of to associate with the features defined in this feature table. Entities are used as lookup keys when retrieving features from a feature table.

    • Features: List of features within a feature table.

    Here is a ride-hailing example of a valid feature table specification:

    By default, Feast assumes that features specified in the feature-table specification corresponds one-to-one to the fields found in the sources. All features defined in a feature table should be available in the defined sources.

    Field mappings can be used to map features defined in Feast to fields as they occur in data sources.

    In the example feature-specification table above, we use field mappings to ensure the feature named rating in the batch source is mapped to the field named driver_rating.

    hashtag
    Working with a Feature Table

    hashtag
    Creating a Feature Table

    hashtag
    Updating a Feature Table

    hashtag
    Feast currently supports the following changes to feature tables:

    • Adding new features.

    • Removing features.

    • Updating source, max age, and labels.

    circle-exclamation

    Deleted features are archived, rather than removed completely. Importantly, new features cannot use the names of these deleted features.

    hashtag
    Feast currently does not support the following changes to feature tables:

    • Changes to the project or name of a feature table.

    • Changes to entities related to a feature table.

    • Changes to names and types of existing features.

    hashtag
    Deleting a Feature Table

    triangle-exclamation

    Feast currently does not support the deletion of feature tables.

    Getting training features

    Feast provides a historical retrieval interface for exporting feature data in order to train machine learning models. Essentially, users are able to enrich their data with features from any feature tables.

    hashtag
    Retrieving historical features

    Below is an example of the process required to produce a training dataset:

    hashtag
    1. Define feature references

    define the specific features that will be retrieved from Feast. These features can come from multiple feature tables. The only requirement is that the feature tables that make up the feature references have the same entity (or composite entity).

    2. Define an entity dataframe

    Feast needs to join feature values onto specific entities at specific points in time. Thus, it is necessary to provide an as part of the get_historical_features method. In the example above we are defining an entity source. This source is an external file that provides Feast with the entity dataframe.

    3. Launch historical retrieval job

    Once the feature references and an entity source are defined, it is possible to call get_historical_features(). This method launches a job that extracts features from the sources defined in the provided feature tables, joins them onto the provided entity source, and returns a reference to the training dataset that is produced.

    Please see the for more details.

    hashtag
    Point-in-time Joins

    Feast always joins features onto entity data in a point-in-time correct way. The process can be described through an example.

    In the example below there are two tables (or dataframes):

    • The dataframe on the left is the that contains timestamps, entities, and the target variable (trip_completed). This dataframe is provided to Feast through an entity source.

    • The dataframe on the right contains driver features. This dataframe is represented in Feast through a feature table and its accompanying data source(s).

    The user would like to have the driver features joined onto the entity dataframe to produce a training dataset that contains both the target (trip_completed) and features (average_daily_rides, maximum_daily_rides, rating). This dataset will then be used to train their model.

    Feast is able to intelligently join feature data with different timestamps to a single entity dataframe. It does this through a point-in-time join as follows:

    1. Feast loads the entity dataframe and all feature tables (driver dataframe) into the same location. This can either be a database or in memory.

    2. For each in the , Feast tries to find feature values in each feature table to join to it. Feast extracts the timestamp and entity key of each row in the entity dataframe and scans backward through the feature table until it finds a matching entity key.

    3. If the event timestamp of the matching entity key within the driver feature table is within the maximum age configured for the feature table, then the features at that entity key are joined onto the entity dataframe. If the event timestamp is outside of the maximum age, then only null values are returned.

    circle-info

    Point-in-time correct joins attempts to prevent the occurrence of feature leakage by trying to recreate the state of the world at a single point in time, instead of joining features based on exact timestamps only.

    Release process

    hashtag
    Release process

    For Feast maintainers, these are the concrete steps for making a new release.

    1. For new major or minor release, create and check out the release branch for the new stream, e.g. v0.6-branch. For a patch version, check out the stream's release branch.

    2. Update the . See the guide and commit

      • Make to review each PR in the changelog to

    3. Update versions for the release/release candidate with a commit:

      1. In the root pom.xml, remove -SNAPSHOT from the <revision> property, update versions, and commit.

    4. Push the commits and tags. Make sure the CI passes.

      • If the CI does not pass, or if there are new patches for the release fix, repeat step 2 & 3 with release candidates until stable release is achieved.

    5. Bump to the next patch version in the release branch, append -SNAPSHOT in pom.xml and push.

    6. Create a PR against master to:

      1. Bump to the next major/minor version and append -SNAPSHOT .

      2. Add the change log by applying the change log commit created in step 2.

    7. Create a which includes a summary of important changes as well as any artifacts associated with the release. Make sure to include the same change log as added in . Use Feast vX.Y.Z as the title.

    8. Update the to include the action required instructions for users to upgrade to this new release. Instructions should include a migration for each breaking change made to this release.

    When a tag that matches a Semantic Version string is pushed, CI will automatically build and push the relevant artifacts to their repositories or package managers (docker images, Python wheels, etc). JVM artifacts are promoted from Sonatype OSSRH to Maven Central, but it sometimes takes some time for them to be available. The sdk/go/v tag is required to version the Go SDK go module so that users can go get a specific tagged release of the Go SDK.

    hashtag
    Creating a change log

    We use an to generate change logs. The process still requires a little bit of manual effort.

    1. Create a GitHub token as . The token is used as an input argument (-t) to the change log generator.

    2. The change log generator configuration below will look for unreleased changes on a specific branch. The branch will be master for a major/minor release, or a release branch (v0.4-branch) for a patch release. You will need to set the branch using the --release-branch argument.

    1. Review each change log item.

      • Make sure that sentences are grammatically correct and well formatted (although we will try to enforce this at the PR review stage).

      • Make sure that each item is categorised correctly. You will see the following categories: Breaking changes

    hashtag
    Flag Breaking Changes & Deprecations

    It's important to flag breaking changes and deprecation to the API for each release so that we can maintain API compatibility.

    Developers should have flagged PRs with breaking changes with the compat/breaking label. However, it's important to double check each PR's release notes and contents for changes that will break API compatibility and manually label compat/breaking to PRs with undeclared breaking changes. The change log will have to be regenerated if any new labels have to be added.

    Azure AKS (with Helm)

    hashtag
    Overview

    This guide installs Feast on Azure Kubernetes cluster (known as AKS), and ensures the following services are running:

    • Feast Core

    • Feast Online Serving

    • Postgres

    • Redis

    • Spark

    • Kafka

    • Feast Jupyter (Optional)

    • Prometheus (Optional)

    hashtag
    1. Requirements

    1. Install and configure

    2. Install and configure

    3. Install

    hashtag
    2. Preparation

    Create an AKS cluster with Azure CLI. The detailed steps can be found , and a high-level walk through includes:

    Add the Feast Helm repository and download the latest charts:

    Feast includes a Helm chart that installs all necessary components to run Feast Core, Feast Online Serving, and an example Jupyter notebook.

    Feast Core requires Postgres to run, which requires a secret to be set on Kubernetes:

    hashtag
    3. Feast installation

    Install Feast using Helm. The pods may take a few minutes to initialize.

    hashtag
    4. Spark operator installation

    Follow the documentation , and Feast documentation to

    and ensure the service account used by Feast has permissions to manage Spark Application resources. This depends on your k8s setup, but typically you'd need to configure a Role and a RoleBinding like the one below:

    hashtag
    5. Use Jupyter to connect to Feast

    After all the pods are in a RUNNING state, port-forward to the Jupyter Notebook Server in the cluster:

    You can now connect to the bundled Jupyter Notebook Server at localhost:8888 and follow the example Jupyter notebook.

    hashtag
    6. Environment variables

    If you are running the , you may want to make sure the following environment variables are correctly set:

    hashtag
    7. Further Reading

    Getting online features

    Feast provides an API through which online feature values can be retrieved. This allows teams to look up feature values at low latency in production during model serving, in order to make online predictions.

    circle-info

    Online stores only maintain the current state of features, i.e latest feature values. No historical data is stored or served.

    The online store must be populated through ingestion jobs prior to being used for online serving.

    Feast Serving provides a gRPC APIarrow-up-right that is backed by Redisarrow-up-right. We have native clients in , , and .

    hashtag
    Online Field Statuses

    Feast also returns status codes when retrieving features from the Feast Serving API. These status code give useful insight into the quality of data being served.

    IBM Cloud Kubernetes Service (IKS) and Red Hat OpenShift (with Kustomize)

    hashtag
    Overview

    This guide installs Feast on an existing IBM Cloud Kubernetes cluster or Red Hat OpenShift on IBM Cloud , and ensures the following services are running:

    • Feast Core

    Upgrading Feast

    hashtag
    Migration from v0.6 to v0.7

    hashtag
    Feast Core Validation changes

    git clone https://github.com/feast-dev/feast.git
    cd feast/infra/docker-compose
    cp .env.sample .env
    docker-compose pull && docker-compose up -d
    docker-compose ps
    docker-compose logs -f -t
    # Feature references with target feature
    feature_refs = [
        "driver_trips:average_daily_rides",
        "driver_trips:maximum_daily_rides",
        "driver_trips:rating",
        "driver_trips:rating:trip_completed",
    ]
    
    # Define entity source
    entity_source = FileSource(
       "event_timestamp",
       ParquetFormat(),
       "gs://some-bucket/customer"
    )
    
    # Retrieve historical dataset from Feast.
    historical_feature_retrieval_job = client.get_historical_features(
        feature_refs=feature_refs,
        entity_rows=entity_source
    )
    
    output_file_uri = historical_feature_retrieval_job.get_output_file_uri()
    from feast import Client
    
    online_client = Client(
       core_url="localhost:6565",
       serving_url="localhost:6566",
    )
    
    entity_rows = [
       {"driver_id": 1001},
       {"driver_id": 1002},
    ]
    
    # Features in <featuretable_name:feature_name> format
    feature_refs = [
       "driver_trips:average_daily_rides",
       "driver_trips:maximum_daily_rides",
       "driver_trips:rating",
    ]
    
    response = online_client.get_online_features(
       feature_refs=feature_refs, # Contains only feature references
       entity_rows=entity_rows, # Contains only entities (driver ids)
    )
    
    # Print features in dictionary format
    response_dict = response.to_dict()
    print(response_dict)

    Status

    Meaning

    NOT_FOUND

    The feature value was not found in the online store. This might mean that no feature value was ingested for this feature.

    NULL_VALUE

    A entity key was successfully found but no feature values had been set. This status code should not occur during normal operation.

    OUTSIDE_MAX_AGE

    The age of the feature row in the online store (in terms of its event timestamp) has exceeded the maximum age defined within the feature table.

    PRESENT

    The feature values have been found and are within the maximum age.

    UNKNOWN

    Indicates a system failure.

    Pythonarrow-up-right
    Goarrow-up-right
    Javaarrow-up-right

    Labels: Labels are arbitrary key-value properties that can be defined by users.

  • Max age: Max age affect the retrieval of features from a feature table. Age is measured as the duration of time between the event timestamp of a feature and the lookup time on an entity key used to retrieve the feature. Feature values outside max age will be returned as unset values. Max age allows for eviction of keys from online stores and limits the amount of historical scanning required for historical feature values during retrieval.

  • Batch Source: The batch data source from which Feast will ingest feature values into stores. This can either be used to back-fill stores before switching over to a streaming source, or it can be used as the primary source of data for a feature table. Visit Sources to learn more about batch sources.

  • Stream Source: The streaming data source from which you can ingest streaming feature values into Feast. Streaming sources must be paired with a batch source containing the same feature values. A streaming source is only used to populate online stores. The batch equivalent source that is paired with a streaming source is used during the generation of historical feature datasets. Visit Sources to learn more about stream sources.

  • sources
    ingestion
    FeatureSpecarrow-up-right
    entities
    Tag the commit with the release version, using a v and sdk/go/v prefixes
    • for a release candidate, create tags vX.Y.Z-rc.Nand sdk/go/vX.Y.Z-rc.N

    • for a stable release X.Y.Z create tags vX.Y.Z and sdk/go/vX.Y.Z

  • Check that versions are updated with make lint-versions.

  • If changes required are flagged by the version lint, make the changes, amend the commit and move the tag to the new commit.

  • Check that versions are updated with env TARGET_MERGE_BRANCH=master make lint-versions

    You should also set the --future-release argument. This is the version you are releasing. The version can still be changed at a later date.

  • Update the arguments below and run the command to generate the change log to the console.

  • ,
    Implemented enhancements
    ,
    Fixed bugs
    , and
    Merged pull requests
    . Any unlabelled PRs will be found in
    Merged pull requests
    . It's important to make sure that any
    breaking changes
    ,
    enhancements
    , or
    bug fixes
    are pulled up out of
    merged pull requests
    into the correct category. Housekeeping, tech debt clearing, infra changes, or refactoring do not count as
    enhancements
    . Only enhancements a user benefits from should be listed in that category.
  • Make sure that the "Full Change log" link is actually comparing the correct tags (normally your released version against the previously version).

  • Make sure that release notes and breaking changes are present.

  • CHANGELOG.mdarrow-up-right
    Creating a change log
    flag any breaking changes and deprecation.
    GitHub releasearrow-up-right
    CHANGELOG.mdarrow-up-right
    Upgrade Guide
    open source change log generatorarrow-up-right
    per these instructionsarrow-up-right
    avg_daily_ride = Feature("average_daily_rides", ValueType.FLOAT)
    from feast import BigQuerySource, FeatureTable, Feature, ValueType
    from google.protobuf.duration_pb2 import Duration
    
    driver_ft = FeatureTable(
        name="driver_trips",
        entities=["driver_id"],
        features=[
          Feature("average_daily_rides", ValueType.FLOAT),
          Feature("rating", ValueType.FLOAT)
        ],
        max_age=Duration(seconds=3600),
        labels={
          "team": "driver_matching" 
        },
        batch_source=BigQuerySource(
            table_ref="gcp_project:bq_dataset.bq_table",
            event_timestamp_column="datetime",
            created_timestamp_column="timestamp",
            field_mapping={
              "rating": "driver_rating"
            }
        )
    )
    driver_ft = FeatureTable(...)
    client.apply(driver_ft)
    driver_ft = FeatureTable()
    
    client.apply(driver_ft)
    
    driver_ft.labels = {"team": "marketplace"}
    
    client.apply(driver_ft)
    docker run -it --rm ferrarimarco/github-changelog-generator \
    --user feast-dev \
    --project feast  \
    --release-branch <release-branch-to-find-changes>  \
    --future-release <proposed-release-version>  \
    --unreleased-only  \
    --no-issues  \
    --bug-labels kind/bug  \
    --enhancement-labels kind/feature  \
    --breaking-labels compat/breaking  \
    -t <your-github-token>  \
    --max-issues 1 \
    -o
  • If multiple entity keys are found with the same event timestamp, then they are deduplicated by the created timestamp, with newer values taking precedence.

  • Feast repeats this joining process for all feature tables and returns the resulting dataset.

  • Feature references
    entity dataframe
    Feast SDKarrow-up-right
    entity dataframe
    entity row
    entity dataframe
    Configuring Feast components
  • Feast and Spark

  • Azure CLIarrow-up-right
    Kubectlarrow-up-right
    Helm 3arrow-up-right
    herearrow-up-right
    to install Spark operator on Kubernetes arrow-up-right
    configure Spark roles
    Minimal Ride Hailing Examplearrow-up-right
    Feast Concepts
    Feast Examples/Tutorialsarrow-up-right
    Feast Helm Chart Documentationarrow-up-right

    Feast Online Serving

  • Postgres

  • Redis

  • Kafka (Optional)

  • Feast Jupyter (Optional)

  • Prometheus (Optional)

  • hashtag
    1. Prerequisites

    1. IBM Cloud Kubernetes Servicearrow-up-right or Red Hat OpenShift on IBM Cloudarrow-up-right

    2. Install Kubectlarrow-up-right that matches the major.minor versions of your IKS or Install the OpenShift CLIarrow-up-right that matches your local operating system and OpenShift cluster version.

    3. Install Helm 3arrow-up-right

    4. Install

    hashtag
    2. Preparation

    hashtag
    IBM Cloud Block Storage Setup (IKS only)

    :warning: If you have Red Hat OpenShift Cluster on IBM Cloud skip to this section.

    By default, IBM Cloud Kubernetes cluster uses IBM Cloud File Storagearrow-up-right based on NFS as the default storage class, and non-root users do not have write permission on the volume mount path for NFS-backed storage. Some common container images in Feast, such as Redis, Postgres, and Kafka specify a non-root user to access the mount path in the images. When containers are deployed using these images, the containers fail to start due to insufficient permissions of the non-root user creating folders on the mount path.

    IBM Cloud Block Storagearrow-up-right allows for the creation of raw storage volumes and provides faster performance without the permission restriction of NFS-backed storage

    Therefore, to deploy Feast we need to set up IBM Cloud Block Storagearrow-up-right as the default storage class so that you can have all the functionalities working and get the best experience from Feast.

    1. Follow the instructionsarrow-up-right to install the Helm version 3 client on your local machine.

    2. Add the IBM Cloud Helm chart repository to the cluster where you want to use the IBM Cloud Block Storage plug-in.

    3. Install the IBM Cloud Block Storage plug-in. When you install the plug-in, pre-defined block storage classes are added to your cluster.

      Example output:

    4. Verify that all block storage plugin pods are in a "Running" state.

    5. Verify that the storage classes for Block Storage were added to your cluster.

    6. Set the Block Storage as the default storageclass.

      Example output:

      Security Context Constraint Setup (OpenShift only)

    By default, in OpenShift, all pods or containers will use the Restricted SCCarrow-up-right which limits the UIDs pods can run with, causing the Feast installation to fail. To overcome this, you can allow Feast pods to run with any UID by executing the following:

    hashtag
    3. Installation

    Install Feast using kustomize. The pods may take a few minutes to initialize.

    hashtag
    Optional: Enable Feast Jupyter and Kafka

    You may optionally enable the Feast Jupyter component which contains code examples to demonstrate Feast. Some examples require Kafka to stream real time features to the Feast online serving. To enable, edit the following properties in the values.yaml under the manifests/contrib/feast folder:

    Then regenerate the resource manifests and deploy:

    hashtag
    4. Use Feast Jupyter Notebook Server to connect to Feast

    After all the pods are in a RUNNING state, port-forward to the Jupyter Notebook Server in the cluster:

    You can now connect to the bundled Jupyter Notebook Server at localhost:8888 and follow the example Jupyter notebook.

    hashtag
    5. Uninstall Feast

    hashtag
    6. Troubleshooting

    When running the minimal_ride_hailing_example Jupyter Notebook example the following errors may occur:

    1. When running job = client.get_historical_features(...):

      or

      Add the following environment variable:

    2. When running job.get_status()

      Add the following environment variable:

    3. When running job = client.start_stream_to_online_ingestion(...)

      Add the following environment variable:

    In v0.7, Feast Core no longer accepts starting with number (0-9) and using dash in names for:
    • Project

    • Feature Set

    • Entities

    • Features

    Migrate all project, feature sets, entities, feature names:

    • with ‘-’ by recreating them with '-' replace with '_'

    • recreate any names with a number (0-9) as the first letter to one without.

    Feast now prevents feature sets from being applied if no store is subscribed to that Feature Set.

    • Ensure that a store is configured to subscribe to the Feature Set before applying the Feature Set.

    hashtag
    Feast Core's Job Coordinator is now Feast Job Controller

    In v0.7, Feast Core's Job Coordinator has been decoupled from Feast Core and runs as a separate Feast Job Controller application. See its Configuration reference for how to configure Feast Job Controller.

    Ingestion Job API

    In v0.7, the following changes are made to the Ingestion Job API:

    • Changed List Ingestion Job API to return list of FeatureSetReference instead of list of FeatureSet in response.

    • Moved ListIngestionJobs, StopIngestionJob, RestartIngestionJob calls from CoreService to JobControllerService.

    • Python SDK/CLI: Added new and jobcontroller_url config option.

    Users of the Ingestion Job API via gRPC should migrate by:

    • Add new client to connect to Job Controller endpoint to call JobControllerService and call ListIngestionJobs, StopIngestionJob, RestartIngestionJob from new client.

    • Migrate code to accept feature references instead of feature sets returned in ListIngestionJobs response.

    Users of Ingestion Job via Python SDK (ie feast ingest-jobs list or client.stop_ingest_job() etc.) should migrate by:

    • ingest_job()methods only: Create a new separate Job Controller clientarrow-up-right to connect to the job controller and call ingest_job() methods using the new client.

    • Configure the Feast Job Controller endpoint url via jobcontroller_url config option.

    hashtag
    Configuration Properties Changes

    • Rename feast.jobs.consolidate-jobs-per-source property to feast.jobs.controller.consolidate-jobs-per-sources

    • Renamefeast.security.authorization.options.subjectClaim to feast.security.authentication.options.subjectClaim

    • Rename feast.logging.audit.messageLoggingEnabled to feast.audit.messageLogging.enabled

    hashtag
    Migration from v0.5 to v0.6

    hashtag
    Database schema

    In Release 0.6 we introduced Flywayarrow-up-right to handle schema migrations in PostgreSQL. Flyway is integrated into core and for now on all migrations will be run automatically on core start. It uses table flyway_schema_history in the same database (also created automatically) to keep track of already applied migrations. So no specific maintenance should be needed.

    If you already have existing deployment of feast 0.5 - Flyway will detect existing tables and omit first baseline migration.

    After core started you should have flyway_schema_history look like this

    In this release next major schema changes were done:

    • Source is not shared between FeatureSets anymore. It's changed to 1:1 relation

      and source's primary key is now auto-incremented number.

    • Due to generalization of Source sources.topics & sources.bootstrap_servers columns were deprecated.

      They will be replaced with sources.config. Data migration handled by code when respected Source is used.

      topics and bootstrap_servers will be deleted in the next release.

    • Job (table jobs) is no longer connected to Source (table sources) since it uses consolidated source for optimization purposes.

      All data required by Job would be embedded in its table.

    New Models (tables):

    • feature_statistics

    Minor changes:

    • FeatureSet has new column version (see protoarrow-up-right for details)

    • Connecting table jobs_feature_sets in many-to-many relation between jobs & feature sets

      has now version and delivery_status.

    hashtag
    Migration from v0.4 to v0.6

    hashtag
    Database

    For all versions earlier than 0.5 seamless migration is not feasible due to earlier breaking changes and creation of new database will be required.

    Since database will be empty - first (baseline) migration would be applied:

    Versioning policy

    Versioning policies and status of Feast components

    hashtag
    Versioning policy and branch workflow

    Feast uses semantic versioningarrow-up-right.

    Contributors are encouraged to understand our branch workflow described below, for choosing where to branch when making a change (and thus the merge base for a pull request).

    • Major and minor releases are cut from the master branch.

    • Each major and minor release has a long-lived maintenance branch, e.g., v0.3-branch. This is called a "release branch".

    • From the release branch the pre-release release candidates are tagged, e.g., v0.3.0-rc.1

    • From the release candidates the stable patch version releases are tagged, e.g.,v0.3.0.

    A release branch should be substantially feature complete with respect to the intended release. Code that is committed to master may be merged or cherry-picked on to a release branch, but code that is directly committed to a release branch should be solely applicable to that release (and should not be committed back to master).

    In general, unless you're committing code that only applies to a particular release stream (for example, temporary hot-fixes, back-ported security fixes, or image hashes), you should base changes from master and then merge or cherry-pick to the release branch.

    hashtag
    Feast Component Matrix

    The following table shows the status (stable, beta, or alpha) of Feast components.

    Application status indicators for Feast:

    • Stable means that the component has reached a sufficient level of stability and adoption that the Feast community has deemed the component stable. Please see the stability criteria below.

    • Beta means that the component is working towards a version 1.0 release. Beta does not mean a component is unstable, it simply means the component has not met the full criteria of stability.

    • Alpha means that the component is in the early phases of development and/or integration into Feast.

    Criteria for reaching stable status:

    • Contributors from at least two organizations

    • Complete end-to-end test suite

    • Scalability and load testing if applicable

    Criteria for reaching beta status

    • Contributors from at least two organizations

    • End-to-end test suite

    • API reference documentation

    hashtag
    Levels of support

    Feast components have various levels of support based on the component status.

    hashtag
    Support from the Feast community

    Feast has an active and helpful community of users and contributors.

    The Feast community offers support on a best-effort basis for stable and beta applications. Best-effort support means that there’s no formal agreement or commitment to solve a problem but the community appreciates the importance of addressing the problem as soon as possible. The community commits to helping you diagnose and address the problem if all the following are true:

    • The cause falls within the technical framework that Feast controls. For example, the Feast community may not be able to help if the problem is caused by a specific network configuration within your organization.

    • Community members can reproduce the problem.

    • The reporter of the problem can help with further diagnosis and troubleshooting.

    Please see the page for channels through which support can be requested.

    Roadmap

    hashtag
    Backlog

    • Add On-demand transformations support

    Configuration Reference

    hashtag
    Overview

    This reference describes how to configure Feast components:

    Development guide

    hashtag
    Overview

    This guide is targeted at developers looking to contribute to Feast:

    Feast and Spark

    Configuring Feast to use Spark for ingestion.

    Feast relies on Spark to ingest data from the offline store to the online store, streaming ingestion, and running queries to retrieve historical data from the offline store. Feast supports several Spark deployment options.

    hashtag
    Option 1. Use Kubernetes Operator for Apache Spark

    To install the Spark on K8s Operator

    Currently Feast is tested using v1beta2-1.1.2-2.4.5

    Google Cloud Platform

    hashtag
    Description

    • Offline Store: Uses the BigQuery offline store by default. Also supports File as the offline store.

    az group create --name myResourceGroup  --location eastus
    az acr create --resource-group myResourceGroup  --name feast-AKS-ACR --sku Basic
    az aks create -g myResourceGroup  -n feast-AKS --location eastus --attach-acr feast-AKS-ACR --generate-ssh-keys
    
    az aks install-cli
    az aks get-credentials --resource-group myResourceGroup  --name  feast-AKS
    helm version # make sure you have the latest Helm installed
    helm repo add feast-charts https://feast-helm-charts.storage.googleapis.com
    helm repo update
    kubectl create secret generic feast-postgresql --from-literal=postgresql-password=password
    helm install feast-release feast-charts/feast
    helm repo add spark-operator https://googlecloudplatform.github.io/spark-on-k8s-operator 
    helm install my-release spark-operator/spark-operator  --set serviceAccounts.spark.name=spark --set image.tag=v1beta2-1.1.2-2.4.5
    cat <<EOF | kubectl apply -f -
    kind: Role
    apiVersion: rbac.authorization.k8s.io/v1beta1
    metadata:
      name: use-spark-operator
      namespace: <REPLACE ME>
    rules:
    - apiGroups: ["sparkoperator.k8s.io"]
      resources: ["sparkapplications"]
      verbs: ["create", "delete", "deletecollection", "get", "list", "update", "watch", "patch"]
    ---
    apiVersion: rbac.authorization.k8s.io/v1beta1
    kind: RoleBinding
    metadata:
      name: use-spark-operator
      namespace: <REPLACE ME>
    roleRef:
      kind: Role
      name: use-spark-operator
      apiGroup: rbac.authorization.k8s.io
    subjects:
      - kind: ServiceAccount
        name: default
    EOF
    kubectl port-forward \
    $(kubectl get pod -o custom-columns=:metadata.name | grep jupyter) 8888:8888
    Forwarding from 127.0.0.1:8888 -> 8888
    Forwarding from [::1]:8888 -> 8888
    demo_data_location = "wasbs://<container_name>@<storage_account_name>.blob.core.windows.net/"
    os.environ["FEAST_AZURE_BLOB_ACCOUNT_NAME"] = "<storage_account_name>"
    os.environ["FEAST_AZURE_BLOB_ACCOUNT_ACCESS_KEY"] = <Insert your key here>
    os.environ["FEAST_HISTORICAL_FEATURE_OUTPUT_LOCATION"] = "wasbs://<container_name>@<storage_account_name>.blob.core.windows.net/out/"
    os.environ["FEAST_SPARK_STAGING_LOCATION"] = "wasbs://<container_name>@<storage_account_name>.blob.core.windows.net/artifacts/"
    os.environ["FEAST_SPARK_LAUNCHER"] = "k8s"
    os.environ["FEAST_SPARK_K8S_NAMESPACE"] = "default"
    os.environ["FEAST_HISTORICAL_FEATURE_OUTPUT_FORMAT"] = "parquet"
    os.environ["FEAST_REDIS_HOST"] = "feast-release-redis-master.default.svc.cluster.local"
    os.environ["DEMO_KAFKA_BROKERS"] = "feast-release-kafka.default.svc.cluster.local:9092"
     helm repo add iks-charts https://icr.io/helm/iks-charts
     helm repo update
     helm install v2.0.2 iks-charts/ibmcloud-block-storage-plugin -n kube-system
     KeyError: 'historical_feature_output_location'
     KeyError: 'spark_staging_location'
     os.environ["FEAST_HISTORICAL_FEATURE_OUTPUT_LOCATION"] = "file:///home/jovyan/historical_feature_output"
     os.environ["FEAST_SPARK_STAGING_LOCATION"] = "file:///home/jovyan/test_data"
     <SparkJobStatus.FAILED: 2>
    oc adm policy add-scc-to-user anyuid -z default,kf-feast-kafka -n feast
    git clone https://github.com/kubeflow/manifests
    cd manifests/contrib/feast/
    kustomize build feast/base | kubectl apply -n feast -f -
    kafka.enabled: true
    feast-jupyter.enabled: true
    make feast/base
    kustomize build feast/base | kubectl apply -n feast -f -
    kubectl port-forward \
    $(kubectl get pod -l app=feast-jupyter -o custom-columns=:metadata.name) 8888:8888 -n feast
    Forwarding from 127.0.0.1:8888 -> 8888
    Forwarding from [::1]:8888 -> 8888
    kustomize build feast/base | kubectl delete -n feast -f -
    >> select version, description, script, checksum from flyway_schema_history
    
    version |              description                |                          script         |  checksum
    --------+-----------------------------------------+-----------------------------------------+------------
     1       | << Flyway Baseline >>                   | << Flyway Baseline >>                   | 
     2       | RELEASE 0.6 Generalizing Source AND ... | V2__RELEASE_0.6_Generalizing_Source_... | 1537500232
    >> select version, description, script, checksum from flyway_schema_history
    
    version |              description                |                          script         |  checksum
    --------+-----------------------------------------+-----------------------------------------+------------
     1       | Baseline                                | V1__Baseline.sql                        | 1091472110
     2       | RELEASE 0.6 Generalizing Source AND ... | V2__RELEASE_0.6_Generalizing_Source_... | 1537500232
    Job Controller client arrow-up-right

    Beta

    Alpha

    Alpha

    Alpha

    At risk of deprecation

    Beta

    Automated release process (docker images, PyPI packages, etc)
  • API reference documentation

  • No deprecative changes

  • Must include logging and monitoring

  • Deprecative changes must span multiple minor versions and allow for an upgrade path.

    Application

    Status

    Notes

    Feast Servingarrow-up-right

    Beta

    APIs are considered stable and will not have breaking changes within 3 minor versions.

    Feast Corearrow-up-right

    Beta

    At risk of deprecation

    Feast Java Clientarrow-up-right

    Beta

    Feast Python SDKarrow-up-right

    Application status

    Level of support

    Stable

    The Feast community offers best-effort support for stable applications. Stable components will be offered long term support

    Beta

    The Feast community offers best-effort support for beta applications. Beta applications will be supported for at least 2 more minor releases.

    Alpha

    The response differs per application in alpha status, depending on the size of the community for that application and the current level of active development of the application.

    Community

    Beta

    Add Data quality monitoring
  • Add Snowflake offline store support

  • Add Bigtable support

  • Add Push/Ingestion API support

  • hashtag
    Scheduled for development (next 3 months)

    Roadmap discussionarrow-up-right

    • Ensure Feast Serving is compatible with the new Feast

      • Decouple Feast Serving from Feast Core

      • Add FeatureView support to Feast Serving

      • Update Helm Charts (remove Core, Postgres, Job Service, Spark)

    • Add Redis support for Feast

    • Add direct deployment support to AWS and GCP

    • Add Dynamo support

    • Add Redshift support

    hashtag
    Feast 0.10

    hashtag
    New Functionality

    1. Full local mode support (Sqlite and Parquet)

    2. Provider model for added extensibility

    3. Firestore support

    4. Native (No-Spark) BigQuery support

    5. Added support for object store based registry

    6. Add support for FeatureViews

    7. Added support for infrastructure configuration through apply

    hashtag
    Technical debt, refactoring, or housekeeping

    1. Remove dependency on Feast Core

    2. Feast Serving made optional

    3. Moved Python API documentation to Read The Docs

    4. Moved Feast Java components to

    5. Moved Feast Spark components to

    hashtag
    Feast 0.9

    Discussionarrow-up-right

    hashtag
    New Functionality

    • Added Feast Job Service for management of ingestion and retrieval jobs

    • Added support for Spark on K8s Operatorarrow-up-right as Spark job launcher

    • Added Azure deployment and storage support (#1241arrow-up-right)

    Note: Please see discussion thread above for functionality that did not make this release.

    hashtag
    Feast 0.8

    Discussionarrow-up-right

    Feast 0.8 RFCarrow-up-right

    hashtag
    New Functionality

    1. Add support for AWS (data sources and deployment)

    2. Add support for local deployment

    3. Add support for Spark based ingestion

    4. Add support for Spark based historical retrieval

    hashtag
    Technical debt, refactoring, or housekeeping

    1. Move job management functionality to SDK

    2. Remove Apache Beam based ingestion

    3. Allow direct ingestion from batch sources that does not pass through stream

    4. Remove Feast Historical Serving abstraction to allow direct access from Feast SDK to data sources for retrieval

    hashtag
    Feast 0.7

    Discussionarrow-up-right

    GitHub Milestonearrow-up-right

    hashtag
    New Functionality

    1. Label based Ingestion Job selector for Job Controller #903arrow-up-right

    2. Authentication Support for Java & Go SDKs #971arrow-up-right

    3. Automatically Restart Ingestion Jobs on Upgrade #949arrow-up-right

    4. Structured Audit Logging

    5. Request Response Logging support via Fluentd

    6. Feast Core Rest Endpoints

    hashtag
    Technical debt, refactoring, or housekeeping

    1. Improved integration testing framework #886arrow-up-right

    2. Rectify all flaky batch tests #953arrow-up-right, #982arrow-up-right

    3. Decouple job management from Feast Core #951arrow-up-right

    hashtag
    Feast 0.6

    Discussionarrow-up-right

    GitHub Milestonearrow-up-right

    hashtag
    New functionality

    1. Batch statistics and validation #612arrow-up-right

    2. Authentication and authorization #554arrow-up-right

    3. Online feature and entity status metadata #658arrow-up-right

    4. Improved searching and filtering of features and entities

    5. Python support for labels

    hashtag
    Technical debt, refactoring, or housekeeping

    1. Improved job life cycle management #761arrow-up-right

    2. Compute and write metrics for rows prior to store writes #763arrow-up-right

    hashtag
    Feast 0.5

    Discussionarrow-up-right

    hashtag
    New functionality

    1. Streaming statistics and validation (M1 from Feature Validation RFCarrow-up-right)

    2. Support for Redis Clusters (#478arrow-up-right, #502arrow-up-right)

    3. Add feature and feature set labels, i.e. key/value registry metadata (#463arrow-up-right)

    4. Job management API ()

    hashtag
    Technical debt, refactoring, or housekeeping

    1. Clean up and document all configuration options (#525arrow-up-right)

    2. Externalize storage interfaces (#402arrow-up-right)

    3. Reduce memory usage in Redis (#515arrow-up-right)

    4. Support for handling out of order ingestion ()

    5. Remove feature versions and enable automatic data migration () ()

    6. Tracking of batch ingestion by with dataset_id/job_id ()

    7. Write Beam metrics after ingestion to store (not prior) ()

    Feast CLI and Feast Python SDK

  • Feast Go and Feast Java SDK

  • hashtag
    1. Feast Core and Feast Online Serving

    Available configuration properties for Feast Core and Feast Online Serving can be referenced from the corresponding application.yml of each component:

    Component

    Configuration Reference

    Core

    Serving (Online)

    Configuration properties for Feast Core and Feast Online Serving are defined depending on Feast is deployed:

    • Docker Compose deployment - Feast is deployed with Docker Compose.

    • Kubernetes deployment - Feast is deployed with Kubernetes.

    • Direct Configuration - Feast is built and run from source code.

    hashtag
    Docker Compose Deployment

    For each Feast component deployed using Docker Compose, configuration properties from application.yml can be set at:

    Component

    Configuration Path

    Core

    infra/docker-compose/core/core.yml

    Online Serving

    infra/docker-compose/serving/online-serving.yml

    hashtag
    Kubernetes Deployment

    The Kubernetes Feast Deployment is configured using values.yaml in the Helm chartarrow-up-right included with Feast:

    A reference of the sub-chart-specific configuration can found in its values.yml:

    • feast-corearrow-up-right

    • feast-servingarrow-up-right

    Configuration properties can be set via application-override.yaml for each component in values.yaml:

    Visit the Helm chartarrow-up-right included with Feast to learn more about configuration.

    hashtag
    Direct Configuration

    If Feast is built and running from source, configuration properties can be set directly in the Feast component's application.yml:

    Component

    Configuration Path

    Core

    Serving (Online)

    hashtag
    2. Feast CLI and Feast Python SDK

    Configuration options for both the Feast CLI and Feast Python SDKarrow-up-right can be defined in the following locations, in order of precedence:

    1. Command line arguments or initialized arguments: Passing parameters to the Feast CLI or instantiating the Feast Client object with specific parameters will take precedence above other parameters.

    2. Environmental variables: Environmental variables can be set to provide configuration options. They must be prefixed with FEAST_. For example FEAST_CORE_URL.

    3. Configuration file: Options with the lowest precedence are configured in the Feast configuration file. Feast looks for or creates this configuration file in ~/.feast/config if it does not already exist. All options must be defined in the [general] section of this file.

    Visit the available configuration parametersarrow-up-right for Feast Python SDK and Feast CLI to learn more.

    hashtag
    3. Feast Java and Go SDK

    The Feast Java SDKarrow-up-right and Feast Go SDKarrow-up-right are configured via arguments passed when instantiating the respective Clients:

    hashtag
    Go SDK

    Visit the Feast Go SDK API referencearrow-up-right to learn more about available configuration parameters.

    hashtag
    Java SDK

    Visit the Feast Java SDK API referencearrow-up-right to learn more about available configuration parameters.

    Feast Core and Feast Online Serving

    Making a Pull Request

  • Feast Data Storage Format

  • Feast Protobuf API

  • Learn How the Feast Contributing Processarrow-up-right works.

    hashtag
    Project Structure

    Feast is composed of multiple componentsarrow-up-right distributed into multiple repositories:

    Repository

    Description

    Component(s)

    Hosts all required code to run Feast. This includes the Feast Python SDK and Protobuf definitions. For legacy reasons this repository still contains Terraform config and a Go Client for Feast.

    • Python SDK / CLI

    • Protobuf APIs

    • Documentation

    Java-specific Feast components. Includes the Feast Core Registry, Feast Serving for serving online feature values, and the Feast Java Client for retrieving feature values.

    • Core

    • Serving

    • Java Client

    Feast Spark SDK & Feast Job Service for launching ingestion jobs and for building training datasets with Spark

    • Spark SDK

    • Job Service

    hashtag
    Making a Pull Request

    hashtag
    Incorporating upstream changes from master

    Our preference is the use of git rebase instead of git merge : git pull -r

    hashtag
    Signing commits

    Commits have to be signed before they are allowed to be merged into the Feast codebase:

    hashtag
    Good practices to keep in mind

    • Fill in the description based on the default template configured when you first open the PR

      • What this PR does/why we need it

      • Which issue(s) this PR fixes

      • Does this PR introduce a user-facing change

    • Include kind label when opening the PR

    • Add WIP: to PR name if more work needs to be done prior to review

    • Avoid force-pushing as it makes reviewing difficult

    Managing CI-test failures

    • GitHub runner tests

      • Click checks tab to analyse failed tests

    • Prow tests

      • Visit to analyse failed tests

    hashtag
    Feast Data Storage Format

    Feast data storage contracts are documented in the following locations:

    • Feast Offline Storage Formatarrow-up-right: Used by BigQuery, Snowflake (Future), Redshift (Future).

    • Feast Online Storage Formatarrow-up-right: Used by Redis, Google Datastore.

    hashtag
    Feast Protobuf API

    Feast Protobuf API defines the common API used by Feast's Components:

    • Feast Protobuf API specifications are written in proto3arrow-up-right in the Main Feast Repository.

    • Changes to the API should be proposed via a GitHub Issuearrow-up-right for discussion first.

    hashtag
    Generating Language Bindings

    The language specific bindings have to be regenerated when changes are made to the Feast Protobuf API:

    Repository

    Language

    Regenerating Language Bindings

    Python

    Run make compile-protos-python to generate bindings

    Golang

    Run make compile-protos-go to generate bindings

    Java

    No action required: bindings are generated automatically during compilation.

    Project Structure
    version of the operator image. To configure Feast to use it, set the following options in Feast config:

    Feast Setting

    Value

    SPARK_LAUNCHER

    "k8s"

    SPARK_STAGING_LOCATION

    S3/GCS/Azure Blob Storage URL to use as a staging location, must be readable and writable by Feast. For S3, use s3a:// prefix here. Ex.: s3a://some-bucket/some-prefix/artifacts/

    HISTORICAL_FEATURE_OUTPUT_LOCATION

    S3/GCS/Azure Blob Storage URL used to store results of historical retrieval queries, must be readable and writable by Feast. For S3, use s3a:// prefix here. Ex.: s3a://some-bucket/some-prefix/out/

    SPARK_K8S_NAMESPACE

    Only needs to be set if you are customizing the spark-on-k8s-operator. The name of the Kubernetes namespace to run Spark jobs in. This should match the value of sparkJobNamespace set on spark-on-k8s-operator Helm chart. Typically this is also the namespace Feast itself will run in.

    SPARK_K8S_JOB_TEMPLATE_PATH

    Only needs to be set if you are customizing the Spark job template. Local file path with the template of the SparkApplication resource. No prefix required. Ex.: /home/jovyan/work/sparkapp-template.yaml. An example template is and the spec is defined in the .

    Lastly, make sure that the service account used by Feast has permissions to manage Spark Application resources. This depends on your k8s setup, but typically you'd need to configure a Role and a RoleBinding like the one below:

    hashtag
    Option 2. Use GCP and Dataproc

    If you're running Feast in Google Cloud, you can use Dataproc, a managed Spark platform. To configure Feast to use it, set the following options in Feast config:

    Feast Setting

    Value

    SPARK_LAUNCHER

    "dataproc"

    DATAPROC_CLUSTER_NAME

    Dataproc cluster name

    DATAPROC_PROJECT

    Dataproc project name

    SPARK_STAGING_LOCATION

    GCS URL to use as a staging location, must be readable and writable by Feast. Ex.: gs://some-bucket/some-prefix

    See Feast documentationarrow-up-right for more configuration options for Dataproc.

    hashtag
    Option 3. Use AWS and EMR

    If you're running Feast in AWS, you can use EMR, a managed Spark platform. To configure Feast to use it, set at least the following options in Feast config:

    Feast Setting

    Value

    SPARK_LAUNCHER

    "emr"

    SPARK_STAGING_LOCATION

    S3 URL to use as a staging location, must be readable and writable by Feast. Ex.: s3://some-bucket/some-prefix

    See Feast documentationarrow-up-right for more configuration options for EMR.

    Online Store: Uses the Datastore online store by default. Also supports Sqlite as an online store.

    hashtag
    Example

    hashtag
    Permissions

    Command

    Component

    Permissions

    Recommended Role

    Apply

    BigQuery (source)

    bigquery.jobs.create

    bigquery.readsessions.create

    bigquery.readsessions.getData

    roles/bigquery.user

    Apply

    Datastore (destination)

    datastore.entities.allocateIds

    datastore.entities.create

    datastore.entities.delete

    datastore.entities.get

    datastore.entities.list

    datastore.entities.update

    roles/datastore.owner

    Materialize

    Kustomizearrow-up-right

    Audit Logging

    circle-exclamation

    This page applies to Feast 0.7. The content may be out of date for Feast 0.8+

    hashtag
    Introduction

    # values.yaml
    feast-core:
      enabled: true # whether to deploy the feast-core subchart to deploy Feast Core.
      # feast-core subchart specific config.
      gcpServiceAccount:
        enabled: true 
      # ....
    # values.yaml
    feast-core:
      # ....
      application-override.yaml: 
         # application.yml config properties for Feast Core.
         # ...
    # Set option as command line arguments.
    feast config set core_url "localhost:6565"
    # Pass options as initialized arguments.
    client = Client(
        core_url="localhost:6565",
        project="default"
    )
    FEAST_CORE_URL=my_feast:6565 FEAST_PROJECT=default feast projects list
    [general]
    project = default
    core_url = localhost:6565
    // configure serving host and port.
    cli := feast.NewGrpcClient("localhost", 6566)
    // configure serving host and port.
    client = FeastClient.create(servingHost, servingPort);
    # Include -s flag to signoff
    git commit -s -m "My first commit"
    helm repo add spark-operator \
        https://googlecloudplatform.github.io/spark-on-k8s-operator
    
    helm install my-release spark-operator/spark-operator \
        --set serviceAccounts.spark.name=spark
    cat <<EOF | kubectl apply -f -
    kind: Role
    apiVersion: rbac.authorization.k8s.io/v1beta1
    metadata:
      name: use-spark-operator
      namespace: default  # replace if using different namespace
    rules:
    - apiGroups: ["sparkoperator.k8s.io"]
      resources: ["sparkapplications"]
      verbs: ["create", "delete", "deletecollection", "get", "list", "update", "watch", "patch"]
    ---
    apiVersion: rbac.authorization.k8s.io/v1beta1
    kind: RoleBinding
    metadata:
      name: use-spark-operator
      namespace: default  # replace if using different namespace
    roleRef:
      kind: Role
      name: use-spark-operator
      apiGroup: rbac.authorization.k8s.io
    subjects:
      - kind: ServiceAccount
        name: default
    EOF
    feature_store.yaml
    project: my_feature_repo
    registry: gs://my-bucket/data/registry.db
    provider: gcp
    NAME: v2.0.2
    LAST DEPLOYED: Fri Feb  5 12:29:50 2021
    NAMESPACE: kube-system
    STATUS: deployed
    REVISION: 1
    NOTES:
    Thank you for installing: ibmcloud-block-storage-plugin.   Your release is named: v2.0.2
     ...
     kubectl get pods -n kube-system | grep ibmcloud-block-storage
     kubectl get storageclasses | grep ibmc-block
     kubectl patch storageclass ibmc-block-gold -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
     kubectl patch storageclass ibmc-file-gold -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"false"}}}'
    
     # Check the default storageclass is block storage
     kubectl get storageclass | grep \(default\)
     ibmc-block-gold (default)   ibm.io/ibmc-block   65s
     os.environ["FEAST_REDIS_HOST"] = "feast-release-redis-master"
     org.apache.kafka.vendor.common.KafkaException: Failed to construct kafka consumer
     os.environ["DEMO_KAFKA_BROKERS"] = "feast-release-kafka:9092"

    BigQuery (source)

    bigquery.jobs.create

    roles/bigquery.user

    Materialize

    Datastore (destination)

    datastore.entities.allocateIds

    datastore.entities.create

    datastore.entities.delete

    datastore.entities.get

    datastore.entities.list

    datastore.entities.update

    datastore.databases.get

    roles/datastore.owner

    Get Online Features

    Datastore

    datastore.entities.get

    roles/datastore.user

    Get Historical Features

    BigQuery (source)

    bigquery.datasets.get

    bigquery.tables.get

    bigquery.tables.create

    bigquery.tables.updateData

    bigquery.tables.update

    bigquery.tables.delete

    bigquery.tables.getData

    roles/bigquery.dataEditor

    feast-javaarrow-up-right
    feast-sparkarrow-up-right
    #891arrow-up-right
    #961arrow-up-right
    #878arrow-up-right
    #663arrow-up-right
    #302arrow-up-right
    #273arrow-up-right
    #386arrow-up-right
    #462arrow-up-right
    #461arrow-up-right
    #489arrow-up-right
    Feast Go Clientarrow-up-right
    Feast Spark Python SDKarrow-up-right
    Feast Spark Launchersarrow-up-right
    Feast Job Servicearrow-up-right
    Feast Helm Chartarrow-up-right
    core/src/main/resources/application.ymlarrow-up-right
    serving/src/main/resources/application.ymlarrow-up-right
    core/src/main/resources/application.ymlarrow-up-right
    serving/src/main/resources/application.ymlarrow-up-right

    Go Client

  • Terraform

  • Helm Chart for deploying Feast on Kubernetes & Spark.

    • Helm Chart

    Prow status page arrow-up-right
    Main Feast Repositoryarrow-up-right
    Feast Javaarrow-up-right
    Feast Sparkarrow-up-right
    Feast Helm Chartarrow-up-right
    Main Feast Repositoryarrow-up-right
    Main Feast Repositoryarrow-up-right
    Feast Javaarrow-up-right
    herearrow-up-right
    k8s-operator User Guidearrow-up-right
    Feast provides audit logging functionality in order to debug problems and to trace the lineage of events.

    hashtag
    Audit Log Types

    Audit Logs produced by Feast come in three favors:

    Audit Log Type

    Description

    Message Audit Log

    Logs service calls that can be used to track Feast request handling. Currently only gRPC request/response is supported. Enabling Message Audit Logs can be resource intensive and significantly increase latency, as such is not recommended on Online Serving.

    Transition Audit Log

    Logs transitions in status in resources managed by Feast (ie an Ingestion Job becoming RUNNING).

    Action Audit Log

    Logs actions performed on a specific resource managed by Feast (ie an Ingestion Job is aborted).

    hashtag
    Configuration

    Audit Log Type

    Description

    Message Audit Log

    Enabled when both feast.logging.audit.enabled and feast.logging.audit.messageLogging.enabled is set to true

    Transition Audit Log

    Enabled when feast.logging.audit.enabled is set to true

    Action Audit Log

    Enabled when feast.logging.audit.enabled is set to true

    hashtag
    JSON Format

    Audit Logs produced by Feast are written to the console similar to normal logs but in a structured, machine parsable JSON. Example of a Message Audit Log JSON entry produced:

    hashtag
    Log Entry Schema

    Fields common to all Audit Log Types:

    Field

    Description

    logType

    Log Type. Always set to FeastAuditLogEntry. Useful for filtering out Feast audit logs.

    application

    Application. Always set to Feast.

    component

    Feast Component producing the Audit Log. Set to feast-core for Feast Core and feast-serving for Feast Serving. Use to filtering out Audit Logs by component.

    version

    Version of Feast producing this Audit Log. Use to filtering out Audit Logs by version.

    Fields in Message Audit Log Type

    Field

    Description

    id

    Generated UUID that uniquely identifies the service call.

    service

    Name of the Service that handled the service call.

    method

    Name of the Method that handled the service call. Useful for filtering Audit Logs by method (ie ApplyFeatureTable calls)

    request

    Full request submitted by client in the service call as JSON.

    response

    Full response returned to client by the service after handling the service call as JSON.

    Fields in Action Audit Log Type

    Field

    Description

    action

    Name of the action taken on the resource.

    resource.type

    Type of resource of which the action was taken on (i.e FeatureTable)

    resource.id

    Identifier specifying the specific resource of which the action was taken on.

    Fields in Transition Audit Log Type

    Field

    Description

    status

    The new status that the resource transitioned to

    resource.type

    Type of resource of which the transition occurred (i.e FeatureTable)

    resource.id

    Identifier specifying the specific resource of which the transition occurred.

    hashtag
    Log Forwarder

    Feast currently only supports forwarding Request/Response (Message Audit Log Type) logs to an external fluentD service with feast.** Fluentd tag.

    hashtag
    Request/Response Log Example

    hashtag
    Configuration

    The Fluentd Log Forwarder configured with the with the following configuration options in application.yml:

    Settings

    Description

    feast.logging.audit.messageLogging.destination

    fluentd

    feast.logging.audit.messageLogging.fluentdHost

    localhost

    feast.logging.audit.messageLogging.fluentdPort

    24224

    When using Fluentd as the Log forwarder, a Feast release_name can be logged instead of the IP address (eg. IP of Kubernetes pod deployment), by setting an environment variable RELEASE_NAME when deploying Feast.

    {
      "message": {
        "logType": "FeastAuditLogEntry",
        "kind": "MESSAGE",
        "statusCode": "OK",
        "request": {
          "filter": {
            "project": "dummy",
          }
        },
        "application": "Feast",
        "response": {},
        "method": "ListFeatureTables",
        "identity": "105960238928959148073",
        "service": "CoreService",
        "component": "feast-core",
        "id": "45329ea9-0d48-46c5-b659-4604f6193711",
        "version": "0.10.0-SNAPSHOT"
      },
      "hostname": "feast.core"
      "timestamp": "2020-10-20T04:45:24Z",
      "severity": "INFO",
    }
    {
      "id": "45329ea9-0d48-46c5-b659-4604f6193711",
      "service": "CoreService"
      "status_code": "OK",
      "identity": "105960238928959148073",
      "method": "ListProjects",
      "request": {},
      "response": {
        "projects": [
          "default", "project1", "project2"
        ]
      }
      "release_name": 506.457.14.512
    }

    identity

    Identity of the client making the service call as an user Id. Only set when Authentication is enabled.

    statusCode

    The status code returned by the service handling the service call (ie OK if service call handled without error).

    Metrics Reference

    circle-exclamation

    This page applies to Feast 0.7. The content may be out of date for Feast 0.8+

    Reference of the metrics that each Feast component exports:

    • Feast Core

    For how to configure Feast to export Metrics, see the

    hashtag
    Feast Core

    Exported Metrics

    Feast Core exports the following metrics:

    Metric Tags

    Exported Feast Core metrics may be filtered by the following tags/keys

    hashtag
    Feast Serving

    Exported Metrics

    Feast Serving exports the following metrics:

    Metric Tags

    Exported Feast Serving metrics may be filtered by the following tags/keys

    hashtag
    Feast Ingestion Job

    Feast Ingestion computes both metrics an statistics on Make sure you familar with data ingestion concepts before proceeding.

    Metrics Namespace

    Metrics are computed at two stages of the Feature Row's/Feature Value's life cycle when being processed by the Ingestion Job:

    • Inflight- Prior to writing data to stores, but after successful validation of data.

    • WriteToStoreSucess- After a successful store write.

    Metrics processed by each staged will be tagged with metrics_namespace to the stage where the metric was computed.

    Metrics Bucketing

    Metrics with a {BUCKET} are computed on a 60 second window/bucket. Suffix with the following to select the bucket to use:

    • min - minimum value.

    • max - maximum value.

    • mean

    Exported Metrics

    Metric Tags

    Exported Feast Ingestion Job metrics may be filtered by the following tags/keys

    None

    feast_core_total_memory_bytes

    Total amount of memory in the Java virtual machine

    None

    feast_core_free_memory_bytes

    Total amount of free memory in the Java virtual machine.

    None

    feast_core_gc_collection_seconds

    Time spent in a given JVM garbage collector in seconds.

    None

    project, feature_name

    feast_serving_grpc_request_count

    Total gRPC requests served.

    method

    - mean value.
  • percentile_90- 90 percentile.

  • percentile_95- 95 percentile.

  • percentile_99- 99 percentile.

  • feast_store, feast_project_name,feast_featureSet_name,ingestion_job_name,

    metrics_namespace

    feast_ingestion_feature_value_missing_count

    No. of times a ingested Feature values did not provide a value for the Feature.

    feast_store, feast_project_name,feast_featureSet_name,

    feast_feature_name,

    ingestion_job_name,

    metrics_namespace

    feast_ingestion_deadletter_row_count

    No. of Feature Rows that that the Ingestion Job did not successfully write to store.

    feast_store, feast_project_name,feast_featureSet_name,ingestion_job_name

    Stage where metrics where computed. Either Inflight or WriteToStoreSuccess

    Metrics

    Description

    Tags

    feast_core_request_latency_seconds

    Feast Core's latency in serving Requests in Seconds.

    service, method, status_code

    feast_core_feature_set_total

    No. of Feature Sets registered with Feast Core.

    None

    feast_core_store_total

    No. of Stores registered with Feast Core.

    None

    feast_core_max_memory_bytes

    Tag

    Description

    service

    Name of the Service that request is made to. Should be set to CoreService

    method

    Name of the Method that the request is calling. (ie ListFeatureSets)

    status_code

    Status code returned as a result of handling the requests (ie OK). Can be used to find request failures.

    Metric

    Description

    Tags

    feast_serving_request_latency_seconds

    Feast Serving's latency in serving Requests in Seconds.

    method

    feast_serving_request_feature_count

    No. of requests retrieving a Feature from Feast Serving.

    project, feature_name

    feast_serving_not_found_feature_count

    No. of requests retrieving a Feature has resulted in a NOT_FOUND field status.

    project, feature_name

    feast_serving_stale_feature_count

    Tag

    Description

    method

    Name of the Method that the request is calling. (ie ListFeatureSets)

    status_code

    Status code returned as a result of handling the requests (ie OK). Can be used to find request failures.

    project

    Name of the project that the FeatureSet of the Feature retrieved belongs to.

    feature_name

    Name of the Feature being retrieved.

    Metric

    Description

    Tags

    feast_ingestion_feature_row_lag_ms_{BUCKET}

    Lag time in milliseconds between succeeding ingested Feature Rows.

    feast_store, feast_project_name,feast_featureSet_name,ingestion_job_name,

    metrics_namespace

    feast_ingestion_feature_value_lag_ms_{BUCKET}

    Lag time in milliseconds between succeeding ingested values for each Feature.

    feast_store, feast_project_name,feast_featureSet_name,

    feast_feature_name,

    ingestion_job_name,

    metrics_namespace

    feast_ingestion_feature_value_{BUCKET}

    Last value feature for each Feature.

    feast_store, feature_project_name, feast_feature_name,feast_featureSet_name, ingest_job_name, metrics_namepace

    feast_ingestion_feature_row_ingested_count

    Tag

    Description

    feast_store

    Name of the target store the Ingestion Job is writing to.

    feast_project_name

    Name of the project that the ingested FeatureSet belongs to.

    feast_featureSet_name

    Name of the Feature Set being ingested.

    feast_feature_name

    Name of the Feature being ingested.

    ingestion_job_name

    Name of the Ingestion Job performing data ingestion. Typically this is set to the Id of the Ingestion Job.

    Feast Serving
    Feast Ingestion Job
    Metrics user guide.
    data ingestion.

    Max amount of memory the Java virtual machine will attempt to use.

    No. of requests retrieving a Feature resulted in a

    No. of Ingested Feature Rows

    metrics_namespace

    OUTSIDE_MAX_AGE field status.

    Security

    Secure Feast with SSL/TLS, Authentication and Authorization.

    circle-exclamation

    This page applies to Feast 0.7. The content may be out of date for Feast 0.8+

    hashtag
    Overview

    Feast supports the following security methods:

    .

    hashtag
    SSL/TLS

    Feast supports SSL/TLS encrypted inter-service communication among Feast Core, Feast Online Serving, and Feast SDKs.

    hashtag
    Configuring SSL/TLS on Feast Core and Feast Serving

    The following properties configure SSL/TLS. These properties are located in their corresponding application.ymlfiles:

    Read more on enabling SSL/TLS in the

    hashtag
    Configuring SSL/TLS on Python SDK/CLI

    To enable SSL/TLS in the or , set the config options via feast config:

    circle-info

    The Python SDK automatically uses SSL/TLS when connecting to Feast Core and Feast Online Serving via port 443.

    hashtag
    Configuring SSL/TLS on Go SDK

    Configure SSL/TLS on the by passing configuration via SecurityConfig:

    hashtag
    Configuring SSL/TLS on Java SDK

    Configure SSL/TLS on the by passing configuration via SecurityConfig:

    hashtag
    Authentication

    circle-exclamation

    To prevent man in the middle attacks, we recommend that SSL/TLS be implemented prior to authentication.

    Authentication can be implemented to identify and validate client requests to Feast Core and Feast Online Serving. Currently, Feast uses ID tokens (i.e. ) to authenticate client requests.

    hashtag
    Configuring Authentication in Feast Core and Feast Online Serving

    Authentication can be configured for Feast Core and Feast Online Serving via properties in their corresponding application.yml files:

    circle-info

    jwkEndpointURIis set to retrieve Google's OIDC JWK by default, allowing OIDC ID tokens issued by Google to be used for authentication.

    Behind the scenes, Feast Core and Feast Online Serving authenticate by:

    • Extracting the OIDC ID token TOKENfrom gRPC metadata submitted with request:

    • Validates token's authenticity using the JWK retrieved from the jwkEndpointURI

    hashtag
    Authenticating Serving with Feast Core

    Feast Online Serving communicates with Feast Core during normal operation. When both authentication and authorization are enabled on Feast Core, Feast Online Serving is forced to authenticate its requests to Feast Core. Otherwise, Feast Online Serving produces an Authentication failure error when connecting to Feast Core.

    Properties used to configure Serving authentication via application.yml:

    Google Provider automatically extracts the credential from the credential JSON file.

    • Set to the path of the credential in the JSON file.

    OAuth Provider makes an OAuth request to obtain the credential. OAuth requires the following options to be set at feast.security.core-authentication.options.:

    hashtag
    Enabling Authentication in Python SDK/CLI

    Configure the and to use authentication via feast config:

    Google Provider automatically finds and uses Google Credentials to authenticate requests:

    • Google Provider automatically uses established credentials for authenticating requests if you are already authenticated with the gcloud CLI via:

    • Alternatively Google Provider can be configured to use the credentials in the JSON file viaGOOGLE_APPLICATION_CREDENTIALS

    hashtag
    Enabling Authentication in Go SDK

    Configure the to use authentication by specifying the credential via SecurityConfig:

    Google Credential uses Service Account credentials JSON file set viaGOOGLE_APPLICATION_CREDENTIALS environmental variable () to obtain tokens for Authenticating Feast requests:

    • Exporting GOOGLE_APPLICATION_CREDENTIALS

    hashtag
    Enabling Authentication in Java SDK

    Configure the to use authentication by setting credentials via SecurityConfig:

    GoogleAuthCredentials uses Service Account credentials JSON file set viaGOOGLE_APPLICATION_CREDENTIALS environmental variable () to obtain tokens for Authenticating Feast requests:

    • Exporting GOOGLE_APPLICATION_CREDENTIALS

    hashtag
    Authorization

    circle-info

    Authorization requires that authentication be configured to obtain a user identity for use in authorizing requests.

    Authorization provides access control to FeatureTables and/or Features based on project membership. Users who are members of a project are authorized to:

    • Create and/or Update a Feature Table in the Project.

    • Retrieve Feature Values for Features in that Project.

    hashtag
    Authorization API/Server

    Feast delegates Authorization grants to an external Authorization Server that implements the .

    • Feast checks whether a user is authorized to make a request by making a checkAccessRequest to the Authorization Server.

    • The Authorization Server should return a AuthorizationResult with whether the user is allowed to make the request.

    Authorization can be configured for Feast Core and Feast Online Serving via properties in their corresponding application.yml

    circle-info

    This example of the can be used as a reference implementation for implementing an Authorization Server that Feast supports.

    hashtag
    Authentication & Authorization

    When using Authentication & Authorization, consider:

    • Enabling Authentication without Authorization makes authentication optional. You can still send unauthenticated requests.

    • Enabling Authorization forces all requests to be authenticated. Requests that are not authenticated are dropped.

    Configuration Property

    Description

    oauth_url

    Target URL receiving the client-credentials request.

    grant_type

    OAuth grant type. Set as client_credentials

    client_id

    Client Id used in the client-credentials request.

    client_secret

    Client secret used in the client-credentials request.

    audience

    Target audience of the credential. Set to host URL of Feast Core.

    (i.e. https://localhost if Feast Core listens on localhost).

    jwkEndpointURI

    HTTPS URL used to retrieve a JWK that can be used to decode the credential.

    environmental variable (
    ):

    OAuth Provider makes an OAuth client credentialsarrow-up-right request to obtain the credential/token used to authenticate Feast requests. The OAuth provider requires the following config options to be set via feast config:

    Configuration Property

    Description

    oauth_token_request_url

    Target URL receiving the client-credentials request.

    oauth_grant_type

    Create a Google Credential with target audience.

    Target audience of the credential should be set to host URL of target Service. (ie https://localhost if Service listens on localhost):

    OAuth Credential makes an OAuth client credentialsarrow-up-right request to obtain the credential/token used to authenticate Feast requests:

    • Create OAuth Credential with parameters:

    Parameter

    Description

    audience

    Create a Google Credential with target audience.

    Target audience of the credentials should be set to host URL of target Service. (ie https://localhost if Service listens on localhost):

    OAuthCredentials makes an OAuth client credentialsarrow-up-right request to obtain the credential/token used to authenticate Feast requests:

    • Create OAuthCredentials with parameters:

    Parameter

    Description

    audience

    Configuration Property

    Description

    grpc.server.security.enabled

    Enables SSL/TLS functionality if true

    grpc.server.security.certificateChain

    Provide the path to certificate chain.

    grpc.server.security.privateKey

    Provide the to private key.

    Configuration Option

    Description

    core_enable_ssl

    Enables SSL/TLS functionality on connections to Feast core if true

    serving_enable_ssl

    Enables SSL/TLS functionality on connections to Feast Online Serving if true

    core_server_ssl_cert

    Optional. Specifies the path of the root certificate used to verify Core Service's identity. If omitted, uses system certificates.

    serving_server_ssl_cert

    Optional. Specifies the path of the root certificate used to verify Serving Service's identity. If omitted, uses system certificates.

    Config Option

    Description

    EnableTLS

    Enables SSL/TLS functionality when connecting to Feast if true

    TLSCertPath

    Optional. Provides the path of the root certificate used to verify Feast Service's identity. If omitted, uses system certificates.

    Config Option

    Description

    setTLSEnabled()

    Enables SSL/TLS functionality when connecting to Feast if true

    setCertificatesPath()

    Optional. Set the path of the root certificate used to verify Feast Service's identity. If omitted, uses system certificates.

    Configuration Property

    Description

    feast.security.authentication.enabled

    Enables Authentication functionality if true

    feast.security.authentication.provider

    Authentication Provider type. Currently only supports jwt

    feast.security.authentication.option.jwkEndpointURI

    HTTPS URL used by Feast to retrieved the JWKarrow-up-right used to verify OIDC ID tokens.

    Configuration Property

    Description

    feast.core-authentication.enabled

    Requires Feast Online Serving to authenticate when communicating with Feast Core.

    feast.core-authentication.provider

    Selects provider Feast Online Serving uses to retrieve credentials then used to authenticate requests to Feast Core. Valid providers are google and oauth.

    Configuration Option

    Description

    enable_auth

    Enables authentication functionality if set to true.

    auth_provider

    Use an authentication provider to obtain a credential for authentication. Currently supports google and oauth.

    auth_token

    Manually specify a static token for use in authentication. Overrules auth_provider if both are set.

    Configuration Property

    Description

    feast.security.authorization.enabled

    Enables authorization functionality if true.

    feast.security.authorization.provider

    Authentication Provider type. Currently only supports http

    feast.security.authorization.option.authorizationUrl

    URL endpoint of Authorization Server to make check access requests to.

    feast.security.authorization.option.subjectClaim

    Optional. Name of the claim of the to extract from the ID Token to include in the check access request as Subject.

    SSL/TLS on messaging between Feast Core, Feast Online Serving and Feast SDKs.
    Authentication to Feast Core and Serving based on Open ID Connect ID tokens.
    Authorization based on project membership and delegating authorization grants to external Authorization Server.
    Important considerations when integrating Authentication/Authorization
    gRPC starter docs.arrow-up-right
    Feast Python SDKarrow-up-right
    Feast CLI
    Go SDKarrow-up-right
    Feast Java SDKarrow-up-right
    arrow-up-right
    Open ID Connect (OIDC)arrow-up-right
    Google Open ID Connectarrow-up-right
    GOOGLE_APPLICATION_CREDENTIALS environment variablearrow-up-right
    client credentialsarrow-up-right
    Feast Python SDKarrow-up-right
    Feast CLI
    Feast Java SDKarrow-up-right
    Google Cloud Authentication documentationarrow-up-right
    Feast Java SDKarrow-up-right
    Google Cloud authentication documentationarrow-up-right
    Authorization Open API specificationarrow-up-right
    Authorization Server with Ketoarrow-up-right
    Overview of Feast's Security Methods.
    Feast Authorization Flow
    Google Cloud Authentication documentationarrow-up-right
    cred := feast.NewOAuthCredential("localhost:6566", "client_id", "secret", "https://oauth.endpoint/auth")
    CallCredentials credentials = new OAuthCredentials(Map.of(
      "audience": "localhost:6566",
      "grant_type", "client_credentials",
      "client_id", "some_id",
      "client_id", "secret",
      "oauth_url", "https://oauth.endpoint/auth",
      "jwkEndpointURI", "https://jwk.endpoint/jwk"));
    cli, err := feast.NewSecureGrpcClient("localhost", 6566, feast.SecurityConfig{
        EnableTLS: true,
             TLSCertPath: "/path/to/cert.pem",
    })Option
    FeastClient client = FeastClient.createSecure("localhost", 6566, 
        SecurityConfig.newBuilder()
          .setTLSEnabled(true)
          .setCertificatePath(Optional.of("/path/to/cert.pem"))
          .build());
    ('authorization', 'Bearer: TOKEN')
    $ feast config set enable_auth true
    $ gcloud auth application-default login
    // error handling omitted.
    // Use Google Credential as provider.
    cred, _ := feast.NewGoogleCredential("localhost:6566")
    cli, _ := feast.NewSecureGrpcClient("localhost", 6566, feast.SecurityConfig{
      // Specify the credential to provide tokens for Feast Authentication.  
        Credential: cred, 
    })
    $ export GOOGLE_APPLICATION_CREDENTIALS="path/to/key.json"
    // Use GoogleAuthCredential as provider.
    CallCredentials credentials = new GoogleAuthCredentials(
        Map.of("audience", "localhost:6566"));
    
    FeastClient client = FeastClient.createSecure("localhost", 6566, 
        SecurityConfig.newBuilder()
          // Specify the credentials to provide tokens for Feast Authentication.  
          .setCredentials(Optional.of(creds))
          .build());
    $ export GOOGLE_APPLICATION_CREDENTIALS="path/to/key.json"
    $ export GOOGLE_APPLICATION_CREDENTIALS="path/to/key.json"
    cred, _ := feast.NewGoogleCredential("localhost:6566")
    CallCredentials credentials = new GoogleAuthCredentials(
        Map.of("audience", "localhost:6566"));

    OAuth grant type. Set as client_credentials

    oauth_client_id

    Client Id used in the client-credentials request.

    oauth_client_secret

    Client secret used in the client-credentials request.

    oauth_audience

    Target audience of the credential. Set to host URL of target Service.

    (https://localhost if Service listens on localhost).

    Target audience of the credential. Set to host URL of target Service.

    ( https://localhost if Service listens on localhost).

    clientId

    Client Id used in the client-credentials request.

    clientSecret

    Client secret used in the client-credentials request.

    endpointURL

    Target URL to make the client-credentials request to.

    Target audience of the credential. Set to host URL of target Service.

    ( https://localhost if Service listens on localhost).

    grant_type

    OAuth grant type. Set as client_credentials

    client_id

    Client Id used in the client-credentials request.

    client_secret

    Client secret used in the client-credentials request.

    oauth_url

    Target URL to make the client-credentials request to obtain credential.

    jwkEndpointURI

    HTTPS URL used to retrieve a JWK that can be used to decode the credential.

    http://localhost:8888/tree?localhostchevron-right
    http://localhost:8888/tree?localhostchevron-right
    http://localhost:8888/tree?localhostchevron-right
    http://localhost:8888/tree?localhostchevron-right