LogoLogo
v0.45-branch
v0.45-branch
  • Introduction
  • Community & getting help
  • Roadmap
  • Changelog
  • Getting started
    • Quickstart
    • Architecture
      • Overview
      • Language
      • Push vs Pull Model
      • Write Patterns
      • Feature Transformation
      • Feature Serving and Model Inference
      • Role-Based Access Control (RBAC)
    • Concepts
      • Overview
      • Project
      • Data ingestion
      • Entity
      • Feature view
      • Feature retrieval
      • Point-in-time joins
      • [Alpha] Saved dataset
      • Permission
      • Tags
    • Components
      • Overview
      • Registry
      • Offline store
      • Online store
      • Feature server
      • Batch Materialization Engine
      • Provider
      • Authorization Manager
    • Third party integrations
    • FAQ
  • Tutorials
    • Sample use-case tutorials
      • Driver ranking
      • Fraud detection on GCP
      • Real-time credit scoring on AWS
      • Driver stats on Snowflake
    • Validating historical features with Great Expectations
    • Building streaming features
  • How-to Guides
    • Running Feast with Snowflake/GCP/AWS
      • Install Feast
      • Create a feature repository
      • Deploy a feature store
      • Build a training dataset
      • Load data into the online store
      • Read features from the online store
      • Scaling Feast
      • Structuring Feature Repos
    • Running Feast in production (e.g. on Kubernetes)
    • Customizing Feast
      • Adding a custom batch materialization engine
      • Adding a new offline store
      • Adding a new online store
      • Adding a custom provider
    • Adding or reusing tests
    • Starting Feast servers in TLS(SSL) Mode
  • Reference
    • Codebase Structure
    • Type System
    • Data sources
      • Overview
      • File
      • Snowflake
      • BigQuery
      • Redshift
      • Push
      • Kafka
      • Kinesis
      • Spark (contrib)
      • PostgreSQL (contrib)
      • Trino (contrib)
      • Azure Synapse + Azure SQL (contrib)
    • Offline stores
      • Overview
      • Dask
      • Snowflake
      • BigQuery
      • Redshift
      • DuckDB
      • Spark (contrib)
      • PostgreSQL (contrib)
      • Trino (contrib)
      • Azure Synapse + Azure SQL (contrib)
      • Remote Offline
    • Online stores
      • Overview
      • SQLite
      • Snowflake
      • Redis
      • Dragonfly
      • IKV
      • Datastore
      • DynamoDB
      • Bigtable
      • Remote
      • PostgreSQL
      • Cassandra + Astra DB
      • Couchbase
      • MySQL
      • Hazelcast
      • ScyllaDB
      • SingleStore
      • Milvus
    • Registries
      • Local
      • S3
      • GCS
      • SQL
      • Snowflake
    • Providers
      • Local
      • Google Cloud Platform
      • Amazon Web Services
      • Azure
    • Batch Materialization Engines
      • Snowflake
      • AWS Lambda (alpha)
      • Spark (contrib)
    • Feature repository
      • feature_store.yaml
      • .feastignore
    • Feature servers
      • Python feature server
      • [Alpha] Go feature server
      • Offline Feature Server
    • [Beta] Web UI
    • [Beta] On demand feature view
    • [Alpha] Vector Database
    • [Alpha] Data quality monitoring
    • [Alpha] Streaming feature computation with Denormalized
    • Feast CLI reference
    • Python API reference
    • Usage
  • Project
    • Contribution process
    • Development guide
    • Backwards Compatibility Policy
      • Maintainer Docs
    • Versioning policy
    • Release process
    • Feast 0.9 vs Feast 0.10+
Powered by GitBook
On this page
  • Table of Contents
  • Overview
  • Compatibility
  • Community
  • Making a pull request
  • Pull request checklist
  • Good practices to keep in mind
  • Forking the repo
  • Pre-commit Hooks
  • Signing off commits
  • Incorporating upstream changes from master
  • Feast Python SDK and CLI
  • Environment Setup
  • Quick start
  • building the UI
  • Recompiling python lock files
  • Building protos
  • Building a docker image for development
  • Code Style and Linting
  • Unit Tests
  • Integration Tests
  • Contrib integration tests
  • (Experimental) Feast UI
  • Feast Java Serving
  • Developing the Feast Helm charts
  • Feast Java Feature Server Helm Chart
  • Feast Python Feature Server Helm Chart
  • Testing with Github Actions workflows
  • Feast Data Storage Format

Was this helpful?

Edit on GitHub
Export as PDF
  1. Project

Development guide

PreviousContribution processNextBackwards Compatibility Policy

Last updated 3 months ago

Was this helpful?

Table of Contents

Overview

This guide is targeted at developers looking to contribute to Feast components in the main Feast repository:

Compatibility

Community

Making a pull request

We use the convention that the assignee of a PR is the person with the next action.

If the assignee is empty it means that no reviewer has been found yet. If a reviewer has been found, they should also be the assigned the PR. Finally, if there are comments to be addressed, the PR author should be the one assigned the PR.

Pull request checklist

A quick list of things to keep in mind as you're making changes:

  • As you make changes

  • When you make the PR

    • Make a pull request from the forked repo you made

    • Ensure the title of the PR matches semantic release conventions (e.g. start with feat: or fix: or ci: or chore: or docs:). Keep in mind that any PR with feat: or fix: will directly make it into the change log of a release, so make sure they are understandable!

    • Ensure you add a GitHub label (i.e. a kind tag to the PR (e.g. kind/bug or kind/housekeeping)) or else checks will fail.

    • Ensure you leave a release note for any user facing changes in the PR. There is a field automatically generated in the PR request. You can write NONE in that field if there are no user facing changes.

    • Try to keep PRs smaller. This makes them easier to review.

Good practices to keep in mind

  • Fill in the description based on the default template configured when you first open the PR

    • What this PR does/why we need it

    • Which issue(s) this PR fixes

    • Does this PR introduce a user-facing change

  • Add WIP: to PR name if more work needs to be done prior to review

Forking the repo

Fork the Feast Github repo and clone your fork locally. Then make changes to a local branch to the fork.

Pre-commit Hooks

  1. Ensure that you have Python (3.7 and above) with pip, installed.

  2. Install pre-commit with pip & install pre-push hooks

pip install pre-commit
pre-commit install --hook-type pre-commit --hook-type pre-push
  1. On push, the pre-commit hook will run. This runs make format and make lint.

Signing off commits

Use git signoffs to sign your commits. See https://docs.github.com/en/github/authenticating-to-github/managing-commit-signature-verification for details

Then, you can sign off commits with the -s flag:

git commit -s -m "My first commit"

GPG-signing commits with -S is optional.

Incorporating upstream changes from master

Our preference is the use of git rebase [master] instead of git merge : git pull -r.

Note that this means if you are midway through working through a PR and rebase, you'll have to force push: git push --force-with-lease origin [branch name]

Feast Python SDK and CLI

Environment Setup

Tools

  • Docker: Docker is used to provision service dependencies during testing, and build images for feature servers and other components.

  • make is used to run various scripts

  • (Optional): Node & Yarn (needed for building the feast UI)

Quick start

  • create a new virtual env: uv venv --python 3.11 (Replace the python version with your desired version)

  • activate the venv: source venv/bin/activate

  • Install dependencies make install-python-dependencies-dev

building the UI

make build-ui

Recompiling python lock files

Recompile python lock files. This only needs to be run when you make changes to requirements or simply want to update python lock files to reflect latest versions.

make lock-python-dependencies-all

Building protos

make compile-protos-python

Building a docker image for development

docker build -t docker-whale -f ./sdk/python/feast/infra/feature_servers/multicloud/Dockerfile .

Code Style and Linting

Feast Python SDK and CLI codebase:

  • Has type annotations as enforced by mypy

  • Is lintable by ruff

To ensure your Python code conforms to Feast Python code standards:

  • Autoformat your code to conform to the code style:

make format-python
  • Lint your Python code before submitting it for review:

make lint-python

Unit Tests

Unit tests (pytest) for the Feast Python SDK and CLI can run as follows:

make test-python-unit
  • Ensure Feast Python SDK and CLI is not configured with configuration overrides (ie ~/.feast/config should be empty).

Integration Tests

There are two sets of tests you can run:

  1. Local integration tests (for faster development, tests file offline store & key online stores)

  2. Full integration tests (requires cloud environment setups)

Local integration tests

It leverages a file based offline store to test against emulated versions of Datastore, DynamoDB, and Redis, using ephemeral containers.

These tests create new temporary tables / datasets locally only, and they are cleaned up. when the containers are torn down.

make test-python-integration-local

(Advanced) Full integration tests

To test across clouds, on top of setting up Redis, you also need GCP / AWS / Snowflake setup.

GCP

  1. You will need to setup a service account, enable the BigQuery API, and create a staging location for a bucket.

    • Remember to save your PROJECT_ID and your key.json. These will be your secrets that you will need to configure in Github actions. Namely, secrets.GCP_PROJECT_ID and secrets.GCP_SA_KEY. The GCP_SA_KEY value is the contents of your key.json file.

    • Make sure to add the service account email that you created in the previous step to the users that can access your bucket. Then, make sure to give the account the correct access roles, namely objectCreator, objectViewer, objectAdmin, and admin, so that your tests can use the bucket.

  1. Login to gcloud if you haven't already:

gcloud auth login
gcloud auth application-default login
  • When you run gcloud auth application-default login, you should see some output of the form:

    Credentials saved to file: [$HOME/.config/gcloud/application_default_credentials.json]
  • You should run export GOOGLE_APPLICATION_CREDENTIALS="$HOME/.config/gcloud/application_default_credentials.json” to add the application credentials to your .zshrc or .bashrc.

  1. Run export GCLOUD_PROJECT=[your project id from step 2] to your .zshrc or .bashrc.

  2. Running gcloud config list should give you something like this:

$ gcloud config list
[core]
account = [your email]
disable_usage_reporting = True
project = [your project id]

Your active configuration is: [default]
  1. Export GCP specific environment variables in your workflow. Namely,

export GCS_REGION='[your gcs region e.g US]'
export GCS_STAGING_LOCATION='[your gcs staging location]'

NOTE: Your GCS_STAGING_LOCATION should be in the form gs://<bucket name> where the bucket name is from step 2.

  1. Once authenticated, you should be able to run the integration tests for BigQuery without any failures.

AWS

  1. Setup AWS by creating an account, database, and cluster. You will need to enable Redshift and Dynamo.

  1. To run the AWS Redshift and Dynamo integration tests you will have to export your own AWS credentials. Namely,

export AWS_REGION='[your aws region]'
export AWS_CLUSTER_ID='[your aws cluster id]'
export AWS_USER='[your aws user]'
export AWS_DB='[your aws database]'
export AWS_STAGING_LOCATION='[your s3 staging location uri]'
export AWS_IAM_ROLE='[redshift and s3 access role]'
export AWS_LAMBDA_ROLE='[your aws lambda execution role]'
export AWS_REGISTRY_PATH='[your aws registry path]'

Snowflake

  1. See https://signup.snowflake.com/ to setup a trial.

  2. Setup your account and if you are not an ACCOUNTADMIN (if you created your own account, you should be), give yourself the SYSADMIN role.

grant role accountadmin, sysadmin to user user2;
  • Your account name can be found under

  1. Create Dashboard and add a Tile.

  2. Create a warehouse and database named FEAST with the schemas OFFLINE and ONLINE.

create or replace warehouse feast_tests_wh with
warehouse_size='MEDIUM' --set your warehouse size to whatever your budget allows--
auto_suspend = 180
auto_resume = true
initially_suspended=true;

create or replace database FEAST;
use database FEAST;
create schema OFFLINE;
create schema ONLINE;
  1. Then to run successfully, you'll need some environment variables setup:

export SNOWFLAKE_CI_DEPLOYMENT='[your snowflake account name]'
export SNOWFLAKE_CI_USER='[your snowflake username]'
export SNOWFLAKE_CI_PASSWORD='[your snowflake pw]'
export SNOWFLAKE_CI_ROLE='[your CI role e.g. SYSADMIN]'
export SNOWFLAKE_CI_WAREHOUSE='[your warehouse]'
export BLOB_EXPORT_STORAGE_NAME='[your data unloading storage name]'
export BLOB_EXPORT_URI='[your data unloading blob uri]`
  1. Once everything is setup, running snowflake integration tests should pass without failures.

Note that for Snowflake / GCP / AWS, running make test-python-integration will create new temporary tables / datasets in your cloud storage tables.

(Advanced) Running specific provider tests or running your test against specific online or offline stores

  1. If you don't need to have your test run against all of the providers(gcp, aws, and snowflake) or don't need to run against all of the online stores, you can tag your test with specific providers or stores that you need(@pytest.mark.universal_online_stores or @pytest.mark.universal_online_stores with the only parameter). The only parameter selects specific offline providers and online stores that your test will test against. Example:

# Only parametrizes this test with the sqlite online store
@pytest.mark.universal_online_stores(only=["sqlite"])
def test_feature_get_online_features_types_match():
  1. You can also filter tests to run by using pytest's cli filtering. Instead of using the make commands to test Feast, you can filter tests by name with the -k parameter. The parametrized integration tests are all uniquely identified by their provider and online store so the -k option can select only the tests that you need to run. For example, to run only Redshift related tests, you can use the following command:

python -m pytest -n 8 --integration -k Redshift sdk/python/tests

(Experimental) Run full integration tests against containerized services

Test across clouds requires existing accounts on GCP / AWS / Snowflake, and may incur costs when using these services.

It's possible to run some integration tests against emulated local versions of these services, using ephemeral containers. These tests create new temporary tables / datasets locally only, and they are cleaned up. when the containers are torn down.

The services with containerized replacements currently implemented are:

  • Datastore

  • DynamoDB

  • Redis

  • Trino

  • HBase

  • Postgres

  • Cassandra

You can run make test-python-integration-container to run tests against the containerized versions of dependencies.

Contrib integration tests

(Contrib) Running tests for Spark offline store

You can run make test-python-universal-spark to run all tests against the Spark offline store. (Note: you'll have to run pip install -e ".[dev]" first).

Not all tests are passing yet

(Contrib) Running tests for Trino offline store

You can run make test-python-universal-trino to run all tests against the Trino offline store. (Note: you'll have to run pip install -e ".[dev]" first)

(Contrib) Running tests for Postgres offline store

You can run test-python-universal-postgres-offline to run all tests against the Postgres offline store. (Note: you'll have to run pip install -e ".[dev]" first)

(Contrib) Running tests for Postgres online store

You can run test-python-universal-postgres-online to run all tests against the Postgres offline store. (Note: you'll have to run pip install -e ".[dev]" first)

(Contrib) Running tests for HBase online store

TODO

(Experimental) Feast UI

Feast Java Serving

Developing the Feast Helm charts

There are 2 helm charts:

  • Feast Java feature server

  • Feast Python feature server

Generally, you can override the images in the helm charts with locally built Docker images, and install the local helm chart.

Feast Java Feature Server Helm Chart

It will:

  • run make build-java-docker-dev to build local Java feature server binaries

  • configure the included application-override.yaml to override the image tag to use the locally built dev images.

  • install the local chart with helm install feast-release ../../../infra/charts/feast --values application-override.yaml

Feast Python Feature Server Helm Chart

It will:

  • run make build-feature-server-dev to build a local python feature server binary

  • install the local chart with helm install feast-release ../../../infra/charts/feast-feature-server --set image.tag=dev --set feature_store_yaml_base64=$(base64 feature_store.yaml)

Testing with Github Actions workflows

Feast Data Storage Format

Feast data storage contracts are documented in the following locations:

Please see for more details on the structure of the entire codebase.

The compatibility policy for Feast can be found , and should be followed for all changes proposed, by maintainers or contributors.

See and for details on how to get more involved in the community.

PRs that are submitted by the general public need to be identified as ok-to-test. Once enabled, will run a range of tests to verify the submission, after which community members will help to review the pull request.

Make your changes in a (instead of making a branch on the main Feast repo)

as you go (to avoid DCO checks failing)

instead of using git pull on your PR branch

Install to ensure all the default linters / formatters are run when you push.

Please run tests locally before submitting a PR (e.g. for Python, the )

See

Setup to automatically lint and format the codebase on commit:

Warning: using the default integrations with IDEs like VSCode or IntelliJ will not sign commits. When you submit a PR, you'll have to re-sign commits to pass the DCO check.

Please note that we use .

Alternatively - To use on a Fedora or RHEL machine, follow this

for managing python dependencies.

(M1 Mac only): Follow the

(Optional): for recompile python lock files. Only when you make changes to requirements or simply want to update python lock files to reflect latest versioons.

Conforms to

Has imports sorted by ruff (see )

Setup to automatically format and lint on commit.

Local configuration can interfere with Unit tests and cause them to fail:

Ensure > and by boto3

For this approach of running tests, you'll need to have docker set up locally:

Note: you can manually control what tests are run today by inspecting and commenting out tests that are added to DEFAULT_FULL_REPO_CONFIGS

You can get free credits .

Setup your service account and project using steps 1-5 .

Follow these in your project to create a bucket for running GCP tests and remember to save the bucket name.

Install the .

You can get free credits .

Also remember to save your , username, and role.

You will need to create a data unloading location(either on S3, GCP, or Azure). Detailed instructions . You will need to save the storage export location and the storage export name. You will need to create a in your warehouse to make this work. Name this storage integration FEAST_S3.

For this approach of running tests, you'll need to have docker set up locally:

See

See

See also development instructions related to the helm chart below at

All README's for helm charts are generated using . You can install it (e.g. with brew install norwoodj/tap/helm-docs) and then run make build-helm-docs.

See the Java demo example (it has development instructions too using minikube)

See the Python demo example (it has development instructions too using minikube)

Please refer to the maintainers if you would like to locally test out the github actions workflow changes. This document will help you setup your fork to test the ci integration tests and other workflows without needing to make a pull request against feast-dev master.

: Used by BigQuery, Snowflake (Future), Redshift (Future).

: Used by Redis, Google Datastore.

⚠️
⚠️
this page
here
Contribution process
Community
Prow
Creating a pull request from a fork
pre-commit
Docker with BuiltKit
podman
guide
uv
installation instructions
dev guide if you have issues
Pixi
Black code style
isort (I) rules
no AWS configuration is present
no AWS credentials can be accessed
Get Docker
RepoConfiguration
here
here
instructions
Cloud SDK
here
account name
here
storage integration
Get Docker
Feast contributing guide
Java contributing guide
helm-docs
here
here
doc
Feast Offline Storage Format
Feast Online Storage Format
Development Guide: Main Feast Repository
Table of Contents
Overview
Compatibility
Community
Making a pull request
Pull request checklist
Good practices to keep in mind
Forking the repo
Pre-commit Hooks
Signing off commits
Incorporating upstream changes from master
Feast Python SDK and CLI
Environment Setup
Code Style and Linting
Unit Tests
Integration Tests
Local integration tests
(Advanced) Full integration tests
(Advanced) Running specific provider tests or running your test against specific online or offline stores
(Experimental) Run full integration tests against containerized services
Contrib integration tests
(Contrib) Running tests for Spark offline store
(Contrib) Running tests for Trino offline store
(Contrib) Running tests for Postgres offline store
(Contrib) Running tests for Postgres online store
(Contrib) Running tests for HBase online store
(Experimental) Feast UI
Feast Java Serving
Developing the Feast Helm charts
Feast Java Feature Server Helm Chart
Feast Python Feature Server Helm Chart
Testing with Github Actions workflows
Feast Data Storage Format
Feast Python SDK and CLI
Feast Java Serving
forked repo
Sign your commits
Rebase from master
pre-commit hooks
local integration tests
pre-commit hooks
Developing the Feast Helm charts