Docker Compose

Overview

This guide will bring Feast up using Docker Compose. This will allow you to:

  • Create, register, and manage feature sets

  • Ingest feature data into Feast

  • Retrieve features for online serving

  • Retrieve features for batch serving (only if using Google Cloud Platform)

This guide is split into three parts:

  1. Setting up your environment

  2. Starting Feast with online serving support only (does not require GCP).

  3. Starting Feast with support for both online and batch serving (requires GCP)

The docker compose setup uses Direct Runner for the Apache Beam jobs that populate data stores. Running Beam with the Direct Runner means it does not need a dedicated runner like Flink or Dataflow, but this comes at the cost of performance. We recommend the use of a dedicated runner when running Feast with very large workloads.

0. Requirements

  • Docker compose must be installed.

  • The following list of TCP ports must be free:

    • 6565, 6566, 8888, and 9094.

    • Alternatively it is possible to modify port mappings in /docker-compose/docker-compose.yml.

  • (for batch serving only) For batch serving you will also need a GCP service account key that has access to Google Cloud Storage and BigQuery.

  • (for batch serving only) Google Cloud SDK installed, authenticated, and configured to the project you will use.

1. Set up environment

Clone the Feast repository and navigate to the docker-compose sub-directory:

git clone https://github.com/feast-dev/feast.git && \
cd feast && export FEAST_HOME_DIR=$(pwd) && \
cd infra/docker-compose

Make a copy of the .env.sample file:

cp .env.sample .env

2. Docker Compose for Online Serving Only

2.1 Start Feast (without batch retrieval support)

If you do not require batch serving, then its possible to simply bring up Feast:

docker-compose up -d

A Jupyter Notebook environment is now available to use Feast:

http://localhost:8888/tree/feast/examples

3. Docker Compose for Online and Batch Serving

Batch serving requires Google Cloud Storage to function, specifically Google Cloud Storage (GCP) and BigQuery.

3.1 Set up Google Cloud Platform

Create a service account from the GCP console and copy it to the infra/docker-compose/gcp-service-accounts folder:

cp my-service-account.json ${FEAST_HOME_DIR}/infra/docker-compose/gcp-service-accounts

Create a Google Cloud Storage bucket. Make sure that your service account above has read/write permissions to this bucket:

gsutil mb gs://my-feast-staging-bucket

3.2 Configure .env

Configure the .env file based on your environment. At the very least you have to modify:

Parameter

Description

FEAST_CORE_GCP_SERVICE_ACCOUNT_KEY

This should be your service account file name, for example key.json.

FEAST_BATCH_SERVING_GCP_SERVICE_ACCOUNT_KEY

This should be your service account file name, for example key.json

FEAST_JUPYTER_GCP_SERVICE_ACCOUNT_KEY

This should be your service account file name, for example key.json

FEAST_JOB_STAGING_LOCATION

Google Cloud Storage bucket that Feast will use to stage data exports and batch retrieval requests, for example gs://your-gcs-bucket/staging

3.3 Configure .bq-store.yml

We will also need to configure the bq-store.yml file inside infra/docker-compose/serving/ to configure the BigQuery storage configuration as well as the feature sets that the store subscribes to. At a minimum you will need to set:

Parameter

Description

bigquery_config.project_id

This is you GCP project Id.

bigquery_config.dataset_id

This is the name of the BigQuery dataset that tables will be created in. Each feature set will have one table in BigQuery.

3.4 Start Feast (with batch retrieval support)

Start Feast:

docker-compose up -d

A Jupyter Notebook environment is now available to use Feast:

http://localhost:8888/tree/feast/examples