All pages
Powered by GitBook
1 of 3

Loading...

Loading...

Loading...

feature_store.yaml

Overview

feature_store.yaml is used to configure a feature store. The file must be located at the root of a feature repository. An example feature_store.yaml is shown below:

feature_store.yaml
project: loyal_spider
registry: data/registry.db
provider: local
online_store:
    type: sqlite
    path: data/online_store.db

Options

The following top-level configuration options exist in the feature_store.yaml file.

  • provider — Configures the environment in which Feast will deploy and operate.

  • registry — Configures the location of the feature registry.

  • online_store — Configures the online store.

  • offline_store — Configures the offline store.

  • project — Defines a namespace for the entire feature store. Can be used to isolate multiple deployments in a single installation of Feast. Should only contain letters, numbers, and underscores.

  • engine - Configures the batch materialization engine.

Please see the RepoConfig API reference for the full list of configuration options.

Feature repository

Feast users use Feast to manage two important sets of configuration:

  • Configuration about how to run Feast on your infrastructure

  • Feature definitions

With Feast, the above configuration can be written declaratively and stored as code in a central location. This central location is called a feature repository. The feature repository is the declarative source of truth for what the desired state of a feature store should be.

The Feast CLI uses the feature repository to configure, deploy, and manage your feature store.

What is a feature repository?

A feature repository consists of:

  • A collection of Python files containing feature declarations.

  • A feature_store.yaml file containing infrastructural configuration.

  • A .feastignore file containing paths in the feature repository to ignore.

Typically, users store their feature repositories in a Git repository, especially when working in teams. However, using Git is not a requirement.

Structure of a feature repository

The structure of a feature repository is as follows:

  • The root of the repository should contain a feature_store.yaml file and may contain a .feastignore file.

  • The repository should contain Python files that contain feature definitions.

  • The repository can contain other files as well, including documentation and potentially data files.

An example structure of a feature repository is shown below:

$ tree -a
.
├── data
│   └── driver_stats.parquet
├── driver_features.py
├── feature_store.yaml
└── .feastignore

1 directory, 4 files

A couple of things to note about the feature repository:

  • Feast reads all Python files recursively when feast apply is ran, including subdirectories, even if they don't contain feature definitions.

  • It's recommended to add .feastignore and add paths to all imperative scripts if you need to store them inside the feature registry.

The feature_store.yaml configuration file

The configuration for a feature store is stored in a file named feature_store.yaml , which must be located at the root of a feature repository. An example feature_store.yaml file is shown below:

feature_store.yaml
project: my_feature_repo_1
registry: data/metadata.db
provider: local
online_store:
    path: data/online_store.db

The feature_store.yaml file configures how the feature store should run. See feature_store.yaml for more details.

The .feastignore file

This file contains paths that should be ignored when running feast apply. An example .feastignore is shown below:

.feastignore
# Ignore virtual environment
venv

# Ignore a specific Python file
scripts/foo.py

# Ignore all Python files directly under scripts directory
scripts/*.py

# Ignore all "foo.py" anywhere under scripts directory
scripts/**/foo.py

See .feastignore for more details.

Feature definitions

A feature repository can also contain one or more Python files that contain feature definitions. An example feature definition file is shown below:

driver_features.py
from datetime import timedelta

from feast import BigQuerySource, Entity, Feature, FeatureView, Field
from feast.types import Float32, Int64, String

driver_locations_source = BigQuerySource(
    table_ref="rh_prod.ride_hailing_co.drivers",
    timestamp_field="event_timestamp",
    created_timestamp_column="created_timestamp",
)

driver = Entity(
    name="driver",
    description="driver id",
)

driver_locations = FeatureView(
    name="driver_locations",
    entities=[driver],
    ttl=timedelta(days=1),
    schema=[
        Field(name="lat", dtype=Float32),
        Field(name="lon", dtype=String),
        Field(name="driver", dtype=Int64),
    ],
    source=driver_locations_source,
)

To declare new feature definitions, just add code to the feature repository, either in existing files or in a new file. For more information on how to define features, see Feature Views.

Next steps

  • See Create a feature repository to get started with an example feature repository.

  • See feature_store.yaml, .feastignore, or Feature Views for more information on the configuration files that live in a feature registry.

.feastignore

Overview

.feastignore is a file that is placed at the root of the Feature Repository. This file contains paths that should be ignored when running feast apply. An example .feastignore is shown below:

.feastignore
# Ignore virtual environment
venv

# Ignore a specific Python file
scripts/foo.py

# Ignore all Python files directly under scripts directory
scripts/*.py

# Ignore all "foo.py" anywhere under scripts directory
scripts/**/foo.py

.feastignore file is optional. If the file can not be found, every Python file in the feature repo directory will be parsed by feast apply.

Feast Ignore Patterns

Pattern
Example matches
Explanation

venv

venv/foo.py venv/a/foo.py

You can specify a path to a specific directory. Everything in that directory will be ignored.

scripts/foo.py

scripts/foo.py

You can specify a path to a specific file. Only that file will be ignored.

scripts/*.py

scripts/foo.py scripts/bar.py

You can specify an asterisk (*) anywhere in the expression. An asterisk matches zero or more characters, except "/".

scripts/**/foo.py

scripts/foo.py scripts/a/foo.py scripts/a/b/foo.py

You can specify a double asterisk (**) anywhere in the expression. A double asterisk matches zero or more directories.