OfflineStore
class.OfflineStoreConfig
class.RetrievalJob
class for this offline store.DataSource
class for the offline storeOfflineStore
in a feature repo's feature_store.yaml
file.OfflineStore
class.get_historical_features
and pull_latest_from_table_or_query
.pull_latest_from_table_or_query
is invoked when running materialization (using the feast materialize
or feast materialize-incremental
commands, or the corresponding FeatureStore.materialize()
method. This method pull data from the offline store, and the FeatureStore
class takes care of writing this data into the online store.get_historical_features
is invoked when reading values from the offline store using the FeatureStore.get_historical_features()
method. Typically, this method is used to retrieve features when training ML models.pull_all_from_table_or_query
is a method that pulls all the data from an offline store from a specified start date to a specified end date.FeastConfigBaseModel
class, which is defined here.FeastConfigBaseModel
is a pydantic class, which parses yaml configuration into python objects. Pydantic also allows the model classes to define validators for the config classes, to make sure that the config classes are correctly defined.type
field, which contains the fully qualified class name of its corresponding OfflineStore class.Config
suffix.feature_store.yaml
as follows:config: RepoConfig
parameter which is passed into the methods of the OfflineStore interface, specifically at the config.offline_store
field of the config
parameter.RetrievalJob
instance, which represents the execution of the actual query against the underlying store.RetrievalJob
interface.RetrievalJob
interface exposes two methods - to_df
and to_arrow
. The expectation is for the retrieval job to be able to return the rows read from the offline store as a parquet DataFrame, or as an Arrow table respectively.DataSource
base class needs to be defined. This class is responsible for holding information needed by specific feature views to support reading historical values from the offline store. For example, a feature view using Redshift as the offline store may need to know which table contains historical feature values.from_proto
, and to_proto
.custom_options
field should be used to store any configuration needed by the data source. In this case, the implementer is responsible for serializing this configuration into bytes in the to_proto
method and reading the value back from bytes in the from_proto
method.feature_store.yaml
file, specifically in the offline_store
field. The value specified should be the fully qualified class name of the OfflineStore.feature_store.yaml
:type
of the offline store class as the value for the offline_store
.OfflineStore
class in a separate repo, you can still test your implementation against the Feast test suite, as long as you have Feast as a submodule in your repo. In the Feast submodule, we can run all the unit tests with:FULL_REPO_CONFIGS
variable defined in sdk/python/tests/integration/feature_repos/repo_configuration.py
. To overwrite these configurations, you can simply create your own file that contains a FULL_REPO_CONFIGS
, and point Feast to that file by setting the environment variable FULL_REPO_CONFIGS_MODULE
to point to that file. The main challenge there will be to write a DataSourceCreator
for the offline store. In this repo, the file that overwrites FULL_REPO_CONFIGS
is feast_custom_offline_store/feast_tests.py
, so you would run