RetrievalJobclass for this offline store.
DataSourceclass for the offline store
OfflineStorein a feature repo's
pull_latest_from_table_or_queryis invoked when running materialization (using the
feast materialize-incrementalcommands, or the corresponding
FeatureStore.materialize()method. This method pull data from the offline store, and the
FeatureStoreclass takes care of writing this data into the online store.
get_historical_featuresis invoked when reading values from the offline store using the
FeatureStore.get_historica_features()method. Typically, this method is used to retrieve features when training ML models.
FeastConfigBaseModelclass, which is defined here.
FeastConfigBaseModelis a pydantic class, which parses yaml configuration into python objects. Pydantic also allows the model classes to define validators for the config classes, to make sure that the config classes are correctly defined.
typefield, which contains the fully qualified class name of its corresponding OfflineStore class.
config: RepoConfigparameter which is passed into the methods of the OfflineStore interface, specifically at the
config.offline_storefield of the
RetrievalJobinstance, which represents the execution of the actual query against the underlying store.
RetrievalJobinterface exposes two methods -
to_arrow. The expectation is for the retrieval job to be able to return the rows read from the offline store as a parquet DataFrame, or as an Arrow table respectively.
DataSourcebase class needs to be defined. This class is responsible for holding information needed by specific feature views to support reading historical values from the offline store. For example, a feature view using Redshift as the offline store may need to know which table contains historical feature values.
custom_optionsfield should be used to store any configuration needed by the data source. In this case, the implementer is responsible for serializing this configuration into bytes in the
to_protomethod and reading the value back from bytes in the
feature_store.yamlfile, specifically in the
offline_storefield. The value specified should be the fully qualified class name of the OfflineStore.
typeof the offline store class as the value for the
OfflineStoreclass in a separate repo, you can still test your implementation against the Feast test suite, as long as you have Feast as a submodule in your repo. In the Feast submodule, we can run all the unit tests with:
FULL_REPO_CONFIGSvariable defined in
sdk/python/tests/integration/feature_repos/repo_configuration.py. To overwrite these configurations, you can simply create your own file that contains a
FULL_REPO_CONFIGS, and point Feast to that file by setting the environment variable
FULL_REPO_CONFIGS_MODULEto point to that file. The main challenge there will be to write a
DataSourceCreatorfor the offline store. In this repo, the file that overwrites
feast_custom_offline_store/feast_tests.py, so you would run