All Feast operations execute through a
provider. Operations like materializing data from the offline to the online store, updating infrastructure like databases, launching streaming ingestion jobs, building training datasets, and reading features from the online store.
Custom providers allow Feast users to extend Feast to execute any custom logic. Examples include:
Launching custom streaming ingestion jobs (Spark, Beam)
Launching custom batch ingestion (materialization) jobs (Spark, Beam)
Adding custom validation to feature repositories during
Adding custom infrastructure setup logic which runs during
Extending Feast commands with in-house metrics, logging, or tracing
Feast comes with built-in providers, e.g,
AwsProvider. However, users can develop their own providers by creating a class that implements the contract in the Provider class.
The fastest way to add custom logic to Feast is to extend an existing provider. The most generic provider is the
LocalProvider which contains no cloud-specific logic. The guide that follows will extend the
LocalProvider with operations that print text to the console. It is up to you as a developer to add your custom code to the provider methods, but the guide below will provide the necessary scaffolding to get you started.
The first step is to define a custom provider class. We've created the
from datetime import datetimefrom typing import Any, Callable, Dict, List, Optional, Sequence, Tuple, Unionfrom feast.entity import Entityfrom feast.feature_table import FeatureTablefrom feast.feature_view import FeatureViewfrom feast.infra.local import LocalProviderfrom feast.infra.offline_stores.offline_store import RetrievalJobfrom feast.protos.feast.types.EntityKey_pb2 import EntityKey as EntityKeyProtofrom feast.protos.feast.types.Value_pb2 import Value as ValueProtofrom feast.registry import Registryfrom feast.repo_config import RepoConfigclass MyCustomProvider(LocalProvider):def __init__(self, config: RepoConfig, repo_path):super().__init__(config)# Add your custom init code here. This code runs on every Feast operation.def update_infra(self,project: str,tables_to_delete: Sequence[Union[FeatureTable, FeatureView]],tables_to_keep: Sequence[Union[FeatureTable, FeatureView]],entities_to_delete: Sequence[Entity],entities_to_keep: Sequence[Entity],partial: bool,):super().update_infra(project,tables_to_delete,tables_to_keep,entities_to_delete,entities_to_keep,partial,)print("Launching custom streaming jobs is pretty easy...")def materialize_single_feature_view(self,config: RepoConfig,feature_view: FeatureView,start_date: datetime,end_date: datetime,registry: Registry,project: str,tqdm_builder: Callable[[int], tqdm],) -> None:super().materialize_single_feature_view(config, feature_view, start_date, end_date, registry, project, tqdm_builder)print("Launching custom batch jobs is pretty easy...")
Notice how in the above provider we have only overwritten two of the methods on the
materialize_single_feature_view. These two methods are convenient to replace if you are planning to launch custom batch or streaming jobs.
update_infra can be used for launching idempotent streaming jobs, and
materialize_single_feature_view can be used for launching batch ingestion jobs.
It is possible to overwrite all the methods on the provider class. In fact, it isn't even necessary to subclass an existing provider like
LocalProvider. The only requirement for the provider class is that it follows the Provider contract.
Configure your feature_store.yaml file to point to your new provider class:
project: reporegistry: registry.dbprovider: feast_custom_provider.custom_provider.MyCustomProvideronline_store:type: sqlitepath: online_store.dboffline_store:type: file
Notice how the
provider field above points to the module and class where your provider can be found.
Now you should be able to use your provider by running a Feast command:
Registered entity driver_idRegistered feature view driver_hourly_statsDeploying infrastructure for driver_hourly_statsLaunching custom streaming jobs is pretty easy...
It may also be necessary to add the module root path to your
PYTHONPATH as follows:
PYTHONPATH=$PYTHONPATH:/home/my_user/my_custom_provider feast apply
That's it. You should not have a fully functional custom provider!
Have a look at the custom provider demo repository for a fully functional example of a custom provider. Feel free to fork it when creating your own custom provider!