Search…
Adding a custom provider

Overview

All Feast operations execute through a provider. Operations like materializing data from the offline to the online store, updating infrastructure like databases, launching streaming ingestion jobs, building training datasets, and reading features from the online store.
Custom providers allow Feast users to extend Feast to execute any custom logic. Examples include:
    Launching custom streaming ingestion jobs (Spark, Beam)
    Launching custom batch ingestion (materialization) jobs (Spark, Beam)
    Adding custom validation to feature repositories during feast apply
    Adding custom infrastructure setup logic which runs during feast apply
    Extending Feast commands with in-house metrics, logging, or tracing
Feast comes with built-in providers, e.g, LocalProvider, GcpProvider, and AwsProvider. However, users can develop their own providers by creating a class that implements the contract in the Provider class.
This guide also comes with a fully functional custom provider demo repository. Please have a look at the repository for a representative example of what a custom provider looks like, or fork the repository when creating your own provider.

Guide

The fastest way to add custom logic to Feast is to extend an existing provider. The most generic provider is the LocalProvider which contains no cloud-specific logic. The guide that follows will extend the LocalProvider with operations that print text to the console. It is up to you as a developer to add your custom code to the provider methods, but the guide below will provide the necessary scaffolding to get you started.

Step 1: Define a Provider class

The first step is to define a custom provider class. We've created the MyCustomProvider below.
1
from datetime import datetime
2
from typing import Any, Callable, Dict, List, Optional, Sequence, Tuple, Union
3
4
from feast.entity import Entity
5
from feast.feature_table import FeatureTable
6
from feast.feature_view import FeatureView
7
from feast.infra.local import LocalProvider
8
from feast.infra.offline_stores.offline_store import RetrievalJob
9
from feast.protos.feast.types.EntityKey_pb2 import EntityKey as EntityKeyProto
10
from feast.protos.feast.types.Value_pb2 import Value as ValueProto
11
from feast.registry import Registry
12
from feast.repo_config import RepoConfig
13
14
15
class MyCustomProvider(LocalProvider):
16
def __init__(self, config: RepoConfig, repo_path):
17
super().__init__(config)
18
# Add your custom init code here. This code runs on every Feast operation.
19
20
def update_infra(
21
self,
22
project: str,
23
tables_to_delete: Sequence[Union[FeatureTable, FeatureView]],
24
tables_to_keep: Sequence[Union[FeatureTable, FeatureView]],
25
entities_to_delete: Sequence[Entity],
26
entities_to_keep: Sequence[Entity],
27
partial: bool,
28
):
29
super().update_infra(
30
project,
31
tables_to_delete,
32
tables_to_keep,
33
entities_to_delete,
34
entities_to_keep,
35
partial,
36
)
37
print("Launching custom streaming jobs is pretty easy...")
38
39
def materialize_single_feature_view(
40
self,
41
config: RepoConfig,
42
feature_view: FeatureView,
43
start_date: datetime,
44
end_date: datetime,
45
registry: Registry,
46
project: str,
47
tqdm_builder: Callable[[int], tqdm],
48
) -> None:
49
super().materialize_single_feature_view(
50
config, feature_view, start_date, end_date, registry, project, tqdm_builder
51
)
52
print("Launching custom batch jobs is pretty easy...")
Copied!
Notice how in the above provider we have only overwritten two of the methods on the LocalProvider, namely update_infra and materialize_single_feature_view. These two methods are convenient to replace if you are planning to launch custom batch or streaming jobs. update_infra can be used for launching idempotent streaming jobs, and materialize_single_feature_view can be used for launching batch ingestion jobs.
It is possible to overwrite all the methods on the provider class. In fact, it isn't even necessary to subclass an existing provider like LocalProvider. The only requirement for the provider class is that it follows the Provider contract.

Step 2: Configuring Feast to use the provider

Configure your feature_store.yaml file to point to your new provider class:
1
project: repo
2
registry: registry.db
3
provider: feast_custom_provider.custom_provider.MyCustomProvider
4
online_store:
5
type: sqlite
6
path: online_store.db
7
offline_store:
8
type: file
Copied!
Notice how the provider field above points to the module and class where your provider can be found.

Step 3: Using the provider

Now you should be able to use your provider by running a Feast command:
1
feast apply
Copied!
1
Registered entity driver_id
2
Registered feature view driver_hourly_stats
3
Deploying infrastructure for driver_hourly_stats
4
Launching custom streaming jobs is pretty easy...
Copied!
It may also be necessary to add the module root path to your PYTHONPATH as follows:
1
PYTHONPATH=$PYTHONPATH:/home/my_user/my_custom_provider feast apply
Copied!
That's it. You should not have a fully functional custom provider!

Next steps

Have a look at the custom provider demo repository for a fully functional example of a custom provider. Feel free to fork it when creating your own custom provider!
Last modified 1mo ago
Export as PDF
Copy link