arrow-left

All pages
gitbookPowered by GitBook
1 of 1

Loading...

Push

Warning: This is an experimental feature. It's intended for early testing and feedback, and could change without warnings in future releases.

hashtag
Description

Push sources allow feature values to be pushed to the online store and offline store in real time. This allows fresh feature values to be made available to applications. Push sources supercede the FeatureStore.write_to_online_storearrow-up-right.

Push sources can be used by multiple feature views. When data is pushed to a push source, Feast propagates the feature values to all the consuming feature views.

Push sources must have a batch source specified. The batch source will be used for retrieving historical features. Thus users are also responsible for pushing data to a batch data source such as a data warehouse table. When using a push source as a stream source in the definition of a feature view, a batch source doesn't need to be specified in the feature view definition explicitly.

hashtag
Stream sources

Streaming data sources are important sources of feature values. A typical setup with streaming data looks like:

  1. Raw events come in (stream 1)

  2. Streaming transformations applied (e.g. generating features like last_N_purchased_categories) (stream 2)

  3. Write stream 2 values to an offline store as a historical log for training (optional)

Feast allows users to push features previously registered in a feature view to the online store for fresher features. It also allows users to push batches of stream data to the offline store by specifying that the push be directed to the offline store. This will push the data to the offline store declared in the repository configuration used to initialize the feature store.

hashtag
Example

hashtag
Defining a push source

Note that the push schema needs to also include the entity.

hashtag
Pushing data

Note that the to parameter is optional and defaults to online but we can specify these options: PushMode.ONLINE, PushMode.OFFLINE, or PushMode.ONLINE_AND_OFFLINE.

See also for instructions on how to push data to a deployed feature server.

Write stream 2 values to an online store for low latency feature serving

  • Periodically materialize feature values from the offline store into the online store for decreased training-serving skew and improved model performance

  • Python feature server
    from feast import PushSource, ValueType, BigQuerySource, FeatureView, Feature, Field
    from feast.types import Int64
    
    push_source = PushSource(
        name="push_source",
        batch_source=BigQuerySource(table="test.test"),
    )
    
    fv = FeatureView(
        name="feature view",
        entities=["user_id"],
        schema=[Field(name="life_time_value", dtype=Int64)],
        source=push_source,
    )
    from feast import FeatureStore
    import pandas as pd
    from feast.data_source import PushMode
    
    fs = FeatureStore(...)
    feature_data_frame = pd.DataFrame()
    fs.push("push_source_name", feature_data_frame, to=PushMode.ONLINE_AND_OFFLINE)