[Alpha] Streaming feature computation with Denormalized
Denormalized makes it easy to compute real-time features and write them directly to your Feast online store. This guide will walk you through setting up a streaming pipeline that computes feature aggregations and pushes them to Feast in real-time.
Denormalized/Feast integration diagram
Prerequisites
Python 3.12+
Kafka cluster (local or remote) OR docker installed
For a full working demo, check out the feast-example repo.
Quick Start
First, create a new Python project or use our template:
Set up your Feast feature repository:
Project Structure
Your project should look something like this:
Run a test Kafka instance in docker
docker run --rm -p 9092:9092 emgeee/kafka_emit_measurements:latest
This will spin up a docker container that runs a kafka instance and run a simple script to emit fake data to two topics.
Define Your Features
In feature_repo/sensor_data.py, define your feature view and entity:
Create Your Streaming Pipeline
In stream_job.py, define your streaming computations:
mkdir my-feature-project
cd my-feature-project
python -m venv .venv
source .venv/bin/activate # or `.venv\Scripts\activate` on Windows
pip install denormalized[feast] feast