OpenLineage Integration

This module provides native integration between Feast and OpenLineagearrow-up-right, enabling automatic data lineage tracking for ML feature engineering workflows.

Overview

When enabled, the integration automatically emits OpenLineage events for:

  • Registry changes - Events when feature views, feature services, and entities are applied

  • Feature materialization - START, COMPLETE, and FAIL events when features are materialized

No code changes required - just enable OpenLineage in your feature_store.yaml!

Installation

OpenLineage is an optional dependency. Install it with:

pip install openlineage-python

Or install Feast with the OpenLineage extra:

pip install feast[openlineage]

Configuration

Add the openlineage section to your feature_store.yaml:

project: my_project
registry: data/registry.db
provider: local
online_store:
  type: sqlite
  path: data/online_store.db

openlineage:
  enabled: true
  transport_type: http
  transport_url: http://localhost:5000
  transport_endpoint: api/v1/lineage
  namespace: feast
  emit_on_apply: true
  emit_on_materialize: true

Once configured, all Feast operations will automatically emit lineage events.

Environment Variables

You can also configure via environment variables:

Usage

Once configured, lineage is tracked automatically:

Configuration Options

Option
Default
Description

enabled

false

Enable/disable OpenLineage integration

transport_type

http

Transport type: http, file, kafka

transport_url

-

URL for HTTP transport (required)

transport_endpoint

api/v1/lineage

API endpoint for HTTP transport

api_key

-

Optional API key for authentication

namespace

feast

Namespace for lineage events (uses project name if set to "feast")

producer

feast

Producer identifier

emit_on_apply

true

Emit events on feast apply

emit_on_materialize

true

Emit events on materialization

Lineage Graph Structure

When you run feast apply, Feast creates a lineage graph that matches the Feast UI:

Jobs created:

  • feast_feature_views_{project}: Shows DataSources + Entities → FeatureViews

  • feature_service_{name}: Shows specific FeatureViews → FeatureService (one per service)

Datasets include:

  • Schema with feature names, types, descriptions, and tags

  • Feast-specific facets with metadata (TTL, entities, owner, etc.)

  • Documentation facets with descriptions

Transport Types

File Transport

Kafka Transport

Custom Feast Facets

The integration includes custom Feast-specific facets in lineage events:

FeastFeatureViewFacet

Captures metadata about feature views:

  • name: Feature view name

  • ttl_seconds: Time-to-live in seconds

  • entities: List of entity names

  • features: List of feature names

  • online_enabled / offline_enabled: Store configuration

  • description: Feature view description

  • tags: Key-value tags

FeastFeatureServiceFacet

Captures metadata about feature services:

  • name: Feature service name

  • feature_views: List of feature view names

  • feature_count: Total number of features

  • description: Feature service description

  • tags: Key-value tags

FeastMaterializationFacet

Captures materialization run metadata:

  • feature_views: Feature views being materialized

  • start_date / end_date: Materialization window

  • rows_written: Number of rows written

Lineage Visualization

Use Marquezarrow-up-right to visualize your Feast lineage:

Then access the Marquez UI at http://localhost:3000 to see your feature lineage.

Namespace Behavior

  • If namespace is set to "feast" (default): Uses project name as namespace (e.g., my_project)

  • If namespace is set to a custom value: Uses {namespace}/{project} (e.g., custom/my_project)

Feast to OpenLineage Mapping

Feast Concept
OpenLineage Concept

DataSource

InputDataset

FeatureView

OutputDataset (of feature views job) / InputDataset (of feature service job)

Feature

Schema field

Entity

InputDataset

FeatureService

OutputDataset

Materialization

RunEvent (START/COMPLETE/FAIL)

Last updated

Was this helpful?