Importing Features from dbt

Alpha Feature: The dbt integration is currently in early development and subject to change.

Current Limitations:

Supported data sources: BigQuery, Snowflake, and File-based sources only
Manual entity column specification required

Breaking changes may occur in future releases.

This guide explains how to use Feast's dbt integration to automatically import dbt models as Feast FeatureViews. This enables you to leverage your existing dbt transformations as feature definitions without manual duplication.

Overview

dbt (data build tool) is a popular tool for transforming data in your warehouse. Many teams already use dbt to create feature tables. Feast's dbt integration allows you to:

Discover dbt models tagged for feature engineering
Import model metadata (columns, types, descriptions) as Feast objects
Generate Python code for Entity, DataSource, and FeatureView definitions

This eliminates the need to manually define Feast objects that mirror your dbt models.

Prerequisites

A dbt project with compiled artifacts (target/manifest.json)
Feast installed with dbt support:

pip install 'feast[dbt]'

Or install the parser directly:

pip install dbt-artifacts-parser

Quick Start

1. Tag your dbt models

In your dbt project, add a feast tag to models you want to import:

models/driver_features.sql

{{ config(
    materialized='table',
    tags=['feast']
) }}

SELECT
    driver_id,
    event_timestamp,
    avg_rating,
    total_trips,
    is_active
FROM {{ ref('stg_drivers') }}

2. Define column types in schema.yml

Feast uses column metadata from your schema.yml to determine feature types:

models/schema.yml

version: 2
models:
  - name: driver_features
    description: "Driver aggregated features for ML models"
    columns:
      - name: driver_id
        description: "Unique driver identifier"
        data_type: STRING
      - name: event_timestamp
        description: "Feature timestamp"
        data_type: TIMESTAMP
      - name: avg_rating
        description: "Average driver rating"
        data_type: FLOAT64
      - name: total_trips
        description: "Total completed trips"
        data_type: INT64
      - name: is_active
        description: "Whether driver is currently active"
        data_type: BOOLEAN

3. Compile your dbt project

cd your_dbt_project
dbt compile

This generates target/manifest.json which Feast will read.

4. List available models

Use the Feast CLI to discover tagged models:

feast dbt list target/manifest.json --tag-filter feast

Output:

Found 1 model(s) with tag 'feast':

  driver_features
    Description: Driver aggregated features for ML models
    Columns: driver_id, event_timestamp, avg_rating, total_trips, is_active
    Tags: feast

5. Import models as Feast definitions

Generate a Python file with Feast object definitions:

feast dbt import target/manifest.json \
    --entity-column driver_id \
    --data-source-type bigquery \
    --tag-filter feast \
    --output features/driver_features.py

This generates:

features/driver_features.py

"""
Feast feature definitions generated from dbt models.

Source: target/manifest.json
Project: my_dbt_project
Generated by: feast dbt import
"""

from datetime import timedelta

from feast import Entity, FeatureView, Field
from feast.types import Bool, Float64, Int64
from feast.infra.offline_stores.bigquery_source import BigQuerySource


# Entities
driver_id = Entity(
    name="driver_id",
    join_keys=["driver_id"],
    description="Entity key for dbt models",
    tags={'source': 'dbt'},
)


# Data Sources
driver_features_source = BigQuerySource(
    name="driver_features_source",
    table="my_project.my_dataset.driver_features",
    timestamp_field="event_timestamp",
    description="Driver aggregated features for ML models",
    tags={'dbt.model': 'driver_features', 'dbt.tag.feast': 'true'},
)


# Feature Views
driver_features_fv = FeatureView(
    name="driver_features",
    entities=[driver_id],
    ttl=timedelta(days=1),
    schema=[
        Field(name="avg_rating", dtype=Float64, description="Average driver rating"),
        Field(name="total_trips", dtype=Int64, description="Total completed trips"),
        Field(name="is_active", dtype=Bool, description="Whether driver is currently active"),
    ],
    online=True,
    source=driver_features_source,
    description="Driver aggregated features for ML models",
    tags={'dbt.model': 'driver_features', 'dbt.tag.feast': 'true'},
)

Multiple Entity Support

The dbt integration supports feature views with multiple entities, useful for modeling relationships involving multiple keys.

Usage

Specify multiple entity columns using repeated -e flags:

feast dbt import \
  -m target/manifest.json \
  -e user_id \
  -e merchant_id \
  --tag feast \
  -o features/transactions.py

This creates a FeatureView with both user_id and merchant_id as entities, useful for:

Transaction features keyed by both user and merchant
Interaction features keyed by multiple parties
Association tables in many-to-many relationships

Single entity usage:

feast dbt import -m target/manifest.json -e driver_id --tag feast

Requirements

All specified entity columns must exist in each dbt model being imported. Models missing any entity column will be skipped with a warning.

Generated Code

The --output flag generates code like:

user_id = Entity(name="user_id", join_keys=["user_id"], ...)
merchant_id = Entity(name="merchant_id", join_keys=["merchant_id"], ...)

transaction_fv = FeatureView(
    name="transactions",
    entities=[user_id, merchant_id],  # Multiple entities
    schema=[...],
    ...
)

CLI Reference

`feast dbt list`

Discover dbt models available for import.

feast dbt list <manifest_path> [OPTIONS]

Arguments:

manifest_path: Path to dbt's manifest.json file

Options:

--tag-filter, -t: Filter models by dbt tag (e.g., feast)
--model, -m: Filter to specific model name(s)

`feast dbt import`

Import dbt models as Feast object definitions.

feast dbt import <manifest_path> [OPTIONS]

Arguments:

manifest_path: Path to dbt's manifest.json file

Options:

Option

Description

Default

--entity-column, -e

Entity column name (can be specified multiple times)

(required)

--data-source-type, -d

Data source type: bigquery, snowflake, file

bigquery

--tag-filter, -t

Filter models by dbt tag

None

--model, -m

Import specific model(s) only

None

--timestamp-field

Timestamp column name

event_timestamp

--ttl-days

Feature TTL in days

1

--exclude-columns

Columns to exclude from features

None

--no-online

Disable online serving

False

--output, -o

Output Python file path

None (stdout)

--dry-run

Preview without generating code

False

Type Mapping

Feast automatically maps dbt/warehouse column types to Feast types:

dbt/SQL Type

Feast Type

STRING, VARCHAR, TEXT

String

INT, INTEGER, BIGINT

Int64

SMALLINT, TINYINT

Int32

FLOAT, REAL

Float32

DOUBLE, FLOAT64

Float64

BOOLEAN, BOOL

Bool

TIMESTAMP, DATETIME

UnixTimestamp

BYTES, BINARY

Bytes

ARRAY<type>

Array(type)

JSON, JSONB

Map (or Json if declared in schema)

VARIANT, OBJECT

Map

SUPER

Map

MAP<string,string>

Map

STRUCT, RECORD

Struct (BigQuery)

struct<...>

Struct (Spark)

Snowflake NUMBER(precision, scale) types are handled specially:

Scale > 0: Float64
Precision <= 9: Int32
Precision <= 18: Int64
Precision > 18: Float64

Data Source Configuration

BigQuery

feast dbt import manifest.json -e user_id -d bigquery -o features.py

Generates BigQuerySource with the full table path from dbt metadata:

BigQuerySource(
    table="project.dataset.table_name",
    ...
)

Snowflake

feast dbt import manifest.json -e user_id -d snowflake -o features.py

Generates SnowflakeSource with database, schema, and table:

SnowflakeSource(
    database="MY_DB",
    schema="MY_SCHEMA",
    table="TABLE_NAME",
    ...
)

File

feast dbt import manifest.json -e user_id -d file -o features.py

Generates FileSource with a placeholder path:

FileSource(
    path="/data/table_name.parquet",
    ...
)

For file sources, update the generated path to point to your actual data files.

Best Practices

1. Use consistent tagging

Create a standard tagging convention in your dbt project:

# dbt_project.yml
models:
  my_project:
    features:
      +tags: ['feast']  # All models in features/ get the feast tag

2. Document your columns

Column descriptions from schema.yml are preserved in the generated Feast definitions, making your feature catalog self-documenting.

3. Review before committing

Use --dry-run to preview what will be generated:

feast dbt import manifest.json -e user_id -d bigquery --dry-run

4. Version control generated code

Commit the generated Python files to your repository. This allows you to:

Track changes to feature definitions over time
Review dbt-to-Feast mapping in pull requests
Customize generated code if needed

5. Integrate with CI/CD

Add dbt import to your CI pipeline:

# .github/workflows/features.yml
- name: Compile dbt
  run: dbt compile

- name: Generate Feast definitions
  run: |
    feast dbt import target/manifest.json \
      -e user_id -d bigquery -t feast \
      -o feature_repo/features.py

- name: Apply Feast changes
  run: feast apply

Limitations

Single entity support: Currently supports one entity column per import. For multi-entity models, run multiple imports or manually adjust the generated code.
No incremental updates: Each import generates a complete file. Use version control to track changes.
Column types required: Models without data_type in schema.yml default to String type.

Troubleshooting

"manifest.json not found"

Run dbt compile or dbt run first to generate the manifest file.

"No models found with tag"

Check that your models have the correct tag in their config:

{{ config(tags=['feast']) }}

"Missing entity column"

Ensure your dbt model includes the entity column specified with --entity-column. Models missing this column are skipped with a warning.

"Missing timestamp column"

By default, Feast looks for event_timestamp. Use --timestamp-field to specify a different column name.

PreviousStarting Feast servers in TLS(SSL) Mode NextCodebase Structure

Last updated 7 days ago

Was this helpful?

Good afternoon

hashtagOverview

hashtagPrerequisites

hashtagQuick Start

hashtag1. Tag your dbt models

hashtag2. Define column types in schema.yml

hashtag3. Compile your dbt project

hashtag4. List available models

hashtag5. Import models as Feast definitions

hashtagMultiple Entity Support

hashtagUsage

hashtagRequirements

hashtagGenerated Code

hashtagCLI Reference

hashtagfeast dbt list

hashtagfeast dbt import

hashtagType Mapping

hashtagData Source Configuration

hashtagBigQuery

hashtagSnowflake

hashtagFile

hashtagBest Practices

hashtag1. Use consistent tagging

hashtag2. Document your columns

hashtag3. Review before committing

hashtag4. Version control generated code

hashtag5. Integrate with CI/CD

hashtagLimitations

hashtagTroubleshooting

hashtag"manifest.json not found"

hashtag"No models found with tag"

hashtag"Missing entity column"

hashtag"Missing timestamp column"

Overview

Prerequisites

Quick Start

1. Tag your dbt models

2. Define column types in schema.yml

3. Compile your dbt project

4. List available models

5. Import models as Feast definitions

Multiple Entity Support

Usage

Requirements

Generated Code

CLI Reference

`feast dbt list`

`feast dbt import`

Type Mapping

Data Source Configuration

BigQuery

Snowflake

File

Best Practices

1. Use consistent tagging

2. Document your columns

3. Review before committing

4. Version control generated code

5. Integrate with CI/CD

Limitations

Troubleshooting

"manifest.json not found"

"No models found with tag"

"Missing entity column"

"Missing timestamp column"