Importing Features from dbt
Alpha Feature: The dbt integration is currently in early development and subject to change.
Current Limitations:
Supported data sources: BigQuery, Snowflake, and File-based sources only
Single entity per model
Manual entity column specification required
Breaking changes may occur in future releases.
This guide explains how to use Feast's dbt integration to automatically import dbt models as Feast FeatureViews. This enables you to leverage your existing dbt transformations as feature definitions without manual duplication.
Overview
dbt (data build tool) is a popular tool for transforming data in your warehouse. Many teams already use dbt to create feature tables. Feast's dbt integration allows you to:
Discover dbt models tagged for feature engineering
Import model metadata (columns, types, descriptions) as Feast objects
Generate Python code for Entity, DataSource, and FeatureView definitions
This eliminates the need to manually define Feast objects that mirror your dbt models.
Prerequisites
A dbt project with compiled artifacts (
target/manifest.json)Feast installed with dbt support:
Or install the parser directly:
Quick Start
1. Tag your dbt models
In your dbt project, add a feast tag to models you want to import:
2. Define column types in schema.yml
Feast uses column metadata from your schema.yml to determine feature types:
3. Compile your dbt project
This generates target/manifest.json which Feast will read.
4. List available models
Use the Feast CLI to discover tagged models:
Output:
5. Import models as Feast definitions
Generate a Python file with Feast object definitions:
This generates:
CLI Reference
feast dbt list
feast dbt listDiscover dbt models available for import.
Arguments:
manifest_path: Path to dbt'smanifest.jsonfile
Options:
--tag-filter,-t: Filter models by dbt tag (e.g.,feast)--model,-m: Filter to specific model name(s)
feast dbt import
feast dbt importImport dbt models as Feast object definitions.
Arguments:
manifest_path: Path to dbt'smanifest.jsonfile
Options:
--entity-column, -e
Column to use as entity key
(required)
--data-source-type, -d
Data source type: bigquery, snowflake, file
bigquery
--tag-filter, -t
Filter models by dbt tag
None
--model, -m
Import specific model(s) only
None
--timestamp-field
Timestamp column name
event_timestamp
--ttl-days
Feature TTL in days
1
--exclude-columns
Columns to exclude from features
None
--no-online
Disable online serving
False
--output, -o
Output Python file path
None (stdout)
--dry-run
Preview without generating code
False
Type Mapping
Feast automatically maps dbt/warehouse column types to Feast types:
STRING, VARCHAR, TEXT
String
INT, INTEGER, BIGINT
Int64
SMALLINT, TINYINT
Int32
FLOAT, REAL
Float32
DOUBLE, FLOAT64
Float64
BOOLEAN, BOOL
Bool
TIMESTAMP, DATETIME
UnixTimestamp
BYTES, BINARY
Bytes
ARRAY<type>
Array(type)
Snowflake NUMBER(precision, scale) types are handled specially:
Scale > 0:
Float64Precision <= 9:
Int32Precision <= 18:
Int64Precision > 18:
Float64
Data Source Configuration
BigQuery
Generates BigQuerySource with the full table path from dbt metadata:
Snowflake
Generates SnowflakeSource with database, schema, and table:
File
Generates FileSource with a placeholder path:
Best Practices
1. Use consistent tagging
Create a standard tagging convention in your dbt project:
2. Document your columns
Column descriptions from schema.yml are preserved in the generated Feast definitions, making your feature catalog self-documenting.
3. Review before committing
Use --dry-run to preview what will be generated:
4. Version control generated code
Commit the generated Python files to your repository. This allows you to:
Track changes to feature definitions over time
Review dbt-to-Feast mapping in pull requests
Customize generated code if needed
5. Integrate with CI/CD
Add dbt import to your CI pipeline:
Limitations
Single entity support: Currently supports one entity column per import. For multi-entity models, run multiple imports or manually adjust the generated code.
No incremental updates: Each import generates a complete file. Use version control to track changes.
Column types required: Models without
data_typein schema.yml default toStringtype.
Troubleshooting
"manifest.json not found"
Run dbt compile or dbt run first to generate the manifest file.
"No models found with tag"
Check that your models have the correct tag in their config:
"Missing entity column"
Ensure your dbt model includes the entity column specified with --entity-column. Models missing this column are skipped with a warning.
"Missing timestamp column"
By default, Feast looks for event_timestamp. Use --timestamp-field to specify a different column name.
Last updated
Was this helpful?