LogoLogo
master
master
  • Introduction
  • Blog
  • Community & getting help
  • Roadmap
  • Changelog
  • Getting started
    • Quickstart
    • Architecture
      • Overview
      • Language
      • Push vs Pull Model
      • Write Patterns
      • Feature Transformation
      • Feature Serving and Model Inference
      • Role-Based Access Control (RBAC)
    • Concepts
      • Overview
      • Project
      • Data ingestion
      • Entity
      • Feature view
      • Feature retrieval
      • Point-in-time joins
      • [Alpha] Saved dataset
      • Permission
      • Tags
    • Use Cases
    • Components
      • Overview
      • Registry
      • Offline store
      • Online store
      • Feature server
      • Batch Materialization Engine
      • Provider
      • Authorization Manager
      • OpenTelemetry Integration
    • Third party integrations
    • FAQ
  • Tutorials
    • Sample use-case tutorials
      • Driver ranking
      • Fraud detection on GCP
      • Real-time credit scoring on AWS
      • Driver stats on Snowflake
    • Validating historical features with Great Expectations
    • Building streaming features
    • Retrieval Augmented Generation (RAG) with Feast
  • How-to Guides
    • Running Feast with Snowflake/GCP/AWS
      • Install Feast
      • Create a feature repository
      • Deploy a feature store
      • Build a training dataset
      • Load data into the online store
      • Read features from the online store
      • Scaling Feast
      • Structuring Feature Repos
    • Running Feast in production (e.g. on Kubernetes)
    • Customizing Feast
      • Adding a custom batch materialization engine
      • Adding a new offline store
      • Adding a new online store
      • Adding a custom provider
    • Adding or reusing tests
    • Starting Feast servers in TLS(SSL) Mode
  • Reference
    • Codebase Structure
    • Type System
    • Data sources
      • Overview
      • File
      • Snowflake
      • BigQuery
      • Redshift
      • Push
      • Kafka
      • Kinesis
      • Spark (contrib)
      • PostgreSQL (contrib)
      • Trino (contrib)
      • Azure Synapse + Azure SQL (contrib)
      • Couchbase (contrib)
    • Offline stores
      • Overview
      • Dask
      • Snowflake
      • BigQuery
      • Redshift
      • DuckDB
      • Couchbase Columnar (contrib)
      • Spark (contrib)
      • PostgreSQL (contrib)
      • Trino (contrib)
      • Azure Synapse + Azure SQL (contrib)
      • Clickhouse (contrib)
      • Remote Offline
    • Online stores
      • Overview
      • SQLite
      • Snowflake
      • Redis
      • Dragonfly
      • IKV
      • Datastore
      • DynamoDB
      • Bigtable
      • Remote
      • PostgreSQL
      • Cassandra + Astra DB
      • Couchbase
      • MySQL
      • Hazelcast
      • ScyllaDB
      • SingleStore
      • Milvus
    • Registries
      • Local
      • S3
      • GCS
      • SQL
      • Snowflake
    • Providers
      • Local
      • Google Cloud Platform
      • Amazon Web Services
      • Azure
    • Batch Materialization Engines
      • Snowflake
      • AWS Lambda (alpha)
      • Spark (contrib)
    • Feature repository
      • feature_store.yaml
      • .feastignore
    • Feature servers
      • Python feature server
      • [Alpha] Go feature server
      • Offline Feature Server
      • Registry server
    • [Beta] Web UI
    • [Beta] On demand feature view
    • [Alpha] Vector Database
    • [Alpha] Data quality monitoring
    • [Alpha] Streaming feature computation with Denormalized
    • Feast CLI reference
    • Python API reference
    • Usage
  • Project
    • Contribution process
    • Development guide
    • Backwards Compatibility Policy
      • Maintainer Docs
    • Versioning policy
    • Release process
    • Feast 0.9 vs Feast 0.10+
Powered by GitBook
On this page
  • Functionality
  • Functionality Matrix

Was this helpful?

Edit on GitHub
Export as PDF
  1. Reference
  2. Data sources

Overview

PreviousData sourcesNextFile

Last updated 2 months ago

Was this helpful?

Functionality

In Feast, each batch data source is associated with corresponding offline stores. For example, a SnowflakeSource can only be processed by the Snowflake offline store, while a FileSource can be processed by both File and DuckDB offline stores. Otherwise, the primary difference between batch data sources is the set of supported types. Feast has an internal type system, and aims to support eight primitive types (bytes, string, int32, int64, float32, float64, bool, and timestamp) along with the corresponding array types. However, not every batch data source supports all of these types.

For more details on the Feast type system, see .

Functionality Matrix

There are currently four core batch data source implementations: FileSource, BigQuerySource, SnowflakeSource, and RedshiftSource. There are several additional implementations contributed by the Feast community (PostgreSQLSource, SparkSource, and TrinoSource), which are not guaranteed to be stable or to match the functionality of the core implementations. Details for each specific data source can be found .

Below is a matrix indicating which data sources support which types.

File
BigQuery
Snowflake
Redshift
Postgres
Spark
Trino
Couchbase

bytes

yes

yes

yes

yes

yes

yes

yes

yes

string

yes

yes

yes

yes

yes

yes

yes

yes

int32

yes

yes

yes

yes

yes

yes

yes

yes

int64

yes

yes

yes

yes

yes

yes

yes

yes

float32

yes

yes

yes

yes

yes

yes

yes

yes

float64

yes

yes

yes

yes

yes

yes

yes

yes

bool

yes

yes

yes

yes

yes

yes

yes

yes

timestamp

yes

yes

yes

yes

yes

yes

yes

yes

array types

yes

yes

yes

no

yes

yes

no

no

here
here