Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
The list below contains the functionality that contributors are planning to develop for Feast.
We welcome contribution to all items in the roadmap!
Have questions about the roadmap? Go to the Slack channel to ask on #feast-development.
Data Sources
Offline Stores
Online Stores
Feature Engineering
Streaming
Deployments
Feature Serving
Data Quality Management (See RFC)
Feature Discovery and Governance
GitHub Repository: Find the complete Feast codebase on GitHub.
Community Governance Doc: See the governance model of Feast, including who the maintainers are and how decisions are made.
Slack: Feel free to ask questions or say hello! This is the main place where maintainers and contributors brainstorm and where users ask questions or discuss best practices.
Feast users should join #feast-general
or #feast-beginners
to ask questions
Feast developers / contributors should join #feast-development
Mailing list: We have both a user and developer mailing list.
Feast users should join feast-discuss@googlegroups.com group by clicking here.
Feast developers / contributors should join feast-dev@googlegroups.com group by clicking here.
Community Calendar: Includes community calls and design meetings.
Google Folder: This folder is used as a central repository for all Feast resources. For example:
Design proposals in the form of Request for Comments (RFC).
User surveys and meeting minutes.
Slide decks of conferences our contributors have spoken at.
Feast Linux Foundation Wiki: Our LFAI wiki page contains links to resources for contributors and maintainers.
Slack: Need to speak to a human? Come ask a question in our Slack channel (link above).
GitHub Issues: Found a bug or need a feature? Create an issue on GitHub.
StackOverflow: Need to ask a question on how to use Feast? We also monitor and respond to StackOverflow.
We have a user and contributor community call every two weeks (US & EU friendly).
Please join the above Feast user groups in order to see calendar invites to the community calls
Tuesday 10:00 am to 10:30 am PST
Meeting notes (incl recordings): https://bit.ly/feast-notes
We also have a #feast-development
community call every two weeks, where we discuss contributions + brainstorm best practices.
Tuesday 8:00 am to 8:30 am PST
Meeting notes (incl recordings): Feast Development Biweekly
A data source in Feast refers to raw underlying data that users own (e.g. in a table in BigQuery). Feast does not manage any of the raw underlying data but instead, is in charge of loading this data and performing different operations on the data to retrieve or serve features.
Feast uses a time-series data model to represent data. This data model is used to interpret feature data in data sources in order to build training datasets or materialize features into an online store.
Below is an example data source with a single entity column (driver
) and two feature columns (trips_today
, and rating
).
Feast supports primarily time-stamped tabular data as data sources. There are many kinds of possible data sources:
Batch data sources: ideally, these live in data warehouses (BigQuery, Snowflake, Redshift), but can be in data lakes (S3, GCS, etc). Feast supports ingesting and querying data across both.
Stream data sources: Feast does not have native streaming integrations. It does however facilitate making streaming features available in different environments. There are two kinds of sources:
Push sources allow users to push features into Feast, and make it available for training / batch scoring ("offline"), for realtime feature serving ("online") or both.
[Alpha] Stream sources allow users to register metadata from Kafka or Kinesis sources. The onus is on the user to ingest from these sources, though Feast provides some limited helper methods to ingest directly from Kafka / Kinesis topics.
Ingesting from batch sources is only necessary to power real-time models. This is done through materialization. Under the hood, Feast manages an offline store (to scalably generate training data from batch sources) and an online store (to provide low-latency access to features for real-time models).
A key command to use in Feast is the materialize_incremental
command, which fetches the latest values for all entities in the batch source and ingests these values into the online store.
Materialization can be called programmatically or through the CLI:
If the schema
parameter is not specified when defining a data source, Feast attempts to infer the schema of the data source during feast apply
. The way it does this depends on the implementation of the offline store. For the offline stores that ship with Feast out of the box this inference is performed by inspecting the schema of the table in the cloud data warehouse, or if a query is provided to the source, by running the query with a LIMIT
clause and inspecting the result.
Ingesting from stream sources happens either via a Push API or via a contrib processor that leverages an existing Spark context.
(Experimental) Request data sources: This is data that is only available at request time (e.g. from a user action that needs an immediate model prediction response). This is primarily relevant as an input into , which allow light-weight feature engineering and combining features across sources.
To push data into the offline or online stores: see for details.
(experimental) To use a contrib Spark processor to ingest from a topic, see