The Bytewax batch materialization engine provides an execution engine for batch materializing operations (materialize and materialize-incremental).
In order to use the Bytewax materialization engine, you will need a cluster running version 1.22.10 or greater.
The Bytewax materialization engine loads authentication and cluster information from the . By default, kubectl looks for a file named config in the $HOME/.kube directory. You can specify other kubeconfig files by setting the KUBECONFIG environment variable.
Bytewax jobs can be configured to access as environment variables to access online and offline stores during job runs.
To configure secrets, first create them using kubectl:
Then configure them in the batch_engine section of feature_store.yaml:
The Bytewax materialization engine is configured through the The feature_store.yaml configuration file:
The namespace configuration directive specifies which Kubernetes jobs, services and configuration maps will be created in.
The image configuration directive specifies which container image to use when running the materialization job. To create a custom image based on this container, run the following command:
Once that image is built and pushed to a registry, it can be specified as a part of the batch engine configuration:
kubectl create secret generic -n bytewax aws-credentials --from-literal=aws-access-key-id='<access key id>' --from-literal=aws-secret-access-key='<secret access key>'batch_engine:
type: bytewax
namespace: bytewax
env:
- name: AWS_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: aws-credentials
key: aws-access-key-id
- name: AWS_SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
name: aws-credentials
key: aws-secret-access-keybatch_engine:
type: bytewax
namespace: bytewax
image: bytewax/bytewax-feast:latestDOCKER_BUILDKIT=1 docker build . -f ./sdk/python/feast/infra/materialization/contrib/bytewax/Dockerfile -t <image tag>batch_engine:
type: bytewax
namespace: bytewax
image: <image tag>The Snowflake batch materialization engine provides a highly scalable and parallel execution engine using a Snowflake Warehouse for batch materializations operations (materialize and materialize-incremental) when using a SnowflakeSource.
The engine requires no additional configuration other than for you to supply Snowflake's standard login and context details. The engine leverages custom (automatically deployed for you) Python UDFs to do the proper serialization of your offline store data to your online serving tables.
When using all three options together, snowflake.offline, snowflake.engine, and snowflake.online, you get the most unique experience of unlimited scale and performance + governance and data security.
Please see for an explanation of batch materialization engines.
...
offline_store:
type: snowflake.offline
...
batch_engine:
type: snowflake.engine
account: snowflake_deployment.us-east-1
user: user_login
password: user_password
role: sysadmin
warehouse: demo_wh
database: FEAST