Bytewax
The Bytewax batch materialization engine provides an execution engine for batch materializing operations (
materialize
and materialize-incremental
).In order to use the Bytewax materialization engine, you will need a Kubernetes cluster running version 1.22.10 or greater.
The Bytewax materialization engine loads authentication and cluster information from the kubeconfig file. By default, kubectl looks for a file named
config
in the $HOME/.kube directory
. You can specify other kubeconfig files by setting the KUBECONFIG
environment variable.Bytewax jobs can be configured to access Kubernetes secrets as environment variables to access online and offline stores during job runs.
To configure secrets, first create them using
kubectl
:kubectl create secret generic -n bytewax aws-credentials --from-literal=aws-access-key-id='<access key id>' --from-literal=aws-secret-access-key='<secret access key>'
If your Docker registry requires authentication to store/pull containers, you can use this same approach to store your repository access credential and use when running the materialization engine.
Then configure them in the batch_engine section of
feature_store.yaml
:batch_engine:
type: bytewax
namespace: bytewax
env:
- name: AWS_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: aws-credentials
key: aws-access-key-id
- name: AWS_SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
name: aws-credentials
key: aws-secret-access-key
image_pull_secrets:
- docker-repository-access-secret
The Bytewax materialization engine is configured through the The
feature_store.yaml
configuration file:batch_engine:
type: bytewax
namespace: bytewax
image: bytewax/bytewax-feast:latest
image_pull_secrets:
- my_container_secret
service_account_name: my-k8s-service-account
include_security_context_capabilities: false
annotations:
# example annotation you might include if running on AWS EKS
iam.amazonaws.com/role: arn:aws:iam::<account number>:role/MyBytewaxPlatformRole
resources:
limits:
cpu: 1000m
memory: 2048Mi
requests:
cpu: 500m
memory: 1024Mi
Notes:
- The
namespace
configuration directive specifies which Kubernetes namespace jobs, services and configuration maps will be created in. - The
image_pull_secrets
configuration directive specifies the pre-configured secret to use when pulling the image container from your registry. - The
service_account_name
specifies which Kubernetes service account to run the job under. - The
include_security_context_capabilities
flag indicates whether or not"add": ["NET_BIND_SERVICE"]
and"drop": ["ALL"]
are included in the job & pod security context capabilities. annotations
allows you to include additional Kubernetes annotations to the job. This is particularly useful for IAM roles which grant the running pod access to cloud platform resources (for example).- The
resources
configuration directive sets the standard Kubernetes resource requests for the job containers to utilise when materializing data.
The
image
configuration directive specifies which container image to use when running the materialization job. To create a custom image based on this container, run the following command:DOCKER_BUILDKIT=1 docker build . -f ./sdk/python/feast/infra/materialization/contrib/bytewax/Dockerfile -t <image tag>
Once that image is built and pushed to a registry, it can be specified as a part of the batch engine configuration:
batch_engine:
type: bytewax
namespace: bytewax
image: <image tag>
Last modified 24d ago