6 — Batch & Jobs

This guide covers two related top-level spec fields:

  • spec.batchEngine — override the compute engine used for materialization

  • spec.cronJob — schedule periodic feast materialize-incremental (or any command) as a Kubernetes CronJob


Batch Engine (spec.batchEngine)

By default, Feast runs materialization using the local batch engine (in-process Python). For large feature sets you can point the operator at a Spark, Ray, or other supported engine via a Kubernetes ConfigMap.

ConfigMap format

Create a ConfigMap whose value is a YAML snippet identical to the batch_engine section of feature_store.yaml. Include the type: key and all engine-specific options:

apiVersion: v1
kind: ConfigMap
metadata:
  name: feast-batch-engine
data:
  config: |
    type: spark
    spark_conf:
      spark.master: k8s://https://kubernetes.default.svc
      spark.kubernetes.namespace: feast
      spark.kubernetes.container.image: ghcr.io/feast-dev/feast-spark:latest
      spark.executor.instances: "2"
      spark.executor.memory: 4g
      spark.driver.memory: 2g

Reference the ConfigMap from the CR:

Engine types

type

Notes

local

Default; in-process Python, no extra infra

spark

Apache Spark; requires a Spark operator or standalone cluster

ray

Ray cluster; requires a Ray operator

bytewax

Bytewax streaming engine

snowflake.engine

Snowflake Snowpark compute

For engine-specific YAML options (Spark conf, Ray address, etc.) see the Feast SDK — Compute Enginearrow-up-right docs.


Scheduled Materialization (spec.cronJob)

The operator can deploy a Kubernetes CronJob that runs feast materialize-incremental (or any custom command) on a schedule. This is the recommended way to keep your online store fresh without managing an external job scheduler.

CronJob image resolution

The CronJob container image is resolved through the following priority chain:

  1. cronJob.containerConfigs.image in the CR — per-CronJob override

  2. RELATED_IMAGE_CRON_JOB env var on the operator pod — cluster-wide default set by OLM/platform (default: quay.io/openshift/origin-cli:4.17)

Minimal example — nightly materialization

The CronJob runs the default feast materialize-incremental command using the same feature_store.yaml that the operator generated for this FeatureStore.

Custom command

Override the container command to run any Feast CLI command:

Or run a Python script instead:

Time zone

Concurrency policy

Job history retention

Suspend a CronJob

To temporarily pause scheduled runs without deleting the CronJob:

Resource requests for the job pod

Environment variables and secrets in the job pod

Advanced job spec

Full cronJob field reference

Field
Type
Default
Description

schedule

string

Cron expression (required)

timeZone

string

kube-controller-manager TZ

IANA time zone name

concurrencyPolicy

string

Allow

Allow / Forbid / Replace

suspend

bool

false

Suspend future runs

startingDeadlineSeconds

int64

Abort if start missed by this many seconds

successfulJobsHistoryLimit

int32

Successful job history to keep

failedJobsHistoryLimit

int32

Failed job history to keep

annotations

map

CronJob metadata annotations

jobSpec.parallelism

int32

1

Job pod parallelism

jobSpec.completions

int32

1

Required completions

jobSpec.activeDeadlineSeconds

int64

Max job duration

jobSpec.backoffLimit

int32

Retry limit

containerConfigs.image

string

operator default

Feature server image

containerConfigs.commands

[]string

feast materialize-incremental

Override container command

containerConfigs.resources

ResourceRequirements

CPU/memory requests and limits

containerConfigs.env / envFrom

Environment variables


Combining batchEngine + cronJob

Use both together to run scheduled Spark-based materialization:


See also

Last updated

Was this helpful?