Scaling Feast

Overview

Feast is designed to be easy to use and understand out of the box, with as few infrastructure dependencies as possible. However, there are components used by default that may not scale well. Since Feast is designed to be modular, it's possible to swap such components with more performant components, at the cost of Feast depending on additional infrastructure.

Scaling Feast Registry

The default Feast registry is a file-based registry. Any changes to the feature repo, or materializing data into the online store, results in a mutation to the registry.

However, there are inherent limitations with a file-based registry, since changing a single field in the registry requires re-writing the whole registry file. With multiple concurrent writers, this presents a risk of data loss, or bottlenecks writes to the registry since all changes have to be serialized (e.g. when running materialization for multiple feature views or time ranges concurrently).

The recommended solution in this case is to use the SQL based registry, which allows concurrent, transactional, and fine-grained updates to the registry. This registry implementation requires access to an existing database (such as MySQL, Postgres, etc).

Scaling Materialization

The default Feast materialization process is an in-memory process, which pulls data from the offline store before writing it to the online store. However, this process does not scale for large data sets, since it's executed on a single-process.

Feast supports pluggable Compute Engines, that allow the materialization process to be scaled up. Aside from the local process, Feast supports a Lambda-based materialization engine, and a Bytewax-based materialization engine.

Users may also be able to build an engine to scale up materialization using existing infrastructure in their organizations.

Horizontal Scaling with the Feast Operator

When running Feast on Kubernetes with the Feast Operator, you can horizontally scale the FeatureStore deployment using spec.replicas or HPA autoscaling. The FeatureStore CRD implements the Kubernetes scale sub-resource, so you can also use kubectl scale:

kubectl scale featurestore/my-feast --replicas=3

Prerequisites: Horizontal scaling requires DB-backed persistence for all enabled services (online store, offline store, and registry). File-based persistence (SQLite, DuckDB, registry.db) is incompatible with multiple replicas because these backends do not support concurrent access from multiple pods.

Static Replicas

Set a fixed number of replicas via spec.replicas:

apiVersion: feast.dev/v1
kind: FeatureStore
metadata:
  name: sample-scaling
spec:
  feastProject: my_project
  replicas: 3
  services:
    onlineStore:
      persistence:
        store:
          type: postgres
          secretRef:
            name: feast-data-stores
    registry:
      local:
        persistence:
          store:
            type: sql
            secretRef:
              name: feast-data-stores

Autoscaling with HPA

Configure a HorizontalPodAutoscaler to dynamically scale based on metrics. HPA autoscaling is configured under services.scaling.autoscaling and is mutually exclusive with spec.replicas > 1:

apiVersion: feast.dev/v1
kind: FeatureStore
metadata:
  name: sample-autoscaling
spec:
  feastProject: my_project
  services:
    scaling:
      autoscaling:
        minReplicas: 2
        maxReplicas: 10
        metrics:
        - type: Resource
          resource:
            name: cpu
            target:
              type: Utilization
              averageUtilization: 70
    podDisruptionBudgets:
      maxUnavailable: 1
    onlineStore:
      persistence:
        store:
          type: postgres
          secretRef:
            name: feast-data-stores
      server:
        resources:
          requests:
            cpu: 200m
            memory: 256Mi
    registry:
      local:
        persistence:
          store:
            type: sql
            secretRef:
              name: feast-data-stores

When autoscaling is configured, the operator automatically sets the deployment strategy to RollingUpdate (instead of the default Recreate) to ensure zero-downtime scaling, and auto-injects soft pod anti-affinity and zone topology spread constraints. You can override any of these by explicitly setting deploymentStrategy, affinity, or topologySpreadConstraints in the CR.

Validation Rules

The operator enforces the following rules:

spec.replicas > 1 and services.scaling.autoscaling are mutually exclusive -- you cannot set both.
Scaling with replicas > 1 or any autoscaling config is rejected if any enabled service uses file-based persistence.
S3 (s3://) and GCS (gs://) backed registry file persistence is allowed with scaling, since these object stores support concurrent readers.

High Availability

When scaling is enabled (replicas > 1 or autoscaling), the operator provides HA features to improve resilience:

Pod Anti-Affinity — The operator automatically injects a soft (preferredDuringSchedulingIgnoredDuringExecution) pod anti-affinity rule that prefers spreading pods across different nodes. This prevents multiple replicas from being co-located on the same node, improving resilience to node failures. You can override this by providing your own affinity configuration:

spec:
  replicas: 3
  services:
    # Override with custom affinity (e.g. strict anti-affinity)
    affinity:
      podAntiAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
        - topologyKey: kubernetes.io/hostname
          labelSelector:
            matchLabels:
              feast.dev/name: my-feast
    # ...

Topology Spread Constraints — The operator automatically injects a soft zone-spread constraint (whenUnsatisfiable: ScheduleAnyway) that distributes pods across availability zones. This is a best-effort spread — if zones are unavailable, pods will still be scheduled. You can override this with explicit constraints or disable it with an empty array:

spec:
  replicas: 3
  services:
    # Override with custom topology spread (e.g. strict zone spreading)
    topologySpreadConstraints:
    - maxSkew: 1
      topologyKey: topology.kubernetes.io/zone
      whenUnsatisfiable: DoNotSchedule
      labelSelector:
        matchLabels:
          feast.dev/name: my-feast
    # ...

To disable the auto-injected topology spread:

spec:
  replicas: 3
  services:
    topologySpreadConstraints: []
    # ...

PodDisruptionBudget — You can configure a PDB to limit voluntary disruptions (e.g. during node drains or cluster upgrades). The PDB is only created when scaling is enabled. Exactly one of minAvailable or maxUnavailable must be set:

spec:
  replicas: 3
  services:
    podDisruptionBudgets:
      maxUnavailable: 1       # at most 1 pod unavailable during disruptions
    # -- OR --
    # podDisruptionBudgets:
    #   minAvailable: "50%"   # at least 50% of pods must remain available
    # ...

The PDB is not auto-injected — you must explicitly configure it. This is intentional because a misconfigured PDB (e.g. minAvailable equal to the replica count) can block node drains and cluster upgrades.

Using KEDA (Kubernetes Event-Driven Autoscaling)

KEDA is also supported as an external autoscaler. KEDA should target the FeatureStore's scale sub-resource directly (since it implements the Kubernetes scale API). This is the recommended approach because the operator manages the Deployment's replica count from spec.replicas — targeting the Deployment directly would conflict with the operator's reconciliation.

When using KEDA, do not set scaling.autoscaling or spec.replicas > 1 -- KEDA manages the replica count through the scale sub-resource.

Ensure DB-backed persistence -- The CRD's CEL validation rules automatically enforce DB-backed persistence when KEDA scales spec.replicas above 1 via the scale sub-resource. The operator also automatically switches the deployment strategy to RollingUpdate when replicas > 1.
Configure the FeatureStore with DB-backed persistence:

apiVersion: feast.dev/v1
kind: FeatureStore
metadata:
  name: sample-keda
spec:
  feastProject: my_project
  services:
    onlineStore:
      persistence:
        store:
          type: postgres
          secretRef:
            name: feast-data-stores
    registry:
      local:
        persistence:
          store:
            type: sql
            secretRef:
              name: feast-data-stores

Create a KEDA ScaledObject targeting the FeatureStore resource:

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: feast-scaledobject
spec:
  scaleTargetRef:
    apiVersion: feast.dev/v1
    kind: FeatureStore
    name: sample-keda
  minReplicaCount: 1
  maxReplicaCount: 10
  triggers:
  - type: prometheus
    metadata:
      serverAddress: http://prometheus.monitoring.svc:9090
      metricName: http_requests_total
      query: sum(rate(http_requests_total{service="feast"}[2m]))
      threshold: "100"

KEDA-created HPAs are not owned by the Feast operator. The operator will not interfere with them, but it also will not clean them up if the FeatureStore CR is deleted. You must manage the KEDA ScaledObject lifecycle independently.

For the full API reference, see the FeatureStore CRD reference.

PreviousRead features from the online store NextStructuring Feature Repos

Last updated 21 days ago

Was this helpful?

Good night

hashtagOverview

hashtagScaling Feast Registry

hashtagScaling Materialization

hashtagHorizontal Scaling with the Feast Operator

hashtagStatic Replicas

hashtagAutoscaling with HPA

hashtagValidation Rules

hashtagHigh Availability

hashtagUsing KEDA (Kubernetes Event-Driven Autoscaling)

Overview

Scaling Feast Registry

Scaling Materialization

Horizontal Scaling with the Feast Operator

Static Replicas

Autoscaling with HPA

Validation Rules

High Availability

Using KEDA (Kubernetes Event-Driven Autoscaling)