3 — Serving & Observability

This guide covers the spec.services.onlineStore.serving block and general server configuration (server block) for all Feast services: per-worker tuning, log levels, Prometheus metrics, offline push batching, and MCP (Model Context Protocol).


Server configuration (server block)

Every deployable Feast service (online store, offline store, registry, UI) accepts a server block under spec.services.<service>. The online and offline stores use ServerConfigs; the registry uses RegistryServerConfigs (adds restAPI / grpc toggles).

Enable a server

An empty server: {} is enough to deploy a service. Without it the component is not deployed (e.g. the offline store runs as a local process, not as a network server):

services:
  onlineStore:
    server: {}        # deploys the online feature server on port 6566
  offlineStore:
    server: {}        # deploys the offline feature server on port 8815
  registry:
    local:
      server: {}      # deploys the registry server on port 6570
  ui: {}              # deploys the Feast UI

Log level

services:
  onlineStore:
    server:
      logLevel: debug    # debug | info | warning | error | critical
  offlineStore:
    server:
      logLevel: info
  registry:
    local:
      server:
        logLevel: warning

Container image and resources

The operator resolves the feature server image through the following priority chain:

  1. server.image in the CR — per-service override, highest priority

  2. RELATED_IMAGE_FEATURE_SERVER env var on the operator pod — cluster-wide default set by OLM/platform

  3. Built-in defaultquay.io/feastdev/feature-server:<operator-version>

The same RELATED_IMAGE_FEATURE_SERVER image is used for all server containers (online, offline, registry, UI) and for the init containers (git clone, feast apply). Setting it overrides all of them at once without touching any CR.

The cronJob container uses a separate env var: RELATED_IMAGE_CRON_JOB (default: quay.io/openshift/origin-cli:4.17).

Cluster-wide image override (operator env var) — set this on the operator Deployment to redirect all pods cluster-wide to a different registry (e.g. a private mirror or a pinned digest):

Per-service CR override — for a single FeatureStore or to pin one service to a different image than the cluster default:

Worker configuration (gunicorn)

The online and offline servers run on gunicorn. Tune worker count, connections, and request limits:

Production recommendation: set workers: -1 and registryTtlSec: 60 or higher. See Online Server Performance Tuning for detailed guidance.

Environment variables and secrets

Inject environment variables from Secrets or ConfigMaps into any server:

Volume mounts

Mount additional volumes (ConfigMaps, Secrets, PVCs) into the server containers:


TLS

All servers support TLS termination. Provide a Kubernetes Secret containing the TLS certificate and key, and reference it from tls:

For mTLS (mutual TLS), also set a CA certificate ConfigMap:


Prometheus Metrics

When metrics are enabled the feature server starts an HTTP server on port 8000 for Prometheus scraping. The operator automatically adds a containerPort, a Kubernetes Service port, and a ServiceMonitor for Prometheus discovery.

Two paths — use either or both

Path 1: CLI flag (existing, simple)

Path 2: YAML config (new, granular)

Both paths expose port 8000 and create a ServiceMonitor. When serving.metrics.enabled is true the Python server reads it from feature_store.yaml directly; no --metrics flag is injected. When server.metrics: true is used, the --metrics flag is injected.

SDK note: MetricsConfig uses extra="forbid" in Pydantic. Only use category keys that are recognized by your Feast SDK version.

Verify monitoring is wired:


Offline Push Batching

When features are pushed to the online store via /push, each request also triggers a synchronous offline store write. At high push throughput this causes OOM. Push batching groups these writes into fixed-size batches flushed on a timer.

Field
Type
Description

enabled

bool

Activates batching

batchSize

int

Max rows per batch; flush when reached

batchIntervalSeconds

int

Flush interval even when batch is not full


MCP (Model Context Protocol)

MCP mounts LLM-agent-compatible tool endpoints alongside the existing REST API on port 6566. The REST API is not replaced — MCP is additive.

The operator writes feature_server.type: mcp into feature_store.yaml only when serving.mcp.enabled: true. Setting enabled: false reverts to type: local.

Field
Type
Default
Description

enabled

bool

Must be true; false keeps type: local

serverName

string

feast-mcp-server

Name advertised to MCP clients

serverVersion

string

1.0.0

Version advertised to MCP clients

transport

string

sse

sse (SSE-based) or http (Streamable HTTP)

MCP is mounted at /mcp on port 6566 — no additional Kubernetes Service is created.

Dependency: the feature server image must include feast[mcp] (fastapi-mcp). Without it the server starts normally but MCP routes are not registered.


serving vs server — summary

Capability

server.* block

serving.* block

Enable the server

server: {}

Log level

server.logLevel

Workers / gunicorn

server.workerConfigs

TLS

server.tls

Env vars / image

server.env, server.image

Metrics (simple)

server.metrics: true

Metrics (per-category)

serving.metrics.categories

Offline push batching

serving.offlinePushBatching

MCP

serving.mcp


See also

Last updated

Was this helpful?