2 — Persistence

The offline store, online store, and registry each need a place to store data. The operator supports two persistence patterns for each:

  • File persistence — a path on a volume (ephemeral or PVC-backed)

  • DB/store persistence — an external database or managed store, wired via a Kubernetes Secret

The pattern is the same for all three services; only the store types differ.


Persistence patterns at a glance

Pattern
Best for
Data survives pod restart?

File — emptyDir

Dev / CI

No

File — PVC (ref)

Single-node prod or testing

Yes (if PVC is retained)

File — PVC (create)

Operator-managed storage

Yes

DB store

Production, HA, multi-pod

Yes


File persistence

Ephemeral (emptyDir)

The default when no persistence block is set. Data lives on the pod's local disk and is lost on restart. Suitable for development only.

services:
  onlineStore:
    server: {}          # no persistence block → emptyDir

PVC — reference an existing PVC

When you already have a PVC provisioned (e.g. by your storage team):

PVC — let the operator create one

Omitting storageClassName uses the cluster default StorageClass. Omitting create entirely creates a PVC with the operator's built-in defaults (1 Gi, default StorageClass).


DB / store persistence

For production, point the operator at an external database. The operator reads connection details from a Kubernetes Secret and writes them into feature_store.yaml.

Secret format

The Secret must contain one key per store component. The key name is the store type (e.g. postgres, sql, redis). The value is a YAML snippet identical to what you would write under the corresponding section in feature_store.yaml, minus the type: key (the operator inserts it from persistence.store.type).

Reference the Secret from the CR:

Key lookup rule: the operator looks up the Secret key that matches persistence.store.type (e.g. type: postgres → key postgres). Keep all stores in one Secret or split across multiple — both work.

Injecting DB credentials into the server pod

The Secret key values in the example above hard-code passwords. For production, keep credentials in a separate Secret and inject them as environment variables using envFrom, then reference them with ${VAR} substitution in the data-stores Secret value:


Store types by component

Online store

type

Secret key

SDK docs

postgres

postgres

cassandra

cassandra

hazelcast

hazelcast

datastore

datastore

dynamodb

dynamodb

bigtable

bigtable

For all store-specific YAML keys (connection options, pool sizes, etc.) see the linked SDK docs — the Secret value accepts the same keys.

Offline store

type (file)

Notes

file

Default pandas-based parquet offline store

dask

Dask-based parallel parquet

duckdb

DuckDB in-process analytical engine

For external DB-backed offline stores (BigQuery, Snowflake, Spark, Trino, etc.), use persistence.store.type and a Secret with the matching key. See Offline Storesarrow-up-right in the SDK docs.

Registry

type

Secret key

Notes

file

(file persistence, no Secret)

SQLite-backed file registry

sql

sql

SQLAlchemy URL — supports PostgreSQL, MySQL, SQLite

snowflake.registry

snowflake.registry

Snowflake-backed registry

For sql, the Secret value is a path: (SQLAlchemy URL) plus optional cache_ttl_seconds and sqlalchemy_config_kwargs:


Common patterns

Redis online store

Postgres for both online store and registry


See also

Last updated

Was this helpful?