OpenTelemetry Integration
The OpenTelemetry integration in Feast provides comprehensive monitoring and observability capabilities for your feature serving infrastructure. This component enables you to track key metrics, traces, and logs from your Feast deployment.
Motivation
Monitoring and observability are critical for production machine learning systems. The OpenTelemetry integration addresses these needs by:
Performance Monitoring: Track CPU and memory usage of feature servers
Operational Insights: Collect metrics to understand system behavior and performance
Troubleshooting: Enable effective debugging through distributed tracing
Resource Optimization: Monitor resource utilization to optimize deployments
Production Readiness: Provide enterprise-grade observability capabilities
Architecture
The OpenTelemetry integration in Feast consists of several components working together:
OpenTelemetry Collector: Receives, processes, and exports telemetry data
Prometheus Integration: Enables metrics collection and monitoring
Instrumentation: Automatic Python instrumentation for tracking metrics
Exporters: Components that send telemetry data to monitoring systems
Key Features
Automated Instrumentation: Python auto-instrumentation for comprehensive metric collection
Metric Collection: Track key performance indicators including:
Memory usage
CPU utilization
Request latencies
Feature retrieval statistics
Flexible Configuration: Customizable metric collection and export settings
Kubernetes Integration: Native support for Kubernetes deployments
Prometheus Compatibility: Integration with Prometheus for metrics visualization
Setup and Configuration
To add monitoring to the Feast Feature Server, follow these steps:
1. Deploy Prometheus Operator
Follow the Prometheus Operator documentation to install the operator.
2. Deploy OpenTelemetry Operator
Before installing the OpenTelemetry Operator:
Install
cert-manager
Validate that the
pods
are runningApply the OpenTelemetry operator:
For additional installation steps, refer to the OpenTelemetry Operator documentation.
3. Configure OpenTelemetry Collector
Add the OpenTelemetry Collector configuration under the metrics section in your values.yaml file:
4. Add Instrumentation Configuration
Add the following annotations and environment variables to your deployment.yaml:
5. Add Metric Checks
Add metric checks to all manifests and deployment files:
6. Add Required Manifests
Add the following components to your chart:
Instrumentation
OpenTelemetryCollector
ServiceMonitors
Prometheus Instance
RBAC rules
7. Deploy Feast
Deploy Feast with metrics enabled:
Usage
To enable OpenTelemetry monitoring in your Feast deployment:
Set
metrics.enabled=true
in your Helm valuesConfigure the OpenTelemetry Collector endpoint
Deploy with proper annotations and environment variables
Example configuration:
Monitoring
Once configured, you can monitor various metrics including:
feast_feature_server_memory_usage
: Memory utilization of the feature serverfeast_feature_server_cpu_usage
: CPU usage statisticsAdditional custom metrics based on your configuration
These metrics can be visualized using Prometheus and other compatible monitoring tools.
Last updated
Was this helpful?