Feast and Spark
Configuring Feast to use Spark for ingestion.
Option 1. Use Kubernetes Operator for Apache Spark
helm repo add spark-operator \
https://googlecloudplatform.github.io/spark-on-k8s-operator
helm install my-release spark-operator/spark-operator \
--set serviceAccounts.spark.name=sparkcat <<EOF | kubectl apply -f -
kind: Role
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: use-spark-operator
namespace: default # replace if using different namespace
rules:
- apiGroups: ["sparkoperator.k8s.io"]
resources: ["sparkapplications"]
verbs: ["create", "delete", "deletecollection", "get", "list", "update", "watch", "patch"]
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: RoleBinding
metadata:
name: use-spark-operator
namespace: default # replace if using different namespace
roleRef:
kind: Role
name: use-spark-operator
apiGroup: rbac.authorization.k8s.io
subjects:
- kind: ServiceAccount
name: default
EOFOption 2. Use GCP and Dataproc
Option 3. Use AWS and EMR
Last updated
Was this helpful?