Search…
Redshift

Description

The Redshift offline store provides support for reading RedshiftSources.
  • Redshift tables and views are allowed as sources.
  • All joins happen within Redshift.
  • Entity dataframes can be provided as a SQL query or can be provided as a Pandas dataframe. Pandas dataframes will be uploaded to Redshift in order to complete join operations.
  • A RedshiftRetrievalJob is returned when calling get_historical_features().

Example

feature_store.yaml
1
project: my_feature_repo
2
registry: data/registry.db
3
provider: aws
4
offline_store:
5
type: redshift
6
region: us-west-2
7
cluster_id: feast-cluster
8
database: feast-database
9
user: redshift-user
10
s3_staging_location: s3://feast-bucket/redshift
11
iam_role: arn:aws:iam::123456789012:role/redshift_s3_access_role
Copied!
Configuration options are available here.

Permissions

Feast requires the following permissions in order to execute commands for Redshift offline store:
Command
Permissions
Resources
Apply
redshift-data:DescribeTable
redshift:GetClusterCredentials
arn:aws:redshift:<region>:<account_id>:dbuser:<redshift_cluster_id>/<redshift_username>
arn:aws:redshift:<region>:<account_id>:dbname:<redshift_cluster_id>/<redshift_database_name>
arn:aws:redshift:<region>:<account_id>:cluster:<redshift_cluster_id>
Materialize
redshift-data:ExecuteStatement
arn:aws:redshift:<region>:<account_id>:cluster:<redshift_cluster_id>
Materialize
redshift-data:DescribeStatement
*
Materialize
s3:ListBucket
s3:GetObject
s3:DeleteObject
arn:aws:s3:::<bucket_name>
arn:aws:s3:::<bucket_name>/*
Get Historical Features
redshift-data:ExecuteStatement
redshift:GetClusterCredentials
arn:aws:redshift:<region>:<account_id>:dbuser:<redshift_cluster_id>/<redshift_username>
arn:aws:redshift:<region>:<account_id>:dbname:<redshift_cluster_id>/<redshift_database_name>
arn:aws:redshift:<region>:<account_id>:cluster:<redshift_cluster_id>
Get Historical Features
redshift-data:DescribeStatement
*
Get Historical Features
s3:ListBucket
s3:GetObject
s3:PutObject
s3:DeleteObject
arn:aws:s3:::<bucket_name>
arn:aws:s3:::<bucket_name>/*
The following inline policy can be used to grant Feast the necessary permissions:
1
{
2
"Statement": [
3
{
4
"Action": [
5
"s3:ListBucket",
6
"s3:PutObject",
7
"s3:GetObject",
8
"s3:DeleteObject"
9
],
10
"Effect": "Allow",
11
"Resource": [
12
"arn:aws:s3:::<bucket_name>/*",
13
"arn:aws:s3:::<bucket_name>"
14
]
15
},
16
{
17
"Action": [
18
"redshift-data:DescribeTable",
19
"redshift:GetClusterCredentials",
20
"redshift-data:ExecuteStatement"
21
],
22
"Effect": "Allow",
23
"Resource": [
24
"arn:aws:redshift:<region>:<account_id>:dbuser:<redshift_cluster_id>/<redshift_username>",
25
"arn:aws:redshift:<region>:<account_id>:dbname:<redshift_cluster_id>/<redshift_database_name>",
26
"arn:aws:redshift:<region>:<account_id>:cluster:<redshift_cluster_id>"
27
]
28
},
29
{
30
"Action": [
31
"redshift-data:DescribeStatement"
32
],
33
"Effect": "Allow",
34
"Resource": "*"
35
}
36
],
37
"Version": "2012-10-17"
38
}
Copied!
In addition to this, Redshift offline store requires an IAM role that will be used by Redshift itself to interact with S3. More concretely, Redshift has to use this IAM role to run UNLOAD and COPY commands. Once created, this IAM role needs to be configured in feature_store.yaml file as offline_store: iam_role.
The following inline policy can be used to grant Redshift necessary permissions to access S3:
1
{
2
"Statement": [
3
{
4
"Action": "s3:*",
5
"Effect": "Allow",
6
"Resource": [
7
"arn:aws:s3:::feast-integration-tests",
8
"arn:aws:s3:::feast-integration-tests/*"
9
]
10
}
11
],
12
"Version": "2012-10-17"
13
}
Copied!
While the following trust relationship is necessary to make sure that Redshift, and only Redshift can assume this role:
1
{
2
"Version": "2012-10-17",
3
"Statement": [
4
{
5
"Effect": "Allow",
6
"Principal": {
7
"Service": "redshift.amazonaws.com"
8
},
9
"Action": "sts:AssumeRole"
10
}
11
]
12
}
Copied!
Last modified 20d ago
Export as PDF
Copy link