Search…
Adding or reusing tests

Overview

This guide will go over:
    1.
    how Feast tests are setup
    2.
    how to extend the test suite to test new functionality
    3.
    how to use the existing test suite to test a new custom offline / online store.

Test suite overview

Let's inspect the test setup as is:
1
$ tree
2
3
.
4
├── e2e
5
│ └── test_universal_e2e.py
6
├── feature_repos
7
│ ├── repo_configuration.py
8
│ └── universal
9
│ ├── data_source_creator.py
10
│ ├── data_sources
11
│ │ ├── bigquery.py
12
│ │ ├── file.py
13
│ │ └── redshift.py
14
│ ├── entities.py
15
│ └── feature_views.py
16
├── offline_store
17
│ ├── test_s3_custom_endpoint.py
18
│ └── test_universal_historical_retrieval.py
19
├── online_store
20
│ ├── test_e2e_local.py
21
│ ├── test_feature_service_read.py
22
│ ├── test_online_retrieval.py
23
│ └── test_universal_online.py
24
├── registration
25
│ ├── test_cli.py
26
│ ├── test_cli_apply_duplicated_featureview_names.py
27
│ ├── test_cli_chdir.py
28
│ ├── test_feature_service_apply.py
29
│ ├── test_feature_store.py
30
│ ├── test_inference.py
31
│ ├── test_registry.py
32
│ ├── test_universal_odfv_feature_inference.py
33
│ └── test_universal_types.py
34
└── scaffolding
35
├── test_init.py
36
├── test_partial_apply.py
37
├── test_repo_config.py
38
└── test_repo_operations.py
39
40
8 directories, 27 files
Copied!
feature_repos has setup files for most tests in the test suite and sets up pytest fixtures for other tests. Crucially, this parametrizes on different offline stores, different online stores, etc and abstracts away store specific implementations so tests don't need to rewrite e.g. uploading dataframes to a specific store for setup.

Understanding an example test

Let's look at a sample test using the universal repo:
Python
1
@pytest.mark.integration
2
@pytest.mark.parametrize("full_feature_names", [True, False], ids=lambda v: str(v))
3
def test_historical_features(environment, universal_data_sources, full_feature_names):
4
store = environment.feature_store
5
6
(entities, datasets, data_sources) = universal_data_sources
7
feature_views = construct_universal_feature_views(data_sources)
8
9
customer_df, driver_df, orders_df, global_df, entity_df = (
10
datasets["customer"],
11
datasets["driver"],
12
datasets["orders"],
13
datasets["global"],
14
datasets["entity"],
15
)
16
17
# ... more test code
18
19
customer_fv, driver_fv, driver_odfv, order_fv, global_fv = (
20
feature_views["customer"],
21
feature_views["driver"],
22
feature_views["driver_odfv"],
23
feature_views["order"],
24
feature_views["global"],
25
)
26
27
feature_service = FeatureService(
28
"convrate_plus100",
29
features=[
30
feature_views["driver"][["conv_rate"]],
31
feature_views["driver_odfv"]
32
],
33
)
34
35
feast_objects = []
36
feast_objects.extend(
37
[
38
customer_fv,
39
driver_fv,
40
driver_odfv,
41
order_fv,
42
global_fv,
43
driver(),
44
customer(),
45
feature_service,
46
]
47
)
48
store.apply(feast_objects)
49
50
# ... more test code
51
52
job_from_df = store.get_historical_features(
53
entity_df=entity_df_with_request_data,
54
features=[
55
"driver_stats:conv_rate",
56
"driver_stats:avg_daily_trips",
57
"customer_profile:current_balance",
58
"customer_profile:avg_passenger_count",
59
"customer_profile:lifetime_trip_count",
60
"conv_rate_plus_100:conv_rate_plus_100",
61
"conv_rate_plus_100:conv_rate_plus_val_to_add",
62
"order:order_is_success",
63
"global_stats:num_rides",
64
"global_stats:avg_ride_length",
65
],
66
full_feature_names=full_feature_names,
67
)
68
actual_df_from_df_entities = job_from_df.to_df()
69
70
# ... more test code
71
72
assert_frame_equal(
73
expected_df, actual_df_from_df_entities, check_dtype=False,
74
)
75
76
# ... more test code
Copied!
The key fixtures are the environment and universal_data_sources fixtures, which are defined in the feature_repos directories. This by default pulls in a standard dataset with driver and customer entities, certain feature views, and feature values. By including the environment as a parameter, the test automatically parametrizes across other offline / online store combinations.

Writing a new test or reusing existing tests

To:
    Include a new offline store:
      extend data_source_creator.py for your offline store
      in repo_configuration.py add a newIntegrationTestRepoConfig or two (depending on how many online stores you want to test)
      Run the full test suite with make test-python-integration
    Include a new online store:
      in repo_configuration.py add a new config that maps to a serialized version of configuration you need in feature_store.yaml to setup the online store.
      in repo_configuration.py, add newIntegrationTestRepoConfig for offline stores you want to test
      Run the full test suite with make test-python-integration
    Add a new test to an existing test file:
      Use the same function signatures as an existing test (e.g. have environment as an argument) to include the relevant test fixtures.
      We prefer to expand what an individual test covers due to the cost of standing up offline / online stores
    Using custom data in a new test:
      This is used in several places such as test_universal_types.py
1
@pytest.mark.integration
2
def your_test(environment: Environment):
3
df = #...#
4
data_source = environment.data_source_creator.create_data_source(
5
df,
6
destination_name=environment.feature_store.project
7
)
8
your_fv = driver_feature_view(data_source)
9
entity = driver(value_type=ValueType.UNKNOWN)
10
fs.apply([fv, entity])
11
12
# ... run test
13
14
environment.data_source_creator.teardown()
Copied!
Last modified 10d ago