This document provides a high-level introduction to Feast (Feature Store), explaining its purpose as a machine learning feature platform and its core architecture. For detailed information about specific components, see:
Feast (Feature Store) is an open-source machine learning feature store that manages the infrastructure for productionizing analytic data for model training and online inference. It provides a unified data access layer that abstracts feature storage from feature retrieval.
Key Capabilities:
Primary Use Cases:
Sources: README.md26-38 docs/getting-started/quickstart.md1-16 sdk/python/feast/feature_store.py104-107
FeatureStore Class Architecture
The FeatureStore class at sdk/python/feast/feature_store.py104-192 is the primary entry point for all Feast operations. It orchestrates interactions between the registry, provider, offline stores, and online stores.
Core Components:
RepoConfig): Configuration loaded from feature_store.yaml containing project name, registry settings, offline/online store configsBaseRegistry): Metadata store that persists feature definitions (entities, feature views, feature services)Provider): Abstraction layer that delegates operations to offline/online storesPath): File system path to the feature repositoryInitialization Flow:
feature_store.yaml or accepts a RepoConfig object sdk/python/feast/feature_store.py120-157registry_type sdk/python/feast/feature_store.py159-180get_provider() sdk/python/feast/feature_store.py182Sources: sdk/python/feast/feature_store.py104-207
Configuration System Architecture
The RepoConfig class at sdk/python/feast/repo_config.py194-558 manages all Feast configuration, loaded from feature_store.yaml. It uses Pydantic for validation and supports multiple configuration patterns.
Key Configuration Properties:
| Property | Type | Purpose |
|---|---|---|
project | str | Unique project identifier for multi-tenant deployments |
provider | str | Provider type: local, gcp, aws, azure |
registry | RegistryConfig | Registry backend configuration |
offline_store | OfflineStoreConfig | Offline store type and connection details |
online_store | OnlineStoreConfig | Online store type and connection details |
batch_engine | ComputeEngineConfig | Materialization engine configuration |
Registry Configuration Types sdk/python/feast/repo_config.py136-184:
cache_ttl_seconds for cachingType Resolution sdk/python/feast/repo_config.py39-107:
REGISTRY_CLASS_FOR_TYPE: Maps string types to registry implementation classesOFFLINE_STORE_CLASS_FOR_TYPE: Maps offline store types (e.g., "bigquery" → "feast.infra.offline_stores.bigquery.BigQueryOfflineStore")ONLINE_STORE_CLASS_FOR_TYPE: Maps online store types (e.g., "redis" → "feast.infra.online_stores.redis.RedisOnlineStore")Sources: sdk/python/feast/repo_config.py194-558 sdk/python/feast/repo_config.py39-122
Feature Data Flow Through Feast
Phase 1: Development sdk/python/feast/repo_operations.py114-220
parse_repo() function discovers these definitions by importing Python modulesPhase 2: Registration sdk/python/feast/feature_store.py862-1070
feast apply invokes FeatureStore.apply() which:
update_feature_views_with_inferred_features_and_entities() sdk/python/feast/feature_store.py624-635apply_diff_to_registry() sdk/python/feast/feature_store.py845-846Provider.update_infra() sdk/python/feast/infra/provider.py69-92Phase 3: Training Data Generation sdk/python/feast/feature_store.py1440-1599
get_historical_features() accepts an entity DataFrame with timestampsProvider.get_historical_features() which calls the offline store implementationPhase 4: Materialization sdk/python/feast/feature_store.py2041-2135
materialize() or materialize_incremental() loads features from offline to online storesProvider.materialize_single_feature_view() for each feature viewOfflineStore.pull_latest_from_table_or_query() to read latest feature values sdk/python/feast/infra/offline_stores/bigquery.py127-183Provider.online_write_batch() sdk/python/feast/infra/provider.py124-147Phase 5: Online Serving sdk/python/feast/feature_store.py1615-1799
get_online_features() retrieves features for real-time inferenceProvider.get_online_features() sdk/python/feast/infra/provider.py308-320OnlineResponse with feature values and metadataPhase 6: Streaming Ingestion sdk/python/feast/feature_store.py1944-2014
write_to_online_store() accepts streaming data (Kafka/Kinesis) or direct pushesProvider.online_write_batch()Sources: sdk/python/feast/feature_store.py862-1070 sdk/python/feast/feature_store.py1440-1599 sdk/python/feast/feature_store.py2041-2135 sdk/python/feast/feature_store.py1615-1799 sdk/python/feast/repo_operations.py114-220
Dual Storage System
Feast maintains two storage layers:
Offline Stores (Historical Feature Values)
Supported Offline Stores sdk/python/feast/repo_config.py91-107:
| Store Type | Class | Primary Method |
|---|---|---|
| BigQuery | BigQueryOfflineStore | bigquery.py125-341 |
| Snowflake | SnowflakeOfflineStore | snowflake.py |
| Redshift | RedshiftOfflineStore | redshift.py95-183 |
| File/Dask | DaskOfflineStore | Local parquet files |
| Spark | SparkOfflineStore | Distributed processing |
Key Methods:
pull_latest_from_table_or_query(): Retrieves latest feature values per entity sdk/python/feast/infra/offline_stores/bigquery.py127-183get_historical_features(): Performs point-in-time joins for training data sdk/python/feast/infra/offline_stores/bigquery.py235-340Online Stores (Real-Time Feature Values)
Supported Online Stores sdk/python/feast/repo_config.py68-89:
| Store Type | Class | Infrastructure Object |
|---|---|---|
| SQLite | SqliteOnlineStore | sqlite.py93-567 |
| Redis | RedisOnlineStore | In-memory cache |
| DynamoDB | DynamoDBOnlineStore | AWS managed NoSQL |
| Snowflake | SnowflakeOnlineStore | Hybrid tables |
Key Methods:
online_write_batch(): Writes feature values to online store sdk/python/feast/infra/online_stores/sqlite.py253-312online_read(): Retrieves feature values by entity key sdk/python/feast/infra/online_stores/sqlite.py314-357update(): Creates/updates online store tables sdk/python/feast/infra/online_stores/sqlite.py159-227Provider Abstraction sdk/python/feast/infra/passthrough_provider.py58-495
PassthroughProvider delegates all storage operations to the configured offline and online storesProvider interface defined in sdk/python/feast/infra/provider.py49-474offline_store, online_store, batch_engine sdk/python/feast/infra/passthrough_provider.py69-129Sources: sdk/python/feast/infra/offline_stores/offline_store.py73-338 sdk/python/feast/infra/online_stores/online_store.py29-380 sdk/python/feast/infra/offline_stores/bigquery.py125-341 sdk/python/feast/infra/online_stores/sqlite.py93-567 sdk/python/feast/infra/passthrough_provider.py58-129 sdk/python/feast/repo_config.py68-107
Registry Architecture
The registry is Feast's metadata store, persisting all feature definitions. The BaseRegistry abstract class defines the interface at sdk/python/feast/infra/registry/base_registry.py
Registry Contents:
Registry Implementations:
File Registry sdk/python/feast/infra/registry/registry.py
RegistryStore:
path: data/registry.dbpath: s3://bucket/registry.dbpath: gs://bucket/registry.dbcache_ttl_seconds (default 600s)SQL Registry sdk/python/feast/infra/registry/sql.py
path: postgresql+psycopg://user:pass@host/dbSnowflake Registry sdk/python/feast/infra/registry/snowflake.py
registry_type: snowflake.registryRemote Registry sdk/python/feast/infra/registry/remote.py
registry_type: remoteCaching Strategy:
cache_ttl_seconds: Time before cache refresh sdk/python/feast/repo_config.py150-154cache_mode: sync (immediate refresh) or thread (background refresh) sdk/python/feast/repo_config.py156-160FeatureStore.refresh_registry() sdk/python/feast/feature_store.py208-223Registry Selection in FeatureStore sdk/python/feast/feature_store.py159-180:
Sources: sdk/python/feast/infra/registry/base_registry.py sdk/python/feast/infra/registry/registry.py sdk/python/feast/infra/registry/sql.py sdk/python/feast/feature_store.py159-180 sdk/python/feast/repo_config.py136-184
Local Development:
type: sqlite, path: data/online_store.db sdk/python/feast/infra/online_stores/sqlite.py93-567registry: data/registry.db docs/getting-started/quickstart.md106-118Cloud Deployment:
Kubernetes Deployment docs/how-to-guides/running-feast-in-production.md1-10:
Multi-Environment Setup docs/how-to-guides/running-feast-in-production.md44-49:
feast plan and feast apply docs/how-to-guides/running-feast-in-production.md39-42Scaling Considerations docs/how-to-guides/scaling-feast.md1-5:
Sources: docs/getting-started/quickstart.md106-118 docs/how-to-guides/running-feast-in-production.md1-86 docs/how-to-guides/scaling-feast.md1-5 infra/feast-operator/README.md1-7 sdk/python/feast/infra/online_stores/sqlite.py93-567 sdk/python/feast/repo_config.py39-107
The Feast CLI provides commands for managing the feature store lifecycle. Key commands are invoked through sdk/python/feast/repo_operations.py:
Core Commands:
| Command | Function | Implementation |
|---|---|---|
feast init | Bootstrap new feature repository | repo_operations.py491-579 |
feast apply | Register features and update infrastructure | repo_operations.py432-458 |
feast plan | Preview changes (dry-run) | repo_operations.py223-246 |
feast materialize | Load features to online store | feature_store.py2041-2135 |
feast materialize-incremental | Incremental materialization | feature_store.py2137-2229 |
feast serve | Start feature server | feature_server.py |
feast ui | Start web UI | ui_server.py |
Apply Workflow sdk/python/feast/repo_operations.py432-458:
parse_repo() repo_operations.py114-220diff_between() repo_operations.py354-359store.apply() repo_operations.py384-390Sources: sdk/python/feast/repo_operations.py114-579 sdk/python/feast/feature_store.py2041-2229
Feast provides a complete feature store platform with:
FeatureStore class managing all operationsRepoConfig supporting multiple storage backendsFor specific implementation details, see the subsequent pages in this wiki covering individual components and subsystems.
Sources: README.md26-244 sdk/python/feast/feature_store.py104-207 sdk/python/feast/repo_config.py194-558 docs/getting-started/quickstart.md1-206
Refresh this wiki
This wiki was recently refreshed. Please wait 4 days to refresh again.