A comprehensive benchmarking framework for Online Analytical Processing (OLAP) database management systems. OLAPBench provides standardized benchmarks and automated testing capabilities across multiple database systems including DuckDB, PostgreSQL, ClickHouse, Hyper, Umbra, MonetDB, SingleStore, SQL Server, and more.
- Overview
- Features
- Supported Database Systems
- Supported Benchmarks
- Installation
- Quick Start
- Configuration
- Running Benchmarks
- Benchmark Results
- Project Structure
- Adding New Systems
- Adding New Benchmarks
- Examples
- Contributing
OLAPBench is designed to provide fair, reproducible, and comprehensive performance comparisons between different OLAP database systems. It automates the entire benchmarking process from data generation and database setup to query execution and result collection.
The framework supports:
- Multiple benchmark suites including TPC-H, TPC-DS, ClickBench, Join Order Benchmark (JOB), Star Schema Benchmark (SSB), and StackOverflow
- Containerized database deployments for consistent testing environments
- Query plan analysis and performance profiling
- Automated result collection and CSV export
- Timeout handling and error recovery
- Version comparison across different database releases
- 🚀 Automated Benchmarking: Complete automation from data generation to result analysis
- 📊 Multiple Metrics: Query execution time, compilation time, memory usage, and result validation
- 🐳 Docker Integration: Containerized database deployments for reproducible environments
- 📈 Query Plan Analysis: Capture and analyze query execution plans
- 🔄 Retry Logic: Automatic retry on failures with container cleanup
- 📝 Comprehensive Logging: Detailed logging with configurable verbosity levels
- 🎯 Selective Testing: Include/exclude specific queries or systems
- 📋 Result Export: CSV export for further analysis and visualization
| Database | Version Support | Container | Notes |
|---|---|---|---|
| DuckDB | 0.7.0+ | ✅ | In-memory and persistent modes |
| PostgreSQL | 12.0+ | ✅ | Full SQL support |
| ClickHouse | Latest | ✅ | Columnar storage optimized |
| Hyper | Latest | ✅ | High-performance engine |
| Umbra | Latest | ✅ | Research prototype |
| MonetDB | Latest | ✅ | Column-store pioneer |
| SingleStore | Latest | ✅ | Distributed SQL |
| SQL Server | Latest | ✅ | Microsoft's enterprise RDBMS |
| CedarDB | Latest | ✅ | Modern analytical engine |
| Apollo | Latest | ✅ | SQL Server variant |
- Purpose: Ad-hoc query performance on business data
- Scale Factors: 0.01, 0.1, 1, 10, 100+ GB
- Queries: 22 complex analytical queries
- Data: Order/lineitem, customer, supplier, part, nation, region tables
- Purpose: Decision support system performance
- Scale Factors: Configurable from MB to TB
- Queries: 99 complex analytical queries
- Data: Retail business model with complex relationships
- Purpose: Real-world web analytics workload
- Data Source: Yandex.Metrica web analytics dataset
- Queries: 43 queries covering typical analytical patterns
- Size: ~100GB of real web traffic data
- Purpose: Complex join performance evaluation
- Data Source: Internet Movie Database (IMDb)
- Queries: 113 queries with complex join patterns
- Focus: Multi-table joins and query optimization
- Purpose: Star schema performance (derived from TPC-H)
- Schema: Simplified star schema design
- Queries: 13 queries testing different analytical patterns
- Use Case: Data warehouse workloads
- Purpose: Real-world forum data analysis
- Data Source: StackOverflow database dump
- Queries: Analytical queries on posts, users, votes, comments
- Size: Multi-GB real dataset
- Python 3.7+
- Docker (for containerized database systems)
- Git (for cloning the repository)
- Virtual Environment support
-
Clone the repository:
git clone https://github.com/SQL-Storm/OLAPBench.git cd OLAPBench -
Run the setup script:
./setup.sh
This will:
- Create a Python virtual environment in
.venv/ - Install all required dependencies from
requirements.txt - Set up the benchmarking environment
- Create a Python virtual environment in
-
Activate the virtual environment:
source .venv/bin/activate
-
Run a simple test:
./benchmark.sh test/duckdb.benchmark.yaml
-
Run with verbose output:
./benchmark.sh --verbose test/postgres.benchmark.yaml
-
Clear previous results and run:
./benchmark.sh --clear test/clickhouse.benchmark.yaml
# Activate virtual environment
source .venv/bin/activate
# Run benchmark
python benchmark.py -j test/duckdb.benchmark.yaml --verboseBenchmarks are defined using YAML configuration files. Here's a complete example:
title: "DuckDB Performance Test"
repetitions: 3 # Number of query repetitions
warmup: 1 # Warmup runs before measurement
timeout: 300 # Query timeout in seconds
global_timeout: 1800 # Total benchmark timeout in seconds
fetch_result: true # Whether to fetch and validate results
fetch_result_limit: 1000 # Limit rows fetched for validation
output: "results/duckdb/" # Output directory for results
systems:
- title: "DuckDB ${version}" # System title (supports templating)
dbms: duckdb # Database system identifier
disabled: false # Whether to skip this system
parameter: # Parameter matrix for testing
version:
- "0.9.0"
- "1.0.0"
- "latest"
buffer_size: # Optional: memory configuration
- "1GB"
- "4GB"
settings: # Database-specific settings
max_memory: "8GB"
threads: 4
benchmarks:
- name: tpch # Benchmark identifier
scale: 1 # Scale factor (1 = 1GB for TPC-H)
included_queries: # Optional: specific queries to run
- "1"
- "6"
- "14"
excluded_queries: # Optional: queries to skip
- "2"
- name: clickbench
# No scale needed for ClickBenchThe framework supports parameter matrices for testing multiple configurations:
systems:
- title: "PostgreSQL ${version} (${shared_buffers})"
dbms: postgres
parameter:
version: ["13.0", "14.0", "15.0", "16.0"]
shared_buffers: ["1GB", "2GB", "4GB"]
settings:
shared_buffers: "${shared_buffers}"
max_connections: 100This creates 16 different test configurations (4 versions × 4 buffer sizes).
./benchmark.sh [OPTIONS] <config.yaml> [benchmark_names...]
Options:
--clear Clear previous results before running
--launch Only launch databases without running queries
--verbose/-v Enable verbose logging
--veryverbose/-vv Enable very verbose logging
--noretry Don't retry on failure
Examples:
./benchmark.sh test/duckdb.benchmark.yaml
./benchmark.sh --clear --verbose production/tpch.benchmark.yaml
./benchmark.sh config.yaml tpch ssb # Run only TPC-H and SSBpython benchmark.py [OPTIONS]
Options:
-j/--json FILE Benchmark configuration file (required)
-v/--verbose Verbose output
-vv/--very-verbose Very verbose output
--db DIR Database storage directory (default: ./db)
--data DIR Data storage directory (default: ./data)
--env FILE Environment variables file
--clear Clear previous results
--launch Launch databases onlyCreate a .env file or use --env to specify database connection parameters:
# Database connection settings
POSTGRES_PASSWORD=mypassword
CLICKHOUSE_PASSWORD=mypassword
# Resource limits
MAX_MEMORY=16GB
MAX_THREADS=8
# Docker settings
DOCKER_MEMORY_LIMIT=32GBResults are organized in the specified output directory:
results/
├── duckdb/
│ ├── tpch_sf1.csv # Main results CSV
│ ├── tpch_sf1_plans/ # Query plans (if enabled)
│ │ ├── query_1_plan.json
│ │ └── query_6_plan.xml
│ └── logs/ # Detailed logs
│ ├── duckdb_1.0.0.log
│ └── benchmark.log
└── postgres/
└── tpch_sf1.csv
The main results are exported to CSV with the following columns:
| Column | Description |
|---|---|
title |
System configuration title |
query |
Query identifier |
state |
Execution state (success/error/timeout/oom/fatal) |
client_total |
End-to-end execution times (JSON array) |
total |
Database-reported total times |
execution |
Query execution times |
compilation |
Query compilation times |
rows |
Number of rows returned |
message |
Error message (if applicable) |
import pandas as pd
# Load results
df = pd.read_csv('results/duckdb/tpch_sf1.csv')
# Calculate median execution times
df['median_time'] = df['client_total'].apply(
lambda x: pd.Series(eval(x)).median()
)
# Group by system
summary = df.groupby('title')['median_time'].agg(['mean', 'sum', 'count'])
print(summary)OLAPBench/
├── benchmark.py # Main benchmark runner
├── benchmark.sh # Shell wrapper script
├── setup.sh # Environment setup
├── requirements.txt # Python dependencies
├── test.py # Test runner
├── benchmarks/ # Benchmark implementations
│ ├── benchmark.py # Base benchmark class
│ ├── tpch/ # TPC-H benchmark
│ │ ├── tpch.py
│ │ ├── tpch.dbschema.json
│ │ ├── dbgen.sh
│ │ └── queries/ # Query files
│ ├── clickbench/ # ClickBench benchmark
│ ├── tpcds/ # TPC-DS benchmark
│ ├── job/ # Join Order Benchmark
│ ├── ssb/ # Star Schema Benchmark
│ └── stackoverflow/ # StackOverflow benchmark
├── dbms/ # Database system implementations
│ ├── dbms.py # Base DBMS class
│ ├── duckdb.py # DuckDB implementation
│ ├── postgres.py # PostgreSQL implementation
│ ├── clickhouse.py # ClickHouse implementation
│ └── ... # Other database systems
├── docker/ # Docker configurations
│ ├── duckdb/
│ │ ├── Dockerfile
│ │ └── server.py
│ ├── postgres/
│ └── ...
├── queryplan/ # Query plan analysis
│ ├── queryplan.py
│ ├── parsers/ # Database-specific parsers
│ └── encoder/ # Plan serialization
├── util/ # Utility modules
│ ├── logger.py # Logging framework
│ ├── formatter.py # Output formatting
│ ├── process.py # Process management
│ └── ...
├── schemas/ # JSON schemas
├── test/ # Test configurations
└── data/ # Generated benchmark data
⚠️ Important: New database systems must be registered indbms/dbms.pyin thedatabase_systems()function to be recognized by the framework.
To add support for a new database system, you need to create the implementation and register it in the framework:
- Create DBMS implementation (
dbms/newsystem.py):
from dbms.dbms import DBMS, Result, DBMSDescription
class NewSystem(DBMS):
def __init__(self, benchmark, db_dir, data_dir, params, settings):
super().__init__(benchmark, db_dir, data_dir, params, settings)
@property
def name(self) -> str:
return "newsystem"
def _connect(self, port: int):
# Implement connection logic
pass
def _create_schema(self):
# Implement schema creation
pass
def _load_data(self):
# Implement data loading
pass
def _execute_query(self, query: str) -> Result:
# Implement query execution
pass
class NewSystemDescription(DBMSDescription):
@staticmethod
def get_name() -> str:
return "newsystem"
@staticmethod
def instantiate(benchmark, db_dir, data_dir, params, settings):
return NewSystem(benchmark, db_dir, data_dir, params, settings)- Create Docker configuration (
docker/newsystem/):
FROM newsystem:latest
COPY entrypoint.sh /entrypoint.sh
EXPOSE 5432
ENTRYPOINT ["/entrypoint.sh"]- Register the new system in
dbms/dbms.py:
Add the import and include it in the database_systems() function:
def database_systems() -> Dict[str, DBMSDescription]:
from dbms import apollo, cedardb, clickhouse, duckdb, hyper, monetdb, postgres, singlestore, sqlserver, umbra, umbradev, newsystem
dbms_list = [
apollo.ApolloDescription, cedardb.CedarDBDescription, clickhouse.ClickHouseDescription,
duckdb.DuckDBDescription, hyper.HyperDescription, monetdb.MonetDBDescription,
postgres.PostgresDescription, singlestore.SingleStoreDescription, sqlserver.SQLServerDescription,
umbra.UmbraDescription, umbradev.UmbraDevDescription, newsystem.NewSystemDescription
]
return {dbms.get_name(): dbms for dbms in dbms_list}- Update schema (
schemas/benchmark.schema.json):
{
"dbms": {
"enum": [..., "newsystem"]
}
}
⚠️ Important: New benchmarks must be registered inbenchmarks/benchmark.pyin thebenchmarks()function to be recognized by the framework.
To add a new benchmark, you need to create the implementation and register it in the framework:
- Create benchmark implementation (
benchmarks/newbench/newbench.py):
from benchmarks.benchmark import Benchmark, BenchmarkDescription
class NewBench(Benchmark):
@property
def name(self) -> str:
return "newbench"
@property
def description(self) -> str:
return "New Benchmark Description"
def dbgen(self):
# Implement data generation
pass
class NewBenchDescription(BenchmarkDescription):
@staticmethod
def get_name() -> str:
return "newbench"
@staticmethod
def instantiate(base_dir: str, args: dict):
return NewBench(base_dir, args)- Create schema definition (
benchmarks/newbench/newbench.dbschema.json):
{
"file_ending": "csv",
"delimiter": ",",
"format": "text",
"tables": [
{
"name": "table1",
"columns": [
{"name": "id", "type": "INTEGER PRIMARY KEY"},
{"name": "name", "type": "VARCHAR(255)"}
]
}
]
}-
Add queries (
benchmarks/newbench/queries/):- Create SQL files:
1.sql,2.sql, etc. - Each file contains one query
- Create SQL files:
-
Register the new benchmark in
benchmarks/benchmark.py:
Add the import and include it in the benchmarks() function:
def benchmarks() -> dict[str, BenchmarkDescription]:
from benchmarks.clickbench import clickbench
from benchmarks.job import job
from benchmarks.ssb import ssb
from benchmarks.tpcds import tpcds
from benchmarks.tpch import tpch
from benchmarks.stackoverflow import stackoverflow
from benchmarks.newbench import newbench
benchmark_list = [
clickbench.ClickBenchDescription,
job.JOBDescription,
ssb.SSBDescription,
stackoverflow.StackOverflowDescription,
tpcds.TPCDSDescription,
tpch.TPCHDescription,
newbench.NewBenchDescription
]
return {benchmark.get_name(): benchmark for benchmark in benchmark_list}- Create data generation script (
benchmarks/newbench/dbgen.sh):
#!/bin/bash
# Generate benchmark data
echo "Generating NewBench data..."The test/ directory contains example configurations for most supported systems. You can use these as templates:
test/duckdb.benchmark.yaml- DuckDB with version matrixtest/postgres.benchmark.yaml- PostgreSQL with multiple versionstest/clickhouse.benchmark.yaml- ClickHouse configurationtest/hyper.benchmark.yaml- Hyper configurationtest/singlestore.benchmark.yaml- SingleStore configurationtest/sqlserver.benchmark.yaml- SQL Server configurationtest/umbra.benchmark.yaml- Umbra configurationtest/umbradev.benchmark.yaml- Umbra development versiontest/apollo.benchmark.yaml- Apollo configuration
Note: Some supported systems (MonetDB, CedarDB) don't have example test configurations yet.
Create a custom benchmark configuration:
title: "Custom Performance Comparison"
repetitions: 5
warmup: 2
timeout: 600
global_timeout: 3600
output: "results/custom/"
systems:
- title: "DuckDB Latest"
dbms: duckdb
parameter:
version: ["latest"]
- title: "PostgreSQL 16"
dbms: postgres
parameter:
version: ["16.0"]
settings:
shared_buffers: "2GB"
work_mem: "256MB"
benchmarks:
- name: tpch
scale: 1
included_queries: ["1", "3", "5", "6", "10"]
- name: clickbench
included_queries: ["0", "1", "2", "3", "4"]Run the custom benchmark:
./benchmark.sh --verbose custom.benchmark.yamlContributions are welcome! Please:
- Fork the repository
- Create a feature branch:
git checkout -b feature/new-feature - Add tests for new functionality
- Follow code style guidelines
- Submit a pull request
# Clone and setup
git clone https://github.com/SQL-Storm/OLAPBench.git
cd OLAPBench
./setup.sh
source .venv/bin/activate
# Run tests
python test.py
# Run specific test
./benchmark.sh test/duckdb.benchmark.yaml- Follow Python PEP 8 style guidelines
- Add docstrings for public methods
- Include type hints where appropriate
- Write unit tests for new functionality
- Update documentation for new features
- SQLStorm: Query performance analysis framework - https://github.com/SQL-Storm/SQLStorm
- TPC Benchmarks: Official TPC benchmark specifications - http://www.tpc.org/
- ClickBench: ClickHouse benchmark suite - https://github.com/ClickHouse/ClickBench
This project is licensed under the MIT License - see the LICENSE file for details.
- TPC Council for benchmark specifications
- Database system developers and communities
- Academic research on query performance analysis
- Open source contributors and maintainers
For more examples and advanced configurations, see the SQLStorm examples directory.