ForgeQ is a distributed job queue system in Go designed to handle asynchronous workloads reliably using concurrent workers, retry semantics, and fault-tolerant execution.
It demonstrates production-grade patterns such as worker pools, lease-based job execution, graceful shutdown, and at-least-once delivery.
Modern backend systems often need to perform tasks that should not block user requests:
- sending emails
- processing images or videos
- generating reports
- syncing data between services
- running scheduled or delayed jobs
ForgeQ provides a durable job queue and worker execution system, allowing you to enqueue tasks and process them asynchronously with strong guarantees.
- At-least-once execution β jobs are never lost, but may run more than once
- Durable persistence β jobs are stored in Postgres
- Crash recovery β in-progress jobs are safely retried if a worker dies
- Bounded concurrency β worker pools prevent overload
- Graceful shutdown β no jobs are dropped during termination
-
Blocking Work in APIs
Instead of making users wait for slow operations:
- API enqueues a job
- responds immediately
- background workers process the task
-
Reliability & Retries Failures happen:
- network errors
- third-party downtime
- crashes
ForgeQ ensures:
- automatic retries with backoff
- failure tracking
- dead-letter handling
-
Concurrency & Scalability
Handling many tasks simultaneously is hard.
ForgeQ provides:
- concurrent worker pools (goroutines)
- backpressure control
- horizontal scaling (multiple workers)
-
Scheduling & Delayed Execution
Need something to run later?
ForgeQ supports:
- delayed jobs (run_at)
- scheduled execution
- retry scheduling
-
Fault Tolerance
Systems crash. ForgeQ is built for it:
- lease-based job locking
- heartbeat mechanism
- recovery of stuck jobs
Job
A unit of work.
{
"type": "send_email",
"payload": {
"email": "user@example.com"
}
}Logical grouping of jobs:
- emails
- reports
- payments
A process that:
- pulls jobs
- executes them concurrently
- reports results
Responsible for:
- selecting available jobs
- assigning them to workers
Handles:
- delayed jobs
- retry scheduling
Prevents duplicate execution:
- job is βleasedβ to a worker
- expires if worker crashes
- recovered safely
-
Enqueue
Client sends a job:
POST /jobs
Job is stored in Postgres as pending.
-
Scheduling
If run_at is in the future:
- job waits
- scheduler promotes it when ready
-
Dispatching
Dispatcher continuously:
- scans for available jobs
- locks (leases) them
- sends them to workers
-
Execution
Workers:
- run jobs concurrently using goroutines
- execute registered handlers
-
Success
If job succeeds:
- marked as completed
- result stored/logged
-
Failure & Retry
If job fails:
- attempts incremented
- next retry scheduled (exponential backoff)
- eventually marked dead if max retries exceeded
-
Recovery
If a worker crashes:
- lease expires
- job is re-queued safely
-
Shutdown
On SIGINT / SIGTERM:
- stop accepting new jobs
- finish in-flight work
- gracefully exit
+------------+
| Client |
+------------+
|
v
+------------+
| API Server |
+------------+
|
v
+------------+
| Postgres |
+------------+
/ | \
v v v
+---------+ +--------+ +---------+
|Scheduler| |Dispatcher| |Recovery|
+---------+ +--------+ +---------+
|
v
+-------------+
| Worker Pool |
+-------------+
- API receives job requests and stores them in Postgres
- Scheduler promotes delayed and retryable jobs
- Dispatcher leases jobs and sends them to workers
- Workers execute jobs concurrently
- Results are persisted and metrics updated
- Recovery loop reclaims expired jobs
Each component runs independently using goroutines and coordinated via shared storage and signaling.
ForgeQ is built around Goβs concurrency primitives:
- Worker pools implemented with goroutines
- Channels used for job dispatching and coordination
context.Contextfor cancellation and timeoutssync.WaitGroupfor graceful shutdown- Atomic counters for metrics and tracking
The system avoids shared-memory contention by favoring message passing and clear ownership of state.
ForgeQ is designed to handle failures gracefully:
- Worker crashes β jobs are recovered via lease expiration
- Transient errors β retried with exponential backoff
- Permanent failures β moved to dead state
- Long-running jobs β protected via heartbeat mechanism
This ensures the system remains reliable under real-world conditions.
- Concurrent worker pool using goroutines
- Durable job storage (Postgres)
- Retry with exponential backoff
- Delayed and scheduled jobs
- Lease-based job execution (no duplicates)
- Heartbeats for long-running jobs
- Graceful shutdown via OS signals
- Metrics (Prometheus-ready)
- CLI for job submission and worker control
- Email sending system
- Video/image processing pipeline
- Data synchronization jobs
- Scheduled billing tasks
- Report generation
- Web scraping / crawling
At-least-once delivery
Postgres was chosen for the MVP because:
- strong consistency guarantees
- transactional safety for job state transitions
- simpler deployment compared to distributed queues
- sufficient performance for moderate workloads
Tradeoff:
- lower throughput compared to systems like Kafka or Redis-based queues
Instead of in-memory locking:
- safe across multiple workers
- crash-resistant
- scalable
All workers use context.Context:
- clean shutdown
- no goroutine leaks
- Go (Golang)
- Postgres
- pgx
- chi (HTTP router)
- Prometheus (metrics)
- Docker
# start server
forgeq server
# start workers
forgeq worker --concurrency 10
# enqueue job
forgeq enqueue --type send_email --payload '{"email":"test@example.com"}'ForgeQ is designed to demonstrate:
- advanced concurrency in Go
- real-world backend system design
- fault tolerance and recovery
- production-grade patterns used in distributed systems
This project focuses on:
- correctness over shortcuts
- explicit concurrency patterns
- clean architecture
- production-like behavior
- Multi-node coordination using distributed leases
- Sharded queues for higher throughput
- Exactly-once execution (idempotency strategies)
- Web dashboard and observability UI