ForgeQ

ForgeQ is a distributed job queue system in Go designed to handle asynchronous workloads reliably using concurrent workers, retry semantics, and fault-tolerant execution.

It demonstrates production-grade patterns such as worker pools, lease-based job execution, graceful shutdown, and at-least-once delivery.

🚀 Overview

Modern backend systems often need to perform tasks that should not block user requests:

sending emails
processing images or videos
generating reports
syncing data between services
running scheduled or delayed jobs

ForgeQ provides a durable job queue and worker execution system, allowing you to enqueue tasks and process them asynchronously with strong guarantees.

✅ Guarantees

At-least-once execution — jobs are never lost, but may run more than once
Durable persistence — jobs are stored in Postgres
Crash recovery — in-progress jobs are safely retried if a worker dies
Bounded concurrency — worker pools prevent overload
Graceful shutdown — no jobs are dropped during termination

🎯 What Problem Does ForgeQ Solve?

Blocking Work in APIs

Instead of making users wait for slow operations:
- API enqueues a job
- responds immediately
- background workers process the task
Reliability & Retries Failures happen:
- network errors
- third-party downtime
- crashes
ForgeQ ensures:
- automatic retries with backoff
- failure tracking
- dead-letter handling
Concurrency & Scalability

Handling many tasks simultaneously is hard.

ForgeQ provides:
- concurrent worker pools (goroutines)
- backpressure control
- horizontal scaling (multiple workers)
Scheduling & Delayed Execution

Need something to run later?

ForgeQ supports:
- delayed jobs (run_at)
- scheduled execution
- retry scheduling
Fault Tolerance

Systems crash. ForgeQ is built for it:
- lease-based job locking
- heartbeat mechanism
- recovery of stuck jobs

🧠 Core Concepts

Job

A unit of work.

{
  "type": "send_email",
  "payload": {
    "email": "user@example.com"
  }
}

Queue

Logical grouping of jobs:

emails
reports
payments

Worker

A process that:

pulls jobs
executes them concurrently
reports results

Dispatcher

Responsible for:

selecting available jobs
assigning them to workers

Scheduler

Handles:

delayed jobs
retry scheduling

Lease (Locking Mechanism)

Prevents duplicate execution:

job is “leased” to a worker
expires if worker crashes
recovered safely

⚙️ How ForgeQ Works (Workflow)

Enqueue

Client sends a job:
```
POST /jobs
```
Job is stored in Postgres as pending.
Scheduling

If run_at is in the future:
- job waits
- scheduler promotes it when ready
Dispatching

Dispatcher continuously:
- scans for available jobs
- locks (leases) them
- sends them to workers
Execution

Workers:
- run jobs concurrently using goroutines
- execute registered handlers
Success

If job succeeds:
- marked as completed
- result stored/logged
Failure & Retry

If job fails:
- attempts incremented
- next retry scheduled (exponential backoff)
- eventually marked dead if max retries exceeded
Recovery

If a worker crashes:
- lease expires
- job is re-queued safely
Shutdown

On SIGINT / SIGTERM:
- stop accepting new jobs
- finish in-flight work
- gracefully exit

Architecture

          +------------+
          |   Client   |
          +------------+
                 |
                 v
          +------------+
          | API Server |
          +------------+
                 |
                 v
          +------------+
          |  Postgres  |
          +------------+
           /     |     \
          v      v      v
   +---------+ +--------+ +---------+
   |Scheduler| |Dispatcher| |Recovery|
   +---------+ +--------+ +---------+
                       |
                       v
                +-------------+
                |  Worker Pool |
                +-------------+

Flow

API receives job requests and stores them in Postgres
Scheduler promotes delayed and retryable jobs
Dispatcher leases jobs and sends them to workers
Workers execute jobs concurrently
Results are persisted and metrics updated
Recovery loop reclaims expired jobs

Each component runs independently using goroutines and coordinated via shared storage and signaling.

⚡ Concurrency Model

ForgeQ is built around Go’s concurrency primitives:

Worker pools implemented with goroutines
Channels used for job dispatching and coordination
context.Context for cancellation and timeouts
sync.WaitGroup for graceful shutdown
Atomic counters for metrics and tracking

The system avoids shared-memory contention by favoring message passing and clear ownership of state.

🧯 Failure Handling

ForgeQ is designed to handle failures gracefully:

Worker crashes → jobs are recovered via lease expiration
Transient errors → retried with exponential backoff
Permanent failures → moved to dead state
Long-running jobs → protected via heartbeat mechanism

This ensures the system remains reliable under real-world conditions.

🔥 Key Features

Concurrent worker pool using goroutines
Durable job storage (Postgres)
Retry with exponential backoff
Delayed and scheduled jobs
Lease-based job execution (no duplicates)
Heartbeats for long-running jobs
Graceful shutdown via OS signals
Metrics (Prometheus-ready)
CLI for job submission and worker control

📦 Example Use Cases

Email sending system
Video/image processing pipeline
Data synchronization jobs
Scheduled billing tasks
Report generation
Web scraping / crawling

⚖️ Design Decisions

At-least-once delivery

Jobs may run more than once in rare cases. This improves reliability and simplifies recovery.

Why Postgres?

Postgres was chosen for the MVP because:

strong consistency guarantees
transactional safety for job state transitions
simpler deployment compared to distributed queues
sufficient performance for moderate workloads

Tradeoff:

lower throughput compared to systems like Kafka or Redis-based queues

Lease-based locking

Instead of in-memory locking:

safe across multiple workers
crash-resistant
scalable

Context-based cancellation

All workers use context.Context:

clean shutdown
no goroutine leaks

🛠 Tech Stack

Go (Golang)
Postgres
pgx
chi (HTTP router)
Prometheus (metrics)
Docker

🧪 Example Flow

# start server
forgeq server

# start workers
forgeq worker --concurrency 10

# enqueue job
forgeq enqueue --type send_email --payload '{"email":"test@example.com"}'

💬 Summary

ForgeQ is designed to demonstrate:

advanced concurrency in Go
real-world backend system design
fault tolerance and recovery
production-grade patterns used in distributed systems

🧠 Author Notes

This project focuses on:

correctness over shortcuts
explicit concurrency patterns
clean architecture
production-like behavior

🌐 Future Work

Multi-node coordination using distributed leases
Sharded queues for higher throughput
Exactly-once execution (idempotency strategies)
Web dashboard and observability UI

📄 License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.github/workflows		.github/workflows
.vscode		.vscode
cmd		cmd
deployments		deployments
internal		internal
migrations		migrations
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ForgeQ

🚀 Overview

✅ Guarantees

🎯 What Problem Does ForgeQ Solve?

🧠 Core Concepts

Queue

Worker

Dispatcher

Scheduler

Lease (Locking Mechanism)

⚙️ How ForgeQ Works (Workflow)

Architecture

Flow

⚡ Concurrency Model

🧯 Failure Handling

🔥 Key Features

📦 Example Use Cases

⚖️ Design Decisions

Jobs may run more than once in rare cases. This improves reliability and simplifies recovery.

Why Postgres?

Lease-based locking

Context-based cancellation

🛠 Tech Stack

🧪 Example Flow

💬 Summary

🧠 Author Notes

🌐 Future Work

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

ForgeQ

🚀 Overview

✅ Guarantees

🎯 What Problem Does ForgeQ Solve?

🧠 Core Concepts

Queue

Worker

Dispatcher

Scheduler

Lease (Locking Mechanism)

⚙️ How ForgeQ Works (Workflow)

Architecture

Flow

⚡ Concurrency Model

🧯 Failure Handling

🔥 Key Features

📦 Example Use Cases

⚖️ Design Decisions

Jobs may run more than once in rare cases. This improves reliability and simplifies recovery.

Why Postgres?

Lease-based locking

Context-based cancellation

🛠 Tech Stack

🧪 Example Flow

💬 Summary

🧠 Author Notes

🌐 Future Work

📄 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Languages

Packages