Skip to content

andreikurilo/forgeq

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

ForgeQ

ForgeQ is a distributed job queue system in Go designed to handle asynchronous workloads reliably using concurrent workers, retry semantics, and fault-tolerant execution.

It demonstrates production-grade patterns such as worker pools, lease-based job execution, graceful shutdown, and at-least-once delivery.


πŸš€ Overview

Modern backend systems often need to perform tasks that should not block user requests:

  • sending emails
  • processing images or videos
  • generating reports
  • syncing data between services
  • running scheduled or delayed jobs

ForgeQ provides a durable job queue and worker execution system, allowing you to enqueue tasks and process them asynchronously with strong guarantees.

βœ… Guarantees

  • At-least-once execution β€” jobs are never lost, but may run more than once
  • Durable persistence β€” jobs are stored in Postgres
  • Crash recovery β€” in-progress jobs are safely retried if a worker dies
  • Bounded concurrency β€” worker pools prevent overload
  • Graceful shutdown β€” no jobs are dropped during termination

🎯 What Problem Does ForgeQ Solve?

  1. Blocking Work in APIs

    Instead of making users wait for slow operations:

    • API enqueues a job
    • responds immediately
    • background workers process the task
  2. Reliability & Retries Failures happen:

    • network errors
    • third-party downtime
    • crashes

    ForgeQ ensures:

    • automatic retries with backoff
    • failure tracking
    • dead-letter handling
  3. Concurrency & Scalability

    Handling many tasks simultaneously is hard.

    ForgeQ provides:

    • concurrent worker pools (goroutines)
    • backpressure control
    • horizontal scaling (multiple workers)
  4. Scheduling & Delayed Execution

    Need something to run later?

    ForgeQ supports:

    • delayed jobs (run_at)
    • scheduled execution
    • retry scheduling
  5. Fault Tolerance

    Systems crash. ForgeQ is built for it:

    • lease-based job locking
    • heartbeat mechanism
    • recovery of stuck jobs

🧠 Core Concepts

Job

A unit of work.

{
  "type": "send_email",
  "payload": {
    "email": "user@example.com"
  }
}

Queue

Logical grouping of jobs:

  • emails
  • reports
  • payments

Worker

A process that:

  • pulls jobs
  • executes them concurrently
  • reports results

Dispatcher

Responsible for:

  • selecting available jobs
  • assigning them to workers

Scheduler

Handles:

  • delayed jobs
  • retry scheduling

Lease (Locking Mechanism)

Prevents duplicate execution:

  • job is β€œleased” to a worker
  • expires if worker crashes
  • recovered safely

βš™οΈ How ForgeQ Works (Workflow)

  1. Enqueue

    Client sends a job:

    POST /jobs

    Job is stored in Postgres as pending.

  2. Scheduling

    If run_at is in the future:

    • job waits
    • scheduler promotes it when ready
  3. Dispatching

    Dispatcher continuously:

    • scans for available jobs
    • locks (leases) them
    • sends them to workers
  4. Execution

    Workers:

    • run jobs concurrently using goroutines
    • execute registered handlers
  5. Success

    If job succeeds:

    • marked as completed
    • result stored/logged
  6. Failure & Retry

    If job fails:

    • attempts incremented
    • next retry scheduled (exponential backoff)
    • eventually marked dead if max retries exceeded
  7. Recovery

    If a worker crashes:

    • lease expires
    • job is re-queued safely
  8. Shutdown

    On SIGINT / SIGTERM:

    • stop accepting new jobs
    • finish in-flight work
    • gracefully exit

Architecture

          +------------+
          |   Client   |
          +------------+
                 |
                 v
          +------------+
          | API Server |
          +------------+
                 |
                 v
          +------------+
          |  Postgres  |
          +------------+
           /     |     \
          v      v      v
   +---------+ +--------+ +---------+
   |Scheduler| |Dispatcher| |Recovery|
   +---------+ +--------+ +---------+
                       |
                       v
                +-------------+
                |  Worker Pool |
                +-------------+

Flow

  1. API receives job requests and stores them in Postgres
  2. Scheduler promotes delayed and retryable jobs
  3. Dispatcher leases jobs and sends them to workers
  4. Workers execute jobs concurrently
  5. Results are persisted and metrics updated
  6. Recovery loop reclaims expired jobs

Each component runs independently using goroutines and coordinated via shared storage and signaling.


⚑ Concurrency Model

ForgeQ is built around Go’s concurrency primitives:

  • Worker pools implemented with goroutines
  • Channels used for job dispatching and coordination
  • context.Context for cancellation and timeouts
  • sync.WaitGroup for graceful shutdown
  • Atomic counters for metrics and tracking

The system avoids shared-memory contention by favoring message passing and clear ownership of state.


🧯 Failure Handling

ForgeQ is designed to handle failures gracefully:

  • Worker crashes β†’ jobs are recovered via lease expiration
  • Transient errors β†’ retried with exponential backoff
  • Permanent failures β†’ moved to dead state
  • Long-running jobs β†’ protected via heartbeat mechanism

This ensures the system remains reliable under real-world conditions.


πŸ”₯ Key Features

  • Concurrent worker pool using goroutines
  • Durable job storage (Postgres)
  • Retry with exponential backoff
  • Delayed and scheduled jobs
  • Lease-based job execution (no duplicates)
  • Heartbeats for long-running jobs
  • Graceful shutdown via OS signals
  • Metrics (Prometheus-ready)
  • CLI for job submission and worker control

πŸ“¦ Example Use Cases

  • Email sending system
  • Video/image processing pipeline
  • Data synchronization jobs
  • Scheduled billing tasks
  • Report generation
  • Web scraping / crawling

βš–οΈ Design Decisions

At-least-once delivery

Jobs may run more than once in rare cases. This improves reliability and simplifies recovery.

Why Postgres?

Postgres was chosen for the MVP because:

  • strong consistency guarantees
  • transactional safety for job state transitions
  • simpler deployment compared to distributed queues
  • sufficient performance for moderate workloads

Tradeoff:

  • lower throughput compared to systems like Kafka or Redis-based queues

Lease-based locking

Instead of in-memory locking:

  • safe across multiple workers
  • crash-resistant
  • scalable

Context-based cancellation

All workers use context.Context:

  • clean shutdown
  • no goroutine leaks

πŸ›  Tech Stack

  • Go (Golang)
  • Postgres
  • pgx
  • chi (HTTP router)
  • Prometheus (metrics)
  • Docker

πŸ§ͺ Example Flow

# start server
forgeq server

# start workers
forgeq worker --concurrency 10

# enqueue job
forgeq enqueue --type send_email --payload '{"email":"test@example.com"}'

πŸ’¬ Summary

ForgeQ is designed to demonstrate:

  • advanced concurrency in Go
  • real-world backend system design
  • fault tolerance and recovery
  • production-grade patterns used in distributed systems

🧠 Author Notes

This project focuses on:

  • correctness over shortcuts
  • explicit concurrency patterns
  • clean architecture
  • production-like behavior

🌐 Future Work

  • Multi-node coordination using distributed leases
  • Sharded queues for higher throughput
  • Exactly-once execution (idempotency strategies)
  • Web dashboard and observability UI

πŸ“„ License

MIT

About

ForgeQ is a distributed background job processing system written in Go. It enables applications to offload heavy, asynchronous, and retryable work into a reliable, concurrent execution pipeline.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages