Skip to content
View Pawansingh3889's full-sized avatar

Highlights

  • Pro

Block or report Pawansingh3889

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Pawansingh3889/README.md

Pawan Singh Kapkoti

Data engineer. I break things, read source code, and ship fixes upstream.

MSc Data Analytics. Building pipelines and dev tools on the side. I believe compliance shouldn't mean spreadsheets and AI shouldn't require the cloud. Yorkshire, UK.

Find me around the web

Portfolio · LinkedIn · Email


Dev tools on PyPI

sql-sop — SQL linter on PyPI. 23 rules, 78 tests, libCST-based injection scanner, pre-commit + GitHub Action. 195+ monthly downloads. pip install sql-sop

pr-sop — PR governance checker on PyPI. 3 config-driven checks (CHANGELOG drift, version consistency, pre-commit rev pins), 29 tests, pydantic v2 config, runs as CLI, pre-commit hook, or GitHub Action. First external consumer: sql-sop itself. pip install pr-sop


Data pipelines

Production Analytics Pipeline — Incremental ETL from fish production ERP. 15K+ rows/day, FastAPI (11 endpoints) + Next.js + Power BI, Prefect orchestration, Docker + OpenTofu. 53 tests.

UK Crime Pipeline — Police UK API → PostgreSQL + BigQuery. 99,675 records, 6 dbt marts, 65 tests, Polars ingestion, SLO monitoring. streamlit · looker studio · hugging face


AI / on-prem apps

OpsMind — On-prem AI for manufacturing. NL-to-SQL in 5s, LangGraph agent, MCP server architecture, pgvector + ChromaDB RAG, Gemma 3 12B via Ollama. Golden-set eval harness with failure-mode taxonomy. docs

Manufacturing Compliance Dashboard — BRC/HACCP food safety compliance. MCP server exposes 5 compliance tools for LLM agents, NL query interface for auditors, z-score anomaly detection, Four Golden Signals /metrics endpoint. live

SQL Ops Reviewer — GitHub Action that auto-reviews .sql files in PRs using local AI. Catches injection risks, performance anti-patterns, style violations. One YAML file to set up, runs on the CI runner, zero API keys.

MediAsk — Health Q&A platform for factory workers. NHS-verified guidance, Gemini responses, voice input, 18 languages. Flask + PostgreSQL, Dockerised. live


Open source

drt — Triage Collaborator. Shipped multi-sync orchestration (drt run --all, --select tag:, --threads N) with a thread-safe StateManager and 11 parallel-dispatch tests. Plus 5 destination connectors, the official connector tutorial, Docker support, and pre-commit hooks — all merged.

sql-sop — Maintainer. Review and merge community PRs (W011 union-without-all, W012 group-by-ordinal, W005-template adoption), publish to PyPI, maintain governance + security policy, triage issues. First-PR-wins soft-assignment policy in place.

pr-sop — Creator and maintainer. Shipped v0.1.0 (initial three checks), v0.1.1 (fix for third-party rev: pin false positives), and v0.1.2 (fix for CI-merge-commit tag lookup) to PyPI in 24 hours. Full governance, security, contributing, and code-of-conduct documents published.


Signature upstream PRs

Merged contributions into projects I use every day.


I learn tools by reading their source: reverse-engineer the architecture, find the gap, ship the fix.

drt · pandas · ChromaDB · pgcli · ollama · superset · plotly · fpdf2


Stack

Python, SQL, dbt, PostgreSQL, BigQuery, FastAPI, Streamlit, Prefect, LangGraph, Ollama, Docker, Polars, pandas, Pydantic, pytest, GitHub Actions

Pinned Loading

  1. uk-crime-pipeline uk-crime-pipeline Public

    End-to-end pipeline: Police UK API to PostgreSQL + BigQuery. dbt staging/marts, 65 tests, 3 CI/CD workflows, Looker Studio + Streamlit dashboards.

    Python

  2. OpsMind OpsMind Public

    On-prem AI query tool for manufacturing. NL-to-SQL in 5 seconds. LangGraph agent, pgvector + ChromaDB RAG, Gemma 3 12B via Ollama. 19 tables, read-only.

    Python 1

  3. forthepeople-uk forthepeople-uk Public

    UK citizen transparency platform. Free council-level dashboards: weather, population, housing, crime, health, schools, elections, benefits.

    Python

  4. Hackathon-mediask Hackathon-mediask Public

    MediAsk — health Q&A platform for factory workers. Flask, PostgreSQL, Gemini AI, Docker. Live on Render.

    Python

  5. manufacturing-compliance-dashboard manufacturing-compliance-dashboard Public

    BRC/HACCP food safety dashboard. Batch traceability, temperature monitoring, allergen matrix, weight variance. Streamlit + Sentry.

    Python

  6. sql-guard sql-guard Public

    Fast rule-based SQL linter on PyPI (sql-sop). 24 rules, 81 tests, 0.08s scans. Pre-commit hook + GitHub Action. 195+ monthly downloads.

    Python 3