Server

Actions

All workflows
Workflows
- Server Server
- Build Actions Cache Build Actions Cache
- Build and Test - Hexagon Android (QDC) Build and Test - Hexagon Android (QDC)
- Build on RISCV Linux Machine by Cloud-V Build on RISCV Linux Machine by Cloud-V
- Build relocatable cmake package Build relocatable cmake package
- Check Pre-Tokenizer Hashes Check Pre-Tokenizer Hashes
- Check vendor Check vendor
- CI CI
- CI (3rd-party) CI (3rd-party)
- CI (android) CI (android)
- CI (apple) CI (apple)
Management
- Caches

Server

Actions

Loading...
Loading

server.yml

2,500+ workflow runs

server : remove /api endpoints Server #31586: Pull request #22165 opened by ggerganov

Queued gg/server-remove-api

gg/server-remove-api

Queued

common: improve GGUF quantization tag regex Server #31585: Pull request #22164 opened by v1b3coder

Action required v1b3coder:fix/gguf-tag-regex

v1b3coder:fix/gguf-tag-regex

Action required

Tensor-parallel: Fix delayed AllReduce on Gemma-4 MoE Server #31583: Pull request #22129 synchronize by gaugarg-nv

Queued gaugarg-nv:gemma4_perf

gaugarg-nv:gemma4_perf

Queued

server: Allow continue in thinking (reasoning prefill) Server #31582: Pull request #22162 synchronize by roj234

Action required roj234:thinking_prefill

roj234:thinking_prefill

Action required

server: Allow continue in thinking (reasoning prefill) Server #31581: Pull request #22162 synchronize by roj234

Action required roj234:thinking_prefill

roj234:thinking_prefill

Action required

server: Allow continue in thinking (reasoning prefill) Server #31580: Pull request #22162 opened by roj234

Action required roj234:thinking_prefill

roj234:thinking_prefill

Action required

TP: fix 0-sized tensor slices, AllReduce fallback Server #31579: Pull request #21808 synchronize by JohannesGaessler

In progress JohannesGaessler:tp-fix-0-slice

JohannesGaessler:tp-fix-0-slice

In progress

webui: Server tools Server #31578: Pull request #21237 synchronize by allozaur

Queued allozaur:allozaur/20677-webui-server-tools

allozaur:allozaur/20677-webui-server-tools

Queued

mtmd: refactor mtmd_decode_use_mrope Server #31577: Pull request #22161 opened by ngxson

1h 4m 31s ngxson:xsn/mtmd_refactor_mrope

ngxson:xsn/mtmd_refactor_mrope

1h 4m 31s

rpc : implement event and async backend APIs Server #31576: Pull request #18626 synchronize by rgerganov

1h 4m 35s rgerganov:rpc-async

rgerganov:rpc-async

1h 4m 35s

webui: Server tools Server #31575: Pull request #21237 synchronize by allozaur

43m 12s allozaur:allozaur/20677-webui-server-tools

allozaur:allozaur/20677-webui-server-tools

43m 12s

mtmd, llama : Update HunyuanVL vision-language model support Server #31574: Pull request #22037 synchronize by wendadawen

Action required ManaEstras:pr2-hunyuanvl

ManaEstras:pr2-hunyuanvl

Action required

webui: Server tools Server #31573: Pull request #21237 synchronize by allozaur

55m 46s allozaur:allozaur/20677-webui-server-tools

allozaur:allozaur/20677-webui-server-tools

55m 46s

webui: Server tools Server #31572: Pull request #21237 synchronize by allozaur

12m 22s allozaur:allozaur/20677-webui-server-tools

allozaur:allozaur/20677-webui-server-tools

12m 22s

Make llama_token_to_piece return INT32_MIN for invalid tokens Server #31569: Pull request #22121 synchronize by julmb

Action required julmb:bounds

julmb:bounds

Action required

Support for DeepseekV32ForCausalLM with DeepSeek Sparse Attention (DSA) Server #31568: Pull request #21149 synchronize by fairydreaming

2h 41m 32s fairydreaming:deepseek-dsa

fairydreaming:deepseek-dsa

2h 41m 32s

fix: GLM-DSA crash in llama-tokenize when using vocab_only (#22102) Server #31567: Commit 81df3f7 pushed by ggerganov

3h 22m 3s master

master

3h 22m 3s

sycl: scalar SWAR byte-subtract in Q6_K MMVQ dot product Server #31566: Pull request #22156 opened by aicss-genai

Action required aicss-genai:aicss-genai/sycl-bmg-upstream-pr-7

aicss-genai:aicss-genai/sycl-bmg-upstream-pr-7

Action required

ggml-cuda: flush legacy pool on OOM and retry Server #31565: Pull request #22155 opened by leonardHONG

Action required leonardHONG:cuda-pool-leg-oom-retry

leonardHONG:cuda-pool-leg-oom-retry

Action required

server: Support chat_template_kwargs for /v1/messages Server #31564: Pull request #22154 opened by Soreepeong

In progress Soreepeong:anthropic-oai-conv-chat-template-kwargs

Soreepeong:anthropic-oai-conv-chat-template-kwargs

In progress

sycl: add GGML_SYCL_USE_ASYNC_MEM_OP env toggle Server #31563: Pull request #22153 opened by aicss-genai

Action required aicss-genai:aicss-genai/sycl-bmg-upstream-pr-6

aicss-genai:aicss-genai/sycl-bmg-upstream-pr-6

Action required

sycl: Q5_K reorder MMVQ/dequant + Q8_0 reorder MMVQ path Server #31562: Pull request #22152 opened by aicss-genai

Action required aicss-genai:aicss-genai/sycl-bmg-upstream-pr-5

aicss-genai:aicss-genai/sycl-bmg-upstream-pr-5

Action required

sycl: route small f32 matmuls to oneMKL, bypass oneDNN Server #31561: Pull request #22150 opened by aicss-genai

Action required aicss-genai:aicss-genai/sycl-bmg-upstream-pr-4

aicss-genai:aicss-genai/sycl-bmg-upstream-pr-4

Action required

spec: save the dynamic/static ngram cache file Server #31560: Pull request #22055 synchronize by petersid2022

Action required petersid2022:self-speculation-save-cache

petersid2022:self-speculation-save-cache

Action required

sycl: add FILL, CUMSUM, DIAG, SOLVE_TRI, SSM_SCAN, GATED_DELTA_NET Server #31559: Pull request #22149 opened by aicss-genai

Action required aicss-genai:aicss-genai/sycl-bmg-upstream-pr-3

aicss-genai:aicss-genai/sycl-bmg-upstream-pr-3

Action required

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Actions

Workflows

Management

Server

Actions

Loading...
Loading

Uh oh!

Create status badge

Uh oh!

Filter by Event

Sorry, something went wrong.

Sorry, something went wrong.

No matching events.

Filter by Status

Sorry, something went wrong.

Sorry, something went wrong.

No matching statuses.

Filter by Branch

Sorry, something went wrong.

Sorry, something went wrong.

No matching branches.

Filter by Actor

Sorry, something went wrong.

Sorry, something went wrong.

No matching users.

Actions: ggml-org/llama.cpp

Actions

Server Server Actions Loading... Loading Sorry, something went wrong. Uh oh! There was an error while loading. Please reload this page.

Server

Server

Actions

Loading...
Loading