Tags · 32bitmicro/llama.cpp

b5663

compare-llama-bench: add option to plot (ggml-org#14169)

* compare llama-bench: add option to plot

* Address review comments: convert case + add type hints

* Add matplotlib to requirements

* fix tests

* Improve comment and fix assert condition for test

* Add back default test_name, add --plot_log_scale

* use log_scale regardless of x_values

Jun 14, 2025
2e42be4
zip
tar.gz
Downloads

b5361

llama-bench : add defrag-thold, check for invalid ranges (ggml-org#13487

)

May 12, 2025
cf0a43b
zip
tar.gz
Downloads

b4359

llama-run : improve progress bar (ggml-org#10821)

Set default width to whatever the terminal is. Also fixed a small bug around
default n_gpu_layers value.

Signed-off-by: Eric Curtin <ecurtin@redhat.com>

Dec 19, 2024
7909e85
zip
tar.gz

b4358

ggml : fix arm build (ggml-org#10890)

* ggml: GGML_NATIVE uses -mcpu=native on ARM

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

* ggml: Show detected features with GGML_NATIVE

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

* remove msvc support, add GGML_CPU_ARM_ARCH option

* disable llamafile in android example

* march -> mcpu, skip adding feature macros

ggml-ci

---------

Signed-off-by: Adrien Gallouët <angt@huggingface.co>
Co-authored-by: Adrien Gallouët <angt@huggingface.co>

Dec 18, 2024
9177484
zip
tar.gz

b4357

tts : add OuteTTS support (ggml-org#10784)

* server : add "tokens" output

ggml-ci

* server : output embeddings for all tokens when pooling = none

ggml-ci

* server : be explicit about the pooling type in the tests

ggml-ci

* server : do not normalize embeddings when there is no pooling

ggml-ci

* llama : add OuteTTS support (wip)

* wip

* extract features

* first conv

* group norm

* resnet conv

* resnet

* attn

* pos net

* layer norm

* convnext

* head

* hann window

* fix n_embd + remove llama.cpp hacks

* compute hann window

* fft

* spectrum processing

* clean-up

* tts : receive input text and generate codes

* clip : fix new conv name

* tts : minor fix

* tts : add header + minor fixes

ggml-ci

* tts : add matchematical constant

ggml-ci

* tts : fix sampling + cut initial noise

* tts : fixes

* tts : update default samplers

ggml-ci

* tts : text pre-processing

* tts : outetts-voc -> wavtokenizer-dec

* tts : remove hardcoded constants

ggml-ci

* tts : fix tensor shapes

* llama : refactor wavtokenizer tensors

ggml-ci

* cont

ggml-ci

* cont [no ci]

* llama : update WavTokenizer to non-causal attn

* llama : handle no-vocab detokenization

* tts : add Python example for OuteTTS (wip)

* tts : extend python example to generate spectrogram

ggml-ci

* server : fix rebase artifacts

* tts : enable "return_tokens" in Python example

ggml-ci

* tts : minor fixes

* common : support HF download for vocoder

Dec 18, 2024
0bf2d10
zip
tar.gz

b4354

server : add "tokens" output (ggml-org#10853)

* server : add "tokens" output

ggml-ci

* server : update readme

ggml-ci

* server : return tokens ids only if requested

ggml-ci

* tests : improve "tokens" type check

Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>

* server : remove "tokens" from the OAI endpoint

ggml-ci

---------

Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>

Dec 18, 2024
0e70ba6
zip
tar.gz

b4353

server : (embeddings) using same format for "input" and "content" (gg…

…ml-org#10872)

* server : (embeddings) using same format for "input" and "content"

* fix test case

* handle empty input case

* fix test

Dec 18, 2024
4682887
zip
tar.gz

b4351

Revert "llama : add Falcon3 support (ggml-org#10864)" (ggml-org#10876)

This reverts commit 382bc7f.

Dec 18, 2024
4da69d1
zip
tar.gz

b4350

Use model->gguf_kv for loading the template instead of using the C AP…

…I. (ggml-org#10868)

* Bump model_template to 16384 bytes to support larger chat templates.

* Use `model->gguf_kv` for efficiency.

Dec 17, 2024
d62b532
zip
tar.gz

b4349

tests: add tests for GGUF (ggml-org#10830)

Dec 17, 2024
081b29b
zip
tar.gz

PreviousNext

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

b5663

b5361

b4359

b4358

b4357

b4354

b4353

b4351

b4350

b4349

Tags: 32bitmicro/llama.cpp