Skip to content

Tags: 32bitmicro/llama.cpp

Tags

b5663

Toggle b5663's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
compare-llama-bench: add option to plot (ggml-org#14169)

* compare llama-bench: add option to plot

* Address review comments: convert case + add type hints

* Add matplotlib to requirements

* fix tests

* Improve comment and fix assert condition for test

* Add back default test_name, add --plot_log_scale

* use log_scale regardless of x_values

b5361

Toggle b5361's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
llama-bench : add defrag-thold, check for invalid ranges (ggml-org#13487

)

b4359

Toggle b4359's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
llama-run : improve progress bar (ggml-org#10821)

Set default width to whatever the terminal is. Also fixed a small bug around
default n_gpu_layers value.

Signed-off-by: Eric Curtin <ecurtin@redhat.com>

b4358

Toggle b4358's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
ggml : fix arm build (ggml-org#10890)

* ggml: GGML_NATIVE uses -mcpu=native on ARM

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

* ggml: Show detected features with GGML_NATIVE

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

* remove msvc support, add GGML_CPU_ARM_ARCH option

* disable llamafile in android example

* march -> mcpu, skip adding feature macros

ggml-ci

---------

Signed-off-by: Adrien Gallouët <angt@huggingface.co>
Co-authored-by: Adrien Gallouët <angt@huggingface.co>

b4357

Toggle b4357's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
tts : add OuteTTS support (ggml-org#10784)

* server : add "tokens" output

ggml-ci

* server : output embeddings for all tokens when pooling = none

ggml-ci

* server : be explicit about the pooling type in the tests

ggml-ci

* server : do not normalize embeddings when there is no pooling

ggml-ci

* llama : add OuteTTS support (wip)

* wip

* extract features

* first conv

* group norm

* resnet conv

* resnet

* attn

* pos net

* layer norm

* convnext

* head

* hann window

* fix n_embd + remove llama.cpp hacks

* compute hann window

* fft

* spectrum processing

* clean-up

* tts : receive input text and generate codes

* clip : fix new conv name

* tts : minor fix

* tts : add header + minor fixes

ggml-ci

* tts : add matchematical constant

ggml-ci

* tts : fix sampling + cut initial noise

* tts : fixes

* tts : update default samplers

ggml-ci

* tts : text pre-processing

* tts : outetts-voc -> wavtokenizer-dec

* tts : remove hardcoded constants

ggml-ci

* tts : fix tensor shapes

* llama : refactor wavtokenizer tensors

ggml-ci

* cont

ggml-ci

* cont [no ci]

* llama : update WavTokenizer to non-causal attn

* llama : handle no-vocab detokenization

* tts : add Python example for OuteTTS (wip)

* tts : extend python example to generate spectrogram

ggml-ci

* server : fix rebase artifacts

* tts : enable "return_tokens" in Python example

ggml-ci

* tts : minor fixes

* common : support HF download for vocoder

b4354

Toggle b4354's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
server : add "tokens" output (ggml-org#10853)

* server : add "tokens" output

ggml-ci

* server : update readme

ggml-ci

* server : return tokens ids only if requested

ggml-ci

* tests : improve "tokens" type check

Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>

* server : remove "tokens" from the OAI endpoint

ggml-ci

---------

Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>

b4353

Toggle b4353's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
server : (embeddings) using same format for "input" and "content" (gg…

…ml-org#10872)

* server : (embeddings) using same format for "input" and "content"

* fix test case

* handle empty input case

* fix test

b4351

Toggle b4351's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Revert "llama : add Falcon3 support (ggml-org#10864)" (ggml-org#10876)

This reverts commit 382bc7f.

b4350

Toggle b4350's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Use model->gguf_kv for loading the template instead of using the C AP…

…I. (ggml-org#10868)

* Bump model_template to 16384 bytes to support larger chat templates.

* Use `model->gguf_kv` for efficiency.

b4349

Toggle b4349's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
tests: add tests for GGUF (ggml-org#10830)