-
Notifications
You must be signed in to change notification settings - Fork 6.3k
Add dedicated logging thread #21547
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Add dedicated logging thread #21547
Conversation
231a809 to
b6e13bf
Compare
b6e13bf to
8d39212
Compare
|
@cubic-dev-ai review this PR |
@stelfrag I have started the AI code review. It will take a few minutes to complete. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 issues found across 9 files
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="src/libnetdata/log/nd_log-queue.c">
<violation number="1" location="src/libnetdata/log/nd_log-queue.c:362">
P1: `nd_log_queue_shutdown()` can hang indefinitely when the command queue is full because no shutdown signal is sent, but the thread is still joined. Ensure a shutdown request is issued even when the queue is full to avoid deadlock.</violation>
</file>
<file name="src/libnetdata/log/nd_log.c">
<violation number="1" location="src/libnetdata/log/nd_log.c:312">
P2: Async logging can drop messages when select_output remaps a non-file method (e.g., journald not initialized) to stderr. The async path ignores the fallback FILE* and later writes via nd_log.sources[source].fp, which is NULL for NDLM_JOURNAL/ETW/WEL sources, so the message is lost. Skip async when the FILE* doesn’t match the source’s fp (or persist the fallback fp in the queue entry).</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
|
@cubic-dev-ai review this PR |
@stelfrag I have started the AI code review. It will take a few minutes to complete. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No issues found across 9 files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This pull request implements asynchronous logging with a dedicated background thread to improve performance by offloading I/O operations from hot paths.
Changes:
- Introduces a ring buffer-based command queue (4096 entries) processed by a "LOGGER" thread
- Routes non-critical logs (below ALERT priority) through async path while keeping ALERT+ synchronous for crash safety
- Implements safe log rotation via REOPEN command handled entirely in the logger thread to avoid FILE* races
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| src/libnetdata/log/nd_log-queue.h | Defines queue structures, opcodes, and public API for async logging |
| src/libnetdata/log/nd_log-queue.c | Implements ring buffer, logger thread event loop, and queue management |
| src/libnetdata/log/nd_log.c | Integrates async logging path with sync fallback and routing logic |
| src/libnetdata/log/nd_log.h | Adds nd_log_shutdown() to public API |
| src/libnetdata/log/nd_log-init.c | Initializes async queue on startup and handles log file reopening |
| src/libnetdata/log/nd_log-internals.h | Adds nd_logger_journal_format() for journal formatting |
| src/libnetdata/log/nd_log-to-systemd-journal.c | Refactors journal formatting into reusable function |
| src/daemon/daemon-shutdown.c | Calls nd_log_shutdown() to flush logs before exit |
| CMakeLists.txt | Adds nd_log-queue.c and nd_log-queue.h to build |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 9 out of 9 changed files in this pull request and generated 11 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@cubic-dev-ai review this PR |
@stelfrag I have started the AI code review. It will take a few minutes to complete. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 issue found across 9 files
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="src/libnetdata/log/nd_log-queue.h">
<violation number="1" location="src/libnetdata/log/nd_log-queue.h:76">
P3: The comment claims a boolean return value, but nd_log_queue_init returns void. Update the comment (or change the signature) to avoid misleading callers about failure handling.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
- Introduce `nd_log_queue` for handling log entries asynchronously via background threads. - Implement queue initialization, enqueue, and shutdown functions. - Add thread-safe mechanisms for processing and flushing log entries. - Integrate async logging with the existing logging infrastructure, falling back to synchronous if the queue is unavailable. - Update `CMakeLists.txt` to include new files: `nd_log-queue.c` and `nd_log-queue.h`.
- Introduce a `shutdown_acknowledged` flag with supporting mutex and condition variable for safe thread shutdown. - Ensure logger thread signals shutdown acknowledgment before exiting. - Update cleanup process to properly destroy new synchronization primitives.
… async logging - Store `journal_direct`, `journal_libsystemd`, and `syslog` initialization states during enqueue to ensure consistent behavior during logging. - Add fallback mechanisms for logging to `stderr` when target destinations are not initialized. - Introduce a flush operation before log destination reinitialization to avoid losing pending log messages. - Refactor logging queue thread signature and improve error handling for asynchronous journal and syslog writes.
…lized send logic - Introduce `nd_logger_journal_format` for consistent journal native protocol (KEY=VALUE\n) formatting. - Update to use `journal_direct_send` for improved handling and fallback mechanisms. - Simplify `nd_logger_journal_direct` by separating formatting and sending responsibilities.
- Replace the legacy `nd_log_queue` structure with the `nd_log_event_loop` and `nd_log_cmd_pool` architecture for cleaner and more efficient log entry processing. - Transition from a queue-based logging system to a command-driven event loop using uv_async callbacks. - Introduce reusable command pool with support for opcodes (e.g., `ND_LOG_OP_ENTRY`, `ND_LOG_OP_FLUSH`, `ND_LOG_OP_SHUTDOWN`) to enable more flexible event-driven log operations. - Simplify logging thread to focus on processing commands in an event loop, improving modularity and reducing synchronization complexity. - Update public API to reflect the new architecture, including changes to queue management, shutdown logic, and entry enqueueing.
- Remove unused initialization fields (`journal_direct`, `journal_libsystemd`) from log queue entries to reduce enqueue-time complexity. - Replace raw buffer handling with `CLEAN_BUFFER` to streamline memory management and auto-free resources on return. - Adapt async file logging to look up `FILE*` at write time, enabling proper log rotation handling. - Deprecate journal async logging support due to sync path optimization. - Introduce timed shutdown wait with fallback to prevent indefinite hangs during logger thread termination.
- Prevent async logging for fallback `FILE*` mismatches to avoid message loss. - Add atomic flag signaling for immediate shutdown when the queue is full.
…R` remapping to `NDLM_FILE` during `select_output`. - Add atomic queue depth decrement during entry processing. - Prevent memory leaks by freeing allocated buffers when enqueueing is skipped.
…ocated log messages. - Add source index bounds check to prevent out-of-bounds access during log entry processing. - Optimize command pool cleanup by iterating between active indices instead of full range. - Refactor initialization function to ensure thread-safe single execution using `FUNCTION_RUN_ONCE`. - Enhance shutdown logic to safely handle completion destruction after thread termination. - Use CAS loop for accurate high-water mark updates in concurrent scenarios. - Adjust async signaling to prevent race conditions during shutdown.
8d160f0 to
b1b9ab6
Compare
Summary
Summary by cubic
Add asynchronous logging with a dedicated background thread to offload I/O from hot paths and reduce contention. Adds safe flush and shutdown to prevent log loss during reopen and process exit.
New Features
Migration
Written for commit b1b9ab6. Summary will update on new commits.