logging: write container output synchronously so final output is not truncated#5009
Open
AkihiroSuda wants to merge 1 commit into
Open
logging: write container output synchronously so final output is not truncated#5009AkihiroSuda wants to merge 1 commit into
AkihiroSuda wants to merge 1 commit into
Conversation
0e038aa to
b6c713a
Compare
…truncated
Logging tests such as TestLogsWithoutNewlineOrEOF and
TestLogsAfterRestartingContainer flake under gomodjail: a container's final
output, in particular a line emitted right before exit with no trailing
newline, is sometimes dropped from the log. Running
nerdctl run --name c alpine printf "'Hello World!\nThere is no newline'"
nerdctl logs -f c
would intermittently print only "'Hello World!" instead of the full output.
Two independent problems, both reproduced locally under gomodjail:
1. getContainerWait hung for short-lived containers. The logging process is
started by containerd while it sets up the container's IO, before the task
is created, so the first con.Task() returns NotFound and the code retried
forever "waiting for the task to start". For a fast container the task can
instead have already exited and been removed before the logger ever sees
it, so it never appeared and the logger blocked forever holding the logger
lock. It now concludes the container has exited once it is missing and the
container has been observed producing output.
2. The container's final chunk was lost to teardown. On exit containerd closes
the stdio FIFOs and tears the logging process down almost immediately. The
old path read the FIFO, copied it through an io.Pipe and a bufio splitter,
and handed each line to the driver over a buffered channel; a trailing
chunk with no newline was held in the splitter until EOF and then raced the
teardown across several goroutines, so it was frequently lost. The logger
now reads each FIFO directly and, for drivers that can write synchronously
(json-file, via the new SyncDriver interface), writes each entry inline from
the reading goroutine and flushes a trailing no-newline fragment as soon as
it is read. Streaming drivers keep using the buffered channel so a slow
driver cannot block the container.
The viewer also does a final read of the JSON log file when it receives the
stop signal, so entries flushed just before exit are not missed.
Verified locally with the gomodjail-packed binary: 250+ iterations of the
failing printf case, the restart (doubled-output) case, multi-line output and
follow-on-running-container all pass with no truncation.
Fixes containerd#5006
Assisted-by: Claude Opus 4.8 <noreply@anthropic.com>
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Logging tests such as TestLogsWithoutNewlineOrEOF and
TestLogsAfterRestartingContainer flake under gomodjail: a container's final
output, in particular a line emitted right before exit with no trailing
newline, is sometimes dropped from the log. Running
would intermittently print only "'Hello World!" instead of the full output.
Two independent problems, both reproduced locally under gomodjail:
getContainerWait hung for short-lived containers. The logging process is
started by containerd while it sets up the container's IO, before the task
is created, so the first con.Task() returns NotFound and the code retried
forever "waiting for the task to start". For a fast container the task can
instead have already exited and been removed before the logger ever sees
it, so it never appeared and the logger blocked forever holding the logger
lock. It now concludes the container has exited once it is missing and the
container has been observed producing output.
The container's final chunk was lost to teardown. On exit containerd closes
the stdio FIFOs and tears the logging process down almost immediately. The
old path read the FIFO, copied it through an io.Pipe and a bufio splitter,
and handed each line to the driver over a buffered channel; a trailing
chunk with no newline was held in the splitter until EOF and then raced the
teardown across several goroutines, so it was frequently lost. The logger
now reads each FIFO directly and, for drivers that can write synchronously
(json-file, via the new SyncDriver interface), writes each entry inline from
the reading goroutine and flushes a trailing no-newline fragment as soon as
it is read. Streaming drivers keep using the buffered channel so a slow
driver cannot block the container.
The viewer also does a final read of the JSON log file when it receives the
stop signal, so entries flushed just before exit are not missed.
Verified locally with the gomodjail-packed binary: 250+ iterations of the
failing printf case, the restart (doubled-output) case, multi-line output and
follow-on-running-container all pass with no truncation.
Fixes #5006
Assisted-by: Claude Opus 4.8 noreply@anthropic.com