From: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
To: Petr Mladek <pmladek@suse.com>
Cc: "Ricardo Cañuelo" <ricardo.canuelo@collabora.com>,
"Michal Hocko" <mhocko@suse.com>,
akpm@linux-foundation.org, kernel@collabora.com, hch@lst.de,
guro@fb.com, rientjes@google.com, mcgrof@kernel.org,
keescook@chromium.org, yzaikin@google.com, linux-mm@kvack.org,
"Sergey Senozhatsky" <sergey.senozhatsky@gmail.com>,
"Steven Rostedt" <rostedt@goodmis.org>
Subject: Re: [PATCH] mm, oom: enable rate-limiting controls for oom dumps
Date: Tue, 13 Oct 2020 19:46:32 +0900 [thread overview]
Message-ID: <9cb10e17-ac04-9f7d-2138-cc044e2b080b@i-love.sakura.ne.jp> (raw)
In-Reply-To: <20201013090259.GC26155@alley>
On 2020/10/13 18:02, Petr Mladek wrote:
> On Tue 2020-10-13 09:40:27, Tetsuo Handa wrote:
>> On 2020/10/13 0:41, Michal Hocko wrote:
>>>> What about introducing some feedback from the printk code?
>>>>
>>>> static u64 printk_last_report_seq;
>>>>
>>>> if (consoles_seen(printk_last_report_seq)) {
>>>> dump_header();
>>>> printk_last_report_seq = printk_get_last_seq();
>>>> }
>>>>
>>>> By other words. It would skip the massive report when the consoles
>>>> were not able to see the previous one.
>>>
>>> I am pretty sure this has been discussed in the past but maybe we really
>>> want to make ratelimit to work reasonably also for larger sections
>>> instead. Current implementation only really works if the rate limited
>>> operation is negligible wrt to the interval. Can we have a ratelimit
>>> alternative with a scope effect (effectivelly lock like semantic)?
>>> if (rate_limit_begin(&oom_rs)) {
>>> dump_header();
>>> rate_limit_end(&oom_rs);
>>> }
>>>
>>> rate_limi_begin would act like a try lock with additional constrain on
>>> the period/cadence based on rate_limi_end marked values.
>>>
>>
>> Here is one of past discussions.
>>
>> https://lkml.kernel.org/r/7de2310d-afbd-e616-e83a-d75103b986c6@i-love.sakura.ne.jp
>> https://lkml.kernel.org/r/20190830103504.GA28313@dhcp22.suse.cz
>> https://lkml.kernel.org/r/57be50b2-a97a-e559-e4bd-10d923895f83@i-love.sakura.ne.jp
>>
>> Michal Hocko complained about different OOM domains, and now just ignores it...
>
> How is this related to this discussion, please? AFAIK, we are
> discussing how to tune the values of the existing ratelimiting.
dump_tasks() is one of functions called from dump_header().
Since Michal wants to recognize OOM domains when ratelimiting dump_tasks(),
ratelimit for dump_header() is also expected to recognize OOM domains.
>
>> Proper ratelimiting for OOM messages had better not to count on asynchronous printk().
>
> I am a bit confused. AFAIK, you wanted to print OOM messages
> asynchronous ways in the past. The lockless printk ringbuffer is on
> its way into 5.10. Handling consoles in kthreads will be the next
> step of the printk rework.
What I'm proposing is synchronously printing OOM messages from a different
thread, for one dump_tasks() call can generate thousands of lines which may
significantly delay arrival of non OOM related messages to consoles (or even
drop due to logbuf being full). I don't want to enqueue too many OOM related
messages to logbuf, even after printk() became completely asynchronous.
>
> OK, the current state is that printk() is semi-synchronous. It does
> console_trylock(). The console is handled immediately when it
> succeeds. Otherwise it expects that the current console_lock owner
> would do the job.
>
> Tuning ratelimits is not trivial for a particular system. It would
> be better to have some autotuning. If the printk is synchronous,
> we could measure how long the printing took. If it is asynchronous,
> we could check whether the last report has been already flushed or
> not. We could then decide whether to print the new report.
Whether the last report has been already flushed needs to recognize
OOM domains.
>
> What is the desired behavior, please?
>
> Could you please provide some examples how you would tune ratelimit
> when printing all messages to the console takes X ms and OOM
> happens every Y ms?
My proposal is to decide whether to print the new report based on
whether all OOM candidates for that OOM domain have been flushed to
consoles. There is no X and Y.
next prev parent reply other threads:[~2020-10-13 10:47 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-10-09 9:30 Ricardo Cañuelo
2020-10-12 15:18 ` Michal Hocko
2020-10-13 9:23 ` Ricardo Cañuelo
2020-10-13 11:56 ` Michal Hocko
2020-10-12 15:22 ` Petr Mladek
2020-10-12 15:41 ` Michal Hocko
2020-10-13 0:40 ` Tetsuo Handa
2020-10-13 7:25 ` Michal Hocko
2020-10-13 9:02 ` Petr Mladek
2020-10-13 10:46 ` Tetsuo Handa [this message]
2020-10-15 13:05 ` Petr Mladek
2020-10-13 9:18 ` Ricardo Cañuelo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9cb10e17-ac04-9f7d-2138-cc044e2b080b@i-love.sakura.ne.jp \
--to=penguin-kernel@i-love.sakura.ne.jp \
--cc=akpm@linux-foundation.org \
--cc=guro@fb.com \
--cc=hch@lst.de \
--cc=keescook@chromium.org \
--cc=kernel@collabora.com \
--cc=linux-mm@kvack.org \
--cc=mcgrof@kernel.org \
--cc=mhocko@suse.com \
--cc=pmladek@suse.com \
--cc=ricardo.canuelo@collabora.com \
--cc=rientjes@google.com \
--cc=rostedt@goodmis.org \
--cc=sergey.senozhatsky@gmail.com \
--cc=yzaikin@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox