From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
To: mhocko@kernel.org, htejun@gmail.com
Cc: cl@linux.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org,
torvalds@linux-foundation.org, rientjes@google.com,
oleg@redhat.com, kwalker@redhat.com, akpm@linux-foundation.org,
hannes@cmpxchg.org, vdavydov@parallels.com, skozina@redhat.com,
mgorman@suse.de, riel@redhat.com
Subject: Re: [PATCH] mm,vmscan: Use accurate values for zone_reclaimable()checks
Date: Tue, 27 Oct 2015 20:07:38 +0900 [thread overview]
Message-ID: <201510272007.HHI18717.MOOtJQHSVFOLFF@I-love.SAKURA.ne.jp> (raw)
In-Reply-To: <20151027091603.GB9891@dhcp22.suse.cz>
Michal Hocko wrote:
> > On Fri, Oct 23, 2015 at 01:11:45PM +0200, Michal Hocko wrote:
> > > > The problem here is not lack
> > > > of execution resource but concurrency management misunderstanding the
> > > > situation.
> > >
> > > And this sounds like a bug to me.
> >
> > I don't know. I can be argued either way, the other direction being a
> > kernel thread going RUNNING non-stop is buggy. Given how this has
> > been a complete non-issue for all the years, I'm not sure how useful
> > plugging this is.
>
> Well, I guess we haven't noticed because this is a pathological case. It
> also triggers OOM livelocks which were not reported in the past either.
> You do not reach this state normally unless you rely _want_ to kill your
> machine
I don't think we can say this is a pathological case. Customers' serves
might have hit this state. We have no code for warning this state.
>
> And vmstat is not the only instance. E.g. sysrq oom trigger is known
> to stay behind in similar cases. It should be changed to a dedicated
> WQ_MEM_RECLAIM wq and it would require runnable item guarantee as well.
>
Well, this seems to be the cause of SysRq-f being unresponsive...
http://lkml.kernel.org/r/201411231349.CAG78628.VFQFOtOSFJMOLH@I-love.SAKURA.ne.jp
Picking up from http://lkml.kernel.org/r/201506112212.JAG26531.FLSVFMOQJOtOHF@I-love.SAKURA.ne.jp
----------
[ 515.536393] Showing busy workqueues and worker pools:
[ 515.538185] workqueue events: flags=0x0
[ 515.539758] pwq 6: cpus=3 node=0 flags=0x0 nice=0 active=8/256
[ 515.541872] pending: vmpressure_work_fn, console_callback, vmstat_update, flush_to_ldisc, push_to_pool, moom_callback, sysrq_reinject_alt_sysrq, fb_deferred_io_work
[ 515.546684] workqueue events_power_efficient: flags=0x80
[ 515.548589] pwq 6: cpus=3 node=0 flags=0x0 nice=0 active=2/256
[ 515.550829] pending: neigh_periodic_work, check_lifetime
[ 515.552884] workqueue events_freezable_power_: flags=0x84
[ 515.554742] pwq 6: cpus=3 node=0 flags=0x0 nice=0 active=1/256
[ 515.556846] in-flight: 3837:disk_events_workfn
[ 515.558665] workqueue writeback: flags=0x4e
[ 515.560291] pwq 16: cpus=0-7 flags=0x4 nice=0 active=2/256
[ 515.562271] in-flight: 3812:bdi_writeback_workfn bdi_writeback_workfn
[ 515.564544] workqueue xfs-data/sda1: flags=0xc
[ 515.566265] pwq 6: cpus=3 node=0 flags=0x0 nice=0 active=4/256
[ 515.568359] in-flight: 374(RESCUER):xfs_end_io, 3759:xfs_end_io, 26:xfs_end_io, 3836:xfs_end_io
[ 515.571018] pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/256
[ 515.573113] in-flight: 179:xfs_end_io
[ 515.574782] pool 2: cpus=1 node=0 flags=0x0 nice=0 workers=4 idle: 3790 237 3820
[ 515.577230] pool 6: cpus=3 node=0 flags=0x0 nice=0 workers=5 manager: 219
[ 515.579488] pool 16: cpus=0-7 flags=0x4 nice=0 workers=3 idle: 356 357
----------
We want immediate execution guarantee for not only vmstat_update and
moom_callback but also vmstat_shepherd and console_callback?
> > > Don't we have some IO related paths which would suffer from the same
> > > problem. I haven't checked all the WQ_MEM_RECLAIM users but from the
> > > name I would expect they _do_ participate in the reclaim and so they
> > > should be able to make a progress. Now if your new IMMEDIATE flag will
> >
> > Seriously, nobody goes full-on RUNNING.
>
> Looping with cond_resched seems like general pattern in the kernel when
> there is no clear source to wait for. We have io_schedule when we know
> we should wait for IO (in case of congestion) but this is not necessarily
> the case - as you can see here. What should we wait for? A short nap
> without actually waiting on anything sounds like a dirty workaround to
> me.
Can't we have a waitqueue like
http://lkml.kernel.org/r/201510142121.IDE86954.SOVOFFQOFMJHtL@I-love.SAKURA.ne.jp ?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2015-10-27 11:07 UTC|newest]
Thread overview: 61+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-21 12:26 [PATCH] mm,vmscan: Use accurate values for zone_reclaimable() checks Tetsuo Handa
2015-10-21 13:03 ` Michal Hocko
2015-10-21 14:22 ` Christoph Lameter
2015-10-21 14:33 ` Michal Hocko
2015-10-21 14:49 ` Christoph Lameter
2015-10-21 14:55 ` Michal Hocko
2015-10-21 15:39 ` Tetsuo Handa
2015-10-21 17:16 ` Christoph Lameter
2015-10-22 11:37 ` Tetsuo Handa
2015-10-22 13:39 ` Christoph Lameter
2015-10-22 14:09 ` Tejun Heo
2015-10-22 14:21 ` Tejun Heo
2015-10-22 14:23 ` Christoph Lameter
2015-10-22 14:24 ` Tejun Heo
2015-10-22 14:25 ` Christoph Lameter
2015-10-22 14:33 ` Tejun Heo
2015-10-22 14:41 ` Christoph Lameter
2015-10-22 15:14 ` Tejun Heo
2015-10-23 4:26 ` Tejun Heo
2015-11-02 15:01 ` Michal Hocko
2015-11-02 19:20 ` Tejun Heo
2015-11-03 2:32 ` Tetsuo Handa
2015-11-03 19:43 ` Tejun Heo
2015-11-05 14:59 ` Tetsuo Handa
2015-11-05 17:45 ` Christoph Lameter
2015-11-06 0:16 ` Tejun Heo
2015-11-11 15:44 ` Michal Hocko
2015-11-11 16:03 ` Michal Hocko
2015-10-22 14:22 ` Christoph Lameter
2015-10-22 15:06 ` Michal Hocko
2015-10-22 15:15 ` Tejun Heo
2015-10-22 15:33 ` Christoph Lameter
2015-10-23 8:37 ` Michal Hocko
2015-10-23 11:43 ` Make vmstat deferrable again (was Re: [PATCH] mm,vmscan: Use accurate values for zone_reclaimable() checks) Christoph Lameter
2015-10-23 12:07 ` Sergey Senozhatsky
2015-10-23 14:12 ` Christoph Lameter
2015-10-23 14:49 ` Sergey Senozhatsky
2015-10-23 16:10 ` Christoph Lameter
2015-10-22 15:35 ` [PATCH] mm,vmscan: Use accurate values for zone_reclaimable() checks Michal Hocko
2015-10-22 15:37 ` Tejun Heo
2015-10-22 15:49 ` Michal Hocko
2015-10-22 18:42 ` Tejun Heo
2015-10-22 21:42 ` [PATCH] mm,vmscan: Use accurate values for zone_reclaimable()checks Tetsuo Handa
2015-10-22 22:47 ` Tejun Heo
2015-10-23 8:36 ` Michal Hocko
2015-10-23 10:37 ` Tejun Heo
2015-10-23 8:33 ` [PATCH] mm,vmscan: Use accurate values for zone_reclaimable() checks Michal Hocko
2015-10-23 10:36 ` Tejun Heo
2015-10-23 11:11 ` Michal Hocko
2015-10-23 12:25 ` Tetsuo Handa
2015-10-23 18:23 ` Tejun Heo
2015-10-25 10:52 ` Tetsuo Handa
2015-10-25 22:47 ` Tejun Heo
2015-10-27 9:22 ` Michal Hocko
2015-10-27 10:55 ` Tejun Heo
2015-10-27 12:07 ` Michal Hocko
2015-10-23 18:21 ` Tejun Heo
2015-10-27 9:16 ` Michal Hocko
2015-10-27 10:52 ` Tejun Heo
2015-10-27 11:07 ` Tetsuo Handa [this message]
2015-10-27 11:30 ` [PATCH] mm,vmscan: Use accurate values for zone_reclaimable()checks Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=201510272007.HHI18717.MOOtJQHSVFOLFF@I-love.SAKURA.ne.jp \
--to=penguin-kernel@i-love.sakura.ne.jp \
--cc=akpm@linux-foundation.org \
--cc=cl@linux.com \
--cc=hannes@cmpxchg.org \
--cc=htejun@gmail.com \
--cc=kwalker@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=mhocko@kernel.org \
--cc=oleg@redhat.com \
--cc=riel@redhat.com \
--cc=rientjes@google.com \
--cc=skozina@redhat.com \
--cc=torvalds@linux-foundation.org \
--cc=vdavydov@parallels.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox