Re: [PATCH] mm,page_alloc: PF_WQ_WORKER threads must sleep at should_reclaim_retry().

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
To: Michal Hocko <mhocko@kernel.org>
Cc: Tejun Heo <tj@kernel.org>, Roman Gushchin <guro@fb.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Vladimir Davydov <vdavydov.dev@gmail.com>,
	David Rientjes <rientjes@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	linux-mm <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] mm,page_alloc: PF_WQ_WORKER threads must sleep at should_reclaim_retry().
Date: Wed, 22 Aug 2018 06:07:40 +0900	[thread overview]
Message-ID: <804b50cb-0b17-201a-790b-18604396f826@i-love.sakura.ne.jp> (raw)
In-Reply-To: <20180803061653.GB27245@dhcp22.suse.cz>

On 2018/08/03 15:16, Michal Hocko wrote:
> On Fri 03-08-18 07:05:54, Tetsuo Handa wrote:
>> On 2018/07/31 14:09, Michal Hocko wrote:
>>> On Tue 31-07-18 06:01:48, Tetsuo Handa wrote:
>>>> On 2018/07/31 4:10, Michal Hocko wrote:
>>>>> Since should_reclaim_retry() should be a natural reschedule point,
>>>>> let's do the short sleep for PF_WQ_WORKER threads unconditionally in
>>>>> order to guarantee that other pending work items are started. This will
>>>>> workaround this problem and it is less fragile than hunting down when
>>>>> the sleep is missed. E.g. we used to have a sleeping point in the oom
>>>>> path but this has been removed recently because it caused other issues.
>>>>> Having a single sleeping point is more robust.
>>>>
>>>> linux.git has not removed the sleeping point in the OOM path yet. Since removing the
>>>> sleeping point in the OOM path can mitigate CVE-2016-10723, please do so immediately.
>>>
>>> is this an {Acked,Reviewed,Tested}-by?
>>>
>>> I will send the patch to Andrew if the patch is ok. 
>>>
>>>> (And that change will conflict with Roman's cgroup aware OOM killer patchset. But it
>>>> should be easy to rebase.)
>>>
>>> That is still a WIP so I would lose sleep over it.
>>>
>>
>> Now that Roman's cgroup aware OOM killer patchset will be dropped from linux-next.git ,
>> linux-next.git will get the sleeping point removed. Please send this patch to linux-next.git .
> 
> I still haven't heard any explicit confirmation that the patch works for
> your workload. Should I beg for it? Or you simply do not want to have
> your stamp on the patch? If yes, I can live with that but this playing
> hide and catch is not really a lot of fun.
> 

I noticed that the patch has not been sent to linux-next.git yet.
Please send to linux-next.git without my stamp on the patch.

[   44.863590] Out of memory: Kill process 1071 (a.out) score 865 or sacrifice child
[   44.867666] Killed process 1817 (a.out) total-vm:5244kB, anon-rss:1040kB, file-rss:0kB, shmem-rss:0kB
[   44.872176] oom_reaper: reaped process 1817 (a.out), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
[   91.698761] BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=0 stuck for 48s!
[   91.702313] Showing busy workqueues and worker pools:
[   91.705011] workqueue events: flags=0x0
[   91.707482]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=8/256
[   91.710524]     pending: vmpressure_work_fn, vmw_fb_dirty_flush [vmwgfx], e1000_watchdog [e1000], vmstat_shepherd, free_work, mmdrop_async_fn, mmdrop_async_fn, check_corruption
[   91.717439] workqueue events_freezable: flags=0x4
[   91.720161]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256
[   91.723304]     pending: vmballoon_work [vmw_balloon]
[   91.726167] workqueue events_power_efficient: flags=0x80
[   91.729139]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=4/256
[   91.732253]     pending: fb_flashcursor, gc_worker [nf_conntrack], neigh_periodic_work, neigh_periodic_work
[   91.736471] workqueue events_freezable_power_: flags=0x84
[   91.739546]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256
[   91.742696]     in-flight: 2097:disk_events_workfn
[   91.745517] workqueue mm_percpu_wq: flags=0x8
[   91.748069]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=2/256
[   91.751182]     pending: drain_local_pages_wq BAR(1830), vmstat_update
[   91.754661] workqueue mpt_poll_0: flags=0x8
[   91.757161]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256
[   91.759958]     pending: mpt_fault_reset_work [mptbase]
[   91.762696] workqueue xfs-data/sda1: flags=0xc
[   91.765353]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=3/256
[   91.768248]     pending: xfs_end_io [xfs], xfs_end_io [xfs], xfs_end_io [xfs]
[   91.771589] workqueue xfs-cil/sda1: flags=0xc
[   91.774009]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256
[   91.776800]     pending: xlog_cil_push_work [xfs] BAR(703)
[   91.779464] workqueue xfs-reclaim/sda1: flags=0xc
[   91.782017]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256
[   91.784599]     pending: xfs_reclaim_worker [xfs]
[   91.786930] workqueue xfs-sync/sda1: flags=0x4
[   91.789289]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256
[   91.792075]     pending: xfs_log_worker [xfs]
[   91.794213] pool 0: cpus=0 node=0 flags=0x0 nice=0 hung=48s workers=4 idle: 52 13 5
[  121.906640] BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=0 stuck for 78s!
[  121.909572] Showing busy workqueues and worker pools:
[  121.911703] workqueue events: flags=0x0
[  121.913531]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=8/256
[  121.915873]     pending: vmpressure_work_fn, vmw_fb_dirty_flush [vmwgfx], e1000_watchdog [e1000], vmstat_shepherd, free_work, mmdrop_async_fn, mmdrop_async_fn, check_corruption
[  121.921962] workqueue events_freezable: flags=0x4
[  121.924336]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256
[  121.926941]     pending: vmballoon_work [vmw_balloon]
[  121.929226] workqueue events_power_efficient: flags=0x80
[  121.931554]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=4/256
[  121.933983]     pending: fb_flashcursor, gc_worker [nf_conntrack], neigh_periodic_work, neigh_periodic_work
[  121.937415] workqueue events_freezable_power_: flags=0x84
[  121.939719]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256
[  121.942166]     in-flight: 2097:disk_events_workfn
[  121.944385] workqueue mm_percpu_wq: flags=0x8
[  121.946468]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=2/256
[  121.948975]     pending: drain_local_pages_wq BAR(1830), vmstat_update
[  121.951808] workqueue mpt_poll_0: flags=0x8
[  121.953864]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256
[  121.956245]     pending: mpt_fault_reset_work [mptbase]
[  121.958505] workqueue xfs-data/sda1: flags=0xc
[  121.960514]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=3/256
[  121.962896]     pending: xfs_end_io [xfs], xfs_end_io [xfs], xfs_end_io [xfs]
[  121.965682] workqueue xfs-cil/sda1: flags=0xc
[  121.967639]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256
[  121.970004]     pending: xlog_cil_push_work [xfs] BAR(703)
[  121.972285] workqueue xfs-reclaim/sda1: flags=0xc
[  121.974339]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256
[  121.976779]     pending: xfs_reclaim_worker [xfs]
[  121.978926] workqueue xfs-sync/sda1: flags=0x4
[  121.980997]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256
[  121.983364]     pending: xfs_log_worker [xfs]
[  121.985326] pool 0: cpus=0 node=0 flags=0x0 nice=0 hung=78s workers=4 idle: 52 13 5
[  147.872620] sysrq: SysRq : Terminate All Tasks

next prev parent reply	other threads:[~2018-08-21 21:08 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-26 11:06 Tetsuo Handa
2018-07-26 11:39 ` Michal Hocko
2018-07-27 15:47   ` Tetsuo Handa
2018-07-30  9:32     ` Michal Hocko
2018-07-30 14:34       ` Tetsuo Handa
2018-07-30 14:46         ` Michal Hocko
2018-07-30 14:54           ` Tejun Heo
2018-07-30 15:25             ` Tetsuo Handa
2018-07-30 15:44               ` Tejun Heo
2018-07-30 18:51                 ` Michal Hocko
2018-07-30 19:10                   ` Michal Hocko
2018-07-30 21:01                     ` Tetsuo Handa
2018-07-31  5:09                       ` Michal Hocko
2018-07-31 10:47                         ` Tetsuo Handa
2018-07-31 11:15                           ` Michal Hocko
2018-07-31 11:30                             ` Tetsuo Handa
2018-07-31 11:55                               ` Michal Hocko
2018-08-02 22:05                         ` Tetsuo Handa
2018-08-03  6:16                           ` Michal Hocko
2018-08-21 21:07                             ` Tetsuo Handa [this message]
2018-08-22  7:32                               ` Michal Hocko
2018-08-23 20:06                               ` David Rientjes
2018-08-23 21:00                                 ` Tetsuo Handa
2018-08-23 22:45                                   ` David Rientjes
2018-08-24  0:31                                     ` Tetsuo Handa
2018-09-05 13:20                                       ` Tetsuo Handa
2018-09-05 13:40                                         ` Michal Hocko
2018-09-05 13:53                                           ` Tetsuo Handa
2018-09-05 14:04                                             ` Michal Hocko
2018-09-06  1:00                                               ` Tetsuo Handa
2018-09-06  5:57                                                 ` Michal Hocko
2018-09-06  6:22                                                   ` Tetsuo Handa
2018-09-06  7:03                                                   ` Tetsuo Handa
2018-07-30 19:14                   ` Tejun Heo
2018-08-27 13:51 Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=804b50cb-0b17-201a-790b-18604396f826@i-love.sakura.ne.jp \
    --to=penguin-kernel@i-love.sakura.ne.jp \
    --cc=akpm@linux-foundation.org \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=rientjes@google.com \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox