linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org, cgroups@vger.kernel.org,
	linux-mm@kvack.org, Johannes Weiner <hannes@cmpxchg.org>,
	Vladimir Davydov <vdavydov.dev@gmail.com>,
	Roman Gushchin <guro@fb.com>, Shakeel Butt <shakeelb@google.com>,
	Chris Down <chris@chrisdown.name>,
	Yang Shi <yang.shi@linux.alibaba.com>, Tejun Heo <tj@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	"Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>,
	Konstantin Khorenko <khorenko@virtuozzo.com>,
	Kirill Tkhai <ktkhai@virtuozzo.com>,
	Andrey Ryabinin <aryabinin@virtuozzo.com>
Subject: Re: [PATCH] mm: fix hanging shrinker management on long do_shrink_slab
Date: Wed, 4 Dec 2019 09:35:14 +0100	[thread overview]
Message-ID: <20191204083514.GC25242@dhcp22.suse.cz> (raw)
In-Reply-To: <20191129214541.3110-1-ptikhomirov@virtuozzo.com>

On Sat 30-11-19 00:45:41, Pavel Tikhomirov wrote:
> We have a problem that shrinker_rwsem can be held for a long time for
> read in shrink_slab, at the same time any process which is trying to
> manage shrinkers hangs.
> 
> The shrinker_rwsem is taken in shrink_slab while traversing shrinker_list.
> It tries to shrink something on nfs (hard) but nfs server is dead at
> these moment already and rpc will never succeed. Generally any shrinker
> can take significant time to do_shrink_slab, so it's a bad idea to hold
> the list lock here.

Yes, this is a known problem and people have already tried to address it
in the past. Have you checked previous attempts? SRCU based one
http://lkml.kernel.org/r/153365347929.19074.12509495712735843805.stgit@localhost.localdomain
but I believe there were others (I only had this one in my notes).
Please make sure to Cc Dave Chinner when posting a next version because
he had some concerns about the change of the behavior.

> We have a similar problem in shrink_slab_memcg, except that we are
> traversing shrinker_map+shrinker_idr there.
> 
> The idea of the patch is to inc a refcount to the chosen shrinker so it
> won't disappear and release shrinker_rwsem while we are in
> do_shrink_slab, after that we will reacquire shrinker_rwsem, dec
> the refcount and continue the traversal.

The reference count part makes sense to me. RCU role needs a better
explanation. Also do you have any reason to not use completion for
the final step? Openconding essentially the same concept sounds a bit
awkward to me.

> We also need a wait_queue so that unregister_shrinker can wait for the
> refcnt to become zero. Only after these we can safely remove the
> shrinker from list and idr, and free the shrinker.
[...]
>   crash> bt ...
>   PID: 18739  TASK: ...  CPU: 3   COMMAND: "bash"
>    #0 [...] __schedule at ...
>    #1 [...] schedule at ...
>    #2 [...] rpc_wait_bit_killable at ... [sunrpc]
>    #3 [...] __wait_on_bit at ...
>    #4 [...] out_of_line_wait_on_bit at ...
>    #5 [...] _nfs4_proc_delegreturn at ... [nfsv4]
>    #6 [...] nfs4_proc_delegreturn at ... [nfsv4]
>    #7 [...] nfs_do_return_delegation at ... [nfsv4]
>    #8 [...] nfs4_evict_inode at ... [nfsv4]
>    #9 [...] evict at ...
>   #10 [...] dispose_list at ...
>   #11 [...] prune_icache_sb at ...
>   #12 [...] super_cache_scan at ...
>   #13 [...] do_shrink_slab at ...

Are NFS people aware of this? Because this is simply not acceptable
behavior. Memory reclaim cannot be block indefinitely or for a long
time. There must be a way to simply give up if the underlying inode
cannot be reclaimed.

I still have to think about the proposed solution. It sounds a bit over
complicated to me.
-- 
Michal Hocko
SUSE Labs


  parent reply	other threads:[~2019-12-04  8:35 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-29 21:45 Pavel Tikhomirov
2019-12-02 16:36 ` Andrey Ryabinin
2019-12-03  0:13   ` Shakeel Butt
2019-12-03 11:03     ` Kirill Tkhai
2019-12-05  3:13   ` Pavel Tikhomirov
2019-12-06  2:09   ` Dave Chinner
2019-12-06 10:09     ` Michal Hocko
2019-12-06 17:11     ` Shakeel Butt
2019-12-10  1:20       ` Dave Chinner
2019-12-19 10:35         ` Pavel Tikhomirov
2019-12-03 11:38 ` Kirill Tkhai
2019-12-05  3:29   ` Pavel Tikhomirov
2019-12-04  8:35 ` Michal Hocko [this message]
2019-12-19 10:20   ` Pavel Tikhomirov
2022-10-18  3:48 ` Zhang Tianci

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191204083514.GC25242@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=aryabinin@virtuozzo.com \
    --cc=cgroups@vger.kernel.org \
    --cc=chris@chrisdown.name \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=khorenko@virtuozzo.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=ktkhai@virtuozzo.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ptikhomirov@virtuozzo.com \
    --cc=shakeelb@google.com \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=vdavydov.dev@gmail.com \
    --cc=yang.shi@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox