From: Minchan Kim <minchan@kernel.org>
To: Jan Kara <jack@suse.cz>
Cc: Michal Hocko <mhocko@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
linux-mm <linux-mm@kvack.org>,
LKML <linux-kernel@vger.kernel.org>,
Johannes Weiner <hannes@cmpxchg.org>,
Chris Fries <cfries@google.com>
Subject: Re: [PATCH] mm: workingset: fix NULL ptr dereference
Date: Tue, 10 Apr 2018 20:19:31 +0900 [thread overview]
Message-ID: <20180410111931.GA5113@rodete-laptop-imager.corp.google.com> (raw)
In-Reply-To: <20180410102845.3ixg2lbnumqn2o6z@quack2.suse.cz>
On Tue, Apr 10, 2018 at 12:28:45PM +0200, Jan Kara wrote:
> On Tue 10-04-18 11:32:41, Michal Hocko wrote:
> > On Tue 10-04-18 10:55:31, Jan Kara wrote:
> > > On Tue 10-04-18 10:22:43, Michal Hocko wrote:
> > > > On Mon 09-04-18 10:58:15, Minchan Kim wrote:
> > > > > Recently, I got a report like below.
> > > > >
> > > > > [ 7858.792946] [<ffffff80086f4de0>] __list_del_entry+0x30/0xd0
> > > > > [ 7858.792951] [<ffffff8008362018>] list_lru_del+0xac/0x1ac
> > > > > [ 7858.792957] [<ffffff800830f04c>] page_cache_tree_insert+0xd8/0x110
> > > > > [ 7858.792962] [<ffffff8008310188>] __add_to_page_cache_locked+0xf8/0x4e0
> > > > > [ 7858.792967] [<ffffff800830ff34>] add_to_page_cache_lru+0x50/0x1ac
> > > > > [ 7858.792972] [<ffffff800830fdd0>] pagecache_get_page+0x468/0x57c
> > > > > [ 7858.792979] [<ffffff80085d081c>] __get_node_page+0x84/0x764
> > > > > [ 7858.792986] [<ffffff800859cd94>] f2fs_iget+0x264/0xdc8
> > > > > [ 7858.792991] [<ffffff800859ee00>] f2fs_lookup+0x3b4/0x660
> > > > > [ 7858.792998] [<ffffff80083d2540>] lookup_slow+0x1e4/0x348
> > > > > [ 7858.793003] [<ffffff80083d0eb8>] walk_component+0x21c/0x320
> > > > > [ 7858.793008] [<ffffff80083d0010>] path_lookupat+0x90/0x1bc
> > > > > [ 7858.793013] [<ffffff80083cfe6c>] filename_lookup+0x8c/0x1a0
> > > > > [ 7858.793018] [<ffffff80083c52d0>] vfs_fstatat+0x84/0x10c
> > > > > [ 7858.793023] [<ffffff80083c5b00>] SyS_newfstatat+0x28/0x64
> > > > >
> > > > > v4.9 kenrel already has the d3798ae8c6f3,("mm: filemap: don't
> > > > > plant shadow entries without radix tree node") so I thought
> > > > > it should be okay. When I was googling, I found others report
> > > > > such problem and I think current kernel still has the problem.
> > > > >
> > > > > https://bugzilla.redhat.com/show_bug.cgi?id=1431567
> > > > > https://bugzilla.redhat.com/show_bug.cgi?id=1420335
> > > > >
> > > > > It assumes shadow entry of radix tree relies on the init state
> > > > > that node->private_list allocated should be list_empty state.
> > > > > Currently, it's initailized in SLAB constructor which means
> > > > > node of radix tree would be initialized only when *slub allocates
> > > > > new page*, not *new object*. So, if some FS or subsystem pass
> > > > > gfp_mask to __GFP_ZERO, slub allocator will do memset blindly.
> > > > > That means allocated node can have !list_empty(node->private_list).
> > > > > It ends up calling NULL deference at workingset_update_node by
> > > > > failing list_empty check.
> > > > >
> > > > > This patch should fix it.
> > > > >
> > > > > Fixes: 449dd6984d0e ("mm: keep page cache radix tree nodes in check")
> > > > > Reported-by: Chris Fries <cfries@google.com>
> > > > > Cc: Johannes Weiner <hannes@cmpxchg.org>
> > > > > Cc: Jan Kara <jack@suse.cz>
> > > > > Signed-off-by: Minchan Kim <minchan@kernel.org>
> > > >
> > > > Regardless of whether it makes sense to use __GFP_ZERO from the upper
> > > > layer or not, it is subtle as hell to rely on the pre-existing state
> > > > for a newly allocated object. So yes this makes perfect sense.
> > > >
> > > > Do we want CC: stable?
> > > > Acked-by: Michal Hocko <mhocko@suse.com>
Thanks, Michal.
> > >
> > > Well, for hot allocations we do rely on previous state a lot. After all
> > > that's what slab constructor was created for. Whether radix tree node
> > > allocation is such a hot path is a question for debate, I agree.
> >
> > I really doubt that LIST_INIT is something to notice for the radix tree
> > allocation.
>
> I agree with that.
I totally agree with Michal's opinion. I don't want to play with semantic
game here atlhough we can make the API work with simple one line without
any performance lose.
As I stated in description, there was other report hitting the bug
and I believe we didn't fixed it for a long time. Maybe, FS out of tree
and ouf of radix tree users could affect by this bug once they use
__GFP_ZERO intentionally or by chance. MM didn't give any guide to them.
I hope let's make it simple unless we lose big thing.
>
> > So I would rather have safe code than rely on the previous state which is
> > really subtle.
>
> And I agree on subtlety part here as well. But even with LIST_INIT we'll be
> relying on some fields being 0 / NULL so you cannot really say that with
> LIST_INIT we won't be relying on previous state. And fully memsetting
> radix_tree_node on allocation *would* IMO have effect on the performance.
It also does memset in radix_tree_node_rcu_free.
I think if it's really want to get benefit from slab constructor,
the object should have init state when the object is freeing time
so next allocation don't need to do anyting.
In this perspecitve, I think radix_tree_node's constructor is pointless.
> So I'm not convinced LIST_INIT buys us much. It deals with __GFP_ZERO
> problem but not much else.
Jan, so, what is your stance for this patch?
If you're okay for that, I really want to go my original patch
Michal already gave Acked-by.
Thanks.
next prev parent reply other threads:[~2018-04-10 11:19 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-04-09 1:58 Minchan Kim
2018-04-09 2:49 ` Matthew Wilcox
2018-04-09 3:09 ` Minchan Kim
2018-04-09 11:14 ` Matthew Wilcox
2018-04-09 11:25 ` Minchan Kim
2018-04-09 12:25 ` Chao Yu
2018-04-09 12:48 ` Michal Hocko
2018-04-09 13:41 ` Matthew Wilcox
2018-04-09 13:51 ` Christoph Hellwig
2018-04-09 13:52 ` Michal Hocko
2018-04-09 15:34 ` David Sterba
2018-04-09 14:49 ` Minchan Kim
2018-04-09 15:20 ` Matthew Wilcox
2018-04-09 23:04 ` Minchan Kim
2018-04-10 1:12 ` Matthew Wilcox
2018-04-10 2:33 ` Minchan Kim
2018-04-10 2:39 ` Minchan Kim
2018-04-10 2:41 ` Matthew Wilcox
2018-04-10 2:59 ` Minchan Kim
2018-04-10 8:50 ` Jan Kara
2018-04-10 11:56 ` Matthew Wilcox
2018-04-10 12:38 ` Michal Hocko
2018-04-10 11:53 ` [PATCH v2] " kbuild test robot
2018-04-10 13:11 ` kbuild test robot
2018-04-09 18:38 ` [PATCH] " Jaegeuk Kim
2018-04-09 19:40 ` Matthew Wilcox
2018-04-10 8:26 ` Michal Hocko
2018-04-10 12:05 ` Matthew Wilcox
2018-04-10 12:33 ` Michal Hocko
2018-04-10 12:39 ` Johannes Weiner
2018-04-10 13:28 ` Minchan Kim
2018-04-10 12:48 ` Johannes Weiner
2018-04-10 8:22 ` Michal Hocko
2018-04-10 8:55 ` Jan Kara
2018-04-10 9:32 ` Michal Hocko
2018-04-10 10:28 ` Jan Kara
2018-04-10 11:19 ` Minchan Kim [this message]
2018-04-10 12:07 ` Matthew Wilcox
2018-04-10 12:44 ` Johannes Weiner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180410111931.GA5113@rodete-laptop-imager.corp.google.com \
--to=minchan@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=cfries@google.com \
--cc=hannes@cmpxchg.org \
--cc=jack@suse.cz \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox