linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Pasha Tatashin <pasha.tatashin@soleen.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: LKML <linux-kernel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>,
	 linux-m68k@lists.linux-m68k.org,
	 Anshuman Khandual <anshuman.khandual@arm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	 william.kucharski@oracle.com,
	Mike Kravetz <mike.kravetz@oracle.com>,
	 Vlastimil Babka <vbabka@suse.cz>,
	Geert Uytterhoeven <geert@linux-m68k.org>,
	schmitzmic@gmail.com,  Steven Rostedt <rostedt@goodmis.org>,
	Ingo Molnar <mingo@redhat.com>,
	 Johannes Weiner <hannes@cmpxchg.org>,
	Roman Gushchin <guro@fb.com>,
	Muchun Song <songmuchun@bytedance.com>,
	 Wei Xu <weixugc@google.com>, Greg Thelen <gthelen@google.com>,
	 David Rientjes <rientjes@google.com>,
	Paul Turner <pjt@google.com>, Hugh Dickins <hughd@google.com>
Subject: Re: [PATCH v3 1/9] mm: add overflow and underflow checks for page->_refcount
Date: Wed, 26 Jan 2022 17:40:47 -0500	[thread overview]
Message-ID: <CA+CK2bAvGjieaXRcHqfhfPp0uogfLOmCtbE_9w3ULFbM+ZuHNg@mail.gmail.com> (raw)
In-Reply-To: <YfGkxtQd0KE8YNXt@casper.infradead.org>

On Wed, Jan 26, 2022 at 2:45 PM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Wed, Jan 26, 2022 at 02:22:26PM -0500, Pasha Tatashin wrote:
> > On Wed, Jan 26, 2022 at 1:59 PM Matthew Wilcox <willy@infradead.org> wrote:
> > >
> > > On Wed, Jan 26, 2022 at 06:34:21PM +0000, Pasha Tatashin wrote:
> > > > The problems with page->_refcount are hard to debug, because usually
> > > > when they are detected, the damage has occurred a long time ago. Yet,
> > > > the problems with invalid page refcount may be catastrophic and lead to
> > > > memory corruptions.
> > > >
> > > > Reduce the scope of when the _refcount problems manifest themselves by
> > > > adding checks for underflows and overflows into functions that modify
> > > > _refcount.
> > >
> > > If you're chasing a bug like this, presumably you turn on page
> > > tracepoints.  So could we reduce the cost of this by putting the
> > > VM_BUG_ON_PAGE parts into __page_ref_mod() et al?  Yes, we'd need to
> > > change the arguments to those functions to pass in old & new, but that
> > > should be a cheap change compared to embedding the VM_BUG_ON_PAGE.
> >
> > This is not only about chasing a bug. This also about preventing
> > memory corruption and information leaking that are caused by ref_count
> > bugs from happening.
> > Several months ago a memory corruption bug was discovered by accident:
> > an engineer was studying a process core from a production system and
> > noticed that some memory does not look like it belongs to the original
> > process. We tried to manually reproduce that bug but failed. However,
> > later analysis by our team, explained that the problem occured due to
> > ref_count bug in Linux, and the bug itself was root caused and fixed
> > (mentioned in the cover letter).  This work would have prevented
> > similar ref_count bugs from yielding to the memory corruption
> > situation.
>
> But the VM_BUG_ON_PAGE tells us next to nothing useful.  To take
> your first example [1] as the kind of thing you say this is going to
> help fix:
>
> 1. Page p is allocated by thread a (refcount 1)
> 2. Thread b gets mistaken pointer to p

Thread b gets a mistaken pointer to p because of a bug in the kernel.
The different types of bugs can lead to such scenarios, and it is
probably not feasible to prevent all of them. However, one of such
scenarios is that we lost control of ref_count, and the page was then
incorrectly remapped or even copied (perhaps migrated) into another
address space.

While studying the logs of the machine on which the double mapping
occured, we noticed that ref_count was underflowed. This was the
smoking gun for the problem, and that is why we concentrated our
search for the root cause of memory leak around places where ref_count
can be incorrectly modified.

This patch series ensures that once we get to a situation where
ref_count is for some reason becomes negative we panic immediately as
there is a possibility that a  leak can occur.

The second benefit of this series is that it makes the ref_count
changes contiguous, with this series we never reset the value to 0,
instead we only operate using offsets and add/sub operations. This
helps with tracing the history of ref_count via tracepoints.

> 3. Thread b calls put_page(), __put_page(), page goes to memory
>    allocator.
> 4. Thread c calls alloc_page(), also gets page p (refcount 1 again).
> 5. Thread a calls put_page(), __put_page()
> 6. Thread c calls put_page() and gets a VM_BUG_ON_PAGE.
>
> How do we find thread b's involvement?  I don't think we can even see
> thread a's involvement in all of this!  All we know is a backtrace
> pointing to thread c, who is a completely innocent bystander.  I think
> you have to enable page tracepoints to have any shot at finding thread
> b's involvement.

You are right, we cannot get to see thread's involvement, we only get
a panic closer to the damage and hopefully prior to leak occurs.
Again, this is just one of the mitigation techniques. Another one is
this page table check [2].

[2] https://lore.kernel.org/all/20211221154650.1047963-1-pasha.tatashin@soleen.com
>
> [1] https://lore.kernel.org/stable/20211122171825.1582436-1-gthelen@google.com/


  reply	other threads:[~2022-01-26 22:41 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-26 18:34 [PATCH v3 0/9] Hardening page _refcount Pasha Tatashin
2022-01-26 18:34 ` [PATCH v3 1/9] mm: add overflow and underflow checks for page->_refcount Pasha Tatashin
2022-01-26 18:59   ` Matthew Wilcox
2022-01-26 19:22     ` Pasha Tatashin
2022-01-26 19:45       ` Matthew Wilcox
2022-01-26 22:40         ` Pasha Tatashin [this message]
2022-01-27 18:27       ` Vlastimil Babka
2022-01-27 19:38         ` Pasha Tatashin
2022-01-27 18:30   ` Vlastimil Babka
2022-01-27 19:42     ` Pasha Tatashin
2022-01-26 18:34 ` [PATCH v3 2/9] mm: Avoid using set_page_count() in set_page_recounted() Pasha Tatashin
2022-01-26 18:34 ` [PATCH v3 3/9] mm: remove set_page_count() from page_frag_alloc_align Pasha Tatashin
2022-01-26 18:34 ` [PATCH v3 4/9] mm: avoid using set_page_count() when pages are freed into allocator Pasha Tatashin
2022-01-26 18:34 ` [PATCH v3 5/9] mm: rename init_page_count() -> page_ref_init() Pasha Tatashin
2022-01-26 18:34 ` [PATCH v3 6/9] mm: remove set_page_count() Pasha Tatashin
2022-01-26 18:34 ` [PATCH v3 7/9] mm: simplify page_ref_* functions Pasha Tatashin
2022-01-26 18:34 ` [PATCH v3 8/9] mm: do not use atomic_set_release in page_ref_unfreeze() Pasha Tatashin
2022-01-26 18:34 ` [PATCH v3 9/9] mm: use atomic_cmpxchg_acquire in page_ref_freeze() Pasha Tatashin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CA+CK2bAvGjieaXRcHqfhfPp0uogfLOmCtbE_9w3ULFbM+ZuHNg@mail.gmail.com \
    --to=pasha.tatashin@soleen.com \
    --cc=akpm@linux-foundation.org \
    --cc=anshuman.khandual@arm.com \
    --cc=geert@linux-m68k.org \
    --cc=gthelen@google.com \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-m68k@lists.linux-m68k.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=mingo@redhat.com \
    --cc=pjt@google.com \
    --cc=rientjes@google.com \
    --cc=rostedt@goodmis.org \
    --cc=schmitzmic@gmail.com \
    --cc=songmuchun@bytedance.com \
    --cc=vbabka@suse.cz \
    --cc=weixugc@google.com \
    --cc=william.kucharski@oracle.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox