linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Jann Horn <jannh@google.com>, akpm@linux-foundation.org
Cc: linux-mm@kvack.org, willy@infradead.org, hughd@google.com,
	lorenzo.stoakes@oracle.com, joel@joelfernandes.org,
	linux-kernel@vger.kernel.org, stable@vger.kernel.org
Subject: Re: [PATCH] mm/mremap: Fix move_normal_pmd/retract_page_tables race
Date: Tue, 8 Oct 2024 09:50:14 +0200	[thread overview]
Message-ID: <316dadcb-b199-4b94-8d07-94b40bf534df@redhat.com> (raw)
In-Reply-To: <20241007-move_normal_pmd-vs-collapse-fix-2-v1-1-5ead9631f2ea@google.com>

On 07.10.24 23:42, Jann Horn wrote:
> In mremap(), move_page_tables() looks at the type of the PMD entry and the
> specified address range to figure out by which method the next chunk of
> page table entries should be moved.
> At that point, the mmap_lock is held in write mode, but no rmap locks are
> held yet. For PMD entries that point to page tables and are fully covered
> by the source address range, move_pgt_entry(NORMAL_PMD, ...) is called,
> which first takes rmap locks, then does move_normal_pmd().
> move_normal_pmd() takes the necessary page table locks at source and
> destination, then moves an entire page table from the source to the
> destination.
> 
> The problem is: The rmap locks, which protect against concurrent page table
> removal by retract_page_tables() in the THP code, are only taken after the
> PMD entry has been read and it has been decided how to move it.
> So we can race as follows (with two processes that have mappings of the
> same tmpfs file that is stored on a tmpfs mount with huge=advise); note
> that process A accesses page tables through the MM while process B does it
> through the file rmap:
> 
> 
> process A                      process B
> =========                      =========
> mremap
>    mremap_to
>      move_vma
>        move_page_tables
>          get_old_pmd
>          alloc_new_pmd
>                        *** PREEMPT ***
>                                 madvise(MADV_COLLAPSE)
>                                   do_madvise
>                                     madvise_walk_vmas
>                                       madvise_vma_behavior
>                                         madvise_collapse
>                                           hpage_collapse_scan_file
>                                             collapse_file
>                                               retract_page_tables
>                                                 i_mmap_lock_read(mapping)
>                                                 pmdp_collapse_flush
>                                                 i_mmap_unlock_read(mapping)
>          move_pgt_entry(NORMAL_PMD, ...)
>            take_rmap_locks
>            move_normal_pmd
>            drop_rmap_locks
> 
> When this happens, move_normal_pmd() can end up creating bogus PMD entries
> in the line `pmd_populate(mm, new_pmd, pmd_pgtable(pmd))`.
> The effect depends on arch-specific and machine-specific details; on x86,
> you can end up with physical page 0 mapped as a page table, which is likely
> exploitable for user->kernel privilege escalation.
> 
> 
> Fix the race by letting process B recheck that the PMD still points to a
> page table after the rmap locks have been taken. Otherwise, we bail and let
> the caller fall back to the PTE-level copying path, which will then bail
> immediately at the pmd_none() check.
> 
> Bug reachability: Reaching this bug requires that you can create shmem/file
> THP mappings - anonymous THP uses different code that doesn't zap stuff
> under rmap locks. File THP is gated on an experimental config flag
> (CONFIG_READ_ONLY_THP_FOR_FS), so on normal distro kernels you need shmem
> THP to hit this bug. As far as I know, getting shmem THP normally requires
> that you can mount your own tmpfs with the right mount flags, which would
> require creating your own user+mount namespace; though I don't know if some
> distros maybe enable shmem THP by default or something like that.
> 
> Bug impact: This issue can likely be used for user->kernel privilege
> escalation when it is reachable.
> 
> Cc: stable@vger.kernel.org
> Fixes: 1d65b771bc08 ("mm/khugepaged: retract_page_tables() without mmap or vma lock")
> Closes: https://project-zero.issues.chromium.org/371047675
> Co-developed-by: David Hildenbrand <david@redhat.com>
> Signed-off-by: Jann Horn <jannh@google.com>
> ---
> @David: please confirm we can add your Signed-off-by to this patch after
> the Co-developed-by.

Sure, thanks for sending this out!

Signed-off-by: David Hildenbrand <david@redhat.com>

-- 
Cheers,

David / dhildenb



  parent reply	other threads:[~2024-10-08  7:50 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-07 21:42 Jann Horn
2024-10-08  3:53 ` Qi Zheng
2024-10-08  7:52   ` David Hildenbrand
2024-10-08  7:58     ` Qi Zheng
2024-10-08  8:21       ` David Hildenbrand
2024-10-08  7:50 ` David Hildenbrand [this message]
2024-10-08 23:58 ` Joel Fernandes
2024-10-09  0:49   ` Jann Horn
2024-10-09  1:46     ` Joel Fernandes
2024-10-09 14:44 ` Lorenzo Stoakes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=316dadcb-b199-4b94-8d07-94b40bf534df@redhat.com \
    --to=david@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=hughd@google.com \
    --cc=jannh@google.com \
    --cc=joel@joelfernandes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=stable@vger.kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox