Re: [PATCH] mm: hugetlb: fix UAF in hugetlb_handle_userfault

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Mike Kravetz <mike.kravetz@oracle.com>
To: Liu Shixin <liushixin2@huawei.com>
Cc: Liu Zixian <liuzixian4@huawei.com>,
	Muchun Song <songmuchun@bytedance.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Kefeng Wang <wangkefeng.wang@huawei.com>,
	John Hubbard <jhubbard@nvidia.com>, Peter Xu <peterx@redhat.com>,
	David Hildenbrand <david@redhat.com>
Subject: Re: [PATCH] mm: hugetlb: fix UAF in hugetlb_handle_userfault
Date: Wed, 21 Sep 2022 16:57:39 -0700	[thread overview]
Message-ID: <Yyuk83B4VHh+pbFp@monkey> (raw)
In-Reply-To: <YytOYH1MSo5cNoB6@monkey>

On 09/21/22 10:48, Mike Kravetz wrote:
> On 09/21/22 16:34, Liu Shixin wrote:
> > The vma_lock and hugetlb_fault_mutex are dropped before handling
> > userfault and reacquire them again after handle_userfault(), but
> > reacquire the vma_lock could lead to UAF[1] due to the following
> > race,
> > 
> > hugetlb_fault
> >   hugetlb_no_page
> >     /*unlock vma_lock */
> >     hugetlb_handle_userfault
> >       handle_userfault
> >         /* unlock mm->mmap_lock*/
> >                                            vm_mmap_pgoff
> >                                              do_mmap
> >                                                mmap_region
> >                                                  munmap_vma_range
> >                                                    /* clean old vma */
> >         /* lock vma_lock again  <--- UAF */
> >     /* unlock vma_lock */
> > 
> > Since the vma_lock will unlock immediately after hugetlb_handle_userfault(),
> > let's drop the unneeded lock and unlock in hugetlb_handle_userfault() to fix
> > the issue.
> 
> Thank you very much!
> 
> When I saw this report, the obvious fix was to do something like what you have
> done below.  That looks fine with a few minor comments.
> 
> One question I have not yet answered is, "Does this same issue apply to
> follow_hugetlb_page()?".  I believe it does.  follow_hugetlb_page calls
> hugetlb_fault which could result in the fault being processed by userfaultfd.
> If we experience the race above, then the associated vma could no longer be
> valid when returning from hugetlb_fault.  follow_hugetlb_page and callers
> have a flag (locked) to deal with dropping mmap lock.  However, I am not sure
> if it is handled correctly WRT userfaultfd.  I think this needs to be answered
> before fixing.  And, if the follow_hugetlb_page code needs to be fixed it
> should be done at the same time.
> 

To at least verify this code path, I added userfaultfd handling to the gup_test
program in kernel selftests.  When doing basic gup test on a hugetlb page in
a userfaultfd registered range, I hit this warning:

[ 6939.867796] FAULT_FLAG_ALLOW_RETRY missing 1
[ 6939.871503] CPU: 2 PID: 5720 Comm: gup_test Not tainted 6.0.0-rc6-next-20220921+ #72
[ 6939.874562] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1.fc35 04/01/2014
[ 6939.877707] Call Trace:
[ 6939.878745]  <TASK>
[ 6939.879779]  dump_stack_lvl+0x6c/0x9f
[ 6939.881199]  handle_userfault.cold+0x14/0x1e
[ 6939.882830]  ? find_held_lock+0x2b/0x80
[ 6939.884370]  ? __mutex_unlock_slowpath+0x45/0x280
[ 6939.886145]  hugetlb_handle_userfault+0x90/0xf0
[ 6939.887936]  hugetlb_fault+0xb7e/0xda0
[ 6939.889409]  ? vprintk_emit+0x118/0x3a0
[ 6939.890903]  ? _printk+0x58/0x73
[ 6939.892279]  follow_hugetlb_page.cold+0x59/0x145
[ 6939.894116]  __get_user_pages+0x146/0x750
[ 6939.895580]  __gup_longterm_locked+0x3e9/0x680
[ 6939.897023]  ? seqcount_lockdep_reader_access.constprop.0+0xa5/0xb0
[ 6939.898939]  ? lockdep_hardirqs_on+0x7d/0x100
[ 6939.901243]  gup_test_ioctl+0x320/0x6e0
[ 6939.902202]  __x64_sys_ioctl+0x87/0xc0
[ 6939.903220]  do_syscall_64+0x38/0x90
[ 6939.904233]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 6939.905423] RIP: 0033:0x7fbb53830f7b

This is because userfaultfd is expecting FAULT_FLAG_ALLOW_RETRY which is not
set in this path.

Adding John, Peter and David on Cc: as they are much more fluent in all the
fault and FOLL combinations and might have immediate suggestions.  It is going
to take me a little while to figure out:
1) How to make sure we get the right flags passed to handle_userfault
2) How to modify follow_hugetlb_page as userfaultfd can certainly drop
   mmap_lock.  So we can not assume vma still exists upon return.

-- 
Mike Kravetz

next prev parent reply	other threads:[~2022-09-21 23:58 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-21  8:34 Liu Shixin
2022-09-21 17:31 ` Sidhartha Kumar
2022-09-21 17:48 ` Mike Kravetz
2022-09-21 23:57   ` Mike Kravetz [this message]
2022-09-22  0:57     ` John Hubbard
2022-09-22  2:35       ` Mike Kravetz
2022-09-22  7:46     ` David Hildenbrand
2022-09-22 17:18       ` Mike Kravetz
2022-09-22 15:14     ` Peter Xu
2022-09-21 19:07 ` Andrew Morton
2022-09-22  1:58   ` Liu Shixin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yyuk83B4VHh+pbFp@monkey \
    --to=mike.kravetz@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=jhubbard@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=liushixin2@huawei.com \
    --cc=liuzixian4@huawei.com \
    --cc=peterx@redhat.com \
    --cc=songmuchun@bytedance.com \
    --cc=wangkefeng.wang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox