Re: [External] Re: [PATCH] mm/hugetlb: Fix a race between hugetlb sysctl handlers

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Muchun Song <songmuchun@bytedance.com>
To: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	ak@linux.intel.com,
	 Linux Memory Management List <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [External] Re: [PATCH] mm/hugetlb: Fix a race between hugetlb sysctl handlers
Date: Wed, 26 Aug 2020 10:47:08 +0800	[thread overview]
Message-ID: <CAMZfGtWj5_Uh2KFAy4DGc0vzrNm4+Nge7rOBDAFQhh2aN7wOqA@mail.gmail.com> (raw)
In-Reply-To: <231ec1f1-fe7a-c48a-2427-1311360d4b9b@oracle.com>

On Wed, Aug 26, 2020 at 8:03 AM Mike Kravetz <mike.kravetz@oracle.com> wrote:
>
> On 8/24/20 8:01 PM, Muchun Song wrote:
> > On Tue, Aug 25, 2020 at 5:21 AM Mike Kravetz <mike.kravetz@oracle.com> wrote:
> >>
> >> I too am looking at this now and do not completely understand the race.
> >> It could be that:
> >>
> >> hugetlb_sysctl_handler_common
> >> ...
> >>         table->data = &tmp;
> >>
> >> and, do_proc_doulongvec_minmax()
> >> ...
> >>         return __do_proc_doulongvec_minmax(table->data, table, write, ...
> >> with __do_proc_doulongvec_minmax(void *data, struct ctl_table *table, ...
> >> ...
> >>         i = (unsigned long *) data;
> >>         ...
> >>                 *i = val;
> >>
> >> So, __do_proc_doulongvec_minmax can be dereferencing and writing to the pointer
> >> in one thread when hugetlb_sysctl_handler_common is setting it in another?
> >
> > Yes, you are right.
> >
> >>
> >> Another confusing part of the message is the stack trace which includes
> >> ...
> >>      ? set_max_huge_pages+0x3da/0x4f0
> >>      ? alloc_pool_huge_page+0x150/0x150
> >>
> >> which are 'downstream' from these routines.  I don't understand why these
> >> are in the trace.
> >
> > I am also confused. But this issue can be reproduced easily by letting more
> > than one thread write to `/proc/sys/vm/nr_hugepages`. With this patch applied,
> > the issue can not be reproduced and disappears.
>
> There certainly is an issue here as one thread can modify data in another.
> However, I am having a hard time seeing what causes the 'kernel NULL pointer
> dereference'.

If you write 0 to '/proc/sys/vm/nr_hugepages', you will get the
    kernel NULL pointer dereference, address: 0000000000000000

If you write 1024 to '/proc/sys/vm/nr_hugepages', you will get the
    kernel NULL pointer dereference, address: 0000000000000400

The address of dereference is the value which you write to the
'/proc/sys/vm/nr_hugepages'.

>
> I tried to reproduce the issue myself but was unsuccessful.  I have 16 threads
> writing to /proc/sys/vm/nr_hugepages in an infinite loop.  After several hours
> running, I did not hit the issue.  Just curious, what architecture is the
> system?  any special config or compiler options?
>
> If you can easily reproduce, can you post the detailed oops message?
>
> The 'NULL pointer' seems strange because after the first assignment to
> table->data the value should never be NULL.  Certainly it can be modified
> by another thread, but I can not see how it can be NULL.  At the beginning
> of __do_proc_doulongvec_minmax, there is a check for NULL pointer with:

CPU0:                                     CPU1:
                                          proc_sys_write
hugetlb_sysctl_handler                      proc_sys_call_handler
hugetlb_sysctl_handler_common                 hugetlb_sysctl_handler
  table->data = &tmp;                           hugetlb_sysctl_handler_common
                                                  table->data = &tmp;
    proc_doulongvec_minmax
      do_proc_doulongvec_minmax             sysctl_head_finish
        __do_proc_doulongvec_minmax
          i = table->data;
          *i = val;     // corrupt CPU1 stack

If the val is 0, you will see the NULL.

>
>         if (!data || !table->maxlen || !*lenp || (*ppos && !write)) {
>                 *lenp = 0;
>                 return 0;
>         }
>
> I looked at the code my compiler produced for __do_proc_doulongvec_minmax.
> It appears to use the same value/register for the pointer throughout the
> routine.  IOW, I do not see how the pointer can be NULL for the assignment
> when the routine does:
>
>                         *i = val;
>
> Again, your analysis/patch points out a real issue.  I just want to get
> a better understanding to make sure there is not another issue causing
> the NULL pointer dereference.

Below is my test script. There are 8 threads to execute the following script.
In my qemu, it is easy to panic. Thanks.

#!/bin/sh

while :
do
        echo 128 > /proc/sys/vm/nr_hugepages
        echo 0 > /proc/sys/vm/nr_hugepages
done

> --
> Mike Kravetz



-- 
Yours,
Muchun

next prev parent reply	other threads:[~2020-08-26  2:47 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-22  9:53 Muchun Song
2020-08-24 20:59 ` Andrew Morton
2020-08-24 21:19   ` Mike Kravetz
2020-08-25  3:01     ` [External] " Muchun Song
2020-08-26  0:01       ` Mike Kravetz
2020-08-26  2:47         ` Muchun Song [this message]
2020-08-27 21:51           ` Mike Kravetz
2020-08-28  2:33             ` Muchun Song
2020-08-25  2:42 ` Muchun Song
2020-08-25 15:25 ` Andi Kleen
2020-08-26  2:34   ` [Phishing Risk] [External] " Muchun Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAMZfGtWj5_Uh2KFAy4DGc0vzrNm4+Nge7rOBDAFQhh2aN7wOqA@mail.gmail.com \
    --to=songmuchun@bytedance.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox