linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Muchun Song <songmuchun@bytedance.com>
To: mike.kravetz@oracle.com, Andrew Morton <akpm@linux-foundation.org>
Cc: ak@linux.intel.com,
	Linux Memory Management List <linux-mm@kvack.org>,
	 LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] mm/hugetlb: Fix a race between hugetlb sysctl handlers
Date: Tue, 25 Aug 2020 10:42:58 +0800	[thread overview]
Message-ID: <CAMZfGtUCeD_zLr+VY--3fcCuB+_P14tSPPeeSR6aG5KnBU36Gg@mail.gmail.com> (raw)
In-Reply-To: <20200822095328.61306-1-songmuchun@bytedance.com>

Hi Andrew and Mike,

On Sat, Aug 22, 2020 at 5:53 PM Muchun Song <songmuchun@bytedance.com> wrote:
>
> There is a race between the assignment of `table->data` and write value
> to the pointer of `table->data` in the __do_proc_doulongvec_minmax().
> Fix this by duplicating the `table`, and only update the duplicate of
> it. And introduce a helper of proc_hugetlb_doulongvec_minmax() to
> simplify the code.

I am sorry, I didn't expose more details about how the race happened.

CPU0:                                     CPU1:
                                          proc_sys_write
hugetlb_sysctl_handler                      proc_sys_call_handler
hugetlb_sysctl_handler_common                 hugetlb_sysctl_handler
  table->data = &tmp;                           hugetlb_sysctl_handler_common
                                                  table->data = &tmp;
    proc_doulongvec_minmax
      do_proc_doulongvec_minmax             sysctl_head_finish
        __do_proc_doulongvec_minmax
          i = table->data;
          *i = val;     // corrupt CPU1 stack

>
> The following oops was seen:
>
>     BUG: kernel NULL pointer dereference, address: 0000000000000000
>     #PF: supervisor instruction fetch in kernel mode
>     #PF: error_code(0x0010) - not-present page
>     Code: Bad RIP value.

Here we can see the "Bad RIP value", so the stack frame is corrupted by
others.

>     ...
>     Call Trace:
>      ? set_max_huge_pages+0x3da/0x4f0
>      ? alloc_pool_huge_page+0x150/0x150
>      ? proc_doulongvec_minmax+0x46/0x60
>      ? hugetlb_sysctl_handler_common+0x1c7/0x200
>      ? nr_hugepages_store+0x20/0x20
>      ? copy_fd_bitmaps+0x170/0x170
>      ? hugetlb_sysctl_handler+0x1e/0x20
>      ? proc_sys_call_handler+0x2f1/0x300
>      ? unregister_sysctl_table+0xb0/0xb0
>      ? __fd_install+0x78/0x100
>      ? proc_sys_write+0x14/0x20
>      ? __vfs_write+0x4d/0x90
>      ? vfs_write+0xef/0x240
>      ? ksys_write+0xc0/0x160
>      ? __ia32_sys_read+0x50/0x50
>      ? __close_fd+0x129/0x150
>      ? __x64_sys_write+0x43/0x50
>      ? do_syscall_64+0x6c/0x200
>      ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
> Fixes: e5ff215941d5 ("hugetlb: multiple hstates for multiple page sizes")
> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> ---
>  mm/hugetlb.c | 27 +++++++++++++++++++++------
>  1 file changed, 21 insertions(+), 6 deletions(-)
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index a301c2d672bf..818d6125af49 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -3454,6 +3454,23 @@ static unsigned int allowed_mems_nr(struct hstate *h)
>  }
>
>  #ifdef CONFIG_SYSCTL
> +static int proc_hugetlb_doulongvec_minmax(struct ctl_table *table, int write,
> +                                         void *buffer, size_t *length,
> +                                         loff_t *ppos, unsigned long *out)
> +{
> +       struct ctl_table dup_table;
> +
> +       /*
> +        * In order to avoid races with __do_proc_doulongvec_minmax(), we
> +        * can duplicate the @table and alter the duplicate of it.
> +        */
> +       dup_table = *table;
> +       dup_table.data = out;
> +       dup_table.maxlen = sizeof(unsigned long);
> +
> +       return proc_doulongvec_minmax(&dup_table, write, buffer, length, ppos);
> +}
> +
>  static int hugetlb_sysctl_handler_common(bool obey_mempolicy,
>                          struct ctl_table *table, int write,
>                          void *buffer, size_t *length, loff_t *ppos)
> @@ -3465,9 +3482,8 @@ static int hugetlb_sysctl_handler_common(bool obey_mempolicy,
>         if (!hugepages_supported())
>                 return -EOPNOTSUPP;
>
> -       table->data = &tmp;
> -       table->maxlen = sizeof(unsigned long);
> -       ret = proc_doulongvec_minmax(table, write, buffer, length, ppos);
> +       ret = proc_hugetlb_doulongvec_minmax(table, write, buffer, length, ppos,
> +                                            &tmp);
>         if (ret)
>                 goto out;
>
> @@ -3510,9 +3526,8 @@ int hugetlb_overcommit_handler(struct ctl_table *table, int write,
>         if (write && hstate_is_gigantic(h))
>                 return -EINVAL;
>
> -       table->data = &tmp;
> -       table->maxlen = sizeof(unsigned long);
> -       ret = proc_doulongvec_minmax(table, write, buffer, length, ppos);
> +       ret = proc_hugetlb_doulongvec_minmax(table, write, buffer, length, ppos,
> +                                            &tmp);
>         if (ret)
>                 goto out;
>
> --
> 2.11.0
>


-- 
Yours,
Muchun


  parent reply	other threads:[~2020-08-25  2:43 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-22  9:53 Muchun Song
2020-08-24 20:59 ` Andrew Morton
2020-08-24 21:19   ` Mike Kravetz
2020-08-25  3:01     ` [External] " Muchun Song
2020-08-26  0:01       ` Mike Kravetz
2020-08-26  2:47         ` Muchun Song
2020-08-27 21:51           ` Mike Kravetz
2020-08-28  2:33             ` Muchun Song
2020-08-25  2:42 ` Muchun Song [this message]
2020-08-25 15:25 ` Andi Kleen
2020-08-26  2:34   ` [Phishing Risk] [External] " Muchun Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAMZfGtUCeD_zLr+VY--3fcCuB+_P14tSPPeeSR6aG5KnBU36Gg@mail.gmail.com \
    --to=songmuchun@bytedance.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox