From: Muchun Song <songmuchun@bytedance.com>
To: mike.kravetz@oracle.com, Andrew Morton <akpm@linux-foundation.org>
Cc: ak@linux.intel.com,
Linux Memory Management List <linux-mm@kvack.org>,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] mm/hugetlb: Fix a race between hugetlb sysctl handlers
Date: Tue, 25 Aug 2020 10:42:58 +0800 [thread overview]
Message-ID: <CAMZfGtUCeD_zLr+VY--3fcCuB+_P14tSPPeeSR6aG5KnBU36Gg@mail.gmail.com> (raw)
In-Reply-To: <20200822095328.61306-1-songmuchun@bytedance.com>
Hi Andrew and Mike,
On Sat, Aug 22, 2020 at 5:53 PM Muchun Song <songmuchun@bytedance.com> wrote:
>
> There is a race between the assignment of `table->data` and write value
> to the pointer of `table->data` in the __do_proc_doulongvec_minmax().
> Fix this by duplicating the `table`, and only update the duplicate of
> it. And introduce a helper of proc_hugetlb_doulongvec_minmax() to
> simplify the code.
I am sorry, I didn't expose more details about how the race happened.
CPU0: CPU1:
proc_sys_write
hugetlb_sysctl_handler proc_sys_call_handler
hugetlb_sysctl_handler_common hugetlb_sysctl_handler
table->data = &tmp; hugetlb_sysctl_handler_common
table->data = &tmp;
proc_doulongvec_minmax
do_proc_doulongvec_minmax sysctl_head_finish
__do_proc_doulongvec_minmax
i = table->data;
*i = val; // corrupt CPU1 stack
>
> The following oops was seen:
>
> BUG: kernel NULL pointer dereference, address: 0000000000000000
> #PF: supervisor instruction fetch in kernel mode
> #PF: error_code(0x0010) - not-present page
> Code: Bad RIP value.
Here we can see the "Bad RIP value", so the stack frame is corrupted by
others.
> ...
> Call Trace:
> ? set_max_huge_pages+0x3da/0x4f0
> ? alloc_pool_huge_page+0x150/0x150
> ? proc_doulongvec_minmax+0x46/0x60
> ? hugetlb_sysctl_handler_common+0x1c7/0x200
> ? nr_hugepages_store+0x20/0x20
> ? copy_fd_bitmaps+0x170/0x170
> ? hugetlb_sysctl_handler+0x1e/0x20
> ? proc_sys_call_handler+0x2f1/0x300
> ? unregister_sysctl_table+0xb0/0xb0
> ? __fd_install+0x78/0x100
> ? proc_sys_write+0x14/0x20
> ? __vfs_write+0x4d/0x90
> ? vfs_write+0xef/0x240
> ? ksys_write+0xc0/0x160
> ? __ia32_sys_read+0x50/0x50
> ? __close_fd+0x129/0x150
> ? __x64_sys_write+0x43/0x50
> ? do_syscall_64+0x6c/0x200
> ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
> Fixes: e5ff215941d5 ("hugetlb: multiple hstates for multiple page sizes")
> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> ---
> mm/hugetlb.c | 27 +++++++++++++++++++++------
> 1 file changed, 21 insertions(+), 6 deletions(-)
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index a301c2d672bf..818d6125af49 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -3454,6 +3454,23 @@ static unsigned int allowed_mems_nr(struct hstate *h)
> }
>
> #ifdef CONFIG_SYSCTL
> +static int proc_hugetlb_doulongvec_minmax(struct ctl_table *table, int write,
> + void *buffer, size_t *length,
> + loff_t *ppos, unsigned long *out)
> +{
> + struct ctl_table dup_table;
> +
> + /*
> + * In order to avoid races with __do_proc_doulongvec_minmax(), we
> + * can duplicate the @table and alter the duplicate of it.
> + */
> + dup_table = *table;
> + dup_table.data = out;
> + dup_table.maxlen = sizeof(unsigned long);
> +
> + return proc_doulongvec_minmax(&dup_table, write, buffer, length, ppos);
> +}
> +
> static int hugetlb_sysctl_handler_common(bool obey_mempolicy,
> struct ctl_table *table, int write,
> void *buffer, size_t *length, loff_t *ppos)
> @@ -3465,9 +3482,8 @@ static int hugetlb_sysctl_handler_common(bool obey_mempolicy,
> if (!hugepages_supported())
> return -EOPNOTSUPP;
>
> - table->data = &tmp;
> - table->maxlen = sizeof(unsigned long);
> - ret = proc_doulongvec_minmax(table, write, buffer, length, ppos);
> + ret = proc_hugetlb_doulongvec_minmax(table, write, buffer, length, ppos,
> + &tmp);
> if (ret)
> goto out;
>
> @@ -3510,9 +3526,8 @@ int hugetlb_overcommit_handler(struct ctl_table *table, int write,
> if (write && hstate_is_gigantic(h))
> return -EINVAL;
>
> - table->data = &tmp;
> - table->maxlen = sizeof(unsigned long);
> - ret = proc_doulongvec_minmax(table, write, buffer, length, ppos);
> + ret = proc_hugetlb_doulongvec_minmax(table, write, buffer, length, ppos,
> + &tmp);
> if (ret)
> goto out;
>
> --
> 2.11.0
>
--
Yours,
Muchun
next prev parent reply other threads:[~2020-08-25 2:43 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-08-22 9:53 Muchun Song
2020-08-24 20:59 ` Andrew Morton
2020-08-24 21:19 ` Mike Kravetz
2020-08-25 3:01 ` [External] " Muchun Song
2020-08-26 0:01 ` Mike Kravetz
2020-08-26 2:47 ` Muchun Song
2020-08-27 21:51 ` Mike Kravetz
2020-08-28 2:33 ` Muchun Song
2020-08-25 2:42 ` Muchun Song [this message]
2020-08-25 15:25 ` Andi Kleen
2020-08-26 2:34 ` [Phishing Risk] [External] " Muchun Song
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAMZfGtUCeD_zLr+VY--3fcCuB+_P14tSPPeeSR6aG5KnBU36Gg@mail.gmail.com \
--to=songmuchun@bytedance.com \
--cc=ak@linux.intel.com \
--cc=akpm@linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mike.kravetz@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox