From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 04554C433E2 for ; Fri, 28 Aug 2020 03:12:00 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 8B73620786 for ; Fri, 28 Aug 2020 03:11:58 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=bytedance-com.20150623.gappssmtp.com header.i=@bytedance-com.20150623.gappssmtp.com header.b="aRbRS+a/" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8B73620786 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=bytedance.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 4A1DE6B0002; Thu, 27 Aug 2020 23:11:58 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4786A6B0003; Thu, 27 Aug 2020 23:11:58 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 38F646B0006; Thu, 27 Aug 2020 23:11:58 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0009.hostedemail.com [216.40.44.9]) by kanga.kvack.org (Postfix) with ESMTP id 227506B0002 for ; Thu, 27 Aug 2020 23:11:58 -0400 (EDT) Received: from smtpin30.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id DFDF0180AD817 for ; Fri, 28 Aug 2020 03:11:57 +0000 (UTC) X-FDA: 77198503074.30.music91_04087fd27072 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin30.hostedemail.com (Postfix) with ESMTP id B6804180B3C83 for ; Fri, 28 Aug 2020 03:11:57 +0000 (UTC) X-HE-Tag: music91_04087fd27072 X-Filterd-Recvd-Size: 6808 Received: from mail-pf1-f195.google.com (mail-pf1-f195.google.com [209.85.210.195]) by imf21.hostedemail.com (Postfix) with ESMTP for ; Fri, 28 Aug 2020 03:11:57 +0000 (UTC) Received: by mail-pf1-f195.google.com with SMTP id u128so5081125pfb.6 for ; Thu, 27 Aug 2020 20:11:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=ZlYTv+JQLd9JVlnCrCmTKqrBb5QXVLCgV6EqO1wAudk=; b=aRbRS+a/K6+8m0Wq0iiI6pwgzwGwvnUxg5hZ7czz9iu/UB8RLw+EIMwUL9kelKGyr0 NN49zGXqUexjxqASR4rLzbEUWfD0/EYKhRN85ZiWO/MSdlsHzDwlRv+dYo1Nit79axJv Yca7ixUGGn8kUCN2pbuB6nVrkIXYdEnt0G06pj8myHlQ20XxL86aQEhkJkP0wuoBvMvi 3nkfUgb7/wriMgLs3S6euhibCk2HcJ22lzaHEhAA/yRQk5UVbWirAvBSzaY6E9uZzTsy NirIDELoXiiESDuV8GG6xLuH9wjAtFboqZeHP2GAnY4SHduuu546R8zofBkpG5+F4loo k7Iw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=ZlYTv+JQLd9JVlnCrCmTKqrBb5QXVLCgV6EqO1wAudk=; b=FugjDM+XP+kkiPq9P2BL57GgRxgB3FaBdY1KNfXC93D5KqR7+Lb1vAVvJAgbfaa9Zj lmfhFnhg+aeei/Qz4ifX1cQVZkQ+kyQfnoq0JanDui1PuIBeyWOrPNK0i7SaG5csaZ1K ARKYfrmkKcwQMZjipd8/a1+AFqRStcxaKvp3ifSZbjU3j7CxQ+sTwQJANBd3EWpI6L8l jraU5F0/S+7ZEYZruZS8gpBANGmR8ZQKUvLpS632k3l+rVl+InmDN1qYdxKp/mZVDRPo jNOujhNL5xy67BHNzF1kWae3kp2Zds8d2E7z8PTYU9hHZ8Vbp6cz+iwvpuX9QMhmUZYe AyZw== X-Gm-Message-State: AOAM533ZVgDFvqQalbnKDN0FTApP+Mni/C4aOypDiOPBuX8qcU/CRnY+ 43656RRQEoWIf6hgVWKC7yzisQ== X-Google-Smtp-Source: ABdhPJy63F9dKcU45c7+iR+5xgW9uxkVEvKkUb5bcUsVW2QzvXtLs2YRPQTI0qZTEijAMt2rMUbzVg== X-Received: by 2002:a62:144b:: with SMTP id 72mr18687406pfu.111.1598584315598; Thu, 27 Aug 2020 20:11:55 -0700 (PDT) Received: from Smcdef-MBP.local.net ([103.136.221.72]) by smtp.gmail.com with ESMTPSA id 78sm4453297pfv.200.2020.08.27.20.11.52 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Thu, 27 Aug 2020 20:11:55 -0700 (PDT) From: Muchun Song To: mike.kravetz@oracle.com, akpm@linux-foundation.org Cc: ak@linux.intel.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH v2] mm/hugetlb: Fix a race between hugetlb sysctl handlers Date: Fri, 28 Aug 2020 11:11:46 +0800 Message-Id: <20200828031146.43035-1-songmuchun@bytedance.com> X-Mailer: git-send-email 2.21.0 (Apple Git-122) MIME-Version: 1.0 X-Rspamd-Queue-Id: B6804180B3C83 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam04 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: There is a race between the assignment of `table->data` and write value to the pointer of `table->data` in the __do_proc_doulongvec_minmax() on the other thread. CPU0: CPU1: proc_sys_write hugetlb_sysctl_handler proc_sys_call_handler hugetlb_sysctl_handler_common hugetlb_sysctl_handler table->data =3D &tmp; hugetlb_sysctl_handler_comm= on table->data =3D &tmp; proc_doulongvec_minmax do_proc_doulongvec_minmax sysctl_head_finish __do_proc_doulongvec_minmax unuse_table i =3D table->data; *i =3D val; // corrupt CPU1's stack Fix this by duplicating the `table`, and only update the duplicate of it. And introduce a helper of proc_hugetlb_doulongvec_minmax() to simplify the code. The following oops was seen: BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor instruction fetch in kernel mode #PF: error_code(0x0010) - not-present page Code: Bad RIP value. ... Call Trace: ? set_max_huge_pages+0x3da/0x4f0 ? alloc_pool_huge_page+0x150/0x150 ? proc_doulongvec_minmax+0x46/0x60 ? hugetlb_sysctl_handler_common+0x1c7/0x200 ? nr_hugepages_store+0x20/0x20 ? copy_fd_bitmaps+0x170/0x170 ? hugetlb_sysctl_handler+0x1e/0x20 ? proc_sys_call_handler+0x2f1/0x300 ? unregister_sysctl_table+0xb0/0xb0 ? __fd_install+0x78/0x100 ? proc_sys_write+0x14/0x20 ? __vfs_write+0x4d/0x90 ? vfs_write+0xef/0x240 ? ksys_write+0xc0/0x160 ? __ia32_sys_read+0x50/0x50 ? __close_fd+0x129/0x150 ? __x64_sys_write+0x43/0x50 ? do_syscall_64+0x6c/0x200 ? entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: e5ff215941d5 ("hugetlb: multiple hstates for multiple page sizes") Signed-off-by: Muchun Song --- chagelogs in v2: 1. Add more details about how the race happened to the commit message. 2. Remove unnecessary assignment of table->maxlen. mm/hugetlb.c | 26 ++++++++++++++++++++------ 1 file changed, 20 insertions(+), 6 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index a301c2d672bf..4c2a2620eeed 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3454,6 +3454,22 @@ static unsigned int allowed_mems_nr(struct hstate = *h) } =20 #ifdef CONFIG_SYSCTL +static int proc_hugetlb_doulongvec_minmax(struct ctl_table *table, int w= rite, + void *buffer, size_t *length, + loff_t *ppos, unsigned long *out) +{ + struct ctl_table dup_table; + + /* + * In order to avoid races with __do_proc_doulongvec_minmax(), we + * can duplicate the @table and alter the duplicate of it. + */ + dup_table =3D *table; + dup_table.data =3D out; + + return proc_doulongvec_minmax(&dup_table, write, buffer, length, ppos); +} + static int hugetlb_sysctl_handler_common(bool obey_mempolicy, struct ctl_table *table, int write, void *buffer, size_t *length, loff_t *ppos) @@ -3465,9 +3481,8 @@ static int hugetlb_sysctl_handler_common(bool obey_= mempolicy, if (!hugepages_supported()) return -EOPNOTSUPP; =20 - table->data =3D &tmp; - table->maxlen =3D sizeof(unsigned long); - ret =3D proc_doulongvec_minmax(table, write, buffer, length, ppos); + ret =3D proc_hugetlb_doulongvec_minmax(table, write, buffer, length, pp= os, + &tmp); if (ret) goto out; =20 @@ -3510,9 +3525,8 @@ int hugetlb_overcommit_handler(struct ctl_table *ta= ble, int write, if (write && hstate_is_gigantic(h)) return -EINVAL; =20 - table->data =3D &tmp; - table->maxlen =3D sizeof(unsigned long); - ret =3D proc_doulongvec_minmax(table, write, buffer, length, ppos); + ret =3D proc_hugetlb_doulongvec_minmax(table, write, buffer, length, pp= os, + &tmp); if (ret) goto out; =20 --=20 2.11.0