From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A3680C19F2B for ; Tue, 2 Aug 2022 11:44:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CADFD6B0071; Tue, 2 Aug 2022 07:44:26 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C5D236B0072; Tue, 2 Aug 2022 07:44:26 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AFED38E0001; Tue, 2 Aug 2022 07:44:26 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 9CB7C6B0071 for ; Tue, 2 Aug 2022 07:44:26 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 72B19140F3F for ; Tue, 2 Aug 2022 11:44:26 +0000 (UTC) X-FDA: 79754469732.15.88DDB51 Received: from mx0b-00364e01.pphosted.com (mx0b-00364e01.pphosted.com [148.163.139.74]) by imf25.hostedemail.com (Postfix) with ESMTP id C9AC6A0120 for ; Tue, 2 Aug 2022 11:44:25 +0000 (UTC) Received: from pps.filterd (m0167074.ppops.net [127.0.0.1]) by mx0b-00364e01.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 272BY69o009830 for ; Tue, 2 Aug 2022 07:44:24 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=columbia.edu; h=mime-version : references : in-reply-to : from : date : message-id : subject : to : cc : content-type; s=pps01; bh=mMQAXDawgwi4XOuy0MQToRHya7rOSDDrD4efkapgA00=; b=rupa7+Iia7yl6j96wxUxWp7rmvUFZJ/byDUUlfJ2WXVQBffRP92A2LB9H9Xz7mRUGOxo LNgbVW33SEwprCyUvGjI7v3Nn29FdWAHYv8VJGU8VV6MTnPjjIvMXvDFiVihyyt0UPom iW6JDrwMKLM9y8yVx5XybU+prdc08oG1gPIYcmfp8Iq+OQ5eoJrRdAec3MiqMq6tT2bo 8OuqWyvY03ObTtZwydFL8f2LB6ob0Gpe2d18m8E/gvNsMoBFVkC7fH2VVqimfP84wYbV LP4NmVVOIcRTDzI1MxDhj4Ykr+us7mWW5Ppw4EJtZk6FCGv+NESfY5MPHqi/yRFZCuSa lQ== Received: from sendprdmail20.cc.columbia.edu (sendprdmail20.cc.columbia.edu [128.59.72.22]) by mx0b-00364e01.pphosted.com (PPS) with ESMTPS id 3hmwpgtvdt-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Tue, 02 Aug 2022 07:44:24 -0400 Received: from mail-vs1-f71.google.com (mail-vs1-f71.google.com [209.85.217.71]) by sendprdmail20.cc.columbia.edu (8.14.7/8.14.4) with ESMTP id 272BiNqk015548 (version=TLSv1/SSLv3 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Tue, 2 Aug 2022 07:44:23 -0400 Received: by mail-vs1-f71.google.com with SMTP id q2-20020a056102204200b00385464aa04eso495130vsr.20 for ; Tue, 02 Aug 2022 04:44:23 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc; bh=mMQAXDawgwi4XOuy0MQToRHya7rOSDDrD4efkapgA00=; b=wfoEZU/JDV5LAwaOLli6IenVkiVwc3MyGUeE4T43jJ3YYfj5NQjEpLvD139XMD7e8E fQ3FYs+EdogkrhMVbyY2vca05mwvCERtbfqe1yAXM2+hQB6ho776vXQv965TO6IrC4Or 6KJY06vZdNKGgEKLdbqLYl3k6YGqsRhRyCEP9ayUDFmqlQyJZGDnPOM/X4R3RCDDALoz w+c10zJ79bRh9YVoGw1AzLgs3Z9ARfvGEQoOJ/6UVo7uCRc1kXudkrXQCU1hG/EsJQ3H EdaPsvL1V8Cy+JnhYeDxbr/4e3BmVuDD+W9r5UEydbKgL83Ijjpej3nY829AUB0oM4oH Hxqw== X-Gm-Message-State: ACgBeo1aFXJAW8zwK09Iw6w/U8KVpm6cHKmUoMZP+reghvpTQGsG9pE1 9elp9unwC0lhUb0x+2pDiBPqgdQhDZWU+9VC2q5JBeMpj9poty+WiYOq529XNTF+fmpBd5keblZ xZB3esaxgPioujIw06euc6BjXjxG2wFNClEk= X-Received: by 2002:a05:6122:f84:b0:377:bff5:545b with SMTP id br4-20020a0561220f8400b00377bff5545bmr689201vkb.32.1659440663170; Tue, 02 Aug 2022 04:44:23 -0700 (PDT) X-Google-Smtp-Source: AA6agR79v4Fy6IZ+Wfnz7FObedtknwucGJjjBVC2N9zGyc3ExLGxDlf7hXrJ9Ft0iA77BKgnf7G1Mt29SoVifXdHLco= X-Received: by 2002:a05:6122:f84:b0:377:bff5:545b with SMTP id br4-20020a0561220f8400b00377bff5545bmr689189vkb.32.1659440662897; Tue, 02 Aug 2022 04:44:22 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Gabriel Ryan Date: Tue, 2 Aug 2022 07:44:12 -0400 Message-ID: Subject: Re: Race in mm/ksm.c To: Kefeng Wang Cc: abhishek.shah@columbia.edu, akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: multipart/alternative; boundary="0000000000003cfcdd05e540a3d0" X-Proofpoint-ORIG-GUID: W_-bvx9TKLlJBdlBWFfapC03OsVpuNLg X-Proofpoint-GUID: W_-bvx9TKLlJBdlBWFfapC03OsVpuNLg X-CU-OB: Yes X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.883,Hydra:6.0.517,FMLib:17.11.122.1 definitions=2022-08-02_07,2022-08-02_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 impostorscore=10 adultscore=0 clxscore=1011 suspectscore=0 bulkscore=10 spamscore=0 priorityscore=1501 mlxlogscore=986 mlxscore=0 phishscore=0 malwarescore=0 lowpriorityscore=10 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2206140000 definitions=main-2208020053 ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1659440666; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=mMQAXDawgwi4XOuy0MQToRHya7rOSDDrD4efkapgA00=; b=1y1rvlSmPxUkhC1k2E/g2l2HitNzrwlzyndgs0tT6qXqmi6IfkhoIcvSq5IXcKColOXNC/ rGK/nq0zHYBJKEyJrvHbaFRN2SlFVPRNxJtZOlCB/Ww7gQG7ADKcIcX9PLdmtSk5B7qnHP 8/SJlXe4jcaMSSTP3qJs/rMx1dlsuG4= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=columbia.edu header.s=pps01 header.b=rupa7+Ii; spf=pass (imf25.hostedemail.com: domain of gr2547@columbia.edu designates 148.163.139.74 as permitted sender) smtp.mailfrom=gr2547@columbia.edu; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1659440666; a=rsa-sha256; cv=none; b=Or0urjXgEyHVlDP5fLscfC4PF9r5kx4YsbiyHEBnbgX7CKBCLntdsIecPUZz9BOTJGZOA9 gcnLVZNGEVJwx/n+WWu5sWtRE8JhYSb8SIO4CTIGe4SQfKglNQgv4eovufv/XH19E/Hg67 0RK/SCLeD/gL+R/zU10Cgq9lcUvPyX4= Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=columbia.edu header.s=pps01 header.b=rupa7+Ii; spf=pass (imf25.hostedemail.com: domain of gr2547@columbia.edu designates 148.163.139.74 as permitted sender) smtp.mailfrom=gr2547@columbia.edu; dmarc=none X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: C9AC6A0120 X-Stat-Signature: 8gnrjtfmh8ksdmwy9y6kmothzb3ipk4m X-HE-Tag: 1659440665-217539 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: --0000000000003cfcdd05e540a3d0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Confirmed. Thanks for the quick response and apologies for the delay! Best, Gabe On Thu, Jul 21, 2022 at 9:57 PM Kefeng Wang wrote: > > On 2022/7/21 23:58, Abhishek Shah wrote: > > Dear Kernel Maintainers, > > We found a race in mm/ksm.c. During the execution of the function > *__ksm_run* which uses variable *ksm_run* to decide the list insertion > point, the variable *ksm_run* can be concurrently modified in the > function *run_store*, which we thought could be undesirable since =E2=80= =9CKSM > pages in newly forked mms can be missed=E2=80=9D (See comment here: > https://elixir.bootlin.com/linux/v5.18-rc5/source/mm/ksm.c#L2498). We > would also like your thoughts on the security impact given it is a TOCTOU > bug. > > We provide more details below including the trace and reproducing > test cases. > > > Hello=EF=BC=8C could the following changes to avoid the data-race issue= =EF=BC=9F > > diff --git a/mm/ksm.c b/mm/ksm.c > index 54f78c9eecae..f072753cbb3a 100644 > --- a/mm/ksm.c > +++ b/mm/ksm.c > @@ -2497,6 +2497,7 @@ int __ksm_enter(struct mm_struct *mm) > { > struct mm_slot *mm_slot; > int needs_wakeup; > + bool ksm_run_merge; > > mm_slot =3D alloc_mm_slot(); > if (!mm_slot) > @@ -2505,6 +2506,10 @@ int __ksm_enter(struct mm_struct *mm) > /* Check ksm_run too? Would need tighter locking */ > needs_wakeup =3D list_empty(&ksm_mm_head.mm_list); > > + mutex_lock(&ksm_thread_mutex); > + ksm_run_unmerge =3D !!(ksm_run & KSM_RUN_UNMERGE); > + mutex_unlock(&ksm_thread_mutex); > + > spin_lock(&ksm_mmlist_lock); > insert_to_mm_slots_hash(mm, mm_slot); > /* > @@ -2517,7 +2522,7 @@ int __ksm_enter(struct mm_struct *mm) > * scanning cursor, otherwise KSM pages in newly forked mms will = be > * missed: then we might as well insert at the end of the list. > */ > - if (ksm_run & KSM_RUN_UNMERGE) > + if (ksm_run_unmerge) > list_add_tail(&mm_slot->mm_list, &ksm_mm_head.mm_list); > else > list_add_tail(&mm_slot->mm_list, > &ksm_scan.mm_slot->mm_list); > > > > > *Trace* > BUG: KCSAN: data-race in __ksm_enter / run_store > write to 0xffffffff881edae0 of 8 bytes by task 6542 on cpu 0: > run_store+0x19a/0x2d0 mm/ksm.c:2897 > kobj_attr_store+0x44/0x60 lib/kobject.c:824 > sysfs_kf_write+0x16f/0x1a0 fs/sysfs/file.c:136 > kernfs_fop_write_iter+0x2ae/0x370 fs/kernfs/file.c:291 > call_write_iter include/linux/fs.h:2050 [inline] > new_sync_write fs/read_write.c:504 [inline] > vfs_write+0x779/0x900 fs/read_write.c:591 > ksys_write+0xde/0x190 fs/read_write.c:644 > __do_sys_write fs/read_write.c:656 [inline] > __se_sys_write fs/read_write.c:653 [inline] > __x64_sys_write+0x43/0x50 fs/read_write.c:653 > do_syscall_x64 arch/x86/entry/common.c:50 [inline] > do_syscall_64+0x3d/0x90 arch/x86/entry/common.c:80 > entry_SYSCALL_64_after_hwframe+0x44/0xae > > read to 0xffffffff881edae0 of 8 bytes by task 6541 on cpu 1: > __ksm_enter+0x114/0x260 mm/ksm.c:2501 > ksm_madvise+0x291/0x350 mm/ksm.c:2451 > madvise_vma_behavior mm/madvise.c:1039 [inline] > madvise_walk_vmas mm/madvise.c:1221 [inline] > do_madvise+0x656/0xeb0 mm/madvise.c:1399 > __do_sys_madvise mm/madvise.c:1412 [inline] > __se_sys_madvise mm/madvise.c:1410 [inline] > __x64_sys_madvise+0x64/0x70 mm/madvise.c:1410 > do_syscall_x64 arch/x86/entry/common.c:50 [inline] > do_syscall_64+0x3d/0x90 arch/x86/entry/common.c:80 > entry_SYSCALL_64_after_hwframe+0x44/0xae > > Reported by Kernel Concurrency Sanitizer on: > CPU: 1 PID: 6541 Comm: syz-executor2-n Not tainted 5.18.0-rc5+ #107 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 > 04/01/2014 > --------------------- > *Inputs * > Input CPU 0: > r0 =3D openat$sysctl(0xffffff9c, > &(0x7f0000000100)=3D'/sys/kernel/mm/ksm/run\x00', 0x1, 0x0) > write$sysctl(r0, &(0x7f0000000000)=3D'2\x00', 0x2) > > Input CPU 1: > madvise(&(0x7f0000ffc000/0x4000)=3Dnil, 0x4000, 0xc) > mlock2(&(0x7f0000ffe000/0x2000)=3Dnil, 0x2000, 0x0) > madvise(&(0x7f0000ffd000/0x3000)=3Dnil, 0x3000, 0x12) > clone(0x0, 0x0, 0x0, 0x0, 0x0) > > --0000000000003cfcdd05e540a3d0 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Confirmed.

Thanks for the qu= ick response and apologies for the delay!

Best,

Gabe

<= div dir=3D"ltr" class=3D"gmail_attr">On Thu, Jul 21, 2022 at 9:57 PM Kefeng= Wang <wangkefeng.wang@hua= wei.com> wrote:
=20 =20 =20


On 2022/7/21 23:58, Abhishek Shah wrote:
=20
Dear=C2=A0Kernel Maintainers,

We found a race in=C2=A0mm/ksm.c.=C2=A0During the execut= ion of the function=C2=A0__ksm_run=C2=A0which uses variable=C2=A0ksm_run=C2=A0to decide the list insertion point, the variable=C2=A0ksm_run=C2=A0can be concurrently modified in the function=C2=A0run_store, which we thought could be undesirable since =E2=80=9CKSM pages = in newly forked mms can be missed=E2=80=9D (See comment here:=C2= =A0https:/= /elixir.bootlin.com/linux/v5.18-rc5/source/mm/ksm.c#L2498)= . We would also like your=C2=A0thoughts on the security impact given it is a TOCTOU bug.=C2=A0=C2=A0

We provide more=C2=A0details=C2=A0below including the trace and reproducing test=C2=A0cases.


Hello=EF=BC=8C could the following changes to avoid the data-race is= sue=EF=BC=9F

diff --git a/mm/ksm.c b/mm/ksm.c
index 54f78c9eecae..f072753cbb3a 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -2497,6 +2497,7 @@ int __ksm_enter(struct mm_struct *mm)
=C2=A0{
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 struct mm_slot *mm_slot; =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 int needs_wakeup;
+=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 bool ksm_run_merge;
=C2=A0
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 mm_slot =3D alloc_mm_slot(= );
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 if (!mm_slot)
@@ -2505,6 +2506,10 @@ int __ksm_enter(struct mm_struct *mm)
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 /* Check ksm_run too?=C2= =A0 Would need tighter locking */
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 needs_wakeup =3D list_empt= y(&ksm_mm_head.mm_list);
=C2=A0
+=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 mutex_lock(&ksm_thread_mute= x);
+=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ksm_run_unmerge =3D !!(ksm_run = & KSM_RUN_UNMERGE);
+=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 mutex_unlock(&ksm_thread_mu= tex);
+
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 spin_lock(&ksm_mmlist_= lock);
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 insert_to_mm_slots_hash(mm= , mm_slot);
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 /*
@@ -2517,7 +2522,7 @@ int __ksm_enter(struct mm_struct *mm)
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 * scanning cursor, o= therwise KSM pages in newly forked mms will be
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 * missed: then we mi= ght as well insert at the end of the list.
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 */
-=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 if (ksm_run & KSM_RUN_UNMER= GE)
+=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 if (ksm_run_unmerge)
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 list_add_tail(&mm_slot->mm_list, &ksm_mm_head.mm_list);
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 else
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 list_add_tail(&mm_slot->mm_list, &ksm_scan.mm_slot->mm_list);




Trace
BUG: KCSAN: data-race in __ksm_enter / run_store
write to 0xffffffff881edae0 of 8 bytes by task 6542 on cpu 0:
=C2=A0run_store+0x19a/0x2d0 mm/ksm.c:2897
=C2=A0kobj_attr_store+0x44/0x60 lib/kobject.c:824
=C2=A0sysfs_kf_write+0x16f/0x1a0 fs/sysfs/file.c:136
=C2=A0kernfs_fop_write_iter+0x2ae/0x370 fs/kernfs/file.c:291
=C2=A0call_write_iter include/linux/fs.h:2050 [inline]
=C2=A0new_sync_write fs/read_write.c:504 [inline]
=C2=A0vfs_write+0x779/0x900 fs/read_write.c:591
=C2=A0ksys_write+0xde/0x190 fs/read_write.c:644
=C2=A0__do_sys_write fs/read_write.c:656 [inline]
=C2=A0__se_sys_write fs/read_write.c:653 [inline]
=C2=A0__x64_sys_write+0x43/0x50 fs/read_write.c:653
=C2=A0do_syscall_x64 arch/x86/entry/common.c:50 [inline]
=C2=A0do_syscall_64+0x3d/0x90 arch/x86/entry/common.c:80
=C2=A0entry_SYSCALL_64_after_hwframe+0x44/0xae

read to 0xffffffff881edae0 of 8 bytes by task 6541 on cpu 1:
=C2=A0__ksm_enter+0x114/0x260 mm/ksm.c:2501
=C2=A0ksm_madvise+0x291/0x350 mm/ksm.c:2451
=C2=A0madvise_vma_behavior mm/madvise.c:1039 [inline]
=C2=A0madvise_walk_vmas mm/madvise.c:1221 [inline]
=C2=A0do_madvise+0x656/0xeb0 mm/madvise.c:1399
=C2=A0__do_sys_madvise mm/madvise.c:1412 [inline]
=C2=A0__se_sys_madvise mm/madvise.c:1410 [inline]
=C2=A0__x64_sys_madvise+0x64/0x70 mm/madvise.c:1410
=C2=A0do_syscall_x64 arch/x86/entry/common.c:50 [inline]
=C2=A0do_syscall_64+0x3d/0x90 arch/x86/entry/common.c:80
=C2=A0entry_SYSCALL_64_after_hwframe+0x44/0xae

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 6541 Comm: syz-executor2-n Not tainted 5.18.0-rc5+ #107
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
---------------------
Inputs=C2=A0
Input CPU 0:
r0 =3D openat$sysctl(0xffffff9c, &(0x7f0000000100)=3D'/sys/kernel/mm/ksm/run\x00', 0x1= , 0x0)
write$sysctl(r0, &(0x7f0000000000)=3D'2\x00', 0x2)

Input CPU 1:
madvise(&(0x7f0000ffc000/0x4000)=3Dnil, 0x4000, 0xc)
mlock2(&(0x7f0000ffe000/0x2000)=3Dnil, 0x2000, 0x0)
madvise(&(0x7f0000ffd000/0x3000)=3Dnil, 0x3000, 0x12)
clone(0x0, 0x0, 0x0, 0x0, 0x0)
--0000000000003cfcdd05e540a3d0--