From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BCE92C02198 for ; Thu, 13 Feb 2025 01:41:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4F57B6B0085; Wed, 12 Feb 2025 20:41:52 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4A48C6B0088; Wed, 12 Feb 2025 20:41:52 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 36C5D6B0089; Wed, 12 Feb 2025 20:41:52 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 1AD046B0085 for ; Wed, 12 Feb 2025 20:41:52 -0500 (EST) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id A2CD8141654 for ; Thu, 13 Feb 2025 01:41:51 +0000 (UTC) X-FDA: 83113220022.19.72AC2B6 Received: from mail-qv1-f43.google.com (mail-qv1-f43.google.com [209.85.219.43]) by imf04.hostedemail.com (Postfix) with ESMTP id C1ED640006 for ; Thu, 13 Feb 2025 01:41:49 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=HT8YKxrA; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf04.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.219.43 as permitted sender) smtp.mailfrom=nphamcs@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739410909; a=rsa-sha256; cv=none; b=YSrA69GZ87ZvTrduLq5pFvRwHcwg6FqN+OrnQZ4gwAgTrKO1ZIe9HsfTojHHtCysjmNH/w Fr4HU5C4w9OO04fIYPbz+3AR8OsbJyvqPB8q+x6mWwPLhQ9Kv6rPxXiOccFEBybIfhpAUh hwhAkHD7/aiQQGPhWaap19vPrvXmcUM= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=HT8YKxrA; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf04.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.219.43 as permitted sender) smtp.mailfrom=nphamcs@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739410909; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=oBkkofDUNVnQBz0KANHLmHlwBCnbPEtz1NoNEgOCwbw=; b=f12QX/4ODXqhXlZKTsShQ1UjObKVjOtbZtScAyAgn9oPyyDheX2Tt3qSXz/M5kDp5jhalb 932ahJHhRMPWoEEwzhMzSxJaFYYqxZvnYeiv1/3A+ybKPjYDRZH/ZUsRs25v42X0HFJwlK oaktEVaiO3ENw/GRjteOB894N5GjDeQ= Received: by mail-qv1-f43.google.com with SMTP id 6a1803df08f44-6dcdf23b4edso3629376d6.0 for ; Wed, 12 Feb 2025 17:41:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1739410909; x=1740015709; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=oBkkofDUNVnQBz0KANHLmHlwBCnbPEtz1NoNEgOCwbw=; b=HT8YKxrAcUkaPyOmNO+aUeJpxic6uaJlln60X2HQ4eh7p99STHke24W4sExy5RkOpc BdWwQ42GD8CrfybqQpKwYpx4zpRZ/ls94GMcV4bGxSkO1UrRbMZLZcErOroGVxTu9o/m Vy/HaC0eaeAcA4oXjPAtiHyBCGSls3BBfjbntttszcM/Zc3eewvgNDohzZ6a5NuPjPUW VZgtDUDGYYUWf77YDFHCBzJd1N+ihS1LRxQG0mbxyUXel5RzWh371wHFOBF9CGjqTGCj 8GsxmYjqVHX/TNICDjs772FEJYqGUFRvhYEM7IMVScu33n8sWQPYad7TSh5R9NzqMU9X /z/Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1739410909; x=1740015709; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=oBkkofDUNVnQBz0KANHLmHlwBCnbPEtz1NoNEgOCwbw=; b=EjoORGH0au2McQ0iAm4mo6tw04eDJ+tLKtFCNuDkV7wxtLtKonBOd+/aptpJM+C61i qjVmWOGhK0xth66HPqgTx4Kpc+duVy3Gvy4vNlrSrp3srkUAMPDM8lX9AMp66SFbmZgu q8B3GXE+bdyicCY2EXS9gexauv8FWdLCtleEMmWpkkwZl2cNskPRGdb8rvZc3vMWfn/C 2cgFnhwk/LFOZ+dg79UY0GAogkfSzAhg6SdFzNWBYDg+CYQTTHaWXRAyc7H8aqSfFmkU kzzQannxcISmF1USkvC+os8U72VlQvIj8yrRHAm/CEcq68vQaIDT2szErZuHt3bCqkMe 2g0g== X-Forwarded-Encrypted: i=1; AJvYcCVBKFVCu5oFFR+/jk3HWB07owbL5bDc3Y1k0iRA2GMl/iP4q/x6IOpDNTICP/BmpaAY3NIRHcpFpw==@kvack.org X-Gm-Message-State: AOJu0YwUiYQKqYDK3zNBdcEs0wlyJFI4ltkKGA56YQ8lx6CAOz2fV0U9 SS5GSP9pFCEZ2+Mbze/rvrKWnGJM0ug7ptu1B28zjp4KfUydnWp46vFPhFdWtWw0QeBHVLiUqCL iWfh9vPx2Vi8UAgUz24NkfJoO5Rg= X-Gm-Gg: ASbGncs8DFzDB69lLy7iteqAuejXi6Ajl1Mtp5IKctHbh7oraJjvbmZTfO9yiokP22T Awr6YlRmDqO+q9p99Mp3PfWt5hAE801X21zEbHlHdAewKaeTzeVPKZedWjSWKlpoHK52GFv02Pi k= X-Google-Smtp-Source: AGHT+IHNxTk4hVF3BaGlxrxayEkmM0uOUHcr/QV/OUWyQBXPLs1qBfDT3QNk44lukQ9d3Hw+3K0H9itAG1VCJwdFhHA= X-Received: by 2002:a05:6214:410:b0:6d8:9b20:64e8 with SMTP id 6a1803df08f44-6e46ed82f7fmr100419916d6.10.1739410908721; Wed, 12 Feb 2025 17:41:48 -0800 (PST) MIME-Version: 1.0 References: <44655569e3a1419f800952004f07e714@honor.com> In-Reply-To: <44655569e3a1419f800952004f07e714@honor.com> From: Nhat Pham Date: Wed, 12 Feb 2025 17:41:37 -0800 X-Gm-Features: AWEUYZn4m9ca5J-5LnKUbK3SF7NbtB0puCi6HbqaGL_SFOpCyb685SEx4FK6oOk Message-ID: Subject: Re: [PATCH] mm: Fix possible NULL pointer dereference in __swap_duplicate To: gaoxu Cc: Andrew Morton , "linux-mm@kvack.org" , "linux-kernel@vger.kernel.org" , Suren Baghdasaryan , Barry Song <21cnbao@gmail.com>, yipengxiang Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: C1ED640006 X-Stat-Signature: yjrou4nga4xciodxj3h4uo8wy5kes9yi X-Rspam-User: X-HE-Tag: 1739410909-285610 X-HE-Meta: U2FsdGVkX18Vw74fhYwjt6srNNGDzxuIN8BQCJ4XVnva/KjHj3ZHL2O9m8pJl4vVjHYZq1O/ArhVZ94k21Lhan95gP92A6rxj/Xv7WI2YN6yF4aWXSvnPTcE/m6dl91nnrpVm7PBQXCkn+MFxc2bCJIUkUhIiXWJee0ocLFq6gE3OBqzzMcL2847UAoNGDUh/dprzlubnVT0JN8WuT95VmncmdDpRB0ZYwaVsSUVRWkzgSJCW8qJL2qd/4DNXOhDtrzJIqJeuCCRnTaj2qPk8PyaZdp+dgDv6HsW1ksOpS70hK43CizAza7gJNL/w106wvZpHKbjdHL17Q8KUh7pgr9SUjpqXOGPg4dfSBl9y+yY3sjSJEhgvtjMCrNkXVmfeurqqSikUVMWQhVNkzL8//dpRySznrJ/htik+GP8jy8xUQ61SO7UdG68HmFJNqPzwnwvCl76Kit4I/H9z0nzg1kO9L8KQwvXgDHyvOuWLCPfCn/jlD54BZFiNfm2E0Ohh2+u+8FnOssvyGdDSqb5hY8sDzQvbIB6CzEaNBKZ+bi72CerbKLuQNYHHrAw5RsPMUs5fammZ1INvKknzizL2UtDKRO646/feBN8TYkVAHuoqE9+w3olCK8efCc1xviWPdDjZMptQjYmmhYIcaLm4KtynUdKS5WLp9thOIjRLRZ12oCkCSnWQmHYcgvDSu0QOY8UWUvi+ntdP7TfYipaqCbp702yG9MSPab9L46OoxZl9pEbTVqgaEVXdwZDwcFv4wMXwYRr2DMwbdq3sumXoQ9t7MRt8sJE2Hz18owc9dbLx0NNcOstMHZOmmsngfKYgkPZ0FFbDmZCKhKHMtRCUKGQtoSYTsiMTxeEIoQnbJOZGOSa0UGx5pnJWp5CQAB8fK02DgVmJNoNV3h45WfQpoHDjxaciXp7sZB6q0I2btO8A4AQx2RVLo/xUgs62k1id4rGCZVUcD8gWhPSByn T9yEWXDU 4T00zqYtFibTLir7epU5+2m7saEDGSrRrZavoNkyg+V/OHWbx9+fiavYEyQBtjzYOlwF8sVXY39A2ESvh3Y732f4NZApgCpvLMORSCoh10HrzXV4St6DyTfZvCPGixI9D5zH1NBu24Yw4n39f2UewnAwqiUtOyb2obK35zC9bZdfQiP1/Dcel0ZTw7QhUAy9ycYMTKfHLj2/7Cxj9Go7OoRhullT8P6ifZIBzdkVy30WqfKGOeoe3CBQtds5aJ9VdgpcuYcQICwjcEHEelcvrziBbxeHEU2mOymmCCo9IlyPLNDrYXvbZh+Lr8KFXTF81jaLTFTvfFzmIwdkQttjknsnuAuxxAfogqEouLkbuBMINmbO6dRwwRexfm/ftCG9WoMncOH1p+ScSTdN9sM0ybnM3zdylO992WTSpPPGeDnkZ1tK7h+arP5uDyAR6NB9fl9EH0L7Cf6cgVqtjgBNORj+eNqW0prxCUGwobLU467nf23LgUIIil0mqYQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000004, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Feb 11, 2025 at 7:14=E2=80=AFPM gaoxu wrote: > > swp_swap_info() may return null; it is necessary to check the return valu= e > to avoid NULL pointer dereference. The code for other calls to > swp_swap_info() includes checks, and __swap_duplicate() should also > include checks. > > The reason why swp_swap_info() returns NULL is unclear; it may be due to > CPU cache issues or DDR bit flips. The probability of this issue is very > small, and the stack info we encountered is as follows=EF=BC=9A > Unable to handle kernel NULL pointer dereference at virtual address > 0000000000000058 > [RB/E]rb_sreason_str_set: sreason_str set null_pointer > Mem abort info: > ESR =3D 0x0000000096000005 > EC =3D 0x25: DABT (current EL), IL =3D 32 bits > SET =3D 0, FnV =3D 0 > EA =3D 0, S1PTW =3D 0 > FSC =3D 0x05: level 1 translation fault > Data abort info: > ISV =3D 0, ISS =3D 0x00000005, ISS2 =3D 0x00000000 > CM =3D 0, WnR =3D 0, TnD =3D 0, TagAccess =3D 0 > GCS =3D 0, Overlay =3D 0, DirtyBit =3D 0, Xs =3D 0 > user pgtable: 4k pages, 39-bit VAs, pgdp=3D00000008a80e5000 > [0000000000000058] pgd=3D0000000000000000, p4d=3D0000000000000000, > pud=3D0000000000000000 > Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP > Skip md ftrace buffer dump for: 0x1609e0 > ... > pc : swap_duplicate+0x44/0x164 > lr : copy_page_range+0x508/0x1e78 > sp : ffffffc0f2a699e0 > x29: ffffffc0f2a699e0 x28: ffffff8a5b28d388 x27: ffffff8b06603388 > x26: ffffffdf7291fe70 x25: 0000000000000006 x24: 0000000000100073 > x23: 00000000002d2d2f x22: 0000000000000008 x21: 0000000000000000 > x20: 00000000002d2d2f x19: 18000000002d2d2f x18: ffffffdf726faec0 > x17: 0000000000000000 x16: 0010000000000001 x15: 0040000000000001 > x14: 0400000000000001 x13: ff7ffffffffffb7f x12: ffeffffffffffbff > x11: ffffff8a5c7e1898 x10: 0000000000000018 x9 : 0000000000000006 > x8 : 1800000000000000 x7 : 0000000000000000 x6 : ffffff8057c01f10 > x5 : 000000000000a318 x4 : 0000000000000000 x3 : 0000000000000000 > x2 : 0000006daf200000 x1 : 0000000000000001 x0 : 18000000002d2d2f > Call trace: > swap_duplicate+0x44/0x164 > copy_page_range+0x508/0x1e78 > copy_process+0x1278/0x21cc > kernel_clone+0x90/0x438 > __arm64_sys_clone+0x5c/0x8c > invoke_syscall+0x58/0x110 > do_el0_svc+0x8c/0xe0 > el0_svc+0x38/0x9c > el0t_64_sync_handler+0x44/0xec > el0t_64_sync+0x1a8/0x1ac > Code: 9139c35a 71006f3f 54000568 f8797b55 (f9402ea8) > ---[ end trace 0000000000000000 ]--- > Kernel panic - not syncing: Oops: Fatal exception > SMP: stopping secondary CPUs > > The patch seems to only provide a workaround, but there are no more > effective software solutions to handle the bit flips problem. This path > will change the issue from a system crash to a process exception, thereby > reducing the impact on the entire machine. > > Signed-off-by: gao xu Yeah this smells like a bug. A bit strange though - I have eyeballed the code, and we (should have?) locked the PTE before resolving it into the swap entry format. Which should have been enough to prevent the swap entry from being unmapped and freed up. Which should have been enough to prevent swapoff...? (are you even doing concurrent swapoff?) Can you provide more context? What kernel version is this, what kind of workload is this, any reproducer, etc.?