From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C585FC021A0 for ; Thu, 13 Feb 2025 23:07:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 59D5E28001A; Thu, 13 Feb 2025 18:07:33 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 54D68280018; Thu, 13 Feb 2025 18:07:33 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 414EA28001A; Thu, 13 Feb 2025 18:07:33 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 25A9D280018 for ; Thu, 13 Feb 2025 18:07:33 -0500 (EST) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id CD77E1C8388 for ; Thu, 13 Feb 2025 23:07:32 +0000 (UTC) X-FDA: 83116459944.21.DDA8042 Received: from mail-vs1-f41.google.com (mail-vs1-f41.google.com [209.85.217.41]) by imf28.hostedemail.com (Postfix) with ESMTP id F313BC0003 for ; Thu, 13 Feb 2025 23:07:30 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=PYIMDPvq; spf=pass (imf28.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.217.41 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739488051; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ckcD7kf8hbhTKxgQjWiNmXpiRbxO5jasuP7uxcmPy2A=; b=aP8l/gAAiSFjBsgCc5ByGw9DRoUz4+BOP0cNsTXI4Nc0/4oQ8rw1BZ2SxjlRSoaiDxxIyc cPERd0Bx5kYzmaSANpA3/MLio6Z4k9MyAwwZlXKDYgkmOXw8AVX+6X1nmE5f6NVOk1I2B1 VZrfV7j9kA3aeoAggkz2KQ7KPA+uVuQ= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=PYIMDPvq; spf=pass (imf28.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.217.41 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739488051; a=rsa-sha256; cv=none; b=USLyrDC0wxObFGqh6cYt/INKha6ZPo1Zhk2sN3BaSr3zvKi7j/zeroqYaYxuOS2gPAsWjx u5vOHAls4NbIk4K5OmAPBJ5oRB9WU4LkbN2V/ON/b6TiyEmRc2SQWJKfueS6QiQ//Busd0 yfE9X81fN2gHECL+vhSxDWpwrvD47nQ= Received: by mail-vs1-f41.google.com with SMTP id ada2fe7eead31-4b9486a15a0so1455761137.0 for ; Thu, 13 Feb 2025 15:07:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1739488050; x=1740092850; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=ckcD7kf8hbhTKxgQjWiNmXpiRbxO5jasuP7uxcmPy2A=; b=PYIMDPvqGqlzL6rNw5Pv9/zphZHWsAD3b29YwwhmgXML+THSHVIhUtUlp7E39vN3K9 0ca+OhPhMO/fXyNZ7xro0s76aGl4MJO3yqpBnOjBPpFdT4jKyLDTlR+9z2l/0B28YRuT neUH/xwoSMmaty8fR7Yyj05KYRYEpfE2EyqVtpYbmfoAHdsCW4PetV+5DNVHqYvikMDn sw0Tn/xOGDW0BW2z340D//AMoPOjcXWAyzP8HLdxxZff4J7p9bUnf9xShojyJQ6GDP0b K5R8nAKvIN4AyaE3aG+x7l7+XEe/De+vycTP5zoS/5BqBfjU0DKoWCdqgWt5BJiVjnue 2BsQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1739488050; x=1740092850; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ckcD7kf8hbhTKxgQjWiNmXpiRbxO5jasuP7uxcmPy2A=; b=bUwsXEfV4P31l1CjoKHuzctnnMnTJ7tucDjqrrJ29+QI/sqytNNzqy+02mzwhajHzs FMwOoyIoc8aC/+kVEDpWuXI7MPcWnV9o9ZHTKPwq/NQomsfwkx2ojIskgN5hb1sWCik6 4nObpOrBEU66sLTnQziPc9twUyMK1l0aosUB6WiOw4RTVPyXBMywv35p8D6ZoUpl2eu5 yI6UHeAeR7mzcmM9TeKeXAkEthWJB6wBxpFz85ulh4fgsDRqYsRvarnHcGOWQasOqnue 130I+pq0Zp/OI/9KDJvBh5yKASpr3TusGQt3bQ1mPxX9TYSyQVQ2jy4sXRjnX6f3Qpo+ lUZA== X-Forwarded-Encrypted: i=1; AJvYcCVQ3Pb7lj4LcJ2JsXaXhvHgDwgaLQyS4m0d6c2B/SlmuM7edI5ALPYveOhTBcDal9JDo2JuvRzGEw==@kvack.org X-Gm-Message-State: AOJu0YxnFxHmwRO9hfYPFpuVc/q3cxeviKe1FDE08mqHgmnhMSqkr3kc hjkSYPO3QFSdvKHvoxiSnp+FFg4iuytX5xfIlUNsShf94/K8VWcfsEfF+1rIWjkn4Wy/N59qySf yMSsV//3WPbigoP2dylX4ccNd2o8= X-Gm-Gg: ASbGncsfoVqOvH5wmjgoxjg/VvNuBNVjR5fNur0NhBh1maABBfN0CKCyCWbrS9Mmpbf IUYt9gcFyDQ8wJN8xncTuctlwMl56+edFlPg6DbVppunniSOGtDf+XBaFBqY8fuBO9RZVIr9h X-Google-Smtp-Source: AGHT+IGRs+BkoBsAvtouPrFHW7HnicJHJ8jpxEWAuwpW1xi8ZW2LJ23bl+SEP/M/9S4ga9OTOT4n8KwOARop3H3a0E0= X-Received: by 2002:a05:6102:d86:b0:4b2:cdf4:81f7 with SMTP id ada2fe7eead31-4bc054487bcmr4169348137.5.1739488049976; Thu, 13 Feb 2025 15:07:29 -0800 (PST) MIME-Version: 1.0 References: <44655569e3a1419f800952004f07e714@honor.com> In-Reply-To: From: Barry Song <21cnbao@gmail.com> Date: Fri, 14 Feb 2025 12:07:18 +1300 X-Gm-Features: AWEUYZlY7gtO0rr19Ytfl3Ld6Hz4B8ESkZMQk0sLn_tD-hLaqvoDUM53FTUDFqU Message-ID: Subject: Re: [PATCH] mm: Fix possible NULL pointer dereference in __swap_duplicate To: gaoxu Cc: Nhat Pham , Andrew Morton , "linux-mm@kvack.org" , "linux-kernel@vger.kernel.org" , Suren Baghdasaryan , yipengxiang Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: F313BC0003 X-Stat-Signature: nfiacncgyam5c5hebizgyzzxxa1rnwrc X-HE-Tag: 1739488050-795868 X-HE-Meta: U2FsdGVkX19y1k5nKlIeuhpdm3qeFFZg6AQWfbi0c/1dJ9C7A6P0oaZ4kq7RM7o/pFkD9c6UgdV4aPQ936wSzqBdPS/FX2iVheSXlLGeNXWm09uk/iWVW0xZOdwHhTCSTBaDPU/lXF4LRIAf1/CoLGPdUAGKBavMBB8zwVvbNlkzYTgNxrqjNtMHJvArf8Fy4QLIPoB+CaySc68i2HzO/d5X+rlqwo32yrhOL5ruAHFdPJDROfr02ACQYIw7+RwwdegtwR45T4dSp0rIHojmAmSJxWHwASrqCFd1kpPONdDgIuJjs+A4yGlpq/2qJ/F5qfzIRLEq0mOK8v2EtUQIiPvRi/Tao74D7uDNXXJobvpziMSVuOEzKsGwkRWW0etJqaD0LffZpg+S1aOboEkKmERqwYIzIMgEbzL46eWAl5kk6YgUgx0nkvr01nUcA6aewc0h0wIGZUGsOcqDwZyU1TzGO21bF+9UJXRdZeWpjL7pcvTBRE4fDJ+fMUKDSj3hhYwE74FBK3F7GoEEgsRDKFz66oswjMGy0XmRQGvc84sg6Mg/pNodvNI+TDjnTY+OV/Gp4gLD3pDxXC0XGV+xug36W4KD+PoFsDr+DU1izKCcIcKzeMsxtsdfVOrgBYcLBjKRRPXhtXJ4+CS+Xc56akpuZEUM03X8jcVfpVY3TkNKcnCAnZiw7C/gD+/QZq6Gron9hSOsFs+i+eE8xo7qe6XQRfKXUoL2yPxWIkw1pPEblXUIzg/+bKzmkBjCEw5kxJA2mAFa2cDvZZQgurqJ/PLF9HwydGDoRLEEj4WmEK7EhMaJMMuMFtVSDuUfWwUZc9ACNlkJ1mD0SI+uZV0hEEXFKvcNhuMDQeuYYFLNlH1hL9Iy1rKL3CP+vGB8S2oViKqJpqK6TC8ijrugwPN+PfT4v8FrhcbPmsOlOhFeT8gS0vtTkfpyVnJAAxr2qKzFJoUPPlUjUaj/VX/EOME kSSxjnzu VUloY+4F0o/1EddSEWH4xUCSg4NmbG3GWNWQd2bpdzhkVSaVL2P3JKb3Hhg47cJmIqdIjdpkUScIcKcmcV1ghHSEJ/yIQkr5xuXe/DD+O1B6o798PRfN/0qi6Oc/zCwshYI1zgC2X2wuE38nnax4eE2orSHWTi0n5BhaxjFQXaSrxicd8cBhBXwSQ4U+VC9cQCRsnW7hgj9QQuirLsA06Wb2aXFZOaMmfPfhy6SBGDRsZZAjoVZXWZD6Df388DEg2ffkM4XTx3dZjv9L+Dx/YGpjvpi4Vucb3T1RijZetoNtKAfFSaAQYIgpSDonzawgSb7LrB3w+aKutEo0ErwFNqNYNQUaPUQOmfOzmhl4qdImsZJj0k53uKVMA+CxbEcxoSChUEnWxmmTtMfq/g0XH/bKTkTdkw/9pN/2ynftkE/hJrjHahYWFhGxzl6dB93np1EwfOdMZV/oH9EdFyHaz7U5vrrudsSswDKBpjyCjAKXzHKiQNXhzQhgWwzBGYwwS/EUcvDsEz3fL+QMQE0mtWaWc3a+tuPxPdAEyRg0xokTrF2WF+2ra1cAwD+/aVA0igdvgTovSFC0WiKUcGXSMmCs0djL6dxuH0leys7wtzswTNd+d24DnKOmq1oIDNHMQlnhhFncAmVH3Vqit+qIiskp/DILrDrD/rXYXZc1IT1QguwzbmD89m7/yGJxcWTkEOuEe X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Feb 13, 2025 at 9:52=E2=80=AFPM gaoxu wrote: > > > > > On Tue, Feb 11, 2025 at 7:14=E2=80=AFPM gaoxu wrote: > > > > > > swp_swap_info() may return null; it is necessary to check the return > > > value to avoid NULL pointer dereference. The code for other calls to > > > swp_swap_info() includes checks, and __swap_duplicate() should also > > > include checks. > > > > > > The reason why swp_swap_info() returns NULL is unclear; it may be due > > > to CPU cache issues or DDR bit flips. The probability of this issue i= s > > > very small, and the stack info we encountered is as follows=EF=BC=9A > > > Unable to handle kernel NULL pointer dereference at virtual address > > > 0000000000000058 > > > [RB/E]rb_sreason_str_set: sreason_str set null_pointer Mem abort info= : > > > ESR =3D 0x0000000096000005 > > > EC =3D 0x25: DABT (current EL), IL =3D 32 bits > > > SET =3D 0, FnV =3D 0 > > > EA =3D 0, S1PTW =3D 0 > > > FSC =3D 0x05: level 1 translation fault Data abort info: > > > ISV =3D 0, ISS =3D 0x00000005, ISS2 =3D 0x00000000 > > > CM =3D 0, WnR =3D 0, TnD =3D 0, TagAccess =3D 0 > > > GCS =3D 0, Overlay =3D 0, DirtyBit =3D 0, Xs =3D 0 user pgtable: 4k= pages, > > > 39-bit VAs, pgdp=3D00000008a80e5000 [0000000000000058] > > > pgd=3D0000000000000000, p4d=3D0000000000000000, > > > pud=3D0000000000000000 > > > Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP Skip md ftrac= e > > > buffer dump for: 0x1609e0 ... > > > pc : swap_duplicate+0x44/0x164 > > > lr : copy_page_range+0x508/0x1e78 > > > sp : ffffffc0f2a699e0 > > > x29: ffffffc0f2a699e0 x28: ffffff8a5b28d388 x27: ffffff8b06603388 > > > x26: ffffffdf7291fe70 x25: 0000000000000006 x24: 0000000000100073 > > > x23: 00000000002d2d2f x22: 0000000000000008 x21: 0000000000000000 > > > x20: 00000000002d2d2f x19: 18000000002d2d2f x18: ffffffdf726faec0 > > > x17: 0000000000000000 x16: 0010000000000001 x15: 0040000000000001 > > > x14: 0400000000000001 x13: ff7ffffffffffb7f x12: ffeffffffffffbff > > > x11: ffffff8a5c7e1898 x10: 0000000000000018 x9 : 0000000000000006 > > > x8 : 1800000000000000 x7 : 0000000000000000 x6 : ffffff8057c01f10 > > > x5 : 000000000000a318 x4 : 0000000000000000 x3 : 0000000000000000 > > > x2 : 0000006daf200000 x1 : 0000000000000001 x0 : 18000000002d2d2f Cal= l > > > trace: > > > swap_duplicate+0x44/0x164 > > > copy_page_range+0x508/0x1e78 > > > copy_process+0x1278/0x21cc > > > kernel_clone+0x90/0x438 > > > __arm64_sys_clone+0x5c/0x8c > > > invoke_syscall+0x58/0x110 > > > do_el0_svc+0x8c/0xe0 > > > el0_svc+0x38/0x9c > > > el0t_64_sync_handler+0x44/0xec > > > el0t_64_sync+0x1a8/0x1ac > > > Code: 9139c35a 71006f3f 54000568 f8797b55 (f9402ea8) ---[ end trace > > > 0000000000000000 ]--- Kernel panic - not syncing: Oops: Fatal > > > exception > > > SMP: stopping secondary CPUs > > > > > > The patch seems to only provide a workaround, but there are no more > > > effective software solutions to handle the bit flips problem. This > > > path will change the issue from a system crash to a process exception= , > > > thereby reducing the impact on the entire machine. > > > > > > Signed-off-by: gao xu > > > > Yeah this smells like a bug. A bit strange though - I have eyeballed th= e code, and > > we (should have?) locked the PTE before resolving it into the swap entr= y format. > > Which should have been enough to prevent the swap entry from being > > unmapped and freed up. Which should have been enough to prevent swapoff= ...? > > > > (are you even doing concurrent swapoff?) > No, the swapoff operation was not executed. > > > > Can you provide more context? What kernel version is this, what kind of > > workload is this, any reproducer, etc.? > kernel version is linux 6.6, Android15 - linux6.6.30. > > The issues encountered by mobile users during usage. > The system load should not be high, as there is no info related to low > memory found in the logs. > The probability of this issue occurring is very low and irregular. > We cannot reproduce the problem during stress testing in the laboratory. > > I found someone reporting a similar issue on the web, see: > https://lkml.indiana.edu/hypermail/linux/kernel/2406.0/02380.html > https://forum.proxmox.com/threads/get_swap_device-bad-swap-file-entry.155= 581/ > https://forums.unraid.net/topic/145497-server-crashes-with-repeated-get_s= wap_device-bad-swap-file-entry-3ffffffffffff/ It might be a non-swap entry mistakenly passed to swap functions. I remembe= r fixing a similar issue in the Android Common Kernel 6.6: https://android.googlesource.com/kernel/common/+/119351fe20bc73b71c6 where a migration entry is mistakenly passed to swap APIs. In any case, we need to identify and fix the actual bug. > > > Thanks Barry