From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 54137C4167B for ; Wed, 30 Nov 2022 02:38:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B2E3A6B0072; Tue, 29 Nov 2022 21:38:48 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id AB7456B0073; Tue, 29 Nov 2022 21:38:48 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 958076B0074; Tue, 29 Nov 2022 21:38:48 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 7FD406B0072 for ; Tue, 29 Nov 2022 21:38:48 -0500 (EST) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 4965F1A09C5 for ; Wed, 30 Nov 2022 02:38:48 +0000 (UTC) X-FDA: 80188550736.12.C545C08 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by imf09.hostedemail.com (Postfix) with ESMTP id F28AF14000B for ; Wed, 30 Nov 2022 02:38:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1669775927; x=1701311927; h=from:to:cc:subject:references:date:in-reply-to: message-id:mime-version; bh=PQAlzj9x5Nt5ZZJCWyX7IxJO+PzBYV/VRDx2ZGEtf2I=; b=AH9+HVQYz8FaYevgcqlvqbzNegrLbh3q2faaDbrPW3VgDSeTjP/JRG/N 802XYWuGrVbbk3XxdWhT6yI0wDhII/A19UFfIe4Q97EePWGMVoem6Nk+1 rjgtMg1ktqZCyI8NHOfWYiiH6SMkchcxM9ANiBCvu/X6t56gdJIm8tQ8y 1C9SvSSFDAszr74OMUwlSJVluUawNlQjbv9aBHaexT/OH2LUm7kBKTrmW wnOnjFNSSBKOYfSfuZPuFay0gBX8VonVhIZ05/1Z2rAqd/bXv2gsH2dE0 yxW0Syti77XPf2jXCUAlA8UwsEfCCGjY2YeJhYYOfRDYdJ+vxUaCLknba Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10546"; a="295651360" X-IronPort-AV: E=Sophos;i="5.96,204,1665471600"; d="scan'208";a="295651360" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Nov 2022 18:38:44 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10546"; a="644036816" X-IronPort-AV: E=Sophos;i="5.96,204,1665471600"; d="scan'208";a="644036816" Received: from yhuang6-desk2.sh.intel.com (HELO yhuang6-desk2.ccr.corp.intel.com) ([10.238.208.55]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Nov 2022 18:38:41 -0800 From: "Huang, Ying" To: xialonglong Cc: , , , , , , "Wangkefeng (OS Kernel Lab)" , chenwandun , , Subject: Re: =?utf-8?B?44CQQlVH44CRTlVMTA==?= pointer dereference at __lookup_swap_cgroup References: <25f28e73-5fc6-6e7f-3d41-a5970537fb8b@huawei.com> <87fse3homz.fsf@yhuang6-desk2.ccr.corp.intel.com> <2e132246-96be-a281-78f4-8310f75a0ed8@huawei.com> Date: Wed, 30 Nov 2022 10:37:20 +0800 In-Reply-To: <2e132246-96be-a281-78f4-8310f75a0ed8@huawei.com> (xialonglong's message of "Wed, 30 Nov 2022 09:31:56 +0800") Message-ID: <87y1rtb21r.fsf@yhuang6-desk2.ccr.corp.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=ascii ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=AH9+HVQY; spf=pass (imf09.hostedemail.com: domain of ying.huang@intel.com designates 192.55.52.151 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1669775928; a=rsa-sha256; cv=none; b=V1kfso5QCePyOjamp3W2LW9MYTAatGPxrnKVv/aeQKQREA3+ApSCY/t5hUSWESZ6sIbtQG ff+xfhPVewB3qGqT5j0sJ2mVvhbOYMjkncZTB85IjcWBgg+J+ObBw5PK8TxW6LmGoKmj1E d4AoD+PKyd3zRxGhysAeuE33kj4DSMA= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1669775928; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=xo+yBIBHd7wfHEbHTpx4NKdtVY5V8T/NiGD0nFdj7v4=; b=sBXHvchkDgtfOcIJt0wPF6ixUyWJiiV5qEsj5Y7t5sTvfUyBgO7+LI1BwuFpQiSVaOKdJ6 aaDjks/9YSY4XEZ6h5wVNcggDjUWtD2kyecXGTxBfddSiYmPbaPXVa+E5UCOKEVbQnbaUY XI1J2EJOOomakywCEY8MYPi9D40YIaI= X-Rspam-User: X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: F28AF14000B Authentication-Results: imf09.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=AH9+HVQY; spf=pass (imf09.hostedemail.com: domain of ying.huang@intel.com designates 192.55.52.151 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com X-Stat-Signature: 569ua7t9qam67gpcsk73px77kbfx38of X-HE-Tag: 1669775926-213962 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: xialonglong writes: > Thank you very much for your reply :) > Inspired by your reply we successfully reproduced the bug. > > The test steps: > 1.swapon /dev/zram0 > 2.add some memory pressure by stress-ng > 3.calling swapoff /dev/zram0 in the do_swap_page function (this > changed the source code) > 4.bug occured in the same place. > > After testing, this patch solves the bug. > Finally, there is a small question. Why linux5.10 revert this patch > (2799e77529c2)? 2799e77529c2 is merged by v5.14. Best Regards, Huang, Ying > We found that to fix this bug, the following patches may be required: > efa33fc7f6e mm/shmem: fix shmem_swapin() race with swapoff > 5c046235a826 mm/swap: remove confusing checking for non_swap_entry() > in swap_ra_info() > 2799e77529c2 swap: fix do_swap_page() race with swapoff > 63d8620ecf93 mm/swapfile: use percpu_ref to serialize against > concurrent swapoff > seem like all this patchset is needed except commit 5c046235a826 > ("mm/swap: remove confusing checking for non_swap_entry() in > swap_ra_info()") > > Best Regards, > Xia, longlong > > ( 2022/11/28 9:08, Huang, Ying S: >> Hi, >> >> xialonglong writes: >> >>> A panic occur in the linux 5.10 we meet it only once it seems that >>> there is no special changes between 5.10 and upsteam about swap_cgroup. >>> >>> The test is based on QEMU with 64GB memory, one 2GB zram device as >>> swap area. >>> The test steps: >>> 1.swapoff -a >>> 2.add some memory pressure by stress-ng >>> 3.while (2 minutes) { >>> swapoff /dev/zram0 >>> swapon /dev/zram0 >>> sleep 3 >>> } >>> 4. swapon -a >>> >>> Preliminary analysis showed that the swap entry point to a swap area >>> which have already been swapoff, and no other obvious clues, still >>> trying to reproduce it. >> We have a patch as follows to fix a similar issue, >> >> 2799e77529c2a25492a4395db93996e3dacd762d >> Author: Miaohe Lin >> AuthorDate: Mon Jun 28 19:36:50 2021 -0700 >> Commit: Linus Torvalds >> CommitDate: Tue Jun 29 10:53:49 2021 -0700 >> >> swap: fix do_swap_page() race with swapoff >> >> When I was investigating the swap code, I found the below possible race >> window: >> >> CPU 1 CPU 2 >> ----- ----- >> do_swap_page >> if (data_race(si->flags & SWP_SYNCHRONOUS_IO) >> swap_readpage >> if (data_race(sis->flags & SWP_FS_OPS)) { >> swapoff >> .. >> p->swap_file = NULL; >> .. >> struct file *swap_file = sis->swap_file; >> struct address_space *mapping = swap_file->f_mapping;[oops!] >> >> Note that for the pages that are swapped in through swap cache, this isn't >> an issue. Because the page is locked, and the swap entry will be marked >> with SWAP_HAS_CACHE, so swapoff() can not proceed until the page has been >> unlocked. >> >> Fix this race by using get/put_swap_device() to guard against concurrent >> swapoff. >> >> Can you check whether that can fix your issue? >> >> Best Regards, >> Huang, Ying >> >>> Any known issue about this feature, or any advise will be appreciated. >>> >>> Here are the panic log, >>> >>> Unable to handle kernel NULL pointer dereference at virtual address >>> 0000000000000740 >>> Mem abort info: >>> ESR = 0x96000004 >>> EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, >>> S1PTW = 0 Data abort info: >>> ISV = 0, ISS = 0x00000004 >>> CM = 0, WnR = 0 >>> user pgtable: 4k pages, 48-bit VAs, pgdp=000000010ae6e000 >>> pgd=0000000000000000, p4d=0000000000000000 Internal error: Oops: >>> 96000004 [#1] SMP Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 >>> 02/06/2015 >>> pstate: 00000005 (nzcv daif -PAN -UAO -TCO BTYPE=--) >>> pc : lookup_swap_cgroup_id+0x38/0x50 >>> lr : mem_cgroup_charge+0x9c/0x424 >>> sp : ffff800102f63bc0 >>> x29: ffff800102f63bc0 x28: ffff0000d0d64d00 >>> x27: 0000000000000000 x26: 0000000000000007 >>> x25: ffff0000018c86a8 x24: ffff0000018c8640 >>> x23: 0000000000000cc0 x22: 0000000000000001 >>> x21: 0000000000000001 x20: ffff800102f63d28 >>> x19: fffffe000373cb40 x18: 0000000000000000 >>> x17: 0000000000000000 x16: ffff8001004715a4 >>> x15: 00000000ffffffff x14: 0000000000003000 >>> x13: 00000000ffffffff x12: 0000000000000040 >>> x11: ffff0000c0403478 x10: ffff0000c040347a >>> x9 : ffff8001003e957c x8 : 000000000009dddd >>> x7 : 0000000000000600 x6 : 00000000000000e8 >>> x5 : 0000020000200000 x4 : ffff000000000000 >>> x3 : ffff800101f4c030 x2 : 0000000000000000 >>> x1 : 00000000000001e4 x0 : 0000000000000000 >>> >>> Call trace: >>> lookup_swap_cgroup_id+0x38/0x50 >>> do_swap_page+0xa64/0xc04 >>> handle_pte_fault+0x1c8/0x214 >>> __handle_mm_fault+0x1b0/0x380 >>> handle_mm_fault+0xf4/0x284 >>> do_page_fault+0x188/0x474 >>> do_translation_fault+0xb8/0xe4 >>> do_mem_abort+0x48/0xb0 >>> el0_da+0x44/0x80 >>> el0_sync_handler+0x88/0xb4 >>> el0_sync+0x160/0x180 >>> >>> :?????? mov?? x9, x30 >>> :???? nop >>> :???? >>> lsr?? x2, x0, #58 SWP_TYPE_SHIFT == 58? x2 = >>> swp_type >>> :???? >>> adrp? x1, 0xffff800101f4c000 >>> >>> :??? >>> add?? x3, x1, #0x30???? >>> x3 == swap_cgroup_ctrl >>> :??? ubfx? x6, x0, #11, #47 >>> :??? add?? x2, x2, x2, lsl #1 >>> :??? ubfiz? x1, x0, #1, #11 >>> :??? >>> mov?? x5, >>> #0x200000????????? >>> // #2097152 >>> :??? >>> mov?? x4, >>> #0xffff000000000000???? // >>> #-281474976710656 >>> :??? movk? x5, #0x200, lsl #32 >>> :??? hint? #0x19 >>> :??? >>> ldr?? x0, [x3,x2,lsl #3] x3=ffff800101f4c030, x0 = 0 >>> :??? hint? #0x1d >>> :??? >>> ldr?? x0, [x0,x6,lsl #3] x0 = 0 + 0xe8 * 8 == 0x740 >>> :??? add?? x0, x0, x5 >>> :??? lsr?? x0, x0, #6 >>> :??? add?? x0, x1, x0, lsl #12 >>> :??? ldrh? w0, [x0,x4] >>> :??? ret