From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 368D9C43217 for ; Fri, 13 May 2022 09:40:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C4BFE6B0075; Fri, 13 May 2022 05:40:07 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BFC9F8D0002; Fri, 13 May 2022 05:40:07 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AC2D58D0001; Fri, 13 May 2022 05:40:07 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 9B8206B0075 for ; Fri, 13 May 2022 05:40:07 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 6F2ED6146A for ; Fri, 13 May 2022 09:40:07 +0000 (UTC) X-FDA: 79460223654.14.C5C0D0B Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by imf03.hostedemail.com (Postfix) with ESMTP id D2801200A6 for ; Fri, 13 May 2022 09:39:57 +0000 (UTC) Received: from canpemm500002.china.huawei.com (unknown [172.30.72.57]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4L03Wm1QTJzgZ2p; Fri, 13 May 2022 17:39:32 +0800 (CST) Received: from [10.174.177.76] (10.174.177.76) by canpemm500002.china.huawei.com (7.192.104.244) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Fri, 13 May 2022 17:40:01 +0800 From: Miaohe Lin Subject: [Question]: Decision-making swapoff process to fix problem feasibility To: Linux-MM , linux-kernel CC: Huang Ying , Matthew Wilcox , , Suren Baghdasaryan , Stephen Rothwell , , NeilBrown , HORIGUCHI NAOYA , Minchan Kim , David Howells , David Hildenbrand , Alistair Popple , Andrew Morton , "liubo (AW)" , Miaohe Lin Message-ID: <8a8d3614-9081-3fff-ebca-011deffc4605@huawei.com> Date: Fri, 13 May 2022 17:40:01 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.6.0 MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 8bit X-Originating-IP: [10.174.177.76] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To canpemm500002.china.huawei.com (7.192.104.244) X-CFilter-Loop: Reflected X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: D2801200A6 X-Stat-Signature: 5r5y9jgt5zm9az7tgab136w77hjt6gd3 Authentication-Results: imf03.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf03.hostedemail.com: domain of linmiaohe@huawei.com designates 45.249.212.187 as permitted sender) smtp.mailfrom=linmiaohe@huawei.com X-Rspam-User: X-HE-Tag: 1652434797-548975 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Background: When the swap partition is mounted through the swapon command, the kernel will create the swap_info_struct data structure and initialize it, and save it in the swap_info global array. When the swap partition is no longer in use, the disk is unloaded through the swapoff command. However, if the disk is pulled out after swapon, an error will occur when swapoff the disk, causing the swap_info_struct data structure to remain in the kernel and cannot be cleared. Example: [root@localhost ~]# swapon -s [root@localhost ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 1.1T 0 disk ├─sda1 8:1 0 600M 0 part /boot/efi ├─sda2 8:2 0 1G 0 part /boot └─sda3 8:3 0 1.1T 0 part ├─root 253:0 0 70G 0 lvm / ├─swap 253:1 0 4G 0 lvm └─home 253:2 0 1T 0 lvm /home nvme0n1 259:0 0 3.6T 0 disk └─nvme0n1p1 259:5 0 60G 0 part [root@localhost ~]# swapon /dev/nvme0n1p1 [root@localhost ~]# swapon -s Filename Type Size Used Priority /dev/nvme0n1p1 partition 62914556 0 -2 [root@localhost ~]# echo 1 > /sys/bus/pci/devices/0000:d8:00.0/remove [root@localhost ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 1.1T 0 disk ├─sda1 8:1 0 600M 0 part /boot/efi ├─sda2 8:2 0 1G 0 part /boot └─sda3 8:3 0 1.1T 0 part ├─root 253:0 0 70G 0 lvm / ├─swap 253:1 0 4G 0 lvm └─home 253:2 0 1T 0 lvm /home [root@localhost ~]# swapon -s Filename Type Size Used Priority /dev/nvme0n1p1 partition 62914556 0 -2 [root@localhost ~]# swapoff /dev/nvme0n1p1 swapoff: /dev/nvme0n1p1: swapoff failed: No such file or directory [root@localhost ~]# swapoff -a [root@localhost ~]# swapon -s Filename Type Size Used Priority /dev/nvme0n1p1 partition 62914556 0 -2 Reason: In the swapoff command, the device is acquired in the following ways, but the device has been unplugged at this time, causing the "victim" acquisition to fail, thus returning an error directly. And the invalid swap_info_struct cannot be effectively released. sys_swapoff pathname = getname(specialfile); if (IS_ERR(pathname)) return PTR_ERR(pathname); victim = file_open_name(pathname, O_RDWR|O_LARGEFILE, 0); err = PTR_ERR(victim); if (IS_ERR(victim)) goto out; Possible Solution: In order to solve the above problems, by adding traversal of swap_avail_heads (available swap partitions) in the swapoff and swapon processes, find the swap_info_struct whose disk partition has been unplugged, and release resources. swapoff/swapon process: ... spin_lock(&swap_lock); plist_for_each_entry(p, &swap_active_head, list) { if (p->flags & SWP_WRITEOK) { swap_file = p->swap_file; swap_name = d_path(&swap_file->f_path, tmp, PAGE_SIZE); if (strstr(swap_name, "deleted")) { found = 1; break; } } } spin_unlock(&swap_lock); ... /* do the resource release process */ The reason why the judgment of unavailable swap information is also added to the swapon process is that the swapoff is executed by the user, and the timing is uncontrollable. The system supports swapon multiple disks, and the unavailable swap can be deleted at the same time as swapon is mounted. It will be very appreciative if anyone can give any suggestions and comment! Thanks a lot in advance! :)