linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Miaohe Lin <linmiaohe@huawei.com>
To: Linux-MM <linux-mm@kvack.org>,
	linux-kernel <linux-kernel@vger.kernel.org>
Cc: Huang Ying <ying.huang@intel.com>,
	Matthew Wilcox <willy@infradead.org>, <vbabka@suse.cz>,
	Suren Baghdasaryan <surenb@google.com>,
	Stephen Rothwell <sfr@canb.auug.org.au>, <peterx@redhat.com>,
	NeilBrown <neilb@suse.de>,
	HORIGUCHI NAOYA <naoya.horiguchi@nec.com>,
	Minchan Kim <minchan@kernel.org>,
	David Howells <dhowells@redhat.com>,
	David Hildenbrand <david@redhat.com>,
	Alistair Popple <apopple@nvidia.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	"liubo (AW)" <liubo254@huawei.com>,
	Miaohe Lin <linmiaohe@huawei.com>
Subject: [Question]: Decision-making swapoff process to fix problem feasibility
Date: Fri, 13 May 2022 17:40:01 +0800	[thread overview]
Message-ID: <8a8d3614-9081-3fff-ebca-011deffc4605@huawei.com> (raw)

Background:
	When the swap partition is mounted through the swapon command,
	the kernel will create the swap_info_struct data structure
	and initialize it, and save it in the swap_info global array.
	When the swap partition is no longer in use, the disk is
	unloaded through the swapoff command.

	However, if the disk is pulled out after swapon, an error will
	occur when swapoff the disk, causing the swap_info_struct
	data structure to remain in the kernel and cannot be cleared.

Example:
	[root@localhost ~]# swapon -s
	[root@localhost ~]# lsblk
	NAME             MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
	sda                8:0    0  1.1T  0 disk
	├─sda1             8:1    0  600M  0 part /boot/efi
	├─sda2             8:2    0    1G  0 part /boot
	└─sda3             8:3    0  1.1T  0 part
	  ├─root 253:0    0   70G  0 lvm  /
	  ├─swap 253:1    0    4G  0 lvm
	  └─home 253:2    0    1T  0 lvm  /home
	nvme0n1          259:0    0  3.6T  0 disk
	└─nvme0n1p1      259:5    0   60G  0 part
	[root@localhost ~]# swapon /dev/nvme0n1p1
	[root@localhost ~]# swapon -s
	Filename                                Type            Size    Used    Priority
	/dev/nvme0n1p1                          partition       62914556        0       -2
	[root@localhost ~]# echo 1 > /sys/bus/pci/devices/0000:d8:00.0/remove
	[root@localhost ~]# lsblk
	NAME             MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
	sda                8:0    0  1.1T  0 disk
	├─sda1             8:1    0  600M  0 part /boot/efi
	├─sda2             8:2    0    1G  0 part /boot
	└─sda3             8:3    0  1.1T  0 part
	  ├─root 253:0    0   70G  0 lvm  /
	  ├─swap 253:1    0    4G  0 lvm
	  └─home 253:2    0    1T  0 lvm  /home
	[root@localhost ~]# swapon -s
	Filename                                Type            Size    Used    Priority
	/dev/nvme0n1p1                          partition       62914556        0       -2
	[root@localhost ~]# swapoff /dev/nvme0n1p1
	swapoff: /dev/nvme0n1p1: swapoff failed: No such file or directory
	[root@localhost ~]# swapoff -a
	[root@localhost ~]# swapon -s
	Filename                                Type            Size    Used    Priority
	/dev/nvme0n1p1                          partition       62914556        0       -2
	
Reason:
	In the swapoff command, the device is acquired in the following ways,
	but the device has been unplugged at this time, causing the "victim"
	acquisition to fail, thus returning an error directly.
	And the invalid swap_info_struct cannot be effectively released.

	sys_swapoff

	pathname = getname(specialfile);
	if (IS_ERR(pathname))
		return PTR_ERR(pathname);

	victim = file_open_name(pathname, O_RDWR|O_LARGEFILE, 0);
	err = PTR_ERR(victim);
	if (IS_ERR(victim))
		goto out;

Possible Solution:
	In order to solve the above problems, by adding traversal of
	swap_avail_heads (available swap partitions) in the swapoff
	and swapon processes, find the swap_info_struct whose disk
	partition has been unplugged, and release resources.

	swapoff/swapon process:
	
	...
	spin_lock(&swap_lock);
	plist_for_each_entry(p, &swap_active_head, list) {
			if (p->flags & SWP_WRITEOK) {
					swap_file = p->swap_file;
					swap_name = d_path(&swap_file->f_path, tmp, PAGE_SIZE);

					if (strstr(swap_name, "deleted")) {
							found = 1;
							break;
					}
			}
	}
	spin_unlock(&swap_lock);
	...
	/* do the resource release process */
	
	The reason why the judgment of unavailable swap information is also
	added to the swapon process is that the swapoff is executed by the
	user, and the timing is uncontrollable.
	The system supports swapon multiple disks, and the unavailable swap
	can be deleted at the same time as swapon is mounted.

It will be very appreciative if anyone can give any suggestions and comment! Thanks a lot in advance! :)


                 reply	other threads:[~2022-05-13  9:40 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8a8d3614-9081-3fff-ebca-011deffc4605@huawei.com \
    --to=linmiaohe@huawei.com \
    --cc=akpm@linux-foundation.org \
    --cc=apopple@nvidia.com \
    --cc=david@redhat.com \
    --cc=dhowells@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=liubo254@huawei.com \
    --cc=minchan@kernel.org \
    --cc=naoya.horiguchi@nec.com \
    --cc=neilb@suse.de \
    --cc=peterx@redhat.com \
    --cc=sfr@canb.auug.org.au \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox