From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F3E24C5AE59 for ; Fri, 30 May 2025 02:50:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 765226B0082; Thu, 29 May 2025 22:50:51 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6EE266B0083; Thu, 29 May 2025 22:50:51 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 597C86B0085; Thu, 29 May 2025 22:50:51 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 398F76B0082 for ; Thu, 29 May 2025 22:50:51 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id C07A385A2A for ; Fri, 30 May 2025 02:50:50 +0000 (UTC) X-FDA: 83498046660.23.063ED0F Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf12.hostedemail.com (Postfix) with ESMTP id AD91E40004 for ; Fri, 30 May 2025 02:50:48 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=GhtDT+H5; spf=pass (imf12.hostedemail.com: domain of bhe@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=bhe@redhat.com; dmarc=pass (policy=quarantine) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1748573449; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=44cKIolqH4V7UKsEdZUn1ZRKIHVEItoVbVOKtb7SFEo=; b=jytjgHzjXudvNvZOxI5ntwNYjmYK2eAK6azFLFna00996Ay7rRLvvwqyZuvDTogUjYRvSC m0uIGXifor2txruSg9vKTbuR//OpKT3EO86Z79AHeI2iwvKElRZNhz9cgPp2+rXJJbrQD2 TAbpKQN3E7zAGr4e5eKwqgpAw1m5AP0= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=GhtDT+H5; spf=pass (imf12.hostedemail.com: domain of bhe@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=bhe@redhat.com; dmarc=pass (policy=quarantine) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1748573449; a=rsa-sha256; cv=none; b=jBxG/moRcjwCwc5Z2hmjnHCfc4SnqTxqFB87TWW0zVZb0FCNSh3QB3IeOMFOajxw0t8san q1z7S3EyGORQF7O7cwpKZ7pqr31Kk5Xjejwvc2Nbq0VSbtREj6kxt+qV4yM5exglpLrYOY giPKV07MNKVWTsJAB8OfcvpNZAMUAc4= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1748573448; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=44cKIolqH4V7UKsEdZUn1ZRKIHVEItoVbVOKtb7SFEo=; b=GhtDT+H5QW5QkFm2UmuEl2BwSO1htKRvphWAymXJpGeLNoNpLNk6/ghWzxWjpOVJ8mZ+Gq HmpSdzbO+Ou0MscCotxJ/REt5gmTIX/9UoI/FNsIM1SRRPVY+XzpM3UvcI6BYDLoodE0dj VJC9pWN972nNJvunGmKi7TUF8PyiAkU= Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-494-_2YxOqyiPzubIcrKDzj5DA-1; Thu, 29 May 2025 22:50:41 -0400 X-MC-Unique: _2YxOqyiPzubIcrKDzj5DA-1 X-Mimecast-MFC-AGG-ID: _2YxOqyiPzubIcrKDzj5DA_1748573440 Received: from mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.111]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id CBB011800772; Fri, 30 May 2025 02:50:39 +0000 (UTC) Received: from localhost (unknown [10.72.112.13]) by mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 3AA4418003FC; Fri, 30 May 2025 02:50:37 +0000 (UTC) Date: Fri, 30 May 2025 10:50:33 +0800 From: Baoquan He To: Kemeng Shi Cc: akpm@linux-foundation.org, kasong@tencent.com, hannes@cmpxchg.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 2/4] mm: swap: correctly use maxpages in swapon syscall to avoid potensial deadloop Message-ID: References: <20250522122554.12209-1-shikemeng@huaweicloud.com> <20250522122554.12209-3-shikemeng@huaweicloud.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20250522122554.12209-3-shikemeng@huaweicloud.com> X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.111 X-Rspamd-Queue-Id: AD91E40004 X-Stat-Signature: otn87yztgpmwja9wd1cofgco4a9a68tf X-Rspam-User: X-Rspamd-Server: rspam04 X-HE-Tag: 1748573448-777769 X-HE-Meta: U2FsdGVkX19HeTnWlbIH23UPLDZ9eGxnCzGBIPlFmKWm85bU5UU+GUYBWiOQUAWS0aYEMdbrGn+Q7+Z8WHwRbkwR+tmaiUD68nH5nukaapp5MLT4zPXUwmBfGWKHqiys0z3oxoot9t+rZoIX+2VupGGyJaeeyjDHg4VpnLNBxyx7dFU8ovDtPJH/yA7NLYgmruZFxeABnOrvI67GW+MU5xbJsPS/T7WEeGwqYaFvUPZ8dckBvM7UXHEbejJnADNKz4plISM+wycjWvMctQZJGLWE3Bje9E5a+D4Z94BLXof4xWcL9kMfuaiO84Rcggf7RGF67Q0Fd7MInIln7c0glf1hOS3FUOzB1ObeVhNIpDggvKFFSDdOh+JNGaR/VeN+s4F/s1B11AxEveXLmSFp0xMN+Nre0TiiYYu7uPQB8AmDieRws9bTSdHem/ivJTMUXwW0ZWzyfZ0rv7kGl9fiLA9f5AWzQyK5kGFvI12AS8RKlE0YPntcolRGNJqJ8tndM0IDNpPAAv4Ue1gAqEMMaPA6AoWvqwpK6Yyc/eSSl92QnoJ9xvScm7Y1XGff0k/gTENxhg83SEIu11wT47Rz678QWY+LoLqLHSs3Phm4VTnTiBF+ddIlJsNDW2tO1o/hRmnKyCAaoNiUa5KsapP5LwA+HcM4g+CkYI9t8MU/tX+t85VmLvh5FdgEzWahu7TGyUhg3GLy2efRHtlMA3a5VTqtG7m/IWkhaRX4DwVM0LwRXVjnHC+Y/VX8vcY/h8Ytar9FyCcdDDXb6UGHBo5xGdDxpAcFOc/9gd4brAtboNXoIyCrhDL7O54wjUvwOo4Q783qfE/lVcyvx+q4hHVTwgv0F4TWyMZ2BRs/zvgjZHioUfQ6+ZGdNbkQy5ysZp05Yy1zLmanPfgfRGZwH2QFxUzSwldiYLsJumM+I8DgTMKfR3eelBm6ibjq/YWj7JiFxzYCYumY5b4gnjxY9I5 W/O00zCa wmlmsQZkGxXZvmv4tE9hw1GnvtZFD3EcMAJsdA4XFBzUjg/uiiMnTXVYDDwH+tqEuy5t86fOS7YKvTF2Wb+e5Pcz0uvFDnvkt8/mltIu+gkg6N9fIbwKcBEM/SkaLFqFn/fCmaJUcbRRPpJvr8o03kO5V9x0GPYVmX8Bqzakfxr7XdXy1PaXMEup29pbhb40We2z64ZvN4JBEiPAyify/+IylBwqosF9t4JR8ev/qwh85bb+iXLrKeUYKrrZBmIJsJVvgewpMgjr/iQX8X5awkBg+W8Bpf1qT5x+BT2KfhHRdLAszGqVcrNnQBXkMIV3J4Qbgj9RFDLkNckeGxQhpmWwjGqHSVubqUU1v1Ebph4OcJNzmEgXjSrTZxG+yrPc2Lyik2fM7VgzGjSoGzauEB9slYW+6D0sbwRwhOVkBzdU58JiPQUihQnPt+qkUM2lCLh75 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 05/22/25 at 08:25pm, Kemeng Shi wrote: > We use maxpages from read_swap_header() to initialize swap_info_struct, > however the maxpages might be reduced in setup_swap_extents() and the > si->max is assigned with the reduced maxpages from the > setup_swap_extents(). > Obviously, this could lead to memory waste as we allocated memory based on > larger maxpages, besides, this could lead to a potensial deadloop as ^ typo, potential > following: > 1) When calling setup_clusters() with larger maxpages, unavailable pages > within range [si->max, larger maxpages) are not accounted with > inc_cluster_info_page(). As a result, these pages are assumed available > but can not be allocated. The cluster contains these pages can be moved > to frag_clusters list after it's all available pages were allocated. > 2) When the cluster mentioned in 1) is the only cluster in frag_clusters > list, cluster_alloc_swap_entry() assume order 0 allocation will never > failed and will enter a deadloop by keep trying to allocate page from the > only cluster in frag_clusters which contains no actually available page. > > Call setup_swap_extents() to get the final maxpages before swap_info_struct > initialization to fix the issue. > > Fixes: 661383c6111a3 ("mm: swap: relaim the cached parts that got scanned") > Signed-off-by: Kemeng Shi > --- > mm/swapfile.c | 47 ++++++++++++++++++++--------------------------- > 1 file changed, 20 insertions(+), 27 deletions(-) Reviedwed-by: Baoquan He > > diff --git a/mm/swapfile.c b/mm/swapfile.c > index 75b69213c2e7..a82f4ebefca3 100644 > --- a/mm/swapfile.c > +++ b/mm/swapfile.c > @@ -3141,43 +3141,30 @@ static unsigned long read_swap_header(struct swap_info_struct *si, > return maxpages; > } > > -static int setup_swap_map_and_extents(struct swap_info_struct *si, > - union swap_header *swap_header, > - unsigned char *swap_map, > - unsigned long maxpages, > - sector_t *span) > +static int setup_swap_map(struct swap_info_struct *si, > + union swap_header *swap_header, > + unsigned char *swap_map, > + unsigned long maxpages) > { > - unsigned int nr_good_pages; > unsigned long i; > - int nr_extents; > - > - nr_good_pages = maxpages - 1; /* omit header page */ > > + swap_map[0] = SWAP_MAP_BAD; /* omit header page */ > for (i = 0; i < swap_header->info.nr_badpages; i++) { > unsigned int page_nr = swap_header->info.badpages[i]; > if (page_nr == 0 || page_nr > swap_header->info.last_page) > return -EINVAL; > if (page_nr < maxpages) { > swap_map[page_nr] = SWAP_MAP_BAD; > - nr_good_pages--; > + si->pages--; > } > } > > - if (nr_good_pages) { > - swap_map[0] = SWAP_MAP_BAD; > - si->max = maxpages; > - si->pages = nr_good_pages; > - nr_extents = setup_swap_extents(si, span); > - if (nr_extents < 0) > - return nr_extents; > - nr_good_pages = si->pages; > - } > - if (!nr_good_pages) { > + if (!si->pages) { > pr_warn("Empty swap-file\n"); > return -EINVAL; > } > > - return nr_extents; > + return 0; > } > > #define SWAP_CLUSTER_INFO_COLS \ > @@ -3217,7 +3204,7 @@ static struct swap_cluster_info *setup_clusters(struct swap_info_struct *si, > * Mark unusable pages as unavailable. The clusters aren't > * marked free yet, so no list operations are involved yet. > * > - * See setup_swap_map_and_extents(): header page, bad pages, > + * See setup_swap_map(): header page, bad pages, > * and the EOF part of the last cluster. > */ > inc_cluster_info_page(si, cluster_info, 0); > @@ -3354,6 +3341,15 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags) > goto bad_swap_unlock_inode; > } > > + si->max = maxpages; > + si->pages = maxpages - 1; > + nr_extents = setup_swap_extents(si, &span); > + if (nr_extents < 0) { > + error = nr_extents; > + goto bad_swap_unlock_inode; > + } > + maxpages = si->max; > + > /* OK, set up the swap map and apply the bad block list */ > swap_map = vzalloc(maxpages); > if (!swap_map) { > @@ -3365,12 +3361,9 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags) > if (error) > goto bad_swap_unlock_inode; > > - nr_extents = setup_swap_map_and_extents(si, swap_header, swap_map, > - maxpages, &span); > - if (unlikely(nr_extents < 0)) { > - error = nr_extents; > + error = setup_swap_map(si, swap_header, swap_map, maxpages); > + if (error) > goto bad_swap_unlock_inode; > - } > > /* > * Use kvmalloc_array instead of bitmap_zalloc as the allocation order might > -- > 2.30.0 >