From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B217AC83F27 for ; Sun, 20 Jul 2025 00:35:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D21EE6B0093; Sat, 19 Jul 2025 20:34:59 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CD30D6B0095; Sat, 19 Jul 2025 20:34:59 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BE8DB6B0096; Sat, 19 Jul 2025 20:34:59 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id AFE316B0093 for ; Sat, 19 Jul 2025 20:34:59 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 26B5412DBCD for ; Sun, 20 Jul 2025 00:34:59 +0000 (UTC) X-FDA: 83682773118.18.84DB982 Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf05.hostedemail.com (Postfix) with ESMTP id 744BC100005 for ; Sun, 20 Jul 2025 00:34:57 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=iuLOYmCl; dmarc=none; spf=pass (imf05.hostedemail.com: domain of akpm@linux-foundation.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1752971697; a=rsa-sha256; cv=none; b=pzM8QVRtCExRu3lKguLs+jES9DNchrL6OehiPXs1zn4WnlLl6+wEYVUGYL0ywDzqM03S0c 2S2PWWo+x4dtckAjBa5c108XnJ3ceBdslr0gVcsbMDgQW9AErpdKYUplusrMbBH9z+LlKM RyYSHqvd4P8lOsm8CpOdKF8F2N6P8jk= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=iuLOYmCl; dmarc=none; spf=pass (imf05.hostedemail.com: domain of akpm@linux-foundation.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1752971697; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=/b4l3w/UULZhWqfIh5fLIGXV8CYa68xejI202acHd/U=; b=WcGSOHkfrB7FouYwQlJt9fidmJBT40A1nQG6zsxlzYA17ltwP8IqwtpVbFI53ptHAHsY2i jNAqxh2skXQqDHSJP7vrp/TOYOOIJH8pf01GCyAXXbiYYu2JOZYVxKSVj+/v98eaJSLHWC PJa+6SuTZ6+Yyojdi3jq8WLGa788BBs= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id AEC47A4BB0E; Sun, 20 Jul 2025 00:34:56 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E4CB0C4CEE3; Sun, 20 Jul 2025 00:34:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1752971696; bh=yqcqSxRJPaa5WcY/AqX5FTjR6HbQANKo217deEzj74w=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=iuLOYmCl+d0Ogn4XM2NMCpdwaj6QHPz1+1O3DXx1N+HYNHRvoO3cAzgHMY+mX2xY5 GCRjBDBHGZPdKLh1tLweUTME4SIrySHhhCPDdikvWj9DpVkLsWwse2OeDMS1DN4MSz MixC2cFSntKv10Vetalopb5UtLWYRrJvw89eAWMI= Date: Sat, 19 Jul 2025 17:34:55 -0700 From: Andrew Morton To: Kemeng Shi Cc: kasong@tencent.com, nphamcs@gmail.com, bhe@redhat.com, baohua@kernel.org, chrisl@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] mm: swap: correctly use maxpages in swapon syscall to avoid potential deadloop Message-Id: <20250719173455.35f8082916a76a416764d32d@linux-foundation.org> In-Reply-To: <20250718065139.61989-1-shikemeng@huaweicloud.com> References: <20250718065139.61989-1-shikemeng@huaweicloud.com> X-Mailer: Sylpheed 3.8.0beta1 (GTK+ 2.24.33; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 744BC100005 X-Stat-Signature: kijt3j7xwdtxukbfska9ghhz8uzbonmx X-HE-Tag: 1752971697-844031 X-HE-Meta: U2FsdGVkX1+jb0UIrq9BhVh8qXjdHOmVYqx4vOT2/SDApgN3fcGDdrTZnvZOGq0ReLaTxLPMZNdv0hTEjnJKZQ+TiwkBsq/tEWqDKOx1utC0Us46EYesMRoZXQDEy3BoLfvIR55qq5umpfPoqOFBvj+wImYsZTXE5/4A3x9RmIPQLW3oGi1+MGrSmkDh0nlESYPJF78k2UxIT7aWubxAAa7Vu2gMvczMe/bF8BMQWOLCKXtY8a/WaVDigN/lb5lbTF0RNGVJvDxlcxB/afYNOflG75/MtqhNYKyK3gXjKrLhXf36x8iP4VHIHKT1hokn7MJuQFg9wceXBTDWWHCVT4OQvBZf8/20xXWy7iMmoJ+txP4ICLWRUDszSv725ycnj+EOgWQgt9Lag2kHo5yuxouaSOLfXxN7EDp45PydGxrMjyVFADMMew9JYdML2Q0dJ+daiLitqSFwBL2Aovskth/m27S2w+vkoFbBB+UUK0D7bjvIIkxItTAAm6iEiDDQhUj2ruFuA/LeW6CCvY+UTF9VQnvtkQEpxws++cfMseOFaKXTuCXRd+LyRuX97+DVyi3mmceLC4BPYqwLmQksyYA9nzT8x8Nc0DQO2OXHk175baxrOpDSZdt3lEDIdzOI3z+KwktE8XbriA9w2UE6Zq4uCnS1YOQuwT87jX+grxGITipsdGWnNBHYIzNRUMudWasoZyMdwLfh45bIHg0WjMwFXIsanb5xZixDMme8yfX+ianR64joClm/kUWbg9z9X+LBU19KF1p0qSkjt8eAsLdg+wNdLDmE8lTjHxJaGaAUPqPTSz+vfK1bC0+4N1F8/ig2d3a0Q82IWr9X75Kk/T5CFKFxy8Fr07ET628APFJze66iGTVqTvqUgHVP3D9RKOcufk1++2vJBAKM8km1OocGNCoQxYaYFwHjEM2IY+MdXAqADoy+RI9mTxufROh1FyEkW3wAuuMkNS40URS S3rxEm6C YrLmF7F0vDngLAbx9HsW9ffrKVQ9Ji5nBzJH6jGbvdmPji1zRzwvHQoDr9439qlacDcQwBGu85wVxI2RqCXPplnFjJvh3tJQpZTGR6ZgzWo2t2KH+MF+jo63he3/ZmYgeYD/MYpyiRZZ9NG/RlxHTdQ2GbYqQp9P4B3kT/x+/4PQ6IfHJ6U8BVzcIOYAzCPkpsH1kBM2MYW8m14kDbTRaEQbw/qatWzngCbaO6SKpOzezcqI5aM64+iT/Rbd2ha3/kYjFsEbS7jlFfGi575+XLyJy6GXoc56ieUSnGT2+devNNH0nRB7uFo6+PqPb9wUrExa6 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, 18 Jul 2025 14:51:39 +0800 Kemeng Shi wrote: > We use maxpages from read_swap_header() to initialize swap_info_struct, > however the maxpages might be reduced in setup_swap_extents() and the > si->max is assigned with the reduced maxpages from the > setup_swap_extents(). > > Obviously, this could lead to memory waste as we allocated memory based on > larger maxpages, besides, this could lead to a potential deadloop as > following: > > 1) When calling setup_clusters() with larger maxpages, unavailable > pages within range [si->max, larger maxpages) are not accounted with > inc_cluster_info_page(). As a result, these pages are assumed > available but can not be allocated. The cluster contains these pages > can be moved to frag_clusters list after it's all available pages were > allocated. > > 2) When the cluster mentioned in 1) is the only cluster in > frag_clusters list, cluster_alloc_swap_entry() assume order 0 > allocation will never failed and will enter a deadloop by keep trying > to allocate page from the only cluster in frag_clusters which contains > no actually available page. > > Call setup_swap_extents() to get the final maxpages before > swap_info_struct initialization to fix the issue. > > After this change, span will include badblocks and will become large > value which I think is correct value: > In summary, there are two kinds of swapfile_activate operations. > 1. Filesystem style: Treat all blocks logical continuity and find > usable physical extents in logical range. In this way, si->pages > will be actual usable physical blocks and span will be "1 + > highest_block - lowest_block". > 2. Block device style: Treat all blocks physically continue and > only one single extent is added. In this way, si->pages will be > si->max and span will be "si->pages - 1". > Actually, si->pages and si->max is only used in block device style > and span value is set with si->pages. As a result, span value in > block device style will become a larger value as you mentioned. > > I think larger value is correct based on: > 1. Span value in filesystem style is "1 + highest_block - > lowest_block" which is the range cover all possible phisical blocks > including the badblocks. > 2. For block device style, si->pages is the actual usable block > number and is already in pr_info. The original span value before > this patch is also refer to usable block number which is redundant > in pr_info. > > Link: https://lkml.kernel.org/r/20250522122554.12209-3-shikemeng@huaweicloud.com > Fixes: 661383c6111a ("mm: swap: relaim the cached parts that got scanned") > Signed-off-by: Kemeng Shi > Reviewed-by: Baoquan He > --- > v1->v2: > -Fix typo > -Add description of behavior change of "span" in git log I queued this change: > -Ensure si->pages == si->max - 1 after setup_swap_extents() as a -fix against the v1 patch and updated the base patch's changelog, thanks. --- a/mm/swapfile.c~mm-swap-correctly-use-maxpages-in-swapon-syscall-to-avoid-potensial-deadloop-fix +++ a/mm/swapfile.c @@ -3357,6 +3357,12 @@ SYSCALL_DEFINE2(swapon, const char __use error = nr_extents; goto bad_swap_unlock_inode; } + if (si->pages != si->max - 1) { + pr_err("swap:%u != (max:%u - 1)\n", si->pages, si->max); + error = -EINVAL; + goto bad_swap_unlock_inode; + } + maxpages = si->max; /* OK, set up the swap map and apply the bad block list */ _