From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 45371CF9C5B for ; Tue, 24 Sep 2024 03:25:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B42B36B0082; Mon, 23 Sep 2024 23:25:25 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AF1B26B0083; Mon, 23 Sep 2024 23:25:25 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9DFA86B0085; Mon, 23 Sep 2024 23:25:25 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 807986B0082 for ; Mon, 23 Sep 2024 23:25:25 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 27D21C179A for ; Tue, 24 Sep 2024 03:25:25 +0000 (UTC) X-FDA: 82598191410.19.FBC3E35 Received: from out30-124.freemail.mail.aliyun.com (out30-124.freemail.mail.aliyun.com [115.124.30.124]) by imf22.hostedemail.com (Postfix) with ESMTP id AE2DBC000B for ; Tue, 24 Sep 2024 03:25:19 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=CZiTPjcx; spf=pass (imf22.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.124 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1727148263; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=gy4Xw5HcATj18wlOUR8GQnhv7MlnSmGkVGDSvlC2JMo=; b=S90cwH6Ffk66YSy+Wf8L8rBYJdTk5AZqikIMwitUQwFmpEO+BzdRKqr6Rvdvx0aVMRxIl3 FBqpwyrYYZExXN4+l7Qx8ABhJNxiKuR5+2TklguGhgvm011q9032IBXwX1Rshi3XkP5oWf FjE8oiz77aU907oK75RTk99KdLEYg5Y= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=CZiTPjcx; spf=pass (imf22.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.124 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1727148263; a=rsa-sha256; cv=none; b=SSLYTNxyFzdzMJBjPDUBuROkcgE4o2ZUK3cGyJQsGaa3GcBYP/2kV+bDFiAgdWzdbLz3xA wqVLPt5AjRT+k3rTsDtywg2zPrlZeBWQr3YXdDVlmVhCRt0sB+EPMS66xekDqRhsODRBfq YPc6YXhn0KZT3rbKy+7KDR81hNfyleg= DKIM-Signature:v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1727148310; h=Message-ID:Date:MIME-Version:Subject:To:From:Content-Type; bh=gy4Xw5HcATj18wlOUR8GQnhv7MlnSmGkVGDSvlC2JMo=; b=CZiTPjcx3w9wzDTUCOhEtOB6ETtqAyq1EJQtxqZVlwOgpQJEyK+zbOAu4gsbCPqSz67sBK2XWqz0WOrcFlwmRFhydcCJVHLMqGcio4wDkrHv8rLXrcBFiaAFaK6HnK9hq+R9Sl4VIoQvIHJKVzrzlqQrfva75NBQDCSvBvMr/qc= Received: from 30.74.144.117(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0WFeKxuu_1727148308) by smtp.aliyun-inc.com; Tue, 24 Sep 2024 11:25:09 +0800 Message-ID: <9a110f20-42ad-468b-96c6-683e162452a9@linux.alibaba.com> Date: Tue, 24 Sep 2024 11:25:08 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC PATCH 0/2] remove SWAP_MAP_SHMEM To: Yosry Ahmed Cc: Nhat Pham , akpm@linux-foundation.org, hannes@cmpxchg.org, hughd@google.com, shakeel.butt@linux.dev, ryan.roberts@arm.com, ying.huang@intel.com, chrisl@kernel.org, david@redhat.com, kasong@tencent.com, willy@infradead.org, viro@zeniv.linux.org.uk, baohua@kernel.org, chengming.zhou@linux.dev, linux-mm@kvack.org, kernel-team@meta.com, linux-kernel@vger.kernel.org References: <20240923231142.4155415-1-nphamcs@gmail.com> <4d38c65d-760c-43a5-bb47-8e0235c13a51@linux.alibaba.com> From: Baolin Wang In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam03 X-Rspam-User: X-Rspamd-Queue-Id: AE2DBC000B X-Stat-Signature: btt5yx1oft5ethmp3taapxfz78hg8yks X-HE-Tag: 1727148319-721248 X-HE-Meta: U2FsdGVkX19XecCUkv/5Po5ww6xmD6Uhm1JRRT4hG6BTyZ3+Yaf8WSq5eTnti32LbKE7oWRvvAX0kaBefVbYo9P2WbD2z4Ow7305QWjOw2fXHBz34cmkeXkt8/jJWHVGxB27TUs3qD1IxT2IL95RV+HeYE9VfmQj0aKLNkVwkVA0CD9z/QTW/ukeTCz9fs/goyEe/mayMa7uRuFMVnQucV1vvMKlpiwo6d85IKHkrls86CdxeOP76lOT6ysOT6ZqkpR5Rv/ng+nKdIa1uae7bFsfUgcwqfoK1Km4H5p0PNtfVvXShjIGy/QpDbGvEoscR1Ru4EAQrFcPMewC60iSzupqLpzuCWWgdw2FXsSyXuyfW9w6d913jyYMmHMuB1XhQhGJ8HI90k7AFhb4jRUc9Qagx0KWnB605gaxgGF/OEicKWRIXe5sFZCB1WkSyvWXjwq/w31o9VpElJ/YhhhQgePIScMt0hpvAXC2f4IHyQijrPsY9TMQtCb2cpOehUNptv70hyH233yC4PF6L0oTC0iFJp222CHVpTfo4RV4NAptpTIynjwCP5MQx+8FTGr5eI6fkcvjihIevAnkMxcQ0ITKcEodCtFOrD59E3pT/5wppREG6iAXktUTSwJqqCxWMlDlxX3ojkamzMiovVyeuF5rJ05qyCEZOOpFuRf8PibIdfr/qY+lw5O8S2FtkP7lMrHRikTbZtUj8R0pgIs3UD9wvhwoQWkFFLFh9olgU0UXHUuSy4YN2nylKhqJYkSxn1U1N9upeO2XF+4Kz+PinsRyVujdROKVrWyv53sL7ElOYeJM0V5wk4mZs37GF8OqfyycEH7q7CD/rNMWLjrZFAemba3dAB+yiJidan6+hP/Rwh1j24uQ5xtEQtF79xLnr4t+Fin49YYuDnbge6s3OHPRUjMLHGEb6/+ETJCQftba484GEyUxsujSZW43Tuvfr9Ip5hIGk+p6XKkqIla rZGpQxiA PJwDNatF0F5ixpRtdqKSGa1GCj98c+GmZvaww/QmiQOi7+dyEaW/DGGU3+XN+lRSPl6/M+sw5MZAfjo5fny1vMFeafOzYblFBSkKMCV5xjJ1u4kWLP+klxQ4q8peRr8DlH9MXtrylBAD+LW602C/eU6WMt7fjNoeb/UA4MaXXc/qiaYCJg/IUBZmRbmVcuMVpu+7g6wRq+zzm17Yj5e23aZn3vmjxN848TS9Dd4H3lQHZK7pnCUoGYT5X7zRT+rl9wx+fLrbWjwScMENJwBLC2IEMWwyKNGgOHihDMFzz5ESVB2emPncRoJQrfEMq/2N+nB5581YBRbYjW9XCYl9sGGKiKFM5WyRohdfU X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2024/9/24 10:15, Yosry Ahmed wrote: > On Mon, Sep 23, 2024 at 6:55 PM Baolin Wang > wrote: >> >> >> >> On 2024/9/24 07:11, Nhat Pham wrote: >>> The SWAP_MAP_SHMEM state was originally introduced in the commit >>> aaa468653b4a ("swap_info: note SWAP_MAP_SHMEM"), to quickly determine if a >>> swap entry belongs to shmem during swapoff. >>> >>> However, swapoff has since been rewritten drastically in the commit >>> b56a2d8af914 ("mm: rid swapoff of quadratic complexity"). Now >>> having swap count == SWAP_MAP_SHMEM value is basically the same as having >>> swap count == 1, and swap_shmem_alloc() behaves analogously to >>> swap_duplicate() >>> >>> This RFC proposes the removal of this state and the associated helper to >>> simplify the state machine (both mentally and code-wise). We will also >>> have an extra state/special value that can be repurposed (for swap entries >>> that never gets re-duplicated). >>> >>> Another motivation (albeit a bit premature at the moment) is the new swap >>> abstraction I am currently working on, that would allow for swap/zswap >>> decoupling, swapoff optimization, etc. The fewer states and swap API >>> functions there are, the simpler the conversion will be. >>> >>> I am sending this series first as an RFC, just in case I missed something >>> or misunderstood this state, or if someone has a swap optimization in mind >>> for shmem that would require this special state. >> >> The idea makes sense to me. I did a quick test with shmem mTHP, and >> encountered the following warning which is triggered by >> 'VM_WARN_ON(usage == 1 && nr > 1)' in __swap_duplicate(). > > Apparently __swap_duplicate() does not currently handle increasing the > swap count for multiple swap entries by 1 (i.e. usage == 1) because it > does not handle rolling back count increases when > swap_count_continued() fails. > > I guess this voids my Reviewed-by until we sort this out. Technically > swap_count_continued() won't ever be called for shmem because we only > ever increment the count by 1, but there is no way to know this in > __swap_duplicate() without SWAP_HAS_SHMEM. Agreed. An easy solution might be to add a new boolean parameter to indicate whether the SHMEM swap entry count is increasing? diff --git a/mm/swapfile.c b/mm/swapfile.c index cebc244ee60f..21f1eec2c30a 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -3607,7 +3607,7 @@ void si_swapinfo(struct sysinfo *val) * - swap-cache reference is requested but the entry is not used. -> ENOENT * - swap-mapped reference requested but needs continued swap count. -> ENOMEM */ -static int __swap_duplicate(swp_entry_t entry, unsigned char usage, int nr) +static int __swap_duplicate(swp_entry_t entry, unsigned char usage, int nr, bool shmem) { struct swap_info_struct *si; struct swap_cluster_info *ci; @@ -3620,7 +3620,7 @@ static int __swap_duplicate(swp_entry_t entry, unsigned char usage, int nr) offset = swp_offset(entry); VM_WARN_ON(nr > SWAPFILE_CLUSTER - offset % SWAPFILE_CLUSTER); - VM_WARN_ON(usage == 1 && nr > 1); + VM_WARN_ON(usage == 1 && nr > 1 && !shmem); ci = lock_cluster_or_swap_info(si, offset); err = 0; @@ -3661,7 +3661,7 @@ static int __swap_duplicate(swp_entry_t entry, unsigned char usage, int nr) has_cache = SWAP_HAS_CACHE; else if ((count & ~COUNT_CONTINUED) < SWAP_MAP_MAX) count += usage; - else if (swap_count_continued(si, offset + i, count)) + else if (!shmem && swap_count_continued(si, offset + i, count)) count = COUNT_CONTINUED; else { /*