From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 34092C54791 for ; Wed, 13 Mar 2024 09:10:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B223880014; Wed, 13 Mar 2024 05:10:09 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AD227940010; Wed, 13 Mar 2024 05:10:09 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9C03080014; Wed, 13 Mar 2024 05:10:09 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 861B3940010 for ; Wed, 13 Mar 2024 05:10:09 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 36C12140B0B for ; Wed, 13 Mar 2024 09:10:09 +0000 (UTC) X-FDA: 81891444138.09.B7DFB92 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf16.hostedemail.com (Postfix) with ESMTP id 599A4180002 for ; Wed, 13 Mar 2024 09:10:07 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=none; spf=pass (imf16.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1710321007; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=jI9XG39l6iOV0AUuRAUYxxc6Xf1NpFi+h2skvmyq0v8=; b=5k/tE2tYO6qQqmJlf+ec6LD5dCovIYWDJYcPxBeCTz1jVQvormZUWFP3YCSwFWZwDby/G9 uw1sqR4bTttnCYhQEpIvPTaOrIEsOhTS5pjeZOj6wVVZeOMg4e+VzRydDEwIyESUTBJdKA JWU3QXIqjgdZmpWKckcsTk3FwTFKjPk= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=none; spf=pass (imf16.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1710321007; a=rsa-sha256; cv=none; b=152ot9lwTiddELGYxotj1stNzm34zuHZWKuBiSu4u8fMdeNVrE66fwF1YXU+P4aokmpOjQ 6KDfsNzmSi35hOBHdxowv+Algqtx5R/jVWBAXXp7+cNJsjMxsCTdq9XtjZk5JliGPp/Kcc BW99gv6CozEZiHdVx4oYJ66pIRU40/0= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 369C61007; Wed, 13 Mar 2024 02:10:43 -0700 (PDT) Received: from [10.57.67.164] (unknown [10.57.67.164]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 4D0423F73F; Wed, 13 Mar 2024 02:10:02 -0700 (PDT) Message-ID: Date: Wed, 13 Mar 2024 09:09:59 +0000 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC PATCH v3 3/5] mm: swap: make should_try_to_free_swap() support large-folio Content-Language: en-GB To: Chuanhua Han , Barry Song <21cnbao@gmail.com>, akpm@linux-foundation.org, linux-mm@kvack.org Cc: chengming.zhou@linux.dev, chrisl@kernel.org, david@redhat.com, hannes@cmpxchg.org, kasong@tencent.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, mhocko@suse.com, nphamcs@gmail.com, shy828301@gmail.com, steven.price@arm.com, surenb@google.com, wangkefeng.wang@huawei.com, willy@infradead.org, xiang@kernel.org, ying.huang@intel.com, yosryahmed@google.com, yuzhao@google.com, Barry Song References: <20240304081348.197341-1-21cnbao@gmail.com> <20240304081348.197341-4-21cnbao@gmail.com> <24dc6251-8582-790f-bbd3-465deed946f5@oppo.com> From: Ryan Roberts In-Reply-To: <24dc6251-8582-790f-bbd3-465deed946f5@oppo.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: 599A4180002 X-Rspam-User: X-Stat-Signature: 3nhxf3eocto8f9zawmdttixe7asxz563 X-Rspamd-Server: rspam01 X-HE-Tag: 1710321007-298475 X-HE-Meta: U2FsdGVkX19W5rd/NS43rGXMXV61i4s+Ch4XFFY8QCyWB3jUFBJt4K3DqCxuei1Q0H3FHjkWVoAjWyDDe21gcz+8pkWVCCWmHDoXahNzzL1+jtN28NxSDuAX3s27KpjUu906AS5AGnZIiCXeEZ6zZ/urZRROl9QQj5rERz9xtnju4yYgCTAyQqecmbvVPHn6LlFWhksx2bkjpx6YTeeMYbhpJIGPh86K29xugvZpqAI2r3GMZbLebjh5rOBq1Y4wakteB6VxTOeXzrjvyk4L/T0mfA+20z0HkAIpP9vX4Uh/gqcGEDc72p91XMVDuIjzwQv49o1YA+isYDVLqEE+rOLyMDTIGWi+Tlpz0YJw81EuCfwc/UwyNk+lZ3VJjz32vgQXyIqgE1++R0WwV+VKAmKUDPNhXGH2/VfFVwK0Zg7u6GqRub9YeX+CSMDRt3C197tITJTjQXNPLSLChtNAD4AwP94TPPQ9k+f9jxludScIROY6PefMKBZKGu37UiDlM5R10xIw5IB6B/ln5Jo/ahMiprvAgtkCf06y5BwKhrvi95aFuDgo6EP8L6Qr0VTUkFRPkkOtsF5mOKSvDw64zNsiX74q6xItw87MYeX1A89NYHuNH0odTNH7RCRSaewx/8NY+7eO2Lsxy6yJS61CXYtBSOGKoE6YN39hy2yYQoFaVwkpIO7MIKEtrsjUwumGBgv+1wUVAoHhUByUOEs2jN1NyEfBHXwvuLnLi1WA2bu7PxvknND1DiihXQH2LL5Srs+1BmA3PNj0GpzHfxML0tN90JSZ7YzYWPpLIpjtVES/EBoRDCRd1cA6uRLpXTX8zXKxacw2LFg/pdypr/y3wzxLRsrSWMnS9PDA3KghvC2t01jjooSDSdRhZ7ll/RTjsuPMlopO8gWVdKdtKNPo516zNDZ3nRRe4dxerS6yB24v80pF0QN4ldaKWqAQZ1/m6/usHdB8dtNg7W4aT6i GWAEbtyB 8UmjVuPbm3tfVepnGQVzJr9AEWcRfKnr/nOZvjcijBVpoK+fBcselrbnkayfbXNTwe9PQnBLwXMKmEobDie0tdimgR9lPBuI1Mn5nP1ejc4AMXV7hMyZkObYPLvblzHsrGa4lNxFgJgdE3ggZQd0CHmJ1uVP5rB+wB6beaqzeqpbrG/xtUwisELGEZzZScooLVkrNiJjt9rfeaQ1ziGlviwWFjwcL/+bm84HC4gdW3tG00geBYGL2CYU4imBfZ+ssde3lOGeqmJS6D9SezR9AlKpHu+QjUIT5sL3dOn++FJKbNLlGQKWxMsPrYm26tRaw7aC+rcHFgOpnxpLfGMehTPbtYALE2YeCZ22Ht9JuZgI2iyA6waIbK0HkjUGwxYJTdvH4+hW7BAVDAIY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 13/03/2024 02:21, Chuanhua Han wrote: > hi, Ryan Roberts > > 在 2024/3/12 20:34, Ryan Roberts 写道: >> On 04/03/2024 08:13, Barry Song wrote: >>> From: Chuanhua Han >>> >>> should_try_to_free_swap() works with an assumption that swap-in is always done >>> at normal page granularity, aka, folio_nr_pages = 1. To support large folio >>> swap-in, this patch removes the assumption. >>> >>> Signed-off-by: Chuanhua Han >>> Co-developed-by: Barry Song >>> Signed-off-by: Barry Song >>> Acked-by: Chris Li >>> --- >>> mm/memory.c | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/mm/memory.c b/mm/memory.c >>> index abd4f33d62c9..e0d34d705e07 100644 >>> --- a/mm/memory.c >>> +++ b/mm/memory.c >>> @@ -3837,7 +3837,7 @@ static inline bool should_try_to_free_swap(struct folio *folio, >>> * reference only in case it's likely that we'll be the exlusive user. >>> */ >>> return (fault_flags & FAULT_FLAG_WRITE) && !folio_test_ksm(folio) && >>> - folio_ref_count(folio) == 2; >>> + folio_ref_count(folio) == (1 + folio_nr_pages(folio)); >> I don't think this is correct; one reference has just been added to the folio in >> do_swap_page(), either by getting from swapcache (swap_cache_get_folio()) or by >> allocating. If it came from the swapcache, it could be a large folio, because we >> swapped out a large folio and never removed it from swapcache. But in that case, >> others may have partially mapped it, so the refcount could legitimately equal >> the number of pages while still not being exclusively mapped. >> >> I'm guessing this logic is trying to estimate when we are likely exclusive so >> that we remove from swapcache (release ref) and can then reuse rather than CoW >> the folio? The main CoW path currently CoWs page-by-page even for large folios, >> and with Barry's recent patch, even the last page gets copied. So not sure what >> this change is really trying to achieve? >> > First, if it is a large folio in the swap cache, then its refcont is at > least folio_nr_pages(folio) :   Ahh! Sorry, I had it backwards - was thinking there would be 1 ref for the swap cache, and you were assuming 1 ref per page taken by do_swap_page(). I understand now. On this basis: Reviewed-by: Ryan Roberts > > > For example, in add_to_swap_cache path: > > int add_to_swap_cache(struct folio *folio, swp_entry_t entry, >                         gfp_t gfp, void **shadowp) > { >         struct address_space *address_space = swap_address_space(entry); >         pgoff_t idx = swp_offset(entry); >         XA_STATE_ORDER(xas, &address_space->i_pages, idx, > folio_order(folio)); >         unsigned long i, nr = folio_nr_pages(folio); <--- >         void *old; >         ... >         folio_ref_add(folio, nr); <--- >         folio_set_swapcache(folio); >         ... > } > > > * > > Then in the do_swap_page path: > > * if (should_try_to_free_swap(folio, vma, vmf->flags)) >         folio_free_swap(folio); > * > > * It also indicates that only folio in the swap cache will call > folio_free_swap > * to delete it from the swap cache, So I feel like this patch is > necessary!? 😁 > >>> } >>> >>> static vm_fault_t pte_marker_clear(struct vm_fault *vmf) > > Thanks, > > Chuanhua >