From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 951F7C4332F for ; Fri, 3 Nov 2023 13:57:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C20EE28001C; Fri, 3 Nov 2023 09:57:57 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BAAA528000F; Fri, 3 Nov 2023 09:57:57 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A4A4228001C; Fri, 3 Nov 2023 09:57:57 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 8E24428000F for ; Fri, 3 Nov 2023 09:57:57 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 6598140CD9 for ; Fri, 3 Nov 2023 13:57:57 +0000 (UTC) X-FDA: 81416796594.16.42B7A2C Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf05.hostedemail.com (Postfix) with ESMTP id 54449100020 for ; Fri, 3 Nov 2023 13:57:55 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf05.hostedemail.com: domain of steven.price@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=steven.price@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1699019875; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=a6sQt3NA+6XpANvPrQ9Z6jVqFyLwCDzpekBlyZajEt4=; b=M7d8xM/pU/55NhBnBeN4J4+C3Zodr2xfhwp87BnjPe2or0YySibGWqcstlriquv13bxf44 yYw4MbVIWL7SCKFqXZeF7bLDrbI9Ro0wIE9cZOBTP78VsQrH4xLY3O45NPafLKATQ0906i z/kjldZdPKbuYYotQQiGYDyU8C6EGG0= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf05.hostedemail.com: domain of steven.price@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=steven.price@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1699019875; a=rsa-sha256; cv=none; b=jy8+iiSZ/9jOIw9o4EhP3k5CKT0NDzg4VeM40JVZGx+3KC8FsJvVCkFy1rMmVzem6RBIVE wyGLzz4rUTT+x3uk9Ic2P7oH5J+iChYbhSM5TDV8btqQ9PbTPlCpO753TCW7JwwsvizjvL VUULw8gYjXi3t3q8wkz6OMF+CjLVa7s= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id E751A2F4; Fri, 3 Nov 2023 06:58:36 -0700 (PDT) Received: from [10.1.28.19] (e122027.cambridge.arm.com [10.1.28.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 4D8563F64C; Fri, 3 Nov 2023 06:57:51 -0700 (PDT) Message-ID: <2fe5ce7e-9c5c-4df4-b4fc-9fd3d9b2dccb@arm.com> Date: Fri, 3 Nov 2023 13:57:49 +0000 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v3 4/4] mm: swap: Swap-out small-sized THP without splitting To: Ryan Roberts , Barry Song <21cnbao@gmail.com> Cc: akpm@linux-foundation.org, david@redhat.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, mhocko@suse.com, shy828301@gmail.com, wangkefeng.wang@huawei.com, willy@infradead.org, xiang@kernel.org, ying.huang@intel.com, yuzhao@google.com References: <73aad98e-de4c-4021-af3c-db67e06cdb70@arm.com> <20231102223643.7733-1-v-songbaohua@oppo.com> <6641a14b-e3fb-4e9e-bb95-b0306827294b@arm.com> Content-Language: en-GB From: Steven Price In-Reply-To: <6641a14b-e3fb-4e9e-bb95-b0306827294b@arm.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspam-User: X-Stat-Signature: n46xepenhknc9mwcudrsg4hjne7jdtm6 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 54449100020 X-HE-Tag: 1699019875-97817 X-HE-Meta: U2FsdGVkX1/Ab5mJfQTku5c9LdkqmeidfyPe4HXNgXCcQTbUxlwKobz+qCUWum6QQ7vh2I8juX2+nqu1Jnmjds1wZWgjo7CEndjdTxjWCAW+EhAiIVHacfrPDYT7fXYPJAU/G2HNG+55EjIat7s96uCSdPMrQ9UYZoGkIoPWkOtu7n2ymfNWbRuJ6+ZVEMUb8RfgvKLIWfgzb0doQLXwErW8yivktIUN517HLDl+A0lqTeCWwFcHE/2tiNCPc8gZSYq1q5wqeMstVtaf0piADuczla4cdGPu47A9lm+UsiVdUPg+nqIjLhfw2221FD8tMDFu48dxoKaIbsfjhXbKx8FbWKdEqHQkg2i7QDJDI9vXt+S2omsr/Bes14xsnbnVxjq2WhBVqWzswzqDTG+yOznF2k1m0QYcpCmO6AgaDo0OcOCkDbL1g8NHeY35bJK3suVUJFdvomRaGhvci2BBz0RcCEVHO/qYz7l1D3scGkgU5I+QkpmzV/5NrhiWxzb/aSSKExwt0kfHpigqdludEqVP51B5604iV+cKUSpBnmTDQblps9SRyLJRToR3H3hi/k6Nx41hx1xHkebIR+eWJq5r0lhUnWBvf+SoMJpuqi9EflphBbO8b551Wm8FwoOPlD9vxnXL+55TOMJZuv4Dpv9rX97dFaGAXYZA7Surx4qw5k0Vexy8I+lqw2Y2gyhSKZIEJf4hKSo6PfpotPKAHclE3IOu0jPXIs2DchIcCKpKuZlyrsU3WvJpOtWPKenaGgo801hPym5qZjK0en0rkmCxd5pxyGQ2/Fl2s689ewPgUW/ceNYWN58nGDJTaHMDOhCK4fQ+FczBM5xcmQMlEyU1CeCOQzutvwGjsf5GP/K2VYdFRH4nU2K1TiZij3H9z8ra0H7RluEJjeT+17dOBYCG34M6tIHKAqovoghzvHRlr9jmi/WKmnrWvE8xV/xh1tPrS6ZNbIxjLR7u6bD tIZFNqQK wQlcEAPrOlJDqnBOWuwaZQaxiROW64EBXc/L5cZQK5A4XlWbIdqnhQJpDBfGE5WPPY2gn3JqIvRDsXKpVpLXe7nBaWsrkIeLzrupVJZv6kljZKsWoVT6P0paWsqPmTmplK8T5mV8FIXSUQO2Z4BXbmVNVzg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 03/11/2023 11:31, Ryan Roberts wrote: > On 02/11/2023 22:36, Barry Song wrote: >>> But, yes, it would be nice to fix that! And if I've understood the problem >>> correctly, it doesn't sound like it should be too hard? Is this something you >>> are volunteering for?? :) >> >> Unfornately right now I haven't a real hardware with MTE which can run the latest >> kernel. but i have written a RFC, it will be nice to get someone to test it. Let >> me figure out if we can get someone :-) > > OK, let me know if you find someone. Otherwise I can have a hunt around to see > if I can test it. > >> >> [RFC PATCH] arm64: mm: swap: save and restore mte tags for large folios >> >> This patch makes MTE tags saving and restoring support large folios, >> then we don't need to split them into base pages for swapping on >> ARM64 SoCs with MTE. >> >> --- >> arch/arm64/include/asm/pgtable.h | 21 ++++----------------- >> arch/arm64/mm/mteswap.c | 20 ++++++++++++++++++++ >> 2 files changed, 24 insertions(+), 17 deletions(-) >> >> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h >> index 7f7d9b1df4e5..b12783dca00a 100644 >> --- a/arch/arm64/include/asm/pgtable.h >> +++ b/arch/arm64/include/asm/pgtable.h >> @@ -45,12 +45,6 @@ >> __flush_tlb_range(vma, addr, end, PUD_SIZE, false, 1) >> #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ >> >> -static inline bool arch_thp_swp_supported(void) >> -{ >> - return !system_supports_mte(); >> -} >> -#define arch_thp_swp_supported arch_thp_swp_supported > > IIRC, arm64 was the only arch implementing this, so perhaps it should be ripped > out from the core code now? > >> - >> /* >> * Outside of a few very special situations (e.g. hibernation), we always >> * use broadcast TLB invalidation instructions, therefore a spurious page >> @@ -1028,12 +1022,8 @@ static inline pmd_t pmdp_establish(struct vm_area_struct *vma, >> #ifdef CONFIG_ARM64_MTE >> >> #define __HAVE_ARCH_PREPARE_TO_SWAP >> -static inline int arch_prepare_to_swap(struct page *page) >> -{ >> - if (system_supports_mte()) >> - return mte_save_tags(page); >> - return 0; >> -} >> +#define arch_prepare_to_swap arch_prepare_to_swap >> +extern int arch_prepare_to_swap(struct page *page); > > I think it would be better to modify this API to take a folio explicitly. The > caller already has the folio. > >> >> #define __HAVE_ARCH_SWAP_INVALIDATE >> static inline void arch_swap_invalidate_page(int type, pgoff_t offset) >> @@ -1049,11 +1039,8 @@ static inline void arch_swap_invalidate_area(int type) >> } >> >> #define __HAVE_ARCH_SWAP_RESTORE >> -static inline void arch_swap_restore(swp_entry_t entry, struct folio *folio) >> -{ >> - if (system_supports_mte()) >> - mte_restore_tags(entry, &folio->page); >> -} >> +#define arch_swap_restore arch_swap_restore >> +extern void arch_swap_restore(swp_entry_t entry, struct folio *folio); >> >> #endif /* CONFIG_ARM64_MTE */ >> >> diff --git a/arch/arm64/mm/mteswap.c b/arch/arm64/mm/mteswap.c >> index a31833e3ddc5..e5637e931e4f 100644 >> --- a/arch/arm64/mm/mteswap.c >> +++ b/arch/arm64/mm/mteswap.c >> @@ -83,3 +83,23 @@ void mte_invalidate_tags_area(int type) >> } >> xa_unlock(&mte_pages); >> } >> + >> +int arch_prepare_to_swap(struct page *page) >> +{ >> + if (system_supports_mte()) { >> + struct folio *folio = page_folio(page); >> + long i, nr = folio_nr_pages(folio); >> + for (i = 0; i < nr; i++) >> + return mte_save_tags(folio_page(folio, i)); > > This will return after saving the first page of the folio! You will need to add > each page in a loop, and if you get an error at any point, you will need to > remove the pages that you already added successfully, by calling > arch_swap_invalidate_page() as far as I can see. Steven can you confirm? Yes that's right. mte_save_tags() needs to allocate memory so can fail and if failing then arch_prepare_to_swap() would need to put things back how they were with calls to mte_invalidate_tags() (although I think you'd actually want to refactor to create a function which takes a struct page *). Steve >> + } >> + return 0; >> +} >> + >> +void arch_swap_restore(swp_entry_t entry, struct folio *folio) >> +{ >> + if (system_supports_mte()) { >> + long i, nr = folio_nr_pages(folio); >> + for (i = 0; i < nr; i++) >> + mte_restore_tags(entry, folio_page(folio, i)); > > swap-in currently doesn't support large folios - everything is a single page > folio. So this isn't technically needed. But from the API POV, it seems > reasonable to make this change - except your implementation is broken. You are > currently setting every page in the folio to use the same tags as the first > page. You need to increment the swap entry for each page. > > Thanks, > Ryan > > >> + } >> +} >