From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3E87CC30653 for ; Thu, 4 Jul 2024 13:44:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B30636B0085; Thu, 4 Jul 2024 09:44:31 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id ADF896B0088; Thu, 4 Jul 2024 09:44:31 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9804D6B008A; Thu, 4 Jul 2024 09:44:31 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 728F06B0085 for ; Thu, 4 Jul 2024 09:44:31 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id F31321A1349 for ; Thu, 4 Jul 2024 13:44:30 +0000 (UTC) X-FDA: 82302189900.28.9C0AA12 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf22.hostedemail.com (Postfix) with ESMTP id 4096DC0011 for ; Thu, 4 Jul 2024 13:44:29 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=arm.com (policy=none); spf=pass (imf22.hostedemail.com: domain of cmarinas@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=cmarinas@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1720100650; a=rsa-sha256; cv=none; b=sCznq/8dnTwL5ojuojLzW8R1/yUU+x4sLV/qCMs+bbEBWy6baXx744vLfSMySrKr3q/c/g gWtkCNXM0jhIoAcjig914M/fN3Vp+nU9f4lqWSZ7pY287qKdKn2xwT/a3jOeGjkwSZKM6X LSqK+e5u9IEU8XAcmPwRgzi3leolnY8= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=arm.com (policy=none); spf=pass (imf22.hostedemail.com: domain of cmarinas@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=cmarinas@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1720100650; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=k+3e/q19CvBNQvIAiedCGCsnMwBSPVJEPI0XS+Eoxd8=; b=uPV0OhDPFZXWyJLAIU0s22E18An9OB0W5q73f+jUwsR40IKFNImntaJ0MYdf03XXt4DhOo yZNTntccNFu331L+NR9l/qRY9C6h1pzbCXVeb3z0M+0jKJQ5T7t2pvzZeCaGX9YZ8HFQqp puUTL1V7jQYRGy49FXtIyV5/Y+jjsjg= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 10782626F5; Thu, 4 Jul 2024 13:44:28 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5024DC3277B; Thu, 4 Jul 2024 13:44:26 +0000 (UTC) Date: Thu, 4 Jul 2024 14:44:23 +0100 From: Catalin Marinas To: Yang Shi Cc: muchun.song@linux.dev, will@kernel.org, akpm@linux-foundation.org, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] hugetlbfs: add MTE support Message-ID: References: <20240625233717.2769975-1-yang@os.amperecomputing.com> <7a4a60af-e471-484b-a4a3-ed31daaca30b@os.amperecomputing.com> <546bf8d4-3680-4af3-8d4d-af2d7c192d04@os.amperecomputing.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <546bf8d4-3680-4af3-8d4d-af2d7c192d04@os.amperecomputing.com> X-Rspamd-Queue-Id: 4096DC0011 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: 7cj5bp5jm5niz11fskwty9d7jw7ccifr X-HE-Tag: 1720100669-723466 X-HE-Meta: U2FsdGVkX1+AFkXAFMWZIz8dCgwmZUEoELRtza1TYsY4aX2KU+ATPQCjFV/aKUUkYwZBNvfzkx6sB2mwf4TlpkKMe3q81sfN6v6QP79EsUOZW4EdAmac0U0oS5gVSkQJlywJLVO3PukAB6RSpOSzhdUPM0Jnxb5q71+kJ5c/3amxTcs6FQ8LD9XqK8b4XXiGYVRZLoJWZk9koH4oCism7IfbY3MNugpkJbPXmNhpSk7wD7KXyy/tS4jAweRkrQqx1bxGV09dA81S8UdN/DR4LriinCr8NosVU3MMm58PD/KxVepEx7JPJgRVE3H5dxTrvfVpmHaIJ285wFYydmgqIQb6EF7WOMvRndOro9KXfAgYzewJJ/YYYb0qFHZZmD8G1Yuz6FZ3u9/EdLdzZTwYdJOZJq/2IGYHaoWLKunduZwFYbrjOWjrbkJmLO3d4zIGroHPbbTlPCB/d65AOdRgWTGEslKpqZShR+JsJpp4hSqPdtmjQCENbZxW/1wO9ST6YeXb8yTSMNPLW7VJV4/PxoCcNdjFOLq0WaEsUk5yNr21OXZHDpIy+0+CCpujjWbGdpHDUdC5gLrHGdtq6kiviclBiTQjzmhLVOYA3dADwN29GaBfJ+07GIzd/FRnMqVVuWb+JFBXFtOvNbJujVAbG+4SXywl5cffDOqRHO3rUIJxRMsArnPuZcx8ya9DK1b0otKP6nuuxj7othb2gvc+hs+3yeWM1Wid0mGA9nIhTlXAAPrHqcjnqBuFjTjj3oyM3qYVMgioreojlFfRYlR3GtSXveB0d0w4lpI0wpFRUBpHzj3h9387D9XFR72R9am1hRZGxKWU0376xDvcFP3tae2QCMkUENl/lUSzlrC34iFNoXLgqpq4TYXpt4VkdgyyzZXVTshjJqaaM0xzZhqZkLN2W7hocTVB9N1A7muQYUg9A1/v7sdFWyK0O/woM5TJHjWwEy5w88nuOwP7ZXy AcIeyeoX VhzaGN+esim993dtZICO31oYGsFJPafkuXP3nUGCHBVs5xYZUfGz6fVh61sMDX0qM7s+ArdJGHnwnOBi0lAAlmC8jqQ0gOsFKXqY+nPdRbL5EqbwlA0YeG0XBMgwINOphkJkEo/187VQ6xulgLwJSe67foQWez+YDNMU43ppZxxJHIGCH0FT5Zm78ycSyQC/hwE5/EfrMmg373so8Xt8QV6i+7a5vzJtiBkvgT0cAshNoRSJXuPqVtWglLZTNndllbzfaYIYHlrkdtifHgG80tr1bruwX5jgotHWSoPrtOz2gK5nCFN4W8+ni1kIctozOK4uqE13idrRozVRmHbNMLJSOcUSOeqyGv8lYb/JETLZ9RpcNk1aU1WmUUw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Jul 02, 2024 at 05:15:12PM -0700, Yang Shi wrote: > On 7/2/24 5:04 PM, Yang Shi wrote: > > On 7/2/24 5:34 AM, Catalin Marinas wrote: > > > Last time I checked, about a year ago, this was not sufficient. One > > > issue is that there's no arch_clear_hugetlb_flags() implemented by your > > > patch, leaving PG_arch_{2,3} set on a page. The other issue was that I > > > initially tried to do this only on the head page but this did not go > > > well with the folio_copy() -> copy_highpage() which expects the > > > PG_arch_* flags on each individual page. The alternative was for > > > arch_clear_hugetlb_flags() to iterate over all the pages in a folio. > > > > Thanks for pointing this out. I did miss this point. I took a quick look > > at when the PG_ flags are set. IIUC, it is set by post_alloc_hook() for > > order-0 anonymous folio (clearing page and tags) and set_ptes() for > > others (just clear tags), for example, THP and hugetlb. > > > > I can see THP does set the PG_mte_tagged flag for each sub pages. But it > > seems it does not do it for hugetlb if I read the code correctly. The > > call path is: > > > > hugetlb_fault() -> > >   hugetlb_no_page-> > >     set_huge_pte_at -> > >       __set_ptes() -> > >         __sync_cache_and_tags() -> > > > > > > The __set_ptes() is called in a loop: > > > > if (!pte_present(pte)) { > >         for (i = 0; i < ncontig; i++, ptep++, addr += pgsize) > >             __set_ptes(mm, addr, ptep, pte, 1); > >         return; > >     } > > > > The ncontig and pgsize are returned by num_contig_ptes(). For example, > > 2M hugetlb, ncontig is 1 and pgsize is 2M IIUC. So it means actually > > just the head page has PG_mte_tagged set. If so the copy_highpage() will > > just copy the old head page's flag to the new head page, and the tag. > > All the sub pages don't have PG_mte_tagged set. > > > > Is it expected behavior? I'm supposed we need tags for every sub pages > > too, right? > > We should need something like the below to have tags and page flag set up > for each sub page: > > diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c > index 3f09ac73cce3..528164deef27 100644 > --- a/arch/arm64/mm/hugetlbpage.c > +++ b/arch/arm64/mm/hugetlbpage.c > @@ -228,9 +228,12 @@ void set_huge_pte_at(struct mm_struct *mm, unsigned > long addr, >         int ncontig; >         unsigned long pfn, dpfn; >         pgprot_t hugeprot; > +       unsigned long nr = sz >> PAGE_SHIFT; > >         ncontig = num_contig_ptes(sz, &pgsize); > > +       __sync_cache_and_tags(pte, nr); > + >         if (!pte_present(pte)) { >                 for (i = 0; i < ncontig; i++, ptep++, addr += pgsize) >                         __set_ptes(mm, addr, ptep, pte, 1); We only need this for the last loop when we have a present pte. I'd also avoid calling __set_ptes(...., 1) if we call the __sync_cache_and_tags() here already. Basically just unroll __set_ptes() in set_huge_pte_at() and call __set_pte() directly. It might be better to convert those page flag checks to only happen on the head page. My stashed changes from over a year ago (before we had more folio conversions) below. However, as I mentioned, I got stuck on folio_copy() which also does a cond_resched() between copy_highpage(). ---------8<-------------------------------- diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c index f5bcb0dc6267..a84ad0e46f12 100644 --- a/arch/arm64/kernel/mte.c +++ b/arch/arm64/kernel/mte.c @@ -49,7 +49,9 @@ static void mte_sync_page_tags(struct page *page, pte_t old_pte, return; if (try_page_mte_tagging(page)) { - mte_clear_page_tags(page_address(page)); + unsigned long i, nr_pages = compound_nr(page); + for (i = 0; i < nr_pages; i++) + mte_clear_page_tags(page_address(page + i)); set_page_mte_tagged(page); } } @@ -57,22 +59,23 @@ static void mte_sync_page_tags(struct page *page, pte_t old_pte, void mte_sync_tags(pte_t old_pte, pte_t pte) { struct page *page = pte_page(pte); - long i, nr_pages = compound_nr(page); - bool check_swap = nr_pages == 1; + bool check_swap = !PageCompound(page); bool pte_is_tagged = pte_tagged(pte); /* Early out if there's nothing to do */ if (!check_swap && !pte_is_tagged) return; + /* + * HugeTLB pages are always fully mapped, so only setting head page's + * PG_mte_* flags is enough. + */ + if (PageHuge(page)) + page = compound_head(page); + /* if PG_mte_tagged is set, tags have already been initialised */ - for (i = 0; i < nr_pages; i++, page++) { - if (!page_mte_tagged(page)) { - mte_sync_page_tags(page, old_pte, check_swap, - pte_is_tagged); - set_page_mte_tagged(page); - } - } + if (!page_mte_tagged(page)) + mte_sync_page_tags(page, old_pte, check_swap, pte_is_tagged); /* ensure the tags are visible before the PTE is set */ smp_wmb(); diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c index 5626ddb540ce..692198d8c0d2 100644 --- a/arch/arm64/kvm/guest.c +++ b/arch/arm64/kvm/guest.c @@ -1079,7 +1079,7 @@ long kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm, /* uaccess failed, don't leave stale tags */ if (num_tags != MTE_GRANULES_PER_PAGE) - mte_clear_page_tags(page); + mte_clear_page_tags(page_address(page)); set_page_mte_tagged(page); kvm_release_pfn_dirty(pfn); diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 31d7fa4c7c14..b531b1d75cda 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1173,11 +1173,10 @@ static void sanitise_mte_tags(struct kvm *kvm, kvm_pfn_t pfn, if (!kvm_has_mte(kvm)) return; - for (i = 0; i < nr_pages; i++, page++) { - if (try_page_mte_tagging(page)) { - mte_clear_page_tags(page_address(page)); - set_page_mte_tagged(page); - } + if (try_page_mte_tagging(page)) { + for (i = 0; i < nr_pages; i++) + mte_clear_page_tags(page_address(page + i)); + set_page_mte_tagged(page); } } -- Catalin