From: Ryan Roberts <ryan.roberts@arm.com>
To: James Houghton <jthoughton@google.com>,
Steve Capper <steve.capper@arm.com>,
Will Deacon <will@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>,
Muchun Song <songmuchun@bytedance.com>,
Anshuman Khandual <anshuman.khandual@arm.com>,
Catalin Marinas <catalin.marinas@arm.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 0/2] arm64: hugetlb: Fix page fault loop for sw-dirty/hw-clean contiguous PTEs
Date: Tue, 5 Dec 2023 14:43:01 +0000 [thread overview]
Message-ID: <70dcdad7-5952-48ce-a9b9-042cfea59a5d@arm.com> (raw)
In-Reply-To: <20231204172646.2541916-1-jthoughton@google.com>
On 04/12/2023 17:26, James Houghton wrote:
> It is currently possible for a userspace application to enter a page
> fault loop when using HugeTLB pages implemented with contiguous PTEs
> when HAFDBS is not available. This happens because:
> 1. The kernel may sometimes write PTEs that are sw-dirty but hw-clean
> (PTE_DIRTY | PTE_RDONLY | PTE_WRITE).
Hi James,
Do you know how this happens?
AFAIK, this is the set of valid bit combinations, and
PTE_RDONLY|PTE_WRITE|PTE_DIRTY is not one of them. Perhaps the real solution is
to understand how this is happening and prevent it?
/*
* PTE bits configuration in the presence of hardware Dirty Bit Management
* (PTE_WRITE == PTE_DBM):
*
* Dirty Writable | PTE_RDONLY PTE_WRITE PTE_DIRTY (sw)
* 0 0 | 1 0 0
* 0 1 | 1 1 0
* 1 0 | 1 0 1
* 1 1 | 0 1 x
*
* When hardware DBM is not present, the sofware PTE_DIRTY bit is updated via
* the page fault mechanism. Checking the dirty status of a pte becomes:
*
* PTE_DIRTY || (PTE_WRITE && !PTE_RDONLY)
*/
Thanks,
Ryan
> 2. If, during a write, the CPU uses a sw-dirty, hw-clean PTE in handling
> the memory access on a system without HAFDBS, we will get a page
> fault.
> 3. HugeTLB will check if it needs to update the dirty bits on the PTE.
> For contiguous PTEs, it will check to see if the pgprot bits need
> updating. In this case, HugeTLB wants to write a sequence of
> sw-dirty, hw-dirty PTEs, but it finds that all the PTEs it is about
> to overwrite are all pte_dirty() (pte_sw_dirty() => pte_dirty()),
> so it thinks no update is necessary.
>
> Please see this[1] reproducer.
>
> I think (though I may be wrong) that both step (1) and step (3) are
> buggy.
>
> The first patch in this series fixes step (3); instead of checking if
> pte_dirty is matching in __cont_access_flags_changed, check pte_hw_dirty
> and pte_sw_dirty separately.
>
> The second patch in this series makes step (1) less likely to occur.
> Without this patch, we can get the kernel to write a sw-dirty, hw-clean
> PTE with the following steps (showing the relevant VMA flags and pgprot
> bits):
> i. Create a valid, writable contiguous PTE.
> VMA vmflags: VM_SHARED | VM_READ | VM_WRITE
> VMA pgprot bits: PTE_RDONLY | PTE_WRITE
> PTE pgprot bits: PTE_DIRTY | PTE_WRITE
> ii. mprotect the VMA to PROT_NONE.
> VMA vmflags: VM_SHARED
> VMA pgprot bits: PTE_RDONLY
> PTE pgprot bits: PTE_DIRTY | PTE_RDONLY
> iii. mprotect the VMA back to PROT_READ | PROT_WRITE.
> VMA vmflags: VM_SHARED | VM_READ | VM_WRITE
> VMA pgprot bits: PTE_RDONLY | PTE_WRITE
> PTE pgprot bits: PTE_DIRTY | PTE_WRITE | PTE_RDONLY
>
> Applying either one of the two patches in this patchset will fix the
> particular issue with HugeTLB pages implemented with contiguous PTEs.
> It's possible that only one of these patches should be taken, or that
> the right fix is something else entirely.
>
> [1]: https://gist.github.com/48ca/11d1e466deee032cb35aa8c2280f93b0
>
> James Houghton (2):
> arm64: hugetlb: Distinguish between hw and sw dirtiness in
> __cont_access_flags_changed
> arm64: mm: Always make sw-dirty PTEs hw-dirty in pte_modify
>
> arch/arm64/include/asm/pgtable.h | 6 ++++++
> arch/arm64/mm/hugetlbpage.c | 5 ++++-
> 2 files changed, 10 insertions(+), 1 deletion(-)
>
>
> base-commit: 645a9a454fdb7e698a63a275edca6a17ef97afc4
next prev parent reply other threads:[~2023-12-05 14:43 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-12-04 17:26 James Houghton
2023-12-04 17:26 ` [PATCH 1/2] arm64: hugetlb: Distinguish between hw and sw dirtiness in __cont_access_flags_changed James Houghton
2023-12-04 17:26 ` [PATCH 2/2] arm64: mm: Always make sw-dirty PTEs hw-dirty in pte_modify James Houghton
2023-12-06 10:26 ` Ryan Roberts
2023-12-11 18:42 ` Will Deacon
2023-12-11 19:01 ` James Houghton
2023-12-05 14:43 ` Ryan Roberts [this message]
2023-12-05 17:54 ` [PATCH 0/2] arm64: hugetlb: Fix page fault loop for sw-dirty/hw-clean contiguous PTEs James Houghton
2023-12-06 10:24 ` Ryan Roberts
2023-12-06 21:01 ` James Houghton
2023-12-12 17:22 ` (subset) " Catalin Marinas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=70dcdad7-5952-48ce-a9b9-042cfea59a5d@arm.com \
--to=ryan.roberts@arm.com \
--cc=akpm@linux-foundation.org \
--cc=anshuman.khandual@arm.com \
--cc=catalin.marinas@arm.com \
--cc=jthoughton@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mike.kravetz@oracle.com \
--cc=songmuchun@bytedance.com \
--cc=steve.capper@arm.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox