linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: James Houghton <jthoughton@google.com>
To: Steve Capper <steve.capper@arm.com>,
	Will Deacon <will@kernel.org>,
	 Andrew Morton <akpm@linux-foundation.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>,
	Muchun Song <songmuchun@bytedance.com>,
	 Anshuman Khandual <anshuman.khandual@arm.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	 Ryan Roberts <ryan.roberts@arm.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	 James Houghton <jthoughton@google.com>
Subject: [PATCH 0/2] arm64: hugetlb: Fix page fault loop for sw-dirty/hw-clean contiguous PTEs
Date: Mon,  4 Dec 2023 17:26:44 +0000	[thread overview]
Message-ID: <20231204172646.2541916-1-jthoughton@google.com> (raw)

It is currently possible for a userspace application to enter a page
fault loop when using HugeTLB pages implemented with contiguous PTEs
when HAFDBS is not available. This happens because:
1. The kernel may sometimes write PTEs that are sw-dirty but hw-clean
   (PTE_DIRTY | PTE_RDONLY | PTE_WRITE).
2. If, during a write, the CPU uses a sw-dirty, hw-clean PTE in handling
   the memory access on a system without HAFDBS, we will get a page
   fault.
3. HugeTLB will check if it needs to update the dirty bits on the PTE.
   For contiguous PTEs, it will check to see if the pgprot bits need
   updating. In this case, HugeTLB wants to write a sequence of
   sw-dirty, hw-dirty PTEs, but it finds that all the PTEs it is about
   to overwrite are all pte_dirty() (pte_sw_dirty() => pte_dirty()),
   so it thinks no update is necessary.

Please see this[1] reproducer.

I think (though I may be wrong) that both step (1) and step (3) are
buggy.

The first patch in this series fixes step (3); instead of checking if
pte_dirty is matching in __cont_access_flags_changed, check pte_hw_dirty
and pte_sw_dirty separately.

The second patch in this series makes step (1) less likely to occur.
Without this patch, we can get the kernel to write a sw-dirty, hw-clean
PTE with the following steps (showing the relevant VMA flags and pgprot
bits):
i.   Create a valid, writable contiguous PTE.
       VMA vmflags:     VM_SHARED | VM_READ | VM_WRITE
       VMA pgprot bits: PTE_RDONLY | PTE_WRITE
       PTE pgprot bits: PTE_DIRTY | PTE_WRITE
ii.  mprotect the VMA to PROT_NONE.
       VMA vmflags:     VM_SHARED
       VMA pgprot bits: PTE_RDONLY
       PTE pgprot bits: PTE_DIRTY | PTE_RDONLY
iii. mprotect the VMA back to PROT_READ | PROT_WRITE.
       VMA vmflags:     VM_SHARED | VM_READ | VM_WRITE
       VMA pgprot bits: PTE_RDONLY | PTE_WRITE
       PTE pgprot bits: PTE_DIRTY | PTE_WRITE | PTE_RDONLY

Applying either one of the two patches in this patchset will fix the
particular issue with HugeTLB pages implemented with contiguous PTEs.
It's possible that only one of these patches should be taken, or that
the right fix is something else entirely.

[1]: https://gist.github.com/48ca/11d1e466deee032cb35aa8c2280f93b0

James Houghton (2):
  arm64: hugetlb: Distinguish between hw and sw dirtiness in
    __cont_access_flags_changed
  arm64: mm: Always make sw-dirty PTEs hw-dirty in pte_modify

 arch/arm64/include/asm/pgtable.h | 6 ++++++
 arch/arm64/mm/hugetlbpage.c      | 5 ++++-
 2 files changed, 10 insertions(+), 1 deletion(-)


base-commit: 645a9a454fdb7e698a63a275edca6a17ef97afc4
-- 
2.43.0.rc2.451.g8631bc7472-goog



             reply	other threads:[~2023-12-04 17:27 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-04 17:26 James Houghton [this message]
2023-12-04 17:26 ` [PATCH 1/2] arm64: hugetlb: Distinguish between hw and sw dirtiness in __cont_access_flags_changed James Houghton
2023-12-04 17:26 ` [PATCH 2/2] arm64: mm: Always make sw-dirty PTEs hw-dirty in pte_modify James Houghton
2023-12-06 10:26   ` Ryan Roberts
2023-12-11 18:42   ` Will Deacon
2023-12-11 19:01     ` James Houghton
2023-12-05 14:43 ` [PATCH 0/2] arm64: hugetlb: Fix page fault loop for sw-dirty/hw-clean contiguous PTEs Ryan Roberts
2023-12-05 17:54   ` James Houghton
2023-12-06 10:24     ` Ryan Roberts
2023-12-06 21:01       ` James Houghton
2023-12-12 17:22 ` (subset) " Catalin Marinas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231204172646.2541916-1-jthoughton@google.com \
    --to=jthoughton@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=anshuman.khandual@arm.com \
    --cc=catalin.marinas@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=ryan.roberts@arm.com \
    --cc=songmuchun@bytedance.com \
    --cc=steve.capper@arm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox