Re: Unifying page table walkers

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Peter Xu <peterx@redhat.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: Khalid Aziz <khalid.aziz@oracle.com>,
	Vishal Moola <vishal.moola@gmail.com>,
	Jane Chu <jane.chu@oracle.com>,
	Muchun Song <muchun.song@linux.dev>,
	linux-mm@kvack.org
Subject: Re: Unifying page table walkers
Date: Thu, 6 Jun 2024 17:49:30 -0400	[thread overview]
Message-ID: <ZmIu6v0hKbQBBrLI@x1n> (raw)
In-Reply-To: <ZmIAAjiO4AEd8-Jb@casper.infradead.org>

On Thu, Jun 06, 2024 at 07:29:22PM +0100, Matthew Wilcox wrote:
> The reason we have a separate hugetlb_entry from pmd_entry and pud_entry
> is that it has a different locking context.  It is called with the
> hugetlb_vma_lock held for read (nb: this is not the same as the vma
> lock; see walk_hugetlb_range()).  Why do we need this?  Because of page
> table sharing.

Just to quickly comment on this one: I think it's more than the per-vma
lock.  Oscar is actually working together with me (we had plenty of
discussions but so far all offlist...), and the lock context is as simple
as this after refactor for hugetlb_entry() path:

https://github.com/leberus/linux/commit/88e56c1ecaf8c64ba9165aeba74335bdc15d1b56

hugetlb_entry() existed also because that's the only sane way to link to
the hugetlb API (used to be huge_pte_offset() I believe, now
hugetlb_walk()), which always walk to a specific level of hugetlb pgtable
but without even telling the caller (hence the pte_t* force-cast trick).
Then pxd_entry() won't apply if we don't know that info.  So it's probably
not only about the locking.

Meanwhile, I had a very vague memory that the per-vma lock is also used for
something else, perhaps fallocate() race against faults or something.  But
maybe I misremembered; I didn't read that part of code for quite some time,
as our hugetlb refactoring work doesn't need that knowledge involved: we
simply keep all the behaviors.  Maybe Muchun could remember.

Thanks,

-- 
Peter Xu

next prev parent reply	other threads:[~2024-06-06 21:49 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-06 18:29 Matthew Wilcox
2024-06-06 19:30 ` James Houghton
2024-06-06 20:04   ` Matthew Wilcox
2024-06-06 20:23     ` James Houghton
2024-06-06 21:21       ` Matthew Wilcox
2024-06-06 23:07         ` James Houghton
2024-06-07  7:15           ` David Hildenbrand
2024-06-06 21:33     ` Peter Xu
2024-06-06 21:49 ` Peter Xu [this message]
2024-06-07  5:07   ` Oscar Salvador
2024-06-07  6:59 ` David Hildenbrand
2024-06-09 20:08   ` Matthew Wilcox
2024-06-09 20:28     ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZmIu6v0hKbQBBrLI@x1n \
    --to=peterx@redhat.com \
    --cc=jane.chu@oracle.com \
    --cc=khalid.aziz@oracle.com \
    --cc=linux-mm@kvack.org \
    --cc=muchun.song@linux.dev \
    --cc=vishal.moola@gmail.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox