From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: Matthew Wilcox <willy@infradead.org>
Cc: linux-arch@vger.kernel.org, Yin Fengwei <fengwei.yin@intel.com>,
linux-mm@kvack.org
Subject: Re: API for setting multiple PTEs at once
Date: Fri, 3 Feb 2023 02:27:51 +0300 [thread overview]
Message-ID: <20230202232751.q4qfm2qrauwtz5bs@box.shutemov.name> (raw)
In-Reply-To: <Y9w+AppNv+i1o/o3@casper.infradead.org>
On Thu, Feb 02, 2023 at 10:49:38PM +0000, Matthew Wilcox wrote:
> On Fri, Feb 03, 2023 at 12:48:58AM +0300, Kirill A. Shutemov wrote:
> > On Thu, Feb 02, 2023 at 09:14:23PM +0000, Matthew Wilcox wrote:
> > > For those of you not subscribed, linux-mm is currently discussing
> > > how best to handle page faults on large folios. I simply made it work
> > > when adding large folio support. Now Yin Fengwei is working on
> > > making it fast.
> > >
> > > https://lore.kernel.org/linux-mm/Y9qjn0Y+1ir787nc@casper.infradead.org/
> > > is perhaps the best place to start as it pertains to what the
> > > architecture will see.
> > >
> > > At the bottom of that function, I propose
> > >
> > > + for (i = 0; i < nr; i++) {
> > > + set_pte_at(vma->vm_mm, addr, vmf->pte + i, entry);
> > > + /* no need to invalidate: a not-present page won't be cached */
> > > + update_mmu_cache(vma, addr, vmf->pte + i);
> > > + addr += PAGE_SIZE;
> > > + entry = pte_next(entry);
> > > + }
> > >
> > > (or I would have, had I not forgotten that pte_t isn't an integral type)
> > >
> > > But I think that some architectures want to mark PTEs specially for
> > > "This is part of a contiguous range" -- ARM, perhaps? So would you like
> > > an API like:
> > >
> > > arch_set_ptes(mm, addr, vmf->pte, entry, nr);
> >
> > Maybe just set_ptes(). arch_ doesn't contribute much.
>
> Sure.
>
> > > update_mmu_cache_range(vma, addr, vmf->pte, nr);
> > >
> > > There are some challenges here. For example, folios may be mapped
> > > askew (ie not naturally aligned). Another problem is that folios may
> > > be unmapped in part (eg mmap(), fault, followed by munmap() of one of
> > > the pages in the folio), and I presume you'd need to go and unmark the
> > > other PTEs in that case. So it's not as simple as just checking whether
> > > 'addr' and 'nr' are in some way compatible.
> >
> > I think the key question is who is responsible for 'nr' being safe. Like
> > is it caller or set_ptes() need to check that it belong to the same PTE
> > page table, folio, VMA, etc.
> >
> > I think it has to be done by caller and set_pte() has to be as simple as
> > possible.
>
> Caller guarantees that 'nr' is bounded by all of (vma, PMD table, folio).
Also caller is responsible for taking all relevant locks.
> We don't currently allocate folios larger than PMD size, but perhaps we
> should prepare for that and as part of this same exercise define
>
> set_pmds(mm, addr, vmf->pmd, entry, nr);
>
> ... where 'nr' is the number of PMDs to set, not number of pages.
Sounds good to me.
--
Kiryl Shutsemau / Kirill A. Shutemov
next prev parent reply other threads:[~2023-02-02 23:27 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-02 21:14 Matthew Wilcox
2023-02-02 21:48 ` Kirill A. Shutemov
2023-02-02 22:49 ` Matthew Wilcox
2023-02-02 23:27 ` Kirill A. Shutemov [this message]
2023-02-07 20:27 ` Matthew Wilcox
2023-02-08 11:23 ` Alexandre Ghiti
2023-02-08 12:09 ` Yin, Fengwei
2023-02-08 13:35 ` Matthew Wilcox
2023-02-14 9:55 ` Alexandre Ghiti
2023-02-20 8:29 ` Rolf Eike Beer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230202232751.q4qfm2qrauwtz5bs@box.shutemov.name \
--to=kirill@shutemov.name \
--cc=fengwei.yin@intel.com \
--cc=linux-arch@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox