From: Matthew Wilcox <willy@infradead.org>
To: James Houghton <jthoughton@google.com>
Cc: linux-mm@kvack.org, Vishal Moola <vishal.moola@gmail.com>,
Hugh Dickins <hughd@google.com>, Rik van Riel <riel@surriel.com>,
David Hildenbrand <david@redhat.com>,
"Yin, Fengwei" <fengwei.yin@intel.com>
Subject: Re: Folio mapcount
Date: Wed, 8 Feb 2023 02:26:35 +0000 [thread overview]
Message-ID: <Y+MIWz/lvM0q+lzO@casper.infradead.org> (raw)
In-Reply-To: <CADrL8HUrEgt+1qAtEsOHuQeA+WWnggGfLj8_nqHF0k-pqPi52w@mail.gmail.com>
On Tue, Feb 07, 2023 at 04:35:30PM -0800, James Houghton wrote:
> On Tue, Feb 7, 2023 at 3:35 PM Matthew Wilcox <willy@infradead.org> wrote:
> >
> > On Tue, Feb 07, 2023 at 03:27:07PM -0800, James Houghton wrote:
> > > So page_vma_mapped_walk() might have to walk up to HPAGE_PMD_NR-ish
> > > PTEs (if we find a bunch of pte_none() PTEs). Just curious, could that
> > > be any slower than what we currently do (like, incrementing up to
> > > HPAGE_PMD_NR-ish subpage mapcounts)? Or is it not a concern?
> >
> > I think it's faster. Both of these operations work on folio_nr_pages()
> > entries ... but a page table is 8 bytes and a struct page is 64 bytes.
> > From a CPU prefetching point of view, they're both linear scans, but
> > PTEs are 8 times denser.
>
> >
> > The other factor to consider is how often we do each of these operations.
> > Mapping a folio happens ~once per call to mmap() (even though it's delayed
> > until page fault time). Querying folio_total_mapcount() happens ... less
> > often, I think? Both are going to be quite rare since generally we map
> > the entire folio at once.
>
> Maybe this is a case where we would see a regression: doing PAGE_SIZE
> UFFDIO_CONTINUEs on a THP. Worst case, go from the end of the THP to
> the beginning (ending up with a PTE-mapped THP at the end).
>
> For the i'th PTE we map / i'th UFFDIO_CONTINUE, we have to check
> `folio_nr_pages() - i` PTEs (for most of the iterations anyway). Seems
> like this scales with the square of the size of the folio, so this
> approach would be kind of a non-starter for HugeTLB (with
> high-granularity mapping), I think.
>
> This example isn't completely contrived: if we did post-copy live
> migration with userfaultfd, we might end up doing something like this.
> I'm curious what you think. :)
I think that's a great corner-case to consider. For hugetlb pages,
we know they're PMD/PUD aligned, so _if_ there's a page table present,
at least one page from the folio is already mapped, and we don't need
to look in the page table to find which one. Similarly, if the folio
is going to occupy the entire PMD/PUD if it's mapped in part, we don't
need to iterate within it. And contrariwise, if it's p*d_none(), then
definitely none of the pages are mapped.
That perhaps calls for using a different implementation than
page_vma_mapped_walk(), which should be worth it to optimise this case.
next prev parent reply other threads:[~2023-02-08 2:26 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-24 18:13 Matthew Wilcox
2023-01-24 18:35 ` David Hildenbrand
2023-01-24 18:37 ` David Hildenbrand
2023-01-24 18:35 ` Yang Shi
2023-02-02 3:45 ` Mike Kravetz
2023-02-02 15:31 ` Matthew Wilcox
2023-02-07 16:19 ` Zi Yan
2023-02-07 16:44 ` Matthew Wilcox
2023-02-06 20:34 ` Matthew Wilcox
2023-02-06 22:55 ` Yang Shi
2023-02-06 23:09 ` Matthew Wilcox
2023-02-07 3:06 ` Yin, Fengwei
2023-02-07 4:08 ` Matthew Wilcox
2023-02-07 22:39 ` Peter Xu
2023-02-07 23:27 ` Matthew Wilcox
2023-02-08 19:40 ` Peter Xu
2023-02-08 20:25 ` Matthew Wilcox
2023-02-08 20:58 ` Peter Xu
2023-02-09 15:10 ` Chih-En Lin
2023-02-09 15:43 ` Peter Xu
2023-02-07 22:56 ` James Houghton
2023-02-07 23:08 ` Matthew Wilcox
2023-02-07 23:27 ` James Houghton
2023-02-07 23:35 ` Matthew Wilcox
2023-02-08 0:35 ` James Houghton
2023-02-08 2:26 ` Matthew Wilcox [this message]
2023-02-07 16:23 ` Zi Yan
2023-02-07 16:51 ` Matthew Wilcox
2023-02-08 19:36 ` Zi Yan
2023-02-08 19:54 ` Matthew Wilcox
2023-02-10 15:15 ` Zi Yan
2023-03-29 14:02 ` Yin, Fengwei
2023-07-01 1:17 ` Zi Yan
2023-07-02 9:50 ` Yin, Fengwei
2023-07-02 11:45 ` David Hildenbrand
2023-07-02 12:26 ` Matthew Wilcox
2023-07-03 20:54 ` David Hildenbrand
2023-07-02 19:51 ` Zi Yan
2023-07-03 1:09 ` Yin, Fengwei
2023-07-03 13:24 ` Zi Yan
2023-07-03 20:46 ` David Hildenbrand
2023-07-04 1:22 ` Yin, Fengwei
2023-07-04 2:25 ` Matthew Wilcox
2023-07-03 21:09 ` David Hildenbrand
-- strict thread matches above, loose matches on Subject: below --
2021-12-15 21:55 folio mapcount Matthew Wilcox
2021-12-16 9:37 ` Kirill A. Shutemov
2021-12-16 13:56 ` Matthew Wilcox
2021-12-16 15:19 ` Jason Gunthorpe
2021-12-16 15:54 ` Matthew Wilcox
2021-12-16 16:45 ` David Hildenbrand
2021-12-16 17:01 ` Jason Gunthorpe
2021-12-16 18:56 ` Kirill A. Shutemov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y+MIWz/lvM0q+lzO@casper.infradead.org \
--to=willy@infradead.org \
--cc=david@redhat.com \
--cc=fengwei.yin@intel.com \
--cc=hughd@google.com \
--cc=jthoughton@google.com \
--cc=linux-mm@kvack.org \
--cc=riel@surriel.com \
--cc=vishal.moola@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox