linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Zach O'Keefe <zokeefe@google.com>
To: Yang Shi <shy828301@gmail.com>
Cc: Alex Shi <alex.shi@linux.alibaba.com>,
	David Hildenbrand <david@redhat.com>,
	David Rientjes <rientjes@google.com>,
	Matthew Wilcox <willy@infradead.org>,
	Michal Hocko <mhocko@suse.com>,
	Pasha Tatashin <pasha.tatashin@soleen.com>,
	Peter Xu <peterx@redhat.com>,
	Rongwei Wang <rongwei.wang@linux.alibaba.com>,
	SeongJae Park <sj@kernel.org>, Song Liu <songliubraving@fb.com>,
	Vlastimil Babka <vbabka@suse.cz>, Zi Yan <ziy@nvidia.com>,
	Linux MM <linux-mm@kvack.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Arnd Bergmann <arnd@arndb.de>,
	Axel Rasmussen <axelrasmussen@google.com>,
	Chris Kennelly <ckennelly@google.com>,
	Chris Zankel <chris@zankel.net>, Helge Deller <deller@gmx.de>,
	Hugh Dickins <hughd@google.com>,
	Ivan Kokshaysky <ink@jurassic.park.msu.ru>,
	"James E.J. Bottomley" <James.Bottomley@hansenpartnership.com>,
	Jens Axboe <axboe@kernel.dk>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Matt Turner <mattst88@gmail.com>,
	Max Filippov <jcmvbkbc@gmail.com>,
	Miaohe Lin <linmiaohe@huawei.com>,
	Minchan Kim <minchan@kernel.org>,
	Patrick Xia <patrickx@google.com>,
	Pavel Begunkov <asml.silence@gmail.com>,
	Thomas Bogendoerfer <tsbogend@alpha.franken.de>,
	jthoughton@google.com
Subject: Re: [mm-unstable v7 13/18] proc/smaps: add PMDMappable field to smaps
Date: Tue, 12 Jul 2022 09:31:27 -0700	[thread overview]
Message-ID: <Ys2h3yS5qFd+/MIc@google.com> (raw)
In-Reply-To: <CAHbLzkq972yeLJ8yubx3arTeSpwYShutpBFSXDnktwJ8mAyU1w@mail.gmail.com>

On Jul 11 14:37, Yang Shi wrote:
> On Wed, Jul 6, 2022 at 5:06 PM Zach O'Keefe <zokeefe@google.com> wrote:
> >
> > Add PMDMappable field to smaps output which informs the user if memory
> > in the VMA can be PMD-mapped by MADV_COLLAPSE.
> >
> > The distinction from THPeligible is needed for two reasons:
> >
> > 1) For THP, MADV_COLLAPSE is not coupled to THP sysfs controls, which
> >    THPeligible reports.
> >
> > 2) PMDMappable can also be used in HugeTLB fine-granularity mappings,
> >    which are independent from THP.
> 
> Could you please elaborate the usecase? The user checks this hint
> before calling MADV_COLLAPSE? Is it really necessary?
> 
> And, TBH it sounds confusing and we don't have to maintain both
> THPeligible and PMDMappable. We could just relax THPeligible to make
> it return 1 even though THP is disabled by sysfs but MADV_COLLAPSE
> could collapse it if such hint is useful.
> 

Hey Yang,

Thanks for taking the time to review this series again, and thanks for
challenging this.

TLDR: "Is it really necessary" - at the moment, no, probably not .. but I think
it's "useful".

Rationale:

1. IMO, I thought was was confusing seeing:

	...
	AnonHugePages:      2048 kB
	ShmemPmdMapped:        0 kB
	FilePmdMapped:         0 kB
	Shared_Hugetlb:        0 kB
	Private_Hugetlb:       0 kB
	Swap:                  0 kB
	SwapPss:               0 kB
	Locked:                0 kB
	THPeligible:    0
	...

Maybe this could simply be clarified in the docs though.  I guess we can already
get:

	...
	AnonHugePages:         0 kB
	ShmemPmdMapped:        0 kB
	FilePmdMapped:      2048 kB
	Shared_Hugetlb:        0 kB
	Private_Hugetlb:       0 kB
	Swap:                  0 kB
	SwapPss:               0 kB
	Locked:                0 kB
	THPeligible:    0
	...

today[1], so perhaps it's not a big deal.


2. It was useful for debugging - similar to rationale for including
THPeligible1[2], the logic for determining if a VMA is eligible is pretty
complicated. I.e. is this file mapped suitably? Unlike THPeligible, however,
madvise(2) has the ability to set errno on failure to help* diagnose why some
memory isn't being backed.

3. For the immediately-envisioned usecases, the user "knows" about what memory
they are acting on. However, eventually we'd like to experiment with moving THP
utilization policy to userspace. Here, it would be useful if the userspace agent
managing was made aware of what memory it should be managing. I don't have a
working prototype of what this would like yet, however.

4. I thought it was neat that this field could be reused for HugeTLB
fine-granularity mappings - but TBH I'm not sure how useful it'd be there.

I figured relaxing existing THPeligible could break existing users / tests, and
it'd be likewise confusing for them to see THPeligible:	1, but then have faults
fail and they'd then have to go check sysfs settings and vma flags ; we'd be
back in pre-commit 7635d9cbe832 ("mm, thp, proc: report THP eligibility for each
vma").

Thanks,
Zach

[1] https://lore.kernel.org/linux-mm/YrxbQGiwml24APCx@google.com/


> 
> >
> > Signed-off-by: Zach O'Keefe <zokeefe@google.com>
> > ---
> >  Documentation/filesystems/proc.rst | 10 ++++++++--
> >  fs/proc/task_mmu.c                 |  2 ++
> >  2 files changed, 10 insertions(+), 2 deletions(-)
> >
> > diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst
> > index 47e95dbc820d..f207903a57a5 100644
> > --- a/Documentation/filesystems/proc.rst
> > +++ b/Documentation/filesystems/proc.rst
> > @@ -466,6 +466,7 @@ Memory Area, or VMA) there is a series of lines such as the following::
> >      MMUPageSize:           4 kB
> >      Locked:                0 kB
> >      THPeligible:           0
> > +    PMDMappable:           0
> >      VmFlags: rd ex mr mw me dw
> >
> >  The first of these lines shows the same information as is displayed for the
> > @@ -518,9 +519,14 @@ replaced by copy-on-write) part of the underlying shmem object out on swap.
> >  does not take into account swapped out page of underlying shmem objects.
> >  "Locked" indicates whether the mapping is locked in memory or not.
> >
> > +"PMDMappable" indicates if the memory can be mapped by PMDs - 1 if true, 0
> > +otherwise.  It just shows the current status. Note that this is memory
> > +operable on explicitly by MADV_COLLAPSE.
> > +
> >  "THPeligible" indicates whether the mapping is eligible for allocating THP
> > -pages as well as the THP is PMD mappable or not - 1 if true, 0 otherwise.
> > -It just shows the current status.
> > +pages by the kernel, as well as the THP is PMD mappable or not - 1 if true, 0
> > +otherwise. It just shows the current status.  Note this is memory the kernel can
> > +transparently provide as THPs.
> >
> >  "VmFlags" field deserves a separate description. This member represents the
> >  kernel flags associated with the particular virtual memory area in two letter
> > diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> > index f8cd58846a28..29f2089456ba 100644
> > --- a/fs/proc/task_mmu.c
> > +++ b/fs/proc/task_mmu.c
> > @@ -867,6 +867,8 @@ static int show_smap(struct seq_file *m, void *v)
> >
> >         seq_printf(m, "THPeligible:    %d\n",
> >                    hugepage_vma_check(vma, vma->vm_flags, true, false, true));
> > +       seq_printf(m, "PMDMappable:    %d\n",
> > +                  hugepage_vma_check(vma, vma->vm_flags, true, false, false));
> >
> >         if (arch_pkeys_enabled())
> >                 seq_printf(m, "ProtectionKey:  %8u\n", vma_pkey(vma));
> > --
> > 2.37.0.rc0.161.g10f37bed90-goog
> >


  reply	other threads:[~2022-07-12 16:31 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-06 23:59 [mm-unstable v7 00/18] mm: userspace hugepage collapse Zach O'Keefe
2022-07-06 23:59 ` [mm-unstable v7 01/18] mm/khugepaged: remove redundant transhuge_vma_suitable() check Zach O'Keefe
2022-07-11 20:38   ` Yang Shi
2022-07-12 17:14     ` Zach O'Keefe
2022-07-06 23:59 ` [mm-unstable v7 02/18] mm: khugepaged: don't carry huge page to the next loop for !CONFIG_NUMA Zach O'Keefe
2022-07-06 23:59 ` [mm-unstable v7 03/18] mm/khugepaged: add struct collapse_control Zach O'Keefe
2022-07-08 21:01   ` Andrew Morton
2022-07-11 18:29     ` Zach O'Keefe
2022-07-11 18:45       ` Andrew Morton
2022-07-12 14:17         ` Zach O'Keefe
2022-07-11 21:51       ` Yang Shi
2022-07-06 23:59 ` [mm-unstable v7 04/18] mm/khugepaged: dedup and simplify hugepage alloc and charging Zach O'Keefe
2022-07-06 23:59 ` [mm-unstable v7 05/18] mm/khugepaged: pipe enum scan_result codes back to callers Zach O'Keefe
2022-07-06 23:59 ` [mm-unstable v7 06/18] mm/khugepaged: add flag to predicate khugepaged-only behavior Zach O'Keefe
2022-07-11 20:43   ` Yang Shi
2022-07-12 17:06     ` Zach O'Keefe
2022-07-06 23:59 ` [mm-unstable v7 07/18] mm/thp: add flag to enforce sysfs THP in hugepage_vma_check() Zach O'Keefe
2022-07-11 20:57   ` Yang Shi
2022-07-12 16:58     ` Zach O'Keefe
2022-07-06 23:59 ` [mm-unstable v7 08/18] mm/khugepaged: record SCAN_PMD_MAPPED when scan_pmd() finds hugepage Zach O'Keefe
2022-07-11 21:03   ` Yang Shi
2022-07-12 16:50     ` Zach O'Keefe
2022-07-06 23:59 ` [mm-unstable v7 09/18] mm/madvise: introduce MADV_COLLAPSE sync hugepage collapse Zach O'Keefe
2022-07-11 21:22   ` Yang Shi
2022-07-12 16:54     ` Zach O'Keefe
2022-07-06 23:59 ` [mm-unstable v7 10/18] mm/khugepaged: rename prefix of shared collapse functions Zach O'Keefe
2022-07-06 23:59 ` [mm-unstable v7 11/18] mm/madvise: add huge_memory:mm_madvise_collapse tracepoint Zach O'Keefe
2022-07-11 21:32   ` Yang Shi
2022-07-12 16:21     ` Zach O'Keefe
2022-07-12 17:05       ` Yang Shi
2022-07-12 17:30         ` Zach O'Keefe
2022-07-06 23:59 ` [mm-unstable v7 12/18] mm/madvise: add MADV_COLLAPSE to process_madvise() Zach O'Keefe
2022-07-08 20:47   ` Andrew Morton
2022-07-13  1:05     ` Zach O'Keefe
2022-07-06 23:59 ` [mm-unstable v7 13/18] proc/smaps: add PMDMappable field to smaps Zach O'Keefe
2022-07-11 21:37   ` Yang Shi
2022-07-12 16:31     ` Zach O'Keefe [this message]
2022-07-12 17:27       ` Yang Shi
2022-07-12 17:57         ` Zach O'Keefe
2022-07-13 18:02           ` Andrew Morton
2022-07-13 18:40             ` Zach O'Keefe
2022-07-06 23:59 ` [mm-unstable v7 14/18] selftests/vm: modularize collapse selftests Zach O'Keefe
2022-07-06 23:59 ` [mm-unstable v7 15/18] selftests/vm: dedup hugepage allocation logic Zach O'Keefe
2022-07-06 23:59 ` [mm-unstable v7 16/18] selftests/vm: add MADV_COLLAPSE collapse context to selftests Zach O'Keefe
2022-07-06 23:59 ` [mm-unstable v7 17/18] selftests/vm: add selftest to verify recollapse of THPs Zach O'Keefe
2022-07-06 23:59 ` [mm-unstable v7 18/18] selftests/vm: add selftest to verify multi THP collapse Zach O'Keefe
2022-07-14 18:55 ` [RFC] mm: userspace hugepage collapse: file/shmem semantics Zach O'Keefe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Ys2h3yS5qFd+/MIc@google.com \
    --to=zokeefe@google.com \
    --cc=James.Bottomley@hansenpartnership.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex.shi@linux.alibaba.com \
    --cc=arnd@arndb.de \
    --cc=asml.silence@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=axelrasmussen@google.com \
    --cc=chris@zankel.net \
    --cc=ckennelly@google.com \
    --cc=david@redhat.com \
    --cc=deller@gmx.de \
    --cc=hughd@google.com \
    --cc=ink@jurassic.park.msu.ru \
    --cc=jcmvbkbc@gmail.com \
    --cc=jthoughton@google.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linmiaohe@huawei.com \
    --cc=linux-mm@kvack.org \
    --cc=mattst88@gmail.com \
    --cc=mhocko@suse.com \
    --cc=minchan@kernel.org \
    --cc=pasha.tatashin@soleen.com \
    --cc=patrickx@google.com \
    --cc=peterx@redhat.com \
    --cc=rientjes@google.com \
    --cc=rongwei.wang@linux.alibaba.com \
    --cc=shy828301@gmail.com \
    --cc=sj@kernel.org \
    --cc=songliubraving@fb.com \
    --cc=tsbogend@alpha.franken.de \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox