linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Zach O'Keefe" <zokeefe@google.com>
To: Yang Shi <shy828301@gmail.com>
Cc: Alex Shi <alex.shi@linux.alibaba.com>,
	David Hildenbrand <david@redhat.com>,
	 David Rientjes <rientjes@google.com>,
	Michal Hocko <mhocko@suse.com>,
	 Pasha Tatashin <pasha.tatashin@soleen.com>,
	SeongJae Park <sj@kernel.org>,  Song Liu <songliubraving@fb.com>,
	Vlastimil Babka <vbabka@suse.cz>, Zi Yan <ziy@nvidia.com>,
	 Linux MM <linux-mm@kvack.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	 Andrew Morton <akpm@linux-foundation.org>,
	Arnd Bergmann <arnd@arndb.de>,
	 Axel Rasmussen <axelrasmussen@google.com>,
	Chris Kennelly <ckennelly@google.com>,
	 Chris Zankel <chris@zankel.net>, Helge Deller <deller@gmx.de>,
	Hugh Dickins <hughd@google.com>,
	 Ivan Kokshaysky <ink@jurassic.park.msu.ru>,
	 "James E.J. Bottomley" <James.Bottomley@hansenpartnership.com>,
	Jens Axboe <axboe@kernel.dk>,
	 "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Matthew Wilcox <willy@infradead.org>,
	 Matt Turner <mattst88@gmail.com>,
	Max Filippov <jcmvbkbc@gmail.com>,
	 Miaohe Lin <linmiaohe@huawei.com>,
	Minchan Kim <minchan@kernel.org>,
	 Patrick Xia <patrickx@google.com>,
	Pavel Begunkov <asml.silence@gmail.com>,
	 Peter Xu <peterx@redhat.com>,
	Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Subject: Re: [RFC PATCH 07/14] mm/khugepaged: add vm_flags_ignore to hugepage_vma_revalidate_pmd_count()
Date: Wed, 9 Mar 2022 16:00:29 -0800	[thread overview]
Message-ID: <CAAa6QmTsuvjtAYvthnyY0bv2NRNX2J1Bqj1kDS-XKvUGEGem_Q@mail.gmail.com> (raw)
In-Reply-To: <CAHbLzkqqkzNGT8ELuN6oVwW11==5Tuh08xti27qzgscF88jR0g@mail.gmail.com>

> On Tue, Mar 8, 2022 at 1:35 PM Zach O'Keefe <zokeefe@google.com> wrote:
> >
> > In madvise collapse context, we optionally want to be able to ignore
> > advice from MADV_NOHUGEPAGE-marked regions.
>
> Could you please elaborate why this usecase is valid? Typically
> MADV_NOHUGEPAGE is set when the users really don't want to have THP
> for this area. So it doesn't make too much sense to ignore it IMHO.
>

Hey Yang, thanks for taking time to review and comment.

Semantically, the way I see it, is that MADV_NOHUGEPAGE is a way for
the user to say "I don't want hugepages here", so that the kernel
knows not to do so when faulting memory, and khugepaged can stay away.
However, in MADV_COLLAPSE, the user is explicitly requesting this be
backed by hugepages - so presumably that is exactly what they want.

IOW, if the user didn't want this memory to be backed by hugepages,
they wouldn't be MADV_COLLAPSE'ing it. If there was a range of memory
the user wanted collapsed, but that had some sub-areas marked
MADV_NOHUGEPAGE, they could always issue multiple MADV_COLLAPSE
operations around the excluded regions.

In terms of use cases, I don't have a concrete example, but a user
could hypothetically choose to exclude regions from management from
khugepaged, but still be able to collapse the memory themselves,
when/if they deem appropriate.

> >
> > Add a vm_flags_ignore argument to hugepage_vma_revalidate_pmd_count()
> > which can be used to ignore vm flags used when considering thp
> > eligibility.
> >
> > Signed-off-by: Zach O'Keefe <zokeefe@google.com>
> > ---
> >  mm/khugepaged.c | 18 ++++++++++++------
> >  1 file changed, 12 insertions(+), 6 deletions(-)
> >
> > diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> > index 1d20be47bcea..ecbd3fc41c80 100644
> > --- a/mm/khugepaged.c
> > +++ b/mm/khugepaged.c
> > @@ -964,10 +964,14 @@ khugepaged_alloc_page(struct page **hpage, gfp_t gfp, int node)
> >  #endif
> >
> >  /*
> > - * Revalidate a vma's eligibility to collapse nr hugepages.
> > + * Revalidate a vma's eligibility to collapse nr hugepages. vm_flags_ignore
> > + * can be used to ignore certain vma_flags that would otherwise be checked -
> > + * the principal example being VM_NOHUGEPAGE which is ignored in madvise
> > + * collapse context.
> >   */
> >  static int hugepage_vma_revalidate_pmd_count(struct mm_struct *mm,
> >                                              unsigned long address, int nr,
> > +                                            unsigned long vm_flags_ignore,
> >                                              struct vm_area_struct **vmap)
> >  {
> >         struct vm_area_struct *vma;
> > @@ -986,7 +990,7 @@ static int hugepage_vma_revalidate_pmd_count(struct mm_struct *mm,
> >         hend = vma->vm_end & HPAGE_PMD_MASK;
> >         if (address < hstart || (address + nr * HPAGE_PMD_SIZE) > hend)
> >                 return SCAN_ADDRESS_RANGE;
> > -       if (!hugepage_vma_check(vma, vma->vm_flags))
> > +       if (!hugepage_vma_check(vma, vma->vm_flags & ~vm_flags_ignore))
> >                 return SCAN_VMA_CHECK;
> >         /* Anon VMA expected */
> >         if (!vma->anon_vma || vma->vm_ops)
> > @@ -1000,9 +1004,11 @@ static int hugepage_vma_revalidate_pmd_count(struct mm_struct *mm,
> >   */
> >
> >  static int hugepage_vma_revalidate(struct mm_struct *mm, unsigned long address,
> > +                                  unsigned long vm_flags_ignore,
> >                                    struct vm_area_struct **vmap)
> >  {
> > -       return hugepage_vma_revalidate_pmd_count(mm, address, 1, vmap);
> > +       return hugepage_vma_revalidate_pmd_count(mm, address, 1,
> > +                       vm_flags_ignore, vmap);
> >  }
> >
> >  /*
> > @@ -1043,7 +1049,7 @@ static bool __collapse_huge_page_swapin(struct mm_struct *mm,
> >                 /* do_swap_page returns VM_FAULT_RETRY with released mmap_lock */
> >                 if (ret & VM_FAULT_RETRY) {
> >                         mmap_read_lock(mm);
> > -                       if (hugepage_vma_revalidate(mm, haddr, &vma)) {
> > +                       if (hugepage_vma_revalidate(mm, haddr, VM_NONE, &vma)) {
> >                                 /* vma is no longer available, don't continue to swapin */
> >                                 trace_mm_collapse_huge_page_swapin(mm, swapped_in, referenced, 0);
> >                                 return false;
> > @@ -1200,7 +1206,7 @@ static void collapse_huge_page(struct mm_struct *mm,
> >         count_memcg_page_event(new_page, THP_COLLAPSE_ALLOC);
> >
> >         mmap_read_lock(mm);
> > -       result = hugepage_vma_revalidate(mm, address, &vma);
> > +       result = hugepage_vma_revalidate(mm, address, VM_NONE, &vma);
> >         if (result) {
> >                 mmap_read_unlock(mm);
> >                 goto out_nolock;
> > @@ -1232,7 +1238,7 @@ static void collapse_huge_page(struct mm_struct *mm,
> >          */
> >         mmap_write_lock(mm);
> >
> > -       result = hugepage_vma_revalidate(mm, address, &vma);
> > +       result = hugepage_vma_revalidate(mm, address, VM_NONE, &vma);
> >         if (result)
> >                 goto out_up_write;
> >         /* check if the pmd is still valid */
> > --
> > 2.35.1.616.g0bdcbb4464-goog
> >


  reply	other threads:[~2022-03-10  0:01 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-08 21:34 [RFC PATCH 00/14] mm: userspace hugepage collapse Zach O'Keefe
2022-03-08 21:34 ` [RFC PATCH 01/14] mm/rmap: add mm_find_pmd_raw helper Zach O'Keefe
2022-03-09 22:48   ` Yang Shi
2022-03-08 21:34 ` [RFC PATCH 02/14] mm/khugepaged: add struct collapse_control Zach O'Keefe
2022-03-09 22:53   ` Yang Shi
2022-03-08 21:34 ` [RFC PATCH 03/14] mm/khugepaged: add __do_collapse_huge_page() helper Zach O'Keefe
2022-03-08 21:34 ` [RFC PATCH 04/14] mm/khugepaged: separate khugepaged_scan_pmd() scan and collapse Zach O'Keefe
2022-03-08 21:34 ` [RFC PATCH 05/14] mm/khugepaged: add mmap_assert_locked() checks to scan_pmd() Zach O'Keefe
2022-03-08 21:34 ` [RFC PATCH 06/14] mm/khugepaged: add hugepage_vma_revalidate_pmd_count() Zach O'Keefe
2022-03-09 23:15   ` Yang Shi
2022-03-08 21:34 ` [RFC PATCH 07/14] mm/khugepaged: add vm_flags_ignore to hugepage_vma_revalidate_pmd_count() Zach O'Keefe
2022-03-09 23:17   ` Yang Shi
2022-03-10  0:00     ` Zach O'Keefe [this message]
2022-03-10  0:41       ` Yang Shi
2022-03-10  1:09         ` Zach O'Keefe
2022-03-10  2:16           ` Yang Shi
2022-03-10 15:50             ` Zach O'Keefe
2022-03-10 18:17               ` Yang Shi
2022-03-10 18:46                 ` David Rientjes
2022-03-10 18:58                   ` Zach O'Keefe
2022-03-10 19:54                   ` Yang Shi
2022-03-10 20:24                     ` Zach O'Keefe
2022-03-10 18:53                 ` Zach O'Keefe
2022-03-10 15:56   ` David Hildenbrand
2022-03-10 18:39     ` Zach O'Keefe
2022-03-10 18:54     ` David Rientjes
2022-03-21 14:27       ` Michal Hocko
2022-03-08 21:34 ` [RFC PATCH 08/14] mm/thp: add madv_thp_vm_flags to __transparent_hugepage_enabled() Zach O'Keefe
2022-03-08 21:34 ` [RFC PATCH 09/14] mm/khugepaged: record SCAN_PAGE_COMPOUND when scan_pmd() finds THP Zach O'Keefe
2022-03-09 23:40   ` Yang Shi
2022-03-10  0:46     ` Zach O'Keefe
2022-03-10  2:05       ` Yang Shi
2022-03-10  8:37         ` Zach O'Keefe
2022-03-08 21:34 ` [RFC PATCH 10/14] mm/khugepaged: rename khugepaged-specific/not functions Zach O'Keefe
2022-03-08 21:34 ` [RFC PATCH 11/14] mm/madvise: introduce MADV_COLLAPSE sync hugepage collapse Zach O'Keefe
2022-03-09 23:43   ` Yang Shi
2022-03-10  1:11     ` Zach O'Keefe
2022-03-08 21:34 ` [RFC PATCH 12/14] mm/madvise: introduce batched madvise(MADV_COLLPASE) collapse Zach O'Keefe
2022-03-10  0:06   ` Yang Shi
2022-03-10 19:26     ` David Rientjes
2022-03-10 20:16       ` Matthew Wilcox
2022-03-11  0:06         ` Zach O'Keefe
2022-03-25 16:51           ` Zach O'Keefe
2022-03-25 19:54             ` Yang Shi
2022-03-08 21:34 ` [RFC PATCH 13/14] mm/madvise: add __madvise_collapse_*_batch() actions Zach O'Keefe
2022-03-08 21:34 ` [RFC PATCH 14/14] mm/madvise: add process_madvise(MADV_COLLAPSE) Zach O'Keefe
2022-03-21 14:32 ` [RFC PATCH 00/14] mm: userspace hugepage collapse Zi Yan
2022-03-21 14:51   ` Zach O'Keefe
2022-03-21 14:37 ` Michal Hocko
2022-03-21 15:46   ` Zach O'Keefe
2022-03-22 12:11     ` Michal Hocko
2022-03-22 15:53       ` Zach O'Keefe
2022-03-29 12:24         ` Michal Hocko
2022-03-30  0:36           ` Zach O'Keefe
2022-03-22  6:40 ` Zach O'Keefe
2022-03-22 12:05   ` Michal Hocko
2022-03-23 13:30     ` Zach O'Keefe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAAa6QmTsuvjtAYvthnyY0bv2NRNX2J1Bqj1kDS-XKvUGEGem_Q@mail.gmail.com \
    --to=zokeefe@google.com \
    --cc=James.Bottomley@hansenpartnership.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex.shi@linux.alibaba.com \
    --cc=arnd@arndb.de \
    --cc=asml.silence@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=axelrasmussen@google.com \
    --cc=chris@zankel.net \
    --cc=ckennelly@google.com \
    --cc=david@redhat.com \
    --cc=deller@gmx.de \
    --cc=hughd@google.com \
    --cc=ink@jurassic.park.msu.ru \
    --cc=jcmvbkbc@gmail.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linmiaohe@huawei.com \
    --cc=linux-mm@kvack.org \
    --cc=mattst88@gmail.com \
    --cc=mhocko@suse.com \
    --cc=minchan@kernel.org \
    --cc=pasha.tatashin@soleen.com \
    --cc=patrickx@google.com \
    --cc=peterx@redhat.com \
    --cc=rientjes@google.com \
    --cc=shy828301@gmail.com \
    --cc=sj@kernel.org \
    --cc=songliubraving@fb.com \
    --cc=tsbogend@alpha.franken.de \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox