linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Zach O'Keefe" <zokeefe@google.com>
To: Yang Shi <shy828301@gmail.com>, Michal Hocko <mhocko@suse.com>
Cc: Alex Shi <alex.shi@linux.alibaba.com>,
	David Hildenbrand <david@redhat.com>,
	 David Rientjes <rientjes@google.com>,
	Matthew Wilcox <willy@infradead.org>,
	Peter Xu <peterx@redhat.com>,  Song Liu <songliubraving@fb.com>,
	Linux MM <linux-mm@kvack.org>,
	 Rongwei Wang <rongwei.wang@linux.alibaba.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	 Axel Rasmussen <axelrasmussen@google.com>,
	Hugh Dickins <hughd@google.com>,
	 "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Minchan Kim <minchan@kernel.org>,  SeongJae Park <sj@kernel.org>,
	Pasha Tatashin <pasha.tatashin@soleen.com>
Subject: Re: [RFC] mm: MADV_COLLAPSE semantics
Date: Wed, 25 May 2022 11:09:04 -0700	[thread overview]
Message-ID: <CAAa6QmQyVwoP6eGnRdPqmBBDq_DqoEcbeRZk4Xc3KnjxBcu+vw@mail.gmail.com> (raw)
In-Reply-To: <CAHbLzkpeK8S0rHePZU_OJ_o8tBc0chUoj06PMkz_q0j8uiYj_A@mail.gmail.com>

Hey Michal and Yang,

Thanks for the feedback!

On Tue, May 24, 2022 at 1:02 PM Yang Shi <shy828301@gmail.com> wrote:
> [...]
> Page reclaim could also cause the THP split. And it may happen at any
> time. I'm not sure how the users or callers could monitor it.

I don't have a good idea of what monitoring would look like, but this
is a great example that shows splitting can happen from underneath us
and we'll have to design accordingly.

Luckily in this example, the page is likely cold and therefore of less
interest to be backed by THPs.

On Wed, May 25, 2022 at 10:33 AM Yang Shi <shy828301@gmail.com> wrote:
>
> On Wed, May 25, 2022 at 1:24 AM Michal Hocko <mhocko@suse.com> wrote:
> >
> > On Mon 23-05-22 17:18:32, Zach O'Keefe wrote:
> > [...]
> > > Idea: MADV_COLLAPSE should respect VM_NOHUGEPAGE and "never" THP mode,
> > > but otherwise would attempt to collapse.
> >
> > I do agree that {process_}madvise should fail on VM_NOHUGEPAGE. The
> > process has explicitly noted that THP shouldn't be used on such a VMA
> > and seeing THP could be observed as not complying with that contract.
> >
> > I am not so sure about the global "never" policy, though. The global
> > policy controls _kernel_ driven THPs. As the request to collapse memory
> > comes from the userspace I do not think it should be limited by the
> > kernel policy.

Ya, I agree this would be ideal / is the cleanest. However, Peter
mentioned a non-debug example where users wouldn't be expecting THPs
after setting "never". Though, as Peter points out, I'm not sure how
many users do this with CONFIG_TRANSPARENT_HUGEPAGE=y.

>> I also think it can be beneficial to implement userspace
> > based THP policies and exclude any kernel interference and that could be
> > achieved by global kernel "never" policy and implement the whole
> > functionality by process_madvise.

I don't have a clear picture yet, but even if we move THP collapse
policy to userspace, I imagine we'll still want an informed
application/allocator to be able to MADV_HUGEPAGE'ing known hot memory
and fault-in THPs rather than MADV_COLLAPSING after-the-fact. IOW, I
don't know if we'll ever want "never". When I get started on this
work, I was planning on some prctl(2) interface to disable khugepaged
on processes where the userspace agent has taken responsibility for
THP utilization.

> I'd prefer to respect "never" for now since it is typically used to
> disable THP globally even though the mappings are madvised
> (MADV_HUGEPAGE). IMHO I treat MADV_COLLAPSE as weaker MADV_HUGEPAGE
> (take effect for non-madvised mappings but not flip VM_NOHUGEPAGE) +
> best-effort synchronous THP collapse.

I'm likewise in favor of respecting it until proven otherwise - even
though I agree with Michal that it would be nice to not depend on the
kernel policy / sysfs settings here.

> We could lift the restriction in the future if it turns out non
> respecting "never" is more useful.


  reply	other threads:[~2022-05-25 18:09 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-24  0:18 Zach O'Keefe
2022-05-24 13:26 ` Peter Xu
2022-05-24 17:08   ` Zach O'Keefe
2022-05-24 20:02 ` Yang Shi
2022-05-25  8:24 ` Michal Hocko
2022-05-25 17:32   ` Yang Shi
2022-05-25 18:09     ` Zach O'Keefe [this message]
2022-05-26  7:12     ` Michal Hocko
2022-05-26 17:39       ` Yang Shi
2022-05-27  9:46         ` Michal Hocko
2022-05-31 23:47           ` Yang Shi
2022-06-01  9:50             ` Michal Hocko
2022-06-01 17:25               ` Yang Shi
2022-06-02  6:55                 ` Michal Hocko
2022-06-02 16:43                   ` Yang Shi
2022-06-03 13:26                     ` Zach O'Keefe
2022-06-03 13:33                       ` Zach O'Keefe
2022-05-26 18:30   ` Matthew Wilcox
2022-05-27  8:56     ` Michal Hocko
2022-05-27 18:09     ` Yang Shi
2022-05-31 21:36       ` Zach O'Keefe
2022-05-31 23:52         ` Yang Shi
2022-06-01  9:57         ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAAa6QmQyVwoP6eGnRdPqmBBDq_DqoEcbeRZk4Xc3KnjxBcu+vw@mail.gmail.com \
    --to=zokeefe@google.com \
    --cc=aarcange@redhat.com \
    --cc=alex.shi@linux.alibaba.com \
    --cc=axelrasmussen@google.com \
    --cc=david@redhat.com \
    --cc=hughd@google.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=minchan@kernel.org \
    --cc=pasha.tatashin@soleen.com \
    --cc=peterx@redhat.com \
    --cc=rientjes@google.com \
    --cc=rongwei.wang@linux.alibaba.com \
    --cc=shy828301@gmail.com \
    --cc=sj@kernel.org \
    --cc=songliubraving@fb.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox