From: Shakeel Butt <shakeelb@google.com>
To: David Hildenbrand <david@redhat.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>,
Yang Shi <shy828301@gmail.com>, Peter Xu <peterx@redhat.com>,
Zi Yan <ziy@nvidia.com>, Matthew Wilcox <willy@infradead.org>,
Andrew Morton <akpm@linux-foundation.org>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2] mm: split thp synchronously on MADV_DONTNEED and munmap
Date: Thu, 25 Nov 2021 19:31:08 -0800 [thread overview]
Message-ID: <CALvZod7zeLRJQ2rw5AmEULGPVASrqobwmGS4tNSXeXcp5u_EOw@mail.gmail.com> (raw)
In-Reply-To: <e21a3088-e7fc-0601-3171-f710d644b27d@redhat.com>
On Thu, Nov 25, 2021 at 12:39 AM David Hildenbrand <david@redhat.com> wrote:
>
> On 25.11.21 03:45, Shakeel Butt wrote:
> > Many applications do sophisticated management of their heap memory for
> > better performance but with low cost. We have a bunch of such
> > applications running on our production and examples include caching and
> > data storage services. These applications keep their hot data on the
> > THPs for better performance and release the cold data through
> > MADV_DONTNEED to keep the memory cost low.
> >
> > The kernel defers the split and release of THPs until there is memory
> > pressure. This complicates the memory management of these sophisticated
> > applications which then needs to look into low level kernel handling of
> > THPs to better gauge their headroom for expansion.
> >
> > More specifically these applications monitor their cgroup usage to decide
> > if they can expand the memory footprint or release some (unneeded/cold)
> > buffer. They uses madvise(MADV_DONTNEED) to release the memory which
> > basically puts the THP into defer list. These deferred THPs are still
> > charged to the cgroup which leads to bloated usage read by the application
> > and making wrong decisions. In addition these applications are very
> > latency sensitive and would prefer to not face memory reclaim due to
> > non-deterministic nature of reclaim.
> >
> > Internally we added a cgroup interface to trigger the split of deferred
> > THPs for that cgroup but this is hacky and exposing kernel internals to
> > users. This patch solves this problem in a more general way for the users
> > by splitting the THPS synchronously on MADV_DONTNEED. This patch does
> > the same for munmap() too.
> >
>
> I'll have to defer diving into the code.
>
> Just a comment: It might be good to add that there are still cases where
> splitting the compound page can fail -- for example, if the page is
> still pinned/referenced.
>
> So if you have a THP and intended to only pin/reference e.g., the first
> 4k of it (e.g., O_DIRECT, io_uring fixed buffers), MADV_DONTNEED/unmap
> e.g., the last 4k of it will not split synchronously.
>
> In addition to explicit user action on a compound page; I remember there
> might be other kernel-internal temporary references that could
> theoretically block splitting, but maybe most of them are at least for
> now limited to !compound pages.
>
Hi David,
Thanks for your time (and apologies) but I have to rescind this patch
for now due to mistaken performance impact. Let's move the discussion
to the other thread and decide next steps there.
thanks,
Shakeel
prev parent reply other threads:[~2021-11-26 3:31 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-11-25 2:45 Shakeel Butt
2021-11-25 8:39 ` David Hildenbrand
2021-11-26 3:31 ` Shakeel Butt [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CALvZod7zeLRJQ2rw5AmEULGPVASrqobwmGS4tNSXeXcp5u_EOw@mail.gmail.com \
--to=shakeelb@google.com \
--cc=akpm@linux-foundation.org \
--cc=david@redhat.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=peterx@redhat.com \
--cc=shy828301@gmail.com \
--cc=willy@infradead.org \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox