From: Yafang Shao <laoar.shao@gmail.com>
To: Michal Hocko <mhocko@suse.com>
Cc: hannes@cmpxchg.org, roman.gushchin@linux.dev,
shakeel.butt@linux.dev, muchun.song@linux.dev,
akpm@linux-foundation.org, linux-mm@kvack.org
Subject: Re: [RFC PATCH 0/2] memcg: add nomlock to avoid folios beling mlocked in a memcg
Date: Mon, 6 Jan 2025 21:59:07 +0800 [thread overview]
Message-ID: <CALOAHbC6QdNbn62ZHRCY-PTNevz+wtxMUWgUnLsLFUc1ZC5+YQ@mail.gmail.com> (raw)
In-Reply-To: <Z3vMaDNdC1_fIVKn@tiehlicka>
On Mon, Jan 6, 2025 at 8:28 PM Michal Hocko <mhocko@suse.com> wrote:
>
> On Sun 22-12-24 10:34:12, Yafang Shao wrote:
> > On Sat, Dec 21, 2024 at 3:21 PM Michal Hocko <mhocko@suse.com> wrote:
> > >
> > > On Fri 20-12-24 19:52:16, Yafang Shao wrote:
> > > > On Fri, Dec 20, 2024 at 6:23 PM Michal Hocko <mhocko@suse.com> wrote:
> > > > >
> > > > > On Sun 15-12-24 15:34:13, Yafang Shao wrote:
> > > > > > Implementation Options
> > > > > > ----------------------
> > > > > >
> > > > > > - Solution A: Allow file caches on the unevictable list to become
> > > > > > reclaimable.
> > > > > > This approach would require significant refactoring of the page reclaim
> > > > > > logic.
> > > > > >
> > > > > > - Solution B: Prevent file caches from being moved to the unevictable list
> > > > > > during mlock and ignore the VM_LOCKED flag during page reclaim.
> > > > > > This is a more straightforward solution and is the one we have chosen.
> > > > > > If the file caches are reclaimed from the download-proxy's memcg and
> > > > > > subsequently accessed by tasks in the application’s memcg, a filemap
> > > > > > fault will occur. A new file cache will be faulted in, charged to the
> > > > > > application’s memcg, and locked there.
> > > > >
> > > > > Both options are silently breaking userspace because a non failing mlock
> > > > > doesn't give guarantees it is supposed to AFAICS.
> > > >
> > > > It does not bypass the mlock mechanism; rather, it defers the actual
> > > > locking operation to the page fault path. Could you clarify what you
> > > > mean by "a non-failing mlock"? From what I can see, mlock can indeed
> > > > fail if there isn’t sufficient memory available. With this change, we
> > > > are simply shifting the potential failure point to the page fault path
> > > > instead.
> > >
> > > Your change will cause mlocked pages (as mlock syscall returns success)
> > > to be reclaimable later on. That breaks the basic mlock contract.
> >
> > AFAICS, the mlock() behavior was originally designed with only a
> > single root memory cgroup in mind. In other words, when mlock() was
> > introduced, all locked pages were confined to the same memcg.
>
> yes and this is the case to any other syscalls that might have an impact
> on the memory consumption. This is by design. Memory cgroup controller
> aims to provide a completely transparent resource control without any
> modifications to applications. This is the case for all other cgroup
> controllers. If memcg (or other controller) affects a specific syscall
> behavior then this has to be communicated explicitly to the caller.
>
> The purpose of mlock syscall is to _guarantee_ memory to be resident
> (never swapped out). There might be additional constrains to prevent
> from mlock succeeding - e.g. rlimit or if memcg aims to control amount
> of the mlocked memory but those failures need to be explicitly
> communicated via syscall failure.
Returning an error code like EBUSY to userspace is straightforward
when attempting to mlock a page that is charged to a different memcg.
>
> > However, this changed with the introduction of memcg support. Now,
> > mlock() can lock pages that belong to a different memcg than the
> > current task. This behavior is not explicitly defined in the mlock()
> > documentation, which could lead to confusion.
>
> This is more of a problem of the cgroup configurations where different
> resource domains are sharing resources. This is not much diffent when
> other resources (e.g. shmem) are shared accross unrelated cgroups.
However, we have yet to address even a single one of these issues or
reach a consensus on a solution, correct?
>
> > To clarify, I propose updating the mlock() documentation as follows:
>
> This is not really possible because you are effectively breaking an
> existing userspace.
This behavior is neither mandatory nor the default. You are not
obligated to use it if you prefer not to.
--
Regards
Yafang
next prev parent reply other threads:[~2025-01-06 13:59 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-15 7:34 Yafang Shao
2024-12-15 7:34 ` [RFC PATCH 1/2] mm/memcontrol: add a new cgroup file memory.nomlock Yafang Shao
2024-12-15 7:34 ` [RFC PATCH 2/2] mm: Add support for nomlock to avoid folios beling mlocked in a memcg Yafang Shao
2024-12-20 10:23 ` [RFC PATCH 0/2] memcg: add " Michal Hocko
2024-12-20 11:52 ` Yafang Shao
2024-12-21 7:21 ` Michal Hocko
2024-12-22 2:34 ` Yafang Shao
2024-12-25 2:23 ` Yafang Shao
2025-01-06 12:30 ` Michal Hocko
2025-01-06 14:04 ` Yafang Shao
2025-01-07 8:39 ` Michal Hocko
2025-01-07 9:43 ` Yafang Shao
2025-01-06 12:28 ` Michal Hocko
2025-01-06 13:59 ` Yafang Shao [this message]
2025-01-07 10:04 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CALOAHbC6QdNbn62ZHRCY-PTNevz+wtxMUWgUnLsLFUc1ZC5+YQ@mail.gmail.com \
--to=laoar.shao@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=hannes@cmpxchg.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=muchun.song@linux.dev \
--cc=roman.gushchin@linux.dev \
--cc=shakeel.butt@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox