From: Chris Li <chrisl@kernel.org>
To: Kairui Song <ryncsn@gmail.com>
Cc: "Huang, Ying" <ying.huang@intel.com>,
linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
David Hildenbrand <david@redhat.com>,
Hugh Dickins <hughd@google.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Matthew Wilcox <willy@infradead.org>,
Michal Hocko <mhocko@suse.com>,
linux-kernel@vger.kernel.org, Tejun Heo <tj@kernel.org>
Subject: Re: [PATCH 23/24] swap: fix multiple swap leak when after cgroup migrate
Date: Tue, 21 Nov 2023 21:34:37 -0800 [thread overview]
Message-ID: <CAF8kJuOahJMXu0ds8j9XHV81eSknVitvvWmvW4J-O_i=H-ON0g@mail.gmail.com> (raw)
In-Reply-To: <CAMgjq7Ahh2HVA0ym2t0UdX=FYQ52mrKXDueK5GeNK9=2h+adwQ@mail.gmail.com>
Hi Kairui,
On Mon, Nov 20, 2023 at 3:17 AM Kairui Song <ryncsn@gmail.com> wrote:
>
> Huang, Ying <ying.huang@intel.com> 于2023年11月20日周一 15:37写道:
> >
> > Kairui Song <ryncsn@gmail.com> writes:
> >
> > > From: Kairui Song <kasong@tencent.com>
> > >
> > > When a process which previously swapped some memory was moved to
> > > another cgroup, and the cgroup it previous in is dead, then swapped in
> > > pages will be leaked into rootcg. Previous commits fixed the bug for
> > > no readahead path, this commit fix the same issue for readahead path.
> > >
> > > This can be easily reproduced by:
> > > - Setup a SSD or HDD swap.
> > > - Create memory cgroup A, B and C.
> > > - Spawn process P1 in cgroup A and make it swap out some pages.
> > > - Move process P1 to memory cgroup B.
> > > - Destroy cgroup A.
> > > - Do a swapoff in cgroup C
> > > - Swapped in pages is accounted into cgroup C.
> > >
> > > This patch will fix it make the swapped in pages accounted in cgroup B.
> >
> > Accroding to "Memory Ownership" section of
> > Documentation/admin-guide/cgroup-v2.rst,
> >
> > "
> > A memory area is charged to the cgroup which instantiated it and stays
> > charged to the cgroup until the area is released. Migrating a process
> > to a different cgroup doesn't move the memory usages that it
> > instantiated while in the previous cgroup to the new cgroup.
> > "
> >
> > Because we don't move the charge when we move a task from one cgroup to
> > another. It's controversial which cgroup should be charged to.
> > According to the above document, it's acceptable to charge to the cgroup
> > C (cgroup where swapoff happens).
>
> Hi Ying, thank you very much for the info!
>
> It is controversial indeed, just the original behavior is kind of
> counter-intuitive.
>
> Image if there are cgroup P1, and its child cgroup C1 C2. If a process
> swapped out some memory in C1 then moved to C2, and C1 is dead.
> On swapoff the charge will be moved out of P1...
>
> And swapoff often happen on some unlimited cgroup or some cgroup for
> management agent.
>
> If P1 have a memory limit, it can breech the limit easily, we will see
> a process that never leave P1 having a much higher RSS that P1/C1/C2's
> limit.
> And if there is a limit for the management agent cgroup, the agent
> will be OOM instead of OOM in P1.
I think I will reply to another similar email.
If you want OOM in P1, you can have an admin program. fork and execute
a new process, add the new process into P1, then swap off from that
new process.
>
> Simply moving a process between the child cgroup of the same parent
> cgroup won't cause such issue, thing get weird when swapoff is
> involved.
>
> Or maybe we should try to be compatible, and introduce a sysctl or
> cmdline for this?
If the above suggestion works, then you don't need to change swap off?
If you still want to change the charging model. I like to see the
bigger picture, what rules it follows and how it works in other
situations.
Chris
next prev parent reply other threads:[~2023-11-22 5:34 UTC|newest]
Thread overview: 93+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-19 19:47 [PATCH 00/24] Swapin path refactor for optimization and bugfix Kairui Song
2023-11-19 19:47 ` [PATCH 01/24] mm/swap: fix a potential undefined behavior issue Kairui Song
2023-11-19 20:55 ` Matthew Wilcox
2023-11-20 3:35 ` Chris Li
2023-11-20 11:14 ` Kairui Song
2023-11-20 17:34 ` Chris Li
2023-11-19 19:47 ` [PATCH 02/24] mm/swapfile.c: add back some comment Kairui Song
2023-11-19 19:47 ` [PATCH 03/24] mm/swap: move no readahead swapin code to a stand alone helper Kairui Song
2023-11-19 21:00 ` Matthew Wilcox
2023-11-20 11:14 ` Kairui Song
2023-11-20 14:55 ` Dan Carpenter
2023-11-21 5:34 ` Chris Li
2023-11-22 17:33 ` Kairui Song
2023-11-19 19:47 ` [PATCH 04/24] mm/swap: avoid setting page lock bit and doing extra unlock check Kairui Song
2023-11-20 4:17 ` Chris Li
2023-11-20 11:15 ` Kairui Song
2023-11-20 17:44 ` Chris Li
2023-11-22 17:32 ` Kairui Song
2023-11-22 20:57 ` Chris Li
2023-11-24 8:14 ` Kairui Song
2023-11-24 8:37 ` Christopher Li
2023-11-19 19:47 ` [PATCH 05/24] mm/swap: move readahead policy checking into swapin_readahead Kairui Song
2023-11-21 6:15 ` Chris Li
2023-11-21 6:35 ` Kairui Song
2023-11-21 7:41 ` Chris Li
2023-11-21 8:32 ` Kairui Song
2023-11-21 15:24 ` Chris Li
2023-11-19 19:47 ` [PATCH 06/24] swap: rework swapin_no_readahead arguments Kairui Song
2023-11-20 0:20 ` kernel test robot
2023-11-21 6:44 ` Chris Li
2023-11-23 10:51 ` Kairui Song
2023-11-19 19:47 ` [PATCH 07/24] mm/swap: move swap_count to header to be shared Kairui Song
2023-11-21 6:51 ` Chris Li
2023-11-21 7:03 ` Kairui Song
2023-11-19 19:47 ` [PATCH 08/24] mm/swap: check readahead policy per entry Kairui Song
2023-11-20 6:04 ` Huang, Ying
2023-11-20 11:17 ` Kairui Song
2023-11-21 1:10 ` Huang, Ying
2023-11-21 5:20 ` Chris Li
2023-11-21 5:13 ` Chris Li
2023-11-21 7:54 ` Chris Li
2023-11-23 10:52 ` Kairui Song
2023-11-19 19:47 ` [PATCH 09/24] mm/swap: inline __swap_count Kairui Song
2023-11-20 7:41 ` Huang, Ying
2023-11-21 8:02 ` Chris Li
2023-11-19 19:47 ` [PATCH 10/24] mm/swap: remove nr_rotate_swap and related code Kairui Song
2023-11-21 15:45 ` Chris Li
2023-11-19 19:47 ` [PATCH 11/24] mm/swap: also handle swapcache lookup in swapin_readahead Kairui Song
2023-11-20 0:47 ` kernel test robot
2023-11-21 16:06 ` Chris Li
2023-11-24 8:42 ` Kairui Song
2023-11-24 9:10 ` Chris Li
2023-11-19 19:47 ` [PATCH 12/24] mm/swap: simplify arguments for swap_cache_get_folio Kairui Song
2023-11-21 16:36 ` Chris Li
2023-11-19 19:47 ` [PATCH 13/24] swap: simplify swap_cache_get_folio Kairui Song
2023-11-21 16:50 ` Chris Li
2023-11-19 19:47 ` [PATCH 14/24] mm/swap: do shadow lookup as well when doing swap cache lookup Kairui Song
2023-11-21 16:55 ` Chris Li
2023-11-19 19:47 ` [PATCH 15/24] mm/swap: avoid an duplicated swap cache lookup for SYNCHRONOUS_IO device Kairui Song
2023-11-21 17:15 ` Chris Li
2023-11-22 18:08 ` Kairui Song
2023-11-19 19:47 ` [PATCH 16/24] mm/swap: reduce scope of get_swap_device in swapin path Kairui Song
2023-11-19 21:12 ` Matthew Wilcox
2023-11-20 11:14 ` Kairui Song
2023-11-21 17:25 ` Chris Li
2023-11-22 0:36 ` Huang, Ying
2023-11-23 11:13 ` Kairui Song
2023-11-24 0:40 ` Huang, Ying
2023-11-19 19:47 ` [PATCH 17/24] mm/swap: fix false error when swapoff race with swapin Kairui Song
2023-11-19 19:47 ` [PATCH 18/24] mm/swap: introduce a helper non fault swapin Kairui Song
2023-11-20 1:07 ` kernel test robot
2023-11-22 4:40 ` Chris Li
2023-11-28 11:22 ` Kairui Song
2023-12-13 2:22 ` Chris Li
2023-11-19 19:47 ` [PATCH 19/24] shmem, swap: refactor error check on OOM or race Kairui Song
2023-11-20 7:04 ` Chris Li
2023-11-20 11:17 ` Kairui Song
2023-11-19 19:47 ` [PATCH 20/24] swap: simplify and make swap_find_cache static Kairui Song
2023-11-22 5:01 ` Chris Li
2023-11-19 19:47 ` [PATCH 21/24] swap: make swapin_readahead result checking argument mandatory Kairui Song
2023-11-22 5:15 ` Chris Li
2023-11-24 8:14 ` Kairui Song
2023-11-19 19:47 ` [PATCH 22/24] swap: make swap_cluster_readahead static Kairui Song
2023-11-22 5:20 ` Chris Li
2023-11-19 19:47 ` [PATCH 23/24] swap: fix multiple swap leak when after cgroup migrate Kairui Song
2023-11-20 7:35 ` Huang, Ying
2023-11-20 11:17 ` Kairui Song
2023-11-22 5:34 ` Chris Li [this message]
2023-11-19 19:47 ` [PATCH 24/24] mm/swap: change swapin_readahead to swapin_page_fault Kairui Song
2023-11-20 19:09 ` [PATCH 00/24] Swapin path refactor for optimization and bugfix Yosry Ahmed
2023-11-20 20:22 ` Chris Li
2023-11-22 6:46 ` Kairui Song
2023-11-22 6:43 ` Kairui Song
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAF8kJuOahJMXu0ds8j9XHV81eSknVitvvWmvW4J-O_i=H-ON0g@mail.gmail.com' \
--to=chrisl@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=david@redhat.com \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=ryncsn@gmail.com \
--cc=tj@kernel.org \
--cc=willy@infradead.org \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox