* Re: A path forward to cleaning up dying cgroups?
2025-02-05 18:08 ` Johannes Weiner
@ 2025-02-05 18:16 ` Yosry Ahmed
2025-02-06 4:56 ` Kairui Song
2025-02-05 18:31 ` Roman Gushchin
2025-02-05 18:46 ` Shakeel Butt
2 siblings, 1 reply; 10+ messages in thread
From: Yosry Ahmed @ 2025-02-05 18:16 UTC (permalink / raw)
To: Johannes Weiner
Cc: Hamza Mahfooz, linux-mm, Roman Gushchin, Shakeel Butt,
Andrew Morton, cgroups, linux-kernel, Tejun Heo,
Michal Koutný,
Michal Hocko, Muchun Song, Zach O'Keefe, Kinsey Ho,
Yosry Ahmed, Allen Pais
On Wed, Feb 05, 2025 at 01:08:42PM -0500, Johannes Weiner wrote:
> On Wed, Feb 05, 2025 at 12:50:19PM -0500, Hamza Mahfooz wrote:
> > Cc: Shakeel Butt <shakeel.butt@linux.dev>
> >
> > On 2/5/25 12:48, Hamza Mahfooz wrote:
> > > I was just curious as to what the status of the issue described in [1]
> > > is. It appears that the last time someone took a stab at it was in [2].
>
> If memory serves, the sticking point was whether pages should indeed
> be reparented on cgroup death, or whether they could be moved
> arbitrarily to other cgroups that are still using them.
>
> It's a bit unfortunate, because the reparenting patches were tested
> and reviewed, and the arbitrary recharging was just an idea that
> ttbomk nobody seriously followed up on afterwards.
There was an RFC series [1] for the recharging, but all memcg
maintainers hated it :P
https://lore.kernel.org/lkml/20230720070825.992023-1-yosryahmed@google.com/
>
> We also recently removed the charge moving code from cgroup1, along
> with the subtle page access/locking/accounting rules it imposed on the
> rest of the MM. I'm doubtful there is much appetite in either camp for
> bringing this back.
Yeah with the charge moving code gone the case for recharging grows
weaker.
>
> So I would still love to see Muchun's patches merged. They fix a
> seemingly universally experienced operational issue in memcg, and we
> shouldn't hold it up unless somebody actually posts alternative code.
>
> Thoughts?
Adding Zach and Kinsey who were recently looking into this from the
Google side.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: A path forward to cleaning up dying cgroups?
2025-02-05 18:16 ` Yosry Ahmed
@ 2025-02-06 4:56 ` Kairui Song
0 siblings, 0 replies; 10+ messages in thread
From: Kairui Song @ 2025-02-06 4:56 UTC (permalink / raw)
To: Yosry Ahmed, Muchun Song
Cc: Johannes Weiner, Hamza Mahfooz, linux-mm, Roman Gushchin,
Shakeel Butt, Andrew Morton, cgroups, linux-kernel, Tejun Heo,
Michal Koutný,
Michal Hocko, Zach O'Keefe, Kinsey Ho, Yosry Ahmed,
Allen Pais
On Thu, Feb 6, 2025 at 2:16 AM Yosry Ahmed <yosry.ahmed@linux.dev> wrote:
>
> On Wed, Feb 05, 2025 at 01:08:42PM -0500, Johannes Weiner wrote:
> > On Wed, Feb 05, 2025 at 12:50:19PM -0500, Hamza Mahfooz wrote:
> > > Cc: Shakeel Butt <shakeel.butt@linux.dev>
> > >
> > > On 2/5/25 12:48, Hamza Mahfooz wrote:
> > > > I was just curious as to what the status of the issue described in [1]
> > > > is. It appears that the last time someone took a stab at it was in [2].
> >
> > If memory serves, the sticking point was whether pages should indeed
> > be reparented on cgroup death, or whether they could be moved
> > arbitrarily to other cgroups that are still using them.
> >
> > It's a bit unfortunate, because the reparenting patches were tested
> > and reviewed, and the arbitrary recharging was just an idea that
> > ttbomk nobody seriously followed up on afterwards.
>
> There was an RFC series [1] for the recharging, but all memcg
> maintainers hated it :P
>
> https://lore.kernel.org/lkml/20230720070825.992023-1-yosryahmed@google.com/
We have been suffering from dying cgroup issues for years too, and I
just saw this series. Will it be a good idea to combine this with
reparenting instead (if we will go with the reparenting approach)?
Using objcg API to charge the folios does help speed up the
reparenting, but also adds some overhead and complexity. Just walking
and reparenting the folios seems a more direct approach.
And another idea is, per our observation, dying cgroups have few pages
that are mapped, as the process has all exited. Most folios are just
cache. Shared mapped pages are minor especially for containers. So a
deferred recharge on access seems good enough? Mapped folios may also
be finally unmap someday and get recharged. And at least this makes
accounting more accurate.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: A path forward to cleaning up dying cgroups?
2025-02-05 18:08 ` Johannes Weiner
2025-02-05 18:16 ` Yosry Ahmed
@ 2025-02-05 18:31 ` Roman Gushchin
2025-02-05 18:46 ` Shakeel Butt
2 siblings, 0 replies; 10+ messages in thread
From: Roman Gushchin @ 2025-02-05 18:31 UTC (permalink / raw)
To: Johannes Weiner
Cc: Hamza Mahfooz, linux-mm, Shakeel Butt, Andrew Morton, cgroups,
linux-kernel, Tejun Heo, Michal Koutný,
Michal Hocko, Muchun Song, Allen Pais, Yosry Ahmed
On Wed, Feb 05, 2025 at 01:08:42PM -0500, Johannes Weiner wrote:
> On Wed, Feb 05, 2025 at 12:50:19PM -0500, Hamza Mahfooz wrote:
> > Cc: Shakeel Butt <shakeel.butt@linux.dev>
> >
> > On 2/5/25 12:48, Hamza Mahfooz wrote:
> > > I was just curious as to what the status of the issue described in [1]
> > > is. It appears that the last time someone took a stab at it was in [2].
>
> If memory serves, the sticking point was whether pages should indeed
> be reparented on cgroup death, or whether they could be moved
> arbitrarily to other cgroups that are still using them.
>
> It's a bit unfortunate, because the reparenting patches were tested
> and reviewed, and the arbitrary recharging was just an idea that
> ttbomk nobody seriously followed up on afterwards.
>
> We also recently removed the charge moving code from cgroup1, along
> with the subtle page access/locking/accounting rules it imposed on the
> rest of the MM. I'm doubtful there is much appetite in either camp for
> bringing this back.
>
> So I would still love to see Muchun's patches merged. They fix a
> seemingly universally experienced operational issue in memcg, and we
> shouldn't hold it up unless somebody actually posts alternative code.
>
> Thoughts?
I don't have a strong opinion here. Reparenting is clearly not perfect,
but I agree that we don't have any better solutions, only vague ideas.
I believe Muchun's code would require some refresh, but generally is fine
to merge.
This all comes up to the handling of memory shared between cgroups.
Sharing can be spatial (2 or more simultaneously existing cgroups) or
temporal (a cgroup is being deleted and recreated, the workload tries to
reuse old pages). The reparenting turns temporal sharing into the spacial.
It helps with dying cgroups, but comes at the cost of permanently wrong
accounting and issues with the memory protection.
Thanks!
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: A path forward to cleaning up dying cgroups?
2025-02-05 18:08 ` Johannes Weiner
2025-02-05 18:16 ` Yosry Ahmed
2025-02-05 18:31 ` Roman Gushchin
@ 2025-02-05 18:46 ` Shakeel Butt
2025-02-06 3:30 ` Muchun Song
2 siblings, 1 reply; 10+ messages in thread
From: Shakeel Butt @ 2025-02-05 18:46 UTC (permalink / raw)
To: Johannes Weiner
Cc: Hamza Mahfooz, linux-mm, Roman Gushchin, Andrew Morton, cgroups,
linux-kernel, Tejun Heo, Michal Koutný,
Michal Hocko, Muchun Song, Allen Pais, Yosry Ahmed
On Wed, Feb 05, 2025 at 01:08:42PM -0500, Johannes Weiner wrote:
> On Wed, Feb 05, 2025 at 12:50:19PM -0500, Hamza Mahfooz wrote:
> > Cc: Shakeel Butt <shakeel.butt@linux.dev>
> >
> > On 2/5/25 12:48, Hamza Mahfooz wrote:
> > > I was just curious as to what the status of the issue described in [1]
> > > is. It appears that the last time someone took a stab at it was in [2].
>
> If memory serves, the sticking point was whether pages should indeed
> be reparented on cgroup death, or whether they could be moved
> arbitrarily to other cgroups that are still using them.
>
> It's a bit unfortunate, because the reparenting patches were tested
> and reviewed, and the arbitrary recharging was just an idea that
> ttbomk nobody seriously followed up on afterwards.
>
> We also recently removed the charge moving code from cgroup1, along
> with the subtle page access/locking/accounting rules it imposed on the
> rest of the MM. I'm doubtful there is much appetite in either camp for
> bringing this back.
>
> So I would still love to see Muchun's patches merged. They fix a
> seemingly universally experienced operational issue in memcg, and we
> shouldn't hold it up unless somebody actually posts alternative code.
>
> Thoughts?
I think the recharging (or whatever the alternative) can be a followup
to this. I agree this is a good change.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: A path forward to cleaning up dying cgroups?
2025-02-05 18:46 ` Shakeel Butt
@ 2025-02-06 3:30 ` Muchun Song
2025-02-06 3:34 ` Waiman Long
2025-02-06 15:51 ` Kamalesh Babulal
0 siblings, 2 replies; 10+ messages in thread
From: Muchun Song @ 2025-02-06 3:30 UTC (permalink / raw)
To: Shakeel Butt
Cc: Johannes Weiner, Hamza Mahfooz, linux-mm, Roman Gushchin,
Andrew Morton, cgroups, linux-kernel, Tejun Heo,
Michal Koutný,
Michal Hocko, Allen Pais, Yosry Ahmed
> On Feb 6, 2025, at 02:46, Shakeel Butt <shakeel.butt@linux.dev> wrote:
>
> On Wed, Feb 05, 2025 at 01:08:42PM -0500, Johannes Weiner wrote:
>> On Wed, Feb 05, 2025 at 12:50:19PM -0500, Hamza Mahfooz wrote:
>>> Cc: Shakeel Butt <shakeel.butt@linux.dev>
>>>
>>> On 2/5/25 12:48, Hamza Mahfooz wrote:
>>>> I was just curious as to what the status of the issue described in [1]
>>>> is. It appears that the last time someone took a stab at it was in [2].
>>
>> If memory serves, the sticking point was whether pages should indeed
>> be reparented on cgroup death, or whether they could be moved
>> arbitrarily to other cgroups that are still using them.
>>
>> It's a bit unfortunate, because the reparenting patches were tested
>> and reviewed, and the arbitrary recharging was just an idea that
>> ttbomk nobody seriously followed up on afterwards.
>>
>> We also recently removed the charge moving code from cgroup1, along
>> with the subtle page access/locking/accounting rules it imposed on the
>> rest of the MM. I'm doubtful there is much appetite in either camp for
>> bringing this back.
>>
>> So I would still love to see Muchun's patches merged. They fix a
>> seemingly universally experienced operational issue in memcg, and we
>> shouldn't hold it up unless somebody actually posts alternative code.
>>
>> Thoughts?
>
> I think the recharging (or whatever the alternative) can be a followup
> to this. I agree this is a good change.
I agree with you. We've been encountering dying memory issues for years
on our servers. As Roman said, I need to refresh my patches. So I need
some time for refreshing.
Muchun,
Thanks.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: A path forward to cleaning up dying cgroups?
2025-02-06 3:30 ` Muchun Song
@ 2025-02-06 3:34 ` Waiman Long
2025-02-06 15:51 ` Kamalesh Babulal
1 sibling, 0 replies; 10+ messages in thread
From: Waiman Long @ 2025-02-06 3:34 UTC (permalink / raw)
To: Muchun Song, Shakeel Butt
Cc: Johannes Weiner, Hamza Mahfooz, linux-mm, Roman Gushchin,
Andrew Morton, cgroups, linux-kernel, Tejun Heo,
Michal Koutný,
Michal Hocko, Allen Pais, Yosry Ahmed
On 2/5/25 10:30 PM, Muchun Song wrote:
>
>> On Feb 6, 2025, at 02:46, Shakeel Butt <shakeel.butt@linux.dev> wrote:
>>
>> On Wed, Feb 05, 2025 at 01:08:42PM -0500, Johannes Weiner wrote:
>>> On Wed, Feb 05, 2025 at 12:50:19PM -0500, Hamza Mahfooz wrote:
>>>> Cc: Shakeel Butt <shakeel.butt@linux.dev>
>>>>
>>>> On 2/5/25 12:48, Hamza Mahfooz wrote:
>>>>> I was just curious as to what the status of the issue described in [1]
>>>>> is. It appears that the last time someone took a stab at it was in [2].
>>> If memory serves, the sticking point was whether pages should indeed
>>> be reparented on cgroup death, or whether they could be moved
>>> arbitrarily to other cgroups that are still using them.
>>>
>>> It's a bit unfortunate, because the reparenting patches were tested
>>> and reviewed, and the arbitrary recharging was just an idea that
>>> ttbomk nobody seriously followed up on afterwards.
>>>
>>> We also recently removed the charge moving code from cgroup1, along
>>> with the subtle page access/locking/accounting rules it imposed on the
>>> rest of the MM. I'm doubtful there is much appetite in either camp for
>>> bringing this back.
>>>
>>> So I would still love to see Muchun's patches merged. They fix a
>>> seemingly universally experienced operational issue in memcg, and we
>>> shouldn't hold it up unless somebody actually posts alternative code.
>>>
>>> Thoughts?
>> I think the recharging (or whatever the alternative) can be a followup
>> to this. I agree this is a good change.
> I agree with you. We've been encountering dying memory issues for years
> on our servers. As Roman said, I need to refresh my patches. So I need
> some time for refreshing.
Glad to hear that. I have been waiting for a resolution of the dying
memory cgroup problems for years :-)
Cheers,
Longman
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: A path forward to cleaning up dying cgroups?
2025-02-06 3:30 ` Muchun Song
2025-02-06 3:34 ` Waiman Long
@ 2025-02-06 15:51 ` Kamalesh Babulal
1 sibling, 0 replies; 10+ messages in thread
From: Kamalesh Babulal @ 2025-02-06 15:51 UTC (permalink / raw)
To: Muchun Song, Shakeel Butt
Cc: Johannes Weiner, Hamza Mahfooz, linux-mm, Roman Gushchin,
Andrew Morton, cgroups, linux-kernel, Tejun Heo,
Michal Koutný,
Michal Hocko, Allen Pais, Yosry Ahmed
On 06/02/25 09:00, Muchun Song wrote:
>
>
>> On Feb 6, 2025, at 02:46, Shakeel Butt <shakeel.butt@linux.dev> wrote:
>>
>> On Wed, Feb 05, 2025 at 01:08:42PM -0500, Johannes Weiner wrote:
>>> On Wed, Feb 05, 2025 at 12:50:19PM -0500, Hamza Mahfooz wrote:
>>>> Cc: Shakeel Butt <shakeel.butt@linux.dev>
>>>>
>>>> On 2/5/25 12:48, Hamza Mahfooz wrote:
>>>>> I was just curious as to what the status of the issue described in [1]
>>>>> is. It appears that the last time someone took a stab at it was in [2].
>>>
>>> If memory serves, the sticking point was whether pages should indeed
>>> be reparented on cgroup death, or whether they could be moved
>>> arbitrarily to other cgroups that are still using them.
>>>
>>> It's a bit unfortunate, because the reparenting patches were tested
>>> and reviewed, and the arbitrary recharging was just an idea that
>>> ttbomk nobody seriously followed up on afterwards.
>>>
>>> We also recently removed the charge moving code from cgroup1, along
>>> with the subtle page access/locking/accounting rules it imposed on the
>>> rest of the MM. I'm doubtful there is much appetite in either camp for
>>> bringing this back.
>>>
>>> So I would still love to see Muchun's patches merged. They fix a
>>> seemingly universally experienced operational issue in memcg, and we
>>> shouldn't hold it up unless somebody actually posts alternative code.
>>>
>>> Thoughts?
>>
>> I think the recharging (or whatever the alternative) can be a followup
>> to this. I agree this is a good change.
>
> I agree with you. We've been encountering dying memory issues for years
> on our servers. As Roman said, I need to refresh my patches. So I need
> some time for refreshing.
>
We have seen the dying cgroups issue too and look forward to your patches.
Happy to help with testing/reviewing.
--
Thanks,
Kamalesh
^ permalink raw reply [flat|nested] 10+ messages in thread