On Tue, Apr 11, 2023 at 4:36 PM T.J. Mercier wrote: > > When a memcg is removed by userspace it gets offlined by the kernel. > Offline memcgs are hidden from user space, but they still live in the > kernel until their reference count drops to 0. New allocations cannot > be charged to offline memcgs, but existing allocations charged to > offline memcgs remain charged, and hold a reference to the memcg. > > As such, an offline memcg can remain in the kernel indefinitely, > becoming a zombie memcg. The accumulation of a large number of zombie > memcgs lead to increased system overhead (mainly percpu data in struct > mem_cgroup). It also causes some kernel operations that scale with the > number of memcgs to become less efficient (e.g. reclaim). > > There are currently out-of-tree solutions which attempt to > periodically clean up zombie memcgs by reclaiming from them. However > that is not effective for non-reclaimable memory, which it would be > better to reparent or recharge to an online cgroup. There are also > proposed changes that would benefit from recharging for shared > resources like pinned pages, or DMA buffer pages. > > Suggested attendees: > Yosry Ahmed > Yu Zhao > T.J. Mercier > Tejun Heo > Shakeel Butt > Muchun Song > Johannes Weiner > Roman Gushchin > Alistair Popple > Jason Gunthorpe > Kalesh Singh For the record, here are the slides that were presented for this discussion (attached).