BTW, this is cgroup v1 only, I'm working on a patch to bring this backinto v2 as discussed in https://lkml.org/lkml/2019/1/3/484.We think a fix is to create a kworker that scans all pagecaches and dentry caches etc. in the background, if a referenced memory cgroup is offline, try to drop the cache or move it to the parent cgroup. This kworker can wake up periodically, or upon memory cgroup offline event (or both).
Reparenting has been deprecated for a long time. I don't think we wantto bring it back. Actually, css offline is handled by kworker now. Iproposed a patch to do force_empty in kworker, please seehttps://lkml.org/lkml/2019/1/2/377.
Could you elaborate a bit about why reparenting is not a good idea?
There is a similar problem in inode. After digging in ext4 code, we find that when creating inode cache, SLAB_ACCOUNT is used. In this case, inode will alloc in slab which belongs to the current memory cgroup. After this memory cgroup goes offline, this inode may be held by a dentry cache. If another process uses the same file. this inode will be held by that process, preventing the previous memory cgroup from being destroyed until this other process closes the file and drops the dentry cache.
I'm not sure if you really need kmem charge. If not, you may trycgroup.memory=nokmem.
A very good hint, we’ll investigate, thanks!