hello guys excuse me please for dropping in, but I can not ignore the fact that all this sounds like 99%+ the same as the issue I am going nuts with for the past 2 months, since I switched kernels from version 3 to 4. Please look at the topic `Caching/buffers become useless after some time`. What I did not mention there is that cgroups are also mounted and used, but not actively since I have some scripting issue with setting them up correctly, but there is active data in /sys/fs/cgroup/memory/memory.stat so it might be related to cgroups - I did not think of that until now. same story here as well, 2> into drop_caches solves the issue temporarily, for maybe 2-4 days with lots of I/O. I can however test and play around with cgroups - if one may want to suggest to disable them I'd gladly monitor the behavior (please tell me what and how to do it, if necessary). Also I am curious: could you disable cgroups as well, just to see whether it helps and is actually associated with cgroups? my sysctl regarding vm is: vm.dirty_ratio = 15 vm.dirty_background_ratio = 3 vm.vfs_cache_pressure = 1 I may tell (not for sure) that this issue is less significant since I lowered these values, previously I had 90/80 on dirty_ratio and dirty_background_ratio, not sure about the cache pressue any more. Still there is lots of ram unallocated, usually at least half, mostly even more totally unused, the hosts have 64GB of RAM as well. I hope this is kinda related, so we can work together on pinpointing this, that issue is not going away for me and causes lots of headache slowing down my entire business. 2018-07-24 12:05 GMT+02:00 Bruce Merry : > On 18 July 2018 at 19:40, Bruce Merry wrote: > >> Yes, very easy to produce zombies, though I don't think kernel > >> provides any way to tell how many zombies exist on the system. > >> > >> To create a zombie, first create a memcg node, enter that memcg, > >> create a tmpfs file of few KiBs, exit the memcg and rmdir the memcg. > >> That memcg will be a zombie until you delete that tmpfs file. > > > > Thanks, that makes sense. I'll see if I can reproduce the issue. > > Hi > > I've had some time to experiment with this issue, and I've now got a > way to reproduce it fairly reliably, including with a stock 4.17.8 > kernel. However, it's very phase-of-the-moon stuff, and even > apparently trivial changes (like switching the order in which the > files are statted) makes the issue disappear. > > To reproduce: > 1. Start cadvisor running. I use the 0.30.2 binary from Github, and > run it with sudo ./cadvisor-0.30.2 --logtostderr=true > 2. Run the Python 3 script below, which repeatedly creates a cgroup, > enters it, stats some files in it, and leaves it again (and removes > it). It takes a few minutes to run. > 3. time cat /sys/fs/cgroup/memory/memory.stat. It now takes about 20ms > for me. > 4. sudo sysctl vm.drop_caches=2 > 5. time cat /sys/fs/cgroup/memory/memory.stat. It is back to 1-2ms. > > I've also added some code to memcg_stat_show to report the number of > cgroups in the hierarchy (iterations in for_each_mem_cgroup_tree). > Running the script increases it from ~700 to ~41000. The script > iterates 250,000 times, so only some fraction of the cgroups become > zombies. > > I also tried the suggestion of force_empty: it makes the problem go > away, but is also very, very slow (about 0.5s per iteration), and > given the sensitivity of the test to small changes I don't know how > meaningful that is. > > Reproduction code (if you have tqdm installed you get a nice progress > bar, but not required). Hopefully Gmail doesn't do any format > mangling: > > > #!/usr/bin/env python3 > import os > > try: > from tqdm import trange as range > except ImportError: > pass > > > def clean(): > try: > os.rmdir(name) > except FileNotFoundError: > pass > > > def move_to(cgroup): > with open(cgroup + '/tasks', 'w') as f: > print(pid, file=f) > > > pid = os.getpid() > os.chdir('/sys/fs/cgroup/memory') > name = 'dummy' > N = 250000 > clean() > try: > for i in range(N): > os.mkdir(name) > move_to(name) > for filename in ['memory.stat', 'memory.swappiness']: > os.stat(os.path.join(name, filename)) > move_to('user.slice') > os.rmdir(name) > finally: > move_to('user.slice') > clean() > > > Regards > Bruce > -- > Bruce Merry > Senior Science Processing Developer > SKA South Africa > >