* Re: [Bug 200627] New: Stutters and high kernel CPU usage from list_lru_count_one when cache fills memory [not found] <bug-200627-27@https.bugzilla.kernel.org/> @ 2018-07-22 23:40 ` Andrew Morton 2018-07-22 23:44 ` Kevin Liu 0 siblings, 1 reply; 4+ messages in thread From: Andrew Morton @ 2018-07-22 23:40 UTC (permalink / raw) To: kevin; +Cc: bugzilla-daemon, linux-mm (switched to email. Please respond via emailed reply-to-all, not via the bugzilla web interface). On Sun, 22 Jul 2018 23:33:57 +0000 bugzilla-daemon@bugzilla.kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=200627 > > Bug ID: 200627 > Summary: Stutters and high kernel CPU usage from > list_lru_count_one when cache fills memory Thanks. Please do note the above request. > Product: Memory Management > Version: 2.5 > Kernel Version: 4.18-rc4, 4.16 > Hardware: x86-64 > OS: Linux > Tree: Mainline > Status: NEW > Severity: normal > Priority: P1 > Component: Other > Assignee: akpm@linux-foundation.org > Reporter: kevin@potatofrom.space > Regression: No > > I've recently noticed stuttering and general sluggishness, in Xorg, Firefox, > and other graphical applications, when the memory becomes completely filled > with cache. In `htop`, the stuttering manifests as all CPU cores at 100% usage, > mostly in kernel mode. How recently? Were earlier kernels better behaved? > Doing a `perf top` shows that `list_lru_count_one` causes a lot of overhead: > > ``` > Overhead Shared Object Symbol > 18.38% [kernel] [k] list_lru_count_one > 4.90% [kernel] [k] nmi > 3.27% [kernel] [k] read_hpet > 2.66% [kernel] [k] super_cache_count > 1.84% [kernel] [k] shrink_slab.part.52 > 1.63% [kernel] [k] > shmem_unused_huge_count > 1.19% restic [.] 0x00000000002e696c > 0.98% restic [.] 0x00000000002e6a2f > 0.81% restic [.] 0x00000000002e699b > 0.80% restic [.] 0x00000000002e69b6 > 0.79% restic [.] 0x00000000002e697d > 0.74% .perf-wrapped [.] rb_next > 0.62% [kernel] [k] _aesni_dec4 > 0.57% restic [.] 0x00000000002e6a18 > 0.56% [kernel] [k] aesni_xts_crypt8 > 0.51% restic [.] 0x000000000005676a > 0.50% restic [.] 0x00000000002e69de > 0.50% restic [.] 0x00000000002e69f1 > 0.43% restic [.] 0x00000000002e6a10 > 0.43% restic [.] 0x00000000002e69c9 > 0.41% .perf-wrapped [.] hpp__sort_overhead > 0.41% restic [.] 0x00000000002e6996 > 0.40% [kernel] [k] > update_blocked_averages > 0.38% restic [.] 0x00000000002e6a05 > 0.38% [kernel] [k] __indirect_thunk_start > 0.37% [kernel] [k] > copy_user_enhanced_fast_string > 0.35% rclone [.] crypto/md5.block > ``` > > I've seen it hit up to 25% overhead, while normally (when the cache hasn't > filled up) it only has ~4% overhead. I believe that this is the cause of the > stutter. > > I've kludged together a workaround, as running `echo 3 > > /proc/sys/vm/drop_caches` every minute keeps the cache from filling up and the > system responsive, but I was wondering if this was a potential issue in the > kernel. > > More details on my workload: > > - Running Docker containers connected via NFS to disk; this computer serves ~20 > NFSv4.2 shares, though most of them have fairly light IO. > - Running a restic backup with rclone, which requires significant CPU usage and > does a lot of disk-waiting on hard drives. (It doesn't impact responsiveness > when the cache isn't full, though.) > > System: > > - Linux 4.18-rc4, NixOS unstable > - Intel i7-4820k > - 20 GB RAM > - AMD RX 580 > > Let me know if there are any more details I can provide or any tests I can run. ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Bug 200627] New: Stutters and high kernel CPU usage from list_lru_count_one when cache fills memory 2018-07-22 23:40 ` [Bug 200627] New: Stutters and high kernel CPU usage from list_lru_count_one when cache fills memory Andrew Morton @ 2018-07-22 23:44 ` Kevin Liu 2018-07-23 0:02 ` Kevin Liu 0 siblings, 1 reply; 4+ messages in thread From: Kevin Liu @ 2018-07-22 23:44 UTC (permalink / raw) To: Andrew Morton; +Cc: bugzilla-daemon, linux-mm [-- Attachment #1.1: Type: text/plain, Size: 358 bytes --] > How recently? Were earlier kernels better behaved? I've seen this issue both on Linux 4.16.15 (admittedly using the -ck patchset) and on vanilla Linux 4.18-rc4 (which is what I'm currently using). I'm fairly certain that it did not occur on Linux 4.14.50, which I used previously, but I will boot back into it to double-check and let you know. [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Bug 200627] New: Stutters and high kernel CPU usage from list_lru_count_one when cache fills memory 2018-07-22 23:44 ` Kevin Liu @ 2018-07-23 0:02 ` Kevin Liu 2018-07-23 1:52 ` Kevin Liu 0 siblings, 1 reply; 4+ messages in thread From: Kevin Liu @ 2018-07-23 0:02 UTC (permalink / raw) To: Andrew Morton; +Cc: bugzilla-daemon, linux-mm [-- Attachment #1: Type: text/plain, Size: 1136 bytes --] Sorry, not sure if the previous message registered on bugzilla due to the pgp signature? Including it below. On 07/22/2018 07:44 PM, Kevin Liu wrote: >> How recently? Were earlier kernels better behaved? > I've seen this issue both on Linux 4.16.15 (admittedly using the -ck > patchset) and on vanilla Linux 4.18-rc4 (which is what I'm currently using). > > I'm fairly certain that it did not occur on Linux 4.14.50, which I used > previously, but I will boot back into it to double-check and let you know. > And yes, booted back into Linux 4.14.54, there appears to be no issue -- list_lru_count_one reaches 6% overhead at most: Overhead Shared Object Symbol 5.91% [kernel] [k] list_lru_count_one 5.13% [kernel] [k] nmi 4.08% [kernel] [k] read_hpet 1.26% zma [.] Zone::CheckAlarms 1.16% [kernel] [k] _raw_spin_lock 1.07% restic [.] 0x00000000002e696c 1.06% .perf-wrapped [.] hpp__sort_overhead [-- Attachment #2: Type: text/html, Size: 1944 bytes --] ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Bug 200627] New: Stutters and high kernel CPU usage from list_lru_count_one when cache fills memory 2018-07-23 0:02 ` Kevin Liu @ 2018-07-23 1:52 ` Kevin Liu 0 siblings, 0 replies; 4+ messages in thread From: Kevin Liu @ 2018-07-23 1:52 UTC (permalink / raw) To: Andrew Morton; +Cc: bugzilla-daemon, linux-mm [-- Attachment #1: Type: text/plain, Size: 1442 bytes --] Correction - after using 4.14.53 for a while, I do actually see list_lru_count_one at the top, but only at ~10% overhead. The responsiveness is slightly degraded, but still better than how it was on 4.18-rc4. On 07/22/2018 08:02 PM, Kevin Liu wrote: > Sorry, not sure if the previous message registered on bugzilla due to > the pgp signature? Including it below. > > On 07/22/2018 07:44 PM, Kevin Liu wrote: >>> How recently? Were earlier kernels better behaved? >> I've seen this issue both on Linux 4.16.15 (admittedly using the -ck >> patchset) and on vanilla Linux 4.18-rc4 (which is what I'm currently using). >> >> I'm fairly certain that it did not occur on Linux 4.14.50, which I used >> previously, but I will boot back into it to double-check and let you know. >> > > And yes, booted back into Linux 4.14.54, there appears to be no issue -- > list_lru_count_one reaches 6% overhead at most: > > Overhead Shared Object Symbol > > 5.91% [kernel] [k] list_lru_count_one > > 5.13% [kernel] [k] nmi > > 4.08% [kernel] [k] read_hpet > > 1.26% zma [.] Zone::CheckAlarms > > 1.16% [kernel] [k] _raw_spin_lock > > 1.07% restic [.] 0x00000000002e696c > > 1.06% .perf-wrapped [.] hpp__sort_overhead > > [-- Attachment #2: Type: text/html, Size: 2231 bytes --] ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2018-07-23 1:52 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <bug-200627-27@https.bugzilla.kernel.org/>
2018-07-22 23:40 ` [Bug 200627] New: Stutters and high kernel CPU usage from list_lru_count_one when cache fills memory Andrew Morton
2018-07-22 23:44 ` Kevin Liu
2018-07-23 0:02 ` Kevin Liu
2018-07-23 1:52 ` Kevin Liu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox