From: Johannes Weiner <hannes@cmpxchg.org>
To: Balbir Singh <bsingharora@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
Tejun Heo <tj@kernel.org>,
surenb@google.com, Vinayak Menon <vinmenon@codeaurora.org>,
Christoph Lameter <cl@linux.com>, Mike Galbraith <efault@gmx.de>,
Shakeel Butt <shakeelb@google.com>, linux-mm <linux-mm@kvack.org>,
cgroups@vger.kernel.org,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
kernel-team@fb.com
Subject: Re: [PATCH 0/10] psi: pressure stall information for CPU, memory, and IO v2
Date: Tue, 24 Jul 2018 11:15:19 -0400 [thread overview]
Message-ID: <20180724151519.GA11598@cmpxchg.org> (raw)
In-Reply-To: <CAKTCnzmt_CnfZMMdK9_-rBrL4kUmoE70nVbnE58CJp++FP0CCQ@mail.gmail.com>
Hi Balbir,
On Tue, Jul 24, 2018 at 07:14:02AM +1000, Balbir Singh wrote:
> Does the mechanism scale? I am a little concerned about how frequently
> this infrastructure is monitored/read/acted upon.
I expect most users to poll in the frequency ballpark of the running
averages (10s, 1m, 5m). Our OOMD defaults to 5s polling of the 10s
average; we collect the 1m average once per minute from our machines
and cgroups to log the system/workload health trends in our fleet.
Suren has been experimenting with adaptive polling down to the
millisecond range on Android.
> Why aren't existing mechanisms sufficient
Our existing stuff gives a lot of indication when something *may* be
an issue, like the rate of page reclaim, the number of refaults, the
average number of active processes, one task waiting on a resource.
But the real difference between an issue and a non-issue is how much
it affects your overall goal of making forward progress or reacting to
a request in time. And that's the only thing users really care
about. It doesn't matter whether my system is doing 2314 or 6723 page
refaults per minute, or scanned 8495 pages recently. I need to know
whether I'm losing 1% or 20% of my time on overcommitted memory.
Delayacct is time-based, so it's a step in the right direction, but it
doesn't aggregate tasks and CPUs into compound productivity states to
tell you if only parts of your workload are seeing delays (which is
often tolerable for the purpose of ensuring maximum HW utilization) or
your system overall is not making forward progress. That aggregation
isn't something you can do in userspace with polled delayacct data.
> -- why is the avg delay calculation in the kernel?
For one, as per above, most users will probably be using the standard
averaging windows, and we already have this highly optimizd
infrastructure from the load average. I don't see why we shouldn't use
that instead of exporting an obscure number that requires most users
to have an additional library or copy-paste the loadavg code.
I also mentioned the OOM killer as a likely in-kernel user of the
pressure percentages to protect from memory livelocks out of the box,
in which case we have to do this calculation in the kernel anyway.
> There is no talk about the overhead this introduces in general, may be
> the details are in the patches. I'll read through them
I sent an email on benchmarks and overhead in one of the subthreads, I
will include that information in the cover letter in v3.
https://lore.kernel.org/lkml/20180718215644.GB2838@cmpxchg.org/
Thanks!
next prev parent reply other threads:[~2018-07-24 15:12 UTC|newest]
Thread overview: 83+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-07-12 17:29 Johannes Weiner
2018-07-12 17:29 ` [PATCH 01/10] mm: workingset: don't drop refault information prematurely Johannes Weiner
2018-07-12 17:29 ` [PATCH 02/10] mm: workingset: tell cache transitions from workingset thrashing Johannes Weiner
2018-07-23 13:36 ` Arnd Bergmann
2018-07-23 15:23 ` Johannes Weiner
2018-07-23 15:35 ` Arnd Bergmann
2018-07-23 16:27 ` Johannes Weiner
2018-07-24 15:04 ` Will Deacon
2018-07-25 16:06 ` Will Deacon
2018-07-12 17:29 ` [PATCH 03/10] delayacct: track delays from thrashing cache pages Johannes Weiner
2018-07-12 17:29 ` [PATCH 04/10] sched: loadavg: consolidate LOAD_INT, LOAD_FRAC, CALC_LOAD Johannes Weiner
2018-07-12 17:29 ` [PATCH 05/10] sched: loadavg: make calc_load_n() public Johannes Weiner
2018-07-12 17:29 ` [PATCH 06/10] sched: sched.h: make rq locking and clock functions available in stats.h Johannes Weiner
2018-07-12 17:29 ` [PATCH 07/10] sched: introduce this_rq_lock_irq() Johannes Weiner
2018-07-12 17:29 ` [PATCH 08/10] psi: pressure stall information for CPU, memory, and IO Johannes Weiner
2018-07-13 9:21 ` Peter Zijlstra
2018-07-13 16:17 ` Johannes Weiner
2018-07-14 8:48 ` Peter Zijlstra
2018-07-14 9:02 ` Peter Zijlstra
2018-07-17 10:03 ` Peter Zijlstra
2018-07-18 21:56 ` Johannes Weiner
2018-07-17 14:16 ` Peter Zijlstra
2018-07-18 22:00 ` Johannes Weiner
2018-07-17 14:21 ` Peter Zijlstra
2018-07-18 22:03 ` Johannes Weiner
2018-07-17 15:01 ` Peter Zijlstra
2018-07-18 22:06 ` Johannes Weiner
2018-07-20 14:13 ` Johannes Weiner
2018-07-17 15:17 ` Peter Zijlstra
2018-07-18 22:11 ` Johannes Weiner
2018-07-17 15:32 ` Peter Zijlstra
2018-07-18 12:03 ` Peter Zijlstra
2018-07-18 12:22 ` Peter Zijlstra
2018-07-18 22:36 ` Johannes Weiner
2018-07-19 13:58 ` Peter Zijlstra
2018-07-19 9:26 ` Peter Zijlstra
2018-07-19 12:50 ` Johannes Weiner
2018-07-19 13:18 ` Peter Zijlstra
2018-07-19 15:08 ` Linus Torvalds
2018-07-19 17:54 ` Johannes Weiner
2018-07-19 18:47 ` Johannes Weiner
2018-07-19 20:31 ` Peter Zijlstra
2018-07-24 16:01 ` Johannes Weiner
2018-07-18 12:46 ` Peter Zijlstra
2018-07-18 13:56 ` Johannes Weiner
2018-07-18 16:31 ` Peter Zijlstra
2018-07-18 16:46 ` Johannes Weiner
2018-07-20 20:35 ` Peter Zijlstra
2018-07-12 17:29 ` [PATCH 09/10] psi: cgroup support Johannes Weiner
2018-07-12 20:08 ` Tejun Heo
2018-07-17 15:40 ` Peter Zijlstra
2018-07-24 15:54 ` Johannes Weiner
2018-07-12 17:29 ` [RFC PATCH 10/10] psi: aggregate ongoing stall events when somebody reads pressure Johannes Weiner
2018-07-12 23:45 ` Andrew Morton
2018-07-13 22:17 ` Johannes Weiner
2018-07-13 22:13 ` Suren Baghdasaryan
2018-07-13 22:49 ` Johannes Weiner
2018-07-13 23:34 ` Suren Baghdasaryan
2018-07-17 15:13 ` Peter Zijlstra
2018-07-12 17:37 ` [PATCH 0/10] psi: pressure stall information for CPU, memory, and IO v2 Linus Torvalds
2018-07-12 23:44 ` Andrew Morton
2018-07-13 22:14 ` Johannes Weiner
2018-07-16 15:57 ` Daniel Drake
2018-07-17 11:25 ` Michal Hocko
2018-07-17 12:13 ` Daniel Drake
2018-07-17 12:23 ` Michal Hocko
2018-07-25 22:57 ` Daniel Drake
2018-07-18 22:21 ` Johannes Weiner
2018-07-19 11:29 ` peter enderborg
2018-07-19 12:18 ` Johannes Weiner
2018-07-23 21:14 ` Balbir Singh
2018-07-24 15:15 ` Johannes Weiner [this message]
2018-07-26 1:07 ` Singh, Balbir
2018-07-26 20:07 ` Johannes Weiner
2018-07-27 23:40 ` Suren Baghdasaryan
2018-07-27 22:01 ` Pavel Machek
2018-07-30 15:40 ` Johannes Weiner
2018-07-30 17:39 ` Pavel Machek
2018-07-30 17:51 ` Tejun Heo
2018-07-30 17:54 ` Randy Dunlap
2018-07-30 18:05 ` Tejun Heo
2018-07-30 17:59 ` Pavel Machek
2018-07-30 18:07 ` Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180724151519.GA11598@cmpxchg.org \
--to=hannes@cmpxchg.org \
--cc=akpm@linux-foundation.org \
--cc=bsingharora@gmail.com \
--cc=cgroups@vger.kernel.org \
--cc=cl@linux.com \
--cc=efault@gmx.de \
--cc=kernel-team@fb.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=shakeelb@google.com \
--cc=surenb@google.com \
--cc=tj@kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=vinmenon@codeaurora.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox