Re: PSI vs. CPU overhead for client computing

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Suren Baghdasaryan <surenb@google.com>
To: Luigi Semenzato <semenzato@google.com>
Cc: Linux Memory Management List <linux-mm@kvack.org>,
	Johannes Weiner <hannes@cmpxchg.org>
Subject: Re: PSI vs. CPU overhead for client computing
Date: Tue, 23 Apr 2019 15:04:16 -0700	[thread overview]
Message-ID: <CAJuCfpHGcDM8c19g_AxWa4FSx++YbTSE70CGW4TiKvrdAg3R+w@mail.gmail.com> (raw)
In-Reply-To: <CAA25o9TV7B5Cej_=snuBcBnNFpfixBEQduTwQZOH0fh5iyXd=A@mail.gmail.com>

Hi Luigi,

On Tue, Apr 23, 2019 at 11:58 AM Luigi Semenzato <semenzato@google.com> wrote:
>
> I and others are working on improving system behavior under memory
> pressure on Chrome OS.  We use zram, which swaps to a
> statically-configured compressed RAM disk.  One challenge that we have
> is that the footprint of our workloads is highly variable.  With zram,
> we have to set the size of the swap partition at boot time.  When the
> (logical) swap partition is full, we're left with some amount of RAM
> usable by file and anonymous pages (we can ignore the rest).  We don't
> get to control this amount dynamically.  Thus if the workload fits
> nicely in it, everything works well.  If it doesn't, then the rate of
> anonymous page faults can be quite high, causing large CPU overhead
> for compression/decompression (as well as for other parts of the MM).
>
> In Chrome OS and Android, we have the luxury that we can reduce
> pressure by terminating processes (tab discard in Chrome OS, app kill
> in Android---which incidentally also runs in parallel with Chrome OS
> on some chromebooks).  To help decide when to reduce pressure, we
> would like to have a reliable and device-independent measure of MM CPU
> overhead.  I have looked into PSI and have a few questions.  I am also
> looking for alternative suggestions.
>
> PSI measures the times spent when some and all tasks are blocked by
> memory allocation.  In some experiments, this doesn't seem to
> correlate too well with CPU overhead (which instead correlates fairly
> well with page fault rates).  Could this be because it includes
> pressure from file page faults?

This might be caused by thrashing (see:
https://elixir.bootlin.com/linux/v5.1-rc6/source/mm/filemap.c#L1114).

>  Is there some way of interpreting PSI
> numbers so that the pressure from file pages is ignored?

I don't think so but I might be wrong. Notice here
https://elixir.bootlin.com/linux/v5.1-rc6/source/mm/filemap.c#L1111
you could probably use delayacct to distinguish file thrashing,
however remember that PSI takes into account the number of CPUs and
the number of currently non-idle tasks in its pressure calculations,
so the raw delay numbers might not be very useful here.

> What is the purpose of "some" and "full" in the PSI measurements?  The
> chrome browser is a multi-process app and there is a lot of IPC.  When
> process A is blocked on memory allocation, it cannot respond to IPC
> from process B, thus effectively both processes are blocked on
> allocation, but we don't see that.

I don't think PSI would account such an indirect stall when A is
waiting for B and B is blocked on memory access. B's stall will be
accounted for but I don't think A's blocked time will go into PSI
calculations. The process inter-dependencies are probably out of scope
for PSI.

> Also, there are situations in
> which some "uninteresting" process keep running.  So it's not clear we
> can rely on "full".  Or maybe I am misunderstanding?  "Some" may be a
> better measure, but again it doesn't measure indirect blockage.

Johannes explains the SOME and FULL calculations here:
https://elixir.bootlin.com/linux/v5.1-rc6/source/kernel/sched/psi.c#L76
and includes couple examples with the last one showing FULL>0 and some
tasks still running.

> The kernel contains various cpustat measurements, including some
> slightly esoteric ones such as CPUTIME_GUEST and CPUTIME_GUEST_NICE.
> Would adding a CPUTIME_MEM be out of the question?
>
> Thanks!
>

Just my 2 cents and Johannes being the author might have more to say here.

next prev parent reply	other threads:[~2019-04-23 22:04 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-23 18:57 Luigi Semenzato
2019-04-23 22:04 ` Suren Baghdasaryan [this message]
2019-04-24  4:54   ` Luigi Semenzato
2019-04-24 14:49     ` Suren Baghdasaryan
2019-04-25 17:31       ` Luigi Semenzato
2019-04-24 16:36   ` Johannes Weiner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJuCfpHGcDM8c19g_AxWa4FSx++YbTSE70CGW4TiKvrdAg3R+w@mail.gmail.com \
    --to=surenb@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-mm@kvack.org \
    --cc=semenzato@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox