linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Chris Murphy <lists@colorremedies.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: linux-mm@kvack.org
Subject: Re: user space unresponsive, followup: lsf/mm congestion
Date: Wed, 8 Jan 2020 14:14:22 -0700	[thread overview]
Message-ID: <CAJCQCtTAU78uECUN6Qt5c_SCUVBCRo8EQ34eVXwsS0M91NS43g@mail.gmail.com> (raw)
In-Reply-To: <20200108092501.GO32178@dhcp22.suse.cz>

On Wed, Jan 8, 2020 at 2:25 AM Michal Hocko <mhocko@kernel.org> wrote:
>
> On Tue 07-01-20 14:25:46, Chris Murphy wrote:
> > On Tue, Jan 7, 2020 at 1:58 PM Michal Hocko <mhocko@kernel.org> wrote:
> [...]
> > > Btw. from a quick look at the sysrq output there seems to be quite a lot
> > > of tasks (more than 1k) running on the system. Only handful of them
> > > belong to the compilation. kswapd is busy and 13 processes in direct
> > > reclaim all swapping out to the disk.
> >
> > There might be many dozens of tabs in Firefox with nothing loaded in
> > them, trying to keep the testing more real world (a compile while
> > browsing) rather than being too deferential to the compile. That does
> > clutter the sysrq+t but it doesn't change the outcome of the central
> > culprit which is the ninja compile, which by default does n+2 jobs
> > where n is the number of virtual CPUs.
>
> How much memory does the compile process eat?

By default it sets jobs to numcpus+2, which is 10. But each job
variably has two processes, and each process's memory requirement
varies a ton, few M to over 1G. In the first 20 minutes, about 13000
processes have started and stopped.

I've updated the bug, attaching kernel messages and /proc/vmstate in
1s increments, although quite often during the build multiple seconds
of sampling were just skipped as the system was under too much
pressure.

> If you know that the compilation process is too disruptive wrt.
> memory/cpu consumption then you can use cgroups (memory and cpu
> controllers) to throttle that consumption and protect the rest of the
> system. The compilation process will take much more time of course and
> the explicit configuration is obviously less comfortable than out of the
> box auto configuration but the kernel simply doesn't have information to
> prioritize resources.

Yes but this isn't scalable for regular users who just follow an
upstream's build instructions.

> I do agree that the oom detection could be improved to detect a heavy
> threshing - be it on page cache or swapin/out - and kill something
> rather than leave the system struggling in a highly unproductive state.
> This is far from trivial because what is productive is not something
> kernel can tell easily as it depends on the workload. As mentioned
> elsewhere userspace is likely much better suited to define that policy
> and PSI seems to be a good indicator.

And even user space doesn't know what resources are required in
advance. The user can guess this has been estimated incorrectly, force
power off, start over by passing a lower number of jobs or whatever.

As for PSI, from oomd folks it sounds like swap is a requirement. And
yet, because of the poor performance of swapping, quite a lot of users
don't have any swap. It's also mixed in server environments to have
swap, and rare in cloud environments to have swap. So if there's a
hard requirement on swap existing, PSI isn't a universal solution.

Thanks,

-- 
Chris Murphy


  reply	other threads:[~2020-01-08 21:14 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-07 20:29 Chris Murphy
2020-01-07 20:58 ` Michal Hocko
2020-01-07 21:25   ` Chris Murphy
2020-01-08  9:25     ` Michal Hocko
2020-01-08 21:14       ` Chris Murphy [this message]
2020-01-08 21:18         ` Chris Murphy
2020-01-09 11:51         ` Michal Hocko
2020-01-09 11:53           ` Michal Hocko
2020-01-10  6:12             ` Chris Murphy
2020-01-10 11:07               ` Michal Hocko
2020-01-10 22:27                 ` Chris Murphy
2020-01-14  9:46                   ` Michal Hocko
2020-01-12  0:07                 ` Chris Murphy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJCQCtTAU78uECUN6Qt5c_SCUVBCRo8EQ34eVXwsS0M91NS43g@mail.gmail.com \
    --to=lists@colorremedies.com \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox