linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Shakeel Butt <shakeelb@google.com>
To: Tejun Heo <tj@kernel.org>
Cc: Michal Hocko <mhocko@kernel.org>, Li Zefan <lizefan@huawei.com>,
	Roman Gushchin <guro@fb.com>,
	Vladimir Davydov <vdavydov.dev@gmail.com>,
	Greg Thelen <gthelen@google.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Hugh Dickins <hughd@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linux MM <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Cgroups <cgroups@vger.kernel.org>,
	linux-doc@vger.kernel.org
Subject: Re: [RFC PATCH] mm: memcontrol: memory+swap accounting for cgroup-v2
Date: Thu, 21 Dec 2017 07:22:20 -0800	[thread overview]
Message-ID: <CALvZod432hzxPZgAypjPsZ33Z==0MxmMdPM3bEBZMea-7GFAVw@mail.gmail.com> (raw)
In-Reply-To: <20171221133726.GD1084507@devbig577.frc2.facebook.com>

On Thu, Dec 21, 2017 at 5:37 AM, Tejun Heo <tj@kernel.org> wrote:
> Hello, Shakeel.
>
> On Wed, Dec 20, 2017 at 05:15:41PM -0800, Shakeel Butt wrote:
>> Let's say we have a job that allocates 100 MiB memory and suppose 80
>> MiB is anon and 20 MiB is non-anon (file & kmem).
>>
>> [With memsw] Scheduler sets the memsw limit of the job to 100 MiB and
>> memory to max. Now suppose the job tries to allocates memory more than
>> 100 MiB, it will hit the memsw limit and will try to reclaim non-anon
>> memory. The memcg OOM behavior will only depend on the reclaim of
>> non-anon memory and will be independent of the underlying swap device.
>
> Sure, the direct reclaim on memsw limit won't reclaim anon pages, but
> think about how the state at that point would have formed.  You're
> claiming that memsw makes memory allocation and balancing behavior an
> invariant against the performance of the swap device that the machine
> has.  It's simply not possible.
>

I am claiming memory allocations under global pressure will be
affected by the performance of the underlying swap device. However
memory allocations under memcg memory pressure, with memsw, will not
be affected by the performance of the underlying swap device. A job
having 100 MiB limit running on a machine without global memory
pressure will never see swap on hitting 100 MiB memsw limit.

> On top of that, what's the point?
>
> 1. As I wrote earlier, given the current OOM killer implementation,
>    whether OOM kicks in or not is not even that relevant in
>    determining the health of the workload.  There are frequent failure
>    modes where OOM killer fails to kick in while the workload isn't
>    making any meaningful forward progress.
>

Deterministic oom-killer is not the point. The point is to
"consistently limit the anon memory" allocated by the job which only
memsw can provide. A job owner who has requested 100 MiB for a job
sees some instances of the job suffer at 100 MiB and other instances
suffer at 150 MiB, is an inconsistent behavior.

> 2. On hitting memsw limit, the OOM decision is dependent on the
>    performance of the file backing devices.  Why is that necessarily
>    better than being dependent on swap or both, which would increase
>    the reclaim efficiency anyway?  You can't avoid being affected by
>    the underlying hardware one way or the other.
>

This is a separate discussion but still the amount of file backed
pages is known and controlled by the job owner and they have the
option to use a storage service, providing a consistent performance
across different data centers, instead of the physical disks of the
system where the job is running and thus isolating the job's
performance from the speed of the local disk. This is not possible
with swap. The swap (and its performance) is and should be transparent
to the job owners.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-12-21 15:22 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-19  0:01 Shakeel Butt
2017-12-19 12:49 ` Michal Hocko
2017-12-19 15:12   ` Shakeel Butt
2017-12-19 15:24     ` Tejun Heo
2017-12-19 17:23       ` Shakeel Butt
2017-12-19 17:33         ` Tejun Heo
2017-12-19 18:25           ` Shakeel Butt
2017-12-19 21:41             ` Tejun Heo
2017-12-19 22:39               ` Shakeel Butt
2017-12-20 19:37                 ` Tejun Heo
2017-12-20 20:15                   ` Shakeel Butt
2017-12-20 20:27                     ` Shakeel Butt
2017-12-20 23:36                     ` Tejun Heo
2017-12-21  1:15                       ` Shakeel Butt
2017-12-21 13:37                         ` Tejun Heo
2017-12-21 15:22                           ` Shakeel Butt [this message]
2017-12-21 15:33                             ` Shakeel Butt
2017-12-21 17:29                             ` Tejun Heo
2017-12-27 19:49                               ` Shakeel Butt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CALvZod432hzxPZgAypjPsZ33Z==0MxmMdPM3bEBZMea-7GFAVw@mail.gmail.com' \
    --to=shakeelb@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=gthelen@google.com \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lizefan@huawei.com \
    --cc=mhocko@kernel.org \
    --cc=tj@kernel.org \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox