linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Nico Pache <npache@redhat.com>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: linux-mm@kvack.org, akpm@linux-foundation.org,
	linux-kernel@vger.kernel.org, aquini@redhat.com,
	shakeelb@google.com, llong@redhat.com, mhocko@suse.com,
	hakavlad@inbox.lv
Subject: Re: [PATCH v3] vm_swappiness=0 should still try to avoid swapping anon memory
Date: Wed, 20 Apr 2022 13:34:58 -0400	[thread overview]
Message-ID: <bc9f5209-5c59-c921-d85e-e2e54b2375db@redhat.com> (raw)
In-Reply-To: <YmASIHjTVndHHoL4@cmpxchg.org>



On 4/20/22 10:01, Johannes Weiner wrote:
>> My swappiness=0 solution was a minimal approach to regaining the 'avoid swapping
>> ANON' behavior that was previously there, but as Shakeel pointed out, there may
>> be something larger at play.
> 
> So with my patch and swappiness=0 you get excessive swapping on v1 but
> not on v2? And the patch to avoid DEACTIVATE_ANON fixes it?

correct, I haven't tested the DEACTIVATE_ANON patch since last time I was
working on this, but it did cure it. I can build a new kernel with it and verify
again.

The larger issue is that our workload has regressed in performance.

With V2 and swappiness=10 we are still seeing some swap, but very little tearing
down of THPs over time. With swappiness=0 it did some when swap but we are not
losings GBs of THPS (with your patch swappiness=0 has swap or THP issues on V2).

With V1 and swappiness=(0|10)(with and without your patch), it swaps a ton and
ultimately leads to a significant amount of THP splitting. So the longer the
system/workload runs, the less likely we are to get THPs backing the guest and
the performance gain from THPs is lost.

So your patch does help return the old swappiness=0 behavior, but only for V2.

Ideally we would like to keep swappiness>0 but I found that with my patch and
swappiness=0 we could create a workaround for this effect on V1, but any other
value still results in the THP issue.


After the workload is run with V2 and swappiness=0 the host system look like this**:
               total        used        free      shared  buff/cache   available
Mem:       264071432   257536896      927424        4664     5607112     4993184
Swap:        4194300           0     4194300

Node 0 AnonPages:      128145476 kB	Node 1 AnonPages:      128111908 kB
Node 0 AnonHugePages:  128026624 kB	Node 1 AnonHugePages:  128090112 kB

** without your patch there is still some swap and THP splitting but nothing
like the case below.

Same workload on V1/swappiness=0 looks like this:
               total        used        free	  shared  buff/cache   available
Mem:	   264071432   257169500     1032612        4192     5869320     5357944
Swap:        4194300      623008     3571292

Node 0 AnonPages:      127927156 kB     Node 1 AnonPages:      127701088 kB
Node 0 AnonHugePages:  127789056 kB     Node 1 AnonHugePages:  87552000 kB
								^^^^^^^

This leads to the performance regression I'm referring to in later workloads.
V2 used to have a similar effect to V1, but not nearly as bad. Recent updates
upstream fixed this in V2.

The workload tests multiple FS types so this is most likely not a FS specific
issue either.

> If you haven't done so, it could be useful to litter shrink_node() and
> get_scan_count() with trace_printk() to try to make sense of all the
> decisions that result in it swapping.
Will do :) I was originally doing some BPF tracing that lead me to find the
DEACTIVE_ANON case.

Thanks,
-- Nico



  reply	other threads:[~2022-04-20 17:35 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-09 22:37 Nico Pache
2021-08-10 15:27 ` Johannes Weiner
2021-08-10 19:24   ` Nico Pache
2021-08-10 21:17     ` Shakeel Butt
2021-08-10 22:16       ` Nico Pache
2021-08-10 22:29         ` Shakeel Butt
2021-08-10 21:16   ` Shakeel Butt
2021-08-10 15:37 ` Waiman Long
2022-04-19 18:11 ` Nico Pache
2022-04-19 18:46   ` Johannes Weiner
2022-04-19 19:37     ` Nico Pache
2022-04-19 23:54     ` Nico Pache
2022-04-20 14:01       ` Johannes Weiner
2022-04-20 17:34         ` Nico Pache [this message]
2022-04-20 18:44           ` Johannes Weiner
2022-04-21 16:21             ` Nico Pache

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bc9f5209-5c59-c921-d85e-e2e54b2375db@redhat.com \
    --to=npache@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=aquini@redhat.com \
    --cc=hakavlad@inbox.lv \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=llong@redhat.com \
    --cc=mhocko@suse.com \
    --cc=shakeelb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox