linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Bruno Prémont" <bonbons@linux-vserver.org>
To: Michal Hocko <mhocko@kernel.org>
Cc: cgroups@vger.kernel.org, linux-mm@kvack.org,
	Johannes Weiner <hannes@cmpxchg.org>,
	Vladimir Davydov <vdavydov.dev@gmail.com>,
	Chris Down <chris@chrisdown.name>
Subject: Re: Memory CG and 5.1 to 5.6 uprade slows backup
Date: Thu, 9 Apr 2020 12:17:33 +0200	[thread overview]
Message-ID: <20200409121733.1a5ba17c@hemera.lan.sysophe.eu> (raw)
In-Reply-To: <20200409094615.GE18386@dhcp22.suse.cz>

On Thu, 9 Apr 2020 11:46:15 Michal Hocko <mhocko@kernel.org> wrote:
> [Cc Chris]
> 
> On Thu 09-04-20 11:25:05, Bruno Prémont wrote:
> > Hi,
> > 
> > Upgrading from 5.1 kernel to 5.6 kernel on a production system using
> > cgroups (v2) and having backup process in a memory.high=2G cgroup
> > sees backup being highly throttled (there are about 1.5T to be
> > backuped).  
> 
> What does /proc/sys/vm/dirty_* say?

/proc/sys/vm/dirty_background_bytes:0
/proc/sys/vm/dirty_background_ratio:10
/proc/sys/vm/dirty_bytes:0
/proc/sys/vm/dirty_expire_centisecs:3000
/proc/sys/vm/dirty_ratio:20
/proc/sys/vm/dirty_writeback_centisecs:500

Captured after having restarted the backup task.
After backup process restart the cgroup again has more free memory and
things run at a normal speed (until cgroup memory gets "full" again).
Current cgroup stats when things run fluently:

anon 176128
file 633012224
kernel_stack 73728
slab 47173632
sock 364544
shmem 0
file_mapped 10678272
file_dirty 811008
file_writeback 405504
anon_thp 0
inactive_anon 0
active_anon 0
inactive_file 552849408
active_file 79360000
unevictable 0
slab_reclaimable 46411776
slab_unreclaimable 761856
pgfault 8656857
pgmajfault 2145
workingset_refault 8672334
workingset_activate 410586
workingset_nodereclaim 92895
pgrefill 1516540
pgscan 48241750
pgsteal 45655752
pgactivate 7986
pgdeactivate 1483626
pglazyfree 0
pglazyfreed 0
thp_fault_alloc 0
thp_collapse_alloc 0


> Is it possible that the reclaim is not making progress on too many
> dirty pages and that triggers the back off mechanism that has been
> implemented recently in  5.4 (have a look at 0e4b01df8659 ("mm,
> memcg: throttle allocators when failing reclaim over memory.high")
> and e26733e0d0ec ("mm, memcg: throttle allocators based on
> ancestral memory.high").

Could be though in that case it's throttling the wrong task/cgroup
as far as I can see (at least from cgroup's memory stats) or being
blocked by state external to the cgroup.
Will have a look at those patches so get a better idea at what they
change.

System-wide memory is at least 10G/64G completely free (varies between
10G and 20G free - ~18G file cache, ~10G reclaimable slabs, ~5G
unreclaimable slabs and 7G otherwise in use).

> Keeping the rest of the email for reference.
> 
> > Most memory usage in that cgroup is for file cache.
> > 
> > Here are the memory details for the cgroup:
> > memory.current:2147225600
> > memory.events:low 0
> > memory.events:high 423774
> > memory.events:max 31131
> > memory.events:oom 0
> > memory.events:oom_kill 0
> > memory.events.local:low 0
> > memory.events.local:high 423774
> > memory.events.local:max 31131
> > memory.events.local:oom 0
> > memory.events.local:oom_kill 0
> > memory.high:2147483648
> > memory.low:33554432
> > memory.max:2415919104
> > memory.min:0
> > memory.oom.group:0
> > memory.pressure:some avg10=90.42 avg60=72.59 avg300=78.30 total=298252577711
> > memory.pressure:full avg10=90.32 avg60=72.53 avg300=78.24 total=295658626500
> > memory.stat:anon 10887168
> > memory.stat:file 2062102528
> > memory.stat:kernel_stack 73728
> > memory.stat:slab 76148736
> > memory.stat:sock 360448
> > memory.stat:shmem 0
> > memory.stat:file_mapped 12029952
> > memory.stat:file_dirty 946176
> > memory.stat:file_writeback 405504
> > memory.stat:anon_thp 0
> > memory.stat:inactive_anon 0
> > memory.stat:active_anon 10121216
> > memory.stat:inactive_file 1954959360
> > memory.stat:active_file 106418176
> > memory.stat:unevictable 0
> > memory.stat:slab_reclaimable 75247616
> > memory.stat:slab_unreclaimable 901120
> > memory.stat:pgfault 8651676
> > memory.stat:pgmajfault 2013
> > memory.stat:workingset_refault 8670651
> > memory.stat:workingset_activate 409200
> > memory.stat:workingset_nodereclaim 62040
> > memory.stat:pgrefill 1513537
> > memory.stat:pgscan 47519855
> > memory.stat:pgsteal 44933838
> > memory.stat:pgactivate 7986
> > memory.stat:pgdeactivate 1480623
> > memory.stat:pglazyfree 0
> > memory.stat:pglazyfreed 0
> > memory.stat:thp_fault_alloc 0
> > memory.stat:thp_collapse_alloc 0
> > 
> > Numbers that change most are pgscan/pgsteal
> > Regularly the backup process seems to be blocked for about 2s, but not
> > within a syscall according to strace.
> > 
> > Is there a way to tell kernel that this cgroup should not be throttled
> > and its inactive file cache given up (rather quickly).
> > 
> > The aim here is to avoid backup from killing production task file cache
> > but not starving it.
> > 
> > 
> > If there is some useful info missing, please tell (eventually adding how
> > I can obtain it).
> > 
> > 
> > On a side note, I liked v1's mode of soft/hard memory limit where the
> > memory amount between soft and hard could be used if system has enough
> > free memory. For v2 the difference between high and max seems almost of
> > no use.
> > 
> > A cgroup parameter for impacting RO file cache differently than
> > anonymous memory or otherwise dirty memory would be great too.
> > 
> > 
> > Thanks,
> > Bruno


  reply	other threads:[~2020-04-09 10:17 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-09  9:25 Bruno Prémont
2020-04-09  9:46 ` Michal Hocko
2020-04-09 10:17   ` Bruno Prémont [this message]
2020-04-09 10:34     ` Michal Hocko
2020-04-09 15:09       ` Bruno Prémont
2020-04-09 15:24         ` Chris Down
2020-04-09 15:40           ` Bruno Prémont
2020-04-09 17:50             ` Chris Down
2020-04-09 17:56               ` Chris Down
2020-04-09 15:25         ` Michal Hocko
2020-04-10  7:15           ` Bruno Prémont
2020-04-10  8:43             ` Bruno Prémont
     [not found]               ` <20200410115010.1d9f6a3f@hemera.lan.sysophe.eu>
     [not found]                 ` <20200414163134.GQ4629@dhcp22.suse.cz>
2020-04-15 10:17                   ` Bruno Prémont
2020-04-15 10:24                     ` Michal Hocko
2020-04-15 11:37                       ` Bruno Prémont
2020-04-14 15:09           ` Bruno Prémont
2020-04-09 10:50 ` Chris Down
2020-04-09 11:58   ` Bruno Prémont

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200409121733.1a5ba17c@hemera.lan.sysophe.eu \
    --to=bonbons@linux-vserver.org \
    --cc=cgroups@vger.kernel.org \
    --cc=chris@chrisdown.name \
    --cc=hannes@cmpxchg.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox