Re: [RFC PATCH v2 2/2] mm: convert mm's rss stats to use atomic mode

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Mateusz Guzik <mjguzik@gmail.com>
To: Kairui Song <ryncsn@gmail.com>
Cc: "zhangpeng (AS)" <zhangpeng362@huawei.com>,
	 Rongwei Wang <rongwei.wrw@gmail.com>,
	linux-mm@kvack.org, LKML <linux-kernel@vger.kernel.org>,
	 Andrew Morton <akpm@linux-foundation.org>,
	dennisszhou@gmail.com, shakeelb@google.com, jack@suse.cz,
	 Suren Baghdasaryan <surenb@google.com>,
	kent.overstreet@linux.dev, mhocko@suse.cz, vbabka@suse.cz,
	 Yu Zhao <yuzhao@google.com>,
	yu.ma@intel.com, wangkefeng.wang@huawei.com,
	 sunnanyong@huawei.com
Subject: Re: [RFC PATCH v2 2/2] mm: convert mm's rss stats to use atomic mode
Date: Fri, 17 May 2024 20:08:41 +0200	[thread overview]
Message-ID: <iwlpzi4qnpqri6wegibnsvth4yfdszksfvfyiei3qb3a4serbv@zrw3zsp55zoh> (raw)
In-Reply-To: <CAMgjq7DgE6NZPR8Sf2nq3vpVG8ZoC03e8aXi-QKbiievi3BB_g@mail.gmail.com>

On Fri, May 17, 2024 at 11:29:57AM +0800, Kairui Song wrote:
> Mateusz Guzik <mjguzik@gmail.com> 于 2024年5月16日周四 23:14写道：
> > A part of The Real Solution(tm) would make counter allocations scale
> > (including mcid, not just rss) or dodge them (while maintaining the
> > per-cpu distribution, see below for one idea), but that boils down to
> > balancing scalability versus total memory usage. It is trivial to just
> > slap together a per-cpu cache of these allocations and have the problem
> > go away for benchmarking purposes, while being probably being too memory
> > hungry for actual usage.
> >
> > I was pondering an allocator with caches per some number of cores (say 4
> > or 8). Microbenchmarks aside I suspect real workloads would not suffer
> > from contention at this kind of granularity. This would trivially reduce
> > memory usage compared to per-cpu caching. I suspect things like
> > mm_struct, task_struct, task stacks and similar would be fine with it.
> >
> > Suppose mm_struct is allocated from a more coarse grained allocator than
> > per-cpu. Total number of cached objects would be lower than it is now.
> > That would also mean these allocated but not currently used mms could
> > hold on to other stuff, for example per-cpu rss and mcid counters. Then
> > should someone fork or exit, alloc/free_percpu would be avoided for most
> > cases. This would scale better and be faster single-threaded than the
> > current state.
> 
> And what is the issue with using only one CPU cache, and flush on mm
> switch? No more alloc after boot, and the total (and fixed) memory
> usage is just about a few unsigned long per CPU, which should be even
> lower that the old RSS cache solution (4 unsigned long per task). And
> it scaled very well with many kinds of microbench or workload I've
> tested.
> 
> Unless the workload keeps doing something like "alloc one page then
> switch to another mm", I think the performance will be horrible
> already due to cache invalidations and many switch_*()s, RSS isn't
> really a concern there.
> 

I only skimmed through your patchset. I do think it has a legitimate
approach, but personally I would not do it like that due to the extra
work on context switches. However, I have 0 say about this, so you will
need to prod the mm overlords to get this moving forward.

Maybe I was not clear enough in my opening e-mail, so I'm going to
reiterate some bits: there are scalability problems in execve even with
your patchset or the one which uses atomics.  One of them concerns
another bit which allocates per-cpu memory (the mcid thing).

I note that sorting it out would possibly also take care of the rss
problem, outlining an example approach above.

next prev parent reply	other threads:[~2024-05-17 18:09 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-18 14:20 [RFC PATCH v2 0/2] " Peng Zhang
2024-04-18 14:20 ` [RFC PATCH v2 1/2] percpu_counter: introduce atomic mode for percpu_counter Peng Zhang
2024-04-18 19:40   ` Andrew Morton
2024-04-19  2:55     ` zhangpeng (AS)
2024-04-26  8:11   ` Dennis Zhou
2024-04-29  7:45     ` zhangpeng (AS)
2024-04-18 14:20 ` [RFC PATCH v2 2/2] mm: convert mm's rss stats to use atomic mode Peng Zhang
2024-04-19  2:30   ` Rongwei Wang
2024-04-19  3:32     ` zhangpeng (AS)
2024-04-20  3:13       ` Rongwei Wang
2024-04-20  8:44         ` zhangpeng (AS)
2024-05-16 11:50       ` Kairui Song
2024-05-16 15:14         ` Mateusz Guzik
2024-05-17  3:29           ` Kairui Song
2024-05-17 18:08             ` Mateusz Guzik [this message]
2024-05-19 14:13           ` Dennis Zhou
2024-04-24  4:29 ` [RFC PATCH v2 0/2] " zhangpeng (AS)
2024-04-24  4:51   ` Dennis Zhou

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=iwlpzi4qnpqri6wegibnsvth4yfdszksfvfyiei3qb3a4serbv@zrw3zsp55zoh \
    --to=mjguzik@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=dennisszhou@gmail.com \
    --cc=jack@suse.cz \
    --cc=kent.overstreet@linux.dev \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.cz \
    --cc=rongwei.wrw@gmail.com \
    --cc=ryncsn@gmail.com \
    --cc=shakeelb@google.com \
    --cc=sunnanyong@huawei.com \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    --cc=wangkefeng.wang@huawei.com \
    --cc=yu.ma@intel.com \
    --cc=yuzhao@google.com \
    --cc=zhangpeng362@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox