From: Mateusz Guzik <mjguzik@gmail.com>
To: Kairui Song <ryncsn@gmail.com>
Cc: "zhangpeng (AS)" <zhangpeng362@huawei.com>,
Rongwei Wang <rongwei.wrw@gmail.com>,
linux-mm@kvack.org, LKML <linux-kernel@vger.kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
dennisszhou@gmail.com, shakeelb@google.com, jack@suse.cz,
Suren Baghdasaryan <surenb@google.com>,
kent.overstreet@linux.dev, mhocko@suse.cz, vbabka@suse.cz,
Yu Zhao <yuzhao@google.com>,
yu.ma@intel.com, wangkefeng.wang@huawei.com,
sunnanyong@huawei.com
Subject: Re: [RFC PATCH v2 2/2] mm: convert mm's rss stats to use atomic mode
Date: Fri, 17 May 2024 20:08:41 +0200 [thread overview]
Message-ID: <iwlpzi4qnpqri6wegibnsvth4yfdszksfvfyiei3qb3a4serbv@zrw3zsp55zoh> (raw)
In-Reply-To: <CAMgjq7DgE6NZPR8Sf2nq3vpVG8ZoC03e8aXi-QKbiievi3BB_g@mail.gmail.com>
On Fri, May 17, 2024 at 11:29:57AM +0800, Kairui Song wrote:
> Mateusz Guzik <mjguzik@gmail.com> 于 2024年5月16日周四 23:14写道:
> > A part of The Real Solution(tm) would make counter allocations scale
> > (including mcid, not just rss) or dodge them (while maintaining the
> > per-cpu distribution, see below for one idea), but that boils down to
> > balancing scalability versus total memory usage. It is trivial to just
> > slap together a per-cpu cache of these allocations and have the problem
> > go away for benchmarking purposes, while being probably being too memory
> > hungry for actual usage.
> >
> > I was pondering an allocator with caches per some number of cores (say 4
> > or 8). Microbenchmarks aside I suspect real workloads would not suffer
> > from contention at this kind of granularity. This would trivially reduce
> > memory usage compared to per-cpu caching. I suspect things like
> > mm_struct, task_struct, task stacks and similar would be fine with it.
> >
> > Suppose mm_struct is allocated from a more coarse grained allocator than
> > per-cpu. Total number of cached objects would be lower than it is now.
> > That would also mean these allocated but not currently used mms could
> > hold on to other stuff, for example per-cpu rss and mcid counters. Then
> > should someone fork or exit, alloc/free_percpu would be avoided for most
> > cases. This would scale better and be faster single-threaded than the
> > current state.
>
> And what is the issue with using only one CPU cache, and flush on mm
> switch? No more alloc after boot, and the total (and fixed) memory
> usage is just about a few unsigned long per CPU, which should be even
> lower that the old RSS cache solution (4 unsigned long per task). And
> it scaled very well with many kinds of microbench or workload I've
> tested.
>
> Unless the workload keeps doing something like "alloc one page then
> switch to another mm", I think the performance will be horrible
> already due to cache invalidations and many switch_*()s, RSS isn't
> really a concern there.
>
I only skimmed through your patchset. I do think it has a legitimate
approach, but personally I would not do it like that due to the extra
work on context switches. However, I have 0 say about this, so you will
need to prod the mm overlords to get this moving forward.
Maybe I was not clear enough in my opening e-mail, so I'm going to
reiterate some bits: there are scalability problems in execve even with
your patchset or the one which uses atomics. One of them concerns
another bit which allocates per-cpu memory (the mcid thing).
I note that sorting it out would possibly also take care of the rss
problem, outlining an example approach above.
next prev parent reply other threads:[~2024-05-17 18:09 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-18 14:20 [RFC PATCH v2 0/2] " Peng Zhang
2024-04-18 14:20 ` [RFC PATCH v2 1/2] percpu_counter: introduce atomic mode for percpu_counter Peng Zhang
2024-04-18 19:40 ` Andrew Morton
2024-04-19 2:55 ` zhangpeng (AS)
2024-04-26 8:11 ` Dennis Zhou
2024-04-29 7:45 ` zhangpeng (AS)
2024-04-18 14:20 ` [RFC PATCH v2 2/2] mm: convert mm's rss stats to use atomic mode Peng Zhang
2024-04-19 2:30 ` Rongwei Wang
2024-04-19 3:32 ` zhangpeng (AS)
2024-04-20 3:13 ` Rongwei Wang
2024-04-20 8:44 ` zhangpeng (AS)
2024-05-16 11:50 ` Kairui Song
2024-05-16 15:14 ` Mateusz Guzik
2024-05-17 3:29 ` Kairui Song
2024-05-17 18:08 ` Mateusz Guzik [this message]
2024-05-19 14:13 ` Dennis Zhou
2024-04-24 4:29 ` [RFC PATCH v2 0/2] " zhangpeng (AS)
2024-04-24 4:51 ` Dennis Zhou
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=iwlpzi4qnpqri6wegibnsvth4yfdszksfvfyiei3qb3a4serbv@zrw3zsp55zoh \
--to=mjguzik@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=dennisszhou@gmail.com \
--cc=jack@suse.cz \
--cc=kent.overstreet@linux.dev \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.cz \
--cc=rongwei.wrw@gmail.com \
--cc=ryncsn@gmail.com \
--cc=shakeelb@google.com \
--cc=sunnanyong@huawei.com \
--cc=surenb@google.com \
--cc=vbabka@suse.cz \
--cc=wangkefeng.wang@huawei.com \
--cc=yu.ma@intel.com \
--cc=yuzhao@google.com \
--cc=zhangpeng362@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox