linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Mateusz Guzik <mjguzik@gmail.com>
Cc: Jan Kara <jack@suse.cz>, Dennis Zhou <dennis@kernel.org>,
	linux-kernel@vger.kernel.org, tj@kernel.org, cl@linux.com,
	akpm@linux-foundation.org, shakeelb@google.com,
	linux-mm@kvack.org
Subject: Re: [PATCH 0/2] execve scalability issues, part 1
Date: Wed, 23 Aug 2023 17:47:28 +0200	[thread overview]
Message-ID: <20230823154728.rpkw6fpwvwqbnnh3@quack3> (raw)
In-Reply-To: <CAGudoHFFt5wvYWrwNkz813KaXBmROJ7YJ67s1h3_CBgcoV2fCA@mail.gmail.com>

On Wed 23-08-23 14:13:20, Mateusz Guzik wrote:
> On 8/23/23, Jan Kara <jack@suse.cz> wrote:
> > On Tue 22-08-23 16:24:56, Mateusz Guzik wrote:
> >> On 8/22/23, Jan Kara <jack@suse.cz> wrote:
> >> Then for single-threaded case an area is allocated for NR_MM_COUNTERS
> >> countes * 2 -- first set updated without any synchro by current
> >> thread. Second set only to be modified by others and protected with
> >> mm->arg_lock. The lock protects remote access to the union to begin
> >> with.
> >
> > arg_lock seems a bit like a hack. How is it related to rss_stat? The scheme
> > with two counters is clever but I'm not 100% convinced the complexity is
> > really worth it. I'm not sure the overhead of always using an atomic
> > counter would really be measurable as atomic counter ops in local CPU cache
> > tend to be cheap. Did you try to measure the difference?
> >
> 
> arg_lock is not as is, it would have to be renamed to something more generic.

Ah, OK.

> Atomics on x86-64 are very expensive to this very day. Here is a
> sample measurement of 2 atomics showing up done by someone else:
> https://lore.kernel.org/oe-lkp/202308141149.d38fdf91-oliver.sang@intel.com/T/#u
> 
> tl;dr it is *really* bad.

I didn't express myself well. Sure atomics are expensive compared to plain
arithmetic operations. But I wanted to say - we had atomics for RSS
counters before commit f1a7941243 ("mm: convert mm's rss stats into
percpu_counter") and people seemed happy with it until there were many CPUs
contending on the updates. So maybe RSS counters aren't used heavily enough
for the difference to practically matter? Probably operation like faulting
in (or unmapping) tmpfs file has the highest chance of showing the cost of
rss accounting compared to the cost of the remainder of the operation...

> > If the second counter proves to be worth it, we could make just that one
> > atomic to avoid the need for abusing some spinlock.
> 
> The spinlock would be there to synchronize against the transition to
> per-cpu -- any trickery is avoided and we trivially know for a fact
> the remote party either sees the per-cpu state if transitioned, or
> local if not. Then one easily knows no updates have been lost and the
> buf for 2 sets of counters can be safely freed.

Yeah, the spinlock makes the transition simpler, I agree.

> While writing down the idea previously I did not realize the per-cpu
> counter ops disable interrupts around the op. That's already very slow
> and the trip should be comparable to paying for an atomic (as in the
> patch which introduced percpu counters here slowed things down for
> single-threaded processes).
> 
> With your proposal the atomic would be there, but interrupt trip could
> be avoided. This would roughly maintain the current cost of doing the
> op (as in it would not get /worse/). My patch would make it lower.
> 
> All that said, I'm going to refrain from writing a patch for the time
> being. If powers to be decide on your approach, I'm not going to argue
> -- I don't think either is a clear winner over the other.

I guess we'll need to code it and compare the results :)

								Honza

-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR


  reply	other threads:[~2023-08-23 15:47 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-21 20:28 Mateusz Guzik
2023-08-21 20:28 ` [PATCH 1/2] pcpcntr: add group allocation/free Mateusz Guzik
2023-08-22 13:37   ` Vegard Nossum
2023-08-22 14:06     ` Mateusz Guzik
2023-08-22 17:02   ` Dennis Zhou
2023-08-21 20:28 ` [PATCH 2/2] fork: group allocation of per-cpu counters for mm struct Mateusz Guzik
2023-08-21 21:20   ` Matthew Wilcox
2023-08-21 20:42 ` [PATCH 0/2] execve scalability issues, part 1 Matthew Wilcox
2023-08-21 20:44   ` [PATCH 1/7] mm: Make folios_put() the basis of release_pages() Matthew Wilcox (Oracle)
2023-08-21 20:44     ` [PATCH 2/7] mm: Convert free_unref_page_list() to use folios Matthew Wilcox (Oracle)
2023-08-21 20:44     ` [PATCH 3/7] mm: Add free_unref_folios() Matthew Wilcox (Oracle)
2023-08-21 20:44     ` [PATCH 4/7] mm: Use folios_put() in __folio_batch_release() Matthew Wilcox (Oracle)
2023-08-21 20:44     ` [PATCH 5/7] memcg: Add mem_cgroup_uncharge_batch() Matthew Wilcox (Oracle)
2023-08-21 20:44     ` [PATCH 6/7] mm: Remove use of folio list from folios_put() Matthew Wilcox (Oracle)
2023-08-21 20:44     ` [PATCH 7/7] mm: Use free_unref_folios() in put_pages_list() Matthew Wilcox (Oracle)
2023-08-21 21:07 ` [PATCH 0/2] execve scalability issues, part 1 Dennis Zhou
2023-08-21 21:39   ` Mateusz Guzik
2023-08-21 22:29     ` Mateusz Guzik
2023-08-22  9:51       ` Jan Kara
2023-08-22 14:24         ` Mateusz Guzik
2023-08-23  9:49           ` Jan Kara
2023-08-23 10:49             ` David Laight
2023-08-23 12:01               ` Mateusz Guzik
2023-08-23 12:13             ` Mateusz Guzik
2023-08-23 15:47               ` Jan Kara [this message]
2023-08-23 16:10                 ` Mateusz Guzik
2023-08-23 16:41                   ` Jan Kara
2023-08-23 17:12                     ` Mateusz Guzik
2023-08-23 20:27             ` Dennis Zhou
2023-08-24  9:19               ` Jan Kara
2023-08-26 18:33 ` Mateusz Guzik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230823154728.rpkw6fpwvwqbnnh3@quack3 \
    --to=jack@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=dennis@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mjguzik@gmail.com \
    --cc=shakeelb@google.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox