From: Yosry Ahmed <yosryahmed@google.com>
To: Tejun Heo <tj@kernel.org>
Cc: Josef Bacik <josef@toxicpanda.com>, Jens Axboe <axboe@kernel.dk>,
Zefan Li <lizefan.x@bytedance.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Michal Hocko <mhocko@kernel.org>,
Roman Gushchin <roman.gushchin@linux.dev>,
Shakeel Butt <shakeelb@google.com>,
Muchun Song <muchun.song@linux.dev>,
Andrew Morton <akpm@linux-foundation.org>,
Vasily Averin <vasily.averin@linux.dev>,
cgroups@vger.kernel.org, linux-block@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
bpf@vger.kernel.org
Subject: Re: [RFC PATCH 1/7] cgroup: rstat: only disable interrupts for the percpu lock
Date: Fri, 24 Mar 2023 00:22:09 -0700 [thread overview]
Message-ID: <CAJD7tkYNZeEytm_Px9_73Y-AYJfHAxaoTmmnO71HW5hd1B5tPg@mail.gmail.com> (raw)
In-Reply-To: <ZBz/V5a7/6PZeM7S@slm.duckdns.org>
On Thu, Mar 23, 2023 at 6:39 PM Tejun Heo <tj@kernel.org> wrote:
>
> Hello,
>
> On Thu, Mar 23, 2023 at 04:00:31AM +0000, Yosry Ahmed wrote:
> > Currently, when sleeping is not allowed during rstat flushing, we hold
> > the global rstat lock with interrupts disabled throughout the entire
> > flush operation. Flushing in an O(# cgroups * # cpus) operation, and
> > having interrupts disabled throughout is dangerous.
> >
> > For some contexts, we may not want to sleep, but can be interrupted
> > (e.g. while holding a spinlock or RCU read lock). As such, do not
> > disable interrupts throughout rstat flushing, only when holding the
> > percpu lock. This breaks down the O(# cgroups * # cpus) duration with
> > interrupts disabled to a series of O(# cgroups) durations.
> >
> > Furthermore, if a cpu spinning waiting for the global rstat lock, it
> > doesn't need to spin with interrupts disabled anymore.
>
> I'm generally not a fan of big spin locks w/o irq protection. They too often
> become a source of unpredictable latency spikes. As you said, the global
> rstat lock can be held for quite a while. Removing _irq makes irq latency
> better on the CPU but on the other hand it makes a lot more likely that the
> lock is gonna be held even longer, possibly significantly so depending on
> the configuration and workload which will in turn stall other CPUs waiting
> for the lock. Sure, irqs are being serviced quicker but if the cost is more
> and longer !irq context multi-cpu stalls, what's the point?
>
> I don't think there's anything which requires the global lock to be held
> throughout the entire flushing sequence and irq needs to be disabled when
> grabbing the percpu lock anyway, so why not just release the global lock on
> CPU boundaries instead? We don't really lose anything significant that way.
> The durations of irq disabled sections are still about the same as in the
> currently proposed solution at O(# cgroups) and we avoid the risk of holding
> the global lock for too long unexpectedly from getting hit repeatedly by
> irqs while holding the global lock.
Thanks for taking a look!
I think a problem with this approach is that we risk having to contend
for the global lock at every CPU boundary in atomic contexts. Right
now we contend for the global lock once, and once we have it we go
through all CPUs to flush, only having to contend with updates taking
the percpu locks at this point. If we unconditionally release &
reacquire the global lock at every CPU boundary then we may contend
for it much more frequently with concurrent flushers.
On the memory controller side, concurrent flushers are already held
back to avoid a thundering herd problem on the global rstat lock, but
flushers from outside the memory controller can still compete together
or with a flusher from the memory controller. In this case, we risk
contending the global lock more and concurrent flushers taking a
longer period of time, which may end up causing multi-CPU stalls
anyway, right? Also, if we keep _irq when spinning for the lock, then
concurrent flushers still need to spin with irq disabled -- another
problem that this series tries to fix.
This is particularly a problem for flushers in atomic contexts. There
is a flusher in mem_cgroup_wb_stats() that flushes while holding
another spinlock, and a flusher in mem_cgroup_usage() that flushes
with irqs disabled. If flushing takes a longer period of time due to
repeated lock contention, it affects such atomic context negatively.
I am not sure how all of this matters in practice, it depends heavily
on the workloads and the configuration like you mentioned. I am just
pointing out the potential disadvantages of reacquiring the lock at
every CPU boundary in atomic contexts.
>
> Thanks.
>
> --
> tejun
next prev parent reply other threads:[~2023-03-24 7:22 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-23 4:00 [RFC PATCH 0/7] Make rstat flushing IRQ and sleep friendly Yosry Ahmed
2023-03-23 4:00 ` [RFC PATCH 1/7] cgroup: rstat: only disable interrupts for the percpu lock Yosry Ahmed
2023-03-23 4:29 ` Shakeel Butt
2023-03-23 5:15 ` Yosry Ahmed
2023-03-23 6:33 ` Shakeel Butt
2023-03-23 13:35 ` Yosry Ahmed
2023-03-23 15:40 ` Shakeel Butt
2023-03-23 15:42 ` Yosry Ahmed
2023-03-23 15:46 ` Shakeel Butt
2023-03-23 16:09 ` Shakeel Butt
2023-03-23 16:17 ` Yosry Ahmed
2023-03-23 16:29 ` Shakeel Butt
2023-03-23 16:36 ` Yosry Ahmed
2023-03-23 16:45 ` Shakeel Butt
2023-03-23 16:51 ` Yosry Ahmed
2023-03-23 19:09 ` Shakeel Butt
2023-03-23 17:33 ` Johannes Weiner
2023-03-23 18:09 ` Yosry Ahmed
2023-03-23 18:19 ` Johannes Weiner
2023-03-24 1:39 ` Tejun Heo
2023-03-24 7:22 ` Yosry Ahmed [this message]
2023-03-24 14:12 ` Waiman Long
2023-03-24 22:50 ` Yosry Ahmed
2023-03-25 1:54 ` Tejun Heo
2023-03-25 2:17 ` Yosry Ahmed
2023-03-25 4:30 ` Shakeel Butt
2023-03-25 4:37 ` Yosry Ahmed
2023-03-25 4:46 ` Shakeel Butt
2023-03-27 23:23 ` Yosry Ahmed
2023-03-29 18:53 ` Tejun Heo
2023-03-29 19:22 ` Hugh Dickins
2023-03-29 20:00 ` Tejun Heo
2023-03-29 20:38 ` Hugh Dickins
2023-03-30 4:26 ` Yosry Ahmed
2023-03-31 1:51 ` Tejun Heo
2023-03-23 4:00 ` [RFC PATCH 2/7] memcg: do not disable interrupts when holding stats_flush_lock Yosry Ahmed
2023-03-23 4:32 ` Shakeel Butt
2023-03-23 5:16 ` Yosry Ahmed
2023-03-23 4:00 ` [RFC PATCH 3/7] cgroup: rstat: remove cgroup_rstat_flush_irqsafe() Yosry Ahmed
2023-03-23 15:43 ` Johannes Weiner
2023-03-23 15:45 ` Yosry Ahmed
2023-03-23 4:00 ` [RFC PATCH 4/7] memcg: sleep during flushing stats in safe contexts Yosry Ahmed
2023-03-23 15:56 ` Johannes Weiner
2023-03-23 16:01 ` Yosry Ahmed
2023-03-23 17:27 ` Johannes Weiner
2023-03-23 18:07 ` Yosry Ahmed
2023-03-23 19:35 ` Shakeel Butt
2023-03-23 4:00 ` [RFC PATCH 5/7] vmscan: memcg: sleep when flushing stats during reclaim Yosry Ahmed
2023-03-23 4:00 ` [RFC PATCH 6/7] workingset: memcg: sleep when flushing stats in workingset_refault() Yosry Ahmed
2023-03-23 15:50 ` Johannes Weiner
2023-03-23 16:02 ` Yosry Ahmed
2023-03-23 16:00 ` Johannes Weiner
2023-03-23 16:02 ` Yosry Ahmed
2023-03-23 4:00 ` [RFC PATCH 7/7] memcg: do not modify rstat tree for zero updates Yosry Ahmed
2023-03-23 4:10 ` [RFC PATCH 0/7] Make rstat flushing IRQ and sleep friendly Shakeel Butt
2023-03-23 5:07 ` Yosry Ahmed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAJD7tkYNZeEytm_Px9_73Y-AYJfHAxaoTmmnO71HW5hd1B5tPg@mail.gmail.com \
--to=yosryahmed@google.com \
--cc=akpm@linux-foundation.org \
--cc=axboe@kernel.dk \
--cc=bpf@vger.kernel.org \
--cc=cgroups@vger.kernel.org \
--cc=hannes@cmpxchg.org \
--cc=josef@toxicpanda.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lizefan.x@bytedance.com \
--cc=mhocko@kernel.org \
--cc=muchun.song@linux.dev \
--cc=roman.gushchin@linux.dev \
--cc=shakeelb@google.com \
--cc=tj@kernel.org \
--cc=vasily.averin@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox