linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@suse.com>
To: Waiman Long <longman@redhat.com>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: cgroups@vger.kernel.org, linux-mm@kvack.org,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Johannes Weiner" <hannes@cmpxchg.org>,
	"Michal Koutný" <mkoutny@suse.com>,
	"Peter Zijlstra" <peterz@infradead.org>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Vladimir Davydov" <vdavydov.dev@gmail.com>
Subject: Re: [PATCH 3/4] mm/memcg: Add a local_lock_t for IRQ and TASK object.
Date: Thu, 3 Feb 2022 11:09:57 +0100	[thread overview]
Message-ID: <Yfup9THPcSIPDSoH@dhcp22.suse.cz> (raw)
In-Reply-To: <YfumP3u1VCjKHE3b@linutronix.de>

On Thu 03-02-22 10:54:07, Sebastian Andrzej Siewior wrote:
> On 2022-02-01 16:29:35 [+0100], Michal Hocko wrote:
> > > > Sorry, I know that this all is not really related to your work but if
> > > > the original optimization is solely based on artificial benchmarks then
> > > > I would rather drop it and also make your RT patchset easier.
> > > 
> > > Do you have any real-world benchmark in mind? Like something that is
> > > already used for testing/ benchmarking and would fit here?
> > 
> > Anything that even remotely resembles a real allocation heavy workload.
> 
> So I figured out that build the kernel as user triggers the allocation
> path in_task() and in_interrupt(). I booted a PREEMPT_NONE kernel and
> run "perf stat -r 5 b.sh" where b.sh unpacks a kernel and runs a
> allmodconfig build on /dev/shm. The slow disk should not be a problem.
> 
> With the optimisation:
> |  Performance counter stats for './b.sh' (5 runs):
> | 
> |       43.367.405,59 msec task-clock                #   30,901 CPUs utilized            ( +-  0,01% )
> |           7.393.238      context-switches          #  170,499 /sec                     ( +-  0,13% )
> |             832.364      cpu-migrations            #   19,196 /sec                     ( +-  0,15% )
> |         625.235.644      page-faults               #   14,419 K/sec                    ( +-  0,00% )
> | 103.822.081.026.160      cycles                    #    2,394 GHz                      ( +-  0,01% )
> |  75.392.684.840.822      stalled-cycles-frontend   #   72,63% frontend cycles idle     ( +-  0,02% )
> |  54.971.177.787.990      stalled-cycles-backend    #   52,95% backend cycles idle      ( +-  0,02% )
> |  69.543.893.308.966      instructions              #    0,67  insn per cycle
> |                                                    #    1,08  stalled cycles per insn  ( +-  0,00% )
> |  14.585.269.354.314      branches                  #  336,357 M/sec                    ( +-  0,00% )
> |     558.029.270.966      branch-misses             #    3,83% of all branches          ( +-  0,01% )
> |  
> |            1403,441 +- 0,466 seconds time elapsed  ( +-  0,03% )
> 
> 
> With the optimisation disabled:
> |  Performance counter stats for './b.sh' (5 runs):
> | 
> |       43.354.742,31 msec task-clock                #   30,869 CPUs utilized            ( +-  0,01% )
> |           7.394.210      context-switches          #  170,601 /sec                     ( +-  0,06% )
> |             842.835      cpu-migrations            #   19,446 /sec                     ( +-  0,63% )
> |         625.242.341      page-faults               #   14,426 K/sec                    ( +-  0,00% )
> | 103.791.714.272.978      cycles                    #    2,395 GHz                      ( +-  0,01% )
> |  75.369.652.256.425      stalled-cycles-frontend   #   72,64% frontend cycles idle     ( +-  0,01% )
> |  54.947.610.706.450      stalled-cycles-backend    #   52,96% backend cycles idle      ( +-  0,01% )
> |  69.529.388.440.691      instructions              #    0,67  insn per cycle
> |                                                    #    1,08  stalled cycles per insn  ( +-  0,01% )
> |  14.584.515.016.870      branches                  #  336,497 M/sec                    ( +-  0,00% )
> |     557.716.885.609      branch-misses             #    3,82% of all branches          ( +-  0,02% )
> |  
> |             1404,47 +- 1,05 seconds time elapsed  ( +-  0,08% )
> 
> I'm still open to a more specific test ;)

Thanks for this test. I do assume that both have been run inside a
non-root memcg.

Weiman, what was the original motivation for 559271146efc0? Because as
this RT patch shows it makes future changes much more complex and I
would prefer a simpler and easier to maintain code than some micro
optimizations that do not have any visible effect on real workloads.
-- 
Michal Hocko
SUSE Labs


  reply	other threads:[~2022-02-03 10:10 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-25 16:43 [PATCH 0/4] mm/memcg: Address PREEMPT_RT problems instead of disabling it Sebastian Andrzej Siewior
2022-01-25 16:43 ` [PATCH 1/4] mm/memcg: Disable threshold event handlers on PREEMPT_RT Sebastian Andrzej Siewior
2022-01-26 14:40   ` Michal Hocko
2022-01-26 14:45     ` Sebastian Andrzej Siewior
2022-01-26 15:04       ` Michal Koutný
2022-01-27 13:36         ` Sebastian Andrzej Siewior
2022-01-26 15:21       ` Michal Hocko
2022-01-25 16:43 ` [PATCH 2/4] mm/memcg: Protect per-CPU counter by disabling preemption on PREEMPT_RT where needed Sebastian Andrzej Siewior
2022-01-26 10:06   ` Vlastimil Babka
2022-01-26 11:24     ` Sebastian Andrzej Siewior
2022-01-26 14:56   ` Michal Hocko
2022-01-25 16:43 ` [PATCH 3/4] mm/memcg: Add a local_lock_t for IRQ and TASK object Sebastian Andrzej Siewior
2022-01-26 15:20   ` Michal Hocko
2022-01-27 11:53     ` Sebastian Andrzej Siewior
2022-02-01 12:04       ` Michal Hocko
2022-02-01 12:11         ` Sebastian Andrzej Siewior
2022-02-01 15:29           ` Michal Hocko
2022-02-03  9:54             ` Sebastian Andrzej Siewior
2022-02-03 10:09               ` Michal Hocko [this message]
2022-02-03 11:09                 ` Sebastian Andrzej Siewior
2022-02-08 17:58                 ` Shakeel Butt
2022-02-09  9:17                   ` Michal Hocko
2022-01-26 16:57   ` Vlastimil Babka
2022-01-31 15:06     ` Sebastian Andrzej Siewior
2022-02-03 16:01       ` Vlastimil Babka
2022-02-08 17:17         ` Sebastian Andrzej Siewior
2022-02-08 17:28           ` Michal Hocko
2022-02-09  1:48   ` [mm/memcg] 86895e1e85: WARNING:possible_circular_locking_dependency_detected kernel test robot
2022-01-25 16:43 ` [PATCH 4/4] mm/memcg: Allow the task_obj optimization only on non-PREEMPTIBLE kernels Sebastian Andrzej Siewior
2022-01-25 23:21 ` [PATCH 0/4] mm/memcg: Address PREEMPT_RT problems instead of disabling it Andrew Morton
2022-01-26  7:30   ` Sebastian Andrzej Siewior

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yfup9THPcSIPDSoH@dhcp22.suse.cz \
    --to=mhocko@suse.com \
    --cc=akpm@linux-foundation.org \
    --cc=bigeasy@linutronix.de \
    --cc=cgroups@vger.kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-mm@kvack.org \
    --cc=longman@redhat.com \
    --cc=mkoutny@suse.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox