linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Marcelo Tosatti <mtosatti@redhat.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Lameter <cl@linux.com>,
	Aaron Tomlin <atomlin@atomlin.com>,
	Frederic Weisbecker <frederic@kernel.org>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Russell King <linux@armlinux.org.uk>,
	Huacai Chen <chenhuacai@kernel.org>,
	Heiko Carstens <hca@linux.ibm.com>,
	x86@kernel.org, Vlastimil Babka <vbabka@suse.cz>,
	Michal Hocko <mhocko@suse.com>
Subject: Re: [PATCH v7 00/13] fold per-CPU vmstats remotely
Date: Wed, 19 Apr 2023 08:14:09 -0300	[thread overview]
Message-ID: <ZD/NAaa5TVcL7Mxm@tpad> (raw)
In-Reply-To: <20230418150200.027528c155853fea8e4f58b2@linux-foundation.org>

On Tue, Apr 18, 2023 at 03:02:00PM -0700, Andrew Morton wrote:
> On Mon, 20 Mar 2023 15:03:32 -0300 Marcelo Tosatti <mtosatti@redhat.com> wrote:
> 
> > This patch series addresses the following two problems:
> > 
> > 1. A customer provided evidence indicating that a process
> >    was stalled in direct reclaim:
> > 
> > ...
> >
> >  2. With a task that busy loops on a given CPU,
> >     the kworker interruption to execute vmstat_update
> >     is undesired and may exceed latency thresholds
> >     for certain applications.
> > 
> 
> I don't think I'll be sending this upstream in the next merge window. 
> Because it isn't clear that the added complexity in vmstat handling is
> justified.

From my POV this is an incorrect statement (that the complexity in
vmstat handling is not justified).

Andrew, this is the 3rd attempt to fix this problem:

First try:  https://lore.kernel.org/lkml/20220127173037.318440631@fedora.localdomain/

Second try: https://patchew.org/linux/20230105125218.031928326@redhat.com/

Third try: syncing vmstats remotely from vmstat_shepherd (this
patchset).

And also, can you please explain: what is so complicated about the
vmstat handling? cmpxchg has been around and is used all over the
kernel, and nobody considers "excessively complicated".

> - Michal's request for more clarity on the end-user requirements
>   seems reasonable.

And i explained to Michal in great detail where the end-user 
requirements come from. For virtualized workloads, there are two
types of use-cases:

1) For example, for the MAC scheduler processing must occur every 1ms,
and a certain amount of computation takes place (and must finish before
the next 1ms timeframe). A > 50us latency spike as observed by cyclictest
is considered a "failure".

I showed him a 7us trace caused by, and explained that will extend to >
50us in the case of virtualized vCPU.

2) PLCs. These workloads will also suffer > 50us latency spikes
which is undesirable.

Can you please explain what additional clarity is required?

RH's performance team, for example, has been performing packet
latency tests and waiting for this issue to be fixed for about 2
years now.

Andrew Theurer, can you please explain what problem is the vmstat_work
interruption causing in your testing?

> - You have indicated that additional changelog material is forthcoming.

Not really.

Do you think additional information on the changelog is necessary?

> - The alternative idea of adding a syscall which tells the kernel
>   "I'm about to go realtime, so please clear away all the pending crap
>   which might later interrupt me" sounds pretty good.
>
>   Partly because there are surely other places where we can use this.
> 
>   Partly because it moves all the crap-clearing into special
>   crap-clearing code paths while adding less burden to the
>   commonly-executed code.
> 
>   And I don't think this alternative has been fully investigated and
>   discussed.

This was tried before:
https://lore.kernel.org/lkml/20220127173037.318440631@fedora.localdomain/

My conclusion from that discussion (and work) is that a special system
call:

1) Does not allow the benefits to be widely applied (only modified
applications will benefit). Is not portable across different operating systems. 

Removing the vmstat_work interruption is a benefit for HPC workloads, 
for example (in fact, it is a benefit for any kind of application, 
since the interruption causes cache misses).

2) Increases the system call cost for applications which would use
the interface.

So avoiding the vmstat_update update interruption, without userspace 
knowledge and modifications, is a better than solution than a modified
userspace.







  reply	other threads:[~2023-04-19 11:18 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-20 18:03 Marcelo Tosatti
2023-03-20 18:03 ` [PATCH v7 01/13] vmstat: allow_direct_reclaim should use zone_page_state_snapshot Marcelo Tosatti
2023-03-20 18:21   ` Michal Hocko
2023-03-20 18:32     ` Marcelo Tosatti
2023-03-22 10:03       ` Michal Hocko
2023-03-20 18:03 ` [PATCH v7 02/13] this_cpu_cmpxchg: ARM64: switch this_cpu_cmpxchg to locked, add _local function Marcelo Tosatti
2023-03-20 18:03 ` [PATCH v7 03/13] this_cpu_cmpxchg: loongarch: " Marcelo Tosatti
2023-03-20 18:03 ` [PATCH v7 04/13] this_cpu_cmpxchg: S390: " Marcelo Tosatti
2023-03-20 18:03 ` [PATCH v7 05/13] this_cpu_cmpxchg: x86: " Marcelo Tosatti
2023-03-20 18:03 ` [PATCH v7 06/13] add this_cpu_cmpxchg_local and asm-generic definitions Marcelo Tosatti
2023-03-20 18:03 ` [PATCH v7 07/13] convert this_cpu_cmpxchg users to this_cpu_cmpxchg_local Marcelo Tosatti
2023-03-20 18:03 ` [PATCH v7 08/13] mm/vmstat: switch counter modification to cmpxchg Marcelo Tosatti
2023-03-20 18:03 ` [PATCH v7 09/13] vmstat: switch per-cpu vmstat counters to 32-bits Marcelo Tosatti
2023-03-20 18:03 ` [PATCH v7 10/13] mm/vmstat: use xchg in cpu_vm_stats_fold Marcelo Tosatti
2023-03-20 18:03 ` [PATCH v7 11/13] mm/vmstat: switch vmstat shepherd to flush per-CPU counters remotely Marcelo Tosatti
2023-03-20 18:03 ` [PATCH v7 12/13] mm/vmstat: refresh stats remotely instead of via work item Marcelo Tosatti
2023-03-20 18:03 ` [PATCH v7 13/13] vmstat: add pcp remote node draining via cpu_vm_stats_fold Marcelo Tosatti
2023-03-20 20:43   ` Tim Chen
2023-03-22  1:20     ` Marcelo Tosatti
2023-03-20 18:25 ` [PATCH v7 00/13] fold per-CPU vmstats remotely Michal Hocko
2023-03-20 19:07   ` Marcelo Tosatti
2023-03-22 10:13     ` Michal Hocko
2023-03-22 11:23       ` Marcelo Tosatti
2023-03-22 13:35         ` Michal Hocko
2023-03-22 14:20           ` Marcelo Tosatti
2023-03-23  7:51             ` Michal Hocko
2023-03-23 10:52               ` Marcelo Tosatti
2023-03-23 10:59                 ` Marcelo Tosatti
2023-03-23 12:17                 ` Michal Hocko
2023-03-23 13:30                   ` Marcelo Tosatti
2023-03-23 13:32                     ` Marcelo Tosatti
2023-04-18 22:02 ` Andrew Morton
2023-04-19 11:14   ` Marcelo Tosatti [this message]
2023-04-19 11:15     ` Marcelo Tosatti
2023-04-19 13:44       ` Andrew Theurer
2023-04-20  7:55         ` Michal Hocko
2023-04-23  1:25           ` Marcelo Tosatti
2023-04-19 11:29     ` Marcelo Tosatti
2023-04-19 11:59       ` Marcelo Tosatti
2023-04-19 12:24         ` Frederic Weisbecker
2023-04-19 13:48           ` Marcelo Tosatti
2023-04-19 14:35             ` Michal Hocko
2023-04-19 16:35               ` Marcelo Tosatti
2023-04-20  8:40                 ` Michal Hocko
2023-04-23  1:10                   ` Marcelo Tosatti
2023-04-20 13:45                 ` Marcelo Tosatti
2023-04-26 14:34                   ` Marcelo Tosatti
2023-04-27  8:31                     ` Michal Hocko
2023-04-27 14:59                       ` Marcelo Tosatti
2023-04-26 15:04                   ` Vlastimil Babka
2023-04-26 16:10                     ` Marcelo Tosatti
2023-04-27  8:39                       ` Michal Hocko
2023-04-27 16:25                         ` Marcelo Tosatti
2023-04-19 16:47       ` Vlastimil Babka
2023-04-19 19:15         ` Marcelo Tosatti
2023-05-03 13:51           ` Marcelo Tosatti

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZD/NAaa5TVcL7Mxm@tpad \
    --to=mtosatti@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=atomlin@atomlin.com \
    --cc=chenhuacai@kernel.org \
    --cc=cl@linux.com \
    --cc=frederic@kernel.org \
    --cc=hca@linux.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux@armlinux.org.uk \
    --cc=mhocko@suse.com \
    --cc=vbabka@suse.cz \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox