Re: [PATCH v7 00/13] fold per-CPU vmstats remotely

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Marcelo Tosatti <mtosatti@redhat.com>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Christoph Lameter <cl@linux.com>,
	Aaron Tomlin <atomlin@atomlin.com>,
	Frederic Weisbecker <frederic@kernel.org>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Russell King <linux@armlinux.org.uk>,
	Huacai Chen <chenhuacai@kernel.org>,
	Heiko Carstens <hca@linux.ibm.com>,
	x86@kernel.org, Michal Hocko <mhocko@suse.com>
Subject: Re: [PATCH v7 00/13] fold per-CPU vmstats remotely
Date: Wed, 19 Apr 2023 16:15:50 -0300	[thread overview]
Message-ID: <ZEA95uBeUECRvO5e@tpad> (raw)
In-Reply-To: <1a481d68-930e-9418-a9aa-befdcfe36928@suse.cz>

On Wed, Apr 19, 2023 at 06:47:30PM +0200, Vlastimil Babka wrote:
> On 4/19/23 13:29, Marcelo Tosatti wrote:
> > On Wed, Apr 19, 2023 at 08:14:09AM -0300, Marcelo Tosatti wrote:
> >> This was tried before:
> >> https://lore.kernel.org/lkml/20220127173037.318440631@fedora.localdomain/
> >> 
> >> My conclusion from that discussion (and work) is that a special system
> >> call:
> >> 
> >> 1) Does not allow the benefits to be widely applied (only modified
> >> applications will benefit). Is not portable across different operating systems. 
> >> 
> >> Removing the vmstat_work interruption is a benefit for HPC workloads, 
> >> for example (in fact, it is a benefit for any kind of application, 
> >> since the interruption causes cache misses).
> >> 
> >> 2) Increases the system call cost for applications which would use
> >> the interface.
> >> 
> >> So avoiding the vmstat_update update interruption, without userspace 
> >> knowledge and modifications, is a better than solution than a modified
> >> userspace.
> > 
> > Another important point is this: if an application dirties
> > its own per-CPU vmstat cache, while performing a system call,
> > and a vmstat sync event is triggered on a different CPU, you'd have to:
> > 
> > 1) Wait for that CPU to return to userspace and sync its stats
> > (unfeasible).
> > 
> > 2) Queue work to execute on that CPU (undesirable, as that causes
> > an interruption).
> 
> So you're saying the application might do a syscall from the isolcpu, so
> IIUC it cannot expect any latency guarantees at that very moment, 

Why not? cyclictest uses nanosleep and its the main tool for measuring
latency.

> but then
> it immediately starts expecting them again after returning to userspace, 

No, the expectation more generally is this:

For certain types of applications (for example PLC software or
RAN processing), upon occurrence of an event, it is necessary to
complete a certain task in a maximum amount of time (deadline).

One way to express this requirement is with a pair of numbers,
deadline time and execution time, where:

        * deadline time: length of time between event and deadline.
        * execution time: length of time it takes for processing of event
                          to occur on a particular hardware platform
                          (uninterrupted).

The particular values depend on use-case. For the case
where the realtime application executes in a virtualized
guest, an interruption which must be serviced in the host will cause
the following sequence of events:

        1) VM-exit
        2) execution of IPI (and function call) (or switch to kwork
	thread to execute some work item).
        3) VM-entry

Which causes an excess of 50us latency as observed by cyclictest
(this violates the latency requirement of vRAN application with 1ms TTI,
for example).

> and
> a single interruption for a one-time flush after the syscall would be too
> intrusive?

Generally, if you can't complete the task (which involves executing a
number of instructions) before the deadline, then its a problem.

One-time flush? You mean to switch between:

rt-app -> kworker (to execute vmstat_update flush) -> rt-app

My measurement, which probably had vmstat_update code/data in cache, took 7us.
It might be the case that the code to execute must be brought in from
memory, which takes even longer.

> (elsewhere in the thread you described an RT app initialization that may
> generate vmstats to flush and then entry userspace loop, again, would a
> single interruption soon after entering the loop be so critical?)

1) It depends on the application. For the use-case above, where < 50us
interruption is desired, yes it is critical.

2) The interruptions can come from different sources.

Time
0			rt-app executing instruction 1
1			rt-app executing instruction 2
2			scheduler switches between rt-app and kworker
3			kworker runs vmstat_work
4			scheduler switches between kworker and rt-app
5			rt-app executing instruction 3
6			ipi to handle a KVM request IPI
7			fill in your preferred IPI handler

So the argument "a single interruption might not cause your deadline
to be exceeded" fails (because the time to handle the 
different interruptions might sum).

Does that make sense?

> > 3) Remotely sync the vmstat for that CPU.
> > 
> > 
> > 
> 
>

next prev parent reply	other threads:[~2023-04-20  2:42 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-20 18:03 Marcelo Tosatti
2023-03-20 18:03 ` [PATCH v7 01/13] vmstat: allow_direct_reclaim should use zone_page_state_snapshot Marcelo Tosatti
2023-03-20 18:21   ` Michal Hocko
2023-03-20 18:32     ` Marcelo Tosatti
2023-03-22 10:03       ` Michal Hocko
2023-03-20 18:03 ` [PATCH v7 02/13] this_cpu_cmpxchg: ARM64: switch this_cpu_cmpxchg to locked, add _local function Marcelo Tosatti
2023-03-20 18:03 ` [PATCH v7 03/13] this_cpu_cmpxchg: loongarch: " Marcelo Tosatti
2023-03-20 18:03 ` [PATCH v7 04/13] this_cpu_cmpxchg: S390: " Marcelo Tosatti
2023-03-20 18:03 ` [PATCH v7 05/13] this_cpu_cmpxchg: x86: " Marcelo Tosatti
2023-03-20 18:03 ` [PATCH v7 06/13] add this_cpu_cmpxchg_local and asm-generic definitions Marcelo Tosatti
2023-03-20 18:03 ` [PATCH v7 07/13] convert this_cpu_cmpxchg users to this_cpu_cmpxchg_local Marcelo Tosatti
2023-03-20 18:03 ` [PATCH v7 08/13] mm/vmstat: switch counter modification to cmpxchg Marcelo Tosatti
2023-03-20 18:03 ` [PATCH v7 09/13] vmstat: switch per-cpu vmstat counters to 32-bits Marcelo Tosatti
2023-03-20 18:03 ` [PATCH v7 10/13] mm/vmstat: use xchg in cpu_vm_stats_fold Marcelo Tosatti
2023-03-20 18:03 ` [PATCH v7 11/13] mm/vmstat: switch vmstat shepherd to flush per-CPU counters remotely Marcelo Tosatti
2023-03-20 18:03 ` [PATCH v7 12/13] mm/vmstat: refresh stats remotely instead of via work item Marcelo Tosatti
2023-03-20 18:03 ` [PATCH v7 13/13] vmstat: add pcp remote node draining via cpu_vm_stats_fold Marcelo Tosatti
2023-03-20 20:43   ` Tim Chen
2023-03-22  1:20     ` Marcelo Tosatti
2023-03-20 18:25 ` [PATCH v7 00/13] fold per-CPU vmstats remotely Michal Hocko
2023-03-20 19:07   ` Marcelo Tosatti
2023-03-22 10:13     ` Michal Hocko
2023-03-22 11:23       ` Marcelo Tosatti
2023-03-22 13:35         ` Michal Hocko
2023-03-22 14:20           ` Marcelo Tosatti
2023-03-23  7:51             ` Michal Hocko
2023-03-23 10:52               ` Marcelo Tosatti
2023-03-23 10:59                 ` Marcelo Tosatti
2023-03-23 12:17                 ` Michal Hocko
2023-03-23 13:30                   ` Marcelo Tosatti
2023-03-23 13:32                     ` Marcelo Tosatti
2023-04-18 22:02 ` Andrew Morton
2023-04-19 11:14   ` Marcelo Tosatti
2023-04-19 11:15     ` Marcelo Tosatti
2023-04-19 13:44       ` Andrew Theurer
2023-04-20  7:55         ` Michal Hocko
2023-04-23  1:25           ` Marcelo Tosatti
2023-04-19 11:29     ` Marcelo Tosatti
2023-04-19 11:59       ` Marcelo Tosatti
2023-04-19 12:24         ` Frederic Weisbecker
2023-04-19 13:48           ` Marcelo Tosatti
2023-04-19 14:35             ` Michal Hocko
2023-04-19 16:35               ` Marcelo Tosatti
2023-04-20  8:40                 ` Michal Hocko
2023-04-23  1:10                   ` Marcelo Tosatti
2023-04-20 13:45                 ` Marcelo Tosatti
2023-04-26 14:34                   ` Marcelo Tosatti
2023-04-27  8:31                     ` Michal Hocko
2023-04-27 14:59                       ` Marcelo Tosatti
2023-04-26 15:04                   ` Vlastimil Babka
2023-04-26 16:10                     ` Marcelo Tosatti
2023-04-27  8:39                       ` Michal Hocko
2023-04-27 16:25                         ` Marcelo Tosatti
2023-04-19 16:47       ` Vlastimil Babka
2023-04-19 19:15         ` Marcelo Tosatti [this message]
2023-05-03 13:51           ` Marcelo Tosatti

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZEA95uBeUECRvO5e@tpad \
    --to=mtosatti@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=atomlin@atomlin.com \
    --cc=chenhuacai@kernel.org \
    --cc=cl@linux.com \
    --cc=frederic@kernel.org \
    --cc=hca@linux.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux@armlinux.org.uk \
    --cc=mhocko@suse.com \
    --cc=vbabka@suse.cz \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox