From: Marcelo Tosatti <mtosatti@redhat.com>
To: Frederic Weisbecker <frederic@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>,
Leonardo Bras <leobras.c@gmail.com>,
linux-kernel@vger.kernel.org, cgroups@vger.kernel.org,
linux-mm@kvack.org, Johannes Weiner <hannes@cmpxchg.org>,
Roman Gushchin <roman.gushchin@linux.dev>,
Shakeel Butt <shakeel.butt@linux.dev>,
Muchun Song <muchun.song@linux.dev>,
Andrew Morton <akpm@linux-foundation.org>,
Christoph Lameter <cl@linux.com>,
Pekka Enberg <penberg@kernel.org>,
David Rientjes <rientjes@google.com>,
Joonsoo Kim <iamjoonsoo.kim@lge.com>,
Vlastimil Babka <vbabka@suse.cz>,
Hyeonggon Yoo <42.hyeyoo@gmail.com>,
Leonardo Bras <leobras@redhat.com>,
Thomas Gleixner <tglx@linutronix.de>,
Waiman Long <longman@redhat.com>,
Boqun Feng <boqun.feng@gmail.com>,
Frederic Weisbecker <fweisbecker@suse.de>
Subject: Re: [PATCH 0/4] Introduce QPW for per-cpu operations
Date: Thu, 26 Feb 2026 08:41:09 -0300 [thread overview]
Message-ID: <aaAxVbt/DQyTgIs3@tpad> (raw)
In-Reply-To: <aZ9ugjKvb4U7_R93@pavilion.home>
On Wed, Feb 25, 2026 at 10:49:54PM +0100, Frederic Weisbecker wrote:
<snip>
> > There are specific parts of a simulation that are intensive, but
> > researchers try to minimize them:
> >
> > I/O Operations: Writing "checkpoints" or large trajectory files to disk
> > (using write()). This is why high-end HPC systems use Asynchronous I/O
> > or dedicated I/O nodes—to keep the compute cores from getting bogged
> > down in system calls.
> >
> > Memory Allocation: Constantly calling malloc/free involves the brk or
> > mmap system calls. Optimized simulation tools pre-allocate all the
> > memory they need at startup to avoid this.
>
> Ok. I asked a similar question and got this (you made me use an LLM for the
> first time btw, I held out for 4 years... I'm sure I can wait 4 more years until
> the next usage :o)
You should use it more often, it can save a significant amount of time
:-)
> ### 2. The "Slow Path" (System Calls / Syscalls)
>
> Passing through the kernel (a syscall) is necessary in certain situations, but it is "expensive" because it forces a **context switch**, which flushes CPU caches.
>
> * **Initialization:** During startup (`MPI_Init`), many syscalls are used to create sockets, map shared memory (`mmap`), and configure network interfaces.
> * **Standard TCP/IP:** If you are not using a high-performance network (RDMA) but simple Ethernet instead, MPI must call `send()` and `recv()`, which are syscalls. The Linux kernel then takes over to manage the TCP/IP stack.
> * **Sleep Mode (Blocking):** If an MPI process waits for a message for too long, it may decide to "go to sleep" to yield the CPU to another task via syscalls like `futex()` or `poll()`.
>
> **In summary:** MPI synchronization aims to be **100% User-Space** (via memory polling) to avoid syscall latency. It is precisely because MPI tries to bypass the kernel that we use `nohz_full`: we are asking the kernel not to even "knock on the CPU's door" with its clock interruptions.
Of course, there is a cost to system calls. However, considering
"low latency applications must necessarily remain in userspace,
therefore lets optimize only for that case" is limiting IMHO.
Should avoid interruptions whenever possible, for isolated CPUs
(in userspace _and_ kernelspace).
next prev parent reply other threads:[~2026-03-02 15:53 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-06 14:34 Marcelo Tosatti
2026-02-06 14:34 ` [PATCH 1/4] Introducing qpw_lock() and per-cpu queue & flush work Marcelo Tosatti
2026-02-06 15:20 ` Marcelo Tosatti
2026-02-07 0:16 ` Leonardo Bras
2026-02-11 12:09 ` Marcelo Tosatti
2026-02-14 21:32 ` Leonardo Bras
2026-02-06 14:34 ` [PATCH 2/4] mm/swap: move bh draining into a separate workqueue Marcelo Tosatti
2026-02-06 14:34 ` [PATCH 3/4] swap: apply new queue_percpu_work_on() interface Marcelo Tosatti
2026-02-07 1:06 ` Leonardo Bras
2026-02-26 15:49 ` Marcelo Tosatti
2026-02-06 14:34 ` [PATCH 4/4] slub: " Marcelo Tosatti
2026-02-07 1:27 ` Leonardo Bras
2026-02-06 23:56 ` [PATCH 0/4] Introduce QPW for per-cpu operations Leonardo Bras
2026-02-10 14:01 ` Michal Hocko
2026-02-11 12:01 ` Marcelo Tosatti
2026-02-11 12:11 ` Marcelo Tosatti
2026-02-14 21:35 ` Leonardo Bras
2026-02-11 16:38 ` Michal Hocko
2026-02-11 16:50 ` Marcelo Tosatti
2026-02-11 16:59 ` Vlastimil Babka
2026-02-11 17:07 ` Michal Hocko
2026-02-14 22:02 ` Leonardo Bras
2026-02-16 11:00 ` Michal Hocko
2026-02-19 15:27 ` Marcelo Tosatti
2026-02-19 19:30 ` Michal Hocko
2026-02-20 14:30 ` Marcelo Tosatti
2026-02-23 9:18 ` Michal Hocko
2026-02-23 21:56 ` Frederic Weisbecker
2026-02-24 17:23 ` Marcelo Tosatti
2026-02-25 21:49 ` Frederic Weisbecker
2026-02-26 7:06 ` Michal Hocko
2026-02-26 11:41 ` Marcelo Tosatti [this message]
2026-02-20 10:48 ` Vlastimil Babka
2026-02-20 12:31 ` Michal Hocko
2026-02-20 17:35 ` Marcelo Tosatti
2026-02-20 17:58 ` Vlastimil Babka
2026-02-20 19:01 ` Marcelo Tosatti
2026-02-23 9:11 ` Michal Hocko
2026-02-23 11:20 ` Marcelo Tosatti
2026-02-24 14:40 ` Frederic Weisbecker
2026-02-24 18:12 ` Marcelo Tosatti
2026-02-20 16:51 ` Marcelo Tosatti
2026-02-20 16:55 ` Marcelo Tosatti
2026-02-20 22:38 ` Leonardo Bras
2026-02-23 18:09 ` Vlastimil Babka
2026-02-26 18:24 ` Marcelo Tosatti
2026-02-20 21:58 ` Leonardo Bras
2026-02-23 9:06 ` Michal Hocko
2026-02-28 1:23 ` Leonardo Bras
2026-02-19 13:15 ` Marcelo Tosatti
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aaAxVbt/DQyTgIs3@tpad \
--to=mtosatti@redhat.com \
--cc=42.hyeyoo@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=boqun.feng@gmail.com \
--cc=cgroups@vger.kernel.org \
--cc=cl@linux.com \
--cc=frederic@kernel.org \
--cc=fweisbecker@suse.de \
--cc=hannes@cmpxchg.org \
--cc=iamjoonsoo.kim@lge.com \
--cc=leobras.c@gmail.com \
--cc=leobras@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=longman@redhat.com \
--cc=mhocko@suse.com \
--cc=muchun.song@linux.dev \
--cc=penberg@kernel.org \
--cc=rientjes@google.com \
--cc=roman.gushchin@linux.dev \
--cc=shakeel.butt@linux.dev \
--cc=tglx@linutronix.de \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox