From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
To: Bo Li <libo.gcs85@bytedance.com>
Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de,
dave.hansen@linux.intel.com, x86@kernel.org, luto@kernel.org,
kees@kernel.org, akpm@linux-foundation.org, david@redhat.com,
juri.lelli@redhat.com, vincent.guittot@linaro.org,
peterz@infradead.org, dietmar.eggemann@arm.com, hpa@zytor.com,
acme@kernel.org, namhyung@kernel.org, mark.rutland@arm.com,
alexander.shishkin@linux.intel.com, jolsa@kernel.org,
irogers@google.com, adrian.hunter@intel.com,
kan.liang@linux.intel.com, viro@zeniv.linux.org.uk,
brauner@kernel.org, jack@suse.cz, Liam.Howlett@oracle.com,
vbabka@suse.cz, rppt@kernel.org, surenb@google.com,
mhocko@suse.com, rostedt@goodmis.org, bsegall@google.com,
mgorman@suse.de, vschneid@redhat.com, jannh@google.com,
pfalcato@suse.de, riel@surriel.com, harry.yoo@oracle.com,
linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org,
linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
duanxiongchun@bytedance.com, yinhongbo@bytedance.com,
dengliang.1214@bytedance.com, xieyongji@bytedance.com,
chaiwen.cc@bytedance.com, songmuchun@bytedance.com,
yuanzhu@bytedance.com, chengguozhu@bytedance.com,
sunjiadong.lff@bytedance.com
Subject: Re: [RFC v2 00/35] optimize cost of inter-process communication
Date: Fri, 30 May 2025 10:33:31 +0100 [thread overview]
Message-ID: <8c98c8e0-95e1-4292-8116-79d803962d5f@lucifer.local> (raw)
In-Reply-To: <cover.1748594840.git.libo.gcs85@bytedance.com>
Bo,
You have outstanding feedback on your v1 from me and Dave Hansen. I'm not
quite sure why you're sending a v2 without responding to that.
This isn't how the upstream kernel works...
Thanks, Lorenzo
On Fri, May 30, 2025 at 05:27:28PM +0800, Bo Li wrote:
> Changelog:
>
> v2:
> - Port the RPAL functions to the latest v6.15 kernel.
> - Add a supplementary introduction to the application scenarios and
> security considerations of RPAL.
>
> link to v1:
> https://lore.kernel.org/lkml/CAP2HCOmAkRVTci0ObtyW=3v6GFOrt9zCn2NwLUbZ+Di49xkBiw@mail.gmail.com/
>
> --------------------------------------------------------------------------
>
> # Introduction
>
> We mainly apply RPAL to the service mesh architecture widely adopted in
> modern cloud-native data centers. Before the rise of the service mesh
> architecture, network functions were usually integrated into monolithic
> applications as libraries, and the main business programs invoked them
> through function calls. However, to facilitate the independent development
> and operation and maintenance of the main business programs and network
> functions, the service mesh removed the network functions from the main
> business programs and made them independent processes (called sidecars).
> Inter-process communication (IPC) is used for interaction between the main
> business program and the sidecar, and the introduced inter-process
> communication has led to a sharp increase in resource consumption in
> cloud-native data centers, and may even occupy more than 10% of the CPU of
> the entire microservice cluster.
>
> To achieve the efficient function call mechanism of the monolithic
> architecture under the service mesh architecture, we introduced the RPAL
> (Running Process As Library) architecture, which implements the sharing of
> the virtual address space of processes and the switching threads in user
> mode. Through the analysis of the service mesh architecture, we found that
> the process memory isolation between the main business program and the
> sidecar is not particularly important because they are split from one
> application and were an integral part of the original monolithic
> application. It is more important for the two processes to be independent
> of each other because they need to be independently developed and
> maintained to ensure the architectural advantages of the service mesh.
> Therefore, RPAL breaks the isolation between processes while preserving the
> independence between them. We think that RPAL can also be applied to other
> scenarios featuring sidecar-like architectures, such as distributed file
> storage systems in LLM infra.
>
> In RPAL architecture, multiple processes share a virtual address space, so
> this architecture can be regarded as an advanced version of the Linux
> shared memory mechanism:
>
> 1. Traditional shared memory requires two processes to negotiate to ensure
> the mapping of the same piece of memory. In RPAL architecture, two RPAL
> processes still need to reach a consensus before they can successfully
> invoke the relevant system calls of RPAL to share the virtual address
> space.
> 2. Traditional shared memory only shares part of the data. However, in RPAL
> architecture, processes that have established an RPAL communication
> relationship share a virtual address space, and all user memory (such as
> data segments and code segments) of each RPAL process is shared among these
> processes. However, a process cannot access the memory of other processes
> at any time. We use the MPK mechanism to ensure that the memory of other
> processes can only be accessed when special RPAL functions are called.
> Otherwise, a page fault will be triggered.
> 3. In RPAL architecture, to ensure the consistency of the execution context
> of the shared code (such as the stack and thread local storage), we further
> implement the thread context switching in user mode based on the ability to
> share the virtual address space of different processes, enabling the
> threads of different processes to directly perform fast switching in user
> mode without falling into kernel mode for slow switching.
>
> # Background
>
> In traditional inter-process communication (IPC) scenarios, Unix domain
> sockets are commonly used in conjunction with the epoll() family for event
> multiplexing. IPC operations involve system calls on both the data and
> control planes, thereby imposing a non-trivial overhead on the interacting
> processes. Even when shared memory is employed to optimize the data plane,
> two data copies still remain. Specifically, data is initially copied from
> a process's private memory space into the shared memory area, and then it
> is copied from the shared memory into the private memory of another
> process.
>
> This poses a question: Is it possible to reduce the overhead of IPC with
> only minimal modifications at the application level? To address this, we
> observed that the functionality of IPC, which encompasses data transfer
> and invocation of the target thread, is similar to a function call, where
> arguments are passed and the callee function is invoked to process them.
> Inspired by this analogy, we introduce RPAL (Run Process As Library), a
> framework designed to enable one process to invoke another as if making
> a local function call, all without going through the kernel.
>
> # Design
>
> First, let’s formalize RPAL’s core objectives:
>
> 1. Data-plane efficiency: Reduce the number of data copies from two (in the
> shared memory solution) to one.
> 2. Control-plane optimization: Eliminate the overhead of system calls and
> kernel's thread switches.
> 3. Application compatibility: Minimize the modifications to existing
> applications that utilize Unix domain sockets and the epoll() family.
>
> To attain the first objective, processes that use RPAL share the same
> virtual address space. So one process can access another's data directly
> via a data pointer. This means data can be transferred from one process to
> another with just one copy operation.
>
> To meet the second goal, RPAL relies on the shared address space to do
> lightweight context switching in user space, which we call an "RPAL call".
> This allows one process to execute another process's code just like a
> local function call.
>
> To achieve the third target, RPAL stays compatible with the epoll family
> of functions, like epoll_create(), epoll_wait(), and epoll_ctl(). If an
> application uses epoll for IPC, developers can switch to RPAL with just a
> few small changes. For instance, you can just replace epoll_wait() with
> rpal_epoll_wait(). The basic epoll procedure, where a process waits for
> another to write to a monitored descriptor using an epoll file descriptor,
> still works fine with RPAL.
>
> ## Address space sharing
>
> For address space sharing, RPAL partitions the entire userspace virtual
> address space and allocates non-overlapping memory ranges to each process.
> On x86_64 architectures, RPAL uses a memory range size covered by a
> single PUD (Page Upper Directory) entry, which is 512GB. This restricts
> each process’s virtual address space to 512GB on x86_64, sufficient for
> most applications in our scenario. The rationale is straightforward:
> address space sharing can be simply achieved by copying the PUD from one
> process’s page table to another’s. So one process can directly use the
> data pointer to access another's memory.
>
>
> |------------| <- 0
> |------------| <- 512 GB
> | Process A |
> |------------| <- 2*512 GB
> |------------| <- n*512 GB
> | Process B |
> |------------| <- (n+1)*512 GB
> |------------| <- STACK_TOP
> | Kernel |
> |------------|
>
> ## RPAL call
>
> We refer to the lightweight userspace context switching mechanism as RPAL
> call. It enables the caller (or sender) thread of one process to directly
> switch to the callee (or receiver) thread of another process.
>
> When Process A’s caller thread initiates an RPAL call to Process B’s
> callee thread, the CPU saves the caller’s context and loads the callee’s
> context. This enables direct userspace control flow transfer from the
> caller to the callee. After the callee finishes data processing, the CPU
> saves Process B’s callee context and switches back to Process A’s caller
> context, completing a full IPC cycle.
>
>
> |------------| |---------------------|
> | Process A | | Process B |
> | |-------| | | |-------| |
> | | caller| --- RPAL call --> | | callee| handle |
> | | thread| <------------------ | thread| -> event |
> | |-------| | | |-------| |
> |------------| |---------------------|
>
> # Security and compatibility with kernel subsystems
>
> ## Memory protection between processes
>
> Since processes using RPAL share the address space, unintended
> cross-process memory access may occur and corrupt the data of another
> process. To mitigate this, we leverage Memory Protection Keys (MPK) on x86
> architectures.
>
> MPK assigns 4 bits in each page table entry to a "protection key", which
> is paired with a userspace register (PKRU). The PKRU register defines
> access permissions for memory regions protected by specific keys (for
> detailed implementation, refer to the kernel documentation "Memory
> Protection Keys"). With MPK, even though the address space is shared
> among processes, cross-process access is restricted: a process can only
> access the memory protected by a key if its PKRU register is configured
> with the corresponding permission. This ensures that processes cannot
> access each other’s memory unless an explicit PKRU configuration is set.
>
> ## Page fault handling and TLB flushing
>
> Due to the shared address space architecture, both page fault handling and
> TLB flushing require careful consideration. For instance, when Process A
> accesses Process B’s memory, a page fault may occur in Process A's
> context, but the faulting address belongs to Process B. In this case, we
> must pass Process B's mm_struct to the page fault handler.
>
> TLB flushing is more complex. When a thread flushes the TLB, since the
> address space is shared, not only other threads in the current process but
> also other processes that share the address space may access the
> corresponding memory (related to the TLB flush). Therefore, the cpuset used
> for TLB flushing should be the union of the mm_cpumasks of all processes
> that share the address space.
>
> ## Lazy switch of kernel context
>
> In RPAL, a mismatch may arise between the user context and the kernel
> context. The RPAL call is designed solely to switch the user context,
> leaving the kernel context unchanged. For instance, when a RPAL call takes
> place, transitioning from caller thread to callee thread, and subsequently
> a system call is initiated within callee thread, the kernel will
> incorrectly utilize the caller's kernel context (such as the kernel stack)
> to process the system call.
>
> To resolve context mismatch issues, a kernel context switch is triggered at
> the kernel entry point when the callee initiates a syscall or an
> exception/interrupt occurs. This mechanism ensures context consistency
> before processing system calls, interrupts, or exceptions. We refer to this
> kernel context switch as a "lazy switch" because it defers the switching
> operation from the traditional thread switch point to the next kernel entry
> point.
>
> Lazy switch should be minimized as much as possible, as it significantly
> degrades performance. We currently utilize RPAL in an RPC framework, in
> which the RPC sender thread relies on the RPAL call to invoke the RPC
> receiver thread entirely in user space. In most cases, the receiver
> thread is free of system calls and the code execution time is relatively
> short. This characteristic effectively reduces the probability of a lazy
> switch occurring.
>
> ## Time slice correction
>
> After an RPAL call, the callee's user mode code executes. However, the
> kernel incorrectly attributes this CPU time to the caller due to the
> unchanged kernel context.
>
> To resolve this, we use the Time Stamp Counter (TSC) register to measure
> CPU time consumed by the callee thread in user space. The kernel then uses
> this user-reported timing data to adjust the CPU accounting for both the
> caller and callee thread, similar to how CPU steal time is implemented.
>
> ## Process recovery
>
> Since processes can access each other’s memory, there is a risk that the
> target process’s memory may become invalid at the access time (e.g., if
> the target process has exited unexpectedly). The kernel must handle such
> cases; otherwise, the accessing process could be terminated due to
> failures originating from another process.
>
> To address this issue, each thread of the process should pre-establish a
> recovery point when accessing the memory of other processes. When such an
> invalid access occurs, the thread traps into the kernel. Inside the page
> fault handler, the kernel restores the user context of the thread to the
> recovery point. This mechanism ensures that processes maintain mutual
> independence, preventing cascading failures caused by cross-process memory
> issues.
>
> # Performance
>
> To quantify the performance improvements driven by RPAL, we measured
> latency both before and after its deployment. Experiments were conducted on
> a server equipped with two Intel(R) Xeon(R) Platinum 8336C CPUs (2.30 GHz)
> and 1 TB of memory. Latency was defined as the duration from when the
> client thread initiates a message to when the server thread is invoked and
> receives it.
>
> During testing, the client transmitted 1 million 32-byte messages, and we
> computed the per-message average latency. The results are as follows:
>
> *****************
> Without RPAL: Message length: 32 bytes, Total TSC cycles: 19616222534,
> Message count: 1000000, Average latency: 19616 cycles
> With RPAL: Message length: 32 bytes, Total TSC cycles: 1703459326,
> Message count: 1000000, Average latency: 1703 cycles
> *****************
>
> These results confirm that RPAL delivers substantial latency improvements
> over the current epoll implementation—achieving a 17,913-cycle reduction
> (an ~91.3% improvement) for 32-byte messages.
>
> We have applied RPAL to an RPC framework that is widely used in our data
> center. With RPAL, we have successfully achieved up to 15.5% reduction in
> the CPU utilization of processes in real-world microservice scenario. The
> gains primarily stem from minimizing control plane overhead through the
> utilization of userspace context switches. Additionally, by leveraging
> address space sharing, the number of memory copies is significantly
> reduced.
>
> # Future Work
>
> Currently, RPAL requires the MPK (Memory Protection Key) hardware feature,
> which is supported by a range of Intel CPUs. For AMD architectures, MPK is
> supported only on the latest processor, specifically, 3th Generation AMD
> EPYC™ Processors and subsequent generations. Patch sets that extend RPAL
> support to systems lacking MPK hardware will be provided later.
>
> Accompanying test programs are also provided in the samples/rpal/
> directory. And the user-mode RPAL library, which realizes user-space RPAL
> call, is in the samples/rpal/librpal directory.
>
> We hope to get some community discussions and feedback on RPAL's
> optimization approaches and architecture.
>
> Look forward to your comments.
>
> Bo Li (35):
> Kbuild: rpal support
> RPAL: add struct rpal_service
> RPAL: add service registration interface
> RPAL: add member to task_struct and mm_struct
> RPAL: enable virtual address space partitions
> RPAL: add user interface
> RPAL: enable shared page mmap
> RPAL: enable sender/receiver registration
> RPAL: enable address space sharing
> RPAL: allow service enable/disable
> RPAL: add service request/release
> RPAL: enable service disable notification
> RPAL: add tlb flushing support
> RPAL: enable page fault handling
> RPAL: add sender/receiver state
> RPAL: add cpu lock interface
> RPAL: add a mapping between fsbase and tasks
> sched: pick a specified task
> RPAL: add lazy switch main logic
> RPAL: add rpal_ret_from_lazy_switch
> RPAL: add kernel entry handling for lazy switch
> RPAL: rebuild receiver state
> RPAL: resume cpumask when fork
> RPAL: critical section optimization
> RPAL: add MPK initialization and interface
> RPAL: enable MPK support
> RPAL: add epoll support
> RPAL: add rpal_uds_fdmap() support
> RPAL: fix race condition in pkru update
> RPAL: fix pkru setup when fork
> RPAL: add receiver waker
> RPAL: fix unknown nmi on AMD CPU
> RPAL: enable time slice correction
> RPAL: enable fast epoll wait
> samples/rpal: add RPAL samples
>
> arch/x86/Kbuild | 2 +
> arch/x86/Kconfig | 2 +
> arch/x86/entry/entry_64.S | 160 ++
> arch/x86/events/amd/core.c | 14 +
> arch/x86/include/asm/pgtable.h | 25 +
> arch/x86/include/asm/pgtable_types.h | 11 +
> arch/x86/include/asm/tlbflush.h | 10 +
> arch/x86/kernel/asm-offsets.c | 3 +
> arch/x86/kernel/cpu/common.c | 8 +-
> arch/x86/kernel/fpu/core.c | 8 +-
> arch/x86/kernel/nmi.c | 20 +
> arch/x86/kernel/process.c | 25 +-
> arch/x86/kernel/process_64.c | 118 +
> arch/x86/mm/fault.c | 271 ++
> arch/x86/mm/mmap.c | 10 +
> arch/x86/mm/tlb.c | 172 ++
> arch/x86/rpal/Kconfig | 21 +
> arch/x86/rpal/Makefile | 6 +
> arch/x86/rpal/core.c | 477 ++++
> arch/x86/rpal/internal.h | 69 +
> arch/x86/rpal/mm.c | 426 +++
> arch/x86/rpal/pku.c | 196 ++
> arch/x86/rpal/proc.c | 279 ++
> arch/x86/rpal/service.c | 776 ++++++
> arch/x86/rpal/thread.c | 313 +++
> fs/binfmt_elf.c | 98 +-
> fs/eventpoll.c | 320 +++
> fs/exec.c | 11 +
> include/linux/mm_types.h | 3 +
> include/linux/rpal.h | 633 +++++
> include/linux/sched.h | 21 +
> init/init_task.c | 6 +
> kernel/exit.c | 5 +
> kernel/fork.c | 32 +
> kernel/sched/core.c | 676 +++++
> kernel/sched/fair.c | 109 +
> kernel/sched/sched.h | 8 +
> mm/mmap.c | 16 +
> mm/mprotect.c | 106 +
> mm/rmap.c | 4 +
> mm/vma.c | 18 +
> samples/rpal/Makefile | 17 +
> samples/rpal/asm_define.c | 14 +
> samples/rpal/client.c | 178 ++
> samples/rpal/librpal/asm_define.h | 6 +
> samples/rpal/librpal/asm_x86_64_rpal_call.S | 57 +
> samples/rpal/librpal/debug.h | 12 +
> samples/rpal/librpal/fiber.c | 119 +
> samples/rpal/librpal/fiber.h | 64 +
> .../rpal/librpal/jump_x86_64_sysv_elf_gas.S | 81 +
> .../rpal/librpal/make_x86_64_sysv_elf_gas.S | 82 +
> .../rpal/librpal/ontop_x86_64_sysv_elf_gas.S | 84 +
> samples/rpal/librpal/private.h | 341 +++
> samples/rpal/librpal/rpal.c | 2351 +++++++++++++++++
> samples/rpal/librpal/rpal.h | 149 ++
> samples/rpal/librpal/rpal_pkru.h | 78 +
> samples/rpal/librpal/rpal_queue.c | 239 ++
> samples/rpal/librpal/rpal_queue.h | 55 +
> samples/rpal/librpal/rpal_x86_64_call_ret.S | 45 +
> samples/rpal/offset.sh | 5 +
> samples/rpal/server.c | 249 ++
> 61 files changed, 9710 insertions(+), 4 deletions(-)
> create mode 100644 arch/x86/rpal/Kconfig
> create mode 100644 arch/x86/rpal/Makefile
> create mode 100644 arch/x86/rpal/core.c
> create mode 100644 arch/x86/rpal/internal.h
> create mode 100644 arch/x86/rpal/mm.c
> create mode 100644 arch/x86/rpal/pku.c
> create mode 100644 arch/x86/rpal/proc.c
> create mode 100644 arch/x86/rpal/service.c
> create mode 100644 arch/x86/rpal/thread.c
> create mode 100644 include/linux/rpal.h
> create mode 100644 samples/rpal/Makefile
> create mode 100644 samples/rpal/asm_define.c
> create mode 100644 samples/rpal/client.c
> create mode 100644 samples/rpal/librpal/asm_define.h
> create mode 100644 samples/rpal/librpal/asm_x86_64_rpal_call.S
> create mode 100644 samples/rpal/librpal/debug.h
> create mode 100644 samples/rpal/librpal/fiber.c
> create mode 100644 samples/rpal/librpal/fiber.h
> create mode 100644 samples/rpal/librpal/jump_x86_64_sysv_elf_gas.S
> create mode 100644 samples/rpal/librpal/make_x86_64_sysv_elf_gas.S
> create mode 100644 samples/rpal/librpal/ontop_x86_64_sysv_elf_gas.S
> create mode 100644 samples/rpal/librpal/private.h
> create mode 100644 samples/rpal/librpal/rpal.c
> create mode 100644 samples/rpal/librpal/rpal.h
> create mode 100644 samples/rpal/librpal/rpal_pkru.h
> create mode 100644 samples/rpal/librpal/rpal_queue.c
> create mode 100644 samples/rpal/librpal/rpal_queue.h
> create mode 100644 samples/rpal/librpal/rpal_x86_64_call_ret.S
> create mode 100755 samples/rpal/offset.sh
> create mode 100644 samples/rpal/server.c
>
> --
> 2.20.1
>
next prev parent reply other threads:[~2025-05-30 9:34 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-30 9:27 Bo Li
2025-05-30 9:27 ` [RFC v2 01/35] Kbuild: rpal support Bo Li
2025-05-30 9:27 ` [RFC v2 02/35] RPAL: add struct rpal_service Bo Li
2025-05-30 9:27 ` [RFC v2 03/35] RPAL: add service registration interface Bo Li
2025-05-30 9:27 ` [RFC v2 04/35] RPAL: add member to task_struct and mm_struct Bo Li
2025-05-30 9:27 ` [RFC v2 05/35] RPAL: enable virtual address space partitions Bo Li
2025-05-30 9:27 ` [RFC v2 06/35] RPAL: add user interface Bo Li
2025-05-30 9:27 ` [RFC v2 07/35] RPAL: enable shared page mmap Bo Li
2025-05-30 9:27 ` [RFC v2 08/35] RPAL: enable sender/receiver registration Bo Li
2025-05-30 9:27 ` [RFC v2 09/35] RPAL: enable address space sharing Bo Li
2025-05-30 9:27 ` [RFC v2 10/35] RPAL: allow service enable/disable Bo Li
2025-05-30 9:27 ` [RFC v2 11/35] RPAL: add service request/release Bo Li
2025-05-30 9:27 ` [RFC v2 12/35] RPAL: enable service disable notification Bo Li
2025-05-30 9:27 ` [RFC v2 13/35] RPAL: add tlb flushing support Bo Li
2025-05-30 9:27 ` [RFC v2 14/35] RPAL: enable page fault handling Bo Li
2025-05-30 13:59 ` Dave Hansen
2025-05-30 9:27 ` [RFC v2 15/35] RPAL: add sender/receiver state Bo Li
2025-05-30 9:27 ` [RFC v2 16/35] RPAL: add cpu lock interface Bo Li
2025-05-30 9:27 ` [RFC v2 17/35] RPAL: add a mapping between fsbase and tasks Bo Li
2025-05-30 9:27 ` [RFC v2 18/35] sched: pick a specified task Bo Li
2025-05-30 9:27 ` [RFC v2 19/35] RPAL: add lazy switch main logic Bo Li
2025-05-30 9:27 ` [RFC v2 20/35] RPAL: add rpal_ret_from_lazy_switch Bo Li
2025-05-30 9:27 ` [RFC v2 21/35] RPAL: add kernel entry handling for lazy switch Bo Li
2025-05-30 9:27 ` [RFC v2 22/35] RPAL: rebuild receiver state Bo Li
2025-05-30 9:27 ` [RFC v2 23/35] RPAL: resume cpumask when fork Bo Li
2025-05-30 9:27 ` [RFC v2 24/35] RPAL: critical section optimization Bo Li
2025-05-30 9:27 ` [RFC v2 25/35] RPAL: add MPK initialization and interface Bo Li
2025-05-30 9:27 ` [RFC v2 26/35] RPAL: enable MPK support Bo Li
2025-05-30 17:03 ` Dave Hansen
2025-05-30 9:27 ` [RFC v2 27/35] RPAL: add epoll support Bo Li
2025-05-30 9:27 ` [RFC v2 28/35] RPAL: add rpal_uds_fdmap() support Bo Li
2025-05-30 9:27 ` [RFC v2 29/35] RPAL: fix race condition in pkru update Bo Li
2025-05-30 9:27 ` [RFC v2 30/35] RPAL: fix pkru setup when fork Bo Li
2025-05-30 9:27 ` [RFC v2 31/35] RPAL: add receiver waker Bo Li
2025-05-30 9:28 ` [RFC v2 32/35] RPAL: fix unknown nmi on AMD CPU Bo Li
2025-05-30 9:28 ` [RFC v2 33/35] RPAL: enable time slice correction Bo Li
2025-05-30 9:28 ` [RFC v2 34/35] RPAL: enable fast epoll wait Bo Li
2025-05-30 9:28 ` [RFC v2 35/35] samples/rpal: add RPAL samples Bo Li
2025-05-30 9:33 ` Lorenzo Stoakes [this message]
2025-06-03 8:22 ` [RFC v2 00/35] optimize cost of inter-process communication Bo Li
2025-06-03 9:22 ` Lorenzo Stoakes
2025-05-30 9:41 ` Pedro Falcato
2025-05-30 9:56 ` David Hildenbrand
2025-05-30 22:42 ` Andrew Morton
2025-05-31 7:16 ` Ingo Molnar
2025-06-03 17:49 ` H. Peter Anvin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8c98c8e0-95e1-4292-8116-79d803962d5f@lucifer.local \
--to=lorenzo.stoakes@oracle.com \
--cc=Liam.Howlett@oracle.com \
--cc=acme@kernel.org \
--cc=adrian.hunter@intel.com \
--cc=akpm@linux-foundation.org \
--cc=alexander.shishkin@linux.intel.com \
--cc=bp@alien8.de \
--cc=brauner@kernel.org \
--cc=bsegall@google.com \
--cc=chaiwen.cc@bytedance.com \
--cc=chengguozhu@bytedance.com \
--cc=dave.hansen@linux.intel.com \
--cc=david@redhat.com \
--cc=dengliang.1214@bytedance.com \
--cc=dietmar.eggemann@arm.com \
--cc=duanxiongchun@bytedance.com \
--cc=harry.yoo@oracle.com \
--cc=hpa@zytor.com \
--cc=irogers@google.com \
--cc=jack@suse.cz \
--cc=jannh@google.com \
--cc=jolsa@kernel.org \
--cc=juri.lelli@redhat.com \
--cc=kan.liang@linux.intel.com \
--cc=kees@kernel.org \
--cc=libo.gcs85@bytedance.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=luto@kernel.org \
--cc=mark.rutland@arm.com \
--cc=mgorman@suse.de \
--cc=mhocko@suse.com \
--cc=mingo@redhat.com \
--cc=namhyung@kernel.org \
--cc=peterz@infradead.org \
--cc=pfalcato@suse.de \
--cc=riel@surriel.com \
--cc=rostedt@goodmis.org \
--cc=rppt@kernel.org \
--cc=songmuchun@bytedance.com \
--cc=sunjiadong.lff@bytedance.com \
--cc=surenb@google.com \
--cc=tglx@linutronix.de \
--cc=vbabka@suse.cz \
--cc=vincent.guittot@linaro.org \
--cc=viro@zeniv.linux.org.uk \
--cc=vschneid@redhat.com \
--cc=x86@kernel.org \
--cc=xieyongji@bytedance.com \
--cc=yinhongbo@bytedance.com \
--cc=yuanzhu@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox