From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D22A4C5B549 for ; Fri, 30 May 2025 09:34:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 56C426B00C3; Fri, 30 May 2025 05:34:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 51C5C6B00C4; Fri, 30 May 2025 05:34:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3BD1D6B00C5; Fri, 30 May 2025 05:34:39 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 1AF1B6B00C3 for ; Fri, 30 May 2025 05:34:39 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id C46201CB594 for ; Fri, 30 May 2025 09:34:38 +0000 (UTC) X-FDA: 83499064236.12.5A0FC0E Received: from mail-pj1-f43.google.com (mail-pj1-f43.google.com [209.85.216.43]) by imf06.hostedemail.com (Postfix) with ESMTP id D23A3180007 for ; Fri, 30 May 2025 09:34:36 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b="NzF/JfAU"; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf06.hostedemail.com: domain of libo.gcs85@bytedance.com designates 209.85.216.43 as permitted sender) smtp.mailfrom=libo.gcs85@bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1748597676; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Gd4M6twRFQoCr4fYLEaBSZS3XlcwhQLeBelobi/ni6A=; b=nvq2mkr4YkDvsc13gUTeodxHCvC+0xL2ETBctmYY4y/SWeTtLCXI320XDh6712CU2ksAAb enGv439N8MIIEbFy9AkhuOvp+SJ1PrBUxi15Ro52gVl6jlqgBQ5AgVhxsb92rY+3dPcPUW qGiMN/P6k5s+edHZ5FLR7n1qX8o9/B4= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b="NzF/JfAU"; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf06.hostedemail.com: domain of libo.gcs85@bytedance.com designates 209.85.216.43 as permitted sender) smtp.mailfrom=libo.gcs85@bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1748597676; a=rsa-sha256; cv=none; b=6AyrNF0CtghjjhXMFhzC8DPUtioBaDBXK8xZLiL0lGyauzopxKBf8mjgo2wWb58XuFcchy GsqU8Dt3J5HHXUTWAHhhR2r6S0yVZOr5M/Axp7G1h+/LUiweF5cCtM6F1oSL9tv26bbNG3 NBGNlS1bb9sd6FAIJnt14lTigIFw7QA= Received: by mail-pj1-f43.google.com with SMTP id 98e67ed59e1d1-3122368d7c4so1177351a91.1 for ; Fri, 30 May 2025 02:34:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1748597676; x=1749202476; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Gd4M6twRFQoCr4fYLEaBSZS3XlcwhQLeBelobi/ni6A=; b=NzF/JfAUHW/4aQaVK9DVVnkLMZnBKYm/PjcwFxYuPGCBINmQsEwv4jR/t4z7hfTCK+ 2AKbcJmqfk++Hft2jqxZTs5Vq6qKo4JQqoTlhWPkwZImTbQOFtWxDj9XVfi9gMaYrGLe wq7oin93ztlGxO+G3dgHxAbgJgAgliSKyhNMYZwuIaM2DG3ML2wtsG1a7GGytS/mPzCC h0AHPF+zWl73UZ7zHt7cQZcZ8zh7/wwOzJVB7Gc7mMbZsYX9/EhW6vzE40opGxxE/iKO HROW+WYghTXI3EMMlXE7vVPJ1CJ2EYtfNpuV1TSg7hpNjQlS7ppBwy7xS/PFhJ5wHWRv b/dA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1748597676; x=1749202476; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Gd4M6twRFQoCr4fYLEaBSZS3XlcwhQLeBelobi/ni6A=; b=JHG5XuzBJbGj8pZjTVJ+o5jIWC53V7ANB1uvV3kFzxZQg1jXPrmpkDy1s0n+Q+HlrJ 57bQD8D/N1cIX7nGzQmLY3SiuZvTrIR3MACiGSilvmr8zGGQnqKim+BprneP+e9zLTmN 9p+PXMZ2gy7Mq4kF0rZSyoGBDBzp7fQBIEpibCslpckMh1A+4gvHMhldWZInn4ow9Lx7 VKRBelSaIHyudIl2OJYXSdI/3W2JKTXdmyFQKNmhxyIb/4zwfRjfvkIpGqPZ/rcU6QyW m8EKDIaT3PK1WWQcL+XSm/LzvlufJ7s8wRknm1eZvbh486+w631vkFGtcEdCc3Li8tXn 1d4g== X-Forwarded-Encrypted: i=1; AJvYcCWm0xuNEJB0FyOuWHtHgtQZJVGyc6m2zQwnBxa/p/gOktq8TEL973kChEhy8RtkUR+MKNlhld2rpA==@kvack.org X-Gm-Message-State: AOJu0YxyYOMhBUPWcqIknmuhMUqj9DgYGVcFFltPBkl+LVkPi1JIIUrR nmVG07CZJNgXPlX2en/lvDo49PzhoF6in/XEwDGgXtPswdQEkC9UFtjL+WUdKHS8XZ4= X-Gm-Gg: ASbGnct8kFxbgHyZX/8ssaOqYWvqI/O2a+B2owmrZnDk8wFtX70HxMu0mh36/EDamcX Ry/tRb1RmI50BljX48fFQA0Vjika2DDqHgt26n0i/F20isINdH5CpoP6MoalF5GkhP+V9VUw/6n mach34gWMo7af2JItaEdCq2+EOT1Qo+ZZjpcyVqU0GCW4umDZG22oJW3XShnybUDiTWo+tTvCmg 8r0AQuN8qF/89lhADvkr/5kO90Hb0HvPJVbE7Jjabx6Pqy5YI5C9yfrR/BS8CezykXQqaBO/qVV H1x+tarBfGB324z97VGoByLwyiwb3TVQMYIqyW64nyVybv4dyI6zXbUYwxZlf+n0ZVhXFJMsS9K 7/N9bXIayIg== X-Google-Smtp-Source: AGHT+IFObkOA1vqgzfl2J7tHJ7MiXmVGrPtynvFE/NiK8PUmWtoYGs6g/wJAuSU3bu11FutgVffcuA== X-Received: by 2002:a17:90a:d2ce:b0:311:b3e7:fb3c with SMTP id 98e67ed59e1d1-31241e97f30mr3743435a91.31.1748597675528; Fri, 30 May 2025 02:34:35 -0700 (PDT) Received: from FQ627FTG20.bytedance.net ([63.216.146.178]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-3124e29f7b8sm838724a91.2.2025.05.30.02.34.20 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Fri, 30 May 2025 02:34:35 -0700 (PDT) From: Bo Li To: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, luto@kernel.org, kees@kernel.org, akpm@linux-foundation.org, david@redhat.com, juri.lelli@redhat.com, vincent.guittot@linaro.org, peterz@infradead.org Cc: dietmar.eggemann@arm.com, hpa@zytor.com, acme@kernel.org, namhyung@kernel.org, mark.rutland@arm.com, alexander.shishkin@linux.intel.com, jolsa@kernel.org, irogers@google.com, adrian.hunter@intel.com, kan.liang@linux.intel.com, viro@zeniv.linux.org.uk, brauner@kernel.org, jack@suse.cz, lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com, vbabka@suse.cz, rppt@kernel.org, surenb@google.com, mhocko@suse.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, vschneid@redhat.com, jannh@google.com, pfalcato@suse.de, riel@surriel.com, harry.yoo@oracle.com, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, duanxiongchun@bytedance.com, yinhongbo@bytedance.com, dengliang.1214@bytedance.com, xieyongji@bytedance.com, chaiwen.cc@bytedance.com, songmuchun@bytedance.com, yuanzhu@bytedance.com, chengguozhu@bytedance.com, sunjiadong.lff@bytedance.com, Bo Li Subject: [RFC v2 24/35] RPAL: critical section optimization Date: Fri, 30 May 2025 17:27:52 +0800 Message-Id: <47c919a7d65cb5def07c561e29305d39d9df925f.1748594841.git.libo.gcs85@bytedance.com> X-Mailer: git-send-email 2.39.5 (Apple Git-154) In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: D23A3180007 X-Stat-Signature: 1a4hj9ebhucbmb78yko6g34s8cadq8sc X-Rspam-User: X-HE-Tag: 1748597676-958355 X-HE-Meta: U2FsdGVkX1+UVrxrvhPkbe4CI8Dp/gtEF5Vo2uodTDVRJtnZwWxN5osFdPk2UW9EvFchHtiZp5utza3TWZbmMX3GTpFgyxvP9X3iU8orS3QmMlFzRZnUXkJIHEX8DfVWWoebMxbjKvl9W75PHPBCoS0DhAepysNu37g5OqyPqhZz9CRPpnt7DlzsaHi7Zy5f+kz8k9fvYsf+Hp1Qy/f1lastO/HtBL8jyf3NBr3XdgDQXXTC6vigAwTQ4Odmb8RlRYIea+jiJ6Vvi5cdrt4gAFtn2O9yE+r1zv1PcY/P45r+H4ZRguv3lXnADo1WN2+ldW928JExMLwesZTdCr/Mmt4SDVHt/7Gx+Q5X+SYWswj3wp/U2pM9qOdIu6/16VJfQk5rc5lfTAZGJl8uOVFWI1duhISwwtMKB/qlhCT4RqRkYe/hjnd4l8e9RRuBXoK2oUsWwaB0pZaC2bLnws2sr2WPHf3JglAul6MoR+2k2lIo7mQwSKCa/xn8gV7zd5uDlA3NZPCyl5uU3lJk7yTUwZktJ2vTQAMFIAQgA6cM8lQTYcZwIOwU+2uak96UpgRIwzDoT5R7KQCbxu5fK/gyyjGdr8u+4Ta8PRq9HZk5To47q5Uzq58SbzBRKwdc6mKditgxSTWd48vynQKMboVCj8RZwexn8Bie/bJ9R1e+5HXArIv/eD7xLCpZx6fNl5DRHdAFMuQZf822kMAhgKBT/MkLMfJMhSFVWmL7Y0O+r7vJo7HBrOQ6MsxDXxKqHftIytN4ZwJW48TrIvu2Bc60ZUi65mWZ+XcX2iaaigGJz9CtVPCC5I3/MfqeYcelzYeC8V4Qwo7aZNTkxqlJGVbLHZptg5pBNQq/9KpVGrcFnEs4ZQFPfSA911a5EteKsAiaMAHcKh+3dXH4VYFZjWJp/FmnBxI1bgqiKOzj2QP9mtzbN0GIR/9EBqC1SNv0nDpoNgbxbAEC6En21SBYikU RvQWdPST igsGWn1KDBtz6tOoa1tCW0yaDBjU4V5dWFl2tnwoARbqs1d8RiR1gUxEaBoLpkyDoZfIha4xUXAPuO5+vRlkp9rLd0J+YdeW0+3ar4smN18hKsbsqODJMfPRgGMv6Vik2SZWdFwnSbXbHq7UUEeG7hgxVrFZNOPZ3zeZf0djmXnmFZ4k/LvQRxurSSmt4maowy3cPwh+rrx2F9EiHbRl39DnRSebZZrovcyuQ4O4AIsSEg7qHBINOBPEfPJaWaxeiJDTz1uYIxUApLe8seviGmlXac9PSILJShWfSGyOySv9rYf/hOfghRWYrvA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: The critical section is defined as the user mode code segment within the receiver that executes when control returns from the receiver to the sender. This code segment, located in the receiver, involves operations such as switching the fsbase register and changing the stack pointer. Handling the critical section can be categorized into two scenarios: - First Scenario: If no lazy switch has occurred prior to the return and the fsbase switch is incomplete, a lazy switch is triggered to transition the kernel context from the sender to the receiver. After the fsbase is updated in user mode, another lazy switch occurs to revert the kernel context from the receiver back to the sender. This results in two unnecessary lazy switches. - Second Scenario: If a lazy switch has already occurred during execution of the critical section, the lazy switch can be preemptively triggered. This avoids re-entering the kernel solely to initiate another lazy switch. The implementation of the critical section involves modifying the fsbase register in kernel mode and setting the sender's user mode context to a predefined state. These steps minimize redundant user/kernel transitions and lazy switches. Signed-off-by: Bo Li --- arch/x86/rpal/core.c | 88 ++++++++++++++++++++++++++++++++++++++++- arch/x86/rpal/service.c | 12 ++++++ include/linux/rpal.h | 6 +++ 3 files changed, 104 insertions(+), 2 deletions(-) diff --git a/arch/x86/rpal/core.c b/arch/x86/rpal/core.c index c48df1ce4324..406d54788bac 100644 --- a/arch/x86/rpal/core.c +++ b/arch/x86/rpal/core.c @@ -219,14 +219,98 @@ static inline struct task_struct *rpal_misidentify(void) return next; } +static bool in_ret_section(struct rpal_service *rs, unsigned long ip) +{ + return ip >= rs->rsm.rcs.ret_begin && ip < rs->rsm.rcs.ret_end; +} + +/* + * rpal_update_fsbase - fastpath when RPAL call returns + * @regs: pt_regs saved in kernel entry + * + * If the user is executing rpal call return code and it does + * not update fsbase yet, force fsbase update to perform a + * lazy switch immediately. + */ +static inline void rpal_update_fsbase(struct pt_regs *regs) +{ + struct rpal_service *cur = rpal_current_service(); + struct task_struct *sender = current->rpal_rd->sender; + + if (in_ret_section(cur, regs->ip)) + wrfsbase(sender->thread.fsbase); +} + +/* + * rpal_skip_receiver_code - skip rpal call return code + * @next: the next task to be lazy switched to. + * @regs: pt_regs saved in kernel entry + * + * If the user is executing rpal call return code and we are about + * to perform a lazy switch, skip the remaining return code to + * release receiver's stack. This avoids stack conflict when there + * are more than one senders calls the receiver. + */ +static inline void rpal_skip_receiver_code(struct task_struct *next, + struct pt_regs *regs) +{ + rebuild_sender_stack(next->rpal_sd, regs); +} + +/* + * rpal_skip_receiver_code - skip lazy switch when rpal call return + * @next: the next task to be lazy switched to. + * @regs: pt_regs saved in kernel entry + * + * If the user is executing rpal call return code and we have not + * performed a lazy switch, there is no need to perform lazy switch + * now. Update fsbase and other states to avoid lazy switch. + */ +static inline struct task_struct * +rpal_skip_lazy_switch(struct task_struct *next, struct pt_regs *regs) +{ + struct rpal_service *tgt; + + tgt = next->rpal_rs; + if (in_ret_section(tgt, regs->ip)) { + wrfsbase(current->thread.fsbase); + rebuild_sender_stack(current->rpal_sd, regs); + rpal_clear_task_thread_flag(next, RPAL_LAZY_SWITCHED_BIT); + next->rpal_rd->sender = NULL; + next = NULL; + } + return next; +} + +static struct task_struct *rpal_fix_critical_section(struct task_struct *next, + struct pt_regs *regs) +{ + struct rpal_service *cur = rpal_current_service(); + + /* sender->receiver */ + if (rpal_test_task_thread_flag(next, RPAL_LAZY_SWITCHED_BIT)) + next = rpal_skip_lazy_switch(next, regs); + /* receiver->sender */ + else if (rpal_is_correct_address(cur, regs->ip)) + rpal_skip_receiver_code(next, regs); + + return next; +} + static inline struct task_struct * rpal_kernel_context_switch(struct pt_regs *regs) { struct task_struct *next = NULL; + if (rpal_test_current_thread_flag(RPAL_LAZY_SWITCHED_BIT)) + rpal_update_fsbase(regs); + next = rpal_misidentify(); - if (unlikely(next != NULL)) - next = rpal_do_kernel_context_switch(next, regs); + if (unlikely(next != NULL)) { + next = rpal_fix_critical_section(next, regs); + if (next) + next = rpal_do_kernel_context_switch(next, regs); + } return next; } diff --git a/arch/x86/rpal/service.c b/arch/x86/rpal/service.c index 49458321e7dc..16e94d710445 100644 --- a/arch/x86/rpal/service.c +++ b/arch/x86/rpal/service.c @@ -545,6 +545,13 @@ int rpal_release_service(u64 key) return ret; } +static bool rpal_check_critical_section(struct rpal_service *rs, + struct rpal_critical_section *rcs) +{ + return rpal_is_correct_address(rs, rcs->ret_begin) && + rpal_is_correct_address(rs, rcs->ret_end); +} + int rpal_enable_service(unsigned long arg) { struct rpal_service *cur = rpal_current_service(); @@ -562,6 +569,11 @@ int rpal_enable_service(unsigned long arg) goto out; } + if (!rpal_check_critical_section(cur, &rsm.rcs)) { + ret = -EINVAL; + goto out; + } + mutex_lock(&cur->mutex); if (!cur->enabled) { cur->rsm = rsm; diff --git a/include/linux/rpal.h b/include/linux/rpal.h index b24176f3f245..4f1d92053818 100644 --- a/include/linux/rpal.h +++ b/include/linux/rpal.h @@ -122,12 +122,18 @@ enum rpal_sender_state { RPAL_SENDER_STATE_KERNEL_RET, }; +struct rpal_critical_section { + unsigned long ret_begin; + unsigned long ret_end; +}; + /* * user_meta will be sent to other service when requested. */ struct rpal_service_metadata { unsigned long version; void __user *user_meta; + struct rpal_critical_section rcs; }; struct rpal_request_arg { -- 2.20.1