From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 573E8C5B552 for ; Fri, 30 May 2025 09:36:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E98196B00D4; Fri, 30 May 2025 05:36:54 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E49C96B00D5; Fri, 30 May 2025 05:36:54 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D10386B00D6; Fri, 30 May 2025 05:36:54 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id AEFF06B00D4 for ; Fri, 30 May 2025 05:36:54 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 6F0AC120116 for ; Fri, 30 May 2025 09:36:54 +0000 (UTC) X-FDA: 83499069948.09.D4A8891 Received: from mail-pj1-f46.google.com (mail-pj1-f46.google.com [209.85.216.46]) by imf22.hostedemail.com (Postfix) with ESMTP id 90C8EC0002 for ; Fri, 30 May 2025 09:36:52 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=SqUGF97l; spf=pass (imf22.hostedemail.com: domain of libo.gcs85@bytedance.com designates 209.85.216.46 as permitted sender) smtp.mailfrom=libo.gcs85@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1748597812; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=9GWxjqZwkmZx5+2pv+QLtRl8XC71PyQccoyY/dIwUoU=; b=3T78DxzU/qtVSQlUhq1UYJIrAupIW0ZTjHTpM5ZDoTTnrkWFH5MJu6mmVDTlm3zZ7n8wnY yn3t7m2wx7hX3b1x8arCmJ3X1DDIjJUnXnIy9jrOpLvQIa3sbNPkbFVBK4IWAq2MDpxcm8 yn6yHZQD4O35D0+fq2KhWytd201yaog= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=SqUGF97l; spf=pass (imf22.hostedemail.com: domain of libo.gcs85@bytedance.com designates 209.85.216.46 as permitted sender) smtp.mailfrom=libo.gcs85@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1748597812; a=rsa-sha256; cv=none; b=pFDZ+bP/xpmy3Eu38ywB6/PNK8P7nbyFmuSeWeQ80rcA0/FNtXTaJ/aTdXLXEO6wtu2VwR NPgQ9Sn6+YxxryuaBeZ59ZyvIPw+XvcPq4Q/DD7aTUGxQUEiKcl+nbG58c4PFHg5OJzotq N9lswU7zPZKTAMDUD+m7tORyzgmBr/0= Received: by mail-pj1-f46.google.com with SMTP id 98e67ed59e1d1-309fac646adso2772597a91.1 for ; Fri, 30 May 2025 02:36:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1748597811; x=1749202611; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=9GWxjqZwkmZx5+2pv+QLtRl8XC71PyQccoyY/dIwUoU=; b=SqUGF97lMwW26kJ4RWEnmzya/nz3oatNyQEuBPFQrVR8RwHHhaln2Abij64VfcNUUz MOgR+m8iNuViZQ4JVzbEnzxkT/m4zLNxMjy8s4RK3UTfEAluDU0WtG9iavH5YsLgD6cT jzWcB54YwcF009avaSZCIEDl6Tz0/+uW5MqfE8cC2YdX7eBUVnu/CiWjOxqhhPn37c2a WjKvelHoetrRoL02KGr8bVT/nn3vOEeYAzAiKeZElI4PM1lks/dXvqS5jkPDUJ5EhIKx d2sA140iCbKhRqJYkbIv2QDksYdc6WCBWCkqoToUH5/r9t7vDdwjLd7C+hT3VDMLOHUE 5AQw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1748597811; x=1749202611; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=9GWxjqZwkmZx5+2pv+QLtRl8XC71PyQccoyY/dIwUoU=; b=EGOjhODBzxbD9H/mSLFxqTV7mdgZjrZqLwcTURznuwIR5mELKROUcSQ4VAkqvuMAHc rF4avfO4qH+Nl/Bqf+FZc0GsIu5ffAPVpSeDk0fYH2Ap3GawZfI7Y9c/I8iezhFJOAt2 xK2643rT6ifEeq8z7UslPbg6ZYQONkMSAtZqF1IJHmfnTGlr2dt1xzjYSktMoe4PfTVY OW4zil3aR5B+/otN5c7deFAUlrY+dc0wJxtfuYgsY8THgVnvuqnbI+9soluD+5rns3Nr GJIpzf2ZFUpSeES2hMQ6noGkFn89wqeUAVslweRCGv9J4XHRZ1/uRL8GHt5Uyp607jsw YJnw== X-Forwarded-Encrypted: i=1; AJvYcCXSG1w/qxLZzT6XsZzggb88Cjptcty5GwNYagpyE+Z9LSBz25KuMfZFlOfzXl67sgt8kz08Vl3Nxg==@kvack.org X-Gm-Message-State: AOJu0Yw2SVqMRMpMrJHhkQfqVNwdErReLLCkDZJVg95O38PjW4ae2aDx mb56Kq3sjPVDq10P5TWUSiG68qHvOt+lOdEnUqgAhX6gepy7mOE42sVfsbbjwvoUNDQ= X-Gm-Gg: ASbGncsIioLMC4Tx8bJlgP99Ic0iARM0bkhu6Zwre3ln4eJl+0/Zmi4CbN+KuCXKjj3 S9lQPM+9m2axDGrlVRKWp5epIIl6xD7DcjJpsWhhom7YIIisRNCl1qONU/kW2JN7UZwFmUqFw40 31nEAXfe5F0IVdekvHhPMPqAYp5p2/UZtJps9+YqDjeNBzSCGg9ejYjJzCh2JxCqu+YpyDBIhaz r53ywZ0yvOz6q/HpHJQovxiguJl3CE0pD+lH7m3WKjH7XPLci9ISVyaBSC5VzFkl7ALLbmcWc5p vLhuuaI2g7fYi4qUKtcZRUoM4ocsiNNKF3f/2JnZf1w5dbdUqbShcRXYrsa8cU84DkC/F/9v1ef pCjRL2//Vvw== X-Google-Smtp-Source: AGHT+IEd6eqw3bYsn2JQr9oo7VNa3bHRe5ESaa02VpYg0eRHeS5gGe9cNS3vRZsm3BzNT8I7hOudjA== X-Received: by 2002:a17:90b:5104:b0:302:fc48:4f0a with SMTP id 98e67ed59e1d1-3124446ce79mr4391987a91.0.1748597811484; Fri, 30 May 2025 02:36:51 -0700 (PDT) Received: from FQ627FTG20.bytedance.net ([63.216.146.178]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-3124e29f7b8sm838724a91.2.2025.05.30.02.36.36 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Fri, 30 May 2025 02:36:51 -0700 (PDT) From: Bo Li To: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, luto@kernel.org, kees@kernel.org, akpm@linux-foundation.org, david@redhat.com, juri.lelli@redhat.com, vincent.guittot@linaro.org, peterz@infradead.org Cc: dietmar.eggemann@arm.com, hpa@zytor.com, acme@kernel.org, namhyung@kernel.org, mark.rutland@arm.com, alexander.shishkin@linux.intel.com, jolsa@kernel.org, irogers@google.com, adrian.hunter@intel.com, kan.liang@linux.intel.com, viro@zeniv.linux.org.uk, brauner@kernel.org, jack@suse.cz, lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com, vbabka@suse.cz, rppt@kernel.org, surenb@google.com, mhocko@suse.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, vschneid@redhat.com, jannh@google.com, pfalcato@suse.de, riel@surriel.com, harry.yoo@oracle.com, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, duanxiongchun@bytedance.com, yinhongbo@bytedance.com, dengliang.1214@bytedance.com, xieyongji@bytedance.com, chaiwen.cc@bytedance.com, songmuchun@bytedance.com, yuanzhu@bytedance.com, chengguozhu@bytedance.com, sunjiadong.lff@bytedance.com, Bo Li Subject: [RFC v2 33/35] RPAL: enable time slice correction Date: Fri, 30 May 2025 17:28:01 +0800 Message-Id: <8941a17e12edce00c1cc1c78f4dd3e1bf28e47c0.1748594841.git.libo.gcs85@bytedance.com> X-Mailer: git-send-email 2.39.5 (Apple Git-154) In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: 90C8EC0002 X-Stat-Signature: 76jmi1f8zuqhzudxzh8qo89edm7hzx3n X-Rspam-User: X-Rspamd-Server: rspam07 X-HE-Tag: 1748597812-330584 X-HE-Meta: U2FsdGVkX192/cvo3St4dQ5YPQr6MbCSFG0zuulaebAGbuu5GKu6/NYu4YBpto6cJjRp+Fs0AnE7ZoowWT0jJWkMiuquIQSU5jEH3vxCjnpQTKRu/s8/fI/YJCX3un02To1YYdtqCwlZyXMU0GN18tp/Vs+5NZ5vkYi7ePXb0HANI361zFkSu/XnrYwSY+DXNvMbrvo2fXV3FohB4Jtc+0N/+d6GmxndT1vAHxs4BOWD1zfWeB08HhUNgXyJuD/9WkJeaUXAWuZnWevzDMgA+gjvC7dTeEvcGsHWICg0xvO1GmH60erwaynzujIA5F4jkQAan4BKLUFf/l8VGpb57WZJ3r983UsF7P51O0B6dHxpqg7rj1i8AvuErRiRK6PTc0+gqCm/lPsEErQC511CQLWyAaGL+ty9QaV/Dmg8aOvcMNGBx3F3qazuOtMxjz9K/SjuH6mSMqpbrNWZBCVSiSi7zHB70XCtoIkD0x37O8EPwtnkgqfWeHCG5FlzwCbuf+Wrl5AGR6G1t4ibNPpbrrCFX1afrnLRESYrPKVwqySDptJXe5OZBrZWAc3JE2IyMQU5spwxQD6AQuB+LhNmBzmw2jN5XNDtoo1HW5uJRFm3P0QfvkDfP14tMSoXCVk7mU3Adb5WKdWralrK0i/n6ZsqLZCMDqxmGByB7p386LQkLEpri4ZVKNoeKoLH7+Pm/pFNA2kVhUsBbFELkeb7u9cDUytP0F6bt0mBH0Kadbi0JIOGHmAnYrlr5yTHj3io1V3gsPkA9dhZEEAXulzHaWS9sWjo/wdDBD/jYgy9Qc/KZET9jQNxvXSvIT3E21y77zN3S6VGyQD0kLYnQaIjJXAKhxGN1U6CGBwGOnOVSqLHwYihDUSVRT6GprhAEb4u62OqNtqYnGaeBTTlDMNJ4RqMwkYU2K6GvBhxKMxGQmwNELYOD4JP2wEr+wS0iYZEK1k+divjUWBIAIr1ETj m8qFXWnj ByjRqLzY7pGHgOhC8cagBPlmrh/MFcgyGZKHxi7MiqhHHVzRBxfrYtdC2wk8XGuPiG5QHnqye0FlX+oX0YkKxvUntsxhLU5+Iw4KJZyIS5d+s/qALfOpAYbslejitbL7ELILX/esoL4GT3v5o9vBJavx1YhNjzJ3DMlO97AYnVr/hJ7v4DQ8NSV8BfPO3hXVm9Eh1x+ORqWeVbk2LsVJC8UTt0LFw01dD/ixSwfLkwz1vuiE4t1V6LV3y6VE/TVXnTPOcwxn54a9bUpofwwQF4TtK8vwZMUph2zScNTnEv3aYe+2GyfZm/24U5A== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: After an RPAL call, the receiver's user mode code executes. However, the kernel incorrectly attributes this CPU time to the sender due to the unchanged kernel context. This results in incorrect runtime statistics. This patch adds a new member total_time to both rpal_sender_call_context and rpal_receiver_call_context. This member tracks how much runtime ( measured in CPU cycles via rdtsc()) has been incorrectly accounted for. The kernel measures total_time at the entry of __schedule() and corrects the delta in the update_rq_clock_task() function. Additionally, since RPAL calls occur in user space, runtime statistics are typically calculated by user space. However, when a lazy switch happens, the kernel takes over. To address this, the patch introduces a start_time member to record when an RPAL call is initiated, enabling the kernel to accurately calculate the runtime that needs correction. Signed-off-by: Bo Li --- arch/x86/rpal/core.c | 8 ++++++++ arch/x86/rpal/thread.c | 6 ++++++ include/linux/rpal.h | 3 +++ include/linux/sched.h | 1 + init/init_task.c | 1 + kernel/fork.c | 1 + kernel/sched/core.c | 42 ++++++++++++++++++++++++++++++++++++++++++ 7 files changed, 62 insertions(+) diff --git a/arch/x86/rpal/core.c b/arch/x86/rpal/core.c index 92281b557a6c..2ac5d932f69c 100644 --- a/arch/x86/rpal/core.c +++ b/arch/x86/rpal/core.c @@ -144,6 +144,13 @@ rpal_do_kernel_context_switch(struct task_struct *next, struct pt_regs *regs) struct task_struct *prev = current; if (rpal_test_task_thread_flag(next, RPAL_LAZY_SWITCHED_BIT)) { + struct rpal_receiver_call_context *rcc = next->rpal_rd->rcc; + struct rpal_sender_call_context *scc = current->rpal_sd->scc; + u64 slice = rdtsc_ordered() - scc->start_time; + + rcc->total_time += slice; + scc->total_time += slice; + rpal_resume_ep(next); current->rpal_sd->receiver = next; rpal_lock_cpu(current); @@ -169,6 +176,7 @@ rpal_do_kernel_context_switch(struct task_struct *next, struct pt_regs *regs) rpal_schedule(next); rpal_clear_task_thread_flag(prev, RPAL_LAZY_SWITCHED_BIT); prev->rpal_rd->sender = NULL; + next->rpal_sd->scc->start_time = rdtsc_ordered(); } if (unlikely(!irqs_disabled())) { local_irq_disable(); diff --git a/arch/x86/rpal/thread.c b/arch/x86/rpal/thread.c index 51c9eec639cb..5cd0be631521 100644 --- a/arch/x86/rpal/thread.c +++ b/arch/x86/rpal/thread.c @@ -99,6 +99,8 @@ int rpal_register_sender(unsigned long addr) rsd->scc = (struct rpal_sender_call_context *)(addr - rsp->user_start + rsp->kernel_start); rsd->receiver = NULL; + rsd->scc->start_time = 0; + rsd->scc->total_time = 0; current->rpal_sd = rsd; rpal_set_current_thread_flag(RPAL_SENDER_BIT); @@ -182,6 +184,7 @@ int rpal_register_receiver(unsigned long addr) (struct rpal_receiver_call_context *)(addr - rsp->user_start + rsp->kernel_start); rrd->sender = NULL; + rrd->rcc->total_time = 0; current->rpal_rd = rrd; rpal_set_current_thread_flag(RPAL_RECEIVER_BIT); @@ -289,6 +292,9 @@ int rpal_rebuild_sender_context_on_fault(struct pt_regs *regs, rpal_pkey_to_pkru(rpal_current_service()->pkey), RPAL_PKRU_SET); #endif + if (!rpal_is_correct_address(rpal_current_service(), regs->ip)) + /* receiver has crashed */ + scc->total_time += rdtsc_ordered() - scc->start_time; return 0; } } diff --git a/include/linux/rpal.h b/include/linux/rpal.h index 1d8c1bdc90f2..f5f4da63f28c 100644 --- a/include/linux/rpal.h +++ b/include/linux/rpal.h @@ -310,6 +310,7 @@ struct rpal_receiver_call_context { void __user *events; int maxevents; int timeout; + int64_t total_time; }; /* recovery point for sender */ @@ -325,6 +326,8 @@ struct rpal_sender_call_context { struct rpal_task_context rtc; struct rpal_error_context ec; int sender_id; + s64 start_time; + s64 total_time; }; /* End */ diff --git a/include/linux/sched.h b/include/linux/sched.h index 5f25cc09fb71..a03113fecdc5 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1663,6 +1663,7 @@ struct task_struct { struct rpal_sender_data *rpal_sd; struct rpal_receiver_data *rpal_rd; }; + s64 rpal_steal_time; #endif /* CPU-specific state of this task: */ diff --git a/init/init_task.c b/init/init_task.c index 2eb08b96e66b..3606cf701dfe 100644 --- a/init/init_task.c +++ b/init/init_task.c @@ -224,6 +224,7 @@ struct task_struct init_task __aligned(L1_CACHE_BYTES) = { .rpal_rs = NULL, .rpal_flag = 0, .rpal_cd = NULL, + .rpal_steal_time = 0, #endif }; EXPORT_SYMBOL(init_task); diff --git a/kernel/fork.c b/kernel/fork.c index 11cba74d07c8..ff6331a28987 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -1222,6 +1222,7 @@ static struct task_struct *dup_task_struct(struct task_struct *orig, int node) tsk->rpal_rs = NULL; tsk->rpal_flag = 0; tsk->rpal_cd = NULL; + tsk->rpal_steal_time = 0; #endif return tsk; diff --git a/kernel/sched/core.c b/kernel/sched/core.c index c219ada29d34..d6f8e0d76fc0 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -789,6 +789,14 @@ static void update_rq_clock_task(struct rq *rq, s64 delta) delta -= steal; } #endif +#ifdef CONFIG_RPAL + if (unlikely(current->rpal_steal_time != 0)) { + delta += current->rpal_steal_time; + if (unlikely(delta < 0)) + delta = 0; + current->rpal_steal_time = 0; + } +#endif rq->clock_task += delta; @@ -6872,6 +6880,36 @@ static bool try_to_block_task(struct rq *rq, struct task_struct *p, return true; } +#ifdef CONFIG_RPAL +static void rpal_acct_runtime(void) +{ + if (rpal_current_service()) { + if (rpal_test_task_thread_flag(current, RPAL_SENDER_BIT) && + current->rpal_sd->scc->total_time != 0) { + struct rpal_sender_call_context *scc = + current->rpal_sd->scc; + + u64 slice = + native_sched_clock_from_tsc(scc->total_time) - + native_sched_clock_from_tsc(0); + current->rpal_steal_time -= slice; + scc->total_time = 0; + } else if (rpal_test_task_thread_flag(current, + RPAL_RECEIVER_BIT) && + current->rpal_rd->rcc->total_time != 0) { + struct rpal_receiver_call_context *rcc = + current->rpal_rd->rcc; + + u64 slice = + native_sched_clock_from_tsc(rcc->total_time) - + native_sched_clock_from_tsc(0); + current->rpal_steal_time += slice; + rcc->total_time = 0; + } + } +} +#endif + /* * __schedule() is the main scheduler function. * @@ -6926,6 +6964,10 @@ static void __sched notrace __schedule(int sched_mode) struct rq *rq; int cpu; +#ifdef CONFIG_RPAL + rpal_acct_runtime(); +#endif + trace_sched_entry_tp(preempt, CALLER_ADDR0); cpu = smp_processor_id(); -- 2.20.1