From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4ACA8C5B552 for ; Fri, 30 May 2025 09:36:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DAD236B00B8; Fri, 30 May 2025 05:36:24 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D36C86B00BA; Fri, 30 May 2025 05:36:24 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BFE6E6B00D0; Fri, 30 May 2025 05:36:24 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 9F2386B00B8 for ; Fri, 30 May 2025 05:36:24 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 65329ED7C6 for ; Fri, 30 May 2025 09:36:24 +0000 (UTC) X-FDA: 83499068688.28.EC054A5 Received: from mail-pl1-f177.google.com (mail-pl1-f177.google.com [209.85.214.177]) by imf08.hostedemail.com (Postfix) with ESMTP id 9D3B116000F for ; Fri, 30 May 2025 09:36:22 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=LcreR7nF; spf=pass (imf08.hostedemail.com: domain of libo.gcs85@bytedance.com designates 209.85.214.177 as permitted sender) smtp.mailfrom=libo.gcs85@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1748597782; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=TcEpV4r+zMsMwJgdzkjWqZN5teRn9qhnk6f0JAPvEA0=; b=jvw6zH5xEyKKfdXCu0p8GAHUrj1F6iGjs4QqFtN3ji16OX1dXOAP2Y8HAXBxk7lwtxgbhv tZ2/qsN4xXe07AsLVcLsWJrP+nne5Rn7cMxdvxUK3uUXalHruQp2ViMkgAUM7uFgXwccQt zurH/LqIV/A8Chriam1UCIq5Ra7f65U= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1748597782; a=rsa-sha256; cv=none; b=Y/V0NJOlB6MIGFtMEBH7KGrSLmzd6F7xd8OM71aftEtOl91Ema9+Sd5iufXi8oxAxOvtbh 7gdi3+wceVhXmKby6twU2E7I99QLTaNWpO70G9iPJzga3hjuZLYGBK2EB1e7NeRZhTaAo8 ExzUPmQTOQtftjpqIJ1ZcO+b5dW1OxQ= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=LcreR7nF; spf=pass (imf08.hostedemail.com: domain of libo.gcs85@bytedance.com designates 209.85.214.177 as permitted sender) smtp.mailfrom=libo.gcs85@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com Received: by mail-pl1-f177.google.com with SMTP id d9443c01a7336-234b9dfb842so17487385ad.1 for ; Fri, 30 May 2025 02:36:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1748597781; x=1749202581; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=TcEpV4r+zMsMwJgdzkjWqZN5teRn9qhnk6f0JAPvEA0=; b=LcreR7nFoqeLG3drTyAFze2gGLga3ehFzWTdcun8h7Lvy7BcOeQI2/0VABJDkVHmGy 07E1aEF47cPbttxPnapXoPgXAZ+uK+SbJKUdTdbEixT4dqb9psvElrHnbDPdCVtbPlEN TB9MHMd9IDtClTwO9SXby8Eer8bSkGvMLUA36+9dxG5VHSWVxW+1iAB0VsgsDuIMzkNq 4GAQpY/CmBeZpBTDXUh8PBxP5T73mHw6ji5cflsl8dESJoSqT6VAU37Gar504Jo9JjP+ RO8GX8saXGySzKBsH+W5QWlReTHulK/VfvjPUErm6OmnXdfnaAJqocKcsrK29EkBYgEB U4CQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1748597781; x=1749202581; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=TcEpV4r+zMsMwJgdzkjWqZN5teRn9qhnk6f0JAPvEA0=; b=mF4Ocr7mbSZyutwOYclfeaRME2EgR4VoPFVFWeXfh6ZDZtz0jS+ESGGC7IcNYdkVeR SpoFiwxyiAoFUHqAULEl20Y6IFibv69GQrn9gcHHDcILnvh9vSEjVg9OatDxRR/AMZ0d NJhKQN0GX0ZR1DtLY/6nwHtE7dozb/CiXoxdv6FL8vpCzB1YQQZgLKGXLGLJRPdsy6uk PcrYVn74D6lJJdGBeU8DqYi4DNL9n3ZHMrFjIujyntJio+dkkwEqh8Yob1U6RILZbeCj QJn4Zey9v8FBP0D0eAG7ypKV2/DY1YHf+28O2P+ZxFrwh/7rFKeK8ldWghWVhh/GUcxT aqgw== X-Forwarded-Encrypted: i=1; AJvYcCW3G4PkSXZcfZHxu/8B+wG2jp/069EG5i9lMM/oON+lgtDHNjuBvhTQ+2xRBLynurEqKBbxXbS3fg==@kvack.org X-Gm-Message-State: AOJu0YyZWZhJZ2Rn/aIxXHKa2Q1yjdRWQCAiOQdNaBBkYtoOQOvBpmPd owqPRMNLnnle0Th/rZWTgt8jSPwldjP+RpLgl57OLHEEnTsiD7iJQjsNr1QZA28d9Dg= X-Gm-Gg: ASbGncthKZ/N5oCAAUamyXZLHOsf2W5ikgXw0Htxtdpx8Qpw6InVUYcWD88T6VFRJW2 eBXzTevAst+Cvet5EZQ5mshMAnqoRPUcoe5UOHqxQeUrSJsMnPVjkxulgcHzjuWsfGcEhUtA0A5 hpmZ9yAUY37r3je5CUWaH6mg/1+E0b5bg3SPvr4WvNBuRrDRGMI0I+byRkeVRnFvSgtYB+1slyx UR0n4mHPiiQE0WqJEtPa69Y9dT9sOQRQpnUTbBg3VgIlGQ5hIYt4W4MBRfurh74Jk6I8LsY7RYB 47ahi/M8hktBdsIdPz98OyJQ5ami6V1A941tiinwGvSbuvqVclfSHPdlZhB9Vkxom9QPaYdJYST ch6S85O2vTQLXADStj4Tr X-Google-Smtp-Source: AGHT+IG0dsH4ruc0ZVKtNevc7/fyUjoj6Kve3iAa9r38GSH3v9kXrPKHP05YVx3+W/n5pyeS7YOYMA== X-Received: by 2002:a17:90b:55c6:b0:311:df4b:4b82 with SMTP id 98e67ed59e1d1-3124150e360mr4147264a91.4.1748597781411; Fri, 30 May 2025 02:36:21 -0700 (PDT) Received: from FQ627FTG20.bytedance.net ([63.216.146.178]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-3124e29f7b8sm838724a91.2.2025.05.30.02.36.06 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Fri, 30 May 2025 02:36:21 -0700 (PDT) From: Bo Li To: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, luto@kernel.org, kees@kernel.org, akpm@linux-foundation.org, david@redhat.com, juri.lelli@redhat.com, vincent.guittot@linaro.org, peterz@infradead.org Cc: dietmar.eggemann@arm.com, hpa@zytor.com, acme@kernel.org, namhyung@kernel.org, mark.rutland@arm.com, alexander.shishkin@linux.intel.com, jolsa@kernel.org, irogers@google.com, adrian.hunter@intel.com, kan.liang@linux.intel.com, viro@zeniv.linux.org.uk, brauner@kernel.org, jack@suse.cz, lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com, vbabka@suse.cz, rppt@kernel.org, surenb@google.com, mhocko@suse.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, vschneid@redhat.com, jannh@google.com, pfalcato@suse.de, riel@surriel.com, harry.yoo@oracle.com, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, duanxiongchun@bytedance.com, yinhongbo@bytedance.com, dengliang.1214@bytedance.com, xieyongji@bytedance.com, chaiwen.cc@bytedance.com, songmuchun@bytedance.com, yuanzhu@bytedance.com, chengguozhu@bytedance.com, sunjiadong.lff@bytedance.com, Bo Li Subject: [RFC v2 31/35] RPAL: add receiver waker Date: Fri, 30 May 2025 17:27:59 +0800 Message-Id: <198278a03d91ab7e0e17d782c657da85cff741bb.1748594841.git.libo.gcs85@bytedance.com> X-Mailer: git-send-email 2.39.5 (Apple Git-154) In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 9D3B116000F X-Stat-Signature: 8te8m5p4mbzfty9sapqw4wsyzpbe9zgk X-Rspam-User: X-HE-Tag: 1748597782-784012 X-HE-Meta: U2FsdGVkX1+Q2KAvD57bbuOh5hEi5EmT1b4Bv3ckzTN2UvvTQeP8rc7tIcUZqU8FKL6pD+UbPXiOsKoQlAZ1137awZ9xZNISp7d5RSFu8L61gPVQIinCnTDhjsZf+HZKA4dk3mHzPPj4xh7qJ/L0kpe7z3WwXkE2YejJT+fFaxTnuJX7aipXsR4AW+5kHRyEpS05mhUUy5kj8ZQx73/tlkmJAzZ6dpqMrxIjqm6YeOngMNV/ij1pXVwmYoUJ5Tcj0KP14OOywUHUsjG7SP18JhDuwdVIgUNs5XkqKjxB2p7siVVmpYcZnUP0YWBfvMcE/gLTIdMelTMMmdwJVfFck/mBpofjIUaemBlxvOCQEnoAZmpmZDqjJh2YfbDcGdCMR+o3ydeEv3wunIqY2+vlCgIvVuo2Z65yfJJDwJr463/BkLkGV+HLHLlGUZJlUj1osZ8//B2MUGWQCYb2W4LeB7ImKqkbHJR5UD7LqSwNIi+/105fHqCuZnj9qUeBWUG08lodIqu9QH6qZu5M7NGJQtJJV+YczVXJ3yjJNPU1f4SSn4VduEg3nZ0/O/2V2rTOxWbpJOLf/sLHLVRegTGRjTa9XYA5eq3X94sJlGO05vIRE9HxskXcftpywSzxea+Etcf6EhDGWC+T5rZgMkwpv254A1X6OOBsAp9nC4WFTdHfvTBpKRc5/+7m5kQvwsweFU5cXjBY+jT69d4bF68vEbvxT6VTRl4GX4jt2tl2/8cXvFJcvLjsfg7vV4F0hy+utDVJX5nMmcyPMe9J3ioK994KIu5n69dlYd9oy2TH7VuSjraa8y8sWrSiKDevKA9u1MBXByaWtHZp7ZcJpRDKOGE2aBtEyBM2i7SHaSNJqsalXi9MADrXLW2SJBw0mY7mcCa2ZxUcIudqVYeJq6O3osMAIj6uEP9Q6Rt5UN/lK9htxO93uTK35Y3xYtXtgNJn2CWuueyepjXqMLJT4bH ieqT5Fqy HUdrsl7G1kS3a+s6aIrHIaDm3vfNR5l09Q4At X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: In an RPAL call, the receiver thread is in the TASK_INTERRUPTIBLE state and cannot be awakened, which may lead to missed wakeups. For example, if no kernel event occurs during the entire RPAL call, the receiver thread will remain in the TASK_INTERRUPTIBLE state after the RPAL call completes. To address this issue, RPAL adds a flag to the receiver whenever it encounters an unawakened state and introduces a "waker" work. The waker work runs automatically on every tick to check for receiver threads that have missed wakeups. If any are found, it wakes them up. For epoll, the waker also checks for pending user mode events and wakes the receiver thread if such events exist. Signed-off-by: Bo Li --- arch/x86/rpal/internal.h | 4 ++ arch/x86/rpal/service.c | 98 ++++++++++++++++++++++++++++++++++++++++ arch/x86/rpal/thread.c | 3 ++ include/linux/rpal.h | 11 +++++ kernel/sched/core.c | 3 ++ 5 files changed, 119 insertions(+) diff --git a/arch/x86/rpal/internal.h b/arch/x86/rpal/internal.h index e03f8a90619d..117357dabdec 100644 --- a/arch/x86/rpal/internal.h +++ b/arch/x86/rpal/internal.h @@ -22,6 +22,10 @@ int rpal_enable_service(unsigned long arg); int rpal_disable_service(void); int rpal_request_service(unsigned long arg); int rpal_release_service(u64 key); +void rpal_insert_wake_list(struct rpal_service *rs, + struct rpal_receiver_data *rrd); +void rpal_remove_wake_list(struct rpal_service *rs, + struct rpal_receiver_data *rrd); /* mm.c */ static inline struct rpal_shared_page * diff --git a/arch/x86/rpal/service.c b/arch/x86/rpal/service.c index 9fd568fa9a29..6fefb7a7729c 100644 --- a/arch/x86/rpal/service.c +++ b/arch/x86/rpal/service.c @@ -143,6 +143,99 @@ static void delete_service(struct rpal_service *rs) spin_unlock_irqrestore(&hash_table_lock, flags); } +void rpal_insert_wake_list(struct rpal_service *rs, + struct rpal_receiver_data *rrd) +{ + unsigned long flags; + struct rpal_waker_struct *waker = &rs->waker; + + spin_lock_irqsave(&waker->lock, flags); + list_add_tail(&rrd->wake_list, &waker->wake_head); + spin_unlock_irqrestore(&waker->lock, flags); + pr_debug("rpal debug: [%d] insert wake list\n", current->pid); +} + +void rpal_remove_wake_list(struct rpal_service *rs, + struct rpal_receiver_data *rrd) +{ + unsigned long flags; + struct rpal_waker_struct *waker = &rs->waker; + + spin_lock_irqsave(&waker->lock, flags); + list_del(&rrd->wake_list); + spin_unlock_irqrestore(&waker->lock, flags); + pr_debug("rpal debug: [%d] remove wake list\n", current->pid); +} + +/* waker->lock must be hold */ +static inline void rpal_wake_all(struct rpal_waker_struct *waker) +{ + struct task_struct *wake_list[RPAL_MAX_RECEIVER_NUM]; + struct list_head *list; + unsigned long flags; + int i, cnt = 0; + + spin_lock_irqsave(&waker->lock, flags); + list_for_each(list, &waker->wake_head) { + struct task_struct *task; + struct rpal_receiver_call_context *rcc; + struct rpal_receiver_data *rrd; + int pending; + + rrd = list_entry(list, struct rpal_receiver_data, wake_list); + task = rrd->rcd.bp_task; + rcc = rrd->rcc; + + pending = atomic_read(&rcc->ep_pending) & RPAL_USER_PENDING; + + if (rpal_test_task_thread_flag(task, RPAL_WAKE_BIT) || + (pending && atomic_cmpxchg(&rcc->receiver_state, + RPAL_RECEIVER_STATE_WAIT, + RPAL_RECEIVER_STATE_RUNNING) == + RPAL_RECEIVER_STATE_WAIT)) { + wake_list[cnt] = task; + cnt++; + } + } + spin_unlock_irqrestore(&waker->lock, flags); + + for (i = 0; i < cnt; i++) + wake_up_process(wake_list[i]); +} + +static void rpal_wake_callback(struct work_struct *work) +{ + struct rpal_waker_struct *waker = + container_of(work, struct rpal_waker_struct, waker_work.work); + + rpal_wake_all(waker); + /* We check it every ticks */ + schedule_delayed_work(&waker->waker_work, 1); +} + +static void rpal_enable_waker(struct rpal_waker_struct *waker) +{ + INIT_DELAYED_WORK(&waker->waker_work, rpal_wake_callback); + schedule_delayed_work(&waker->waker_work, 1); + pr_debug("rpal debug: [%d] enable waker\n", current->pid); +} + +static void rpal_disable_waker(struct rpal_waker_struct *waker) +{ + unsigned long flags; + struct list_head *p, *n; + + cancel_delayed_work_sync(&waker->waker_work); + rpal_wake_all(waker); + spin_lock_irqsave(&waker->lock, flags); + list_for_each_safe(p, n, &waker->wake_head) { + list_del_init(p); + } + INIT_LIST_HEAD(&waker->wake_head); + spin_unlock_irqrestore(&waker->lock, flags); + pr_debug("rpal debug: [%d] disable waker\n", current->pid); +} + static inline unsigned long calculate_base_address(int id) { return RPAL_ADDRESS_SPACE_LOW + RPAL_ADDR_SPACE_SIZE * id; @@ -213,6 +306,10 @@ struct rpal_service *rpal_register_service(void) rs->pku_on = PKU_ON_FALSE; rpal_service_pku_init(); #endif + spin_lock_init(&rs->waker.lock); + INIT_LIST_HEAD(&rs->waker.wake_head); + /* receiver may miss wake up if in lazy switch, try to wake it later */ + rpal_enable_waker(&rs->waker); rs->bad_service = false; rs->base = calculate_base_address(rs->id); @@ -257,6 +354,7 @@ void rpal_unregister_service(struct rpal_service *rs) schedule(); delete_service(rs); + rpal_disable_waker(&rs->waker); pr_debug("rpal: unregister service, id: %d, tgid: %d\n", rs->id, rs->group_leader->tgid); diff --git a/arch/x86/rpal/thread.c b/arch/x86/rpal/thread.c index fcc592baaac0..51c9eec639cb 100644 --- a/arch/x86/rpal/thread.c +++ b/arch/x86/rpal/thread.c @@ -186,6 +186,8 @@ int rpal_register_receiver(unsigned long addr) current->rpal_rd = rrd; rpal_set_current_thread_flag(RPAL_RECEIVER_BIT); + rpal_insert_wake_list(cur, rrd); + atomic_inc(&cur->thread_cnt); return 0; @@ -214,6 +216,7 @@ int rpal_unregister_receiver(void) clear_fs_tsk_map(); rpal_put_shared_page(rrd->rsp); + rpal_remove_wake_list(cur, rrd); rpal_clear_current_thread_flag(RPAL_RECEIVER_BIT); rpal_free_thread_pending(&rrd->rcd); kfree(rrd); diff --git a/include/linux/rpal.h b/include/linux/rpal.h index 16a3c80383f7..1d8c1bdc90f2 100644 --- a/include/linux/rpal.h +++ b/include/linux/rpal.h @@ -116,6 +116,7 @@ enum rpal_task_flag_bits { RPAL_RECEIVER_BIT, RPAL_CPU_LOCKED_BIT, RPAL_LAZY_SWITCHED_BIT, + RPAL_WAKE_BIT, }; enum rpal_receiver_state { @@ -189,6 +190,12 @@ struct rpal_fsbase_tsk_map { struct task_struct *tsk; }; +struct rpal_waker_struct { + spinlock_t lock; + struct list_head wake_head; + struct delayed_work waker_work; +}; + /* * Each RPAL process (a.k.a RPAL service) should have a pointer to * struct rpal_service in all its tasks' task_struct. @@ -255,6 +262,9 @@ struct rpal_service { int pkey; #endif + /* receiver thread waker */ + struct rpal_waker_struct waker; + /* delayed service put work */ struct delayed_work delayed_put_work; @@ -347,6 +357,7 @@ struct rpal_receiver_data { struct fd f; struct hrtimer_sleeper ep_sleeper; wait_queue_entry_t ep_wait; + struct list_head wake_list; }; struct rpal_sender_data { diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 486d59bdd3fc..c219ada29d34 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -3943,6 +3943,7 @@ static bool rpal_check_state(struct task_struct *p) struct rpal_receiver_call_context *rcc = p->rpal_rd->rcc; int state; + rpal_clear_task_thread_flag(p, RPAL_WAKE_BIT); retry: state = atomic_read(&rcc->receiver_state) & RPAL_RECEIVER_STATE_MASK; switch (state) { @@ -3957,6 +3958,7 @@ static bool rpal_check_state(struct task_struct *p) case RPAL_RECEIVER_STATE_RUNNING: break; case RPAL_RECEIVER_STATE_CALL: + rpal_set_task_thread_flag(p, RPAL_WAKE_BIT); ret = false; break; default: @@ -4522,6 +4524,7 @@ int rpal_try_to_wake_up(struct task_struct *p) BUG_ON(READ_ONCE(p->__state) == TASK_RUNNING); + rpal_clear_task_thread_flag(p, RPAL_WAKE_BIT); scoped_guard (raw_spinlock_irqsave, &p->pi_lock) { smp_mb__after_spinlock(); if (!ttwu_state_match(p, TASK_NORMAL, &success)) -- 2.20.1