From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DF6C5C36002 for ; Wed, 9 Apr 2025 19:08:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 11A476B00C1; Wed, 9 Apr 2025 15:08:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0C8226B00C2; Wed, 9 Apr 2025 15:08:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EF7D76B00C3; Wed, 9 Apr 2025 15:08:38 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id D380F6B00C1 for ; Wed, 9 Apr 2025 15:08:38 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id A429B5B5F8 for ; Wed, 9 Apr 2025 19:08:40 +0000 (UTC) X-FDA: 83315442000.14.5C22121 Received: from desiato.infradead.org (desiato.infradead.org [90.155.92.199]) by imf22.hostedemail.com (Postfix) with ESMTP id 315B5C0002 for ; Wed, 9 Apr 2025 19:08:37 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=infradead.org header.s=desiato.20200630 header.b=Ublz1CXm; spf=none (imf22.hostedemail.com: domain of peterz@infradead.org has no SPF policy when checking 90.155.92.199) smtp.mailfrom=peterz@infradead.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1744225719; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=MWq7+ZDeu6lqJHXnARaq7nLRtV46HbSzyx3OJzyzJMY=; b=gBbYrzBjgAutznb1s8zzm1slLlBpyw9MRkr5qtS4hn4Q/CUkp9RXJsRkoJcE7Dl6GFXRAr gqcwyK68Y5BlTNXhotdW1XSCSozddXiLft7ftzgTxfgwzEyOAlfXOfAeuvlQhC73QLnqxw LwVktuHv/MYSAImKNDfhhPQLcFgYQr4= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=infradead.org header.s=desiato.20200630 header.b=Ublz1CXm; spf=none (imf22.hostedemail.com: domain of peterz@infradead.org has no SPF policy when checking 90.155.92.199) smtp.mailfrom=peterz@infradead.org; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1744225719; a=rsa-sha256; cv=none; b=zbtsf9bnpPzh9n68t+1/Mf/d3gad+U5yopjMgzcht4g7b88u2Dav/WGA96jby6pM+m8yn6 INoH9EJp9JH0EEmY6SDyg8DwIuTammKcRt3+kaV34Ag6Y0zOjyCzO+mXaqG/4gMex7Ds5e DhqZ6eO3p50qDO5K9K5vp8reCk0N/jU= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=MWq7+ZDeu6lqJHXnARaq7nLRtV46HbSzyx3OJzyzJMY=; b=Ublz1CXmy2JM4TN48lLyQmk0YA eEteI5fUkemxJXiM1OpYItcz6MZbTTauiv+DwjvwOMjyGEzk0hzOUgyDN8GWPLtxewu3UgZA7eJQH XbbARf5uXjuPkBI/LhOlJZIc258rEQNWjCjAOJq2czH77zsq7wmp3HRbZXQS/CEQ0LZs0N5l+Ufws 7cHS35b3l77Bvfgc6dTl90k/iq2K0ZVhDbKqtDXMXtk5+43k4JyxfmoA/KZmFjvT1mS4uy+mw5+LG 0/YgOFTV0y8rc3hxid5KNiudKvVSrcdYNYkCVmNCn74/5OVloO2WTN01HVwxWq/BjHXJMm01wbOC6 PWxUfIFg==; Received: from 77-249-17-252.cable.dynamic.v4.ziggo.nl ([77.249.17.252] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.98.1 #2 (Red Hat Linux)) id 1u2amo-00000008ffB-25gk; Wed, 09 Apr 2025 19:08:34 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 1000) id 153713003FF; Wed, 9 Apr 2025 21:08:34 +0200 (CEST) Date: Wed, 9 Apr 2025 21:08:34 +0200 From: Peter Zijlstra To: Mathieu Desnoyers Cc: Gabriele Monaco , linux-kernel@vger.kernel.org, Andrew Morton , Ingo Molnar , "Paul E. McKenney" , linux-mm@kvack.org, Ingo Molnar , Shuah Khan Subject: Re: [PATCH v12 2/3] sched: Move task_mm_cid_work to mm work_struct Message-ID: <20250409190834.GQ9833@noisy.programming.kicks-ass.net> References: <20250311062849.72083-1-gmonaco@redhat.com> <20250311062849.72083-3-gmonaco@redhat.com> <20250409140303.GA9833@noisy.programming.kicks-ass.net> <20250409152025.GK9833@noisy.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Queue-Id: 315B5C0002 X-Rspamd-Server: rspam05 X-Rspam-User: X-Stat-Signature: u5n6odmtri3933zr5arqki7bwha7dyqr X-HE-Tag: 1744225717-430945 X-HE-Meta: U2FsdGVkX185/Aex2Ni9/2oHyt/lO+xMj44h9bZ/6Ruuer4TDlQpjvXQGV8JNz6w4r0pNWm1eQEQU7AxagGg8eHsmcNugp+0w2DTSfBEuxX384MGjManJA8PEYj3uAzebwJqQoQM6XDy9Lyc1U9Gn7RYeDYfDsBhHz/E2RS4ukp+M7nTEwoCBzyzX08YNBxeLoNvCeis/CGvu1ODpAfmSTdNO73lZUGUDQWckWCZhXZHkzDmLrmwPLd+LnOATUrCoAhyg4mPH9AummXv0DDTaly251B4k+KOjUWoc2LvLzPyRe9VSiv8C6Cbn7M53jlL5NekNo4yiLT+i5RWYNdfzxc2kNiEhoztwLhGiZgnYh22YCiu8S0FlSpEGBpKd4U9jK152oMGGIw9NqpZLHhsNpPF7Rn1dB2fSyeuFDFo5PDVlB+GdxYItAW72F7SuzMTdW0rJ/LvcpCP+O6ZjLjo8OvZeA7PQu2sQH8cqu/ndn/1iQAdKa4Cjous+XPjcW4JABNqGM3I3csiJNAmO75picECZ/hbotqZN3YGGY1mUxvNTU8s5Q5nYbyx0wRIxOoPXt02eWjmW3q2Zx1OAT+Yq6XwdfL9hUBX3/fWzxIyNO8rEPTnUSMwTfwHtBW/fxJI8umk98ceEc6uo1QGQDKvE23PoENlpfAzc/yBiSIIZhqZ5QOuBSedKMt+i6ihnYjoPyI+wXd3i+no7g98W5/2LYBC3C+HWNjm89GF8TvxxfZOupQaNAFIyQZpxRyNqWNiV04mOFUPfpWmN/GxD3vDROBinFL55PEQ04JZd2QThxRZhdQJ2d/CNEgHd6FqlTN/znhZsPL+BeuesP+FHh70PeTheaygWP5BFHyAmIBDFff38AmsBhM1OworYU2TzcCvjFR8Wnv90FqHKEiplF9tUIp3aKvSxHlTGlFnCq4bcMki5o4nEf4ACGT20GL8hFVfvy+8PFJTZYEch+g0NW/ U1K20XlK /i6/5gZ3+r47YVhMc3AYrb4TLCY/JNtEu7qWRFSUHedqv6hYbwVuRLnNQHZCx5RFME5t9vJ3tXZfm6/skIO5+tWCfwmWysV8AIgF7cgwnS9QfLsBoRbYMPHQG3WzjBcu/Fvi0107bCQ7dnEc2xBDxuaEn56liqpaOZgmLBmCfKAwivzUCJ4t2zmq4AxqRS68fXXzMZc65JPZCEQYnmS9ZxiBgk5rdIFSrnIJBcdfmbmKqL4k= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Apr 09, 2025 at 11:53:05AM -0400, Mathieu Desnoyers wrote: > On 2025-04-09 11:20, Peter Zijlstra wrote: > > On Wed, Apr 09, 2025 at 10:15:42AM -0400, Mathieu Desnoyers wrote: > > > On 2025-04-09 10:03, Peter Zijlstra wrote: > > > > On Tue, Mar 11, 2025 at 07:28:45AM +0100, Gabriele Monaco wrote: > > > > > +static inline void rseq_preempt_from_tick(struct task_struct *t) > > > > > +{ > > > > > + u64 rtime = t->se.sum_exec_runtime - t->se.prev_sum_exec_runtime; > > > > > + > > > > > + if (rtime > RSEQ_UNPREEMPTED_THRESHOLD) > > > > > + rseq_preempt(t); > > > > > +} > > > > > > > > This confused me. > > > > > > > > The goal seems to be to tickle __rseq_handle_notify_resume() so it'll > > > > end up queueing that work thing. But why do we want to set PREEMPT_BIT > > > > here? > > > > > > In that scenario, we trigger (from tick) the fact that we may recompact the > > > mm_cid, and thus need to update the rseq mm_cid field before returning to > > > userspace. > > > > > > Changing the value of the mm_cid field while userspace is within a rseq > > > critical section should abort the critical section, because the rseq > > > critical section should be able to expect the mm_cid to be invariant > > > for the whole c.s.. > > > > But, if we run that compaction in a worker, what guarantees the > > compaction is done and mm_cid is stable, but the time this task returns > > to userspace again? > > So let's say we have a task which is running and not preempted by any > other task on a cpu for a long time. > > The idea is to have the tick do two things: > > A) trigger the mm_cid recompaction, > > B) trigger an update of the task's rseq->mm_cid field at some point > after recompaction, so it can get a mm_cid value closer to 0. > > So in its current form this patch will indeed trigger rseq_preempt() > for *every tick* after the task has run for more than 100ms, which > I don't think is intended. This should be fixed. > > Also, doing just an rseq_preempt() is not the correct approach, as > AFAIU it won't force the long running task to release the currently > held mm_cid value. > > I think we need something that looks like the following based on the > current patch: > > - rename rseq_preempt_from_tick() to rseq_tick(), > > - modify rseq_tick() to ensure it calls rseq_set_notify_resume(t) > rather than rseq_preempt(). > > - modify rseq_tick() to ensure it only calls it once every > RSEQ_UNPREEMPTED_THRESHOLD, rather than every tick after > RSEQ_UNPREEMPTED_THRESHOLD. > > - modify rseq_tick() so at some point after the work has > compacted mm_cids, we do the same things as switch_mm_cid() > does, namely to release the currently held cid and get a likely > smaller one (closer to 0). If the value changes, then we should > trigger rseq_preempt() so the task updates the mm_cid field before > returning to userspace and restarts ongoing rseq critical section. > > Thoughts ? Yes, that seems better. Also be sure there's a comment around there somewhere that explains this. Because I'm sure I'll have forgotten all about this in a few months time :-)