From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 52C1697 for ; Wed, 12 Aug 2015 01:16:23 +0000 (UTC) Received: from mail-oi0-f47.google.com (mail-oi0-f47.google.com [209.85.218.47]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 09B5A15D for ; Wed, 12 Aug 2015 01:16:21 +0000 (UTC) Received: by oip136 with SMTP id 136so1468113oip.1 for ; Tue, 11 Aug 2015 18:16:21 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <20150812005117.GJ3895@linux.vnet.ibm.com> References: <20150811183312.GE3895@linux.vnet.ibm.com> <20150811214729.GH3895@linux.vnet.ibm.com> <20150812005117.GJ3895@linux.vnet.ibm.com> From: Andy Lutomirski Date: Tue, 11 Aug 2015 18:16:01 -0700 Message-ID: To: Paul McKenney Content-Type: text/plain; charset=UTF-8 Cc: "ksummit-discuss@lists.linuxfoundation.org" , Peter Zijlstra , "linux-kernel@vger.kernel.org" , Ingo Molnar , Chris Metcalf , Christoph Lameter Subject: Re: [Ksummit-discuss] [BELATED CORE TOPIC] context tracking / nohz / RCU state List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , On Tue, Aug 11, 2015 at 5:51 PM, Paul E. McKenney wrote: > On Tue, Aug 11, 2015 at 02:52:59PM -0700, Andy Lutomirski wrote: >> On Tue, Aug 11, 2015 at 2:47 PM, Paul E. McKenney >> wrote: >> > On Tue, Aug 11, 2015 at 12:07:54PM -0700, Andy Lutomirski wrote: >> >> On Tue, Aug 11, 2015 at 11:33 AM, Paul E. McKenney >> >> wrote: >> >> > On Tue, Aug 11, 2015 at 10:49:36AM -0700, Andy Lutomirski wrote: >> >> >> This is a bit late, but here goes anyway. >> >> >> >> >> >> Having played with the x86 context tracking hooks for awhile, I think >> >> >> it would be nice if core code that needs to be aware of CPU context >> >> >> (kernel, user, idle, guest, etc) could come up with single, >> >> >> comprehensible, easily validated set of hooks that arch code is >> >> >> supposed to call. >> >> >> >> >> >> Currently we have: >> >> >> >> >> >> - RCU hooks, which come in a wide variety to notify about IRQs, NMIs, etc. >> >> > >> >> > Something about people yelling at me for waking up idle CPUs, thus >> >> > degrading their battery lifetimes. ;-) >> >> > >> >> >> - Context tracking hooks. Only used by some arches. Calling these >> >> >> calls the RCU hooks for you in most cases. They have weird >> >> >> interactions with interrupts and they're slow. >> >> > >> >> > Combining these would be good, but there are subtleties. For example, >> >> > some arches don't have context tracking, but RCU still needs to correctly >> >> > identify idle CPUs without in any way interrupting or awakening that CPU. >> >> > It would be good to make this faster, but it does have to work. >> >> >> >> Could we maybe have one set of old RCU-only (no context tracking) >> >> callbacks and a completely separate set of callbacks for arches that >> >> support full context tracking? The implementation of the latter would >> >> presumably call into RCU. >> > >> > It should be possible for RCU to use context tracking if it is available >> > and to have RCU maintain its own state otherwise, if that is what you >> > are getting at. Assuming that the decision is global and made at either >> > build or boot time, anyway. Having some CPUs tracking context and others >> > not sounds like an invitation for subtle bugs. >> >> I think that, if this happens, the decision should be made at build >> time, per arch, and not be configurable. If x86_64 uses context >> tracking, then I think x86_64 shouldn't need additional RCU callbacks, >> assuming that context tracking is comprehensive enough for RCU's >> purposes. > > If by "shouldn't need additional RCU callbacks" you mean that x86_64 > shouldn't need to call the existing rcu_user_enter() and rcu_user_exit() > functions, I agree. Ditto for rcu_irq_enter(), rcu_irq_exit(), > rcu_nmi_enter(), rcu_nmi_exit(), I would guess. But would be necessary > to invoke rcu_idle_enter() and rcu_idle_exit(), especially for > CONFIG_NO_HZ_FULL_SYSIDLE=y kernels. Except that something wants vtime for idle, too, so maybe just kernel_to_idle(). On the other hand, the idle loop is already fully stocked with vtime stuff. --Andy