From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.0 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 027F5C433DF for ; Thu, 13 Aug 2020 15:54:17 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C0C4A20791 for ; Thu, 13 Aug 2020 15:54:16 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C0C4A20791 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 4F9DD6B000D; Thu, 13 Aug 2020 11:54:16 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4AA336B000E; Thu, 13 Aug 2020 11:54:16 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3E7006B0010; Thu, 13 Aug 2020 11:54:16 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0071.hostedemail.com [216.40.44.71]) by kanga.kvack.org (Postfix) with ESMTP id 291736B000D for ; Thu, 13 Aug 2020 11:54:16 -0400 (EDT) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id D7185181AEF21 for ; Thu, 13 Aug 2020 15:54:15 +0000 (UTC) X-FDA: 77145992070.19.silk62_0b0dca826ff5 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin19.hostedemail.com (Postfix) with ESMTP id 7A9451ACEA2 for ; Thu, 13 Aug 2020 15:54:15 +0000 (UTC) X-HE-Tag: silk62_0b0dca826ff5 X-Filterd-Recvd-Size: 5695 Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by imf43.hostedemail.com (Postfix) with ESMTP for ; Thu, 13 Aug 2020 15:54:14 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id DEECBB5F7; Thu, 13 Aug 2020 15:54:35 +0000 (UTC) Date: Thu, 13 Aug 2020 17:54:12 +0200 From: Michal Hocko To: "Paul E. McKenney" Cc: Thomas Gleixner , Uladzislau Rezki , LKML , RCU , linux-mm@kvack.org, Andrew Morton , Vlastimil Babka , Matthew Wilcox , "Theodore Y . Ts'o" , Joel Fernandes , Sebastian Andrzej Siewior , Oleksiy Avramchenko , Peter Zijlstra Subject: Re: [RFC-PATCH 1/2] mm: Add __GFP_NO_LOCKS flag Message-ID: <20200813155412.GP9477@dhcp22.suse.cz> References: <20200811210931.GZ4295@paulmck-ThinkPad-P72> <874kp87mca.fsf@nanos.tec.linutronix.de> <20200813075027.GD9477@dhcp22.suse.cz> <20200813095840.GA25268@pc636> <874kp6llzb.fsf@nanos.tec.linutronix.de> <20200813133308.GK9477@dhcp22.suse.cz> <87sgcqty0e.fsf@nanos.tec.linutronix.de> <20200813145335.GN9477@dhcp22.suse.cz> <20200813154159.GR4295@paulmck-ThinkPad-P72> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200813154159.GR4295@paulmck-ThinkPad-P72> X-Rspamd-Queue-Id: 7A9451ACEA2 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam01 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu 13-08-20 08:41:59, Paul E. McKenney wrote: > On Thu, Aug 13, 2020 at 04:53:35PM +0200, Michal Hocko wrote: > > On Thu 13-08-20 16:34:57, Thomas Gleixner wrote: > > > Michal Hocko writes: > > > > On Thu 13-08-20 15:22:00, Thomas Gleixner wrote: > > > >> It basically requires to convert the wait queue to something else. Is > > > >> the waitqueue strict single waiter? > > > > > > > > I would have to double check. From what I remember only kswapd should > > > > ever sleep on it. > > > > > > That would make it trivial as we could simply switch it over to rcu_wait. > > > > > > >> So that should be: > > > >> > > > >> if (!preemptible() && gfp == GFP_RT_NOWAIT) > > > >> > > > >> which is limiting the damage to those callers which hand in > > > >> GFP_RT_NOWAIT. > > > >> > > > >> lockdep will yell at invocations with gfp != GFP_RT_NOWAIT when it hits > > > >> zone->lock in the wrong context. And we want to know about that so we > > > >> can look at the caller and figure out how to solve it. > > > > > > > > Yes, that would have to somehow need to annotate the zone_lock to be ok > > > > in those paths so that lockdep doesn't complain. > > > > > > That opens the worst of all cans of worms. If we start this here then > > > Joe programmer and his dog will use these lockdep annotation to evade > > > warnings and when exposed to RT it will fall apart in pieces. Just that > > > at that point Joe programmer moved on to something else and the usual > > > suspects can mop up the pieces. We've seen that all over the place and > > > some people even disable lockdep temporarily because annotations don't > > > help. > > > > Hmm. I am likely missing something really important here. We have two > > problems at hand: > > 1) RT will become broken as soon as this new RCU functionality which > > requires an allocation from inside of raw_spinlock hits the RT tree > > 2) lockdep splats which are telling us that early because of the > > raw_spinlock-> spin_lock dependency. > > That is a reasonable high-level summary. > > > 1) can be handled by handled by the bailing out whenever we have to use > > zone->lock inside the buddy allocator - essentially even more strict > > NOWAIT semantic than we have for RT tree - proposed (pseudo) patch is > > trying to describe that. > > Unless I am missing something subtle, the problem with this approach > is that in production-environment CONFIG_PREEMPT_NONE=y kernels, there > is no way at runtime to distinguish between holding a spinlock on the > one hand and holding a raw spinlock on the other. Therefore, without > some sort of indication from the caller, this approach will not make > CONFIG_PREEMPT_NONE=y users happy. If the whole bailout is guarded by CONFIG_PREEMPT_RT specific atomicity check then there is no functional problem - GFP_RT_SAFE would still be GFP_NOWAIT so functional wise the allocator will still do the right thing. [...] > > That would require changing NOWAIT/ATOMIC allocations semantic quite > > drastically for !RT kernels as well. I am not sure this is something we > > can do. Or maybe I am just missing your point. > > Exactly, and avoiding changing this semantic for current users is > precisely why we are proposing some sort of indication to be passed > into the allocation request. In Uladzislau's patch, this was the > __GFP_NO_LOCKS flag, but whatever works. As I've tried to explain already, I would really hope we can do without any new gfp flags. We are running out of them and they tend to generate a lot of maintenance burden. There is a lot of abuse etc. We should also not expose such an implementation detail of the allocator to callers because that would make future changes even harder. The alias, on the othere hand already builds on top of existing NOWAIT semantic and it just helps the allocator to complain about a wrong usage while it doesn't expose any internals. -- Michal Hocko SUSE Labs