From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 60DDCCA0EF5 for ; Tue, 12 Sep 2023 16:49:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A3C7A6B012C; Tue, 12 Sep 2023 12:49:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9EBEE6B012E; Tue, 12 Sep 2023 12:49:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8B35E6B012F; Tue, 12 Sep 2023 12:49:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 76A546B012C for ; Tue, 12 Sep 2023 12:49:12 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 2F0F8B2C14 for ; Tue, 12 Sep 2023 16:49:12 +0000 (UTC) X-FDA: 81228530544.26.5FCDD8D Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf13.hostedemail.com (Postfix) with ESMTP id 4CC4320023 for ; Tue, 12 Sep 2023 16:49:10 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=RjF6Sb2K; dmarc=pass (policy=none) header.from=kernel.org; spf=pass (imf13.hostedemail.com: domain of "SRS0=rctY=E4=paulmck-ThinkPad-P17-Gen-1.home=paulmck@kernel.org" designates 139.178.84.217 as permitted sender) smtp.mailfrom="SRS0=rctY=E4=paulmck-ThinkPad-P17-Gen-1.home=paulmck@kernel.org" ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1694537350; h=from:from:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=XlwflSC4Au84TPQos6+T/LN+WjlH3GArzDZnaDRLGbE=; b=F11Et81IEWjoCHXjwEDikqTJKFDswZaDowquv7EIzgOdMekk+Dci/XHEWobs8NOsXxCDpN gz8r/raB4ck3zPTGoHCSNb2k8dI6bWXtVzN8i8mUKo/3U1X98YxOLtyTpqAqAg4HNX94/H cKJzH6wTKzKLfkadsiBUiGXsrmmGGp8= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=RjF6Sb2K; dmarc=pass (policy=none) header.from=kernel.org; spf=pass (imf13.hostedemail.com: domain of "SRS0=rctY=E4=paulmck-ThinkPad-P17-Gen-1.home=paulmck@kernel.org" designates 139.178.84.217 as permitted sender) smtp.mailfrom="SRS0=rctY=E4=paulmck-ThinkPad-P17-Gen-1.home=paulmck@kernel.org" ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1694537350; a=rsa-sha256; cv=none; b=2P6fKUr/fybfObyB8KEYPsVi8WPSji5NxgrDYOiiteaW4B9Ma0WUCKxIuLJXTur8093Uyw IUkmBW67a0NmJiMPVVef9u4WjVJBKWYuBmJ6HiCBqAsFc+JZMAKlOqTEvLLj0z1cjArYaJ IEIRsQ3u60ydP6pCcbH6Y7KZjgTu8QI= Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 41885614C5; Tue, 12 Sep 2023 16:49:09 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A113DC433C7; Tue, 12 Sep 2023 16:49:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1694537348; bh=nrZw+pDg00rlRjk360zKIMMb1lBqRnURiTHXAD0Qno8=; h=Date:From:To:Subject:Reply-To:References:In-Reply-To:From; b=RjF6Sb2KsXJtj2OPJwvCzUb0vSg0xJ2FpNqHZL2Ydjl8yxRJRFcpo1DAiZ9loeav5 tsWUaStTy831cGmRsjw/Kj46z7toqNqp6WEHvEpUr1etpm/dk82kvL19Lpl04h9k8k 4Yj32IOrb9wcGyVvtG2z2z7ooYOC9oRljHk7zeX70XcIlBgD50v5CiWCWUT+39Cf45 brCd64yGq18rMhUCRQ9b9zmF8/6YTM3niXF9FOd8+dKbco5i/uLg0I5ukWwmFtN29R ljWUU+u9CW41OZtXYnvoJChFa/y+OLOFrwfmAjUYK2j71s7/gX6funiwATpjP21EZS rAYoZInBaQ8rA== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 37B8CCE07C7; Tue, 12 Sep 2023 09:49:08 -0700 (PDT) Date: Tue, 12 Sep 2023 09:49:08 -0700 From: "Paul E. McKenney" To: "Liam R. Howlett" , Geert Uytterhoeven , Andrew Morton , maple-tree@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org, linux-renesas-soc@vger.kernel.org, Shanker Donthineni Subject: Re: [PATCH v2 1/2] maple_tree: Disable mas_wr_append() when other readers are possible Message-ID: <1189edca-9029-4c2e-ba71-b3a1c15b61dc@paulmck-laptop> Reply-To: paulmck@kernel.org References: <495849d6-1dc6-4f38-bce7-23c50df3a99f@paulmck-laptop> <20230911235452.xhtnt7ply7ayr53x@revolver> <33150b55-970c-4607-9015-af0e50e4112d@paulmck-laptop> <62936d98-6353-486e-8535-86c9f90bc7f4@paulmck-laptop> <20230912135617.dnhyk4h5c555l2yg@revolver> <9e85adf9-2e1f-4bed-a58e-9ca629c03579@paulmck-laptop> <20230912154423.gcb5rzwzh4jbcaw7@revolver> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20230912154423.gcb5rzwzh4jbcaw7@revolver> X-Rspamd-Queue-Id: 4CC4320023 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: 6wh3zdqfr8bk4usewkoqu3ysa1ft73gb X-HE-Tag: 1694537350-685860 X-HE-Meta: U2FsdGVkX1+4Gagjzw78Gal91nEg8b2xUDZSzKFPhvnbC5jq1eizC5d7pSOtmx2RSqldDEsodEtwu8Sw3mdutb/49n/1abOw0LiVNRt5RJ/Eu758cdiAeED7dxT4KBq0atdjLUbhnEnnXS5z1zJI5StZ2ddHFDslBFxHGDOwvwpKfsmif3+Op2laH3DjhDy3/8vo7kpxAsD8rDUFjIEM1ALHjZJI4qGlz+h+F7QUqhtQWdjC0XI+vW2gNiwW75MNflvJLsVk0CcwCTR2HFCcyrDNhWTpErmRYGDRMzbRAPa+pTO4SwpA/fSc/4IQ37jMDrUgGaUw8Dw8X37fxzg7gHiLOlDUUpnG1zzBJc8/Y+a0t5KxMhD4VSSQYXS1cIhlQVaKmmgXy9eHrstxMv46LYFrJ1pkBrp0qJt//ho1/si+yi9QzzXLG7UtvfgrxsTK8UoucOEbfGVueMfwJW08giwtzuS3HRRDx1GiH3ogsfIeQQcBj5IQ2zknf7PbS4SYbeMObd113ge6qFs58zBMMuLw8yIObiW1uZeZb9bqs2hy+dk+iP9vK5spoZ/Pcfcd2GuUdc0+bNjXoAZ8DBjI6gtG9XIbKeiUKkd9DSQU9D2kbD2A6VfJZN0QwR6RBc96HplCE2+jxHs3payi/ZkhQw44XKtocqM4+C+xVvBRPk56WP9qai/Jsd8/1IRPIQSqLa8J2Emm7kU/Pe7TfeSPuqt4vcr/YS1SZ+QSeeMSw04V1M1EDQpQRjThgTVEMOAYKT8SeLLhWeWbD95YNc/QSHq8iC6MfuLr37EO3bF0khSX04IHmT98oSMGLoRbQ0B0Xtl1M7/+xb8+Ilzw9qk9ymmmYjoO5Tx4kV+mY+ynTax2WdOSz6wotu5VOIHOo/dXSrkZJwCCm4p8OhKfKvpk92BVu0doXHLBWxSsRfEeAwXcQ3W1o4iZPGb8u7OQRFU+frL4kYyuKzEGWu3bWZK cucNZZ+v cro5WsYS8rD4WE5uHrwyEE2yLSt5656BaCPmykn5PpHqjffk//oiN0tQAQMe+XnBtlAZ+ZegIsmv8ajXz+NAWoQT3+gyaLolqnt3T+PjNoybyymujdiXovdCH91Pia3fgZ1tJzifpnAe/MWnRHJPWNfEaDZ6imfUQKRYG2Z+xUHdXUMVlRrCbhSI/Np8mVwbP8CllnUvB4ZBkVghcyyYIYOfCAE9zVm0CLK0gv9l/r04b0/wz6ydG93hPw48Q7OMnP4Np4ogwJVynmruuomHxNxTKikprRsp+zjUBBjfdxiRaA7w49wDd/EG51p24x+FE1fl5tKhi+ul/TdQykM6QhQuOJ1Csd80yPz5LQNdfuHHSCqqI8uVkTRQO7U8x0ae59ZDoa5xK9ZN0QnJ7hJpnYGquT7c0DcqJKrCWwnap3aPlGF6wXsWR7FUMmd8d3AfhXw5rfOBhCHPXncMR6Aon7VQkh53rBs3/r5Shq/OMSQJNC+Q/RSMdgSZoqIG12cwB31S3aWFQyvymrPLul0xZqtNvDfbn192euRCD2KrNBgUaGQ10v/a4GLdFkYZid/THa+wzS+Epxmj/FV61LERC2Cq3hLTVig7LdCLpWoSSUY8G2AP8s2MfRUnXWIW+xwKzb4rX2VLlHaWfF4m88MqD9ch5BL9bB/x3H1BfZ9dwQEcPUvyutkeE+y2gZAQgS5P64L6Db6GoeCGeYP3nT+SXJGfgt/RtfK53btya31QnHQLnNXGJGYj09Rnze99vEnMT5UKQFbb6sLgs503mE6a5chzMvudRO4HJ79XkXNOyZ6iqvnI= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Sep 12, 2023 at 11:44:23AM -0400, Liam R. Howlett wrote: > * Paul E. McKenney [230912 11:07]: > > On Tue, Sep 12, 2023 at 09:56:17AM -0400, Liam R. Howlett wrote: > > > * Paul E. McKenney [230912 06:00]: > > > > On Tue, Sep 12, 2023 at 10:34:44AM +0200, Geert Uytterhoeven wrote: > > > > > Hi Paul, > > > > > > > > > > On Tue, Sep 12, 2023 at 10:30 AM Paul E. McKenney wrote: > > > > > > On Tue, Sep 12, 2023 at 10:23:37AM +0200, Geert Uytterhoeven wrote: > > > > > > > On Tue, Sep 12, 2023 at 10:14 AM Paul E. McKenney wrote: > > > > > > > > On Mon, Sep 11, 2023 at 07:54:52PM -0400, Liam R. Howlett wrote: > > > > > > > > > * Paul E. McKenney [230906 14:03]: > > > > > > > > > > On Wed, Sep 06, 2023 at 01:29:54PM -0400, Liam R. Howlett wrote: > > > > > > > > > > > * Paul E. McKenney [230906 13:24]: > > > > > > > > > > > > On Wed, Sep 06, 2023 at 11:23:25AM -0400, Liam R. Howlett wrote: > > > > > > > > > > > > > (Adding Paul & Shanker to Cc list.. please see below for why) > > > > > > > > > > > > > > > > > > > > > > > > > > Apologies on the late response, I was away and have been struggling to > > > > > > > > > > > > > get a working PPC32 test environment. > > > > > > > > > > > > > > > > > > > > > > > > > > * Geert Uytterhoeven [230829 12:42]: > > > > > > > > > > > > > > Hi Liam, > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, 18 Aug 2023, Liam R. Howlett wrote: > > > > > > > > > > > > > > > The current implementation of append may cause duplicate data and/or > > > > > > > > > > > > > > > incorrect ranges to be returned to a reader during an update. Although > > > > > > > > > > > > > > > this has not been reported or seen, disable the append write operation > > > > > > > > > > > > > > > while the tree is in rcu mode out of an abundance of caution. > > > > > > > > > > > > > > > > > > > > > > > > > > ... > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > ... > > > > > > > > > > > > > > > > > > > > > > > RCU-related configs: > > > > > > > > > > > > > > > > > > > > > > > > > > > > $ grep RCU .config > > > > > > > > > > > > > > # RCU Subsystem > > > > > > > > > > > > > > CONFIG_TINY_RCU=y > > > > > > > > > > > > > > > > I must have been asleep last time I looked at this. I was looking at > > > > > > > > Tree RCU. Please accept my apologies for my lapse. :-/ > > > > > > > > > > > > > > > > However, Tiny RCU's call_rcu() also avoids enabling IRQs, so I would > > > > > > > > have said the same thing, albeit after looking at a lot less RCU code. > > > > > > > > > > > > > > > > TL;DR: > > > > > > > > > > > > > > > > 1. Try making the __setup_irq() function's call to mutex_lock() > > > > > > > > instead be as follows: > > > > > > > > > > > > > > > > if (!mutex_trylock(&desc->request_mutex)) > > > > > > > > mutex_lock(&desc->request_mutex); > > > > > > > > > > > > > > > > This might fail if __setup_irq() has other dependencies on a > > > > > > > > fully operational scheduler. > > > > > > > > > > > > > > > > 2. Move that ppc32 call to __setup_irq() much later, most definitely > > > > > > > > after interrupts have been enabled and the scheduler is fully > > > > > > > > operational. Invoking mutex_lock() before that time is not a > > > > > > > > good idea. ;-) > > > > > > > > > > > > > > There is no call to __setup_irq() from arch/powerpc/? > > > > > > > > > > > > Glad it is not just me, given that I didn't see a direct call, either. So > > > > > > later in this email, I asked Liam to put a WARN_ON_ONCE(irqs_disabled()) > > > > > > just before that mutex_lock() in __setup_irq(). > > > > > > I had already found that this is the mutex lock that is enabling them. > > > I surrounded the mutex lock to ensure it was not enabled before, but was > > > after. Here is the findings: > > > > > > kernel/irq/manage.c:1587 __setup_irq: > > > [ 0.000000] [c0e65ec0] [c00e9b00] __setup_irq+0x6c4/0x840 (unreliable) > > > [ 0.000000] [c0e65ef0] [c00e9d74] request_threaded_irq+0xf8/0x1f4 > > > [ 0.000000] [c0e65f20] [c0c27168] pmac_pic_init+0x204/0x5f8 > > > [ 0.000000] [c0e65f80] [c0c1f544] init_IRQ+0xac/0x12c > > > [ 0.000000] [c0e65fa0] [c0c1cad0] start_kernel+0x544/0x6d4 > > > > > > Note your line number will be slightly different due to my debug. This > > > is the WARN _after_ the mutex lock. > > > > > > > > > > > > > > > Either way, invoking mutex_lock() early in boot before interrupts have > > > > > > been enabled is a bad idea. ;-) > > > > > > > > > > I'll add that WARN_ON_ONCE() too, and will report back later today... > > > > > > > > Thank you, looking forward to hearing the outcome! > > > > > > > > > > > Note that there are (possibly different) issues seen on ppc32 and on arm32 > > > > > > > (Renesas RZ/A in particular, but not on other Renesas ARM systems). > > > > > > > > > > > > > > I saw an issue on arm32 with cfeb6ae8bcb96ccf, but not with cfeb6ae8bcb96ccf^. > > > > > > > Other people saw an issue on ppc32 with both cfeb6ae8bcb96ccf and > > > > > > > cfeb6ae8bcb96ccf^. > > > > > > > > > > > > I look forward to hearing what is the issue in both cases. > > > > > > > > > > For RZ/A, my problem report is > > > > > https://lore.kernel.org/all/3f86d58e-7f36-c6b4-c43a-2a7bcffd3bd@linux-m68k.org/ > > > > > > > > Thank you, Geert! > > > > > > > > Huh. Is that patch you reverted causing Maple Tree or related code > > > > to attempt to acquire mutexes in early boot before interrupts have > > > > been enabled? > > > > > > > > If that added WARN_ON_ONCE() doesn't trigger early, another approach > > > > would be to put it at the beginning of mutex_lock(). Or for that matter > > > > at the beginning of might_sleep(). > > > > > > Yeah, I put many WARN() calls through the code as well as tracking down > > > where TIF_NEED_RESCHED was set; the tiny.c call_rcu(). > > > > > > > > > So my findings summarized: > > > > > > 1. My change to the maple tree makes call_rcu() more likely on early boot. > > > 2. The initial thread setup is always set to idle state > > > 3. call_rcu() tiny sets TIF_NEED_RESCHED since is_idle_task(current) > > > 4. init_IRQ() takes a mutex lock which will enable the interrupts since > > > TIF_NEED_RESCHED is set. > > > > > > I don't know which of these things is "wrong". > > > > Doing early-boot call_rcu() is OK. > > > > The initial thread eventually becomes the idle thread for the boot CPU. > > See rest_init() in init/main.c. > > > > I can certainly make Tiny call_rcu() refrain from invoking resched_cpu() > > during boot, as shown in the (untested) patch below. This might result in > > boot-time hangs, though. > > If we set the current thread as !idle, then we don't need to add > overhead to every call_rcu(), and you've already tracked down where I > need to change the flags back to idle. Patch below. I personally like your patch way better than mine, but it will need both eyes and time on it. I wouldn't put it past someone to assume that the boot CPU is running the idle thread early in boot. :-/ > > The thought of doing mutex_lock() before interrupts are enabled on the > > boot CPU strikes me as very wrong. Others might argue that the fact > > that __might_resched() explicitly avoids complaining when system_state > > is equal to SYSTEM_BOOTING constitutes evidence that such calls are OK. > > (Which might be why enabling debug suppressed the problem.) Except that > > if you actually try sleeping at that time, nothing good can possibly > > happen. > > Does lockdep check for SYSTEM_BOOTING as well? That could be another > reason? Not from what I can see, but I could be missing something. > > So my question is why is it useful to setup interrupts that early, given > > that interrupts cannot possibly happen until the boot CPU enables them? > > I don't know for sure, but there are 'preallocated IRQs' which end up > grouped 0-15, then I see another one added at 55 after the mpic console > output. I suspect it's so that they can be added as they are discovered > during early boot? Christophe argues that the interrupt stacks must be allocated early on, and that this acquires a mutex. > The below is not fully tested, but qemu stops throwing the warning on > boot and it doesn't add instructions to call_rcu(). Two points in its favor! ;-) Thanx, Paul > ------------------------------------------------------------------------ > diff --git a/init/main.c b/init/main.c > index dbe1fe76be34..fd4739918a94 100644 > --- a/init/main.c > +++ b/init/main.c > @@ -696,7 +696,7 @@ noinline void __ref __noreturn rest_init(void) > */ > rcu_read_lock(); > tsk = find_task_by_pid_ns(pid, &init_pid_ns); > - tsk->flags |= PF_NO_SETAFFINITY; > + tsk->flags |= PF_NO_SETAFFINITY & PF_IDLE; > set_cpus_allowed_ptr(tsk, cpumask_of(smp_processor_id())); > rcu_read_unlock(); > > @@ -943,6 +943,7 @@ void start_kernel(void) > * time - but meanwhile we still have a functioning scheduler. > */ > sched_init(); > + current->flags &= ~PF_IDLE; > > if (WARN(!irqs_disabled(), > "Interrupts were enabled *very* early, fixing it\n")) > > > ------------------------------------------------------------------------ > > > > diff --git a/kernel/rcu/tiny.c b/kernel/rcu/tiny.c > > index fec804b79080..f00fb0855e4b 100644 > > --- a/kernel/rcu/tiny.c > > +++ b/kernel/rcu/tiny.c > > @@ -192,7 +192,7 @@ void call_rcu(struct rcu_head *head, rcu_callback_t func) > > rcu_ctrlblk.curtail = &head->next; > > local_irq_restore(flags); > > > > - if (unlikely(is_idle_task(current))) { > > + if (unlikely(is_idle_task(current)) && system_state > SYSTEM_BOOTING) { > > /* force scheduling for rcu_qs() */ > > resched_cpu(0); > > }