From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DD38DC77B6E for ; Wed, 12 Apr 2023 06:45:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0AF3D900003; Wed, 12 Apr 2023 02:45:49 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 037EB900002; Wed, 12 Apr 2023 02:45:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DF36B900003; Wed, 12 Apr 2023 02:45:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id CF1CC900002 for ; Wed, 12 Apr 2023 02:45:48 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 92C771C6CF5 for ; Wed, 12 Apr 2023 06:45:48 +0000 (UTC) X-FDA: 80671803576.25.AC794E5 Received: from mail-pl1-f170.google.com (mail-pl1-f170.google.com [209.85.214.170]) by imf10.hostedemail.com (Postfix) with ESMTP id 73D6CC000D for ; Wed, 12 Apr 2023 06:45:45 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b="Yst/A2hg"; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf10.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.214.170 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681281946; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=vrFdgvoxy98DSYr+EbiMnh9ylo3lbJcx7INofpJJVwY=; b=H/QKhwf3GZL7DXjK19DNpgopuyd5AQ8O68p2/A0bn64z9BRcVx1wUgR3lLDSzgQw9XO8tk T7ccFVVii8dKe2/dxVJmxmol8gVjgG8Bsnh5YMUrgkWToZdzFZCgCnX85ZIkGzkt3zDJRE WEeVZe324ykFPUcRROg36hyWLh7lEc4= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b="Yst/A2hg"; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf10.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.214.170 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681281946; a=rsa-sha256; cv=none; b=PT4Z8mj9fYcjH1Q6iVQK3WIhIprD32F3R4qtStM7Eykz1a+wAUHWL5usIQr7XUtUPxM1nY C1ZQU8MNjFOoIvVvs+HCr5aAwD/M/BCinnZDriOFKxGDTTEpFKN+HhYAlWBsonLc7NWOvO zcdA+kbYPZC49PkzT/EY4hrRw3CQLgQ= Received: by mail-pl1-f170.google.com with SMTP id d9443c01a7336-1a52ea316e8so1143165ad.1 for ; Tue, 11 Apr 2023 23:45:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1681281944; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=vrFdgvoxy98DSYr+EbiMnh9ylo3lbJcx7INofpJJVwY=; b=Yst/A2hgpm83fjujRtUUdbnAqM3zJTCI3m2IfTAPbAPb5WLu9MHsx8WWf7lN1gNgJY 2/aGLhwL1e5fd1S5fJr8vJKmqiorxXI87CMtslOFrwtecwpj0VEb18P4HyLeFfAeDzXu 05wXN42NMiCE0JNgB9mdL7paKsf8+022DWbNgJ/0w2ZPwtQ2NicH8TqrSLjJRJYDrSTW 1dz1AzJKU2cmVK73WewxREVsxOJcyVy914MgJDQiQ6WvteHaV2wREIUS00q7loGMOryq hKPPq0Ri8cLKi8nPessNEDue/JoTS2+PqDst4v0UCm1oRpO/f8AEy5QhhHU+veg7vivq pf+g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1681281944; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=vrFdgvoxy98DSYr+EbiMnh9ylo3lbJcx7INofpJJVwY=; b=P6mozhJFWYIqoB6EGqxTRyPvmaSTcpVpmKzq8fLqRXJLwFyTVnGoW2KZGnYD2tsbXJ TszmF+DV8x5tCbc9J/+ftpEEvyNIytCF0DFOCskKOTfRjJ+RXma4MbB6BBISy4kF1dlp V9t5iZcVNJl+hErI9YWZgKJjl4OZ2HGlSEKnf0IZ5xzg+/90z8NCurThn1rG5f6ZppcF hf7+rhUPouNNEy+SlDpD2BI9JqCNH97EGoOFChywBpoI/kDC0QD7U73kMrQNgg1IRDYJ ayo5B7y2od3xwq24Rhy+GVr1SwoIF1IvxKz58Isr9kji1PSJSf8SKu8/7Gg2Apz/ssyZ JzOA== X-Gm-Message-State: AAQBX9deu8jGMFUl+iLwasOQe8LDNonHUI1TIIzQ4D1fX7j694IWxTor H/TwwOwiAvcHqJyM4qaRHM80Jw== X-Google-Smtp-Source: AKy350bIjNkh7YW+7NxOPWb0TAAZ0CFAd1RXn2vW2g/ngTNSJz6ZwrdCUpOPmjX3Ify2wd/jJmzfIw== X-Received: by 2002:a05:6a20:7da6:b0:e3:e236:bbd4 with SMTP id v38-20020a056a207da600b000e3e236bbd4mr1439750pzj.2.1681281943943; Tue, 11 Apr 2023 23:45:43 -0700 (PDT) Received: from [10.200.10.123] ([139.177.225.225]) by smtp.gmail.com with ESMTPSA id j17-20020aa783d1000000b00639a1f7b54fsm4302367pfn.60.2023.04.11.23.45.36 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 11 Apr 2023 23:45:43 -0700 (PDT) Message-ID: <31a423b5-5662-6358-a10d-489126ee0b01@bytedance.com> Date: Wed, 12 Apr 2023 14:45:33 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.9.1 Subject: Re: [PATCH] mm: slub: annotate kmem_cache_node->list_lock as raw_spinlock Content-Language: en-US To: Boqun Feng Cc: Vlastimil Babka , 42.hyeyoo@gmail.com, akpm@linux-foundation.org, roman.gushchin@linux.dev, iamjoonsoo.kim@lge.com, rientjes@google.com, penberg@kernel.org, cl@linux.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Zhao Gongyi , Sebastian Andrzej Siewior , Thomas Gleixner , RCU , "Paul E . McKenney" References: <20230411130854.46795-1-zhengqi.arch@bytedance.com> <932bf921-a076-e166-4f95-1adb24d544cf@bytedance.com> From: Qi Zheng In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 73D6CC000D X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: 5ykxazx85365ya6g64urwooqqyodrtpu X-HE-Tag: 1681281945-432969 X-HE-Meta: U2FsdGVkX19P4ffmFCASFghr2xKOU3mRCIKehPkWvbnFrKJd2N00Cc52BQLReDYMIZLaJeQxDQhgalGrkTa3CfbzqSvgotyj1RGJwl0XnLZsjKMzqZqQvjlmYhpDTvqiDTSEpgE83mXDhgLr0sPoMGd7YsT23Um+YIvhGKmKLV7Q+xwn+GAc/mvah1nK7sFrcXL0B2/g/iIxVI/n7+dFskZKO/GzeHG3BWZEJ8y8GZTEDynjtWPDd4Um52lDZmOdoUSf8qz3xXqb34S7wJx6fugLxw9Y6C4XINT84X/eJrB5jdcNPWzfT3OZv5XTs+yWsqkOthf7ixZ/kZXyoC/XGiya0UZJdVqfs6j8fYe4e0aYrHiS4Hm9kFOtACnHWulZaMDOIS0ZUSyWX71PoF4JMt403vYfxaqf959bnhoN9X4GHb9KR4s0SghL4v0NMrayLbOlfre3VqqRFzZPSEADm5kV1sEZH2a6Ljr/lkM97+GOt4aLOXvs+Qja/ZHG+UgLRQjvJIkb28Fzklt8LJnC/fgWn2FKibFbmLiIy7roNJyl4bM/XbjAWHzn5QZGs4bsnREJCu1y13AWRpSK95UTVphRkR4H6FA2f6vBhhi6ammT7IPSdi+aErIgCyXEjGSGwaipjCQq4ZI1p8UwIELJQboNSa5siyPiSHxBl+pL/TW7DFpDZmisJ+K+d+D0XWvm+XrE5LDkmjcbe49RRfZVS6bVA63ypNz2LdXlA/hz34qV605hWYDxYWiIdYSnaIL09ghpI/qeBXCOTDW+ZhNYf/KwMDd3dO85K4AMGomf787ohYeZrsxjoa3dZt8kUNq8YdzvsleNE4GQ22rc86QYUThKU9PJJMXcD5Xjc11y8D6Umf85cVVdrHBBRrzd5843VK5j+l5U4ps3iU4r6qBU3CZ8rregLBZ1eegAX/Sxwf4kjZ/jvxOYE4T5kzzPnLUBoCajX4LNdhZGl49B7lF RWqqx+1j dty8KzL0Zv46oQvV6fzjsFrrPv37GTLacD0rjH1bgZM0DH90P0SpepWTurx66ZP3QEhG7H/ePFW3PMnmTGuXRukr7Ixlf1GbsZpig35qtNnXunuNCKfMzgQX1oGSR+7xwpcImZidQkblEvwtW9NG2O8Q7ecLP2wg4FPLPIL50lxiT1Ng3kGtIEWOvYEiGPwFXBspCWGbOkeezMbuB3WktllAvuTsr6pdAMnzJyYc2WPl9eMGeeb0sKueEaCEERvImH9pwF2PHhV1ungyiXDpRvPC6DXEPJLLfiGUDDGuYZMa2D3sxhRmcC2+GqelntxO0wOYAU/j6nlP39SBnoLg5H1ERoyq2fFCp2HSNJBN11dhVmYf6b+LbJ94M/IlU3ui5Y8vyDUz6VrvnkcecIr/mr3tiFZl/Xd+/58+e5p78eZCOSMTk1K/Oo1QBYYyJiT5JxCw+phgvUog4a7gwA5mmHNEsTcEuHWN1iBlmZOzIjHMxdPxM4kU0RL/9Ie6JyUFVydsO X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2023/4/12 13:51, Boqun Feng wrote: > On Tue, Apr 11, 2023 at 10:25:06PM +0800, Qi Zheng wrote: >> >> >> On 2023/4/11 22:19, Vlastimil Babka wrote: >>> On 4/11/23 16:08, Qi Zheng wrote: >>>> >>>> >>>> On 2023/4/11 21:40, Vlastimil Babka wrote: >>>>> On 4/11/23 15:08, Qi Zheng wrote: >>>>>> The list_lock can be held in the critical section of >>>>>> raw_spinlock, and then lockdep will complain about it >>>>>> like below: >>>>>> >>>>>> ============================= >>>>>> [ BUG: Invalid wait context ] >>>>>> 6.3.0-rc6-next-20230411 #7 Not tainted >>>>>> ----------------------------- >>>>>> swapper/0/1 is trying to lock: >>>>>> ffff888100055418 (&n->list_lock){....}-{3:3}, at: ___slab_alloc+0x73d/0x1330 >>>>>> other info that might help us debug this: >>>>>> context-{5:5} >>>>>> 2 locks held by swapper/0/1: >>>>>> #0: ffffffff824e8160 (rcu_tasks.cbs_gbl_lock){....}-{2:2}, at: cblist_init_generic+0x22/0x2d0 >>>>>> #1: ffff888136bede50 (&ACCESS_PRIVATE(rtpcp, lock)){....}-{2:2}, at: cblist_init_generic+0x232/0x2d0 >>>>>> stack backtrace: >>>>>> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.3.0-rc6-next-20230411 #7 >>>>>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 >>>>>> Call Trace: >>>>>> >>>>>> dump_stack_lvl+0x77/0xc0 >>>>>> __lock_acquire+0xa65/0x2950 >>>>>> ? arch_stack_walk+0x65/0xf0 >>>>>> ? arch_stack_walk+0x65/0xf0 >>>>>> ? unwind_next_frame+0x602/0x8d0 >>>>>> lock_acquire+0xe0/0x300 >>>>>> ? ___slab_alloc+0x73d/0x1330 >>>>>> ? find_usage_forwards+0x39/0x50 >>>>>> ? check_irq_usage+0x162/0xa70 >>>>>> ? __bfs+0x10c/0x2c0 >>>>>> _raw_spin_lock_irqsave+0x4f/0x90 >>>>>> ? ___slab_alloc+0x73d/0x1330 >>>>>> ___slab_alloc+0x73d/0x1330 >>>>>> ? fill_pool+0x16b/0x2a0 >>>>>> ? look_up_lock_class+0x5d/0x160 >>>>>> ? register_lock_class+0x48/0x500 >>>>>> ? __lock_acquire+0xabc/0x2950 >>>>>> ? fill_pool+0x16b/0x2a0 >>>>>> kmem_cache_alloc+0x358/0x3b0 >>>>>> ? __lock_acquire+0xabc/0x2950 >>>>>> fill_pool+0x16b/0x2a0 >>>>>> ? __debug_object_init+0x292/0x560 >>>>>> ? lock_acquire+0xe0/0x300 >>>>>> ? cblist_init_generic+0x232/0x2d0 >>>>>> __debug_object_init+0x2c/0x560 > > This "__debug_object_init" is because INIT_WORK() is called in > cblist_init_generic(), so.. Yes, a more precise call stack is as follows: cblist_init_generic --> INIT_WORK --> lockdep_init_map --> lockdep_init_map_type --> register_lock_class --> init_data_structures_once --> init_rcu_head --> debug_object_init --> __debug_object_init > >>>>>> cblist_init_generic+0x147/0x2d0 >>>>>> rcu_init_tasks_generic+0x15/0x190 >>>>>> kernel_init_freeable+0x6e/0x3e0 >>>>>> ? rest_init+0x1e0/0x1e0 >>>>>> kernel_init+0x1b/0x1d0 >>>>>> ? rest_init+0x1e0/0x1e0 >>>>>> ret_from_fork+0x1f/0x30 >>>>>> >>>>>> >>>>>> The fill_pool() can only be called in the !PREEMPT_RT kernel >>>>>> or in the preemptible context of the PREEMPT_RT kernel, so >>>>>> the above warning is not a real issue, but it's better to >>>>>> annotate kmem_cache_node->list_lock as raw_spinlock to get >>>>>> rid of such issue. >>>>> >>>>> + CC some RT and RCU people >>>> >>>> Thanks. >>>> >>>>> >>>>> AFAIK raw_spinlock is not just an annotation, but on RT it changes the >>>>> implementation from preemptible mutex to actual spin lock, so it would be >>>> >>>> Yeah. >>>> >>>>> rather unfortunate to do that for a spurious warning. Can it be somehow >>>>> fixed in a better way? > > ... probably a better fix is to drop locks and call INIT_WORK(), or make > the cblist_init_generic() lockless (or part lockless), given it's just > initializing the cblist, it's probably doable. But I haven't taken a > careful look yet. This might be a doable solution for this warning, but I also saw another stacks like the following on v5.15: [ 30.349171] Call Trace: [ 30.349171] [ 30.349171] dump_stack_lvl+0x69/0x97 [ 30.349171] __lock_acquire+0x4a0/0x1aa0 [ 30.349171] lock_acquire+0x275/0x2e0 [ 30.349171] _raw_spin_lock_irqsave+0x4c/0x90 [ 30.349171] ___slab_alloc.constprop.95+0x3ea/0xa80 [ 30.349171] __slab_alloc.isra.89.constprop.94+0x1c/0x30 [ 30.349171] kmem_cache_alloc+0x2bd/0x320 [ 30.349171] fill_pool+0x1b2/0x2d0 [ 30.349171] __debug_object_init+0x2c/0x500 [ 30.349171] debug_object_activate+0x136/0x200 [ 30.349171] add_timer+0x10b/0x170 [ 30.349171] queue_delayed_work_on+0x63/0xa0 [ 30.349171] init_mm_internals+0x226/0x2b0 [ 30.349171] kernel_init_freeable+0x82/0x24e [ 30.349171] kernel_init+0x17/0x140 [ 30.349171] ret_from_fork+0x1f/0x30 [ 30.349171] So I'm a bit confused whether to fix individual cases or should there be a general solution. Thanks, Qi > > Regards, > Boqun > >>>> >>>> It's indeed unfortunate for the warning in the commit message. But >>>> functions like kmem_cache_alloc(GFP_ATOMIC) may indeed be called >>>> in the critical section of raw_spinlock or in the hardirq context, which >>> >>> Hmm, I thought they may not, actually. >>> >>>> will cause problem in the PREEMPT_RT kernel. So I still think it is >>>> reasonable to convert kmem_cache_node->list_lock to raw_spinlock type. >>> >>> It wouldn't be the complete solution anyway. Once we allow even a GFP_ATOMIC >>> slab allocation for such context, it means also page allocation can happen >>> to refill the slabs, so lockdep will eventually complain about zone->lock, >>> and who knows what else. >> >> Oh, indeed. :( >> >>> >>>> In addition, there are many fix patches for this kind of warning in the >>>> git log, so I also think there should be a general and better solution. :) >>> >>> Maybe, but given above, I doubt it's this one. >>> >>>> >>>>> >>>> >>> >> >> -- >> Thanks, >> Qi -- Thanks, Qi