From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4B97BC77B6E for ; Wed, 12 Apr 2023 16:47:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D874290000F; Wed, 12 Apr 2023 12:47:28 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D103E900003; Wed, 12 Apr 2023 12:47:28 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B8ACA90000F; Wed, 12 Apr 2023 12:47:28 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id A2193900003 for ; Wed, 12 Apr 2023 12:47:28 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 2AA051C49A4 for ; Wed, 12 Apr 2023 16:47:28 +0000 (UTC) X-FDA: 80673319776.17.939094A Received: from mail-pl1-f176.google.com (mail-pl1-f176.google.com [209.85.214.176]) by imf11.hostedemail.com (Postfix) with ESMTP id 2CBE740019 for ; Wed, 12 Apr 2023 16:47:25 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=OufZnNeJ; spf=pass (imf11.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.214.176 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681318046; a=rsa-sha256; cv=none; b=CxL2ceDyrQwwBsjFBNz7FrnrpjEE25Egs7Fkz8RRIIl09LsUysMuS7j+gBRtU0ZOXQTDEG EVei+7UE9jYNxEzJx2FPCysGXCeK3NnrmngITluZyLPaitTe9cFaWghNgfw7w3j03YcJCv 2uKzkE6IAZjhnJhinR/p3YTYcgsH0So= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=OufZnNeJ; spf=pass (imf11.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.214.176 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681318046; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=SfM+xntQGZX2VnGs0cTxft7GuMLBw0knUN6LZdu9PUY=; b=UU9FBnxLdkafD40spxnJqkCEQ9haoweK8XD5DrisZRZAHf1QnFRDE8YWGpc2INzoADWd83 9Fl4+vJlzqU0LUwFlKZCkSnV8tUf5xIixeXrg85McbrcaXuEgAk0eCtE4EJ6QJrn25VjFO IPApTuHmiw1jNOOEXGS3zNJB9/JjfFE= Received: by mail-pl1-f176.google.com with SMTP id d9443c01a7336-1a52ea316e8so1515425ad.1 for ; Wed, 12 Apr 2023 09:47:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1681318045; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=SfM+xntQGZX2VnGs0cTxft7GuMLBw0knUN6LZdu9PUY=; b=OufZnNeJ4KpAkMxbYJubHufgUPBNlZlygrStdqfhXLDdMixRH5A5mv8moG/CWifYIZ Z9YbcgEMZzzRcHJwfjADDGmvPp/OhorMp3D4eDnJ2Sx2tiZvNL0qoasj5jIHHCqkP8we 3hrqeT+SiduAzb/dm7k7u2KCZuqAqZvQToJkq64h0uYfuQVFD7euKO7wgtSfdxEd1SUi 4RSuq8U7bDh8MDTSpakz9udvVS0sCWjVTdIZ4K250uRzTo1J0d7KtHak95+mf9zx3uzy lTt53bkMEvZ0hw2HsVQQl1eYkYnIQPoplKNXdcy27ZK+0sxShnotTJ062cMfYo0thTRn J4XQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681318045; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=SfM+xntQGZX2VnGs0cTxft7GuMLBw0knUN6LZdu9PUY=; b=fW7jCcref8B/eoiaPKa6txFq7+OROoJlwgI3oz0H6oQbm1kvAK377fPt1J46EcXZVH s945Ln/e4Q1LRGzq5RFZeihTyhyifpimQwLBxrXcy065OwsfgqJWcB8bEgu4Rhh6Jrff RNrfQOMFoNZ7QadZtwe6QGpUNYdzmp9mtMfXrnMSC2JlyzWbgr5qHVR2Yv0cCRXdR4nQ i2ThmFpreJ5fGVrkCYauiOiEwJXge1hzS+iLsE7VchckDHzaJZwPGjsG2H1ZP0uiqgiX N56WwG3wUfZaeNnjdUfl/KvrvE8TlbVh+1bvR7X6P0op9DDDN4FGwt0kuc4qkF+UJxmS CmAQ== X-Gm-Message-State: AAQBX9fJLh6IF3yLv8utOZmZ3st/FCqxpfmBdQUDbsOgRJrGEl234Kpn X/jJcC0RXASVD2jOSIy5xY1N2w== X-Google-Smtp-Source: AKy350aG+4nOXxAKsXPhEV6jRN3YTts9DaBerTHIAA8y2x0KTFliGxXeE2FclAUl9lPjs5g5U5I7WA== X-Received: by 2002:a17:902:ec8b:b0:1a6:5682:af48 with SMTP id x11-20020a170902ec8b00b001a65682af48mr3578610plg.0.1681318044635; Wed, 12 Apr 2023 09:47:24 -0700 (PDT) Received: from [10.200.10.123] ([139.177.225.225]) by smtp.gmail.com with ESMTPSA id q18-20020a170902789200b001a27ea5cb94sm11816836pll.87.2023.04.12.09.47.16 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 12 Apr 2023 09:47:24 -0700 (PDT) Message-ID: Date: Thu, 13 Apr 2023 00:47:14 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.9.1 Subject: Re: [PATCH] mm: slub: annotate kmem_cache_node->list_lock as raw_spinlock Content-Language: en-US To: Waiman Long Cc: "42.hyeyoo@gmail.com" <42.hyeyoo@gmail.com>, "akpm@linux-foundation.org" , "roman.gushchin@linux.dev" , "iamjoonsoo.kim@lge.com" , "rientjes@google.com" , "penberg@kernel.org" , "cl@linux.com" , "linux-mm@kvack.org" , "linux-kernel@vger.kernel.org" , Zhao Gongyi , Sebastian Andrzej Siewior , Thomas Gleixner , RCU , "Paul E . McKenney" , Peter Zijlstra , Vlastimil Babka , Boqun Feng , "Zhang, Qiang1" References: <20230411130854.46795-1-zhengqi.arch@bytedance.com> <932bf921-a076-e166-4f95-1adb24d544cf@bytedance.com> <29efad1c-5ad4-5d26-b1b9-eeee6119e711@bytedance.com> <7f928c82-0aaf-5fac-6a54-a3d95a87b296@bytedance.com> <752cafb6-fd26-0168-f871-d2d4afe417bc@redhat.com> From: Qi Zheng In-Reply-To: <752cafb6-fd26-0168-f871-d2d4afe417bc@redhat.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspam-User: X-Rspamd-Queue-Id: 2CBE740019 X-Rspamd-Server: rspam01 X-Stat-Signature: b4w94h8bje1u4akr14uf4zptskws6epj X-HE-Tag: 1681318045-963556 X-HE-Meta: U2FsdGVkX19WgBXrzLtw0SY9v+B3xI4t2pN5nOYnJiMIQ3ZfQkG9C4h2bdTGRlWWjytD3LvYV3xrLCSBd+gX+LGuEmgqwpC+08+G0q+8p6xNETA8p7vGu8QEM34o9IJYH4C76b+Z/I2gZmjJM0s4jqJh4aFBc1d/SKMbfPmHR7rtix64ZF+mxELBeqOyNnEhJZthc9VQ3wWdHYSqeki+pckPALQXFebhj/1ciWwPbHAT0GwFyaKm9NauvXl/2B4noutDkZ/shLDyAtl4J2EaSISq8omCZT8aoXBXPAKrPCLtevPLRrVhcwOy+rPd4Kp9aw5uuTQm4/muDgP/QbRgdLY2cpDbPQjMSN7TxIN8rJYZoW2LLEq5qESxSZ2Le8+ibN7qawRQLajM90wpkeMp8gqwLRPy5JyO+e2A8Zp2REhhcEw0GaT5HIKuhEv3/WnVFCMrWkg0gwICo0TvIZFQ5ufq4AJBSukHwbF89weUxxpCDfS2ffwZCV1nz+VNubmFyM7mQcNI9s5uaqEvj461txnVT0icmFZLDczDGKAyK5gblFmMrWbTIKjdO6XPGxIe1tF/Ua9P++6BmnP2hHdv3A95eRQaHPD8n6649QsRZDK+zqNWGrYAC4Db9MjdKpS3om68A8GaCmgFxzKSxoSTjFnbV/zJeRukclWSSApgHw8WsTdey27punTgBBSH/6Qt+AnvjamjbegfdWMTvfL/lzKx4tvdS4+VrS4wcBHO7BIlyjT4e8ewwAt2jM7nBMSbDNLEK0eMq31kpBWWNXsbSlFnwRGZweTZZn5Nyk0S9MQF6BIuXJ1G9/+s06mUBN93s+zcxnHXNL/IWSyjkK2Bha/8Jqt8hRehWp1hO4nW7IKFNCrDQoEe9B759PmAz8r3SvimaWywmFE+ocr872v9ahEKIu4ChEIpH8VJrouVqvovzf72Az4OTWKTVmmuLhpuolZMD45XTRBqK08CWi/ uz1g/x10 QplvkZlOjgzGw8jESwzlv1TrZkOpIiP3WTAOxsMUWzhf7RktWtiovQXJw924sVBzgrjThUjMtTIngjsCncYISl3PWPEtSF85S4JZ7cm2msT8TzsnUCtYHO9MIT9R35C7D/+EACJYjTPNsmz8TbHuPdtPmqXdeVtzCDLTDKp4yvLFJhR1xqfAvXqXUaF9QUDvT8ezzGz8IyrNxPjSSUeWlISgZ++DPgy39lj+LnMRAH5XBYs/9H/sMZ7HR2pfwWbm1r2UOW5ZHZkPFzpD0yMs1rbXeEqQ+LousLCRv+tBZbP9xvYggmGi4AvLru1VLVl26pYwqJi8uJeew5HfMcRGKZhpy8KjEumzrlNbklwB0IbeX/GNCKxDi8fSTNknwzG4hB3tdqps4H/flq8kC2YdEvapA73kskJ09fglBLw/c78pG3p+ZNDLvIR/wvTCtjJmlNp0L/GTNH49Y4SPokUd4MGVpuzhx4om9RP5EMooggXnGCZoM9nbGFItOpTRbnEJ3/51LPqGDoYkWj+XlZkke69NyKbSyo9DbhAP+k+fTeXhlSDM= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2023/4/12 21:09, Waiman Long wrote: > On 4/12/23 04:32, Qi Zheng wrote: >> >> >> On 2023/4/12 15:30, Qi Zheng wrote: >>> >>> >>> On 2023/4/12 14:50, Vlastimil Babka wrote: >>>> >>>> >>>> On 4/12/23 08:44, Zhang, Qiang1 wrote: >>>>>> >>>>>> >>>>>> On 2023/4/11 22:19, Vlastimil Babka wrote: >>>>>>> On 4/11/23 16:08, Qi Zheng wrote: >>>>>>>> >>>>>>>> >>>>>>>> On 2023/4/11 21:40, Vlastimil Babka wrote: >>>>>>>>> On 4/11/23 15:08, Qi Zheng wrote: >>>>>>>>>> The list_lock can be held in the critical section of >>>>>>>>>> raw_spinlock, and then lockdep will complain about it >>>>>>>>>> like below: >>>>>>>>>> >>>>>>>>>>     ============================= >>>>>>>>>>     [ BUG: Invalid wait context ] >>>>>>>>>>     6.3.0-rc6-next-20230411 #7 Not tainted >>>>>>>>>>     ----------------------------- >>>>>>>>>>     swapper/0/1 is trying to lock: >>>>>>>>>>     ffff888100055418 (&n->list_lock){....}-{3:3}, at: >>>>>>>>>> ___slab_alloc+0x73d/0x1330 >>>>>>>>>>     other info that might help us debug this: >>>>>>>>>>     context-{5:5} >>>>>>>>>>     2 locks held by swapper/0/1: >>>>>>>>>>      #0: ffffffff824e8160 >>>>>>>>>> (rcu_tasks.cbs_gbl_lock){....}-{2:2}, at: >>>>>>>>>> cblist_init_generic+0x22/0x2d0 >>>>>>>>>>      #1: ffff888136bede50 (&ACCESS_PRIVATE(rtpcp, >>>>>>>>>> lock)){....}-{2:2}, at: cblist_init_generic+0x232/0x2d0 >>>>>>>>>>     stack backtrace: >>>>>>>>>>     CPU: 0 PID: 1 Comm: swapper/0 Not tainted >>>>>>>>>> 6.3.0-rc6-next-20230411 #7 >>>>>>>>>>     Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), >>>>>>>>>> BIOS 1.14.0-2 04/01/2014 >>>>>>>>>>     Call Trace: >>>>>>>>>>      >>>>>>>>>>      dump_stack_lvl+0x77/0xc0 >>>>>>>>>>      __lock_acquire+0xa65/0x2950 >>>>>>>>>>      ? arch_stack_walk+0x65/0xf0 >>>>>>>>>>      ? arch_stack_walk+0x65/0xf0 >>>>>>>>>>      ? unwind_next_frame+0x602/0x8d0 >>>>>>>>>>      lock_acquire+0xe0/0x300 >>>>>>>>>>      ? ___slab_alloc+0x73d/0x1330 >>>>>>>>>>      ? find_usage_forwards+0x39/0x50 >>>>>>>>>>      ? check_irq_usage+0x162/0xa70 >>>>>>>>>>      ? __bfs+0x10c/0x2c0 >>>>>>>>>>      _raw_spin_lock_irqsave+0x4f/0x90 >>>>>>>>>>      ? ___slab_alloc+0x73d/0x1330 >>>>>>>>>>      ___slab_alloc+0x73d/0x1330 >>>>>>>>>>      ? fill_pool+0x16b/0x2a0 >>>>>>>>>>      ? look_up_lock_class+0x5d/0x160 >>>>>>>>>>      ? register_lock_class+0x48/0x500 >>>>>>>>>>      ? __lock_acquire+0xabc/0x2950 >>>>>>>>>>      ? fill_pool+0x16b/0x2a0 >>>>>>>>>>      kmem_cache_alloc+0x358/0x3b0 >>>>>>>>>>      ? __lock_acquire+0xabc/0x2950 >>>>>>>>>>      fill_pool+0x16b/0x2a0 >>>>>>>>>>      ? __debug_object_init+0x292/0x560 >>>>>>>>>>      ? lock_acquire+0xe0/0x300 >>>>>>>>>>      ? cblist_init_generic+0x232/0x2d0 >>>>>>>>>>      __debug_object_init+0x2c/0x560 >>>>>> >>>>>> This "__debug_object_init" is because INIT_WORK() is called in >>>>>> cblist_init_generic(), so.. >>>>>> >>>>>>>>>> cblist_init_generic+0x147/0x2d0 >>>>>>>>>>      rcu_init_tasks_generic+0x15/0x190 >>>>>>>>>>      kernel_init_freeable+0x6e/0x3e0 >>>>>>>>>>      ? rest_init+0x1e0/0x1e0 >>>>>>>>>>      kernel_init+0x1b/0x1d0 >>>>>>>>>>      ? rest_init+0x1e0/0x1e0 >>>>>>>>>>      ret_from_fork+0x1f/0x30 >>>>>>>>>>      >>>>>>>>>> >>>>>>>>>> The fill_pool() can only be called in the !PREEMPT_RT kernel >>>>>>>>>> or in the preemptible context of the PREEMPT_RT kernel, so >>>>>>>>>> the above warning is not a real issue, but it's better to >>>>>>>>>> annotate kmem_cache_node->list_lock as raw_spinlock to get >>>>>>>>>> rid of such issue. >>>>>>>>> >>>>>>>>> + CC some RT and RCU people >>>>>>>> >>>>>>>> Thanks. >>>>>>>> >>>>>>>>> >>>>>>>>> AFAIK raw_spinlock is not just an annotation, but on RT it >>>>>>>>> changes the >>>>>>>>> implementation from preemptible mutex to actual spin lock, so >>>>>>>>> it would be >>>>>>>> >>>>>>>> Yeah. >>>>>>>> >>>>>>>>> rather unfortunate to do that for a spurious warning. Can it be >>>>>>>>> somehow >>>>>>>>> fixed in a better way? >>>>>> >>>>>> ... probably a better fix is to drop locks and call INIT_WORK(), >>>>>> or make >>>>>> the cblist_init_generic() lockless (or part lockless), given it's >>>>>> just >>>>>> initializing the cblist, it's probably doable. But I haven't taken a >>>>>> careful look yet. >>>>>> >>>>> >>>>> >>>>> This is just one of the paths that triggers an invalid wait,  the >>>>> following paths can also trigger: >>>>> >>>>> [  129.914547] [ BUG: Invalid wait context ] >>>>> [  129.914775] 6.3.0-rc1-yocto-standard+ #2 Not tainted >>>>> [  129.915044] ----------------------------- >>>>> [  129.915272] kworker/2:0/28 is trying to lock: >>>>> [  129.915516] ffff88815660f570 (&c->lock){-.-.}-{3:3}, at: >>>>> ___slab_alloc+0x68/0x12e0 >>>>> [  129.915967] other info that might help us debug this: >>>>> [  129.916241] context-{5:5} >>>>> [  129.916392] 3 locks held by kworker/2:0/28: >>>>> [  129.916642]  #0: ffff888100084d48 >>>>> ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x515/0xba0 >>>>> [  129.917145]  #1: ffff888100c17dd0 >>>>> ((work_completion)(&(&krcp->monitor_work)->work)){+.+.}-{0:0}, at: >>>>> process_on0 >>>>> [  129.917758]  #2: ffff8881565f8508 (krc.lock){....}-{2:2}, at: >>>>> kfree_rcu_monitor+0x29f/0x810 >>>>> [  129.918207] stack backtrace: >>>>> [  129.918374] CPU: 2 PID: 28 Comm: kworker/2:0 Not tainted >>>>> 6.3.0-rc1-yocto-standard+ #2 >>>>> [  129.918784] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), >>>>> BIOS rel-1.16.1-0-g3208b098f51a-prebuilt.qemu.o4 >>>>> [  129.919397] Workqueue: events kfree_rcu_monitor >>>>> [  129.919662] Call Trace: >>>>> [  129.919812]  >>>>> [  129.919941]  dump_stack_lvl+0x64/0xb0 >>>>> [  129.920171]  dump_stack+0x10/0x20 >>>>> [  129.920372]  __lock_acquire+0xeb8/0x3a80 >>>>> [  129.920603]  ? ret_from_fork+0x2c/0x50 >>>>> [  129.920824]  ? __pfx___lock_acquire+0x10/0x10 >>>>> [  129.921068]  ? unwind_next_frame.part.0+0x1ba/0x3c0 >>>>> [  129.921343]  ? ret_from_fork+0x2c/0x50 >>>>> [  129.921573]  ? __this_cpu_preempt_check+0x13/0x20 >>>>> [  129.921847]  lock_acquire+0x194/0x480 >>>>> [  129.922060]  ? ___slab_alloc+0x68/0x12e0 >>>>> [  129.922293]  ? __pfx_lock_acquire+0x10/0x10 >>>>> [  129.922529]  ? __pfx_mark_lock.part.0+0x10/0x10 >>>>> [  129.922778]  ? __kasan_check_read+0x11/0x20 >>>>> [  129.922998]  ___slab_alloc+0x9a/0x12e0 >>>>> [  129.923222]  ? ___slab_alloc+0x68/0x12e0 >>>>> [  129.923452]  ? __pfx_mark_lock.part.0+0x10/0x10 >>>>> [  129.923706]  ? __kasan_check_read+0x11/0x20 >>>>> [  129.923937]  ? fill_pool+0x22a/0x370 >>>>> [  129.924161]  ? __lock_acquire+0xf5b/0x3a80 >>>>> [  129.924387]  ? fill_pool+0x22a/0x370 >>>>> [  129.924590]  __slab_alloc.constprop.0+0x5b/0x90 >>>>> [  129.924832]  kmem_cache_alloc+0x296/0x3d0 >>>>> [  129.925073]  ? fill_pool+0x22a/0x370 >>>>> [  129.925291]  fill_pool+0x22a/0x370 >>>>> [  129.925495]  ? __pfx_fill_pool+0x10/0x10 >>>>> [  129.925718]  ? __pfx___lock_acquire+0x10/0x10 >>>>> [  129.926034]  ? __kasan_check_read+0x11/0x20 >>>>> [  129.926269]  ? check_chain_key+0x200/0x2b0 >>>>> [  129.926503]  __debug_object_init+0x82/0x8c0 >>>>> [  129.926734]  ? __pfx_lock_release+0x10/0x10 >>>>> [  129.926984]  ? __pfx___debug_object_init+0x10/0x10 >>>>> [  129.927249]  ? __kasan_check_read+0x11/0x20 >>>>> [  129.927498]  ? do_raw_spin_unlock+0x9c/0x100 >>>>> [  129.927758]  debug_object_activate+0x2d1/0x2f0 >>>>> [  129.928022]  ? __pfx_debug_object_activate+0x10/0x10 >>>>> [  129.928300]  ? __this_cpu_preempt_check+0x13/0x20 >>>>> [  129.928583]  __call_rcu_common.constprop.0+0x94/0xeb0 >>>>> [  129.928897]  ? __this_cpu_preempt_check+0x13/0x20 >>>>> [  129.929186]  ? __pfx_rcu_work_rcufn+0x10/0x10 >>>>> [  129.929459]  ? __pfx___call_rcu_common.constprop.0+0x10/0x10 >>>>> [  129.929803]  ? __pfx_lock_acquired+0x10/0x10 >>>>> [  129.930067]  ? __pfx_do_raw_spin_trylock+0x10/0x10 >>>>> [  129.930363]  ? kfree_rcu_monitor+0x29f/0x810 >>>>> [  129.930627]  call_rcu+0xe/0x20 >>>>> [  129.930821]  queue_rcu_work+0x4f/0x60 >>>>> [  129.931050]  kfree_rcu_monitor+0x5d3/0x810 >>>>> [  129.931302]  ? __pfx_kfree_rcu_monitor+0x10/0x10 >>>>> [  129.931587]  ? __this_cpu_preempt_check+0x13/0x20 >>>>> [  129.931878]  process_one_work+0x607/0xba0 >>>>> [  129.932129]  ? __pfx_process_one_work+0x10/0x10 >>>>> [  129.932408]  ? worker_thread+0xd6/0x710 >>>>> [  129.932653]  worker_thread+0x2d4/0x710 >>>>> [  129.932888]  ? __pfx_worker_thread+0x10/0x10 >>>>> [  129.933154]  kthread+0x18b/0x1c0 >>>>> [  129.933363]  ? __pfx_kthread+0x10/0x10 >>>>> [  129.933598]  ret_from_fork+0x2c/0x50 >>>>> [  129.933825]  >>>>> >>>>> Maybe no need to convert ->list_lock to raw_spinlock. >>>>> >>>>> --- a/lib/debugobjects.c >>>>> +++ b/lib/debugobjects.c >>>>> @@ -562,10 +562,10 @@ __debug_object_init(void *addr, const struct >>>>> debug_obj_descr *descr, int onstack >>>>>          unsigned long flags; >>>>> >>>>>          /* >>>>> -        * On RT enabled kernels the pool refill must happen in >>>>> preemptible >>>>> +        * The pool refill must happen in preemptible >>>>>           * context: >>>>>           */ >>>>> -       if (!IS_ENABLED(CONFIG_PREEMPT_RT) || preemptible()) >>>>> +       if (preemptible()) >>>>>                  fill_pool(); >>>> >>>> +CC Peterz >>>> >>>> Aha so this is in fact another case where the code is written with >>>> actual differences between PREEMPT_RT and !PREEMPT_RT in mind, but >>>> CONFIG_PROVE_RAW_LOCK_NESTING always assumes PREEMPT_RT? >>> >>> Maybe we should make CONFIG_PROVE_RAW_LOCK_NESTING depend on >>> CONFIG_PREEMPT_RT: >> >> I found a discussion [1] of why CONFIG_PROVE_RAW_LOCK_NESTING didn't >> depend on CONFIG_PREEMPT_RT before in the commit history: >> >> ``` >> >>> We now always get a "Invalid wait context" warning with >> >>> CONFIG_PROVE_RAW_LOCK_NESTING=y, see the full warning below: >> >>> >> >>>        [    0.705900] ============================= >> >>>        [    0.706002] [ BUG: Invalid wait context ] >> >>>        [    0.706180] 5.13.0+ #4 Not tainted >> >>>        [    0.706349] ----------------------------- >> >> I believe the purpose of CONFIG_PROVE_RAW_LOCK_NESTING is experimental >> >> and it is turned off by default. Turning it on can cause problem as >> >> shown in your lockdep splat. Limiting it to just PREEMPT_RT will >> defeat >> >> its purpose to find potential spinlock nesting problem in >> non-PREEMPT_RT >> >> kernel. >> > As far as I know, a spinlock can nest another spinlock. In >> > non-PREEMPT_RT kernel >> > spin_lock and raw_spin_lock are same , so here acquiring a spin_lock >> in hardirq >> > context is acceptable, the warning is not needed. My knowledge on this >> > is not enough, >> > Will dig into this. >> > >> >> The point is to fix the issue found, >> > Agree. I thought there was a spinlock usage issue, but by checking >> > deactivate_slab context, >> > looks like the spinlock usage is well. Maybe I'm missing something? >> >> Yes, spinlock and raw spinlock are the same in non-RT kernel. They are >> only different in RT kernel. However, non-RT kernel is also more heavily >> tested than the RT kernel counterpart. The purpose of this config option >> is to expose spinlock nesting problem in more areas of the code. If you >> look at the config help text of PROVE_RAW_LOCK_NESTING: >> >>          help >>           Enable the raw_spinlock vs. spinlock nesting checks which >> ensure >>           that the lock nesting rules for PREEMPT_RT enabled kernels are >>           not violated. >> >>           NOTE: There are known nesting problems. So if you enable this >>           option expect lockdep splats until these problems have been >> fully >>           addressed which is work in progress. This config switch >> allows to >>           identify and analyze these problems. It will be removed and the >>           check permanentely enabled once the main issues have been >> fixed. >> >>           If unsure, select N. >> >> So lockdep splat is expected. It will take time to address all the >> issues found. >> ``` >> >> Also +Waiman Long. > > I believe the purpose of not making PROVE_RAW_LOCK_NESTING depending on > PREEMPT_RT is to allow people to discover this kind of nest locking > problem without enabling PREEMPT_RT. > > Anyway, I don't think you can change list_lock to a raw spinlock. > According to mm/slub.c: > >  * Lock order: >  *   1. slab_mutex (Global Mutex) >  *   2. node->list_lock (Spinlock) >  *   3. kmem_cache->cpu_slab->lock (Local lock) >  *   4. slab_lock(slab) (Only on some arches) >  *   5. object_map_lock (Only for debugging) > > For PREEMPT_RT, local lock is a per-cpu spinlock (rt_mutex). So > list_lock has to be spinlock also. Got it. Thanks for such a detailed explanation! > > Cheers, > Longman > > -- Thanks, Qi