From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.1 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 815A2C433E3 for ; Thu, 30 Jul 2020 16:45:26 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 384BE207F5 for ; Thu, 30 Jul 2020 16:45:26 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=shipmail.org header.i=@shipmail.org header.b="Ban6eCam" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 384BE207F5 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=shipmail.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id C75DC6B000E; Thu, 30 Jul 2020 12:45:25 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C268D8D0001; Thu, 30 Jul 2020 12:45:25 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AEE036B0022; Thu, 30 Jul 2020 12:45:25 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0062.hostedemail.com [216.40.44.62]) by kanga.kvack.org (Postfix) with ESMTP id 99C516B000E for ; Thu, 30 Jul 2020 12:45:25 -0400 (EDT) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 35DA03625 for ; Thu, 30 Jul 2020 16:45:25 +0000 (UTC) X-FDA: 77095317810.29.lock58_06007cd26f7c Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin29.hostedemail.com (Postfix) with ESMTP id E603218086CBA for ; Thu, 30 Jul 2020 16:45:24 +0000 (UTC) X-HE-Tag: lock58_06007cd26f7c X-Filterd-Recvd-Size: 9313 Received: from pio-pvt-msa2.bahnhof.se (pio-pvt-msa2.bahnhof.se [79.136.2.41]) by imf42.hostedemail.com (Postfix) with ESMTP for ; Thu, 30 Jul 2020 16:45:22 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by pio-pvt-msa2.bahnhof.se (Postfix) with ESMTP id 9D5093FC07; Thu, 30 Jul 2020 18:45:18 +0200 (CEST) Authentication-Results: pio-pvt-msa2.bahnhof.se; dkim=pass (1024-bit key; unprotected) header.d=shipmail.org header.i=@shipmail.org header.b=Ban6eCam; dkim-atps=neutral X-Virus-Scanned: Debian amavisd-new at bahnhof.se Received: from pio-pvt-msa2.bahnhof.se ([127.0.0.1]) by localhost (pio-pvt-msa2.bahnhof.se [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id FGIani5JgmKM; Thu, 30 Jul 2020 18:45:17 +0200 (CEST) Received: from mail1.shipmail.org (h-205-35.A357.priv.bahnhof.se [155.4.205.35]) (Authenticated sender: mb878879) by pio-pvt-msa2.bahnhof.se (Postfix) with ESMTPA id 22D5D3FBCF; Thu, 30 Jul 2020 18:45:14 +0200 (CEST) Received: from localhost.localdomain (h-205-35.A357.priv.bahnhof.se [155.4.205.35]) by mail1.shipmail.org (Postfix) with ESMTPSA id 68E9E361FE2; Thu, 30 Jul 2020 18:45:14 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=shipmail.org; s=mail; t=1596127516; bh=WmoqVD5xMx2mKhiXkA7Wx6HNNvArGiPu3cb42J8Eqq8=; h=Subject:To:Cc:References:From:Date:In-Reply-To:From; b=Ban6eCam79LaP2aso+KqRtn4VGxizeZQ3uHVvcID5i+rm+f7F9Yk/y4F1LRpx7kM4 7+UvyzkUHOyPhdiSOiHvQteBECLWcX8gITVwl8MQG3RSVLKJZ8EZjCZK5d63q9vEYr 4f/39kCtOd/IbfmxfAV+kxoiRaUob9Nuv5nZKE4E= Subject: Re: [PATCH] dma-resv: lockdep-prime address_space->i_mmap_rwsem for dma-resv To: Daniel Vetter Cc: DRI Development , Intel Graphics Development , Daniel Vetter , Sumit Semwal , =?UTF-8?Q?Christian_K=c3=b6nig?= , "open list:DMA BUFFER SHARING FRAMEWORK" , "moderated list:DMA BUFFER SHARING FRAMEWORK" , Dave Chinner , Qian Cai , linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, Andrew Morton , Jason Gunthorpe , Linux MM , linux-rdma , Maarten Lankhorst References: <20200728135839.1035515-1-daniel.vetter@ffwll.ch> <38cbc4fb-3a88-47c4-2d6c-4d90f9be42e7@shipmail.org> From: =?UTF-8?Q?Thomas_Hellstr=c3=b6m_=28Intel=29?= Message-ID: <60f2b14f-8cef-f515-9cf5-bdbc02d9c63c@shipmail.org> Date: Thu, 30 Jul 2020 18:45:14 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US X-Rspamd-Queue-Id: E603218086CBA X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam01 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 7/30/20 3:17 PM, Daniel Vetter wrote: > On Thu, Jul 30, 2020 at 2:17 PM Thomas Hellstr=C3=B6m (Intel) > wrote: >> >> On 7/28/20 3:58 PM, Daniel Vetter wrote: >>> GPU drivers need this in their shrinkers, to be able to throw out >>> mmap'ed buffers. Note that we also need dma_resv_lock in shrinkers, >>> but that loop is resolved by trylocking in shrinkers. >>> >>> So full hierarchy is now (ignore some of the other branches we alread= y >>> have primed): >>> >>> mmap_read_lock -> dma_resv -> shrinkers -> i_mmap_lock_write >>> >>> I hope that's not inconsistent with anything mm or fs does, adding >>> relevant people. >>> >> Looks OK to me. The mapping_dirty_helpers run under the i_mmap_lock, b= ut >> don't allocate any memory AFAICT. >> >> Since huge page-table-entry splitting may happen under the i_mmap_lock >> from unmap_mapping_range() it might be worth figuring out how new page >> directory pages are allocated, though. > ofc I'm not an mm expert at all, but I did try to scroll through all > i_mmap_lock_write/read callers. Found the following: > > - kernel/events/uprobes.c in build_map_info: > > /* > * Needs GFP_NOWAIT to avoid i_mmap_rwsem recursion throug= h > * reclaim. This is optimistic, no harm done if it fails. > */ > > - I got lost in the hugetlb.c code and couldn't convince myself it's > not allocating page directories at various levels with something else > than GFP_KERNEL. > > So looks like the recursion is clearly there and known, but the > hugepage code is too complex and flying over my head. > -Daniel OK, so I inverted your annotation and ran a memory hog, and got the=20 below splat. So clearly your proposed reclaim->i_mmap_lock locking order=20 is an already established one. So Reviewed-by: Thomas Hellstr=C3=B6m 8<-----------------------------------------------------------------------= ---------------------- [=C2=A0 308.324654] WARNING: possible circular locking dependency detecte= d [=C2=A0 308.324655] 5.8.0-rc2+ #16 Not tainted [=C2=A0 308.324656] -----------------------------------------------------= - [=C2=A0 308.324657] kswapd0/98 is trying to acquire lock: [=C2=A0 308.324658] ffff92a16f758428 (&mapping->i_mmap_rwsem){++++}-{3:3}= ,=20 at: rmap_walk_file+0x1c0/0x2f0 [=C2=A0 308.324663] =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0 but task is already holding lock: [=C2=A0 308.324664] ffffffffb0960240 (fs_reclaim){+.+.}-{0:0}, at:=20 __fs_reclaim_acquire+0x5/0x30 [=C2=A0 308.324666] =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0 which lock already depends on the new lock. [=C2=A0 308.324667] =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0 the existing dependency chain (in reverse order) is: [=C2=A0 308.324667] =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0 -> #1 (fs_reclaim){+.+.}-{0:0}: [=C2=A0 308.324670]=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 fs_reclaim_= acquire+0x34/0x40 [=C2=A0 308.324672]=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 dma_resv_lo= ckdep+0x186/0x224 [=C2=A0 308.324675]=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 do_one_init= call+0x5d/0x2c0 [=C2=A0 308.324676]=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 kernel_init= _freeable+0x222/0x288 [=C2=A0 308.324678]=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 kernel_init= +0xa/0x107 [=C2=A0 308.324679]=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ret_from_fo= rk+0x1f/0x30 [=C2=A0 308.324680] =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0 -> #0 (&mapping->i_mmap_rwsem){++++}-{3:3}: [=C2=A0 308.324682]=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 __lock_acqu= ire+0x119f/0x1fc0 [=C2=A0 308.324683]=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 lock_acquir= e+0xa4/0x3b0 [=C2=A0 308.324685]=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 down_read+0= x2d/0x110 [=C2=A0 308.324686]=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 rmap_walk_f= ile+0x1c0/0x2f0 [=C2=A0 308.324687]=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 page_refere= nced+0x133/0x150 [=C2=A0 308.324689]=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 shrink_acti= ve_list+0x142/0x610 [=C2=A0 308.324690]=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 balance_pgd= at+0x229/0x620 [=C2=A0 308.324691]=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 kswapd+0x20= 0/0x470 [=C2=A0 308.324693]=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 kthread+0x1= 1f/0x140 [=C2=A0 308.324694]=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ret_from_fo= rk+0x1f/0x30 [=C2=A0 308.324694] =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0 other info that might help us debug this: [=C2=A0 308.324695]=C2=A0 Possible unsafe locking scenario: [=C2=A0 308.324695]=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 CPU0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 CPU1 [=C2=A0 308.324696]=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ----=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ---- [=C2=A0 308.324696]=C2=A0=C2=A0 lock(fs_reclaim); [=C2=A0 308.324697] lock(&mapping->i_mmap_rwsem); [=C2=A0 308.324698]=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 lock(fs_reclaim= ); [=C2=A0 308.324699]=C2=A0=C2=A0 lock(&mapping->i_mmap_rwsem); [=C2=A0 308.324699] =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0 *** DEADLOCK *** [=C2=A0 308.324700] 1 lock held by kswapd0/98: [=C2=A0 308.324701]=C2=A0 #0: ffffffffb0960240 (fs_reclaim){+.+.}-{0:0}, = at:=20 __fs_reclaim_acquire+0x5/0x30 [=C2=A0 308.324702] =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0 stack backtrace: [=C2=A0 308.324704] CPU: 1 PID: 98 Comm: kswapd0 Not tainted 5.8.0-rc2+ #= 16 [=C2=A0 308.324705] Hardware name: VMware, Inc. VMware Virtual Platform/4= 40BX=20 Desktop Reference Platform, BIOS 6.00 07/29/2019 [=C2=A0 308.324706] Call Trace: [=C2=A0 308.324710]=C2=A0 dump_stack+0x92/0xc8 [=C2=A0 308.324711]=C2=A0 check_noncircular+0x12d/0x150 [=C2=A0 308.324713]=C2=A0 __lock_acquire+0x119f/0x1fc0 [=C2=A0 308.324715]=C2=A0 lock_acquire+0xa4/0x3b0 [=C2=A0 308.324716]=C2=A0 ? rmap_walk_file+0x1c0/0x2f0 [=C2=A0 308.324717]=C2=A0 ? __lock_acquire+0x394/0x1fc0 [=C2=A0 308.324719]=C2=A0 down_read+0x2d/0x110 [=C2=A0 308.324720]=C2=A0 ? rmap_walk_file+0x1c0/0x2f0 [=C2=A0 308.324721]=C2=A0 rmap_walk_file+0x1c0/0x2f0 [=C2=A0 308.324722]=C2=A0 page_referenced+0x133/0x150 [=C2=A0 308.324724]=C2=A0 ? __page_set_anon_rmap+0x70/0x70 [=C2=A0 308.324725]=C2=A0 ? page_get_anon_vma+0x190/0x190 [=C2=A0 308.324726]=C2=A0 shrink_active_list+0x142/0x610 [=C2=A0 308.324728]=C2=A0 balance_pgdat+0x229/0x620 [=C2=A0 308.324730]=C2=A0 kswapd+0x200/0x470 [=C2=A0 308.324731]=C2=A0 ? lockdep_hardirqs_on_prepare+0xf5/0x170 [=C2=A0 308.324733]=C2=A0 ? finish_wait+0x80/0x80 [=C2=A0 308.324734]=C2=A0 ? balance_pgdat+0x620/0x620 [=C2=A0 308.324736]=C2=A0 kthread+0x11f/0x140 [=C2=A0 308.324737]=C2=A0 ? kthread_create_worker_on_cpu+0x40/0x40 [=C2=A0 308.324739]=C2=A0 ret_from_fork+0x1f/0x30 >> /Thomas >> >> >> >