From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0746EC2BBD1 for ; Thu, 17 Sep 2020 13:19:50 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2568C2072E for ; Thu, 17 Sep 2020 13:19:48 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=ffwll.ch header.i=@ffwll.ch header.b="Ol8zdtYK" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2568C2072E Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=ffwll.ch Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 5F2276B0003; Thu, 17 Sep 2020 09:19:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5A1378E0001; Thu, 17 Sep 2020 09:19:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 491ED6B0055; Thu, 17 Sep 2020 09:19:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0234.hostedemail.com [216.40.44.234]) by kanga.kvack.org (Postfix) with ESMTP id 314F36B0003 for ; Thu, 17 Sep 2020 09:19:48 -0400 (EDT) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id DA40D180AD822 for ; Thu, 17 Sep 2020 13:19:47 +0000 (UTC) X-FDA: 77272610814.21.pen06_1b0bbf427122 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin21.hostedemail.com (Postfix) with ESMTP id 6D0DD18043666 for ; Thu, 17 Sep 2020 13:19:47 +0000 (UTC) X-HE-Tag: pen06_1b0bbf427122 X-Filterd-Recvd-Size: 10834 Received: from mail-wr1-f67.google.com (mail-wr1-f67.google.com [209.85.221.67]) by imf14.hostedemail.com (Postfix) with ESMTP for ; Thu, 17 Sep 2020 13:19:46 +0000 (UTC) Received: by mail-wr1-f67.google.com with SMTP id c18so2035755wrm.9 for ; Thu, 17 Sep 2020 06:19:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=date:from:to:cc:subject:message-id:mail-followup-to:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to; bh=OJz3dROXwZ4H9FjplYiHUh7ta1dm2pPiZx++Qm+gMOk=; b=Ol8zdtYKra8OqwKcXQZPTIzAIAjrxl7EnGA/qptW05V0voGayUZTJ1UOmlJx84mWcm 0r1ALbQGydxydvFM37pyuB+FPcNQJYGFXrs30Uy4uBKXsTI9vDopq0psqSt8WIVbv/3m SL42by4B46k89IfXDFWI+V95drUdRSPKiXLkY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id :mail-followup-to:references:mime-version:content-disposition :content-transfer-encoding:in-reply-to; bh=OJz3dROXwZ4H9FjplYiHUh7ta1dm2pPiZx++Qm+gMOk=; b=LNfbwPNVNO2oIByKDE2VFCk+VckK+UQBhGU+kuEuaADnJS12c/nR5TrPuHRVxjFjBU Jp9YxFZCrb7Fj7cGiKtmI9Bt+SF9B/yx2ux2gSl4IFIJ2x+PsRftus1EpQo46IVjxqNQ sz5tPoYFuoVbTe+sEivmsphG67ilNF7voVmKu6dWu/wzE7ai/GXIRNYZA2daNiG+KUYv hyHXu0ovRNYuAtDwIMgXkaq8+wCWAJwAoMUPJ7BrwYVVPjdkDQOfdMuCq6a4l+vVNOX7 IFH/8sBETDKQzZ//7huIrr4NyrbbcU9JIJUBp562K57iO6AbwxEe9YIR7hszDzN4hV1Q rymw== X-Gm-Message-State: AOAM531Y4j4/tk4Rf+LzYNyBvGEHY1hdbjcsQrQGXQdo/oOzeCa/p0xg wueZKhsCEf2A5oI44VgZ+ESq+A== X-Google-Smtp-Source: ABdhPJwWrq6ZejmimL/ybiemi2sN/GPynbFDqZKwAC87Lkhqkhgx8buF3bHZlQuU+GRDRVJVnJpavA== X-Received: by 2002:a5d:6b84:: with SMTP id n4mr34531077wrx.55.1600348785370; Thu, 17 Sep 2020 06:19:45 -0700 (PDT) Received: from phenom.ffwll.local ([2a02:168:57f4:0:efd0:b9e5:5ae6:c2fa]) by smtp.gmail.com with ESMTPSA id d2sm39644798wro.34.2020.09.17.06.19.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 17 Sep 2020 06:19:44 -0700 (PDT) Date: Thu, 17 Sep 2020 15:19:42 +0200 From: Daniel Vetter To: Thomas =?iso-8859-1?Q?Hellstr=F6m_=28Intel=29?= Cc: Daniel Vetter , DRI Development , Intel Graphics Development , Daniel Vetter , Sumit Semwal , Christian =?iso-8859-1?Q?K=F6nig?= , "open list:DMA BUFFER SHARING FRAMEWORK" , "moderated list:DMA BUFFER SHARING FRAMEWORK" , Dave Chinner , Qian Cai , linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, Andrew Morton , Jason Gunthorpe , Linux MM , linux-rdma , Maarten Lankhorst Subject: Re: [PATCH] dma-resv: lockdep-prime address_space->i_mmap_rwsem for dma-resv Message-ID: <20200917131942.GX438822@phenom.ffwll.local> Mail-Followup-To: Thomas =?iso-8859-1?Q?Hellstr=F6m_=28Intel=29?= , DRI Development , Intel Graphics Development , Daniel Vetter , Sumit Semwal , Christian =?iso-8859-1?Q?K=F6nig?= , "open list:DMA BUFFER SHARING FRAMEWORK" , "moderated list:DMA BUFFER SHARING FRAMEWORK" , Dave Chinner , Qian Cai , linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, Andrew Morton , Jason Gunthorpe , Linux MM , linux-rdma , Maarten Lankhorst References: <20200728135839.1035515-1-daniel.vetter@ffwll.ch> <38cbc4fb-3a88-47c4-2d6c-4d90f9be42e7@shipmail.org> <60f2b14f-8cef-f515-9cf5-bdbc02d9c63c@shipmail.org> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: <60f2b14f-8cef-f515-9cf5-bdbc02d9c63c@shipmail.org> X-Operating-System: Linux phenom 5.7.0-1-amd64 X-Rspamd-Queue-Id: 6D0DD18043666 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam01 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Jul 30, 2020 at 06:45:14PM +0200, Thomas Hellstr=F6m (Intel) wrot= e: >=20 > On 7/30/20 3:17 PM, Daniel Vetter wrote: > > On Thu, Jul 30, 2020 at 2:17 PM Thomas Hellstr=F6m (Intel) > > wrote: > > >=20 > > > On 7/28/20 3:58 PM, Daniel Vetter wrote: > > > > GPU drivers need this in their shrinkers, to be able to throw out > > > > mmap'ed buffers. Note that we also need dma_resv_lock in shrinker= s, > > > > but that loop is resolved by trylocking in shrinkers. > > > >=20 > > > > So full hierarchy is now (ignore some of the other branches we al= ready > > > > have primed): > > > >=20 > > > > mmap_read_lock -> dma_resv -> shrinkers -> i_mmap_lock_write > > > >=20 > > > > I hope that's not inconsistent with anything mm or fs does, addin= g > > > > relevant people. > > > >=20 > > > Looks OK to me. The mapping_dirty_helpers run under the i_mmap_lock= , but > > > don't allocate any memory AFAICT. > > >=20 > > > Since huge page-table-entry splitting may happen under the i_mmap_l= ock > > > from unmap_mapping_range() it might be worth figuring out how new p= age > > > directory pages are allocated, though. > > ofc I'm not an mm expert at all, but I did try to scroll through all > > i_mmap_lock_write/read callers. Found the following: > >=20 > > - kernel/events/uprobes.c in build_map_info: > >=20 > > /* > > * Needs GFP_NOWAIT to avoid i_mmap_rwsem recursion thro= ugh > > * reclaim. This is optimistic, no harm done if it fails= . > > */ > >=20 > > - I got lost in the hugetlb.c code and couldn't convince myself it's > > not allocating page directories at various levels with something else > > than GFP_KERNEL. > >=20 > > So looks like the recursion is clearly there and known, but the > > hugepage code is too complex and flying over my head. > > -Daniel >=20 > OK, so I inverted your annotation and ran a memory hog, and got the bel= ow > splat. So clearly your proposed reclaim->i_mmap_lock locking order is a= n > already established one. >=20 > So >=20 > Reviewed-by: Thomas Hellstr=F6m No one complaining that this is a terrible idea and two reviews from people who know stuff, so I went ahead and pushed this to drm-misc-next. Thanks for taking a look at this. -Daniel >=20 > 8<---------------------------------------------------------------------= ------------------------ >=20 > [=A0 308.324654] WARNING: possible circular locking dependency detected > [=A0 308.324655] 5.8.0-rc2+ #16 Not tainted > [=A0 308.324656] ------------------------------------------------------ > [=A0 308.324657] kswapd0/98 is trying to acquire lock: > [=A0 308.324658] ffff92a16f758428 (&mapping->i_mmap_rwsem){++++}-{3:3},= at: > rmap_walk_file+0x1c0/0x2f0 > [=A0 308.324663] > =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 but task is already holding = lock: > [=A0 308.324664] ffffffffb0960240 (fs_reclaim){+.+.}-{0:0}, at: > __fs_reclaim_acquire+0x5/0x30 > [=A0 308.324666] > =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 which lock already depends o= n the new lock. >=20 > [=A0 308.324667] > =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 the existing dependency chai= n (in reverse order) is: > [=A0 308.324667] > =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 -> #1 (fs_reclaim){+.+.}-{0:= 0}: > [=A0 308.324670]=A0=A0=A0=A0=A0=A0=A0 fs_reclaim_acquire+0x34/0x40 > [=A0 308.324672]=A0=A0=A0=A0=A0=A0=A0 dma_resv_lockdep+0x186/0x224 > [=A0 308.324675]=A0=A0=A0=A0=A0=A0=A0 do_one_initcall+0x5d/0x2c0 > [=A0 308.324676]=A0=A0=A0=A0=A0=A0=A0 kernel_init_freeable+0x222/0x288 > [=A0 308.324678]=A0=A0=A0=A0=A0=A0=A0 kernel_init+0xa/0x107 > [=A0 308.324679]=A0=A0=A0=A0=A0=A0=A0 ret_from_fork+0x1f/0x30 > [=A0 308.324680] > =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 -> #0 (&mapping->i_mmap_rwse= m){++++}-{3:3}: > [=A0 308.324682]=A0=A0=A0=A0=A0=A0=A0 __lock_acquire+0x119f/0x1fc0 > [=A0 308.324683]=A0=A0=A0=A0=A0=A0=A0 lock_acquire+0xa4/0x3b0 > [=A0 308.324685]=A0=A0=A0=A0=A0=A0=A0 down_read+0x2d/0x110 > [=A0 308.324686]=A0=A0=A0=A0=A0=A0=A0 rmap_walk_file+0x1c0/0x2f0 > [=A0 308.324687]=A0=A0=A0=A0=A0=A0=A0 page_referenced+0x133/0x150 > [=A0 308.324689]=A0=A0=A0=A0=A0=A0=A0 shrink_active_list+0x142/0x610 > [=A0 308.324690]=A0=A0=A0=A0=A0=A0=A0 balance_pgdat+0x229/0x620 > [=A0 308.324691]=A0=A0=A0=A0=A0=A0=A0 kswapd+0x200/0x470 > [=A0 308.324693]=A0=A0=A0=A0=A0=A0=A0 kthread+0x11f/0x140 > [=A0 308.324694]=A0=A0=A0=A0=A0=A0=A0 ret_from_fork+0x1f/0x30 > [=A0 308.324694] > =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 other info that might help u= s debug this: >=20 > [=A0 308.324695]=A0 Possible unsafe locking scenario: >=20 > [=A0 308.324695]=A0=A0=A0=A0=A0=A0=A0 CPU0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0 CPU1 > [=A0 308.324696]=A0=A0=A0=A0=A0=A0=A0 ----=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0 ---- > [=A0 308.324696]=A0=A0 lock(fs_reclaim); > [=A0 308.324697] lock(&mapping->i_mmap_rwsem); > [=A0 308.324698]=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 lock(fs_reclaim); > [=A0 308.324699]=A0=A0 lock(&mapping->i_mmap_rwsem); > [=A0 308.324699] > =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 *** DEADLOCK *** >=20 > [=A0 308.324700] 1 lock held by kswapd0/98: > [=A0 308.324701]=A0 #0: ffffffffb0960240 (fs_reclaim){+.+.}-{0:0}, at: > __fs_reclaim_acquire+0x5/0x30 > [=A0 308.324702] > =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 stack backtrace: > [=A0 308.324704] CPU: 1 PID: 98 Comm: kswapd0 Not tainted 5.8.0-rc2+ #1= 6 > [=A0 308.324705] Hardware name: VMware, Inc. VMware Virtual Platform/44= 0BX > Desktop Reference Platform, BIOS 6.00 07/29/2019 > [=A0 308.324706] Call Trace: > [=A0 308.324710]=A0 dump_stack+0x92/0xc8 > [=A0 308.324711]=A0 check_noncircular+0x12d/0x150 > [=A0 308.324713]=A0 __lock_acquire+0x119f/0x1fc0 > [=A0 308.324715]=A0 lock_acquire+0xa4/0x3b0 > [=A0 308.324716]=A0 ? rmap_walk_file+0x1c0/0x2f0 > [=A0 308.324717]=A0 ? __lock_acquire+0x394/0x1fc0 > [=A0 308.324719]=A0 down_read+0x2d/0x110 > [=A0 308.324720]=A0 ? rmap_walk_file+0x1c0/0x2f0 > [=A0 308.324721]=A0 rmap_walk_file+0x1c0/0x2f0 > [=A0 308.324722]=A0 page_referenced+0x133/0x150 > [=A0 308.324724]=A0 ? __page_set_anon_rmap+0x70/0x70 > [=A0 308.324725]=A0 ? page_get_anon_vma+0x190/0x190 > [=A0 308.324726]=A0 shrink_active_list+0x142/0x610 > [=A0 308.324728]=A0 balance_pgdat+0x229/0x620 > [=A0 308.324730]=A0 kswapd+0x200/0x470 > [=A0 308.324731]=A0 ? lockdep_hardirqs_on_prepare+0xf5/0x170 > [=A0 308.324733]=A0 ? finish_wait+0x80/0x80 > [=A0 308.324734]=A0 ? balance_pgdat+0x620/0x620 > [=A0 308.324736]=A0 kthread+0x11f/0x140 > [=A0 308.324737]=A0 ? kthread_create_worker_on_cpu+0x40/0x40 > [=A0 308.324739]=A0 ret_from_fork+0x1f/0x30 >=20 >=20 >=20 > > > /Thomas > > >=20 > > >=20 > > >=20 > >=20 --=20 Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch