From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 066A7C1975A for ; Wed, 25 Mar 2020 08:01:22 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A4C3B206F8 for ; Wed, 25 Mar 2020 08:01:21 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A4C3B206F8 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 4F79F6B0010; Wed, 25 Mar 2020 04:01:21 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4CE566B0032; Wed, 25 Mar 2020 04:01:21 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 40BA96B0036; Wed, 25 Mar 2020 04:01:21 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0008.hostedemail.com [216.40.44.8]) by kanga.kvack.org (Postfix) with ESMTP id 2672B6B0010 for ; Wed, 25 Mar 2020 04:01:21 -0400 (EDT) Received: from smtpin10.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id A7381181AD0A1 for ; Wed, 25 Mar 2020 08:01:20 +0000 (UTC) X-FDA: 76633139520.10.use49_28c3eb7068a62 X-HE-Tag: use49_28c3eb7068a62 X-Filterd-Recvd-Size: 5125 Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by imf46.hostedemail.com (Postfix) with ESMTP for ; Wed, 25 Mar 2020 08:01:19 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 98B06AC2C; Wed, 25 Mar 2020 08:01:18 +0000 (UTC) Date: Wed, 25 Mar 2020 09:01:17 +0100 From: Michal Hocko To: Jason Gunthorpe Cc: linux-mm@kvack.org, =?iso-8859-1?B?Suly9G1l?= Glisse , Christoph Hellwig Subject: Re: [PATCH v3] mm/mmu_notifier: prevent unpaired invalidate_start and invalidate_end Message-ID: <20200325080117.GY19542@dhcp22.suse.cz> References: <20200211205252.GA10003@ziepe.ca> <20200228135006.GA30885@ziepe.ca> <20200324194137.GQ13183@mellanox.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: <20200324194137.GQ13183@mellanox.com> Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue 24-03-20 16:41:37, Jason Gunthorpe wrote: > On Fri, Feb 28, 2020 at 09:50:06AM -0400, Jason Gunthorpe wrote: > > On Tue, Feb 11, 2020 at 04:52:52PM -0400, Jason Gunthorpe wrote: > > > Many users of the mmu_notifier invalidate_range callbacks maintain > > > locking/counters/etc on a paired basis and have long expected that > > > invalidate_range_start/end() are always paired. > > >=20 > > > For instance kvm_mmu_notifier_invalidate_range_end() undoes > > > kvm->mmu_notifier_count which was incremented during start(). > > >=20 > > > The recent change to add non-blocking notifiers breaks this assumpt= ion > > > when multiple notifiers are present in the list. When EAGAIN is ret= urned > > > from an invalidate_range_start() then no invalidate_range_ends() ar= e > > > called, even if the subscription's start had previously been called= . > > >=20 > > > Unfortunately, due to the RCU list traversal we can't reliably gene= rate a > > > subset of the linked list representing the notifiers already called= to > > > generate an invalidate_range_end() pairing. > > >=20 > > > One case works correctly, if only one subscription requires > > > invalidate_range_end() and it is the last entry in the hlist. In th= is > > > case, when invalidate_range_start() returns -EAGAIN there will be n= othing > > > to unwind. > > >=20 > > > Keep the notifier hlist sorted so that notifiers that require > > > invalidate_range_end() are always last, and if two are added then d= isable > > > non-blocking invalidation for the mm. > > >=20 > > > A warning is printed for this case, if in future we determine this = never > > > happens then we can simply fail during registration when there are > > > unsupported combinations of notifiers. > > >=20 > > > Fixes: 93065ac753e4 ("mm, oom: distinguish blockable mode for mmu n= otifiers") > > > Cc: Michal Hocko > > > Cc: "J=E9r=F4me Glisse" > > > Cc: Christoph Hellwig > > > Signed-off-by: Jason Gunthorpe > > > mm/mmu_notifier.c | 53 +++++++++++++++++++++++++++++++++++++++++++= +--- > > > 1 file changed, 50 insertions(+), 3 deletions(-) > > >=20 > > > v1: https://lore.kernel.org/linux-mm/20190724152858.GB28493@ziepe.c= a/ > > > v2: https://lore.kernel.org/linux-mm/20190807191627.GA3008@ziepe.ca= / > > > * Abandon attempting to fix it by calling invalidate_range_end() du= ring an > > > EAGAIN start > > > * Just trivially ban multiple subscriptions > > > v3: > > > * Be more sophisticated, ban only multiple subscriptions if the res= ult is > > > a failure. Allows multiple subscriptions without invalidate_range= _end > > > * Include a printk when this condition is hit (Michal) > > >=20 > > > At this point the rework Christoph requested during the first posti= ng > > > is completed and there are now only 3 drivers using > > > invalidate_range_end(): > > >=20 > > > drivers/misc/mic/scif/scif_dma.c: .invalidate_range_end =3D s= cif_mmu_notifier_invalidate_range_end}; > > > drivers/misc/sgi-gru/grutlbpurge.c: .invalidate_range_end =3D= gru_invalidate_range_end, > > > virt/kvm/kvm_main.c: .invalidate_range_end =3D kvm_mmu_notifie= r_invalidate_range_end, > > >=20 > > > While I think it is unlikely that any of these drivers will be used= in > > > combination with each other, display a printk in hopes to check. > > >=20 > > > Someday I expect to just fail the registration on this condition. > > >=20 > > > I think this also addresses Michal's concern about a 'big hammer' a= s > > > it probably won't ever trigger now. > >=20 > > I'm going to put this in linux-next to see if there are any reports o= f > > the pr_warn failing. > >=20 > > Michal, are you happy with this solution now? >=20 > It's been a month in linux-next now, with no complaints. If there are > no comments I will go ahead to send it in the hmm PR. I will not block this but it still looks like a wrong approach. A more robust solution would be to allow calling invalidate_range_end even for the failing invalidate_start. --=20 Michal Hocko SUSE Labs