From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.1 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE, SPF_PASS,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D90C8C3F2CD for ; Fri, 28 Feb 2020 13:50:10 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 9075B24699 for ; Fri, 28 Feb 2020 13:50:10 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=ziepe.ca header.i=@ziepe.ca header.b="nRgFoqve" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9075B24699 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=ziepe.ca Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 263576B0007; Fri, 28 Feb 2020 08:50:10 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 213F46B0008; Fri, 28 Feb 2020 08:50:10 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 12A346B000A; Fri, 28 Feb 2020 08:50:10 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0206.hostedemail.com [216.40.44.206]) by kanga.kvack.org (Postfix) with ESMTP id ED7676B0007 for ; Fri, 28 Feb 2020 08:50:09 -0500 (EST) Received: from smtpin27.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 7C89A180AC46F for ; Fri, 28 Feb 2020 13:50:09 +0000 (UTC) X-FDA: 76539669738.27.magic78_75c09030dc919 X-HE-Tag: magic78_75c09030dc919 X-Filterd-Recvd-Size: 6624 Received: from mail-qt1-f195.google.com (mail-qt1-f195.google.com [209.85.160.195]) by imf27.hostedemail.com (Postfix) with ESMTP for ; Fri, 28 Feb 2020 13:50:09 +0000 (UTC) Received: by mail-qt1-f195.google.com with SMTP id e20so2030453qto.5 for ; Fri, 28 Feb 2020 05:50:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ziepe.ca; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to :user-agent; bh=OaarFQAqTFpZXO8Ohvf9oh4OmGU9XzdKfE29ytnRYnc=; b=nRgFoqveBljy4uhVT8KVXPKatUc5k7vFl6XCufJ1oX6He9+lpdS28EmDRLpaXCb7w5 p/8w1LcJSYKT3ftrV2QowOjZ0ccBQF9/NdkOoPFkVw2WuCbCv+HpAvhVHVkVGFo7vT12 h7YKmhf+6cPlhe/kmHtJeQnIcBVbS85gezBs+KA7I4yr1QRVEdGpDfbWuBwNOCfC+5cF my/Y82b/CimfYNB+pxZgw5MXVn9v+SCQSCmpGXqAGBAeSMTzsvdqEP9YsWtEWSRfxrsX eY711Cdob6qiUZM5u5OEsIrsQ5doNPN8l+CliGfmqnmIqmKRq9WrOCfzNs6XSlWIuaTE Ju6Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to:user-agent; bh=OaarFQAqTFpZXO8Ohvf9oh4OmGU9XzdKfE29ytnRYnc=; b=tydqqQIhsh+hkdRezODCCfXCeHXt0ahV8mgvnnAqggPsppzB3NZ3Ev+5cZ2RYTB/Cd ne0GWInZ9/9mWuQfGqCWJ2AkuLxlD+/lw3Gs1o3xtL6hG7QelIpfnnatbUPpmSrnQjzK 2YYp/ZRVCukjLgz/9wevc2L9O1kRtwou0twBHsUGsNOZsZ79uD9q0TrNvhU7bLyYtdHe G5QZvCJnwoKtK02WF2oltNq0BdU52ctTVx0Ii7laTg9S9L+KzRiMIakM5/XOiikRZQSf /OKzJbxrxtH4AQtmLriY3zpCo8E/JKJJ9xW2f6NO9rkFfLWh3dI2mSdXEAbgHZzhtcPF kkZg== X-Gm-Message-State: APjAAAUFPkRbYYIvYPWgj7XXRRBW7Eegs/p6q/M7PSVZl6tEg9E6qv9z I4k8dGcV/gir1uasbN72mk3I6GNaHKWpLQ== X-Google-Smtp-Source: APXvYqx/hMNQFbVSRU1WQVIbjzABFAj8FTnE4UuAwWarfSQLCFVI0PKEkdiA8hb1u5UkerD6wEU3pg== X-Received: by 2002:aed:2266:: with SMTP id o35mr4348260qtc.392.1582897807486; Fri, 28 Feb 2020 05:50:07 -0800 (PST) Received: from ziepe.ca (hlfxns017vw-142-68-57-212.dhcp-dynamic.fibreop.ns.bellaliant.net. [142.68.57.212]) by smtp.gmail.com with ESMTPSA id a17sm3441641qtj.48.2020.02.28.05.50.06 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 28 Feb 2020 05:50:06 -0800 (PST) Received: from jgg by mlx.ziepe.ca with local (Exim 4.90_1) (envelope-from ) id 1j7g1m-00083s-GA; Fri, 28 Feb 2020 09:50:06 -0400 Date: Fri, 28 Feb 2020 09:50:06 -0400 From: Jason Gunthorpe To: linux-mm@kvack.org Cc: Michal Hocko , =?utf-8?B?SsOpcsO0bWU=?= Glisse , Christoph Hellwig Subject: Re: [PATCH v3] mm/mmu_notifier: prevent unpaired invalidate_start and invalidate_end Message-ID: <20200228135006.GA30885@ziepe.ca> References: <20200211205252.GA10003@ziepe.ca> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20200211205252.GA10003@ziepe.ca> User-Agent: Mutt/1.9.4 (2018-02-28) Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Feb 11, 2020 at 04:52:52PM -0400, Jason Gunthorpe wrote: > Many users of the mmu_notifier invalidate_range callbacks maintain > locking/counters/etc on a paired basis and have long expected that > invalidate_range_start/end() are always paired. >=20 > For instance kvm_mmu_notifier_invalidate_range_end() undoes > kvm->mmu_notifier_count which was incremented during start(). >=20 > The recent change to add non-blocking notifiers breaks this assumption > when multiple notifiers are present in the list. When EAGAIN is returne= d > from an invalidate_range_start() then no invalidate_range_ends() are > called, even if the subscription's start had previously been called. >=20 > Unfortunately, due to the RCU list traversal we can't reliably generate= a > subset of the linked list representing the notifiers already called to > generate an invalidate_range_end() pairing. >=20 > One case works correctly, if only one subscription requires > invalidate_range_end() and it is the last entry in the hlist. In this > case, when invalidate_range_start() returns -EAGAIN there will be nothi= ng > to unwind. >=20 > Keep the notifier hlist sorted so that notifiers that require > invalidate_range_end() are always last, and if two are added then disab= le > non-blocking invalidation for the mm. >=20 > A warning is printed for this case, if in future we determine this neve= r > happens then we can simply fail during registration when there are > unsupported combinations of notifiers. >=20 > Fixes: 93065ac753e4 ("mm, oom: distinguish blockable mode for mmu notif= iers") > Cc: Michal Hocko > Cc: "J=C3=A9r=C3=B4me Glisse" > Cc: Christoph Hellwig > Signed-off-by: Jason Gunthorpe > mm/mmu_notifier.c | 53 ++++++++++++++++++++++++++++++++++++++++++++--- > 1 file changed, 50 insertions(+), 3 deletions(-) >=20 > v1: https://lore.kernel.org/linux-mm/20190724152858.GB28493@ziepe.ca/ > v2: https://lore.kernel.org/linux-mm/20190807191627.GA3008@ziepe.ca/ > * Abandon attempting to fix it by calling invalidate_range_end() during= an > EAGAIN start > * Just trivially ban multiple subscriptions > v3: > * Be more sophisticated, ban only multiple subscriptions if the result = is > a failure. Allows multiple subscriptions without invalidate_range_end > * Include a printk when this condition is hit (Michal) >=20 > At this point the rework Christoph requested during the first posting > is completed and there are now only 3 drivers using > invalidate_range_end(): >=20 > drivers/misc/mic/scif/scif_dma.c: .invalidate_range_end =3D scif_= mmu_notifier_invalidate_range_end}; > drivers/misc/sgi-gru/grutlbpurge.c: .invalidate_range_end =3D gru= _invalidate_range_end, > virt/kvm/kvm_main.c: .invalidate_range_end =3D kvm_mmu_notifier_in= validate_range_end, >=20 > While I think it is unlikely that any of these drivers will be used in > combination with each other, display a printk in hopes to check. >=20 > Someday I expect to just fail the registration on this condition. >=20 > I think this also addresses Michal's concern about a 'big hammer' as > it probably won't ever trigger now. I'm going to put this in linux-next to see if there are any reports of the pr_warn failing. Michal, are you happy with this solution now? Thanks, Jason