From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4928FC433E0 for ; Thu, 11 Mar 2021 01:50:18 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 8A95161585 for ; Thu, 11 Mar 2021 01:50:17 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8A95161585 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=ziepe.ca Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 1BDC68D0263; Wed, 10 Mar 2021 20:50:17 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 146E98D0250; Wed, 10 Mar 2021 20:50:17 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EDB548D0263; Wed, 10 Mar 2021 20:50:16 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0253.hostedemail.com [216.40.44.253]) by kanga.kvack.org (Postfix) with ESMTP id CD25C8D0250 for ; Wed, 10 Mar 2021 20:50:16 -0500 (EST) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 7D8AC181AF5C1 for ; Thu, 11 Mar 2021 01:50:16 +0000 (UTC) X-FDA: 77905913232.01.C22ECA8 Received: from mail-qk1-f181.google.com (mail-qk1-f181.google.com [209.85.222.181]) by imf15.hostedemail.com (Postfix) with ESMTP id 58D39A0000FF for ; Thu, 11 Mar 2021 01:50:15 +0000 (UTC) Received: by mail-qk1-f181.google.com with SMTP id n79so19131637qke.3 for ; Wed, 10 Mar 2021 17:50:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ziepe.ca; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=VnK/Ou5++5aAduT+vgSPu3tbQk3cRlxblRdLjBs1qkI=; b=jx8kG6wECMMTmF45eZ7ABrPEJNva0pZbs5+4jzarLFhqMQjLauHiGgggpV0YZCU3km 5oDdr8oblDrFKStGvYjVBGUm5zlM8m7g/So+/LHiNjuPyICK7VO223eO7JeyXrZKhf2f jr1VK2/KysQwLoSVPYm59CMwVKJCXtXR8/LzyjCFdezwWjj01QKqq7oUcke7dZHOPGD8 r/jfAV4zazGcKkH4mpqgbmwxg+IFFrKOKmvaNI8jzKJx4ZwtO8ct87XtTzQZR1pfha18 leRNKp8JXnTzRDcIlMt7jkvDNvfoIuHw1y73qUOQ4Q8M4AaodrOTBP/cJGht2P4Ikyh/ 97uw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=VnK/Ou5++5aAduT+vgSPu3tbQk3cRlxblRdLjBs1qkI=; b=FfTcJFaaRSHljfQcY8UxIzdiiydLW9BP45ToYWPDcsbh+Su5AzcUI4iP3paVzCNYnW CqxuU73J7pg2BEIahITaS8Q0MFBNXsBWSkMcsQfZdtgT+qIg0cGeZgME9FppxwXrX7HP qeeOVf9CQFtA8n1PdyUcWP/eSG6H8TAhg+t4miyUM9AdcYGlv43TlRjmSVw9kbVubU1j eZqDWZEIhE+JIEvylNPX2GolzycmdWRMvFbGg990qpE9PrDKJXzAsx23E19FeyA6DsIE C0c3wDSP2QqhmGNVmsHiFjduqKvOIS3CvEA2P0cI0a5ItmTK1mTmahGDn4S3poutY0kt 4T3A== X-Gm-Message-State: AOAM531+UMXLTZ7nvk11CCyvvqGmM++Y4uO9B+ZUprKUQCWDvIeJ2p+1 N3DFSovud551Ki+HtoW+wXHGMg== X-Google-Smtp-Source: ABdhPJzOjxmf9bn5sDalj+Z+vRxLZW/gbayHuO74x/ayYgTpSVEHuOlsKwcQXlUDbDfAItsVt2VYIg== X-Received: by 2002:a37:a38e:: with SMTP id m136mr5581896qke.250.1615427415187; Wed, 10 Mar 2021 17:50:15 -0800 (PST) Received: from ziepe.ca (hlfxns017vw-142-162-115-133.dhcp-dynamic.fibreop.ns.bellaliant.net. [142.162.115.133]) by smtp.gmail.com with ESMTPSA id v4sm798670qte.18.2021.03.10.17.50.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Mar 2021 17:50:14 -0800 (PST) Received: from jgg by mlx with local (Exim 4.94) (envelope-from ) id 1lKASr-00B0Gx-CD; Wed, 10 Mar 2021 21:50:13 -0400 Date: Wed, 10 Mar 2021 21:50:13 -0400 From: Jason Gunthorpe To: Sean Christopherson Cc: Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, David Rientjes , Ben Gardon , Michal Hocko , =?utf-8?B?SsOpcsO0bWU=?= Glisse , Andrea Arcangeli , Johannes Weiner , Dimitri Sivanich Subject: Re: [PATCH] mm/oom_kill: Ensure MMU notifier range_end() is paired with range_start() Message-ID: <20210311015013.GS444867@ziepe.ca> References: <20210310213117.1444147-1-seanjc@google.com> <20210311002807.GQ444867@ziepe.ca> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 58D39A0000FF X-Stat-Signature: tcdsibiojm176k79se8t71smz3nge8co Received-SPF: none (ziepe.ca>: No applicable sender policy available) receiver=imf15; identity=mailfrom; envelope-from=""; helo=mail-qk1-f181.google.com; client-ip=209.85.222.181 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1615427415-418966 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Mar 10, 2021 at 05:20:01PM -0800, Sean Christopherson wrote: > > Which I believe is fatal to kvm? These notifiers certainly do not only > > happen at process exit. > > My point about the process dying is that the existing bug that causes > mmu_notifier_count to become imbalanced is benign only because the process is > being killed, and thus KVM will stop running its vCPUs. Are you saying we only call non-blocking invalidate during a process exit event?? > > So, both of the remaining _end users become corrupted with this patch! > > I don't follow. mn_hlist_invalidate_range_start() iterates over all > notifiers, even if a notifier earlier in the chain failed. How will > KVM become imbalanced? Er, ok, that got left in a weird way. There is another "bug" where end is not supposed to be called if the start failed. > The existing _end users never fail their _start. If KVM started failing its > start, then yes, it could get corrupted. Well, maybe that is the way out of this now. If we don't permit a start to fail if there is an end then we have no problem to unwind it as we can continue to call everything. This can't be backported too far though, the itree notifier conversions are what made the WARN_ON safe today. Something very approximately like this is closer to my preference: diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c index 61ee40ed804ee5..6d5cd20f81dadc 100644 --- a/mm/mmu_notifier.c +++ b/mm/mmu_notifier.c @@ -501,10 +501,25 @@ static int mn_hlist_invalidate_range_start( ""); WARN_ON(mmu_notifier_range_blockable(range) || _ret != -EAGAIN); + /* + * We call all the notifiers on any EAGAIN, + * there is no way for a notifier to know if + * its start method failed, thus a start that + * does EAGAIN can't also do end. + */ + WARN_ON(ops->invalidate_range_end); ret = _ret; } } } + + if (ret) { + /* Must be non-blocking to get here*/ + hlist_for_each_entry_rcu (subscription, &subscriptions->list, + hlist, srcu_read_lock_held(&srcu)) + subscription->ops->invalidate_range_end(subscription, + range); + } srcu_read_unlock(&srcu, id); return ret;