From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 03FE4C77B73 for ; Tue, 23 May 2023 00:07:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4338E900003; Mon, 22 May 2023 20:07:01 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3E3B6900002; Mon, 22 May 2023 20:07:01 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 283E2900003; Mon, 22 May 2023 20:07:01 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 13AF7900002 for ; Mon, 22 May 2023 20:07:01 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id CE4C51201F2 for ; Tue, 23 May 2023 00:07:00 +0000 (UTC) X-FDA: 80819579400.28.0744852 Received: from mail-pg1-f201.google.com (mail-pg1-f201.google.com [209.85.215.201]) by imf19.hostedemail.com (Postfix) with ESMTP id 090101A001D for ; Tue, 23 May 2023 00:06:58 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=einRQC06; spf=pass (imf19.hostedemail.com: domain of 3oQNsZAYKCIIykgtpimuumrk.iusrot03-ssq1giq.uxm@flex--seanjc.bounces.google.com designates 209.85.215.201 as permitted sender) smtp.mailfrom=3oQNsZAYKCIIykgtpimuumrk.iusrot03-ssq1giq.uxm@flex--seanjc.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1684800419; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=nqpMDrC4KK3NQ6AuoTcuN/aH10d3Gs6pWZJ4L+6LYdI=; b=1eJzSeqfyDniyx7XsMAV+OGHtwKnSIk/XyYWyVGYTnElt74aeiWI/MF7zVSxj3Zg9Btfo5 gPvyyBciDBgde8huzHBHCyJxVHZL+YL4Q5OlBL+ve7rqOgktvP98WbCcsqJVsDiRVAMfvZ rSqfTKUODuL6PsK1gA5syMi8h4q4xP0= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1684800419; a=rsa-sha256; cv=none; b=hLomfjo72omEbJ2SQPDzSvKnr+ZZ5aMZvnXFCKAIrRlJ4R0DeW20yZtOdjl+XWPymhBoqr G1vGOrr0YMmHYUVBXDTlADxIdmBQEmAzDPHHY7Mr3dhH5bcoJkVfWfrDihhT3PAxiOlU9z kjWdBXY/KPUvdYLwk2B1OekIcKwZ1Uk= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=einRQC06; spf=pass (imf19.hostedemail.com: domain of 3oQNsZAYKCIIykgtpimuumrk.iusrot03-ssq1giq.uxm@flex--seanjc.bounces.google.com designates 209.85.215.201 as permitted sender) smtp.mailfrom=3oQNsZAYKCIIykgtpimuumrk.iusrot03-ssq1giq.uxm@flex--seanjc.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-pg1-f201.google.com with SMTP id 41be03b00d2f7-52855ba7539so3568960a12.3 for ; Mon, 22 May 2023 17:06:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1684800418; x=1687392418; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=nqpMDrC4KK3NQ6AuoTcuN/aH10d3Gs6pWZJ4L+6LYdI=; b=einRQC06NM+pLkPcWJKuFkA3JrqGZ4jSC+L44hYXq+E/rLb8tn1KKR5KGxdVgGH9fL z4wM40qWv+kR5vQgB2cbnU9DNxDeI2IC0tmcs88RbHAhC13vUSJeJwXY2E2Dld8hrPg3 MG3L4X+OVgZf3FszqPcq/K+aN6nVAet/JMgxgQXoHdcBP0my7P2nA29a9lYONLLhRYMZ s2S/qIxe2XDDMncs9mGy6cUgAaVEpju9PTe00dW7/KviwcI5VWtDKN+UuNKlHfU72jrL KU7Iq+8yaRoCE4WlHn45jJU/E5rxEradrcAdgs1h1bhJKao7Ozt805O7vAzm8fUMRQot nQfg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684800418; x=1687392418; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=nqpMDrC4KK3NQ6AuoTcuN/aH10d3Gs6pWZJ4L+6LYdI=; b=YV8BScZp61oo/w5WOrYwsUJ+IBzFT/kBabnFHmR22xjyCudeoQUsXDRLORRHwIti2g Dm5Qnw34NpCpXp12KthMHqAt1I6HQekZgjtMt2gVJ7ZggPFj7h/4Ta7w/tqLdYAB8Xbk 1xbnwl391zO0lIPIXLKiEJvt5CUhXi4aklM2LJbZfnGvT55RX3N1KA0EEc6FiqNwLh39 m5vThLPk/NEJBhsK6FcLjdE2e6+Mgo0DgdR1E2Xx9hgkq8Apbhgm5jtdxqmwjVCcZguN gTrXkRHPstZN8c62vQ5kdTSQTMoe4g1rwI2YXRG9ZKiy6K2O7WQg++cJzbwGc+Elf5KA 81bg== X-Gm-Message-State: AC+VfDzcgque2+9uqovRdCCHQpSLZk546wMTNitVRFrNStJ1ArorEZ7B qIcU+ytSSY2cvPvfHAk3/XC/PuJ40nI= X-Google-Smtp-Source: ACHHUZ6HjllyZzTlaaKR4BoUDyFce04HZ9UhEoY+TcxZpDQZdfHOuNAQJ4T7tsiqCTcoicnwg0ExZ7H4Ffw= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a63:5f91:0:b0:52c:b362:1894 with SMTP id t139-20020a635f91000000b0052cb3621894mr2851167pgb.3.1684800417778; Mon, 22 May 2023 17:06:57 -0700 (PDT) Date: Mon, 22 May 2023 17:06:56 -0700 In-Reply-To: <875y8jappz.fsf@nvidia.com> Mime-Version: 1.0 References: <20230522063725.284686-1-apopple@nvidia.com> <875y8jappz.fsf@nvidia.com> Message-ID: Subject: Re: [PATCH] mmu_notifiers: Notify on pte permission upgrades From: Sean Christopherson To: Alistair Popple Cc: Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, robin.murphy@arm.com, will@kernel.org, nicolinc@nvidia.com, linux-arm-kernel@lists.infradead.org, kvm@vger.kernel.org, kvmarm@lists.cs.columbia.edu, jgg@nvidia.com, John Hubbard Content-Type: text/plain; charset="us-ascii" X-Rspamd-Queue-Id: 090101A001D X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: kq1o6it5fij66c9a1u7azxzdr561s9zs X-HE-Tag: 1684800418-26134 X-HE-Meta: U2FsdGVkX1/8gCf+4QOKASjtDxA42XHrJ2j6/LaGxtAleifKujR6WGEP4CZJsQb6IulzCeM/fIN0Td8RYECqbn66hOY966ElpRAkE8ZF/yccohrZNXj6FlOMV7pphEXvS0ob6B5fPZLTUnNHxfxFy43/g2LykwKLC/rlDHtu2yveXbswKTWINf+yXXzmrXSD7Pbn3hXr4zgFwn9XtERmmNN6BdaubXSlMRG9qSSvutS6zISQZB0oDRECO2ihNdQlQJAJgR0ezN2j2ISPwC+4Q89TnvFZlZQJXZ+NVOs+bshvOAzwKUEgp06o76WI+E0Cvney4IKkEDQFyp9PyPfJ268wpIYvInxX5iWjoLqIBxQ1uqFGqaBBu20WPeRUqSexaKgSa8yGS4pjc1+F06BAetdVURrpKl5082XscLkDYPpSSG6PYlejhrVMxrgWYYB7QqrA0zwJKLE+/OR6V5lLRl+2TUDpxedNW253m8DXHO5ZvaiOKxmC6CPGF3JE9/YpFhHLc1e3zz8CJDzK03Gj9z8oHCFEw6pe1IdA0sAFjFp2q19oRAhCEDLZDbxfjv0q1J5EYDVGNbo30jTKPvZ08AD1KF2ONFcvlpo3MRztYgBJusOLfo8S5Vwb2IaUoMQLDvWs3CQ/YqMnXYFhwe9mYEU7sMP9f6cEK3gKv+R+4RlFXqy51aOXACYtJp0GODoxdtStXvnGiZdX+2V0oahyTDsBLldEUzlKmILh9Iw6sdZb5seWRhyPsuX+9h4RSHTYAA65GEEqLK9SPc58y5e7jiq1+Qp11m8VZYM6SvpprguH8xAE/S9rwL7Ei7DodnZJ7NaRvmjSc3yguf+12/8v4FlkVFLmtRokufhsX9dAfdktLHpLxZwoP/MByyunsbtrE7B+Lc+ukhN07cR236DesN2Fibn9SNctmVjACjpDKyQRyHo3y5KLjo0Gk07o2ed0aoG/kl07esSgwQrsf/I dlNy61DQ MhjVN7O4wFgFdSo1u8sO/+JX09hAEdSkcCJl/+VPFR0GxPeZ2tYCwnNhSsSvXfCkEd4sp7ISKVOOXi+W+ua+NvtCQt20cbziv9OZDwK0qI7QN6vK5MZs3aT4Ct/5rFk4SZlgmnLevL30DHt3mi9xhN0LDcAAMQtqUP/tQPlUPW4KRlasW3g3WbEw3hiHlh8PVMmQMj5YmIQhZLdU55o+ADb9/Y3+jjKStMGD1tf007gRj6AdTjEyQjffcjjI5r9/wjZ/8y4IQDvSJkXflwCxP/Sm3ON8R8pJAxKfXjyTvknccyJxSVbIft7JSbZNbQ4bQFYFe/CHS4BLzOKfoJ1erfBTXvWP2cmUFILKjzOKqZzEVfGe6gwBvMYJl0wTAgF4NAskdkh3gIKPIcfAh6lmqZsNazw8zlLxGQV+FCIj+PsBkyzj+rtFXJM/1e0sSr+c5glcaI89zqDoy57tBR1NiqGzjEOVma/rvE93VCQfY4GmVOck4rdILHVrwNYXz5ZI6MkEE+73gWWBga+nFZ2a6w0pzhqWTjCQkehhDKQKLpmENNulMIPFXddSNZ9zNHVkFvE/zBE0z9l+7oAddSdm2ybPLJtAYNq4A+Tmw0zt5C1rnPqA/lHZCNgs7aXOkNLqynPw54WLmwokAPhZ4EAEQIjROZg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, May 23, 2023, Alistair Popple wrote: > > Sean Christopherson writes: > >> diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h > >> index d6c06e140277..f14d68f119d8 100644 > >> --- a/include/linux/mmu_notifier.h > >> +++ b/include/linux/mmu_notifier.h > >> @@ -31,6 +31,11 @@ struct mmu_interval_notifier; > >> * pages in the range so to mirror those changes the user must inspect the CPU > >> * page table (from the end callback). > >> * > >> + * @MMU_NOTIFY_PROTECTION_UPGRAGE: update is due to a change from read-only to > >> + * read-write for pages in the range. This must not be used to upgrade > >> + * permissions on secondary PTEs, rather it should only be used to invalidate > >> + * caches such as secondary TLBs that may cache old read-only entries. > > > > This is a poor fit for invalidate_range_{start,end}(). All other uses bookend > > the primary MMU update, i.e. call start() _before_ changing PTEs. The comments > > are somewhat stale as they talk only about "unmapped", but the contract between > > the primary MMU and the secondary MMU is otherwise quite clear on when the primary > > MMU will invoke start() and end(). > > > > * invalidate_range_start() is called when all pages in the > > * range are still mapped and have at least a refcount of one. > > * > > * invalidate_range_end() is called when all pages in the > > * range have been unmapped and the pages have been freed by > > * the VM. > > > > I'm also confused as to how this actually fixes ARM's SMMU. Unless I'm looking > > at the wrong SMMU implementation, the SMMU implemenents only invalidate_range(), > > not the start()/end() variants. > > mmu_invalidate_range_end() calls the invalidate_range() callback if > the start()/end() variants aren't set. Ahh, thanks. > > static const struct mmu_notifier_ops arm_smmu_mmu_notifier_ops = { > > .invalidate_range = arm_smmu_mm_invalidate_range, > > .release = arm_smmu_mm_release, > > .free_notifier = arm_smmu_mmu_notifier_free, > > }; > > > > Again from include/linux/mmu_notifier.h, not implementing the start()/end() hooks > > is perfectly valid. And AFAICT, the existing invalidate_range() hook is pretty > > much a perfect fit for what you want to achieve. > > Right, I didn't take that approach because it doesn't allow an event > type to be passed which would allow them to be filtered on platforms > which don't require this. > > I had also assumed the invalidate_range() callbacks were allowed to > sleep, hence couldn't be called under PTL. That's certainly true of mmu > interval notifier callbacks, but Catalin reminded me that calls such as > ptep_clear_flush_notify() already call invalidate_range() callback under > PTL so I guess we already assume drivers don't sleep in their > invalidate_range() callbacks. I will update the comments to reflect > that. > > > * If invalidate_range() is used to manage a non-CPU TLB with > > * shared page-tables, it not necessary to implement the > > * invalidate_range_start()/end() notifiers, as > > * invalidate_range() already catches the points in time when an > > * external TLB range needs to be flushed. For more in depth > > * discussion on this see Documentation/mm/mmu_notifier.rst > > > > Even worse, this change may silently regress performance for secondary MMUs that > > haven't yet taken advantage of the event type, e.g. KVM will zap all of KVM's PTEs > > in response to the upgrade, instead of waiting until the guest actually tries to > > utilize the new protections. > > Yeah, I like the idea of introducing a > ptep_set_access_flags_notify(). That way this won't regress performance > on platforms that don't need it. Note this isn't a new feature but > rather a bugfix. It's unclear to me why KVM on ARM hasn't already run > into this issue, but I'm no KVM expert. Thanks for the feedback. KVM manages its own page tables and so does its own TLB invalidations as needed, e.g. KVM can and does change KVM's stage-2 PTEs from read-only to read/write irrespective of mmu_notifiers. I assume the SMMU issue arises only because the SMMU is reusing the host kernel's (stage-1?) page tables.