From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 57631C04A94 for ; Thu, 10 Aug 2023 09:27:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D2B436B0078; Thu, 10 Aug 2023 05:27:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CDB7C6B007B; Thu, 10 Aug 2023 05:27:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BCAB66B007D; Thu, 10 Aug 2023 05:27:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id AD6926B0078 for ; Thu, 10 Aug 2023 05:27:15 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 7E4801A0963 for ; Thu, 10 Aug 2023 09:27:15 +0000 (UTC) X-FDA: 81107666430.03.F1E75D9 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.65]) by imf04.hostedemail.com (Postfix) with ESMTP id 87EFA40014 for ; Thu, 10 Aug 2023 09:27:13 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=HnemLuaN; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf04.hostedemail.com: domain of yan.y.zhao@intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=yan.y.zhao@intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1691659633; a=rsa-sha256; cv=none; b=qgoglGrK3wwyBT/yIAthQ8PZ/ckPUTHxsuPQOHBIW1uqRShVkeSSGdzXt4k8/SmvgK70LC XZYqvYUIdNg4uZa3zQ5Kiucdy/MLjKIQqzsxvcLHp7XG8kfkP5ulvhJbzmejSjo7o7nn7b BU861O//o3J1q3OngrV7ns0pqqbIImg= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=HnemLuaN; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf04.hostedemail.com: domain of yan.y.zhao@intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=yan.y.zhao@intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1691659633; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:content-type: content-transfer-encoding:in-reply-to:in-reply-to: references:references:dkim-signature; bh=4i3TC0BhTvYcd9IOkr0Cty8yPWD+Gf3tuseFnniFum4=; b=L0L/0FsPjIDRkJOu1a0eVLeMXtLqvP8/40UYYsGhLFxyL/5SVe0i1dUjMdS84C63W0wBkn cIg0uvMA+VGKXmmZx7wg6EBSIb1pTU5PwV2m1iqh4jBP3u280nfMNXE+/VtjYCaRgHE18a J0Os9n1xUKSiL5dJnt+74bQA+JeP+cY= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1691659633; x=1723195633; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=C6zjOhS5sp3kUlz9iodlvnz6Wf6HLOevZBJUafi9FrU=; b=HnemLuaNtJJTeNLnWnZibnhCpdYXe87Rj8xMN0gJDzA4TYl94fgIUTLY LmnTLBunn3wpXJWrcYzirnl1h2PqXe8tAP1hZtn/+aMxqwnmjXyvadpuE SwASGGqAtze8mKsa7uEMoA9NSutp5O8dQfkl07NhHADoanqW2wMS6Chy9 Shy4R59RBgtHqqTUQ74hViC4mkQiksovMSumAV6viW5C6sSBWVdLSIUcm RJW4v+Bxddt31e1l+K2EjuuZdOo6GaQDY06Mhb5voRovSTwhpdMq2lpEu yOrRtVf1+yImDlw3CNeNNA/QEi32hsv4z/Gr0pMQlzyID48uCZ76DqbKt A==; X-IronPort-AV: E=McAfee;i="6600,9927,10797"; a="375066897" X-IronPort-AV: E=Sophos;i="6.01,161,1684825200"; d="scan'208";a="375066897" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Aug 2023 02:27:12 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10797"; a="905983293" X-IronPort-AV: E=Sophos;i="6.01,161,1684825200"; d="scan'208";a="905983293" Received: from yzhao56-desk.sh.intel.com ([10.239.159.62]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Aug 2023 02:27:08 -0700 From: Yan Zhao To: linux-mm@kvack.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: pbonzini@redhat.com, seanjc@google.com, mike.kravetz@oracle.com, apopple@nvidia.com, jgg@nvidia.com, rppt@kernel.org, akpm@linux-foundation.org, kevin.tian@intel.com, david@redhat.com, Yan Zhao Subject: [RFC PATCH v2 3/5] mm/mmu_notifier: introduce a new callback .numa_protect Date: Thu, 10 Aug 2023 17:00:08 +0800 Message-Id: <20230810090008.26122-1-yan.y.zhao@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230810085636.25914-1-yan.y.zhao@intel.com> References: <20230810085636.25914-1-yan.y.zhao@intel.com> X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 87EFA40014 X-Stat-Signature: r6uonogqpd4sqkzkhc6q1dhtk9kg8jg3 X-HE-Tag: 1691659633-376914 X-HE-Meta: U2FsdGVkX19NyDkM8smLP/xi86JH1YUyyPSbI2v2AAMEWP31rZduTaQrNtgM79NbeHh6PqHH9eCZqLLcq4oVHqn2GPVL0rHrlo7/dWgqsSSdhipqqZE9BcptU/hY7o4Dmcm70TO7qt1yLFtjelBdO+8WjeDrDvbL1+I7AT17xrg6o8PESOWCjVS75hZmN7fR0/P7bX2Z3KiZxq8nLwfcpLnLPsTMQRTGNi+JHxtGXKeO9NaFpcsJulBi4dWbidOocFAST3pLVt28RaM060BEiXZe47lJD4k6fkXOxr5Jfyqg9TEhiocyCRTw5EZ8O/gh/D3Xmx+sLfvUpcYqN3d1/DJiUI0WR8OP8l9qiA+y5m07nyyTmqlVHpqB9PyKi8QFbyMXUWZZUuBTYr/O89QkqGvT0g9gH070kemmef1g1ezy2z1DzmW1/WR7CUZccZyd8yYmEXnFLv0z+REwjEAUlvLEtsAqqaUJMHkras3qqEHeTSQyHE6rjG7BxT7QnvCUZpFN/ZIN7LTYuT4PK1qh8f86omc8q9INUapQ+cA1373743Drpji43Wm4FSQOEs4jU9wUYgGC4wvNa28sWvdtodKod4McliKC1/HPUP1YUV5qbGdg/ygj8sx+XSLwZZtW3uR3mYqfyq/N00F6CaX1gMAFm0X2AMnY1W/G6OxzY9IamnMYzn1ylibEZlSIpJvLW3oDgaoLvBb3ObgWMFOSwEoj8eM+MvbIfVzD5YpIYJCIBhlzK0tyDsLV0BThSrJovxeX1xhvop0QONNBacgur/c+i2WXO21VAkjDxRl+GXgNJ7dj229+oSZW9GG2X5J8L888OsMjcj93feOO90GTA5TA0ZcfRt+wEAPkpwrt/9ZJU9LC6g6+/utw79xrxCt/92/JquQtEf+Rts++lYQMxoy/+eupt0yuUSX/MdWrxWx3+qZF4yUVVgeyDoPM5GmZpXJi276b/oltA8MzWWt d1w4X67h jySXklCqx1EFP4DtXz1Ru0UODjlNotd9iWp0htmi4kYGwx7DFYWFB2L+ZQZd16TknxLn67L+8BCAg2G4ew70nWKCDPY4yFIN3/pN3RYtJTA/noi17/apSHY4WjqS+SvCi4mRoy+V8C7Di9a/wvOB9bAHKnPP/csFGuefB X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This .numa_protect callback is called when PROT_NONE is set for sure on a PTE or a huge PMD for numa migration purpose. With this callback, subscriber of mmu notifier, (e.g. KVM), can unmap NUMA migration protected pages only in the handler, rather than unmap a wider range containing pages that are obvious none-NUMA-migratble. Signed-off-by: Yan Zhao --- include/linux/mmu_notifier.h | 15 +++++++++++++++ mm/mmu_notifier.c | 18 ++++++++++++++++++ 2 files changed, 33 insertions(+) diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h index a6dc829a4bce..a173db83b071 100644 --- a/include/linux/mmu_notifier.h +++ b/include/linux/mmu_notifier.h @@ -132,6 +132,10 @@ struct mmu_notifier_ops { unsigned long address, pte_t pte); + void (*numa_protect)(struct mmu_notifier *subscription, + struct mm_struct *mm, + unsigned long start, + unsigned long end); /* * invalidate_range_start() and invalidate_range_end() must be * paired and are called only when the mmap_lock and/or the @@ -395,6 +399,9 @@ extern int __mmu_notifier_test_young(struct mm_struct *mm, unsigned long address); extern void __mmu_notifier_change_pte(struct mm_struct *mm, unsigned long address, pte_t pte); +extern void __mmu_notifier_numa_protect(struct mm_struct *mm, + unsigned long start, + unsigned long end); extern int __mmu_notifier_invalidate_range_start(struct mmu_notifier_range *r); extern void __mmu_notifier_invalidate_range_end(struct mmu_notifier_range *r, bool only_end); @@ -448,6 +455,14 @@ static inline void mmu_notifier_change_pte(struct mm_struct *mm, __mmu_notifier_change_pte(mm, address, pte); } +static inline void mmu_notifier_numa_protect(struct mm_struct *mm, + unsigned long start, + unsigned long end) +{ + if (mm_has_notifiers(mm)) + __mmu_notifier_numa_protect(mm, start, end); +} + static inline void mmu_notifier_invalidate_range_start(struct mmu_notifier_range *range) { diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c index 50c0dde1354f..fc96fbd46e1d 100644 --- a/mm/mmu_notifier.c +++ b/mm/mmu_notifier.c @@ -382,6 +382,24 @@ int __mmu_notifier_clear_flush_young(struct mm_struct *mm, return young; } +void __mmu_notifier_numa_protect(struct mm_struct *mm, + unsigned long start, + unsigned long end) +{ + struct mmu_notifier *subscription; + int id; + + id = srcu_read_lock(&srcu); + hlist_for_each_entry_rcu(subscription, + &mm->notifier_subscriptions->list, hlist, + srcu_read_lock_held(&srcu)) { + if (subscription->ops->numa_protect) + subscription->ops->numa_protect(subscription, mm, start, + end); + } + srcu_read_unlock(&srcu, id); +} + int __mmu_notifier_clear_young(struct mm_struct *mm, unsigned long start, unsigned long end) -- 2.17.1