From: Hugh Dickins <hughd@google.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Andi Kleen <ak@linux.intel.com>, Christoph Lameter <cl@linux.com>,
Matthew Wilcox <willy@infradead.org>,
Mike Kravetz <mike.kravetz@oracle.com>,
David Hildenbrand <david@redhat.com>,
Suren Baghdasaryan <surenb@google.com>,
Yang Shi <shy828301@gmail.com>,
Sidhartha Kumar <sidhartha.kumar@oracle.com>,
Vishal Moola <vishal.moola@gmail.com>,
Kefeng Wang <wangkefeng.wang@huawei.com>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Tejun Heo <tj@kernel.org>,
Mel Gorman <mgorman@techsingularity.net>,
Michal Hocko <mhocko@suse.com>,
"Huang, Ying" <ying.huang@intel.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: [PATCH v2 01/12] hugetlbfs: drop shared NUMA mempolicy pretence
Date: Tue, 3 Oct 2023 02:15:09 -0700 (PDT) [thread overview]
Message-ID: <cae82d4b-904a-faaf-282a-34fcc188c81f@google.com> (raw)
In-Reply-To: <ebc0987e-beff-8bfb-9283-234c2cbd17c5@google.com>
hugetlbfs_fallocate() goes through the motions of pasting a shared NUMA
mempolicy onto its pseudo-vma, but how could there ever be a shared NUMA
mempolicy for this file? hugetlb_vm_ops has never offered a set_policy
method, and hugetlbfs_parse_param() has never supported any mpol options
for a mount-wide default policy.
It's just an illusion: clean it away so as not to confuse others, giving
us more freedom to adjust shmem's set_policy/get_policy implementation.
But hugetlbfs_inode_info is still required, just to accommodate seals.
Yes, shared NUMA mempolicy support could be added to hugetlbfs, with a
set_policy method and/or mpol mount option (Andi's first posting did
include an admitted-unsatisfactory hugetlb_set_policy()); but it seems
that nobody has bothered to add that in the nineteen years since v2.6.7
made it possible, and there is at least one company that has invested
enough into hugetlbfs, that I guess they have learnt well enough how to
manage its NUMA, without needing shared mempolicy.
Remove linux/mempolicy.h from linux/hugetlb.h: include linux/pagemap.h in
its place, because hugetlb.h's recently added use of filemap_lock_folio()
requires that (although most .configs and .c's get it in some other way).
Signed-off-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
fs/hugetlbfs/inode.c | 41 +----------------------------------------
include/linux/hugetlb.h | 3 +--
2 files changed, 2 insertions(+), 42 deletions(-)
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 926d01c493fb..0586c90cb9a5 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -83,29 +83,6 @@ static const struct fs_parameter_spec hugetlb_fs_parameters[] = {
{}
};
-#ifdef CONFIG_NUMA
-static inline void hugetlb_set_vma_policy(struct vm_area_struct *vma,
- struct inode *inode, pgoff_t index)
-{
- vma->vm_policy = mpol_shared_policy_lookup(&HUGETLBFS_I(inode)->policy,
- index);
-}
-
-static inline void hugetlb_drop_vma_policy(struct vm_area_struct *vma)
-{
- mpol_cond_put(vma->vm_policy);
-}
-#else
-static inline void hugetlb_set_vma_policy(struct vm_area_struct *vma,
- struct inode *inode, pgoff_t index)
-{
-}
-
-static inline void hugetlb_drop_vma_policy(struct vm_area_struct *vma)
-{
-}
-#endif
-
/*
* Mask used when checking the page offset value passed in via system
* calls. This value will be converted to a loff_t which is signed.
@@ -853,8 +830,7 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset,
/*
* Initialize a pseudo vma as this is required by the huge page
- * allocation routines. If NUMA is configured, use page index
- * as input to create an allocation policy.
+ * allocation routines.
*/
vma_init(&pseudo_vma, mm);
vm_flags_init(&pseudo_vma, VM_HUGETLB | VM_MAYSHARE | VM_SHARED);
@@ -902,9 +878,7 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset,
* folios in these areas, we need to consume the reserves
* to keep reservation accounting consistent.
*/
- hugetlb_set_vma_policy(&pseudo_vma, inode, index);
folio = alloc_hugetlb_folio(&pseudo_vma, addr, 0);
- hugetlb_drop_vma_policy(&pseudo_vma);
if (IS_ERR(folio)) {
mutex_unlock(&hugetlb_fault_mutex_table[hash]);
error = PTR_ERR(folio);
@@ -1283,18 +1257,6 @@ static struct inode *hugetlbfs_alloc_inode(struct super_block *sb)
hugetlbfs_inc_free_inodes(sbinfo);
return NULL;
}
-
- /*
- * Any time after allocation, hugetlbfs_destroy_inode can be called
- * for the inode. mpol_free_shared_policy is unconditionally called
- * as part of hugetlbfs_destroy_inode. So, initialize policy here
- * in case of a quick call to destroy.
- *
- * Note that the policy is initialized even if we are creating a
- * private inode. This simplifies hugetlbfs_destroy_inode.
- */
- mpol_shared_policy_init(&p->policy, NULL);
-
return &p->vfs_inode;
}
@@ -1306,7 +1268,6 @@ static void hugetlbfs_free_inode(struct inode *inode)
static void hugetlbfs_destroy_inode(struct inode *inode)
{
hugetlbfs_inc_free_inodes(HUGETLBFS_SB(inode->i_sb));
- mpol_free_shared_policy(&HUGETLBFS_I(inode)->policy);
}
static const struct address_space_operations hugetlbfs_aops = {
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 3c4427a2396d..a574e26e18a2 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -30,7 +30,7 @@ void free_huge_folio(struct folio *folio);
#ifdef CONFIG_HUGETLB_PAGE
-#include <linux/mempolicy.h>
+#include <linux/pagemap.h>
#include <linux/shm.h>
#include <asm/tlbflush.h>
@@ -513,7 +513,6 @@ static inline struct hugetlbfs_sb_info *HUGETLBFS_SB(struct super_block *sb)
}
struct hugetlbfs_inode_info {
- struct shared_policy policy;
struct inode vfs_inode;
unsigned int seals;
};
--
2.35.3
next prev parent reply other threads:[~2023-10-03 9:15 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-03 9:12 [PATCH v2 00/12] mempolicy: cleanups leading to NUMA mpol without vma Hugh Dickins
2023-10-03 9:15 ` Hugh Dickins [this message]
2023-10-03 9:16 ` [PATCH v2 02/12] kernfs: drop shared NUMA mempolicy hooks Hugh Dickins
2023-10-03 9:17 ` [PATCH v2 03/12] mempolicy: fix migrate_pages(2) syscall return nr_failed Hugh Dickins
2023-10-07 7:27 ` Huang, Ying
2023-10-03 9:19 ` [PATCH v2 04/12] mempolicy trivia: delete those ancient pr_debug()s Hugh Dickins
2023-10-03 9:20 ` [PATCH v2 05/12] mempolicy trivia: slightly more consistent naming Hugh Dickins
2023-10-03 9:21 ` [PATCH v2 06/12] mempolicy trivia: use pgoff_t in shared mempolicy tree Hugh Dickins
2023-10-03 9:22 ` [PATCH v2 07/12] mempolicy: mpol_shared_policy_init() without pseudo-vma Hugh Dickins
2023-10-03 9:24 ` [PATCH v2 08/12] mempolicy: remove confusing MPOL_MF_LAZY dead code Hugh Dickins
2023-10-03 22:28 ` Yang Shi
2023-10-03 9:25 ` [PATCH v2 09/12] mm: add page_rmappable_folio() wrapper Hugh Dickins
2023-10-03 9:26 ` [PATCH v2 10/12] mempolicy: alloc_pages_mpol() for NUMA policy without vma Hugh Dickins
2023-10-19 20:39 ` [PATCH v3 " Hugh Dickins
2023-10-23 16:53 ` domenico cerasuolo
2023-10-23 17:53 ` Andrew Morton
2023-10-23 18:10 ` domenico cerasuolo
2023-10-23 19:05 ` Johannes Weiner
2023-10-23 19:48 ` Hugh Dickins
2023-10-24 6:44 ` [PATCH] mempolicy: alloc_pages_mpol() for NUMA policy without vma: fix Hugh Dickins
2023-10-24 8:17 ` kernel test robot
2023-10-24 15:56 ` Hugh Dickins
2023-10-24 16:09 ` [PATCH v2] " Hugh Dickins
2023-10-23 18:34 ` [PATCH v3 10/12] mempolicy: alloc_pages_mpol() for NUMA policy without vma Zi Yan
2023-10-23 21:10 ` Hugh Dickins
2023-10-23 21:13 ` Zi Yan
2023-10-03 9:27 ` [PATCH v2 11/12] mempolicy: mmap_lock is not needed while migrating folios Hugh Dickins
2023-10-03 9:29 ` [PATCH v2 12/12] mempolicy: migration attempt to match interleave nodes Hugh Dickins
2023-10-24 6:50 ` [PATCH] mempolicy: migration attempt to match interleave nodes: fix Hugh Dickins
2023-10-24 15:18 ` Liam R. Howlett
2023-10-24 16:32 ` Hugh Dickins
2023-10-24 16:45 ` Matthew Wilcox
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cae82d4b-904a-faaf-282a-34fcc188c81f@google.com \
--to=hughd@google.com \
--cc=ak@linux.intel.com \
--cc=akpm@linux-foundation.org \
--cc=cl@linux.com \
--cc=david@redhat.com \
--cc=gregkh@linuxfoundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@techsingularity.net \
--cc=mhocko@suse.com \
--cc=mike.kravetz@oracle.com \
--cc=shy828301@gmail.com \
--cc=sidhartha.kumar@oracle.com \
--cc=surenb@google.com \
--cc=tj@kernel.org \
--cc=vishal.moola@gmail.com \
--cc=wangkefeng.wang@huawei.com \
--cc=willy@infradead.org \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox