linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Hugh Dickins <hughd@google.com>
To: Andi Kleen <ak@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	 Christoph Lameter <cl@linux.com>,
	Matthew Wilcox <willy@infradead.org>,
	 Mike Kravetz <mike.kravetz@oracle.com>,
	 David Hildenbrand <david@redhat.com>,
	 Suren Baghdasaryan <surenb@google.com>,
	Yang Shi <shy828301@gmail.com>,
	 Sidhartha Kumar <sidhartha.kumar@oracle.com>,
	 Vishal Moola <vishal.moola@gmail.com>,
	 Kefeng Wang <wangkefeng.wang@huawei.com>,
	 Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Tejun Heo <tj@kernel.org>,
	 Mel Gorman <mgorman@techsingularity.net>,
	Michal Hocko <mhocko@suse.com>,
	 linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH 01/12] hugetlbfs: drop shared NUMA mempolicy pretence
Date: Tue, 26 Sep 2023 15:26:30 -0700 (PDT)	[thread overview]
Message-ID: <45aa39c0-9b14-3e5-a81a-70a6403a8432@google.com> (raw)
In-Reply-To: <ZRINx/53KKUibbGb@tassilo>

On Mon, 25 Sep 2023, Andi Kleen wrote:
> On Mon, Sep 25, 2023 at 01:21:10AM -0700, Hugh Dickins wrote:
> > hugetlbfs_fallocate() goes through the motions of pasting a shared NUMA
> > mempolicy onto its pseudo-vma, but how could there ever be a shared NUMA
> > mempolicy for this file?  hugetlb_vm_ops has never offered a set_policy
> > method, and hugetlbfs_parse_param() has never supported any mpol options
> > for a mount-wide default policy.
> > 
> > It's just an illusion: clean it away so as not to confuse others, giving
> > us more freedom to adjust shmem's set_policy/get_policy implementation.
> > But hugetlbfs_inode_info is still required, just to accommodate seals.
> > 
> > Yes, shared NUMA mempolicy support could be added to hugetlbfs, with a
> > set_policy method and/or mpol mount option (Andi's first posting did
> > include an admitted-unsatisfactory hugetlb_set_policy()); but it seems
> > that nobody has bothered to add that in the nineteen years since v2.6.7
> > made it possible, and there is at least one company that has invested
> > enough into hugetlbfs, that I guess they have learnt well enough how to
> > manage its NUMA, without needing shared mempolicy.
> 
> TBH i'm not sure people in general rely on shared mempolicy. The
> original use case for it was to modify the numa policy of non anonymous
> shared memory files without modifying the program (e.g. Oracle
> database's shared memory segments)

Ah, "without modifying the program": that makes a lot of sense, but
I had never thought of it that way - I just saw it as the right way to
manage the shared object (though an outlier, since we have so many other
msyscall()s which do not manage the underlying shared object in this way).

> 
> But I don't think that particular usage model ever got any real
> traction: at leas I haven't seen any real usage of it outside my tests.

If the hugetlbfs support had actually gone in, I imagine Oracle would
have managed it that way; but they seem to have survived well without.

> 
> I suspect people either are fine with just process policy or modify the
> program, in which case it's not a big burden to modify every user,
> so process policy or vma based mbind policy works fine.
> 
> Maybe it would be an interesting experiment to disable it everywhere
> with some flag and see if anybody complains.
> 
> On the other hand it might be Hyrum'ed by now.

This is interesting info, Andi, thank you for providing.

I'm torn.  shmem and mempolicy (and struct vm_operations_struct) would
certainly be simpler without shared mempolicy: but I frankly don't have
the time and courage to experiment with deprecating it now; and it is
fundamentally right that such policy should be kept with the object,
not with its mappings.  I've assumed for years that it has to stay.

Hugh


  reply	other threads:[~2023-09-26 22:26 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-25  8:17 [PATCH 00/12] mempolicy: cleanups leading to NUMA mpol without vma Hugh Dickins
2023-09-25  8:21 ` [PATCH 01/12] hugetlbfs: drop shared NUMA mempolicy pretence Hugh Dickins
2023-09-25 22:09   ` Matthew Wilcox
2023-09-25 22:46   ` Andi Kleen
2023-09-26 22:26     ` Hugh Dickins [this message]
2023-09-25  8:22 ` [PATCH 02/12] kernfs: drop shared NUMA mempolicy hooks Hugh Dickins
2023-09-25 22:10   ` Matthew Wilcox
2023-09-25  8:24 ` [PATCH 03/12] mempolicy: fix migrate_pages(2) syscall return nr_failed Hugh Dickins
2023-09-25 22:22   ` Matthew Wilcox
2023-09-26 20:47     ` Hugh Dickins
2023-09-27  8:02   ` Huang, Ying
2023-09-30  4:20     ` Hugh Dickins
2023-09-25  8:25 ` [PATCH 04/12] mempolicy trivia: delete those ancient pr_debug()s Hugh Dickins
2023-09-25 22:23   ` Matthew Wilcox
2023-09-25  8:26 ` [PATCH 05/12] mempolicy trivia: slightly more consistent naming Hugh Dickins
2023-09-25 22:28   ` Matthew Wilcox
2023-09-25  8:28 ` [PATCH 06/12] mempolicy trivia: use pgoff_t in shared mempolicy tree Hugh Dickins
2023-09-25 22:31   ` Matthew Wilcox
2023-09-25 22:38     ` Matthew Wilcox
2023-09-26 21:19     ` Hugh Dickins
2023-09-25  8:29 ` [PATCH 07/12] mempolicy: mpol_shared_policy_init() without pseudo-vma Hugh Dickins
2023-09-25 22:50   ` Matthew Wilcox
2023-09-26 21:36     ` Hugh Dickins
2023-09-25  8:30 ` [PATCH 08/12] mempolicy: remove confusing MPOL_MF_LAZY dead code Hugh Dickins
2023-09-25 22:52   ` Matthew Wilcox
2023-09-25  8:32 ` [PATCH 09/12] mm: add page_rmappable_folio() wrapper Hugh Dickins
2023-09-25 22:58   ` Matthew Wilcox
2023-09-26 21:58     ` Hugh Dickins
2023-09-25  8:33 ` [PATCH 10/12] mempolicy: alloc_pages_mpol() for NUMA policy without vma Hugh Dickins
2023-09-25  8:35 ` [PATCH 11/12] mempolicy: mmap_lock is not needed while migrating folios Hugh Dickins
2023-09-25  8:36 ` [PATCH 12/12] mempolicy: migration attempt to match interleave nodes Hugh Dickins

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=45aa39c0-9b14-3e5-a81a-70a6403a8432@google.com \
    --to=hughd@google.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=david@redhat.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@suse.com \
    --cc=mike.kravetz@oracle.com \
    --cc=shy828301@gmail.com \
    --cc=sidhartha.kumar@oracle.com \
    --cc=surenb@google.com \
    --cc=tj@kernel.org \
    --cc=vishal.moola@gmail.com \
    --cc=wangkefeng.wang@huawei.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox