linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>,
	linux-mm@kvack.org, akpm@linux-foundation.org,
	mpe@ellerman.id.au, linuxppc-dev@lists.ozlabs.org,
	npiggin@gmail.com, christophe.leroy@csgroup.eu
Cc: Oscar Salvador <osalvador@suse.de>,
	Michal Hocko <mhocko@suse.com>,
	Vishal Verma <vishal.l.verma@intel.com>
Subject: Re: [PATCH v5 4/7] mm/hotplug: Support memmap_on_memory when memmap is not aligned to pageblocks
Date: Tue, 25 Jul 2023 20:06:10 +0200	[thread overview]
Message-ID: <e1a4430e-d3ae-711b-7efa-5085934b62fd@redhat.com> (raw)
In-Reply-To: <20230725100212.531277-5-aneesh.kumar@linux.ibm.com>

On 25.07.23 12:02, Aneesh Kumar K.V wrote:
> Currently, memmap_on_memory feature is only supported with memory block
> sizes that result in vmemmap pages covering full page blocks. This is
> because memory onlining/offlining code requires applicable ranges to be
> pageblock-aligned, for example, to set the migratetypes properly.
> 
> This patch helps to lift that restriction by reserving more pages than
> required for vmemmap space. This helps the start address to be page
> block aligned with different memory block sizes. Using this facility
> implies the kernel will be reserving some pages for every memoryblock.
> This allows the memmap on memory feature to be widely useful with
> different memory block size values.
> 
> For ex: with 64K page size and 256MiB memory block size, we require 4
> pages to map vmemmap pages, To align things correctly we end up adding a
> reserve of 28 pages. ie, for every 4096 pages 28 pages get reserved.
> 
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> ---
>   .../admin-guide/mm/memory-hotplug.rst         |  12 ++
>   mm/memory_hotplug.c                           | 121 ++++++++++++++++--
>   2 files changed, 119 insertions(+), 14 deletions(-)
> 
> diff --git a/Documentation/admin-guide/mm/memory-hotplug.rst b/Documentation/admin-guide/mm/memory-hotplug.rst
> index bd77841041af..2994958c7ce8 100644
> --- a/Documentation/admin-guide/mm/memory-hotplug.rst
> +++ b/Documentation/admin-guide/mm/memory-hotplug.rst
> @@ -433,6 +433,18 @@ The following module parameters are currently defined:
>   				 memory in a way that huge pages in bigger
>   				 granularity cannot be formed on hotplugged
>   				 memory.
> +
> +				 With value "force" it could result in memory
> +				 wastage due to memmap size limitations. For
> +				 example, if the memmap for a memory block
> +				 requires 1 MiB, but the pageblock size is 2
> +				 MiB, 1 MiB of hotplugged memory will be wasted.
> +				 Note that there are still cases where the
> +				 feature cannot be enforced: for example, if the
> +				 memmap is smaller than a single page, or if the
> +				 architecture does not support the forced mode
> +				 in all configurations.
> +
>   ``online_policy``		 read-write: Set the basic policy used for
>   				 automatic zone selection when onlining memory
>   				 blocks without specifying a target zone.
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 457824a6ecb8..5b472e137898 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -41,17 +41,89 @@
>   #include "internal.h"
>   #include "shuffle.h"
>   
> +enum {
> +	MEMMAP_ON_MEMORY_DISABLE = 0,
> +	MEMMAP_ON_MEMORY_ENABLE,
> +	MEMMAP_ON_MEMORY_FORCE,
> +};
> +
> +static int memmap_mode __read_mostly = MEMMAP_ON_MEMORY_DISABLE;
> +
> +static inline unsigned long memory_block_memmap_pages(void)
> +{
> +	unsigned long memmap_size;
> +
> +	memmap_size = PHYS_PFN(memory_block_size_bytes()) * sizeof(struct page);
> +	return memmap_size >> PAGE_SHIFT;

I'd really move a !page variant (memory_block_memmap_size()) to the 
previous patch and use it in mhp_supports_memmap_on_memory() and 
arch_supports_memmap_on_memory().

Then, in this patch, reuse that function in 
memory_block_memmap_on_memory_pages() and ...

> +}
> +
> +static inline unsigned long memory_block_memmap_on_memory_pages(void)
> +{
> +	unsigned long nr_pages = memory_block_memmap_pages();

... do here a

nr_pages = PHYS_PFN(memory_block_memmap_size());


Conceptually, it would be even cleaner to have here

nr_pages = PFN_UP(memory_block_memmap_size());

even though one can argue that mhp_supports_memmap_on_memory() will make 
sure that the unaligned value (memory_block_memmap_size()) covers full 
pages, but at least to me it looks cleaner that way. No strong opinion.


> +
> +	/*
> +	 * In "forced" memmap_on_memory mode, we add extra pages to align the
> +	 * vmemmap size to cover full pageblocks. That way, we can add memory
> +	 * even if the vmemmap size is not properly aligned, however, we might waste
> +	 * memory.
> +	 */
> +	if (memmap_mode == MEMMAP_ON_MEMORY_FORCE)
> +		return pageblock_align(nr_pages);
> +	return nr_pages;
> +}
> +
>   #ifdef CONFIG_MHP_MEMMAP_ON_MEMORY
>   /*
>    * memory_hotplug.memmap_on_memory parameter
>    */
> -static bool memmap_on_memory __ro_after_init;
> -module_param(memmap_on_memory, bool, 0444);
> -MODULE_PARM_DESC(memmap_on_memory, "Enable memmap on memory for memory hotplug");
> +static int set_memmap_mode(const char *val, const struct kernel_param *kp)
> +{
> +	int ret, mode;
> +	bool enabled;
> +
> +	if (sysfs_streq(val, "force") ||  sysfs_streq(val, "FORCE")) {
> +		mode =  MEMMAP_ON_MEMORY_FORCE;
> +		goto matched;
> +	}

Avoid the goto + label

} else {
	ret = kstrtobool(val, &enabled);
	...
}

*((int *)kp->arg) =  mode;

> +
> +	ret = kstrtobool(val, &enabled);
> +	if (ret < 0)
> +		return ret;
> +	if (enabled)
> +		mode =  MEMMAP_ON_MEMORY_ENABLE;
> +	else
> +		mode =  MEMMAP_ON_MEMORY_DISABLE;
> +
> +matched:
> +	*((int *)kp->arg) =  mode;
> +	if (mode == MEMMAP_ON_MEMORY_FORCE) {
> +		unsigned long memmap_pages = memory_block_memmap_on_memory_pages();
> +
> +		pr_info("Memory hotplug will reserve %ld pages in each memory block\n",
> +			memmap_pages - memory_block_memmap_pages());

pr_info_once() ?

> +	}
> +	return 0;
> +}
> +

[...]

>   	/*
>   	 * Besides having arch support and the feature enabled at runtime, we
> @@ -1294,10 +1366,28 @@ static bool mhp_supports_memmap_on_memory(unsigned long size)
>   	 *       altmap as an alternative source of memory, and we do not exactly
>   	 *       populate a single PMD.
>   	 */
> -	return mhp_memmap_on_memory() &&
> -	       size == memory_block_size_bytes() &&
> -	       IS_ALIGNED(remaining_size, (pageblock_nr_pages << PAGE_SHIFT)) &&
> -	       arch_supports_memmap_on_memory(size);
> +	if (!mhp_memmap_on_memory() || size != memory_block_size_bytes())
> +		return false;
> +
> +	/*
> +	 * Make sure the vmemmap allocation is fully contained
> +	 * so that we always allocate vmemmap memory from altmap area.
> +	 */
> +	if (!IS_ALIGNED(vmemmap_size, PAGE_SIZE))
> +		return false;
> +
> +	/*
> +	 * start pfn should be pageblock_nr_pages aligned for correctly
> +	 * setting migrate types
> +	 */
> +	if (!pageblock_aligned(memmap_pages))
> +		return false;
> +
> +	if (memmap_pages == PHYS_PFN(memory_block_size_bytes()))
> +		/* No effective hotplugged memory doesn't make sense. */
> +		return false;
> +
> +	return arch_supports_memmap_on_memory(size);
>   }
>   
>   /*
> @@ -1310,7 +1400,10 @@ int __ref add_memory_resource(int nid, struct resource *res, mhp_t mhp_flags)
>   {
>   	struct mhp_params params = { .pgprot = pgprot_mhp(PAGE_KERNEL) };
>   	enum memblock_flags memblock_flags = MEMBLOCK_NONE;
> -	struct vmem_altmap mhp_altmap = {};
> +	struct vmem_altmap mhp_altmap = {
> +		.base_pfn =  PHYS_PFN(res->start),
> +		.end_pfn  =  PHYS_PFN(res->end),

Is it required to set .end_pfn, and if so, shouldn't we also set it to 
base_pfn + memory_block_memmap_on_memory_pages()) ?

We also don't set it on the try_remove_memory() part,.



With these things addressed, feel free to add

Acked-by: David Hildenbrand <david@redhat.com>

-- 
Cheers,

David / dhildenb



  reply	other threads:[~2023-07-25 18:06 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-25 10:02 [PATCH v5 0/7] Add support for memmap on memory feature on ppc64 Aneesh Kumar K.V
2023-07-25 10:02 ` [PATCH v5 1/7] mm/hotplug: Simplify ARCH_MHP_MEMMAP_ON_MEMORY_ENABLE kconfig Aneesh Kumar K.V
2023-07-25 10:02 ` [PATCH v5 2/7] mm/hotplug: Allow memmap on memory hotplug request to fallback Aneesh Kumar K.V
2023-07-25 10:02 ` [PATCH v5 3/7] mm/hotplug: Allow architecture to override memmap on memory support check Aneesh Kumar K.V
2023-07-25 10:02 ` [PATCH v5 4/7] mm/hotplug: Support memmap_on_memory when memmap is not aligned to pageblocks Aneesh Kumar K.V
2023-07-25 18:06   ` David Hildenbrand [this message]
2023-07-26  4:25     ` Aneesh Kumar K.V
2023-07-26  9:04       ` David Hildenbrand
     [not found]         ` <9d1448d3-a43a-5305-68aa-d82111fe077a@linux.ibm.com>
2023-07-26 16:39           ` David Hildenbrand
2023-07-25 10:02 ` [PATCH v5 5/7] powerpc/book3s64/memhotplug: Enable memmap on memory for radix Aneesh Kumar K.V
2023-07-25 10:09   ` David Hildenbrand
2023-07-25 10:02 ` [PATCH v5 6/7] mm/hotplug: Embed vmem_altmap details in memory block Aneesh Kumar K.V
2023-07-26  9:41   ` David Hildenbrand
2023-07-26 10:31     ` Aneesh Kumar K.V
2023-07-26 16:43       ` David Hildenbrand
2023-07-25 10:02 ` [PATCH v5 7/7] mm/hotplug: Enable runtime update of memmap_on_memory parameter Aneesh Kumar K.V
2023-07-25 17:52   ` David Hildenbrand
2023-07-25 10:06 ` [PATCH v5 0/7] Add support for memmap on memory feature on ppc64 David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e1a4430e-d3ae-711b-7efa-5085934b62fd@redhat.com \
    --to=david@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=christophe.leroy@csgroup.eu \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mhocko@suse.com \
    --cc=mpe@ellerman.id.au \
    --cc=npiggin@gmail.com \
    --cc=osalvador@suse.de \
    --cc=vishal.l.verma@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox