From: Oscar Salvador <osalvador@suse.de>
To: David Hildenbrand <david@redhat.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
Andrew Morton <akpm@linux-foundation.org>,
"Rafael J. Wysocki" <rafael@kernel.org>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Michal Hocko <mhocko@suse.com>,
Rafael Parra <rparrazo@redhat.com>
Subject: Re: [PATCH v1 2/2] drivers/base/memory: determine and store zone for single-zone memory blocks
Date: Mon, 31 Jan 2022 12:29:09 +0100 [thread overview]
Message-ID: <20220131112909.GB18027@linux> (raw)
In-Reply-To: <20220128152620.168715-3-david@redhat.com>
On Fri, Jan 28, 2022 at 04:26:20PM +0100, David Hildenbrand wrote:
> For memory hot(un)plug, we only really care about memory blocks that:
> * span a single zone (and, thereby, a single node)
> * are completely System RAM (IOW, no holes, no ZONE_DEVICE)
> If one of these conditions is not met, we reject memory offlining.
> Hotplugged memory blocks (starting out offline), always meet both
> conditions.
This has been always hard for me to follow, so bear with me.
I remember we changed the memory-hotplug policy, not long ago, wrt.
what we can online/offline so we could get rid of certain assumptions like
"there are no holes in this memblock, so it can go" etc.
AFAIR, we can only offline if the memory
1) belongs to a single node ("which is always the case for
hotplugged-memory, boot memory is trickier")
2) does not have any holes
3) spans a single zone
These are the only requeriments we have atm, right?
By default, hotplugged memory already complies with they all three,
only in case of ZONE_DEVICE stuff we might violate 2) and 3).
> There are three scenarios to handle:
...
...
> @@ -225,6 +226,9 @@ static int memory_block_offline(struct memory_block *mem)
> unsigned long nr_vmemmap_pages = mem->nr_vmemmap_pages;
> int ret;
>
> + if (!mem->zone)
> + return -EBUSY;
Should not we return -EINVAL? I mean, -EBUSY reads like this might be a
temporary error which might get fixed later on, but that isn't the case.
> @@ -234,7 +238,7 @@ static int memory_block_offline(struct memory_block *mem)
> -nr_vmemmap_pages);
>
> ret = offline_pages(start_pfn + nr_vmemmap_pages,
> - nr_pages - nr_vmemmap_pages, mem->group);
> + nr_pages - nr_vmemmap_pages, mem->zone, mem->group);
Why not passing the node as well?
> +static struct zone *early_node_zone_for_memory_block(struct memory_block *mem,
> + int nid)
> +{
> + const unsigned long start_pfn = section_nr_to_pfn(mem->start_section_nr);
> + const unsigned long nr_pages = PAGES_PER_SECTION * sections_per_block;
> + struct zone *zone, *matching_zone = NULL;
> + pg_data_t *pgdat = NODE_DATA(nid);
I was about to complain because in init_memory_block() you call
early_node_zone_for_memory_block() with nid == NUMA_NODE_NODE, but then
I saw that NODE_DATA on !CONFIG_NUMA falls to contig_page_data.
So, I guess we cannot really reach this on CONFIG_NUMA machines with nid
being NUMA_NO_NODE, right? (do we want to add a check just in case?)
> +#ifdef CONFIG_NUMA
> +void memory_block_set_nid(struct memory_block *mem, int nid,
> + enum meminit_context context)
But we also set the zone? (Only for boot memory)
> @@ -663,6 +734,17 @@ static int init_memory_block(unsigned long block_id, unsigned long state,
> mem->nr_vmemmap_pages = nr_vmemmap_pages;
> INIT_LIST_HEAD(&mem->group_next);
>
> +#ifndef CONFIG_NUMA
> + if (state == MEM_ONLINE)
> + /*
> + * MEM_ONLINE at this point imples early memory. With NUMA,
implies
--
Oscar Salvador
SUSE Labs
next prev parent reply other threads:[~2022-01-31 11:29 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-01-28 15:26 [PATCH v1 0/2] " David Hildenbrand
2022-01-28 15:26 ` [PATCH v1 1/2] drivers/base/node: rename link_mem_sections() to register_memory_block_under_node() David Hildenbrand
2022-01-31 10:38 ` Oscar Salvador
2022-01-28 15:26 ` [PATCH v1 2/2] drivers/base/memory: determine and store zone for single-zone memory blocks David Hildenbrand
2022-01-31 10:42 ` David Hildenbrand
2022-01-31 11:29 ` Oscar Salvador [this message]
2022-01-31 11:40 ` David Hildenbrand
2022-01-31 11:42 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220131112909.GB18027@linux \
--to=osalvador@suse.de \
--cc=akpm@linux-foundation.org \
--cc=david@redhat.com \
--cc=gregkh@linuxfoundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=rafael@kernel.org \
--cc=rparrazo@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox