From: David Hildenbrand <david@redhat.com>
To: Florian Fainelli <f.fainelli@gmail.com>,
Chris Goldsworthy <quic_cgoldswo@quicinc.com>,
Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>
Cc: linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
Sudarshan Rajagopalan <quic_sudaraja@quicinc.com>,
Doug Berger <opendmb@gmail.com>
Subject: Re: [RFC] arm64: mm: update max_pfn after memory hotplug
Date: Fri, 24 Sep 2021 10:17:46 +0200 [thread overview]
Message-ID: <41789cad-76c6-0ea5-4aa1-3e4a52acff86@redhat.com> (raw)
In-Reply-To: <6eb8319d-acba-b69a-5db3-5dca9ef426e8@gmail.com>
On 24.09.21 04:47, Florian Fainelli wrote:
>
>
> On 9/23/2021 3:54 PM, Chris Goldsworthy wrote:
>> From: Sudarshan Rajagopalan <quic_sudaraja@quicinc.com>
>>
>> After new memory blocks have been hotplugged, max_pfn and max_low_pfn
>> needs updating to reflect on new PFNs being hot added to system.
>>
>> Signed-off-by: Sudarshan Rajagopalan <quic_sudaraja@quicinc.com>
>> Signed-off-by: Chris Goldsworthy <quic_cgoldswo@quicinc.com>
>> ---
>> arch/arm64/mm/mmu.c | 5 +++++
>> 1 file changed, 5 insertions(+)
>>
>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
>> index cfd9deb..fd85b51 100644
>> --- a/arch/arm64/mm/mmu.c
>> +++ b/arch/arm64/mm/mmu.c
>> @@ -1499,6 +1499,11 @@ int arch_add_memory(int nid, u64 start, u64 size,
>> if (ret)
>> __remove_pgd_mapping(swapper_pg_dir,
>> __phys_to_virt(start), size);
>> + else {
>> + max_pfn = PFN_UP(start + size);
>> + max_low_pfn = max_pfn;
>> + }
>
> This is a drive by review, but it got me thinking about your changes a bit:
>
> - if you raise max_pfn when you hotplug memory, don't you need to lower
> it when you hot unplug memory as well?
The issue with lowering is that you actually have to do some search to
figure out the actual value -- and it's not really worth the trouble.
Raising the limit is easy.
With memory hotunplug, anybody wanting to take a look at a "struct page"
via a pfn has to do a pfn_to_online_page() either way. That will fail if
there isn't actually a memmap anymore because the memory has been
unplugged. So "max_pfn" is actually rather a hint what maximum pfn to
look at, and it can be bigger than it actually is.
The a look at the example usage in fs/proc/page.c:kpageflags_read()
pfn_to_online_page() will simply fail and stable_page_flags() will
indicate a KPF_NOPAGE.
Just like we would have a big memory hole now at the end of memory.
>
> - suppose that you have a platform which maps physical memory into the
> CPU's address space at 0x00_4000_0000 (1GB offset) and the kernel boots
> with 2GB of DRAM plugged by default. At that point we have not
> registered a swiotlb because we have less than 4GB of addressable
> physical memory, there is no IOMMU in that system, it's a happy world.
> Now assume that we plug an additional 2GB of DRAM into that system
> adjacent to the previous 2GB, from 0x00_C0000_0000 through
> 0x14_0000_0000, now we have physical addresses above 4GB, but we still
> don't have a swiotlb, some of our DMA_BIT_MASK(32) peripherals are going
> to be unable to DMA from that hot plugged memory, but they could if we
> had a swiotlb.
That's why platforms that hotplug memory should indicate the maximum
possible PFN via some mechanism during boot. On x86-64 (and IIRC also
arm64 now), this is done via the ACPI SRAT.
And that's where "max_possible_pfn" and "max_pfn" differ. See
drivers/acpi/numa/srat.c:acpi_numa_memory_affinity_init():
max_possible_pfn = max(max_possible_pfn, PFN_UP(end - 1));$
Using max_possible_pfn, the OS can properly setup the swiotlb, even
thought it wouldn't currently be required when just looking at max_pfn.
I documented that for virtio-mem in
https://virtio-mem.gitlab.io/user-guide/user-guide-linux.html
"swiotlb and DMA memory".
>
> - now let's go even further but this is very contrived. Assume that the
> firmware has somewhat created a reserved memory region with a 'no-map'
> attribute thus indicating it does not want a struct page to be created
> for a specific PFN range, is it valid to "blindly" raise max_pfn if that
> region were to be at the end of the just hot-plugged memory?
no-map means that no direct mapping is to be created, right? We would
still have a memmap IIRC, and the pages are PG_reserved.
Again, I think this is very similar to just having no-map regions like
random memory holes within the existing memory layout.
What Chris proposes here is very similar to
arch/x86/mm/init_64.c:update_end_of_memory_vars() called during
arch_add_memory()->add_pages() on x86-64.
--
Thanks,
David / dhildenb
next prev parent reply other threads:[~2021-09-24 8:17 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-23 22:54 Chris Goldsworthy
2021-09-23 22:54 ` Chris Goldsworthy
2021-09-24 2:47 ` Florian Fainelli
2021-09-24 8:17 ` David Hildenbrand [this message]
2021-09-24 20:52 ` Chris Goldsworthy
2021-09-25 0:36 ` Sudarshan Rajagopalan
2021-09-27 15:51 ` David Hildenbrand
2021-09-27 23:22 ` Georgi Djakov
2021-09-28 6:12 ` Chris Goldsworthy
2021-09-28 7:33 ` David Hildenbrand
2021-09-27 17:22 ` Georgi Djakov
2021-09-27 17:34 ` David Hildenbrand
2021-09-27 20:00 ` Georgi Djakov
2021-09-27 20:14 ` David Hildenbrand
2021-09-27 23:01 ` Georgi Djakov
2021-09-29 10:10 ` Will Deacon
2021-09-29 10:29 ` David Hildenbrand
2021-09-29 10:42 ` Will Deacon
2021-09-29 10:49 ` David Hildenbrand
2021-09-29 11:03 ` Will Deacon
2021-09-29 12:09 ` David Hildenbrand
2021-09-29 12:51 ` Will Deacon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=41789cad-76c6-0ea5-4aa1-3e4a52acff86@redhat.com \
--to=david@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=catalin.marinas@arm.com \
--cc=f.fainelli@gmail.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=opendmb@gmail.com \
--cc=quic_cgoldswo@quicinc.com \
--cc=quic_sudaraja@quicinc.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox