linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Florian Fainelli <f.fainelli@gmail.com>,
	Chris Goldsworthy <quic_cgoldswo@quicinc.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>
Cc: linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Sudarshan Rajagopalan <quic_sudaraja@quicinc.com>,
	Doug Berger <opendmb@gmail.com>
Subject: Re: [RFC] arm64: mm: update max_pfn after memory hotplug
Date: Fri, 24 Sep 2021 10:17:46 +0200	[thread overview]
Message-ID: <41789cad-76c6-0ea5-4aa1-3e4a52acff86@redhat.com> (raw)
In-Reply-To: <6eb8319d-acba-b69a-5db3-5dca9ef426e8@gmail.com>

On 24.09.21 04:47, Florian Fainelli wrote:
> 
> 
> On 9/23/2021 3:54 PM, Chris Goldsworthy wrote:
>> From: Sudarshan Rajagopalan <quic_sudaraja@quicinc.com>
>>
>> After new memory blocks have been hotplugged, max_pfn and max_low_pfn
>> needs updating to reflect on new PFNs being hot added to system.
>>
>> Signed-off-by: Sudarshan Rajagopalan <quic_sudaraja@quicinc.com>
>> Signed-off-by: Chris Goldsworthy <quic_cgoldswo@quicinc.com>
>> ---
>>    arch/arm64/mm/mmu.c | 5 +++++
>>    1 file changed, 5 insertions(+)
>>
>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
>> index cfd9deb..fd85b51 100644
>> --- a/arch/arm64/mm/mmu.c
>> +++ b/arch/arm64/mm/mmu.c
>> @@ -1499,6 +1499,11 @@ int arch_add_memory(int nid, u64 start, u64 size,
>>    	if (ret)
>>    		__remove_pgd_mapping(swapper_pg_dir,
>>    				     __phys_to_virt(start), size);
>> +	else {
>> +		max_pfn = PFN_UP(start + size);
>> +		max_low_pfn = max_pfn;
>> +	}
> 
> This is a drive by review, but it got me thinking about your changes a bit:
> 
> - if you raise max_pfn when you hotplug memory, don't you need to lower
> it when you hot unplug memory as well?

The issue with lowering is that you actually have to do some search to 
figure out the actual value -- and it's not really worth the trouble. 
Raising the limit is easy.

With memory hotunplug, anybody wanting to take a look at a "struct page" 
via a pfn has to do a pfn_to_online_page() either way. That will fail if 
there isn't actually a memmap anymore because the memory has been 
unplugged. So "max_pfn" is actually rather a hint what maximum pfn to 
look at, and it can be bigger than it actually is.

The a look at the example usage in fs/proc/page.c:kpageflags_read()

pfn_to_online_page() will simply fail and stable_page_flags() will 
indicate a KPF_NOPAGE.

Just like we would have a big memory hole now at the end of memory.

> 
> - suppose that you have a platform which maps physical memory into the
> CPU's address space at 0x00_4000_0000 (1GB offset) and the kernel boots
> with 2GB of DRAM plugged by default. At that point we have not
> registered a swiotlb because we have less than 4GB of addressable
> physical memory, there is no IOMMU in that system, it's a happy world.
> Now assume that we plug an additional 2GB of DRAM into that system
> adjacent to the previous 2GB, from 0x00_C0000_0000 through
> 0x14_0000_0000, now we have physical addresses above 4GB, but we still
> don't have a swiotlb, some of our DMA_BIT_MASK(32) peripherals are going
> to be unable to DMA from that hot plugged memory, but they could if we
> had a swiotlb.

That's why platforms that hotplug memory should indicate the maximum 
possible PFN via some mechanism during boot. On x86-64 (and IIRC also 
arm64 now), this is done via the ACPI SRAT.

And that's where "max_possible_pfn" and "max_pfn" differ. See 
drivers/acpi/numa/srat.c:acpi_numa_memory_affinity_init():

	max_possible_pfn = max(max_possible_pfn, PFN_UP(end - 1));$


Using max_possible_pfn, the OS can properly setup the swiotlb, even 
thought it wouldn't currently be required when just looking at max_pfn.

I documented that for virtio-mem in
	https://virtio-mem.gitlab.io/user-guide/user-guide-linux.html
"swiotlb and DMA memory".

> 
> - now let's go even further but this is very contrived. Assume that the
> firmware has somewhat created a reserved memory region with a 'no-map'
> attribute thus indicating it does not want a struct page to be created
> for a specific PFN range, is it valid to "blindly" raise max_pfn if that
> region were to be at the end of the just hot-plugged memory?

no-map means that no direct mapping is to be created, right? We would 
still have a memmap IIRC, and the pages are PG_reserved.

Again, I think this is very similar to just having no-map regions like 
random memory holes within the existing memory layout.


What Chris proposes here is very similar to 
arch/x86/mm/init_64.c:update_end_of_memory_vars() called during 
arch_add_memory()->add_pages() on x86-64.

-- 
Thanks,

David / dhildenb



  reply	other threads:[~2021-09-24  8:17 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-23 22:54 Chris Goldsworthy
2021-09-23 22:54 ` Chris Goldsworthy
2021-09-24  2:47   ` Florian Fainelli
2021-09-24  8:17     ` David Hildenbrand [this message]
2021-09-24 20:52       ` Chris Goldsworthy
2021-09-25  0:36       ` Sudarshan Rajagopalan
2021-09-27 15:51   ` David Hildenbrand
2021-09-27 23:22     ` Georgi Djakov
2021-09-28  6:12       ` Chris Goldsworthy
2021-09-28  7:33         ` David Hildenbrand
2021-09-27 17:22   ` Georgi Djakov
2021-09-27 17:34     ` David Hildenbrand
2021-09-27 20:00       ` Georgi Djakov
2021-09-27 20:14         ` David Hildenbrand
2021-09-27 23:01           ` Georgi Djakov
2021-09-29 10:10   ` Will Deacon
2021-09-29 10:29     ` David Hildenbrand
2021-09-29 10:42       ` Will Deacon
2021-09-29 10:49         ` David Hildenbrand
2021-09-29 11:03           ` Will Deacon
2021-09-29 12:09             ` David Hildenbrand
2021-09-29 12:51               ` Will Deacon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=41789cad-76c6-0ea5-4aa1-3e4a52acff86@redhat.com \
    --to=david@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=catalin.marinas@arm.com \
    --cc=f.fainelli@gmail.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=opendmb@gmail.com \
    --cc=quic_cgoldswo@quicinc.com \
    --cc=quic_sudaraja@quicinc.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox