From: David Hildenbrand <david@redhat.com>
To: Will Deacon <will@kernel.org>
Cc: "Anshuman Khandual" <anshuman.khandual@arm.com>,
linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
"Catalin Marinas" <catalin.marinas@arm.com>,
"Ard Biesheuvel" <ardb@kernel.org>,
"Mark Rutland" <mark.rutland@arm.com>,
"James Morse" <james.morse@arm.com>,
"Robin Murphy" <robin.murphy@arm.com>,
"Jérôme Glisse" <jglisse@redhat.com>,
"Dan Williams" <dan.j.williams@intel.com>,
"Mike Rapoport" <rppt@linux.ibm.com>
Subject: Re: [PATCH V2 1/2] arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory
Date: Tue, 2 Feb 2021 13:56:10 +0100 [thread overview]
Message-ID: <4d8f5156-8628-5531-1485-322ad92aa15c@redhat.com> (raw)
In-Reply-To: <20210202125152.GC16868@willie-the-truck>
On 02.02.21 13:51, Will Deacon wrote:
> On Tue, Feb 02, 2021 at 01:39:29PM +0100, David Hildenbrand wrote:
>> On 02.02.21 13:35, Will Deacon wrote:
>>> On Tue, Feb 02, 2021 at 12:32:15PM +0000, Will Deacon wrote:
>>>> On Tue, Feb 02, 2021 at 09:41:53AM +0530, Anshuman Khandual wrote:
>>>>> pfn_valid() validates a pfn but basically it checks for a valid struct page
>>>>> backing for that pfn. It should always return positive for memory ranges
>>>>> backed with struct page mapping. But currently pfn_valid() fails for all
>>>>> ZONE_DEVICE based memory types even though they have struct page mapping.
>>>>>
>>>>> pfn_valid() asserts that there is a memblock entry for a given pfn without
>>>>> MEMBLOCK_NOMAP flag being set. The problem with ZONE_DEVICE based memory is
>>>>> that they do not have memblock entries. Hence memblock_is_map_memory() will
>>>>> invariably fail via memblock_search() for a ZONE_DEVICE based address. This
>>>>> eventually fails pfn_valid() which is wrong. memblock_is_map_memory() needs
>>>>> to be skipped for such memory ranges. As ZONE_DEVICE memory gets hotplugged
>>>>> into the system via memremap_pages() called from a driver, their respective
>>>>> memory sections will not have SECTION_IS_EARLY set.
>>>>>
>>>>> Normal hotplug memory will never have MEMBLOCK_NOMAP set in their memblock
>>>>> regions. Because the flag MEMBLOCK_NOMAP was specifically designed and set
>>>>> for firmware reserved memory regions. memblock_is_map_memory() can just be
>>>>> skipped as its always going to be positive and that will be an optimization
>>>>> for the normal hotplug memory. Like ZONE_DEVICE based memory, all normal
>>>>> hotplugged memory too will not have SECTION_IS_EARLY set for their sections
>>>>>
>>>>> Skipping memblock_is_map_memory() for all non early memory sections would
>>>>> fix pfn_valid() problem for ZONE_DEVICE based memory and also improve its
>>>>> performance for normal hotplug memory as well.
>>>>
>>>> Hmm. Although I follow your logic, this does seem to rely on an awful lot of
>>>> assumptions to continue to hold true as the kernel evolves. In particular,
>>>> how do we ensure that early sections are always fully backed with
>>>
>>> Sorry, typo here: ^^^ should be *non-early* sections.
>>
>> It might be a good idea to have a look at generic
>> include/linux/mmzone.h:pfn_valid()
>
> The generic implementation already makes assumptions that aren't true on
> arm64, so that's why we've ended up with our own implementation. But the
> patches here put us in a position where I worry that pfn_valid() may return
> 'true' in future for cases where the underlying struct page is either
> non-existent or bogus, and debugging those failures really sucks. We had a
> raft of those back when NOMAP was introduced and I don't want to re-live
> that experience.
Yeah, and I agree when it comes to boot mem. However, the way generic
memory hotplug/memremap infrastructure (->!early sections) works does
not allow for such special cases you mention and would break quite some
other code if messed up. So I wouldn't worry about that part too much
for now.
>
>> As I expressed already, long term we should really get rid of the arm64
>> variant and rather special-case the generic one. Then we won't go out of
>> sync - just as it happened with ZONE_DEVICE handling here.
>
> Why does this have to be long term? This ZONE_DEVICE stuff could be the
> carrot on the stick :)
Yes, I suggested to do it now, but Anshuman convinced me that doing a
simple fix upfront might be cleaner --- for example when it comes to
backporting :)
--
Thanks,
David / dhildenb
next prev parent reply other threads:[~2021-02-02 12:56 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-02-02 4:11 [PATCH V2 0/2] " Anshuman Khandual
2021-02-02 4:11 ` [PATCH V2 1/2] " Anshuman Khandual
2021-02-02 12:32 ` Will Deacon
2021-02-02 12:35 ` Will Deacon
2021-02-02 12:39 ` David Hildenbrand
2021-02-02 12:51 ` Will Deacon
2021-02-02 12:56 ` David Hildenbrand [this message]
2021-02-03 3:50 ` Anshuman Khandual
2021-02-05 18:55 ` Will Deacon
2021-02-11 11:53 ` Will Deacon
2021-02-11 12:10 ` Anshuman Khandual
2021-02-11 12:21 ` Will Deacon
2021-02-11 12:35 ` David Hildenbrand
2021-03-03 19:04 ` Catalin Marinas
2021-03-03 19:24 ` David Hildenbrand
2021-03-03 21:24 ` Will Deacon
2021-03-04 3:31 ` Anshuman Khandual
2021-03-04 8:12 ` David Hildenbrand
2021-03-04 9:36 ` Will Deacon
2021-03-05 4:22 ` Anshuman Khandual
2021-02-02 4:11 ` [PATCH V2 2/2] arm64/mm: Reorganize pfn_valid() Anshuman Khandual
2021-02-02 8:26 ` David Hildenbrand
2021-02-05 18:52 ` [PATCH V2 0/2] arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory Will Deacon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4d8f5156-8628-5531-1485-322ad92aa15c@redhat.com \
--to=david@redhat.com \
--cc=anshuman.khandual@arm.com \
--cc=ardb@kernel.org \
--cc=catalin.marinas@arm.com \
--cc=dan.j.williams@intel.com \
--cc=james.morse@arm.com \
--cc=jglisse@redhat.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mark.rutland@arm.com \
--cc=robin.murphy@arm.com \
--cc=rppt@linux.ibm.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox