From: David Hildenbrand <david@redhat.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
"Rafael J. Wysocki" <rafael@kernel.org>,
Michal Hocko <mhocko@kernel.org>,
"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Subject: Re: [PATCH v1] drivers/base/memory.c: Don't access uninitialized memmaps in soft_offline_page_store()
Date: Mon, 14 Oct 2019 10:30:52 +0200 [thread overview]
Message-ID: <9fd1f157-d812-3a3b-813a-d34e0cc53f96@redhat.com> (raw)
In-Reply-To: <20191011151634.0b566c9e32e8d0e11181d025@linux-foundation.org>
On 12.10.19 00:16, Andrew Morton wrote:
> On Thu, 10 Oct 2019 16:12:00 +0200 David Hildenbrand <david@redhat.com> wrote:
>
>> Uninitialized memmaps contain garbage and in the worst case trigger kernel
>> BUGs, especially with CONFIG_PAGE_POISONING. They should not get
>> touched.
>>
>> Right now, when trying to soft-offline a PFN that resides on a memory
>> block that was never onlined, one gets a misleading error with
>> CONFIG_PAGE_POISONING:
>> :/# echo 5637144576 > /sys/devices/system/memory/soft_offline_page
>> [ 23.097167] soft offline: 0x150000 page already poisoned
>>
>> But the actual result depends on the garbage in the memmap.
>>
>> soft_offline_page() can only work with online pages, it returns -EIO in
>> case of ZONE_DEVICE. Make sure to only forward pages that are online
>> (iow, managed by the buddy) and, therefore, have an initialized memmap.
>>
>> Add a check against pfn_to_online_page() and similarly return -EIO.
>>
>> Fixes: f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded memory to zones until online") # visible after d0dc12e86b319
>
> Should this be cc:stable?
I think yes, more on that below.
>
> What is the relationship between this and some similar fixes in the
> series "mm/memory_hotplug: Shrink zones before removing memory", v6?
In general, they all have the same root cause. With f1dd2cd13c4b, we
started to initialize the memmap when onlining memory, however, we at
least zeroed it out when adding the memory. With d0dc12e86b319 we
removed the zeroing, and added conditional poisoning instead.
All these BUGs can be reproduced by adding memory and keeping some
memory blocks offline. Most distributions either online memory directly
in the kernel when added or userspace onlines it via udev rules. s390x
is special, because there we don't online memory blocks as default in
user space. So on !s390x systems, these BUGs are quite hard to reproduce.
With "mm/memory_hotplug: Shrink zones before removing memory" these BUGs
get easier to reproduce, because it is now sufficient to offline a
memory block that was already onlined.
Also, devmem with "driver reserved memory" (for which part we don't
initialize the memmap) is able to trigger these BUGs, but that feature
is more recent AFAIK.
So, cc:stable, I am not sure if it applies to all patches. Some really
only trigger when page poisoning is active, but don't result in any
damage (as so far observed). We really produce damage in case we
de-reference the NID/zone via the garbage memmap (and probably when
doing a page_to_pfn(pfn_to_page(gargabe_page))).
But here, it is quite hard to tell what could happen, so I guess, if in
doubt, better add cc:stable?
>
> Should any of the patches in "mm/memory_hotplug: Shrink zones before
> removing memory", v6 be cc:stable?
>
I'll go over all patches and reply to the relevant ones.
So for this patch, please add:
Cc: stable@vger.kernel.org # v4.13+
--
Thanks,
David / dhildenb
next prev parent reply other threads:[~2019-10-14 8:30 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-10 14:12 David Hildenbrand
2019-10-11 6:13 ` Naoya Horiguchi
2019-10-11 9:51 ` David Hildenbrand
2019-10-11 22:16 ` Andrew Morton
2019-10-14 8:30 ` David Hildenbrand [this message]
2019-10-14 13:29 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9fd1f157-d812-3a3b-813a-d34e0cc53f96@redhat.com \
--to=david@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@linux.ibm.com \
--cc=gregkh@linuxfoundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=rafael@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox