From: David Hildenbrand <david@redhat.com>
To: Oscar Salvador <osalvador@suse.de>, stable@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
mhocko@suse.com, vbabka@suse.com,
Vlastimil Babka <vbabka@suse.cz>
Subject: Re: [PATCH STABLE 4.9] mm: Avoid calling build_all_zonelists_init under hotplug context
Date: Tue, 18 Aug 2020 14:26:02 +0200 [thread overview]
Message-ID: <4c604399-8a92-16e4-8fea-682bb0abb1dc@redhat.com> (raw)
In-Reply-To: <20200818110046.6664-1-osalvador@suse.de>
On 18.08.20 13:00, Oscar Salvador wrote:
> Recently a customer of ours experienced a crash when booting the
> system while enabling memory-hotplug.
>
> The problem is that Normal zones on different nodes don't get their private
> zone->pageset allocated, and keep sharing the initial boot_pageset.
> The sharing between zones is normally safe as explained by the comment for
> boot_pageset - it's a percpu structure, and manipulations are done with
> disabled interrupts, and boot_pageset is set up in a way that any page placed
> on its pcplist is immediately flushed to shared zone's freelist, because
> pcp->high == 1.
> However, the hotplug operation updates pcp->high to a higher value as it
> expects to be operating on a private pageset.
>
> The problem is in build_all_zonelists(), which is called when the first range
> of pages is onlined for the Normal zone of node X or Y:
>
> if (system_state == SYSTEM_BOOTING) {
> build_all_zonelists_init();
> } else {
> #ifdef CONFIG_MEMORY_HOTPLUG
> if (zone)
> setup_zone_pageset(zone);
> #endif
> /* we have to stop all cpus to guarantee there is no user
> of zonelist */
> stop_machine(__build_all_zonelists, pgdat, NULL);
> /* cpuset refresh routine should be here */
> }
>
> When called during hotplug, it should execute the setup_zone_pageset(zone)
> which allocates the private pageset.
> However, with memhp_default_state=online, this happens early while
> system_state == SYSTEM_BOOTING is still true, hence this step is skipped.
> (and build_all_zonelists_init() is probably unsafe anyway at this point).
>
> Another hotplug operation on the same zone then leads to zone_pcp_update(zone)
> called from online_pages(), which updates the pcp->high for the shared
> boot_pageset to a value higher than 1.
> At that point, pages freed from Node X and Y Normal zones can end up on the same
> pcplist and from there they can be freed to the wrong zone's freelist,
> leading to the corruption and crashes.
>
> Please, note that upstream has fixed that differently (and unintentionally) by
> adding another boot state (SYSTEM_SCHEDULING), which is set before smp_init().
> That should happen before memory hotplug events even with memhp_default_state=online.
> Backporting that would be too intrusive.
>
> Signed-off-by: Oscar Salvador <osalvador@suse.de>
> Debugged-by: Vlastimil Babka <vbabka@suse.cz>
So, we have ACPI running and already adding DIMMs while booting? Crazy.
Looks sane to me. Thanks!
--
Thanks,
David / dhildenb
next prev parent reply other threads:[~2020-08-18 12:26 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-08-18 11:00 Oscar Salvador
2020-08-18 12:24 ` Michal Hocko
2020-08-20 8:14 ` Greg KH
2020-08-18 12:26 ` David Hildenbrand [this message]
2020-08-20 8:39 ` Patch "mm: Avoid calling build_all_zonelists_init under hotplug context" has been added to the 4.9-stable tree gregkh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4c604399-8a92-16e4-8fea-682bb0abb1dc@redhat.com \
--to=david@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=osalvador@suse.de \
--cc=stable@vger.kernel.org \
--cc=vbabka@suse.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox