From: Daniel Jordan <daniel.m.jordan@oracle.com>
To: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: linux-kernel@vger.kernel.org, akpm@linux-foundation.org,
mhocko@suse.com, linux-mm@kvack.org, dan.j.williams@intel.com,
shile.zhang@linux.alibaba.com, daniel.m.jordan@oracle.com,
ktkhai@virtuozzo.com, david@redhat.com, jmorris@namei.org,
sashal@kernel.org
Subject: Re: [PATCH] mm: initialize deferred pages with interrupts enabled
Date: Wed, 1 Apr 2020 16:08:55 -0400 [thread overview]
Message-ID: <20200401200855.d23xcwznr5cm67p2@ca-dmjordan1.us.oracle.com> (raw)
In-Reply-To: <20200401200027.vsm5roobllewniea@ca-dmjordan1.us.oracle.com>
On Wed, Apr 01, 2020 at 04:00:27PM -0400, Daniel Jordan wrote:
> On Wed, Apr 01, 2020 at 03:32:38PM -0400, Pavel Tatashin wrote:
> > Initializing struct pages is a long task and keeping interrupts disabled
> > for the duration of this operation introduces a number of problems.
> >
> > 1. jiffies are not updated for long period of time, and thus incorrect time
> > is reported. See proposed solution and discussion here:
> > lkml/20200311123848.118638-1-shile.zhang@linux.alibaba.com
> > 2. It prevents farther improving deferred page initialization by allowing
>
> not allowing
> > inter-node multi-threading.
>
> intra-node
>
> ...
> > After:
> > [ 1.632580] node 0 initialised, 12051227 pages in 436ms
>
> Fixes: 3a2d7fa8a3d5 ("mm: disable interrupts while initializing deferred pages")
> Reported-by: Shile Zhang <shile.zhang@linux.alibaba.com>
>
> > Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com>
>
> Freezing jiffies for a while during boot sounds like stable to me, so
>
> Cc: <stable@vger.kernel.org> [4.17.x+]
>
>
> Can you please add a comment to mmzone.h above node_size_lock, something like
>
> * Must be held any time you expect node_start_pfn,
> * node_present_pages, node_spanned_pages or nr_zones to stay constant.
> + * Also synchronizes pgdat->first_deferred_pfn during deferred page
> + * init.
> ...
> spinlock_t node_size_lock;
>
> > @@ -1854,18 +1859,6 @@ deferred_grow_zone(struct zone *zone, unsigned int order)
> > return false;
> >
> > pgdat_resize_lock(pgdat, &flags);
> > -
> > - /*
> > - * If deferred pages have been initialized while we were waiting for
> > - * the lock, return true, as the zone was grown. The caller will retry
> > - * this zone. We won't return to this function since the caller also
> > - * has this static branch.
> > - */
> > - if (!static_branch_unlikely(&deferred_pages)) {
> > - pgdat_resize_unlock(pgdat, &flags);
> > - return true;
> > - }
> > -
>
> Huh, looks like this wasn't needed even before this change.
>
>
> The rest looks fine.
>
> Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com>
...except for I forgot about the touch_nmi_watchdog() calls. I think you'd
need something kind of like this before your patch.
---8<---
From: Daniel Jordan <daniel.m.jordan@oracle.com>
Date: Fri, 27 Mar 2020 17:29:05 -0400
Subject: [PATCH] mm: call touch_nmi_watchdog() on max order boundaries in
deferred init
deferred_init_memmap() disables interrupts the entire time, so it calls
touch_nmi_watchdog() periodically to avoid soft lockup splats. Soon it
will run with interrupts enabled, at which point cond_resched() should
be used instead.
deferred_grow_zone() makes the same watchdog calls through code shared
with deferred init but will continue to run with interrupts disabled, so
it can't call cond_resched().
Pull the watchdog calls up to these two places to allow the first to be
changed later, independently of the second. The frequency reduces from
twice per pageblock (init and free) to once per max order block.
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
---
mm/page_alloc.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 212734c4f8b0..4cf18c534233 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1639,7 +1639,6 @@ static void __init deferred_free_pages(unsigned long pfn,
} else if (!(pfn & nr_pgmask)) {
deferred_free_range(pfn - nr_free, nr_free);
nr_free = 1;
- touch_nmi_watchdog();
} else {
nr_free++;
}
@@ -1669,7 +1668,6 @@ static unsigned long __init deferred_init_pages(struct zone *zone,
continue;
} else if (!page || !(pfn & nr_pgmask)) {
page = pfn_to_page(pfn);
- touch_nmi_watchdog();
} else {
page++;
}
@@ -1813,8 +1811,10 @@ static int __init deferred_init_memmap(void *data)
* that we can avoid introducing any issues with the buddy
* allocator.
*/
- while (spfn < epfn)
+ while (spfn < epfn) {
nr_pages += deferred_init_maxorder(&i, zone, &spfn, &epfn);
+ touch_nmi_watchdog();
+ }
zone_empty:
pgdat_resize_unlock(pgdat, &flags);
@@ -1908,6 +1908,7 @@ deferred_grow_zone_locked(pg_data_t *pgdat, struct zone *zone,
first_deferred_pfn = spfn;
nr_pages += deferred_init_maxorder(&i, zone, &spfn, &epfn);
+ touch_nmi_watchdog();
/* We should only stop along section boundaries */
if ((first_deferred_pfn ^ spfn) < PAGES_PER_SECTION)
--
2.25.0
next prev parent reply other threads:[~2020-04-01 20:10 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-04-01 19:32 Pavel Tatashin
2020-04-01 19:57 ` Michal Hocko
2020-04-01 20:27 ` Pavel Tatashin
2020-04-02 7:34 ` Michal Hocko
2020-04-01 19:58 ` Michal Hocko
2020-04-01 20:00 ` Daniel Jordan
2020-04-01 20:08 ` Daniel Jordan [this message]
2020-04-01 20:31 ` Pavel Tatashin
2020-04-02 7:36 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200401200855.d23xcwznr5cm67p2@ca-dmjordan1.us.oracle.com \
--to=daniel.m.jordan@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=dan.j.williams@intel.com \
--cc=david@redhat.com \
--cc=jmorris@namei.org \
--cc=ktkhai@virtuozzo.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=pasha.tatashin@soleen.com \
--cc=sashal@kernel.org \
--cc=shile.zhang@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox