On 04/15/2015 09:38 AM, Mel Gorman wrote: >> However, there were 2 bootup problems in the dmesg log that needed >> to be addressed. >> 1. There were 2 vmalloc allocation failures: >> [ 2.284686] vmalloc: allocation failure, allocated 16578404352 of >> 17179873280 bytes >> [ 10.399938] vmalloc: allocation failure, allocated 7970922496 of >> 8589938688 bytes >> >> 2. There were 2 soft lockup warnings: >> [ 57.319453] NMI watchdog: BUG: soft lockup - CPU#1 stuck for 23s! >> [swapper/0:1] >> [ 85.409263] NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! >> [swapper/0:1] >> >> Once those problems are fixed, the patch should be in a pretty good >> shape. I have attached the dmesg log for your reference. >> > The obvious conclusion is that initialising 1G per node is not enough for > really large machines. Can you try this on top? It's untested but should > work. The low value was chosen because it happened to work and I wanted > to get test coverage on common hardware but broke is broke. > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index f2c96d02662f..6b3bec304e35 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -276,9 +276,9 @@ static inline bool update_defer_init(pg_data_t *pgdat, > if (pgdat->first_deferred_pfn != ULONG_MAX) > return false; > > - /* Initialise at least 1G per zone */ > + /* Initialise at least 32G per node */ > (*nr_initialised)++; > - if (*nr_initialised> (1UL<< (30 - PAGE_SHIFT))&& > + if (*nr_initialised> (32UL<< (30 - PAGE_SHIFT))&& > (pfn& (PAGES_PER_SECTION - 1)) == 0) { > pgdat->first_deferred_pfn = pfn; > return false; > > I applied the patch and the boot time was 299s instead of 298s, so practically the same. The two issues that I discussed about previously were both gone. Attached is the new dmesg log for your reference. Cheers, Longman