linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Vlastimil Babka <vbabka@suse.cz>
To: Ganapatrao Kulkarni <gpkulkarni@gmail.com>
Cc: Michal Hocko <mhocko@kernel.org>,
	linux-mm@kvack.org,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Mel Gorman <mgorman@techsingularity.net>
Subject: Re: getting oom/stalls for ltp test cpuset01 with latest/4.9 kernel
Date: Mon, 16 Jan 2017 14:22:52 +0100	[thread overview]
Message-ID: <a374d6b6-c299-b50d-d7e0-f85ac78525aa@suse.cz> (raw)
In-Reply-To: <CAFpQJXUq_O=UAhCb7fwq2txYxg_owO77rRdQFUjR0_Mj9p=3pA@mail.gmail.com>

On 01/16/2017 11:41 AM, Ganapatrao Kulkarni wrote:
> On Fri, Jan 13, 2017 at 2:36 PM, Vlastimil Babka <vbabka@suse.cz> wrote:
>> On 01/13/2017 05:35 AM, Ganapatrao Kulkarni wrote:
>>> On Thu, Jan 12, 2017 at 4:40 PM, Vlastimil Babka <vbabka@suse.cz> wrote:
>>>> On 01/11/2017 05:46 PM, Michal Hocko wrote:
>>>>>
>>>>> On Wed 11-01-17 21:52:29, Ganapatrao Kulkarni wrote:
>>>>>
>>>>>> [ 2398.169391] Node 1 Normal: 951*4kB (UME) 1308*8kB (UME) 1034*16kB
>>>>>> (UME) 742*32kB (UME) 581*64kB (UME) 450*128kB (UME) 362*256kB (UME)
>>>>>> 275*512kB (ME) 189*1024kB (UM) 117*2048kB (ME) 2742*4096kB (M) = 12047196kB
>>>>>
>>>>>
>>>>> Most of the memblocks are marked Unmovable (except for the 4MB bloks)
>>>>
>>>>
>>>> No, UME here means that e.g. 4kB blocks are available on unmovable, movable
>>>> and reclaimable lists.
>>>>
>>>>> which shouldn't matter because we can fallback to unmovable blocks for
>>>>> movable allocation AFAIR so we shouldn't really fail the request. I
>>>>> really fail to see what is going on there but it smells really
>>>>> suspicious.
>>>>
>>>>
>>>> Perhaps there's something wrong with zonelists and we are skipping the Node
>>>> 1 Normal zone. Or there's some race with cpuset operations (but can't see
>>>> how).
>>>>
>>>> The question is, how reproducible is this? And what exactly the test
>>>> cpuset01 does? Is it doing multiple things in a loop that could be reduced
>>>> to a single testcase?
>>>
>>> IIUC, this test does node change to  cpuset.mems in loop in parent
>>> process in loop and child processes(equal to no of cpus) keeps on
>>> allocation and freeing
>>> 10 pages till the execution time is over.
>>> more details at
>>> https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/mem/cpuset/cpuset01.c
>>
>> Ah, thanks for explaining. Looks like there might be a race where determining
>> ac.preferred_zone using current_mems_allowed as ac.nodemask skips the only zone
>> that is allowed after the cpuset.mems update, and we only recalculate
>> ac.preferred_zone for allocations that are allowed to escape cpusets/watermarks.
>> Thus we see only part of the zonelist, missing the only allowed zone. This would
>> be due to commit 682a3385e773 ("mm, page_alloc: inline the fast path of the
>> zonelist iterator") and/or some others from that series.
>>
>> Could you try with the following patch please? It also tries to protect from
>> race with last non-root cpuset removal, which could cause cpusets_enable() to
>> become false in the middle of the function.
>>
>> ----8<----
>> From 9f041839401681f2678edf5040c851d11963c5fe Mon Sep 17 00:00:00 2001
>> From: Vlastimil Babka <vbabka@suse.cz>
>> Date: Fri, 13 Jan 2017 10:01:26 +0100
>> Subject: [PATCH] mm, page_alloc: fix race with cpuset update or removal
>>
>> Changelog and S-O-B TBD.
>> ---
>>  mm/page_alloc.c | 10 +++++++++-
>>  1 file changed, 9 insertions(+), 1 deletion(-)
>>
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index 6de9440e3ae2..c397f146843a 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -3775,9 +3775,17 @@ __alloc_pages_nodemask(gfp_t gfp_mask, unsigned int order,
>>         /*
>>          * Restore the original nodemask if it was potentially replaced with
>>          * &cpuset_current_mems_allowed to optimize the fast-path attempt.
>> +        * Also recalculate the starting point for the zonelist iterator or
>> +        * we could end up iterating over non-eligible zones endlessly.
>>          */
>> -       if (cpusets_enabled())
>> +       if (unlikely(ac.nodemask != nodemask)) {
>>                 ac.nodemask = nodemask;
>> +               ac.preferred_zoneref = first_zones_zonelist(ac.zonelist,
>> +                                               ac.high_zoneidx, ac.nodemask);
>> +               if (!ac.preferred_zoneref)
>> +                       goto no_zone;
>> +       }
>> +
>>         page = __alloc_pages_slowpath(alloc_mask, order, &ac);
>>
>>  no_zone:
>> --
>> 2.11.0
>>
> 
> this patch did not fix the issue.
> issue still exists!

Hmm, that's unfortunate.

> i did bisect and this test passes in 4.4,4.5 and 4.6
> test failing since 4.7-rc1

4.7 would match the commit I was trying to fix. But I don't see other
problems now. Could you bisect to a single commit then, to be sure? Thanks.

> thanks
> Ganapat
>>
>>
>>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-01-16 13:22 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-11 10:50 Ganapatrao Kulkarni
2017-01-11 11:05 ` Vlastimil Babka
2017-01-11 12:38   ` Vlastimil Babka
2017-01-11 16:22     ` Ganapatrao Kulkarni
2017-01-11 16:46       ` Michal Hocko
2017-01-12 11:10         ` Vlastimil Babka
2017-01-13  4:35           ` Ganapatrao Kulkarni
2017-01-13  9:06             ` Vlastimil Babka
2017-01-13 15:51               ` Michal Hocko
2017-01-16 10:41               ` Ganapatrao Kulkarni
2017-01-16 13:22                 ` Vlastimil Babka [this message]
2017-01-17  8:13                   ` Vlastimil Babka
2017-01-11 16:33   ` Michal Hocko
2017-01-11 16:32 ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a374d6b6-c299-b50d-d7e0-f85ac78525aa@suse.cz \
    --to=vbabka@suse.cz \
    --cc=gpkulkarni@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox