linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: xunlei <xlpang@linux.alibaba.com>
To: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Vladimir Davydov <vdavydov.dev@gmail.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH] mm: memcg: Fix memcg reclaim soft lockup
Date: Wed, 26 Aug 2020 18:41:18 +0800	[thread overview]
Message-ID: <99efed0e-050a-e313-46ab-8fe6228839d5@linux.alibaba.com> (raw)
In-Reply-To: <20200826081102.GM22869@dhcp22.suse.cz>

On 2020/8/26 下午4:11, Michal Hocko wrote:
> On Wed 26-08-20 15:27:02, Xunlei Pang wrote:
>> We've met softlockup with "CONFIG_PREEMPT_NONE=y", when
>> the target memcg doesn't have any reclaimable memory.
> 
> Do you have any scenario when this happens or is this some sort of a
> test case?

It can happen on tiny guest scenarios.

> 
>> It can be easily reproduced as below:
>>  watchdog: BUG: soft lockup - CPU#0 stuck for 111s![memcg_test:2204]
>>  CPU: 0 PID: 2204 Comm: memcg_test Not tainted 5.9.0-rc2+ #12
>>  Call Trace:
>>   shrink_lruvec+0x49f/0x640
>>   shrink_node+0x2a6/0x6f0
>>   do_try_to_free_pages+0xe9/0x3e0
>>   try_to_free_mem_cgroup_pages+0xef/0x1f0
>>   try_charge+0x2c1/0x750
>>   mem_cgroup_charge+0xd7/0x240
>>   __add_to_page_cache_locked+0x2fd/0x370
>>   add_to_page_cache_lru+0x4a/0xc0
>>   pagecache_get_page+0x10b/0x2f0
>>   filemap_fault+0x661/0xad0
>>   ext4_filemap_fault+0x2c/0x40
>>   __do_fault+0x4d/0xf9
>>   handle_mm_fault+0x1080/0x1790
>>
>> It only happens on our 1-vcpu instances, because there's no chance
>> for oom reaper to run to reclaim the to-be-killed process.
>>
>> Add cond_resched() in such cases at the beginning of shrink_lruvec()
>> to give up the cpu to others.
> 
> I do agree that we need a cond_resched but I cannot say I would like
> this patch. The primary reason is that it doesn't catch all cases when
> the memcg is not reclaimable. For example it wouldn't reschedule if the
> memcg is protected by low/min. What do you think about this instead?
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 99e1796eb833..bbdc38b58cc5 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -2617,6 +2617,8 @@ static void shrink_node_memcgs(pg_data_t *pgdat, struct scan_control *sc)
>  
>  		mem_cgroup_calculate_protection(target_memcg, memcg);
>  
> +		cond_resched();
> +
>  		if (mem_cgroup_below_min(memcg)) {
>  			/*
>  			 * Hard protection.
> 
> This should catch both cases. I even have a vague recollection that
> somebody has proposed something in that direction but I cannot remember
> what has happened with that patch.
> 

It's the endless "retry" in try_charge() that caused the softlockup, and
I think mem_cgroup_protected() will eventually return MEMCG_PROT_NONE,
and shrink_node_memcgs() will call shrink_lruvec() for memcg
self-reclaim cases, so it's not a problem here.

But adding cond_resched() at upper shrink_node_memcgs() may eliminate
potential similar issues, I have no objection with this approach.
I tested it and works well, will send v2 later.


  reply	other threads:[~2020-08-26 10:41 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-26  7:27 Xunlei Pang
2020-08-26  8:11 ` Michal Hocko
2020-08-26 10:41   ` xunlei [this message]
2020-08-26 11:00     ` Michal Hocko
2020-08-26 11:45       ` xunlei
2020-08-26 11:54         ` Xunlei Pang
2020-08-26 12:00       ` xunlei
2020-08-26 12:07         ` Michal Hocko
2020-08-26 12:21           ` xunlei
2020-08-26 12:48             ` Michal Hocko
2020-08-26 13:16               ` xunlei
2020-08-26 13:26                 ` Michal Hocko
2020-08-26 13:48                   ` xunlei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=99efed0e-050a-e313-46ab-8fe6228839d5@linux.alibaba.com \
    --to=xlpang@linux.alibaba.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox