linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@suse.com>
To: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: akpm@linux-foundation.org, linux-mm@kvack.org,
	David Rientjes <rientjes@google.com>,
	Hillf Danton <hillf.zj@alibaba-inc.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Joonsoo Kim <js1304@gmail.com>, Mel Gorman <mgorman@suse.de>,
	Vladimir Davydov <vdavydov@virtuozzo.com>,
	Vlastimil Babka <vbabka@suse.cz>
Subject: Re: [PATCH] mm,page_alloc: PF_WQ_WORKER should always sleep at should_reclaim_retry().
Date: Mon, 9 Jul 2018 09:57:31 +0200	[thread overview]
Message-ID: <20180709075731.GB22049@dhcp22.suse.cz> (raw)
In-Reply-To: <1531046158-4010-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp>

On Sun 08-07-18 19:35:58, Tetsuo Handa wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> should_reclaim_retry() should be a natural reschedule point. PF_WQ_WORKER
> is a special case which needs a stronger rescheduling policy. However,
> since schedule_timeout_uninterruptible(1) for PF_WQ_WORKER depends on
> __zone_watermark_ok() == true, PF_WQ_WORKER is currently counting on
> mutex_trylock(&oom_lock) == 0 in __alloc_pages_may_oom() which is a bad
> expectation.

I think your reference to the oom_lock is more confusing than helpful
actually. I would simply use the following from your previous [1]
changelog:
: should_reclaim_retry() should be a natural reschedule point. PF_WQ_WORKER
: is a special case which needs a stronger rescheduling policy. Doing that
: unconditionally seems more straightforward than depending on a zone being
: a good candidate for a further reclaim.
: 
: Thus, move the short sleep when we are waiting for the owner of oom_lock
: (which coincidentally also serves as a guaranteed sleep for PF_WQ_WORKER
: threads) to should_reclaim_retry().

> unconditionally seems more straightforward than depending on a zone being
> a good candidate for a further reclaim.

[1] http://lkml.kernel.org/r/1528369223-7571-2-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp

[Tetsuo: changelog]
> Signed-off-by: Michal Hocko <mhocko@suse.com>
> Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Joonsoo Kim <js1304@gmail.com>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Cc: Vladimir Davydov <vdavydov@virtuozzo.com>
> Cc: Vlastimil Babka <vbabka@suse.cz>

Your s-o-b is still missing.

> ---
>  mm/page_alloc.c | 34 ++++++++++++++++++----------------
>  1 file changed, 18 insertions(+), 16 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 1521100..f56cc09 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3922,6 +3922,7 @@ bool gfp_pfmemalloc_allowed(gfp_t gfp_mask)
>  {
>  	struct zone *zone;
>  	struct zoneref *z;
> +	bool ret = false;
>  
>  	/*
>  	 * Costly allocations might have made a progress but this doesn't mean
> @@ -3985,25 +3986,26 @@ bool gfp_pfmemalloc_allowed(gfp_t gfp_mask)
>  				}
>  			}
>  
> -			/*
> -			 * Memory allocation/reclaim might be called from a WQ
> -			 * context and the current implementation of the WQ
> -			 * concurrency control doesn't recognize that
> -			 * a particular WQ is congested if the worker thread is
> -			 * looping without ever sleeping. Therefore we have to
> -			 * do a short sleep here rather than calling
> -			 * cond_resched().
> -			 */
> -			if (current->flags & PF_WQ_WORKER)
> -				schedule_timeout_uninterruptible(1);
> -			else
> -				cond_resched();
> -
> -			return true;
> +			ret = true;
> +			goto out;
>  		}
>  	}
>  
> -	return false;
> +out:
> +	/*
> +	 * Memory allocation/reclaim might be called from a WQ
> +	 * context and the current implementation of the WQ
> +	 * concurrency control doesn't recognize that
> +	 * a particular WQ is congested if the worker thread is
> +	 * looping without ever sleeping. Therefore we have to
> +	 * do a short sleep here rather than calling
> +	 * cond_resched().
> +	 */
> +	if (current->flags & PF_WQ_WORKER)
> +		schedule_timeout_uninterruptible(1);
> +	else
> +		cond_resched();
> +	return ret;
>  }
>  
>  static inline bool
> -- 
> 1.8.3.1
> 

-- 
Michal Hocko
SUSE Labs

  reply	other threads:[~2018-07-09  7:57 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-08 10:35 Tetsuo Handa
2018-07-09  7:57 ` Michal Hocko [this message]
2018-07-09 13:08   ` Tetsuo Handa
2018-07-09 13:13     ` Michal Hocko
2018-07-09 13:58     ` Matthew Wilcox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180709075731.GB22049@dhcp22.suse.cz \
    --to=mhocko@suse.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=hillf.zj@alibaba-inc.com \
    --cc=js1304@gmail.com \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=penguin-kernel@I-love.SAKURA.ne.jp \
    --cc=rientjes@google.com \
    --cc=vbabka@suse.cz \
    --cc=vdavydov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox