Re: [PATCH RFC 1/3] mm/hugetlb: split alloc_fresh_huge_page_node into fast and slow path

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Michal Hocko <mhocko@kernel.org>
To: Jia He <hejianet@gmail.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>,
	Gerald Schaefer <gerald.schaefer@de.ibm.com>,
	zhong jiang <zhongjiang@huawei.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Vaishali Thakkar <vaishali.thakkar@oracle.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Mel Gorman <mgorman@techsingularity.net>,
	Vlastimil Babka <vbabka@suse.cz>,
	Minchan Kim <minchan@kernel.org>, Rik van Riel <riel@redhat.com>
Subject: Re: [PATCH RFC 1/3] mm/hugetlb: split alloc_fresh_huge_page_node into fast and slow path
Date: Tue, 24 Jan 2017 17:52:00 +0100	[thread overview]
Message-ID: <20170124165200.GB30832@dhcp22.suse.cz> (raw)
In-Reply-To: <1485244144-13487-2-git-send-email-hejianet@gmail.com>

On Tue 24-01-17 15:49:02, Jia He wrote:
> This patch split alloc_fresh_huge_page_node into 2 parts:
> - fast path without __GFP_REPEAT flag
> - slow path with __GFP_REPEAT flag
> 
> Thus, if there is a server with uneven numa memory layout:
> available: 7 nodes (0-6)
> node 0 cpus: 0 1 2 3 4 5 6 7
> node 0 size: 6603 MB
> node 0 free: 91 MB
> node 1 cpus:
> node 1 size: 12527 MB
> node 1 free: 157 MB
> node 2 cpus:
> node 2 size: 15087 MB
> node 2 free: 189 MB
> node 3 cpus:
> node 3 size: 16111 MB
> node 3 free: 205 MB
> node 4 cpus: 8 9 10 11 12 13 14 15
> node 4 size: 24815 MB
> node 4 free: 310 MB
> node 5 cpus:
> node 5 size: 4095 MB
> node 5 free: 61 MB
> node 6 cpus:
> node 6 size: 22750 MB
> node 6 free: 283 MB
> node distances:
> node   0   1   2   3   4   5   6
>   0:  10  20  40  40  40  40  40
>   1:  20  10  40  40  40  40  40
>   2:  40  40  10  20  40  40  40
>   3:  40  40  20  10  40  40  40
>   4:  40  40  40  40  10  20  40
>   5:  40  40  40  40  20  10  40
>   6:  40  40  40  40  40  40  10
> 
> In this case node 5 has less memory and we will alloc the hugepages
> from these nodes one by one.
> After this patch, we will not trigger too early direct memory/kswap
> reclaim for node 5 if there are enough memory in other nodes.

This description is doesn't explain what is the problem, why it matters
and how the fix actually works. Moreover it does opposite what is
claims. Which brings me to another question. How has this been tested? 

> Signed-off-by: Jia He <hejianet@gmail.com>
> ---
>  mm/hugetlb.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index c7025c1..f2415ce 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1364,10 +1364,19 @@ static struct page *alloc_fresh_huge_page_node(struct hstate *h, int nid)
>  {
>  	struct page *page;
>  
> +	/* fast path without __GFP_REPEAT */
>  	page = __alloc_pages_node(nid,
>  		htlb_alloc_mask(h)|__GFP_COMP|__GFP_THISNODE|
>  						__GFP_REPEAT|__GFP_NOWARN,
>  		huge_page_order(h));

this does opposite what the comment says.

> +
> +	/* slow path with __GFP_REPEAT*/
> +	if (!page)
> +		page = __alloc_pages_node(nid,
> +			htlb_alloc_mask(h)|__GFP_COMP|__GFP_THISNODE|
> +					__GFP_NOWARN,
> +			huge_page_order(h));
> +
>  	if (page) {
>  		prep_new_huge_page(h, page, nid);
>  	}
> -- 
> 2.5.5
> 

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2017-01-24 16:52 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-24  7:49 [PATCH RFC 0/3] optimize kswapd when it does reclaim for hugepage Jia He
2017-01-24  7:49 ` [PATCH RFC 1/3] mm/hugetlb: split alloc_fresh_huge_page_node into fast and slow path Jia He
2017-01-24 16:52   ` Michal Hocko [this message]
2017-01-24  7:49 ` [PATCH RFC 2/3] mm, vmscan: limit kswapd loop if no progress is made Jia He
2017-01-24 16:54   ` Michal Hocko
2017-01-25  3:03     ` hejianet
2017-01-25  9:34       ` Michal Hocko
2017-01-24  7:49 ` [PATCH RFC 3/3] mm, vmscan: correct prepare_kswapd_sleep return value Jia He
2017-01-24 22:01   ` Rik van Riel
2017-01-25  2:24     ` hejianet
2017-01-24 16:46 ` [PATCH RFC 0/3] optimize kswapd when it does reclaim for hugepage Michal Hocko
2017-01-25  2:13   ` hejianet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170124165200.GB30832@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=gerald.schaefer@de.ibm.com \
    --cc=hannes@cmpxchg.org \
    --cc=hejianet@gmail.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mike.kravetz@oracle.com \
    --cc=minchan@kernel.org \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=riel@redhat.com \
    --cc=vaishali.thakkar@oracle.com \
    --cc=vbabka@suse.cz \
    --cc=zhongjiang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox