linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Vlastimil Babka <vbabka@suse.cz>
To: Andrea Arcangeli <aarcange@redhat.com>,
	"Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, Alex Williamson <alex.williamson@redhat.com>,
	David Rientjes <rientjes@google.com>
Subject: Re: [PATCH 0/2] fix for "pathological THP behavior"
Date: Tue, 21 Aug 2018 17:30:11 +0200	[thread overview]
Message-ID: <6120e1b6-b4d2-96cb-2555-d8fab65c23c8@suse.cz> (raw)
In-Reply-To: <20180820151905.GB13047@redhat.com>

On 8/20/18 5:19 PM, Andrea Arcangeli wrote:
> Hi Kirill,
> 
> On Mon, Aug 20, 2018 at 02:58:18PM +0300, Kirill A. Shutemov wrote:
>> I personally prefer to prioritize NUMA locality over THP
>> (__GFP_COMPACT_ONLY variant), but I don't know page-alloc/compaction good
>> enough to Ack it.
> 
> If we go in this direction it'd be nice after fixing the showstopper
> bug, if we could then proceed with an orthogonal optimization by
> checking the watermarks and if the watermarks shows there are no
> PAGE_SIZEd pages available in the local node we should remove both
> __GFP_THISNODE and __GFP_COMPACT_ONLY.
> 
> If as opposed there's still PAGE_SIZEd free memory in the local node
> (not possible to compact for whatever reason), we should stick to
> __GFP_THISNODE | __GFP_COMPACT_ONLY.

If it's "not possible to compact" then the expected outcome of this is
to fail?

> It's orthogonal because the above addition would make sense also in
> the current (buggy) code.
> 
> The main implementation issue is that the watermark checking is not
> well done in mempolicy.c but the place to clear __GFP_THISNODE and
> __GFP_COMPACT_ONLY currently is there.

You could do that without calling watermark checking explicitly, but
it's rather complicated:

1. try alocating with __GFP_THISNODE and ~GFP_DIRECT_RECLAIM
2. if that fails, try PAGE_SIZE with same flags
3. if that fails, try THP size without __GFP_THISNODE
4. PAGE_SIZE without __GFP_THISNODE

Yeah, not possible in current alloc_pages_vma() which should return the
requested order. But the advantage is that it's not prone to races
between watermark checking and actual attempt.

> The case that the local node gets completely full and has not even 4k
> pages available should be totally common, because if you keep
> allocating and you allocate more than the size of a NUMA node
> eventually you will fill the local node with THP then consume all 4k
> pages and then you get into the case where the current code is totally
> unable to allocate THP from the other nodes and it would be totally
> possible to fix with the removal of __GFP_THISNODE |
> __GFP_COMPACT_ONLY, after the PAGE_SIZE watermark check.
> 
> I'm mentioning this optimization in this context, even if it's
> orthogonal, because the alternative patch that prioritizes THP over
> NUMA locality for MADV_HUGEPAGE and defer=always would solve that too
> with a one liner and there would be no need of watermark checking and
> flipping gfp bits whatsoever. Once the local node is full, THPs keeps
> being provided as expected.

Frankly, I would rather go with this option and assume that if someone
explicitly wants THP's, they don't care about NUMA locality that much.
(Note: I hate __GFP_THISNODE, it's an endless source of issues.)
Trying to be clever about "is there still PAGE_SIZEd free memory in the
local node" is imperfect anyway. If there isn't, is it because there's
clean page cache that we can easily reclaim (so it would be worth
staying local) or is it really exhausted? Watermark check won't tell...

Vlastimil

> Thanks,
> Andrea
> 

  reply	other threads:[~2018-08-21 15:32 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-20  3:22 Andrea Arcangeli
2018-08-20  3:22 ` [PATCH 1/2] mm: thp: consolidate policy_nodemask call Andrea Arcangeli
2018-08-20  3:22 ` [PATCH 2/2] mm: thp: fix transparent_hugepage/defrag = madvise || always Andrea Arcangeli
2018-08-20  3:26   ` [PATCH 0/1] fix for "pathological THP behavior" v2 Andrea Arcangeli
2018-08-20  3:26     ` [PATCH 1/1] mm: thp: fix transparent_hugepage/defrag = madvise || always Andrea Arcangeli
2018-08-20 12:35   ` [PATCH 2/2] " Zi Yan
2018-08-20 15:32     ` Andrea Arcangeli
2018-08-21 11:50   ` Michal Hocko
2018-08-21 21:40     ` Andrea Arcangeli
2018-08-22  9:02       ` Michal Hocko
2018-08-22 11:07         ` Michal Hocko
2018-08-22 14:24           ` Andrea Arcangeli
2018-08-22 14:45             ` Michal Hocko
2018-08-22 15:24               ` Andrea Arcangeli
2018-08-23 10:50                 ` Michal Hocko
2018-08-22 15:52         ` Andrea Arcangeli
2018-08-23 10:52           ` Michal Hocko
2018-08-28  7:53             ` Michal Hocko
2018-08-28  8:18               ` Michal Hocko
2018-08-28  8:54                 ` Stefan Priebe - Profihost AG
2018-08-29 11:11                   ` Stefan Priebe - Profihost AG
     [not found]                 ` <D5F4A33C-0A37-495C-9468-D6866A862097@cs.rutgers.edu>
2018-08-29 14:28                   ` Michal Hocko
2018-08-29 14:35                     ` Michal Hocko
2018-08-29 15:22                       ` Zi Yan
2018-08-29 15:47                         ` Michal Hocko
2018-08-29 16:06                           ` Zi Yan
2018-08-29 16:25                             ` Michal Hocko
2018-08-29 19:24                               ` [PATCH] mm, thp: relax __GFP_THISNODE for MADV_HUGEPAGE mappings Michal Hocko
2018-08-29 22:54                                 ` Zi Yan
2018-08-30  7:00                                   ` Michal Hocko
2018-08-30 13:22                                     ` Zi Yan
2018-08-30 13:45                                       ` Michal Hocko
2018-08-30 14:02                                         ` Zi Yan
2018-08-30 16:19                                           ` Stefan Priebe - Profihost AG
2018-08-30 16:40                                           ` Michal Hocko
2018-09-05  3:44                                             ` Andrea Arcangeli
2018-09-05  7:08                                               ` Michal Hocko
2018-09-06 11:10                                                 ` Vlastimil Babka
2018-09-06 11:16                                                   ` Vlastimil Babka
2018-09-06 11:25                                                     ` Michal Hocko
2018-09-06 12:35                                                       ` Zi Yan
2018-09-06 10:59                                   ` Vlastimil Babka
2018-09-06 11:17                                     ` Zi Yan
2018-08-30  6:47                                 ` Michal Hocko
2018-09-06 11:18                                   ` Vlastimil Babka
2018-09-06 11:27                                     ` Michal Hocko
2018-09-12 17:29                                 ` Mel Gorman
2018-09-17  6:11                                   ` Michal Hocko
2018-09-17  7:04                                     ` Stefan Priebe - Profihost AG
2018-09-17  9:32                                       ` Stefan Priebe - Profihost AG
2018-09-17 11:27                                       ` Michal Hocko
2018-08-20 11:58 ` [PATCH 0/2] fix for "pathological THP behavior" Kirill A. Shutemov
2018-08-20 15:19   ` Andrea Arcangeli
2018-08-21 15:30     ` Vlastimil Babka [this message]
2018-08-21 17:26       ` David Rientjes
2018-08-21 22:18         ` Andrea Arcangeli
2018-08-21 22:05       ` Andrea Arcangeli
2018-08-22  9:24       ` Michal Hocko
2018-08-22 15:56         ` Andrea Arcangeli
2018-08-20 19:06   ` Yang Shi
2018-08-20 23:24     ` Andrea Arcangeli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6120e1b6-b4d2-96cb-2555-d8fab65c23c8@suse.cz \
    --to=vbabka@suse.cz \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex.williamson@redhat.com \
    --cc=kirill@shutemov.name \
    --cc=linux-mm@kvack.org \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox