From: Michal Hocko <mhocko@suse.com>
To: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
linux-mm@kvack.org, Alex Williamson <alex.williamson@redhat.com>,
David Rientjes <rientjes@google.com>,
Vlastimil Babka <vbabka@suse.cz>
Subject: Re: [PATCH 2/2] mm: thp: fix transparent_hugepage/defrag = madvise || always
Date: Wed, 22 Aug 2018 16:45:17 +0200 [thread overview]
Message-ID: <20180822144517.GP29735@dhcp22.suse.cz> (raw)
In-Reply-To: <20180822142446.GL13047@redhat.com>
On Wed 22-08-18 10:24:46, Andrea Arcangeli wrote:
> On Wed, Aug 22, 2018 at 01:07:37PM +0200, Michal Hocko wrote:
> > On Wed 22-08-18 11:02:14, Michal Hocko wrote:
> > > On Tue 21-08-18 17:40:49, Andrea Arcangeli wrote:
> > > > On Tue, Aug 21, 2018 at 01:50:57PM +0200, Michal Hocko wrote:
> > > [...]
> > > > > I really detest a new gfp flag for one time semantic that is muddy as
> > > > > hell.
> > > >
> > > > Well there's no way to fix this other than to prevent reclaim to run,
> > > > if you still want to give a chance to page faults to obtain THP under
> > > > MADV_HUGEPAGE in the page fault without waiting minutes or hours for
> > > > khugpaged to catch up with it.
> > >
> > > I do not get that part. Why should caller even care about reclaim vs.
> > > compaction. How can you even make an educated guess what makes more
> > > sense? This should be fully controlled by the allocator path. The caller
> > > should only care about how hard to try. It's been some time since I've
> > > looked but we used to have a gfp flags to tell that for THP allocations
> > > as well.
> >
> > In other words, why do we even try to swap out when allocating costly
> > high order page for requests which do not insist to try really hard?
>
> Note that the testcase with vfio swaps nothing and writes nothing to
> disk. No memory at all is being swapped or freed because 100% of the
> node is pinned with GUP pins, so I'm dubious this could possible move
> the needle for the reproducer that I used for the benchmark.
Now I am confused. How can compaction help at all then? I mean if the
node is full of GUP pins then you can hardly do anything but fallback to
other node. Or how come your new GFP flag makes any difference?
> The swap storm I suggested to you as reproducer, because it's another
> way the bug would see the light of the day and it's easier to
> reproduce without requiring device assignment, but the badness is the
> fact reclaim is called when it shouldn't be and whatever fix must
> cover vfio too. The below I can't imagine how it could possibly have
> an effect on vfio, and even for the swap storm case you're converting
> a swap storm into a CPU waste, it'll still run just extremely slow
> allocations like with vfio.
It would still try to reclaim easy target as compaction requires. If you
do not reclaim at all you can make the current implementation of the
compaction noop due to its own watermark checks IIRC.
> The effect of the below should be evaluated regardless of the issue
> we've been discussing in this thread and it's a new corner case for
> order > PAGE_ALLOC_COSTLY_ORDER. I don't like very much order >
> PAGE_ALLOC_COSTLY_ORDER checks, those are arbitrary numbers, the more
> checks are needed in various places for that, the more it's a sign the
> VM is bad and arbitrary and with one more corner case required to hide
> some badness. But again this will have effects unrelated to what we're
> discussing here and it will just convert I/O into CPU waste and have
> no effect on vfio.
yeah, I agree about PAGE_ALLOC_COSTLY_ORDER being an arbitrary limit for
a different behavior. But we already do handle those specially so it
kind of makes sense to me to expand on that.
--
Michal Hocko
SUSE Labs
next prev parent reply other threads:[~2018-08-22 14:45 UTC|newest]
Thread overview: 61+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-08-20 3:22 [PATCH 0/2] fix for "pathological THP behavior" Andrea Arcangeli
2018-08-20 3:22 ` [PATCH 1/2] mm: thp: consolidate policy_nodemask call Andrea Arcangeli
2018-08-20 3:22 ` [PATCH 2/2] mm: thp: fix transparent_hugepage/defrag = madvise || always Andrea Arcangeli
2018-08-20 3:26 ` [PATCH 0/1] fix for "pathological THP behavior" v2 Andrea Arcangeli
2018-08-20 3:26 ` [PATCH 1/1] mm: thp: fix transparent_hugepage/defrag = madvise || always Andrea Arcangeli
2018-08-20 12:35 ` [PATCH 2/2] " Zi Yan
2018-08-20 15:32 ` Andrea Arcangeli
2018-08-21 11:50 ` Michal Hocko
2018-08-21 21:40 ` Andrea Arcangeli
2018-08-22 9:02 ` Michal Hocko
2018-08-22 11:07 ` Michal Hocko
2018-08-22 14:24 ` Andrea Arcangeli
2018-08-22 14:45 ` Michal Hocko [this message]
2018-08-22 15:24 ` Andrea Arcangeli
2018-08-23 10:50 ` Michal Hocko
2018-08-22 15:52 ` Andrea Arcangeli
2018-08-23 10:52 ` Michal Hocko
2018-08-28 7:53 ` Michal Hocko
2018-08-28 8:18 ` Michal Hocko
2018-08-28 8:54 ` Stefan Priebe - Profihost AG
2018-08-29 11:11 ` Stefan Priebe - Profihost AG
[not found] ` <D5F4A33C-0A37-495C-9468-D6866A862097@cs.rutgers.edu>
2018-08-29 14:28 ` Michal Hocko
2018-08-29 14:35 ` Michal Hocko
2018-08-29 15:22 ` Zi Yan
2018-08-29 15:47 ` Michal Hocko
2018-08-29 16:06 ` Zi Yan
2018-08-29 16:25 ` Michal Hocko
2018-08-29 19:24 ` [PATCH] mm, thp: relax __GFP_THISNODE for MADV_HUGEPAGE mappings Michal Hocko
2018-08-29 22:54 ` Zi Yan
2018-08-30 7:00 ` Michal Hocko
2018-08-30 13:22 ` Zi Yan
2018-08-30 13:45 ` Michal Hocko
2018-08-30 14:02 ` Zi Yan
2018-08-30 16:19 ` Stefan Priebe - Profihost AG
2018-08-30 16:40 ` Michal Hocko
2018-09-05 3:44 ` Andrea Arcangeli
2018-09-05 7:08 ` Michal Hocko
2018-09-06 11:10 ` Vlastimil Babka
2018-09-06 11:16 ` Vlastimil Babka
2018-09-06 11:25 ` Michal Hocko
2018-09-06 12:35 ` Zi Yan
2018-09-06 10:59 ` Vlastimil Babka
2018-09-06 11:17 ` Zi Yan
2018-08-30 6:47 ` Michal Hocko
2018-09-06 11:18 ` Vlastimil Babka
2018-09-06 11:27 ` Michal Hocko
2018-09-12 17:29 ` Mel Gorman
2018-09-17 6:11 ` Michal Hocko
2018-09-17 7:04 ` Stefan Priebe - Profihost AG
2018-09-17 9:32 ` Stefan Priebe - Profihost AG
2018-09-17 11:27 ` Michal Hocko
2018-08-20 11:58 ` [PATCH 0/2] fix for "pathological THP behavior" Kirill A. Shutemov
2018-08-20 15:19 ` Andrea Arcangeli
2018-08-21 15:30 ` Vlastimil Babka
2018-08-21 17:26 ` David Rientjes
2018-08-21 22:18 ` Andrea Arcangeli
2018-08-21 22:05 ` Andrea Arcangeli
2018-08-22 9:24 ` Michal Hocko
2018-08-22 15:56 ` Andrea Arcangeli
2018-08-20 19:06 ` Yang Shi
2018-08-20 23:24 ` Andrea Arcangeli
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180822144517.GP29735@dhcp22.suse.cz \
--to=mhocko@suse.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=alex.williamson@redhat.com \
--cc=linux-mm@kvack.org \
--cc=rientjes@google.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox