linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Vlastimil Babka <vbabka@suse.cz>
To: Daniel Micay <danielmicay@gmail.com>,
	Aliaksey Kandratsenka <alkondratenko@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Shaohua Li <shli@fb.com>,
	linux-mm@kvack.org, linux-api@vger.kernel.org,
	Rik van Riel <riel@redhat.com>, Hugh Dickins <hughd@google.com>,
	Mel Gorman <mel@csn.ul.ie>, Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@suse.cz>,
	Andy Lutomirski <luto@amacapital.net>,
	"google-perftools@googlegroups.com"
	<google-perftools@googlegroups.com>
Subject: Re: [PATCH] mremap: add MREMAP_NOHOLE flag --resend
Date: Wed, 25 Mar 2015 17:22:24 +0100	[thread overview]
Message-ID: <5512E0C0.6060406@suse.cz> (raw)
In-Reply-To: <550E6D9D.1060507@gmail.com>

On 03/22/2015 08:22 AM, Daniel Micay wrote:
> BTW, THP currently interacts very poorly with the jemalloc/tcmalloc
> madvise purging. The part where khugepaged assigns huge pages to dense
> spans of pages is*great*. The part where the kernel hands out a huge
> page on for a fault in a 2M span can be awful. It causes the model
> inside the allocator of uncommitted vs. committed pages to break down.
>
> For example, the allocator might use 1M of a huge page and then start
> purging. The purging will split it into 4k pages, so there will be 1M of
> zeroed 4k pages that are considered purged by the allocator. Over time,
> this can cripple purging. Search for "jemalloc huge pages" and you'll
> find lots of horror stories about this.

I'm not sure I get your description right. The problem I know about is 
where "purging" means madvise(MADV_DONTNEED) and khugepaged later 
collapses a new hugepage that will repopulate the purged parts, 
increasing the memory usage. One can limit this via 
/sys/kernel/mm/transparent_hugepage/khugepaged/max_ptes_none . That 
setting doesn't affect the page fault THP allocations, which however 
happen only in newly accessed hugepage-sized areas and not partially 
purged ones, though.

> I think a THP implementation playing that played well with purging would
> need to drop the page fault heuristic and rely on a significantly better
> khugepaged.

See here http://lwn.net/Articles/636162/ (the "Compaction" part)

The objection is that some short-lived workloads like gcc have to map 
hugepages immediately if they are to benefit from them. I still plan to 
improve khugepaged and allow admins to say that they don't want THP page 
faults (and rely solely on khugepaged which has more information to 
judge additional memory usage), but I'm not sure if it would be an 
acceptable default behavior.
One workaround in the current state for jemalloc and friends could be to 
use madvise(MADV_NOHUGEPAGE) on hugepage-sized/aligned areas where it 
wants to purge parts of them via madvise(MADV_DONTNEED). It could mean 
overhead of another syscall and tracking of where this was applied and 
when it makes sense to undo this and allow THP to be collapsed again, 
though, and it would also split vma's.

> This would mean faulting in a span of memory would no longer
> be faster. Having a flag to populate a range with madvise would help a

If it's a newly mapped memory, there's mmap(MAP_POPULATE). There is also 
a madvise(MADV_WILLNEED), which sounds like what you want, but I don't 
know what the implementation does exactly - it was apparently added for 
paging in ahead, and maybe it ignores unpopulated anonymous areas, but 
it would probably be well in spirit of the flag to make it prepopulate 
those.

> lot though, since the allocator knows exactly how much it's going to
> clobber with the memcpy. There will still be a threshold where mremap
> gets significantly faster, but it would move it higher.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2015-03-25 16:22 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-17 21:09 Shaohua Li
2015-03-18 22:31 ` Andrew Morton
2015-03-19  5:08   ` Shaohua Li
2015-03-19  5:22     ` Andrew Morton
2015-03-19 16:38       ` Shaohua Li
2015-03-19  5:34   ` Daniel Micay
2015-03-22  6:06     ` Aliaksey Kandratsenka
2015-03-22  7:22       ` Daniel Micay
2015-03-24  4:36         ` Aliaksey Kandratsenka
2015-03-24 14:54           ` Daniel Micay
2015-03-25 16:22         ` Vlastimil Babka [this message]
2015-03-25 20:49           ` Daniel Micay
2015-03-25 20:54             ` Daniel Micay
2015-03-26  0:19             ` David Rientjes
2015-03-26  0:24               ` Daniel Micay
2015-03-26  2:31                 ` David Rientjes
2015-03-26  3:24                   ` Daniel Micay
2015-03-26  3:36                     ` Daniel Micay
2015-03-26 17:25                     ` Vlastimil Babka
2015-03-26 20:45                       ` Daniel Micay
2015-03-23  5:17       ` Shaohua Li
2015-03-24  5:25         ` Aliaksey Kandratsenka
2015-03-24 14:39           ` Daniel Micay
2015-03-25  5:02             ` Shaohua Li
2015-03-26  0:50             ` Minchan Kim
2015-03-26  1:21               ` Daniel Micay
2015-03-26  7:02                 ` Minchan Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5512E0C0.6060406@suse.cz \
    --to=vbabka@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=alkondratenko@gmail.com \
    --cc=danielmicay@gmail.com \
    --cc=google-perftools@googlegroups.com \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@amacapital.net \
    --cc=mel@csn.ul.ie \
    --cc=mhocko@suse.cz \
    --cc=riel@redhat.com \
    --cc=shli@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox