linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Andrea Arcangeli <aarcange@redhat.com>
To: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Mel Gorman <mgorman@techsingularity.net>,
	Andrew Morton <akpm@linux-foundation.org>,
	Vlastimil Babka <vbabka@suse.cz>, Rik van Riel <riel@redhat.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Linux-MM <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 1/1] mm: thp: Redefine default THP defrag behaviour disable it by default
Date: Wed, 2 Mar 2016 19:47:32 +0100	[thread overview]
Message-ID: <20160302184732.GC4946@redhat.com> (raw)
In-Reply-To: <20160226103253.GA22450@node.shutemov.name>

On Fri, Feb 26, 2016 at 01:32:53PM +0300, Kirill A. Shutemov wrote:
> Could you elaborate on problems with rmap? I have looked into this deeply
> yet.
> 
> Do you see anything what would prevent following basic scheme:
> 
>  - Identify series of small pages as candidate for collapsing into
>    a compound page. Not sure how difficult it would be. I guess it can be
>    done by looking for adjacent pages which belong to the same anon_vma.

Just like if there was no other process sharing them yes.

>  - Setup migration entries for pte which maps these pages.
>
> 
>  - Collapse small pages into compound page. IIUC, it only will be possible
>    if these pages are not pinned.
> 
>  - Replace migration entries with ptes which point to subpages of the new
>    compound page.
> 
>  - Scan over all vmas mapping this compound page, looking for VMA suitable
>    for huge page. We cannot collapse it right away due lock inversion of
>    anon_vma->rwsem vs. mmap_sem.
> 
>  - For found VMAs, collapse page table into PMD one VMA a time under
>    down_write(mmap_sem).
> 
> Even if would fail to create any PMDs, we would reduce LRU pressure by
> collapsing small pages into compound one.

I see how your new refcounting simplifies things as we don't have to
do create hugepmds immediately, but we still have to modify all ptes
of all sharers, not just those belonging to the vma we collapsed (or
we'd be effectively copying-on-collapse in turn losing the
sharing).

If we'd defer it and leave temporarily new THP and old 4k pages both
allocated and independently mapped, a process running in the old ptes
could gup_fast and a process in the new ptes could gup_fast too and
we'd up with double memory usage, so we'd need a way to redirect
gup_fast in the old pte to the new THP, so the future pins goes to the
new THP always. Some new linkage between old ptes and new ptes would
also be needed to keep walking it slowly and it shall be invalidated
during COWs.

Doing it incrementally and not updating all ptes at once wouldn't be
straightforward. Doing it not incrementally would mean paying the cost
of updating (in the worst case) up to hundred thousand ptes at full
CPU usage for a later gain we're not sure about. Said that I think
it's worthy goal to achieve, especially if we remove compaction from
direct reclaim.

Thanks,
Andrea

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2016-03-02 18:47 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-25 17:12 Mel Gorman
2016-02-25 18:32 ` Rik van Riel
2016-02-25 19:07   ` Mel Gorman
2016-02-25 19:01 ` Andrea Arcangeli
2016-02-25 19:56   ` Mel Gorman
2016-02-25 23:02     ` Andrea Arcangeli
2016-02-25 23:08       ` Andrea Arcangeli
2016-02-26 11:13       ` Mel Gorman
2016-02-26 19:50         ` Andrea Arcangeli
2016-02-26 20:46           ` Mel Gorman
2016-02-26 10:32   ` Kirill A. Shutemov
2016-03-02 18:47     ` Andrea Arcangeli [this message]
2016-02-25 19:45 ` Johannes Weiner
2016-02-26 10:52   ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160302184732.GC4946@redhat.com \
    --to=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=riel@redhat.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox