From: Rik van Riel <riel@redhat.com>
To: Michal Hocko <mhocko@kernel.org>,
Ebru Akagunduz <ebru.akagunduz@gmail.com>
Cc: linux-mm@kvack.org, hughd@google.com, akpm@linux-foundation.org,
kirill.shutemov@linux.intel.com, n-horiguchi@ah.jp.nec.com,
aarcange@redhat.com, iamjoonsoo.kim@lge.com, gorcunov@openvz.org,
linux-kernel@vger.kernel.org, mgorman@suse.de,
rientjes@google.com, vbabka@suse.cz,
aneesh.kumar@linux.vnet.ibm.com, hannes@cmpxchg.org,
boaz@plexistor.com
Subject: Re: [PATCH v4 2/2] mm, thp: avoid unnecessary swapin in khugepaged
Date: Tue, 22 Mar 2016 15:21:16 -0400 [thread overview]
Message-ID: <1458674476.24206.5.camel@redhat.com> (raw)
In-Reply-To: <20160321153637.GE21248@dhcp22.suse.cz>
[-- Attachment #1: Type: text/plain, Size: 3434 bytes --]
On Mon, 2016-03-21 at 16:36 +0100, Michal Hocko wrote:
> On Sun 20-03-16 20:07:39, Ebru Akagunduz wrote:
> >
> > Currently khugepaged makes swapin readahead to improve
> > THP collapse rate. This patch checks vm statistics
> > to avoid workload of swapin, if unnecessary. So that
> > when system under pressure, khugepaged won't consume
> > resources to swapin.
> OK, so you want to disable the optimization when under the memory
> pressure. That sounds like a good idea in general.
>
> > @@ -2493,7 +2494,14 @@ static void collapse_huge_page(struct
> > mm_struct *mm,
> > goto out;
> > }
> >
> > - __collapse_huge_page_swapin(mm, vma, address, pmd);
> > + swap = get_mm_counter(mm, MM_SWAPENTS);
> > + curr_allocstall = sum_vm_event(ALLOCSTALL);
> > + /*
> > + * When system under pressure, don't swapin readahead.
> > + * So that avoid unnecessary resource consuming.
> > + */
> > + if (allocstall == curr_allocstall && swap != 0)
> > + __collapse_huge_page_swapin(mm, vma, address,
> > pmd);
> this criteria doesn't really make much sense to me. So we are
> checking
> whether there was the direct reclaim invoked since some point in time
> (more on that below) and we take that as a signal of a strong memory
> pressure, right? What if that was quite some time ago? What if we
> didn't
> have a single direct reclaim but the kswapd was busy the whole time.
> Or
> what if the allocstall was from a different numa node?
Do you have a measure in mind that the code should test
against, instead?
I don't think we want page cache turnover to prevent
khugepaged collapsing THPs, but if the system gets
to the point where kswapd is doing pageout IO, or
swapout IO, or kswapd cannot keep up, we should
probably slow down khugepaged.
If another NUMA node is under significant memory
pressure, we probably want the programs from that
node to be able to do some allocations from this
node, rather than have khugepaged consume the memory.
> > anon_vma_lock_write(vma->anon_vma);
> >
> > @@ -2905,6 +2913,7 @@ static int khugepaged(void *none)
> > set_user_nice(current, MAX_NICE);
> >
> > while (!kthread_should_stop()) {
> > + allocstall = sum_vm_event(ALLOCSTALL);
> > khugepaged_do_scan();
> And this sounds even buggy AFAIU. I guess you want to snapshot before
> goint to sleep no? Otherwise you are comparing allocstall diff from a
> very short time period. Or was this an intention and you really want
> to
> watch for events while khugepaged is running? If yes a comment would
> be
> due here.
You are right, the snapshot should be taken after
khugepaged_do_work().
The memory pressure needs to be measured over the
longest time possible between khugepaged runs.
> That being said, is this actually useful in the real life? Basing
> your
> decision on something as volatile as the direct reclaim would lead to
> rather volatile results. E.g. how stable are the numbers during your
> test?
>
> Wouldn't it be better to rather do an optimistic swapin and back out
> if the direct reclaim is really required. I realize this will be a
> much
> bigger change but it would make more sense I guess.
That depends on how costly swap IO is.
Having khugepaged be on the conservative side is probably
a good idea, given how many systems out there still have
hard drives today.
--
All Rights Reversed.
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 473 bytes --]
next prev parent reply other threads:[~2016-03-22 19:21 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-03-20 18:07 [PATCH v4 0/2] mm, thp: Fix unnecessarry resource consuming in swapin Ebru Akagunduz
2016-03-20 18:07 ` [PATCH v4 1/2] mm, vmstat: calculate particular vm event Ebru Akagunduz
2016-03-20 18:07 ` [PATCH v4 2/2] mm, thp: avoid unnecessary swapin in khugepaged Ebru Akagunduz
2016-03-21 15:36 ` Michal Hocko
2016-03-22 19:21 ` Rik van Riel [this message]
2016-03-23 12:45 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1458674476.24206.5.camel@redhat.com \
--to=riel@redhat.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@linux.vnet.ibm.com \
--cc=boaz@plexistor.com \
--cc=ebru.akagunduz@gmail.com \
--cc=gorcunov@openvz.org \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=iamjoonsoo.kim@lge.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=mhocko@kernel.org \
--cc=n-horiguchi@ah.jp.nec.com \
--cc=rientjes@google.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox