From: "Huang\, Ying" <ying.huang@intel.com>
To: Yang Shi <yang.shi@linux.alibaba.com>
Cc: <hannes@cmpxchg.org>, <mhocko@suse.com>,
<mgorman@techsingularity.net>,
<kirill.shutemov@linux.intel.com>, <josef@toxicpanda.com>,
<hughd@google.com>, <shakeelb@google.com>,
<akpm@linux-foundation.org>, <linux-mm@kvack.org>,
<linux-kernel@vger.kernel.org>
Subject: Re: [v3 PATCH 2/2] mm: vmscan: correct some vmscan counters for THP swapout
Date: Wed, 22 May 2019 09:23:59 +0800 [thread overview]
Message-ID: <87ftp7cmds.fsf@yhuang-dev.intel.com> (raw)
In-Reply-To: <1558431642-52120-2-git-send-email-yang.shi@linux.alibaba.com> (Yang Shi's message of "Tue, 21 May 2019 17:40:42 +0800")
Yang Shi <yang.shi@linux.alibaba.com> writes:
> Since commit bd4c82c22c36 ("mm, THP, swap: delay splitting THP after
> swapped out"), THP can be swapped out in a whole. But, nr_reclaimed
> and some other vm counters still get inc'ed by one even though a whole
> THP (512 pages) gets swapped out.
>
> This doesn't make too much sense to memory reclaim. For example, direct
> reclaim may just need reclaim SWAP_CLUSTER_MAX pages, reclaiming one THP
> could fulfill it. But, if nr_reclaimed is not increased correctly,
> direct reclaim may just waste time to reclaim more pages,
> SWAP_CLUSTER_MAX * 512 pages in worst case.
>
> And, it may cause pgsteal_{kswapd|direct} is greater than
> pgscan_{kswapd|direct}, like the below:
>
> pgsteal_kswapd 122933
> pgsteal_direct 26600225
> pgscan_kswapd 174153
> pgscan_direct 14678312
>
> nr_reclaimed and nr_scanned must be fixed in parallel otherwise it would
> break some page reclaim logic, e.g.
>
> vmpressure: this looks at the scanned/reclaimed ratio so it won't
> change semantics as long as scanned & reclaimed are fixed in parallel.
>
> compaction/reclaim: compaction wants a certain number of physical pages
> freed up before going back to compacting.
>
> kswapd priority raising: kswapd raises priority if we scan fewer pages
> than the reclaim target (which itself is obviously expressed in order-0
> pages). As a result, kswapd can falsely raise its aggressiveness even
> when it's making great progress.
>
> Other than nr_scanned and nr_reclaimed, some other counters, e.g.
> pgactivate, nr_skipped, nr_ref_keep and nr_unmap_fail need to be fixed
> too since they are user visible via cgroup, /proc/vmstat or trace
> points, otherwise they would be underreported.
>
> When isolating pages from LRUs, nr_taken has been accounted in base
> page, but nr_scanned and nr_skipped are still accounted in THP. It
> doesn't make too much sense too since this may cause trace point
> underreport the numbers as well.
>
> So accounting those counters in base page instead of accounting THP as
> one page.
>
> This change may result in lower steal/scan ratio in some cases since
> THP may get split during page reclaim, then a part of tail pages get
> reclaimed instead of the whole 512 pages, but nr_scanned is accounted
> by 512, particularly for direct reclaim. But, this should be not a
> significant issue.
>
> Cc: "Huang, Ying" <ying.huang@intel.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Mel Gorman <mgorman@techsingularity.net>
> Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
> Cc: Hugh Dickins <hughd@google.com>
> Cc: Shakeel Butt <shakeelb@google.com>
> Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
> ---
> v3: Removed Shakeel's Reviewed-by since the patch has been changed significantly
> Switched back to use compound_order per Matthew
> Fixed more counters per Johannes
> v2: Added Shakeel's Reviewed-by
> Use hpage_nr_pages instead of compound_order per Huang Ying and William Kucharski
>
> mm/vmscan.c | 40 ++++++++++++++++++++++++++++------------
> 1 file changed, 28 insertions(+), 12 deletions(-)
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index b65bc50..1044834 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1250,7 +1250,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
> case PAGEREF_ACTIVATE:
> goto activate_locked;
> case PAGEREF_KEEP:
> - stat->nr_ref_keep++;
> + stat->nr_ref_keep += (1 << compound_order(page));
> goto keep_locked;
> case PAGEREF_RECLAIM:
> case PAGEREF_RECLAIM_CLEAN:
> @@ -1294,6 +1294,17 @@ static unsigned long shrink_page_list(struct list_head *page_list,
> goto activate_locked;
> }
>
> + /*
> + * Account all tail pages when THP is added
> + * into swap cache successfully.
> + * The head page has been accounted at the
> + * first place.
> + */
> + if (PageTransHuge(page))
> + sc->nr_scanned +=
> + ((1 << compound_order(page)) -
> + 1);
> +
The "if" here could be changed to "else if" because if add_to_swap()
fails we don't need to call PageTransHuge() here. But this isn't a big
deal.
You have analyzed the code and found that nr_dirty, nr_unqueued_dirty,
nr_congested and nr_writeback are file cache related and not impacted by
THP swap out. How about add your findings in the patch description?
Best Regards,
Huang, Ying
next prev parent reply other threads:[~2019-05-22 1:24 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-05-21 9:40 [v3 PATCH 1/2] mm: vmscan: remove double slab pressure by inc'ing sc->nr_scanned Yang Shi
2019-05-21 9:40 ` [v3 PATCH 2/2] mm: vmscan: correct some vmscan counters for THP swapout Yang Shi
2019-05-21 16:00 ` Johannes Weiner
2019-05-22 3:25 ` Yang Shi
2019-05-22 1:23 ` Huang, Ying [this message]
2019-05-22 3:26 ` Yang Shi
2019-05-21 15:45 ` [v3 PATCH 1/2] mm: vmscan: remove double slab pressure by inc'ing sc->nr_scanned Johannes Weiner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87ftp7cmds.fsf@yhuang-dev.intel.com \
--to=ying.huang@intel.com \
--cc=akpm@linux-foundation.org \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=josef@toxicpanda.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@techsingularity.net \
--cc=mhocko@suse.com \
--cc=shakeelb@google.com \
--cc=yang.shi@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox