From: Yang Shi <yang.shi@linux.alibaba.com>
To: John Hubbard <jhubbard@nvidia.com>,
ktkhai@virtuozzo.com, hughd@google.com, aarcange@redhat.com,
akpm@linux-foundation.org
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [v3 PATCH] mm: ksm: do not block on page lock when searching stable tree
Date: Wed, 30 Jan 2019 09:47:19 -0800 [thread overview]
Message-ID: <7cf16cfb-3190-dfbd-ce72-92a94d9277f5@linux.alibaba.com> (raw)
In-Reply-To: <82ba1395-baab-3b95-a3f7-47e219551881@nvidia.com>
On 1/29/19 11:14 PM, John Hubbard wrote:
> On 1/29/19 12:29 PM, Yang Shi wrote:
>> ksmd need search stable tree to look for the suitable KSM page, but the
>> KSM page might be locked for a while due to i.e. KSM page rmap walk.
>> Basically it is not a big deal since commit 2c653d0ee2ae
>> ("ksm: introduce ksm_max_page_sharing per page deduplication limit"),
>> since max_page_sharing limits the number of shared KSM pages.
>>
>> But it still sounds not worth waiting for the lock, the page can be
>> skip,
>> then try to merge it in the next scan to avoid potential stall if its
>> content is still intact.
>>
>> Introduce trylock mode to get_ksm_page() to not block on page lock, like
>> what try_to_merge_one_page() does. And, define three possible
>> operations (nolock, lock and trylock) as enum type to avoid stacking up
>> bools and make the code more readable.
>>
>> Return -EBUSY if trylock fails, since NULL means not find suitable KSM
>> page, which is a valid case.
>>
>> With the default max_page_sharing setting (256), there is almost no
>> observed change comparing lock vs trylock.
>>
>> However, with ksm02 of LTP, the reduced ksmd full scan time can be
>> observed, which has set max_page_sharing to 786432. With lock version,
>> ksmd may tak 10s - 11s to run two full scans, with trylock version ksmd
>> may take 8s - 11s to run two full scans. And, the number of
>> pages_sharing and pages_to_scan keep same. Basically, this change has
>> no harm >
>> Cc: Hugh Dickins <hughd@google.com>
>> Cc: Andrea Arcangeli <aarcange@redhat.com>
>> Suggested-by: John Hubbard <jhubbard@nvidia.com>
>> Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>
>> Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
>> ---
>> Hi folks,
>>
>> This patch was with "mm: vmscan: skip KSM page in direct reclaim if
>> priority
>> is low" in the initial submission. Then Hugh and Andrea pointed out
>> commit
>> 2c653d0ee2ae ("ksm: introduce ksm_max_page_sharing per page
>> deduplication
>> limit") is good enough for limiting the number of shared KSM page to
>> prevent
>> from softlock when walking ksm page rmap. This commit does solve the
>> problem.
>> So, the series was dropped by Andrew from -mm tree.
>>
>> However, I thought the second patch (this one) still sounds useful.
>> So, I did
>> some test and resubmit it. The first version was reviewed by Krill
>> Tkhai, so
>> I keep his Reviewed-by tag since there is no change to the patch
>> except the
>> commit log.
>>
>> So, would you please reconsider this patch?
>>
>> v3: Use enum to define get_ksm_page operations (nolock, lock and
>> trylock) per
>> John Hubbard
>> v2: Updated the commit log to reflect some test result and latest
>> discussion
>>
>> mm/ksm.c | 46 ++++++++++++++++++++++++++++++++++++----------
>> 1 file changed, 36 insertions(+), 10 deletions(-)
>>
>> diff --git a/mm/ksm.c b/mm/ksm.c
>> index 6c48ad1..5647bc1 100644
>> --- a/mm/ksm.c
>> +++ b/mm/ksm.c
>> @@ -667,6 +667,12 @@ static void remove_node_from_stable_tree(struct
>> stable_node *stable_node)
>> free_stable_node(stable_node);
>> }
>> +enum get_ksm_page_flags {
>> + GET_KSM_PAGE_NOLOCK,
>> + GET_KSM_PAGE_LOCK,
>> + GET_KSM_PAGE_TRYLOCK
>> +};
>> +
>> /*
>> * get_ksm_page: checks if the page indicated by the stable node
>> * is still its ksm page, despite having held no reference to it.
>> @@ -686,7 +692,8 @@ static void remove_node_from_stable_tree(struct
>> stable_node *stable_node)
>> * a page to put something that might look like our key in
>> page->mapping.
>> * is on its way to being freed; but it is an anomaly to bear in mind.
>> */
>> -static struct page *get_ksm_page(struct stable_node *stable_node,
>> bool lock_it)
>> +static struct page *get_ksm_page(struct stable_node *stable_node,
>> + enum get_ksm_page_flags flags)
>> {
>> struct page *page;
>> void *expected_mapping;
>> @@ -728,8 +735,15 @@ static struct page *get_ksm_page(struct
>> stable_node *stable_node, bool lock_it)
>> goto stale;
>> }
>> - if (lock_it) {
>> + if (flags == GET_KSM_PAGE_TRYLOCK) {
>> + if (!trylock_page(page)) {
>> + put_page(page);
>> + return ERR_PTR(-EBUSY);
>> + }
>> + } else if (flags == GET_KSM_PAGE_LOCK)
>> lock_page(page);
>> +
>> + if (flags != GET_KSM_PAGE_NOLOCK) {
>> if (READ_ONCE(page->mapping) != expected_mapping) {
>> unlock_page(page);
>> put_page(page);
>> @@ -763,7 +777,7 @@ static void remove_rmap_item_from_tree(struct
>> rmap_item *rmap_item)
>> struct page *page;
>> stable_node = rmap_item->head;
>> - page = get_ksm_page(stable_node, true);
>> + page = get_ksm_page(stable_node, GET_KSM_PAGE_LOCK);
>> if (!page)
>> goto out;
>> @@ -863,7 +877,7 @@ static int remove_stable_node(struct
>> stable_node *stable_node)
>> struct page *page;
>> int err;
>> - page = get_ksm_page(stable_node, true);
>> + page = get_ksm_page(stable_node, GET_KSM_PAGE_LOCK);
>> if (!page) {
>> /*
>> * get_ksm_page did remove_node_from_stable_tree itself.
>> @@ -1385,7 +1399,7 @@ static struct page *stable_node_dup(struct
>> stable_node **_stable_node_dup,
>> * stable_node parameter itself will be freed from
>> * under us if it returns NULL.
>> */
>> - _tree_page = get_ksm_page(dup, false);
>> + _tree_page = get_ksm_page(dup, GET_KSM_PAGE_NOLOCK);
>> if (!_tree_page)
>> continue;
>> nr += 1;
>> @@ -1508,7 +1522,7 @@ static struct page *__stable_node_chain(struct
>> stable_node **_stable_node_dup,
>> if (!is_stable_node_chain(stable_node)) {
>> if (is_page_sharing_candidate(stable_node)) {
>> *_stable_node_dup = stable_node;
>> - return get_ksm_page(stable_node, false);
>> + return get_ksm_page(stable_node, GET_KSM_PAGE_NOLOCK);
>> }
>> /*
>> * _stable_node_dup set to NULL means the stable_node
>> @@ -1613,7 +1627,8 @@ static struct page *stable_tree_search(struct
>> page *page)
>> * wrprotected at all times. Any will work
>> * fine to continue the walk.
>> */
>> - tree_page = get_ksm_page(stable_node_any, false);
>> + tree_page = get_ksm_page(stable_node_any,
>> + GET_KSM_PAGE_NOLOCK);
>> }
>> VM_BUG_ON(!stable_node_dup ^ !!stable_node_any);
>> if (!tree_page) {
>> @@ -1673,7 +1688,12 @@ static struct page *stable_tree_search(struct
>> page *page)
>> * It would be more elegant to return stable_node
>> * than kpage, but that involves more changes.
>> */
>> - tree_page = get_ksm_page(stable_node_dup, true);
>> + tree_page = get_ksm_page(stable_node_dup,
>> + GET_KSM_PAGE_TRYLOCK);
>> +
>> + if (PTR_ERR(tree_page) == -EBUSY)
>> + return ERR_PTR(-EBUSY);
>
> or just:
>
> if (PTR_ERR(tree_page) == -EBUSY)
> return tree_page;
>
> right?
Either looks fine to me. Returning errno may look more explicit? Anyway
I really don't have preference.
>
>> +
>> if (unlikely(!tree_page))
>> /*
>> * The tree may have been rebalanced,
>> @@ -1842,7 +1862,8 @@ static struct stable_node
>> *stable_tree_insert(struct page *kpage)
>> * wrprotected at all times. Any will work
>> * fine to continue the walk.
>> */
>> - tree_page = get_ksm_page(stable_node_any, false);
>> + tree_page = get_ksm_page(stable_node_any,
>> + GET_KSM_PAGE_NOLOCK);
>> }
>> VM_BUG_ON(!stable_node_dup ^ !!stable_node_any);
>> if (!tree_page) {
>> @@ -2060,6 +2081,10 @@ static void cmp_and_merge_page(struct page
>> *page, struct rmap_item *rmap_item)
>> /* We first start with searching the page inside the stable
>> tree */
>> kpage = stable_tree_search(page);
>> +
>> + if (PTR_ERR(kpage) == -EBUSY)
>> + return;
>> +
>> if (kpage == page && rmap_item->head == stable_node) {
>> put_page(kpage);
>> return;
>> @@ -2242,7 +2267,8 @@ static struct rmap_item
>> *scan_get_next_rmap_item(struct page **page)
>> list_for_each_entry_safe(stable_node, next,
>> &migrate_nodes, list) {
>> - page = get_ksm_page(stable_node, false);
>> + page = get_ksm_page(stable_node,
>> + GET_KSM_PAGE_NOLOCK);
>> if (page)
>> put_page(page);
>> cond_resched();
>>
>
> Hi Yang,
>
> The patch looks correct as far doing what it claims to do. I'll leave it
> to others to decide if a trylock-based approach is really what you want,
> for KSM scans. It seems reasonable from my very limited knowledge of
> KSM: there shouldn't be any cases where you really *need* to wait for
> a page lock, because the whole system is really sort of an optimization
> anyway.
Thanks!
Yang
>
>
> thanks,
next prev parent reply other threads:[~2019-01-30 17:47 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-01-29 20:29 Yang Shi
2019-01-30 7:14 ` John Hubbard
2019-01-30 17:47 ` Yang Shi [this message]
2019-01-30 18:14 ` John Hubbard
2019-01-30 8:13 ` Kirill Tkhai
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7cf16cfb-3190-dfbd-ce72-92a94d9277f5@linux.alibaba.com \
--to=yang.shi@linux.alibaba.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=hughd@google.com \
--cc=jhubbard@nvidia.com \
--cc=ktkhai@virtuozzo.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox