From: Susheel Khiani <skhiani@codeaurora.org>
To: Hugh Dickins <hughd@google.com>
Cc: akpm@linux-foundation.org, peterz@infradead.org, neilb@suse.de,
dhowells@redhat.com, paulmcquad@gmail.com, linux-mm@kvack.org,
linux-kernel@vger.kernel.org
Subject: Re: [Question] ksm: rmap_item pointing to some stale vmas
Date: Mon, 22 Jun 2015 10:49:14 +0530 [thread overview]
Message-ID: <55879AD2.30507@codeaurora.org> (raw)
In-Reply-To: <55772FDD.4040302@codeaurora.org>
On 6/9/2015 11:56 PM, Susheel Khiani wrote:
> On 4/30/2015 11:37 AM, Susheel Khiani wrote:
>>> But if I've misunderstood, and you think that what you're seeing
>>> fits with the transient forking bugs I've (not quite) described,
>>> and you can explain why even the transient case is important for
>>> you to have fixed, then I really ought to redouble my efforts.
>>>
>>> Hugh
>
> I was able to root cause the issue as we got few instances of same and
> was frequently getting reproducible on stress tests. The reason why it
> was important was because failure to unmap ksm page was resulting into
> CMA allocation failure for us.
>
> For cases like fork, what we observed is for private mapped file pages,
> stable_node pointed by KSM page won't cover all the mappings until ksmd
> completes one full scan. Only after ksmd scan, new rmap_items pointing
> to mappings in child process would come into existence. So in cases like
> CMA allocations where we can't wait for ksmd to complete one full cycle,
> we can traverse anon_vma tree from parent's anon_vma to find out all the
> pages wheres CMA is mapped.
>
> I have tested the following patch on 3.10 kernel and with this change I
> am able to avoid CMA allocation failure which we were otherwise
> frequently seeing because of not able to unmap KSM page.
>
> Please review and let me know the feedback.
>
>
>
> [PATCH] ksm: Traverse through parent's anon_vma while unmapping
>
> While doing try_to_unmap_ksm, we traverse through
> rmap_item list to find out all the anon_vmas from which
> page needs to be unmapped.
>
> Now as per the design of KSM, it builds up its data
> structures by looking into each mm, and comes back a cycle
> later to find out which data structures are now outdated and
> needs to be updated. So, for cases like fork, what we
> observe is for private mapped file pages stable_node
> pointed by KSM page won't cover all the mappings until
> ksmd completes one full scan. Only after ksmd scan, new
> rmap_items pointing to mappings in child process would come
> into existence.
>
> As a result unmapping of a stable page can't be done until
> ksmd has completed one full scan. This becomes an issue in
> case of CMA where we need to unmap and move a CMA page and
> can't wait for ksmd to complete one cycle. Because of
> new rmap_items for new mapping still not created we won't be
> able to unmap CMA page from all the vmas where it is mapped.
> This would result in frequent CMA allocation failures.
>
> So instead of just relying on rmap_items list which we know
> can contain incomplete list, we also scan anon_vma tree from
> parent's anon_vma to find out all the vmas where CMA page is
> mapped and thereby successfully unmap the page and move it
> to new page.
>
> Change-Id: I97cacf6a73734b10c7098362c20fb3f2d4040c76
> Signed-off-by: Susheel Khiani <skhiani@codeaurora.org>
> ---
> mm/ksm.c | 58 +++++++++++++++++++++++++++++++++++++++++++++++++++++++---
> 1 file changed, 55 insertions(+), 3 deletions(-)
>
> diff --git a/mm/ksm.c b/mm/ksm.c
> index 11f6293..10d5266 100644
> --- a/mm/ksm.c
> +++ b/mm/ksm.c
> @@ -1956,6 +1956,7 @@ int page_referenced_ksm(struct page *page, struct
> mem_cgroup *memcg,
> unsigned int mapcount = page_mapcount(page);
> int referenced = 0;
> int search_new_forks = 0;
> + int search_from_root = 0;
>
> VM_BUG_ON(!PageKsm(page));
> VM_BUG_ON(!PageLocked(page));
> @@ -1968,9 +1969,20 @@ again:
> struct anon_vma *anon_vma = rmap_item->anon_vma;
> struct anon_vma_chain *vmac;
> struct vm_area_struct *vma;
> + struct rb_root rb_root;
> +
> + if (!search_from_root) {
> + if (anon_vma)
> + rb_root = anon_vma->rb_root;
> + }
> + else {
> + if (anon_vma && anon_vma->root) {
> + rb_root = anon_vma->root->rb_root;
> + }
> + }
>
> anon_vma_lock_read(anon_vma);
> - anon_vma_interval_tree_foreach(vmac, &anon_vma->rb_root,
> + anon_vma_interval_tree_foreach(vmac, &rb_root,
> 0, ULONG_MAX) {
> vma = vmac->vma;
> if (rmap_item->address < vma->vm_start ||
> @@ -1999,6 +2011,11 @@ again:
> }
> if (!search_new_forks++)
> goto again;
> +
> + if (!search_from_root++) {
> + search_new_forks = 0;
> + goto again;
> + }
> out:
> return referenced;
> }
> @@ -2010,6 +2027,7 @@ int try_to_unmap_ksm(struct page *page, enum
> ttu_flags flags,
> struct rmap_item *rmap_item;
> int ret = SWAP_AGAIN;
> int search_new_forks = 0;
> + int search_from_root = 0;
>
> VM_BUG_ON(!PageKsm(page));
> VM_BUG_ON(!PageLocked(page));
> @@ -2028,9 +2046,20 @@ again:
> struct anon_vma *anon_vma = rmap_item->anon_vma;
> struct anon_vma_chain *vmac;
> struct vm_area_struct *vma;
> + struct rb_root rb_root;
> +
> + if (!search_from_root) {
> + if (anon_vma)
> + rb_root = anon_vma->rb_root;
> + }
> + else {
> + if (anon_vma && anon_vma->root) {
> + rb_root = anon_vma->root->rb_root;
> + }
> + }
>
> anon_vma_lock_read(anon_vma);
> - anon_vma_interval_tree_foreach(vmac, &anon_vma->rb_root,
> + anon_vma_interval_tree_foreach(vmac, &rb_root,
> 0, ULONG_MAX) {
> vma = vmac->vma;
> if (rmap_item->address < vma->vm_start ||
> @@ -2056,6 +2085,11 @@ again:
> }
> if (!search_new_forks++)
> goto again;
> +
> + if(!search_from_root++) {
> + search_new_forks = 0;
> + goto again;
> + }
> out:
> return ret;
> }
> @@ -2068,6 +2102,7 @@ int rmap_walk_ksm(struct page *page, int
> (*rmap_one)(struct page *,
> struct rmap_item *rmap_item;
> int ret = SWAP_AGAIN;
> int search_new_forks = 0;
> + int search_from_root = 0;
>
> VM_BUG_ON(!PageKsm(page));
> VM_BUG_ON(!PageLocked(page));
> @@ -2080,9 +2115,21 @@ again:
> struct anon_vma *anon_vma = rmap_item->anon_vma;
> struct anon_vma_chain *vmac;
> struct vm_area_struct *vma;
> + struct rb_root rb_root;
> +
> + if (!search_from_root) {
> + if (anon_vma)
> + rb_root = anon_vma->rb_root;
> + }
> + else {
> + if (anon_vma && anon_vma->root) {
> + rb_root = anon_vma->root->rb_root;
> + }
> + }
> +
>
> anon_vma_lock_read(anon_vma);
> - anon_vma_interval_tree_foreach(vmac, &anon_vma->rb_root,
> + anon_vma_interval_tree_foreach(vmac, &rb_root,
> 0, ULONG_MAX) {
> vma = vmac->vma;
> if (rmap_item->address < vma->vm_start ||
> @@ -2107,6 +2154,11 @@ again:
> }
> if (!search_new_forks++)
> goto again;
> +
> + if (!search_from_root++) {
> + search_new_forks = 0;
> + goto again;
> + }
> out:
> return ret;
> }
Reminder Ping, did you get a chance to look into
the previous mail
--
Susheel Khiani
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center,
Inc. is a member of the Code Aurora Forum, hosted by The Linux Foundation
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
prev parent reply other threads:[~2015-06-22 5:19 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-04-09 14:05 Susheel Khiani
2015-04-10 17:56 ` Hugh Dickins
2015-04-14 7:01 ` Susheel Khiani
2015-04-15 6:22 ` Hugh Dickins
2015-04-30 6:07 ` Susheel Khiani
2015-06-09 18:26 ` Susheel Khiani
2015-06-22 5:19 ` Susheel Khiani [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=55879AD2.30507@codeaurora.org \
--to=skhiani@codeaurora.org \
--cc=akpm@linux-foundation.org \
--cc=dhowells@redhat.com \
--cc=hughd@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=neilb@suse.de \
--cc=paulmcquad@gmail.com \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox