From: Minchan Kim <minchan@kernel.org>
To: "Wang, Yalin" <Yalin.Wang@sonymobile.com>
Cc: 'Michal Hocko' <mhocko@suse.cz>,
'Andrew Morton' <akpm@linux-foundation.org>,
"'linux-kernel@vger.kernel.org'" <linux-kernel@vger.kernel.org>,
"'linux-mm@kvack.org'" <linux-mm@kvack.org>,
'Rik van Riel' <riel@redhat.com>,
'Johannes Weiner' <hannes@cmpxchg.org>,
'Mel Gorman' <mgorman@suse.de>, 'Shaohua Li' <shli@kernel.org>,
Hugh Dickins <hughd@google.com>,
Cyrill Gorcunov <gorcunov@gmail.com>
Subject: Re: [RFC V3] mm: change mm_advise_free to clear page dirty
Date: Tue, 3 Mar 2015 12:25:51 +0900 [thread overview]
Message-ID: <20150303032537.GA25015@blaptop> (raw)
Could you separte this patch in this patchset thread?
It's tackling differnt problem.
As well, I had a question to previous thread about why shared page
has a problem now but you didn't answer and send a new patchset.
It makes reviewers/maintainer time waste/confuse. Please, don't
hurry to send a code. Before that, resolve reviewers's comments.
On Tue, Mar 03, 2015 at 10:06:40AM +0800, Wang, Yalin wrote:
> This patch add ClearPageDirty() to clear AnonPage dirty flag,
> if not clear page dirty for this anon page, the page will never be
> treated as freeable. We also make sure the shared AnonPage is not
> freeable, we implement it by dirty all copyed AnonPage pte,
> so that make sure the Anonpage will not become freeable, unless
> all process which shared this page call madvise_free syscall.
Please, spend more time to make description clear. I really doubt
who understand this description without code inspection. :(
Of course, I'm not a person to write description clear like native
, either but just I'm sure I spend a more time to write description
rather than coding, at least. :)
>
> Signed-off-by: Yalin Wang <yalin.wang@sonymobile.com>
> ---
> mm/madvise.c | 16 +++++++++-------
> mm/memory.c | 12 ++++++++++--
> 2 files changed, 19 insertions(+), 9 deletions(-)
>
> diff --git a/mm/madvise.c b/mm/madvise.c
> index 6d0fcb8..b61070d 100644
> --- a/mm/madvise.c
> +++ b/mm/madvise.c
> @@ -297,23 +297,25 @@ static int madvise_free_pte_range(pmd_t *pmd, unsigned long addr,
> continue;
>
> page = vm_normal_page(vma, addr, ptent);
> - if (!page)
> + if (!page || !trylock_page(page))
> continue;
>
> if (PageSwapCache(page)) {
> - if (!trylock_page(page))
> - continue;
> -
> if (!try_to_free_swap(page)) {
> unlock_page(page);
> continue;
> }
> -
> - ClearPageDirty(page);
> - unlock_page(page);
> }
>
> /*
> + * we clear page dirty flag for AnonPage, no matter if this
> + * page is in swapcahce or not, AnonPage not in swapcache also set
> + * dirty flag sometimes, this happened when a AnonPage is removed
> + * from swapcahce by try_to_free_swap()
> + */
> + ClearPageDirty(page);
> + unlock_page(page);
> + /*
Parent:
ptrP = malloc();
*ptrP = 'a';
fork(); -> child process pte has dirty by your patch
..
memory pressure -> So, swapped out the page.
..
..
Child: var = *ptrP; assert(var =='a') -> So, swapin happens and child has pte_clean
parent: var = *ptrP; aasert(var == 'a') -> So, swapin happens and parent has pte_clean
..
..
Parent:
madvise_free -> remove PageDirty
So, both parent and child has pte_clean and !PageDirty, which
is target for VM to discard a page.
..
VM discard the page by memory pressure.
..
Child: var = *ptrP: assert(var == 'a'); <---- oops.
And blindly ClearPageDirty makes duplicates swap out.
> * Some of architecture(ex, PPC) don't update TLB
> * with set_pte_at and tlb_remove_tlb_entry so for
> * the portability, remap the pte with old|clean
> diff --git a/mm/memory.c b/mm/memory.c
> index 8068893..3d949b3 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -874,10 +874,18 @@ copy_one_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm,
> if (page) {
> get_page(page);
> page_dup_rmap(page);
> - if (PageAnon(page))
> + if (PageAnon(page)) {
> + /*
> + * we dirty the copyed pte for anon page,
> + * this is useful for madvise_free_pte_range(),
> + * this can prevent shared anon page freed by madvise_free
> + * syscall
> + */
> + pte = pte_mkdirty(pte);
It made every MADV_FREE hinted page void. IOW, if a process called MADV_FREE
calls fork, VM cannot discard pages if child doesn't free pages or calls madvise_free.
Then, if parent calls madvise_free before fork, we couldn't free those pages.
IOW, you are ignoring below example.
parent:
ptr1 = malloc(len);
-> allocator calls mmap(len);
memset(ptr1, 'a', len);
free(ptr1);
-> allocator calls madvise_free(ptr1, len);
fork();
..
..
-> VM discard hinted pages
child:
ptr2 = malloc(len)
-> allocator reuses the chunk allocated from parent.
so, child will see zero pages from ptr2 but he doesn't write
anything so garbage|zero page anything is okay to him.
As well, you are adding new instructions in fork which is very frequent syscall
so I'd like to find another way to avoid adding instructions in such hot path.
I will send different patch. Please review it.
So, my suggestion is below. It always makes pte dirty so let's Cc
Cyrill to take care of softdirty and Hugh who is Mr.Swap.
next reply other threads:[~2015-03-03 3:26 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-03-03 3:25 Minchan Kim [this message]
2015-03-03 3:59 ` Wang, Yalin
2015-03-03 4:14 ` Minchan Kim
2015-03-03 6:46 ` Wang, Yalin
2015-03-03 13:40 ` Minchan Kim
2015-03-05 15:35 ` Michal Hocko
2015-03-09 0:57 ` Minchan Kim
-- strict thread matches above, loose matches on Subject: below --
2015-02-24 8:18 [PATCH RFC 1/4] mm: throttle MADV_FREE Minchan Kim
2015-02-24 15:43 ` Michal Hocko
2015-02-25 0:08 ` Minchan Kim
2015-02-27 3:37 ` [RFC] mm: change mm_advise_free to clear page dirty Wang, Yalin
2015-02-27 21:02 ` Michal Hocko
2015-02-28 2:11 ` Wang, Yalin
2015-02-28 6:01 ` [RFC V2] " Wang, Yalin
2015-03-02 12:38 ` Michal Hocko
2015-03-03 2:06 ` [RFC V3] " Wang, Yalin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150303032537.GA25015@blaptop \
--to=minchan@kernel.org \
--cc=Yalin.Wang@sonymobile.com \
--cc=akpm@linux-foundation.org \
--cc=gorcunov@gmail.com \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=mhocko@suse.cz \
--cc=riel@redhat.com \
--cc=shli@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox