From: Hugh Dickins <hughd@google.com>
To: Mel Gorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@suse.cz>, Linux-MM <linux-mm@kvack.org>,
David Gibson <david@gibson.dropbear.id.au>,
Ken Chen <kenchen@google.com>,
Cong Wang <xiyou.wangcong@gmail.com>,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH -alternative] mm: hugetlbfs: Close race during teardown of hugetlbfs shared page tables V2 (resend)
Date: Sun, 22 Jul 2012 21:04:33 -0700 (PDT) [thread overview]
Message-ID: <alpine.LSU.2.00.1207222033030.6810@eggly.anvils> (raw)
In-Reply-To: <20120720145121.GJ9222@suse.de>
On Fri, 20 Jul 2012, Mel Gorman wrote:
> On Fri, Jul 20, 2012 at 04:36:35PM +0200, Michal Hocko wrote:
> > And here is my attempt for the fix (Hugh mentioned something similar
> > earlier but he suggested using special flags in ptes or VMAs). I still
> > owe doc. update and it hasn't been tested with too many configs and I
> > could missed some definition updates.
> > I also think that changelog could be much better, I will add (steal) the
> > full bug description if people think that this way is worth going rather
> > than the one suggested by Mel.
> > To be honest I am not quite happy how I had to pollute generic mm code with
> > something that is specific to a single architecture.
> > Mel hammered it with the test case and it survived.
>
> Tested-by: Mel Gorman <mgorman@suse.de>
>
> This approach looks more or less like what I was expecting. I like that
> the trick was applied to the page table page instead of using PTE tricks
> or by bodging it with a VMA flag like I was thinking so kudos for that. I
> also prefer this approach to trying to free the page tables on or near
> huge_pmd_unshare()
>
> In general I think this patch would execute better than mine because it is
> far less heavy-handed but I share your concern that it changes the core MM
> quite a bit for a corner case that only one architecture cares about. I am
> completely biased of course, but I still prefer my patch because other than
> an API change it keeps the bulk of the madness in arch/x86/mm/hugetlbpage.c
> . I am also not concerned with the scalability of how quickly we can setup
> page table sharing.
>
> Hugh, I'm afraid you get to choose :)
Thank you bestowing that honour upon me :) Seriously, though, you
were quite right to Cc me on this, it is one of those areas I ought
to know something about (unlike hugetlb reservations, for example).
Please don't be upset if I say that I don't like either of your patches.
Mainly for obvious reasons - I don't like Mel's because anything with
trylock retries and nested spinlocks worries me before I can even start
to think about it; and I don't like Michal's for the same reason as Mel,
that it spreads more change around in common paths than we would like.
But I didn't spend much time thinking through either of them, they just
seemed more complicated than should be needed. I cannot confirm or deny
whether they're correct - though I still do not understand how mmap_sem
can help you, Mel. I can see that it will help in your shmdt()ing test,
but if you leave the area mapped on exit, then mmap_sem is not taken in
the exit_mmap() path, so how does it help?
I spent hours trying to dream up a better patch, trying various
approaches. I think I have a nice one now, what do you think? And
more importantly, does it work? I have not tried to test it at all,
that I'm hoping to leave to you, I'm sure you'll attack it with gusto!
If you like it, please take it over and add your comments and signoff
and send it in. The second part won't come up in your testing, and could
be made a separate patch if you prefer: it's a related point that struck
me while I was playing with a different approach.
I'm sorely tempted to leave a dangerous pair of eyes off the Cc,
but that too would be unfair.
Subject-to-your-testing-
Signed-off-by: Hugh Dickins <hughd@google.com>
---
mm/hugetlb.c | 18 ++++++++++++++++--
1 file changed, 16 insertions(+), 2 deletions(-)
--- v3.5/mm/hugetlb.c 2012-07-21 13:58:29.000000000 -0700
+++ linux/mm/hugetlb.c 2012-07-22 20:28:59.858077817 -0700
@@ -2393,6 +2393,15 @@ void unmap_hugepage_range(struct vm_area
{
mutex_lock(&vma->vm_file->f_mapping->i_mmap_mutex);
__unmap_hugepage_range(vma, start, end, ref_page);
+ /*
+ * Clear this flag so that x86's huge_pmd_share page_table_shareable
+ * test will fail on a vma being torn down, and not grab a page table
+ * on its way out. We're lucky that the flag has such an appropriate
+ * name, and can in fact be safely cleared here. We could clear it
+ * before the __unmap_hugepage_range above, but all that's necessary
+ * is to clear it before releasing the i_mmap_mutex below.
+ */
+ vma->vm_flags &= ~VM_MAYSHARE;
mutex_unlock(&vma->vm_file->f_mapping->i_mmap_mutex);
}
@@ -2959,9 +2968,14 @@ void hugetlb_change_protection(struct vm
}
}
spin_unlock(&mm->page_table_lock);
- mutex_unlock(&vma->vm_file->f_mapping->i_mmap_mutex);
-
+ /*
+ * Must flush TLB before releasing i_mmap_mutex: x86's huge_pmd_unshare
+ * may have cleared our pud entry and done put_page on the page table:
+ * once we release i_mmap_mutex, another task can do the final put_page
+ * and that page table be reused and filled with junk.
+ */
flush_tlb_range(vma, start, end);
+ mutex_unlock(&vma->vm_file->f_mapping->i_mmap_mutex);
}
int hugetlb_reserve_pages(struct inode *inode,
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2012-07-23 4:05 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-07-20 13:49 [PATCH] mm: hugetlbfs: Close race during teardown of hugetlbfs shared page tables v2 Mel Gorman
2012-07-20 14:11 ` [PATCH] mm: hugetlbfs: Close race during teardown of hugetlbfs shared page tables V2 (resend) Mel Gorman
2012-07-20 14:29 ` Michal Hocko
2012-07-20 14:37 ` Mel Gorman
2012-07-20 14:40 ` Michal Hocko
2012-07-20 14:36 ` [PATCH -alternative] " Michal Hocko
2012-07-20 14:51 ` Mel Gorman
2012-07-23 4:04 ` Hugh Dickins [this message]
2012-07-23 11:40 ` Mel Gorman
2012-07-24 1:08 ` Hugh Dickins
2012-07-24 8:32 ` Michal Hocko
2012-07-24 9:34 ` Mel Gorman
2012-07-24 10:04 ` Michal Hocko
2012-07-24 19:23 ` Hugh Dickins
2012-07-25 8:36 ` Mel Gorman
2012-07-26 17:42 ` Rik van Riel
2012-07-26 18:04 ` Larry Woodman
2012-07-27 8:42 ` Mel Gorman
2012-07-26 18:37 ` Rik van Riel
2012-07-26 21:03 ` Larry Woodman
2012-07-27 3:48 ` Larry Woodman
2012-07-27 10:10 ` Larry Woodman
2012-07-27 10:23 ` Mel Gorman
2012-07-27 10:36 ` Larry Woodman
2012-07-30 19:11 ` Larry Woodman
2012-07-31 12:16 ` Hillf Danton
2012-07-31 12:46 ` Mel Gorman
2012-07-31 13:07 ` Larry Woodman
2012-07-31 13:29 ` Mel Gorman
2012-07-31 13:21 ` Michal Hocko
2012-07-31 17:49 ` Larry Woodman
2012-07-31 20:06 ` Michal Hocko
2012-07-31 20:57 ` Larry Woodman
2012-08-01 2:45 ` Larry Woodman
2012-08-01 8:20 ` Michal Hocko
2012-08-01 12:32 ` Michal Hocko
2012-08-01 15:06 ` Larry Woodman
2012-08-02 7:19 ` Michal Hocko
2012-08-02 7:37 ` Mel Gorman
2012-08-02 12:36 ` Michal Hocko
2012-08-02 13:33 ` Mel Gorman
2012-08-02 13:53 ` Michal Hocko
2012-07-31 18:03 ` Rik van Riel
2012-07-26 18:31 ` Rik van Riel
2012-07-27 9:02 ` Michal Hocko
2012-07-26 16:01 ` [PATCH] mm: hugetlbfs: Close race during teardown of hugetlbfs shared page tables v2 Larry Woodman
2012-07-27 8:47 ` Mel Gorman
2012-07-26 21:00 ` Rik van Riel
2012-07-26 21:54 ` Hugh Dickins
2012-07-27 8:52 ` Mel Gorman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LSU.2.00.1207222033030.6810@eggly.anvils \
--to=hughd@google.com \
--cc=david@gibson.dropbear.id.au \
--cc=kenchen@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=mhocko@suse.cz \
--cc=xiyou.wangcong@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox