From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx164.postini.com [74.125.245.164]) by kanga.kvack.org (Postfix) with SMTP id D032C6B0095 for ; Mon, 9 Jul 2012 19:57:53 -0400 (EDT) Received: by yhr47 with SMTP id 47so14254954yhr.14 for ; Mon, 09 Jul 2012 16:57:52 -0700 (PDT) Date: Mon, 9 Jul 2012 16:57:14 -0700 (PDT) From: Hugh Dickins Subject: Re: [PATCH] mm: hugetlb: flush dcache before returning zeroed huge page to userspace In-Reply-To: <20120709141324.GK7315@mudshark.cambridge.arm.com> Message-ID: References: <1341412376-6272-1-git-send-email-will.deacon@arm.com> <20120709122523.GC4627@tiehlicka.suse.cz> <20120709141324.GK7315@mudshark.cambridge.arm.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-linux-mm@kvack.org List-ID: To: Will Deacon Cc: Michal Hocko , Andrew Morton , Hillf Danton , linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org On Mon, 9 Jul 2012, Will Deacon wrote: > On Mon, Jul 09, 2012 at 01:25:23PM +0100, Michal Hocko wrote: > > On Wed 04-07-12 15:32:56, Will Deacon wrote: > > > When allocating and returning clear huge pages to userspace as a > > > response to a fault, we may zero and return a mapping to a previously > > > dirtied physical region (for example, it may have been written by > > > a private mapping which was freed as a result of an ftruncate on the > > > backing file). On architectures with Harvard caches, this can lead to > > > I/D inconsistency since the zeroed view may not be visible to the > > > instruction stream. > > > > > > This patch solves the problem by flushing the region after allocating > > > and clearing a new huge page. Note that PowerPC avoids this issue by > > > performing the flushing in their clear_user_page implementation to keep > > > the loader happy, however this is closely tied to the semantics of the > > > PG_arch_1 page flag which is architecture-specific. > > > > > > Acked-by: Catalin Marinas > > > Signed-off-by: Will Deacon > > > --- > > > mm/hugetlb.c | 1 + > > > 1 files changed, 1 insertions(+), 0 deletions(-) > > > > > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > > > index e198831..b83d026 100644 > > > --- a/mm/hugetlb.c > > > +++ b/mm/hugetlb.c > > > @@ -2646,6 +2646,7 @@ retry: > > > goto out; > > > } > > > clear_huge_page(page, address, pages_per_huge_page(h)); > > > + flush_dcache_page(page); > > > __SetPageUptodate(page); > > > > Does this have to be explicit in the arch independent code? > > It seems that ia64 uses flush_dcache_page already in the clear_user_page > > It would match what is done in similar situations by cow_user_page (mm/memory.c) > and shmem_writepage (mm/shmem.c). Other subsystems also have explicit page > flushing (DMA bounce, ksm) so I think this is the right place for it. I am not at all sure if you are right or not: please let's consult linux-arch about this - now Cc'ed. If this hugetlb_no_page() were solely mapping the hugepage into that userspace, I would say you are wrong. It's the job of clear_huge_page() to take the mapped address into account, and pass it down to the architecture-specific implementation, to do whatever flushing is needed - you should be providing that in your architecture. In particular, notice how clear_huge_page() goes round a loop of clear_user_highpage()s: in your patch, you're expecting the implementation of flush_dcache_page() to notice whether or not this is a hugepage, and flush the appropriate size. Perhaps yours is the only architecture to need this on huge, and your flush_dcache_page() implements it correctly; but it does seem surprising. If I start to grep the architectures for non-empty flush_dcache_page(), I soon find things in arch/arm such as v4_mc_copy_user_highpage() doing if (!test_and_set_bit(PG_dcache_clean,)) __flush_dcache_page() - where the naming suggests that I'm right, it's the architecture's responsibility to arrange whatever flushing is needed in its copy and clear page functions. But... this hugetlb_no_page() has a VM_MAYSHARE case below, which puts the new page into page cache, making it accessible by other processes: that may indeed be reason for flush_dcache_page() there - or a loop of flush_dcache_page()s. But I worry then that in the !VM_MAYSHARE case you would be duplicating expensive flushes: perhaps they should be restricted to the VM_MAYSHARE block. Hugh -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org