From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail172.messagelabs.com (mail172.messagelabs.com [216.82.254.3]) by kanga.kvack.org (Postfix) with ESMTP id 10D966B003D for ; Thu, 12 Mar 2009 18:08:54 -0400 (EDT) Subject: Re: [patch 2/2] fs: fix page_mkwrite error cases in core code and btrfs From: Trond Myklebust In-Reply-To: <20090311035503.GI16561@wotan.suse.de> References: <20090311035318.GH16561@wotan.suse.de> <20090311035503.GI16561@wotan.suse.de> Content-Type: text/plain Date: Thu, 12 Mar 2009 18:08:44 -0400 Message-Id: <1236895724.7179.71.camel@heimdal.trondhjem.org> Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org To: Nick Piggin Cc: Andrew Morton , linux-fsdevel@vger.kernel.org, Linux Memory Management List , Chris Mason List-ID: On Wed, 2009-03-11 at 04:55 +0100, Nick Piggin wrote: > page_mkwrite is called with neither the page lock nor the ptl held. This > means a page can be concurrently truncated or invalidated out from underneath > it. Callers are supposed to prevent truncate races themselves, however > previously the only thing they can do in case they hit one is to raise a > SIGBUS. A sigbus is wrong for the case that the page has been invalidated > or truncated within i_size (eg. hole punched). Callers may also have to > perform memory allocations in this path, where again, SIGBUS would be wrong. > > The previous patch made it possible to properly specify errors. Convert > the generic buffer.c code and btrfs to return sane error values > (in the case of page removed from pagecache, VM_FAULT_NOPAGE will cause the > fault handler to exit without doing anything, and the fault will be retried > properly). > > This fixes core code, and converts btrfs as a template/example. All other > filesystems defining their own page_mkwrite should be fixed in a similar > manner. There appears to be another atomicity problem in the same area of code... The lack of locking between the call to ->page_mkwrite() and the subsequent call to set_page_dirty_balance() means that the filesystem may actually already have written out the page by the time you get round to calling set_page_dirty_balance(). How then is the filesystem supposed to guarantee that whatever structure it allocated in page_mkwrite() is still around when the page gets marked as dirty a second time? Trond -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org