From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wi0-f171.google.com (mail-wi0-f171.google.com [209.85.212.171]) by kanga.kvack.org (Postfix) with ESMTP id D1BD76B0038 for ; Mon, 30 Mar 2015 04:22:23 -0400 (EDT) Received: by wixm2 with SMTP id m2so80173003wix.0 for ; Mon, 30 Mar 2015 01:22:23 -0700 (PDT) Received: from mx2.suse.de (cantor2.suse.de. [195.135.220.15]) by mx.google.com with ESMTPS id v10si16763365wju.8.2015.03.30.01.22.21 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Mon, 30 Mar 2015 01:22:22 -0700 (PDT) Date: Mon, 30 Mar 2015 10:22:18 +0200 From: Michal Hocko Subject: Re: [PATCH] mm: Use GFP_KERNEL allocation for the page cache in page_cache_read Message-ID: <20150330082218.GA3909@dhcp22.suse.cz> References: <1426687766-518-1-git-send-email-mhocko@suse.cz> <55098F3B.7070000@redhat.com> <20150318145528.GK17241@dhcp22.suse.cz> <20150319071439.GE28621@dastard> <20150319124441.GC12466@dhcp22.suse.cz> <20150320034820.GH28621@dastard> <20150326095302.GA15257@dhcp22.suse.cz> <20150326214354.GG28129@dastard> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150326214354.GG28129@dastard> Sender: owner-linux-mm@kvack.org List-ID: To: Dave Chinner Cc: Rik van Riel , Andrew Morton , Al Viro , Johannes Weiner , Mel Gorman , Neil Brown , Tetsuo Handa , Sage Weil , Mark Fasheh , linux-mm@kvack.org, LKML On Fri 27-03-15 08:43:54, Dave Chinner wrote: > On Thu, Mar 26, 2015 at 10:53:02AM +0100, Michal Hocko wrote: > > On Fri 20-03-15 14:48:20, Dave Chinner wrote: > > > On Thu, Mar 19, 2015 at 01:44:41PM +0100, Michal Hocko wrote: > > [...] > > > > Or did I miss your point? Are you concerned about some fs overloading > > > > filemap_fault and do some locking before delegating to filemap_fault? > > > > > > The latter: > > > > > > https://git.kernel.org/cgit/linux/kernel/git/dgc/linux-xfs.git/commit/?h=xfs-mmap-lock&id=de0e8c20ba3a65b0f15040aabbefdc1999876e6b > > > > Hmm. I am completely unfamiliar with the xfs code but my reading of > > 964aa8d9e4d3..723cac484733 is that the newly introduced lock should be > > OK from the reclaim recursion POV. It protects against truncate and > > punch hole, right? Or are there any internal paths which I am missing > > and would cause problems if we do GFP_FS with XFS_MMAPLOCK_SHARED held? > > It might be OK, but you're only looking at the example I gave you, > not the fundamental issue it demonstrates. That is: filesystems may > have *internal dependencies that are unknown to the page cache or mm > subsystem*. Hence the page cache or mm allocations cannot > arbitrarily ignore allocation constraints the filesystem assigns to > mapping operations.... I fully understand that. I am just trying to understand what are the real requirements from filesystems wrt filemap_fault. mapping gfp mask is not usable much for that because e.g. xfs has GFP_NOFS set for the whole inode life AFAICS. And it seems that this context is not really required even after the recent code changes. We can add gfp_mask into struct vm_fault and initialize it to mapping_gfp_mask | GFP_IOFS and .fault() callback might overwrite it. This would be cleaner than unconditional gfp manipulation (the patch below). But we are in a really hard position if the GFP_NOFS context is really required here. We shouldn't really trigger OOM killer because that could be premature and way too disruptive. We can retry the page fault or the allocation but that both sound suboptimal to me. Do you have any other suggestions? This hasn't been tested yet it just shows the idea mentioned above. ---