From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D5622C04A68 for ; Thu, 28 Jul 2022 22:48:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 42AA38E0001; Thu, 28 Jul 2022 18:48:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3DA006B0072; Thu, 28 Jul 2022 18:48:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2872D8E0001; Thu, 28 Jul 2022 18:48:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 194076B0071 for ; Thu, 28 Jul 2022 18:48:13 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id D5D521C6FB2 for ; Thu, 28 Jul 2022 22:48:12 +0000 (UTC) X-FDA: 79737998424.04.FF1743A Received: from mail104.syd.optusnet.com.au (mail104.syd.optusnet.com.au [211.29.132.246]) by imf17.hostedemail.com (Postfix) with ESMTP id 0422B40024 for ; Thu, 28 Jul 2022 22:48:11 +0000 (UTC) Received: from dread.disaster.area (pa49-195-20-138.pa.nsw.optusnet.com.au [49.195.20.138]) by mail104.syd.optusnet.com.au (Postfix) with ESMTPS id BDCD662CC74; Fri, 29 Jul 2022 08:48:06 +1000 (AEST) Received: from dave by dread.disaster.area with local (Exim 4.92.3) (envelope-from ) id 1oHCIV-006TpR-LO; Fri, 29 Jul 2022 08:48:03 +1000 Date: Fri, 29 Jul 2022 08:48:03 +1000 From: Dave Chinner To: Matthew Wilcox Cc: Jan Kara , Christoph Hellwig , Bob Peterson , Andreas Gruenbacher , "Darrick J. Wong" , Damien Le Moal , Naohiro Aota , Johannes Thumshirn , cluster-devel@redhat.com, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, Mel Gorman Subject: Re: remove iomap_writepage v2 Message-ID: <20220728224803.GZ3861211@dread.disaster.area> References: <20220719041311.709250-1-hch@lst.de> <20220728111016.uwbaywprzkzne7ib@quack3> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.4 cv=OJNEYQWB c=1 sm=1 tr=0 ts=62e3122a a=cxZHBGNDieHvTKNp/pucQQ==:117 a=cxZHBGNDieHvTKNp/pucQQ==:17 a=kj9zAlcOel0A:10 a=RgO8CyIxsXoA:10 a=7-415B0cAAAA:8 a=daDVLCEH9rD64MzAnTUA:9 a=CjuIK1q_8ugA:10 a=biEYGPWJfzWAr4FL6Ov7:22 ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=none; spf=none (imf17.hostedemail.com: domain of david@fromorbit.com has no SPF policy when checking 211.29.132.246) smtp.mailfrom=david@fromorbit.com; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1659048492; a=rsa-sha256; cv=none; b=T6U8XF/CgO19N6qHh2xJAsCHSVaCysuTmwOSzlhgLmsC4eAHZJBA1oD52uwk0Xak0o7Ata /sbbBNuNqg29z9zo9Cue4x2HvUKW04+zhX4t8sQvrncZ2+sqYrhkO7TLDoLsFMPu0a8EWd DQ1ShIJVV5Z0eVpDXT7tYH2m/KELZPc= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1659048492; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=PBuDLt5egCmAM7qnmMTRlyhGdjm3qKLZe8rdTJKb4W0=; b=WnECUqu4ze4lZVeUu7XkVo8qedXxc1PHtcbPTHqbKZ9KCMPqKD2JKGk83y8DXjQ1TZJ/Ze b53dyNjbkymlB/SbFasbqA5eAF9wLnS9PAXThtxZLYan9nHYnrrJEggisVUu+p/dYZydrQ fEbLzbZ6XqkyLerY4dJ+tTHqDjZSGI8= X-Rspamd-Server: rspam10 X-Rspam-User: Authentication-Results: imf17.hostedemail.com; dkim=none; spf=none (imf17.hostedemail.com: domain of david@fromorbit.com has no SPF policy when checking 211.29.132.246) smtp.mailfrom=david@fromorbit.com; dmarc=none X-Stat-Signature: 88a7huk3extn74ha9j88pnjouyzfaf35 X-Rspamd-Queue-Id: 0422B40024 X-HE-Tag: 1659048491-167044 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Jul 28, 2022 at 03:18:03PM +0100, Matthew Wilcox wrote: > On Thu, Jul 28, 2022 at 01:10:16PM +0200, Jan Kara wrote: > > Hi Christoph! > > > > On Tue 19-07-22 06:13:07, Christoph Hellwig wrote: > > > this series removes iomap_writepage and it's callers, following what xfs > > > has been doing for a long time. > > > > So this effectively means "no writeback from page reclaim for these > > filesystems" AFAICT (page migration of dirty pages seems to be handled by > > iomap_migrate_page()) which is going to make life somewhat harder for > > memory reclaim when memory pressure is high enough that dirty pages are > > reaching end of the LRU list. I don't expect this to be a problem on big > > machines but it could have some undesirable effects for small ones > > (embedded, small VMs). I agree per-page writeback has been a bad idea for > > efficiency reasons for at least last 10-15 years and most filesystems > > stopped dealing with more complex situations (like block allocation) from > > ->writepage() already quite a few years ago without any bug reports AFAIK. > > So it all seems like a sensible idea from FS POV but are MM people on board > > or at least aware of this movement in the fs land? > > I mentioned it during my folio session at LSFMM, but didn't put a huge > emphasis on it. > > For XFS, writeback should already be in progress on other pages if > we're getting to the point of trying to call ->writepage() in vmscan. > Surely this is also true for other filesystems? Yes. It's definitely true for btrfs, too, because btrfs_writepage does: static int btrfs_writepage(struct page *page, struct writeback_control *wbc) { struct inode *inode = page->mapping->host; int ret; if (current->flags & PF_MEMALLOC) { redirty_page_for_writepage(wbc, page); unlock_page(page); return 0; } .... It also rejects all calls to write dirty pages from memory reclaim contexts. ext4 will also reject writepage calls from memory allocation if block allocation is required (due to delayed allocation) or unwritten extents need converting to written. i.e. if it has to run blocking transactions. So all three major filesystems will either partially or wholly reject ->writepage calls from memory reclaim context. IOWs, if memory reclaim is depending on ->writepage() to make reclaim progress, it's not working as advertised on the vast majority of production Linux systems.... The reality is that ->writepage is a relic of a bygone era of OS and filesystem design. It was useful in the days where writing a dirty page just involved looking up the bufferhead attached to the page to get the disk mapping and then submitting it for IO. Those days are long gone - filesystems have complex IO submission paths now that have to handle delayed allocation, copy-on-write, unwritten extents, have unbound memory demand, etc. All the filesystems that support these 1990s era filesystem technologies simply turn off ->writepage in memory reclaim contexts. Hence for the vast majority of linux users (i.e. everyone using ext4, btrfs and XFS), ->writepage no longer plays any part in memory reclaim on their systems. So why should we try to maintain the fiction that ->writepage is required functionality in a filesystem when it clearly isn't? Cheers, Dave. -- Dave Chinner david@fromorbit.com