From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 55245C6FD1D for ; Tue, 14 Mar 2023 14:51:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E42696B0074; Tue, 14 Mar 2023 10:51:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DF1416B0075; Tue, 14 Mar 2023 10:51:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CBAF98E0001; Tue, 14 Mar 2023 10:51:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id BD6846B0074 for ; Tue, 14 Mar 2023 10:51:13 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 74B071A1013 for ; Tue, 14 Mar 2023 14:51:13 +0000 (UTC) X-FDA: 80567791626.07.D64D2C9 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf19.hostedemail.com (Postfix) with ESMTP id 2D3721A0012 for ; Tue, 14 Mar 2023 14:51:09 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=S7EeSUpG; spf=none (imf19.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1678805470; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=H1QGXFGueBuHBkdVnY1gVNgWxdg/uqRKaISs/k4X5k8=; b=Kgy+u2oCvjrRBYofBvB1OBD3rcYkQEW7goE0zYzHzhLj7FYLct2m29fy27jzyTmdgcKLiG 9XfvYihQB9vuLCOc+4QQa4zcv6ONCix/LPh3/bmpkafcZU7tAXdqIel4f2OfUWVKr1M1BW EmnhAmb9AHf8PenP0orF9HHwFc9KtVA= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=S7EeSUpG; spf=none (imf19.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1678805470; a=rsa-sha256; cv=none; b=zyFCBxuoUqMx5+rvLc9dEeSxmxUKDMM5nzWa6ko0J0gOYmthSgIxSGOyTrcbddS29BNEZj BtF0E/dsvdFmKCwdBEksb/Dp/0CziSXQly1J8pY1H8CfphJz4l67ggFiBzPs95huviFN3q GIH9JDh5n1GukJSxXyNwWUxMo8ehVCY= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Type:MIME-Version:Message-ID: Subject:Cc:To:From:Date:Sender:Reply-To:Content-Transfer-Encoding:Content-ID: Content-Description:In-Reply-To:References; bh=H1QGXFGueBuHBkdVnY1gVNgWxdg/uqRKaISs/k4X5k8=; b=S7EeSUpGaZlBdHu4WUHAIdr25V ixFY7YsdAbe5O4aABq5JSzhjm6AQQEt3eXVPOFTCWjSNtDCgQFZErMmKupnpF9Y/4skbRxraN6HkX TqmDE1TcO5R6zBSPSKRc5doATbZgmLE4SGmmnROLttR98UmmdtbKjQjHzeTUwv1E1p/bjEoFFy+iH nyqmOOJUi5ry9SunSrF7pQBYspmk5uLVQJF6SmWk5LWJAeaLgy3Zr7wr5xxwd+hQMMwpvWD+yVxuy /CaVhXopsc+90LrYwApR0yOX/0l4LaCriYpzBh1LHFmzupT2UryDjtJai0v3ndjSU6zPsYceoZ+bh bp/MSRHg==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1pc5zT-00CzPv-OB; Tue, 14 Mar 2023 14:51:03 +0000 Date: Tue, 14 Mar 2023 14:51:03 +0000 From: Matthew Wilcox To: linux-fsdevel@vger.kernel.org Cc: linux-mm@kvack.org, linux-afs@lists.infradead.org, ceph-devel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-nfs@vger.kernel.org, linux-nilfs@vger.kernel.org, linux-ntfs-dev@lists.sourceforge.net, ntfs3@lists.linux.dev, ocfs2-devel@oss.oracle.com, devel@lists.orangefs.org, reiserfs-devel@vger.kernel.org, Evgeniy Dushistov Subject: RFC: Filesystem metadata in HIGHMEM Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Stat-Signature: trmdh8iag7zru74zgo3cafag6u4bh861 X-Rspam-User: X-Rspamd-Queue-Id: 2D3721A0012 X-Rspamd-Server: rspam06 X-HE-Tag: 1678805469-549626 X-HE-Meta: U2FsdGVkX18UCANQEB/eNseQXAggGbW4fE9fKSlvN9UvKDi32sD3+PLDwZ1KaL20yWw3NYp/mmg2EsK1qxFYVUbmrox7rH2c6Jl/htusW1YGI0OgK+6Sy8kcMobUQcBNl3RmpqRTTQ+gnJ7HDi9mYMsm9PM3nn9AnqZCQqQsW1oSFjlrwYvZWD61FQwBKMfuik23xG/uVBLq1IK2flLCnU0Nlo/LqV7ouhrHuNin3wTU1nFGujgfYbvizX+5JgomI/45g2fHMa2sd3LwzLk9F09UkcOGVT5XJ6y6x2nU0hbtpsCdpN9CRbRkRM4uwYPwid9b9t4IyAwuzyDqXJ/8IxHnkS84cnm/zLGCgUrl9OM6PD+3w3AD9MmeCpNJIgJZPYYC/8vpY8nW/tRkJxuODLPc1LL04XeauNMmwU9E+GAf3AjpdipeAdfhzqzGpAwpRuu03H/WQJGMfAS/LCC6Mz0N6cpyuxPNhW3hFR5VkgrlTZhVtsNJcdyq0KSE58BBiEJNS+RgGTQV7dWWcPVSe8l5lAD0nOMM8pGVVa23VjNWxYkz8mJJLqPEmyIs+8rzY22F4G4puLcUnSH9uMGeqZ2wWGKaropx0qfx22T76ve2g54pOharjXmahoxFp5i/ZDaqY6Sj9seKLhB/foF7DEorVELQRPdam5hpTgR13XQBBEHj6X6ETWEKr8+ACjkkZdpJlf4TxMsUPOaYSXzRirju+Za2Z7cPKykn4GQSr7lCt58EVen8cxfP9RZRAMrKTH3vKT76N+KbHlp4lmdMWC2S/9fvS/1rxS6N10FbrPDBKbEGYLD8Dl+BqxCkmk/+L9zMzpbvVYqIF9sFx7dhZo7v0kpq0FCO1vChIcQhAf+/oZQFzb498EUhSPjjKAZr0X/fUK3sMHb8Opvbg7Ff7iHMES278nLMp576ZDc9gByyiEJ4weW5H031qFcm+WtonDyip9c0syDydTaTEYB zoi+EXvX wut+ZjHUIPDAKjK66SFuFuAptroa2HH6VukBrKVaiGhilN3hullg35RKeaavhMGsxa2ptQiEZR5Cq3Q6Zxj+lRKfjYIe89oZSDr/WEKSBqkPW53U/RlBdKIB1ig== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000862, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: TLDR: I think we should rip out support for fs metadata in highmem We want to support filesystems on devices with LBA size > PAGE_SIZE. That's subtly different and slightly harder than fsblk size > PAGE_SIZE. We can use large folios to read the blocks into, but reading/writing the data in those folios is harder if it's in highmem. The kmap family of functions can only map a single page at a time (and changing that is hard). We could vmap, but that's slow and can't be used from atomic context. Working a single page at a time can be tricky (eg consider an ext2 directory entry that spans a page boundary). Many filesystems do not support having their metadata in highmem. ext4 doesn't. xfs doesn't. f2fs doesn't. afs, ceph, ext2, hfs, minix, nfs, nilfs2, ntfs, ntfs3, ocfs2, orangefs, qnx6, reiserfs, sysv and ufs do. Originally, ext2 directories in the page cache were done by Al Viro in 2001. At that time, the important use-case was machines with tens of gigabytes of highmem and ~800MB of lowmem. Since then, the x86 systems have gone to 64-bit and the only real uses for highmem are cheap systems with ~8GB of memory total and 2-4GB of lowmem. These systems really don't need to keep directories in highmem; using highmem for file & anon memory is enough to keep the system in balance. So let's just rip out the ability to keep directories (and other fs metadata) in highmem. Many filesystems already don't support this, and it makes supporting LBA size > PAGE_SIZE hard. I'll turn this into an LSFMM topic if we don't reach resolution on the mailing list, but I'm optimistic that everybody will just agree with me ;-)