From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E9A8EC47422 for ; Sun, 21 Jan 2024 23:54:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 449716B007B; Sun, 21 Jan 2024 18:54:25 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3F9676B007D; Sun, 21 Jan 2024 18:54:25 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2C09C6B0080; Sun, 21 Jan 2024 18:54:25 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 1C6216B007B for ; Sun, 21 Jan 2024 18:54:25 -0500 (EST) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id E556FC0601 for ; Sun, 21 Jan 2024 23:54:24 +0000 (UTC) X-FDA: 81704974848.02.A104EA8 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf18.hostedemail.com (Postfix) with ESMTP id 076F31C000D for ; Sun, 21 Jan 2024 23:54:22 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=KlrlxiuD; spf=none (imf18.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1705881263; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=jA3IbLFtzAloHVLQrh6taOQq0+ILF1OF3dZLSLc3Sz8=; b=3Non9nfm8C75QxsNqOlnmewpUJa0cvJ/r6nouZ62h1Kqpb4UhlJkAvEg20WZXTc4dkO3so N1SL1urgbNlrLtpRsU1IOBwmeJXO9WPCIX18XjpFVU9VeBJdDL1zIs2drLZthvx1XrQ21Y YaPdAfonlcX98/P6UtzDm4W2K/QD3rQ= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1705881263; a=rsa-sha256; cv=none; b=1S6CU/vuFoKQKjh5KQgzgdJnTqcA4dzsT/KAcUpVtlcYDinJXL3XJq06QIn1dplxg/hN+2 BgEaoCyY2Rv3Y7+o4ODRSwTEduVGwNOIXdVCuohGTcdSfMVx79K7L+vNV2IuwINFhDJdOe 5Ei5u4/juAv3qh+aUNRdd0i9L6WrHlU= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=KlrlxiuD; spf=none (imf18.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Transfer-Encoding: Content-Type:MIME-Version:References:Message-ID:Subject:Cc:To:From:Date: Sender:Reply-To:Content-ID:Content-Description; bh=jA3IbLFtzAloHVLQrh6taOQq0+ILF1OF3dZLSLc3Sz8=; b=KlrlxiuDarLqJ8jOpV7msSXun3 trX9vXalyMaMMqKa2fQXZnVPNXv71jjxs1wuHC3UyXAstE5rS6j9lhGOwaPnqcMuiMVrkVLFuWb8x 6g6muDT0XpImvGHL7zZGiqTe5JfZMV8xl75Brz48rpGm2DO4tmo9qPflaSBNvzUXhpx98HiizJYIV hBUV3a1cKFLUQjS/WjUBCJR+eZpUrkUbAafe86MAWfizAkSphOagTE2mlabwZLUv+FJRURa8him/p j4A4YZwwVoBHvs8KVL8oHzwbqBPsUhfqk5dYNf+rYss+SPQ0dlxN46jdTQyFzjy0kPRuBI0M5DK/c XQlxijgw==; Received: from willy by casper.infradead.org with local (Exim 4.97.1 #2 (Red Hat Linux)) id 1rRhdr-0000000ENNn-2P5G; Sun, 21 Jan 2024 23:54:19 +0000 Date: Sun, 21 Jan 2024 23:54:19 +0000 From: Matthew Wilcox To: Pasha Tatashin Cc: David Rientjes , Pasha Tatashin , Sourav Panda , lsf-pc@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-block@vger.kernel.org, linux-ide@vger.kernel.org, linux-scsi@vger.kernel.org, linux-nvme@lists.infradead.org, bpf@vger.kernel.org Subject: Re: [LSF/MM/BPF TOPIC] State Of The Page Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Stat-Signature: mzu5z6oz3ymwu6goefaz4qjb5abw1i6c X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 076F31C000D X-Rspam-User: X-HE-Tag: 1705881262-25441 X-HE-Meta: U2FsdGVkX1/98GrGr1jwxtjWekdvNYyIMDe5dgnmucNmvrIDLydqfsod/G9I1y6wcqu/c+1Lm4NNMOYprYlp2KkTHZ93NdNjnASvM92RRIUymqWs2gECFMuGeIhEmeYOKUwBC35KLW9EtIzCg2Z2s7Y15/bXjK5qkMUaAvMOewuBjyXLfRCtovlE2x8o+AsmVpZF2F5V4MXUHQWQPUjAZ6XPYtNrVHZOXI7Qk5RC6I3zR+1okQQDNt/yYHj/b3LPxqqGk5CYiTyWKKCu2+zE5OBvooZjg1G/T6mMRmCDL3avL2dmpf2ogR/VnP2pgjwNiNFsBBR880RuK1pekLmmgTV/Chzl25Y3jhoejcu8PP5/uxG6N757JStmp8ztUx5tdSiykb5nDs1D8lBk54FjR2uHLdkf2BS91o4L56Eg+E534bc4roZVHMRjPyJtDzyhx7PFwq2wjlocbIWFYf/n8nhwXd+e2iKFniiGaUxPP489hNRf4F19RMA2LpqIJlpOFFF4YWRC/T95aNV5iqNWkl8oG7XCCRHGIrBY0LN/eqbt2Q3QxHOxjThB9mV4gJKvgL+FRIPER5+C0PmmoJO5ezKMDmx3/eyKmtYLJeLhlNV84NPfF2hdY2jg1cBa0XacQ7WHxRpZoMR7LLdCux8F4GPkJrXm/lbklLckEfRApqdWsteyYfP63a/t2362RkG5BWwBBYCBTniSWEaiKfJG/uaT1iwAbiwMq0m8HwyBNMdcjdVOdCrS67z5bvg6vK+w62BgVsTIUFcTAnGLucTSQPqh3sDuhXfVCkLuIwWhfkV7xjkn07bcrrg7RLLZD3g/wuVbH/0TaRwdH1htufWDNH52yqwSSQN3AkCa3RGUKWKdZtpfEyuj3Sjgc0cNH04X54YMbSSYD9YLMIgavfn78o5D72AmwXaPwu2GQSiTUWlGtTY7tdM/NfWlR4M7Jg8+Ql8CXIsYaMg44YuRFfg BSaqvtrK vnrA0PCw/XwiaRTsEMUei/EQZvuQNQPgcp+UiXK+04xX+djzKKKMm7WZrDjSDVKShCHtwIbjT3WJZPLSjjU5kheKhsnKPYk54BjV9tqaOY+HELx2hYFyArDzcVM3+2YNHs54vzf5NP3FFnQtKjRjYn7rIIlzvxfoVfJfsTzlQEnn4gKGj+boHaldabjxY7/5uzD4lHtIREbDyZvZZHLkDUGpQRe45iDk6dgRx5CmqZmNI7Dc= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sun, Jan 21, 2024 at 06:31:48PM -0500, Pasha Tatashin wrote: > On Sun, Jan 21, 2024 at 6:14 PM Matthew Wilcox wrote: > > I can add a proposal for a topic on both the PCP and Buddy allocators > > (I have a series of Thoughts on how the PCP allocator works in a memdesc > > world that I haven't written down & sent out yet). > > Interesting, given that pcp are mostly allocated by kmalloc and use > vmalloc for large allocations, how memdesc can be different for them > compared to regular kmalloc allocations given that they are sub-page? Oh! I don't mean the mm/percpu.c allocator. I mean the pcp allocator in mm/page_alloc.c. I don't have any Thoughts on mm/percpu.c at this time. I'm vaguely aware that it exists ;-) > > Thee's so much work to be done! And it's mostly parallelisable and almost > > trivial. It's just largely on the filesystem-page cache interaction, so > > it's not terribly interesting. See, for example, the ext2, ext4, gfs2, > > nilfs2, ufs and ubifs patchsets I've done over the past few releases. > > I have about half of an ntfs3 patchset ready to send. > > > There's a bunch of work to be done in DRM to switch from pages to folios > > due to their use of shmem. You can also grep for 'page->mapping' (because > > fortunately we aren't too imaginative when it comes to naming variables) > > and find 270 places that need to be changed. Some are comments, but > > those still need to be updated! > > > > Anything using lock_page(), get_page(), set_page_dirty(), using > > &folio->page, any of the functions in mm/folio-compat.c needs auditing. > > We can make the first three of those work, but they're good indicators > > that the code needs to be looked at. > > > > There is some interesting work to be done, and one of the things I'm > > thinking hard about right now is how we're doing folio conversions > > that make sense with today's code, and stop making sense when we get > > to memdescs. That doesn't apply to anything interacting with the page > > cache (because those are folios now and in the future), but it does apply > > to one spot in ext4 where it allocates memory from slab and attaches a > > buffer_head to it ... > > There are many more drivers that would need the conversion. For > example, IOMMU page tables can occupy gigabytes of space, have > different implementations for AMD, X86, and several ARMs. Conversion > to memdesc and unifying the IO page table management implementation > for these platforms would be beneficial. Understood; there's a lot of code that can benefit from larger allocations. I was listing the impediments to shrinking struct page rather than the places which would most benefit from switching to larger allocations. They're complementary to a large extent; you can switch to compound allocations today and get the benefit later. And unifying implementations is always a worthy project.