From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 80D4CC5478C for ; Tue, 27 Feb 2024 16:34:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F0E61940007; Tue, 27 Feb 2024 11:34:16 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E97AF6B00E1; Tue, 27 Feb 2024 11:34:16 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D5F21940007; Tue, 27 Feb 2024 11:34:16 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id C308D6B00E0 for ; Tue, 27 Feb 2024 11:34:16 -0500 (EST) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 5B636140BDD for ; Tue, 27 Feb 2024 16:34:16 +0000 (UTC) X-FDA: 81838131312.25.527188A Received: from out-176.mta0.migadu.com (out-176.mta0.migadu.com [91.218.175.176]) by imf24.hostedemail.com (Postfix) with ESMTP id 5EE9B180037 for ; Tue, 27 Feb 2024 16:34:14 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=OMEI6pvD; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf24.hostedemail.com: domain of kent.overstreet@linux.dev designates 91.218.175.176 as permitted sender) smtp.mailfrom=kent.overstreet@linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1709051654; a=rsa-sha256; cv=none; b=2sl/OYITeQhCoR+IH6W/T42qJaDCmvHKgMcTchHnNpRv+V9fSmCE/vUJA4VvrdzPsO58D2 2XPGhAtWlfLrq5W4oioFJ1qhOijErzBkvUNEmFah81vRJtcA0ABMZZFRus7Gf/LsWJ9hzL Z2EtL+miNWpLKLWAbnwDicn0sC5zqs4= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=OMEI6pvD; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf24.hostedemail.com: domain of kent.overstreet@linux.dev designates 91.218.175.176 as permitted sender) smtp.mailfrom=kent.overstreet@linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1709051654; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=MtgrO6HfgDBspUmBO8A8V2lYH0C5Az89B+6E1HS2XoY=; b=MN5Isk1qjt4SEj5VMBuwZOZyd6XWKXVykiMoxt1WXYgAWG2d5KuZz+IN638BSrIgoBYM3U qLgmfyKAQuULKvtEgTFGNeCYswhbOYdytvYFv9oKt/QUtmYvT8aD7x8FWY+G/VH0KWDy/X 0tkicI9EZLWtDqjtiusJSe/hGw4m8Uo= Date: Tue, 27 Feb 2024 11:34:06 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1709051652; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=MtgrO6HfgDBspUmBO8A8V2lYH0C5Az89B+6E1HS2XoY=; b=OMEI6pvDazZJhBGLztBSvzYmoeC097k3eEm7TxwgiEqXxsl8bWOJ5pYlKoifuZA5kFkbhx zbT7G0u5dZAabWuYNicluQCdjuMcEktSpCv58eVEcKFno9zcRulGxy9pKZim9Np+ta12JN dud7rauI8N+HH+8Y14TFoFYGtU5Oc2o= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Kent Overstreet To: "Paul E. McKenney" Cc: Matthew Wilcox , Linus Torvalds , Al Viro , Luis Chamberlain , lsf-pc@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-mm , Daniel Gomez , Pankaj Raghav , Jens Axboe , Dave Chinner , Christoph Hellwig , Chris Mason , Johannes Weiner Subject: Re: [LSF/MM/BPF TOPIC] Measuring limits and enhancing buffered IO Message-ID: <4eibprmeehxnavkbjwvqdxecqk3b4l6lkc3hslbf3ggmxv5vxw@gprjhbny5rue> References: <5c6ueuv5vlyir76yssuwmfmfuof3ukxz6h5hkyzfvsm2wkncrl@7wvkfpmvy2gp> <49354148-4dea-4c89-b591-76b21ed4a5d1@paulmck-laptop> <6xpyltamnbd7q7nesntqspyfjfq3jexkmfyj2fekrk2mrhktcr@73vij67d5vne> <1f0d0536-c35b-46f9-9dfb-c8bc29e6956a@paulmck-laptop> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1f0d0536-c35b-46f9-9dfb-c8bc29e6956a@paulmck-laptop> X-Migadu-Flow: FLOW_OUT X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 5EE9B180037 X-Stat-Signature: msq5editzj9cyje5t3wp6mrkakei5s7c X-HE-Tag: 1709051654-457940 X-HE-Meta: U2FsdGVkX1/ndxhKUs5E1TkUbA8zpU3lkC2i9j5TF8Ux9YcD/4dCHC1d9VPK8o/3B1Hh6bjC3nn4+ITBpXx9o3ucne5hHp6Kq4pcxhdYfQDVwJxrpfCy/VuHOLzBNrSfUSm+6G+Ua9gep+y58RYfWtWoDyxu32pY+3igr++X0tgpT0CJCIVsqjIA8PMl58hcP0FXdKwnE6HSSdLZMOhq4atzmAb6pBBoG0db/zu9LXpHjlGFpYw+5ydOtN0RPYX9UL2FdCx84FcN9J++RzfStFFXShuEit4nza/RMXOrlKtZXwtbHpJlKVpWtGWVIE/NmtDHiC+fMIoTXpoDDogt0tSncgO5pEsDxmKEcCtuOso85YgA9dE6+EdkJRHoIXaQ7wJqCeudBqJ45vHM4YmWSQOq+mqPjU5NZygbWJDKPHsNWDEeP6emuFKz70H8JJqtq1f+FiZaxMXLfLA8ysjVuRArf9m3TINlt4ecUJuNYQk4vafNMXxHmLarV6YfE516cWH/BISsfsGUBE0AUbKKFK+WhI73SQIIYTLx+kczn/GGHOe81FCr99f1FQmsRvCcTvlX4lH0qI1fMDlnmsJLPc2mngXur5C4ATpEB/4bQj8d/HJv+hnwzr4E9jEUeKVkgHGVc1DZKviMbmlWWwyhfP1/CSTZ+WuDA8W6yPAuXJFQmxyX2KeAet1YbPaqgrdru7p2J+6q5FV68S1GmGS1s0RNYuzhmhe/UmJ4s7EOtQ1Swf8oCzauWSp7v2cF1UWDaKbq0SZkVh8ylBsZ5X+aiLHae6HoTGNJfGg3R1JiRMIflr7GAKHvN5nzXsZof89R79F398yFqoPSCOp01rmWo+RhSyk2Zzu6vdORr1K3Tmu/s2kUPWayGMatHeUmvcDrg28QTCKj4U8bH4m7u1nFXhtbGugE+KkqRimC+ur686xJXk1KwzphcitfK917/c/tepzQ8vixP3kseJUw1m9 EWZg7Yl9 ELBZu8IvaPJCVRrGDRinVTGX6ZaQBNxMFEMEOuxUn617iJyNRaOOgDIH6v4QtKSQCyE+Y0KmiBbsJ2gwpCtMew7PGVA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Feb 27, 2024 at 08:21:29AM -0800, Paul E. McKenney wrote: > On Tue, Feb 27, 2024 at 03:54:23PM +0000, Matthew Wilcox wrote: > > On Tue, Feb 27, 2024 at 07:32:32AM -0800, Paul E. McKenney wrote: > > > At a ridiculously high level, reclaim is looking for memory to free. > > > Some read-only memory can often be dropped immediately on the grounds > > > that its data can be read back in if needed. Other memory can only be > > > dropped after being written out, which involves a delay. There are of > > > course many other complications, but this will do for a start. > > > > Hi Paul, > > > > I appreciate the necessity of describing what's going on at a very high > > level, but there's a wrinkle that I'm not sure you're aware of which > > may substantially change your argument. > > > > For anonymous memory, we do indeed wait until reclaim to start writing it > > to swap. That may or may not be the right approach given how anonymous > > memory is used (and could be the topic of an interesting discussion > > at LSFMM). > > > > For file-backed memory, we do not write back memory in reclaim. If it > > has got to the point of calling ->writepage in vmscan, things have gone > > horribly wrong to the point where calling ->writepage will make things > > worse. This is why we're currently removing ->writepage from every > > filesystem (only ->writepages will remain). Instead, the page cache > > is written back much earlier, once we get to balance_dirty_pages(). > > That lets us write pages in filesystem-friendly ways instead of in MM > > LRU order. > > Thank you for the additional details. > > But please allow me to further summarize the point of my prior email > that seems to be getting lost: > > 1. RCU already does significant work prodding grace periods. > > 2. There is no reasonable way to provide estimates of the > memory sent to RCU via call_rcu(), and in many cases > the bulk of the waiting memory will be call_rcu() memory. > > Therefore, if we cannot come up with a heuristic that does not need to > know the bytes of memory waiting, we are stuck anyway. That is a completely asinine argument. > So perhaps the proper heuristic for RCU speeding things up is simply > "Hey RCU, we are in reclaim!". Because that's the wrong heuristic. There are important workloads for which we're _always_ in reclaim, but as long as RCU grace periods are happening at some steady rate, the amount of memory stranded will be bounded and there's no reason to expedite grace periods. If we start RCU freeing all pagecache folios we're going to be cycling memory through RCU freeing at the rate of gigabytes per second, tens of gigabytes per second on high end systems. Do you put hard limits on how long we can go before an RCU grace period that will limit the amount of memory stranded to something acceptable? Yes or no?