From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6A332C5478C for ; Tue, 27 Feb 2024 07:22:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E32824401F7; Tue, 27 Feb 2024 02:22:12 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id DBB0B4401F0; Tue, 27 Feb 2024 02:22:12 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C5BC44401F7; Tue, 27 Feb 2024 02:22:12 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id B09E34401F0 for ; Tue, 27 Feb 2024 02:22:12 -0500 (EST) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 3E73780B58 for ; Tue, 27 Feb 2024 07:22:12 +0000 (UTC) X-FDA: 81836740104.23.5FF527F Received: from out-181.mta1.migadu.com (out-181.mta1.migadu.com [95.215.58.181]) by imf06.hostedemail.com (Postfix) with ESMTP id 625C9180003 for ; Tue, 27 Feb 2024 07:22:08 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=pQljFJB8; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf06.hostedemail.com: domain of kent.overstreet@linux.dev designates 95.215.58.181 as permitted sender) smtp.mailfrom=kent.overstreet@linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1709018530; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=gMfdAXegzTyYmWBdsE3vSQo/M0tAqvtQy8vDu7ZYfmM=; b=gnH6VC08iUe2XI2b5fh/nuBPY1QiDJIR/gBcS4OqAFnxu8+hSDzBjlYMiQVXNTn4gjgTrs LkCRg6Da7h1uhrh65T6o9uW4jAF1Pl156IVI1a3Up/0tFiVbAk9r2zBdUFqJkdIJ28CNdg J9+OAcKhgERTBxh1ENDTpOtKFdESdZw= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=pQljFJB8; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf06.hostedemail.com: domain of kent.overstreet@linux.dev designates 95.215.58.181 as permitted sender) smtp.mailfrom=kent.overstreet@linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1709018530; a=rsa-sha256; cv=none; b=Y9iTxHD/ZvEa2gbopZWfYz9GvU+vG/tHZ+yy+soYtu+Xs2EVTl4QcMk34xwIpuWI9RmmC6 DzpjNOi58LKvZlEuRGnhBqinmgVZdPEMxPx5t/OVvQr0XQ15ueV+0iDTT8kyRrE5eFlkKp G861nrZGkD24Azo76y+mCtByIwPtZ9o= Date: Tue, 27 Feb 2024 02:21:59 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1709018524; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=gMfdAXegzTyYmWBdsE3vSQo/M0tAqvtQy8vDu7ZYfmM=; b=pQljFJB8bLMK1tC7i7aqd9CROZS9PJS9VKwBQkWXxDVvnQjP1aOrghsG/Gt6Sq3FrXI3ln ZPjk3iIvmT7pLxinTy0t78s4rfAXHIgUDiURexfIo7JcMdRAvxQOk0YZn4G0KTjJq0Y00j yueGpCKunfGeMr8IpFmOi0ObsEkXqsI= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Kent Overstreet To: Linus Torvalds Cc: Al Viro , Matthew Wilcox , Luis Chamberlain , lsf-pc@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-mm , Daniel Gomez , Pankaj Raghav , Jens Axboe , Dave Chinner , Christoph Hellwig , Chris Mason , Johannes Weiner Subject: Re: [LSF/MM/BPF TOPIC] Measuring limits and enhancing buffered IO Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Migadu-Flow: FLOW_OUT X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 625C9180003 X-Stat-Signature: r16hs8rr8wg9bdwp3pm8uiactprzdg4m X-Rspam-User: X-HE-Tag: 1709018528-506943 X-HE-Meta: U2FsdGVkX19IfpT41IpETOGvYaIm4GCFZ/J3PHWcAhWBFGdKUxaOSpzHMBNo53N7WAV05glPJMgkLmu4ijQBxvzY1QKlzU+JEPLah5hQCxubILKbq6nmUfv4WqDbTYblxkBaJanwEUUrpniqqAKIQKTo1cUITEm3roNKQhL6dyU0FX5fpiodn1n4T9dmBCZtsjUvIxa0Kdkymxt4L2plLk83VZh/b7AM5JMIAV636l7xeugBawWJ+ipaWN+XIrgrMmElPuk0WKFihV9D2fj+c3rwAMTF+30OsWuaHnGBlpaa3BzZkiYVqg65nBjddxGYOxPx2+bsC2/Y6SxNmsMOfkiIa66VB4KFltzY/GUWZSF/KVm+K7AQEa8B8ubDzaj6QF1hbIdb8fm7RxGPmKo7xEMxU4cFn3O0RFln55a0uNzBmeg7ZRtLuehUI9+oFjvZGfwBp2LduqZQlxEECgrVgzhzlET66/yiFkY0KiPRbEde8xke5L4FLsepRqnQOrL1VwMOj1uYMhVMCg0Yndt3/TXUNbwclQdZBKP84Vuh7+da4nn4vKAWLvOY7YRy9hW4/LS+Lw3fX9zlz0mlxR82oYN4iVA7YJ6DQrac3FdUfl0fAtsMXjInGqBT0PQkIG2/CyggVnp+ClaVRfb5WWwtURhOMb4yLBvX/F8/Dn+TQlBr/MlOp1eIAi6vbDM2lXrZ8H4s5j3v+1u7sWsczNylPcKdTjX3c3IQWL/xZJyyYJicVBRdTkSED9Ukg1lJFN6vI9Iwm6K99EkFIxqFlpVrIoQTHEsNse3eXIlWVQc43p1jaS8J13sAZEssUqskyzO1hB9iaaMkPZh6MyVuYg2ul5/uGgBXq9iIDMiTxrfLk9AiEbj6d1Vv2Gi1Nj+YHnzM3NIB6407QYtHbblk/LIn6DA0uc2dU6inlg+CSo4ZHur6suOl6hLTZA0JOstlULXvHSuwt2XFNF01/YvtX33 p8UOBAsF Ru9BQTtV/D9RFa2hebY8nBicprZ6UfD1GtrzyrrXG7MG+1ny90Ll1fTkAasDQ48c6yWPkbDzlAot80S8vhgKreFqIKr53VNXLUf7idFf2W/xw/90Lww71J9p7Qh5EacSUXFbar4/q7Pu0l0I= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Feb 26, 2024 at 03:48:35PM -0800, Linus Torvalds wrote: > On Mon, 26 Feb 2024 at 14:46, Linus Torvalds > wrote: > > > > I really haven't tested this AT ALL. I'm much too scared. > > "Courage is not the absence of fear, but acting in spite of it" > - Paddington Bear / Michal Scott > > It seems to actually boot here. > > That said, from a quick test with lots of threads all hammering on the > same page - I'm still not entirely convinced it makes a difference. > Sure, the kernel profile changes, but filemap_get_read_batch() wasn't > very high up in the profile to begin with. > > I didn't do any actual performance testing, I just did a 64-byte pread > at offset 0 in a loop in 64 threads on my 32c/64t machine. Only rough testing, but this is looking like around a 25% performance increase doing 4k random reads on a 1G file with fio, 8 jobs, on my Ryzen 5950x - 16.7M -> 21.4M iops, very roughly. fio's a pig and we're only spending half our cpu time in the kernel, so the buffered read path is actually getting 40% or 50% faster. So I'd say that's substantial. RCU freeing of pagecache pages would be even better - I think that'd let us completely get rid of the barrier & xarray recheck, and we wouldn't have to do it as a silly special case.