From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8038EC47DD9 for ; Sun, 25 Feb 2024 17:03:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EFC636B0114; Sun, 25 Feb 2024 12:03:54 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E84E06B0115; Sun, 25 Feb 2024 12:03:54 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CFF246B0116; Sun, 25 Feb 2024 12:03:54 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id B73076B0114 for ; Sun, 25 Feb 2024 12:03:54 -0500 (EST) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 6A384C0557 for ; Sun, 25 Feb 2024 17:03:54 +0000 (UTC) X-FDA: 81830948388.03.FCC516B Received: from mail-lf1-f41.google.com (mail-lf1-f41.google.com [209.85.167.41]) by imf21.hostedemail.com (Postfix) with ESMTP id 6D2FE1C001B for ; Sun, 25 Feb 2024 17:03:52 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=google header.b=hmiuIDS4; spf=pass (imf21.hostedemail.com: domain of torvalds@linuxfoundation.org designates 209.85.167.41 as permitted sender) smtp.mailfrom=torvalds@linuxfoundation.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1708880632; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=SYy3wOmbUatqMC0tUK+yhnIuh+GGp0MIfopsKx9GQbw=; b=PRBtpYMJzIdo/uXrdTVj3m/eZ5RztuBCEj4Dak7YOoplrWGeeC5SIxl+1XQJ7lxvNOJQhn 3xaGDqt3nUHtgOyRbz/UypNd9BfwmsYimhVc8FvH6/9SeY8U71RgzAsoGG4AX0um3nD9Fo PVeHZQxHCAYnAxdOEkn+VpOfAnyVyH0= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1708880632; a=rsa-sha256; cv=none; b=BmRCtwiO7oi1hIymxBff7TONC9RGZHllcRB8ezmZ9AB2hJvmX2DeRwCHk+Ahxd8GEVMatv /pqvmcVCyTYLRQvbQwlNCSSucukLx2iAeN+5X4/Kl/ZiKbE2LsH1E9c9HhTngJpC1kYW/h wxzdpbU7C0vXU1ZMH2VZQYPxLBkEDj8= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=google header.b=hmiuIDS4; spf=pass (imf21.hostedemail.com: domain of torvalds@linuxfoundation.org designates 209.85.167.41 as permitted sender) smtp.mailfrom=torvalds@linuxfoundation.org; dmarc=none Received: by mail-lf1-f41.google.com with SMTP id 2adb3069b0e04-512fd840142so259894e87.2 for ; Sun, 25 Feb 2024 09:03:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; t=1708880630; x=1709485430; darn=kvack.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=SYy3wOmbUatqMC0tUK+yhnIuh+GGp0MIfopsKx9GQbw=; b=hmiuIDS4jiT3HcGnoJHSoIA3Xf4Ajgx+7m+fQXV6Ic2R5NfVsuhDxtI+ghwNaIGq2U c1+y9CR8LklMHd1hiVx+pHzCCcLLR1g1x8rTcHLMs9AdFYfziJxW8NRFaDGR0TRdVIxa 5FRwxPiICqQMOPBMwvMUBVOA2PrwziYxj1AwI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1708880630; x=1709485430; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=SYy3wOmbUatqMC0tUK+yhnIuh+GGp0MIfopsKx9GQbw=; b=q/A5vfW+laZ/q9MwNCXkAuFA+Xt1rAKhbgE0CyOQWEpkqKUAasIEV0iBn5ItNpBmU6 hZDb8tzMDoqf9XunqldxrEMA854x7LHwELB875TOCgwhdhN9/g3xQ+JbmBNtqm05SQVR vrEwEj8+9wHd+1ZFwdfCjqbEUzwNsWq/tjAsDsY2vUQA+W6vxWMLv0ojWJrslBYY/qpo 7CRxi/X0HptsvF06LtrK5WteDGl53Bg0B4fJ0OZKA9v7Nlrwiq+HT54ySRici5tdr9HL tyX2qv6MW27vmoBJm8zfOmLQJTSZI6Fow47Pu7njGd9EM/4g9q22/3OENk6ewVGCnF/t aQ9w== X-Forwarded-Encrypted: i=1; AJvYcCUWcUO3XRoCAzXEfzYF7Q5gd+5rqEJh29IL44KOaPmtImzepGY7TC5rEkYV1hn4r1cf3aIYtkaUsPdMhcHU7Tusd7A= X-Gm-Message-State: AOJu0Yw+e1uK64Juhb27xjPtzw3zfUZtDECFAuthuZZdNT0cwqX6ejCq QtmVzyU64eEEDGaQug50bGSKRtay3VTIIOaEyNb0Ek0Zk0h1L9xgi5t6UoB41LQdZX1GabhVCX5 AIqE= X-Google-Smtp-Source: AGHT+IFqwUwLxZuYKw+jUzOSqXQ6eF44VmRknuwY3lsPABBemwwDwtvcG1AZWYkYNGfupa7zBrIo9Q== X-Received: by 2002:a05:6512:4cd:b0:512:9858:774e with SMTP id w13-20020a05651204cd00b005129858774emr2966403lfq.13.1708880630438; Sun, 25 Feb 2024 09:03:50 -0800 (PST) Received: from mail-lf1-f41.google.com (mail-lf1-f41.google.com. [209.85.167.41]) by smtp.gmail.com with ESMTPSA id x8-20020ac259c8000000b00511a69fca86sm559021lfn.136.2024.02.25.09.03.49 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 25 Feb 2024 09:03:49 -0800 (PST) Received: by mail-lf1-f41.google.com with SMTP id 2adb3069b0e04-512fd840142so259871e87.2 for ; Sun, 25 Feb 2024 09:03:49 -0800 (PST) X-Forwarded-Encrypted: i=1; AJvYcCUtxCsq0lwyNXfuAIl3EJ3QUTOKekc0vTiZofHG/1nKn1PDoGv76j9Qe2VAspjDRYQi5fI+Rv6eIlsYyqbnlRXk8Iw= X-Received: by 2002:a19:7419:0:b0:512:b344:774e with SMTP id v25-20020a197419000000b00512b344774emr2771001lfe.22.1708880629372; Sun, 25 Feb 2024 09:03:49 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Linus Torvalds Date: Sun, 25 Feb 2024 09:03:32 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [LSF/MM/BPF TOPIC] Measuring limits and enhancing buffered IO To: Matthew Wilcox Cc: Kent Overstreet , Luis Chamberlain , lsf-pc@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-mm , Daniel Gomez , Pankaj Raghav , Jens Axboe , Dave Chinner , Christoph Hellwig , Chris Mason , Johannes Weiner Content-Type: text/plain; charset="UTF-8" X-Stat-Signature: ao58n9qifu8xcah6scecxgtxuhw1tz71 X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 6D2FE1C001B X-Rspam-User: X-HE-Tag: 1708880632-547968 X-HE-Meta: U2FsdGVkX19BnK5GD4UiGDuY5wEHEXeT+e69hvDnS0o77WlQw4FbMNUjOxe/jND/EQ5ajtofaNPgMHfFJpvPpzPKd7PxRmp0noFBKIGzw9Foe450n7rnFoOU450Wo/LHz6AMXwr18SaL41bueVsR/07TznWIzMl0kj7VkQKCCFWUCmhK/VNs2XEVXxoHrDe4aljsnoZyP/dvQVJLkagBSvRqj/v8J1e+ZqQuH06sqYmSg+AoK4uLDzDjm/mLSBoXzNi+YpE8+CiBVEc204i9SNUbqPyTZlY4IvSAfiRRn67xU7SJohp78nPJNbWqt8tHa7hUo5sG+Cg2GOp+DydOGuxqPEVSLGjjpNnDCmtXikeF2q9Bu0Vh8sZSH11xwUZJwoJ6zmNsB1raF056/XDybbDR5ABkl59YUtMy/bOKlzFQOjwnngo7xHE9EfnJhpJlaIqvWut+VZp2RG+QnpC9Z923h4A63MMkSWHUYQtwdNyCGZcXL0yswviQaGbcPfh+8Ml00euKV5iKdcWBHx+/m7ERAZZYUYRJe+sLLIrMaRksOSRoGVEOJJ8FQZz/d17u6ab8wAS2KIYSTDL3l7KnmRhiUca4TLGn0XPFPBsr5jyL8BuZKkaXN0nKIp74bidt5uCwOI/jZh74KvuiCoKGVnput72IPgjpwUtHqpkoLVOJOgzyZko0wl7CynsDxWUIl6K2VZloK6a/QO6ZrMKKObwlthO4VKpkYqwKQ+X5GmtiBH72PVFSY/JyasO0DNZs/VXm9SApsaDSOHN73x9FlwKqJaYn+Mrq3udIjvhgOiAN2WiHMeV6aGijtbacgDbrdWF4FLQ1/F0ONgYB/4ONUEQ5JB+MUb7jkgybd0gAICC7hreDDEop0lvWaiZKZq2daC3Of9bT2RdfAJ2ILJB9peEX9GrAprHb0CUanOwI05QTivxqAG+lg8r7+6RHQ2JVpEDFf48iJTImqlvJWEH pe447/eu o+eCUIrleF+nNWzK5bb0diA1N61V2mkn6qlz82Zz0XIVjCvZju9tRB9G2UT6g0RGsXxRI8XjXsXbPrEm+XheKAziQN4kApkctU+dZECPzB7mE0UL4snkXP1IzTZ3fwBc3BzJzT5mYdJ5iP+6O9/6S5O3m36qhdsdaAf5uLLKxPcaS0k6OmPRu0rNHXIBLXjSmEvkzIAfdkBlIylJD+JfL8rjjMqVRmT3qg7VPAiPiIOWMgIuUF/zluKOlXLxkme+GovVk6oULkZs6Iic= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sun, 25 Feb 2024 at 05:10, Matthew Wilcox wrote: > > There's also the small random 64 byte read case that we haven't optimised > for yet. That also bottlenecks on the page refcount atomic op. > > The proposed solution to that was double-copy; look up the page without > bumping its refcount, copy to a buffer, look up the page again to be > sure it's still there, copy from the buffer to userspace. Please stop the cray-cray. Yes, cache dirtying is expensive. But you don't actually have cacheline ping-pong, because you don't have lots of different CPU's hammering the same page cache page in any normal circumstances. So the really expensive stuff just doesn't exist. I think you've been staring at profiles too much. In instruction-level profiles, the atomic ops stand out a lot. But that's at least partly artificial - they are a serialization point on x86, so things get accounted to them. So they tend to be the collection point for everything around them in an OoO CPU. Yes, atomics are bad. But double buffering is worse, and only looks good if you have some artificial benchmark that does some single-byte hot-cache read in a loop. In fact, I get the strong feeling that the complaints come from people who have looked at bad microbenchmarks a bit too much. People who have artificially removed the *real* costs by putting their data on a ramdisk, and then run a microbenchmark on this artificial setup. So you have a make-believe benchmark on a make-believe platform, and you may have started out with the best of intentions ("what are the limits"), but at some point you took a wrong turn, and turned that "what are the limits of performance" and turned that into an instruction-level profile and tried to mis-optimize the limits, instead of realizing that that is NOT THE POINT of a "what are the limits" question. The point of doing limit analysis is not to optimize the limit. It's to see how close you are to that limit in real loads. And I pretty much guarantee that you aren't close to those limits on any real loads. Before filesystem people start doing crazy things like double buffering to do RCU reading of the page cache, you need to look yourself in the mirror. Fior example, the fact that Kent complains about the page cache and talks about large folios is completely ludicrous. I've seen the benchmarks of real loads. Kent - you're not close to any limits, you are often a factor of two to five off other filesystems. We're not talking "a few percent", and we're not talking "the atomics are hurting". So people: wake up and smell the coffee. Don't optimize based off profiles of micro-benchmarks on made up platforms. That's for seeing where the limits are. And YOU ARE NOT EVEN CLOSE. Linus