From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3EBF3C47DD9 for ; Sun, 25 Feb 2024 23:46:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 55222940007; Sun, 25 Feb 2024 18:46:10 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 502E06B0123; Sun, 25 Feb 2024 18:46:10 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3CA73940007; Sun, 25 Feb 2024 18:46:10 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 2C6036B0122 for ; Sun, 25 Feb 2024 18:46:10 -0500 (EST) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id EDD9A80629 for ; Sun, 25 Feb 2024 23:46:09 +0000 (UTC) X-FDA: 81831962058.01.5639989 Received: from mail-ed1-f42.google.com (mail-ed1-f42.google.com [209.85.208.42]) by imf11.hostedemail.com (Postfix) with ESMTP id 04C7840009 for ; Sun, 25 Feb 2024 23:46:07 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=google header.b=N02iyxU6; dmarc=none; spf=pass (imf11.hostedemail.com: domain of torvalds@linuxfoundation.org designates 209.85.208.42 as permitted sender) smtp.mailfrom=torvalds@linuxfoundation.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1708904768; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=t8oDS6s3s30y5Ly/yXVA1Qyh4YWmD7GBpwbREpcwfoc=; b=pV0jub/n4eRl0/ILcwiJ3aVJ6arAlxzKIz32+vwVakZgkO8yviJL2InGjqNax1125Lx7HL 53qC1/8K65z5MUl8SLhNuG/tvUAvL0Z8qNp6Mty9RW2TxHxU3m7A8VCaWxEzD83Akyn4kQ 6KkxSJIeb8gXElmIupyehdetHr3oBuw= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=google header.b=N02iyxU6; dmarc=none; spf=pass (imf11.hostedemail.com: domain of torvalds@linuxfoundation.org designates 209.85.208.42 as permitted sender) smtp.mailfrom=torvalds@linuxfoundation.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1708904768; a=rsa-sha256; cv=none; b=7UgMRXwYtASAXboJLyeom0vunF76d7fZBw15tFz+Zsmh2xcSNdlP5+L1FLmdd3Fts1r1EF rQLsGPWFNBs/+XTLoMEeQuj5cgv4NORCx5tetwIMLsVV32qP30ekfaavDelV8VWOWyJsI+ RHJrSwpkrd55o2601gsqAE7oe9oWbMU= Received: by mail-ed1-f42.google.com with SMTP id 4fb4d7f45d1cf-563cb3ba9daso2309683a12.3 for ; Sun, 25 Feb 2024 15:46:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; t=1708904766; x=1709509566; darn=kvack.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=t8oDS6s3s30y5Ly/yXVA1Qyh4YWmD7GBpwbREpcwfoc=; b=N02iyxU6ILxhii0N/mXqyYLAiy4M46kCL9nP4qHaeY1qsaefKyrKRwWKDOnuImm0WE xDXQFSrXkTGWigEMnJjbW7D6pxXjBucV/mwBOOHBPxjTOGQ/UpaVTum5msyethSA9rS5 00UXxH/8TRgCWm+duF+QIntXGFsjfDInBJTNU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1708904766; x=1709509566; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=t8oDS6s3s30y5Ly/yXVA1Qyh4YWmD7GBpwbREpcwfoc=; b=AyWn5MclBO4b3CcDDfQgPR/s8cnMHzc41dhrO6HNPU2WafeF/QJDxc67H1Vt88sILs TUyRltKtfdSWYRfxW+LrU0v5LaQDbNAitHYLm7dx6NSqCy3LUuGj3ZdtWKS5r+yzz6Uk yYjWT/8wZEQY2D3j2iZE2AyHlFrNMxIsiJMmZpAl3S8TvpJDDdRLlXQBg/sQuRxls/Tk umjszLPaS9Gd2sfg7eRlOydBnwLI92cMzPQ8sWluNFYQe50mV0uaClU8hxVuI0RfERMX 8C+CbHLvPRlg0Kfo9sEltS33nMrm/sOZKGmaMHSBNemnLk4QMASv/kCVjLAABA2VFg8j 7XSQ== X-Forwarded-Encrypted: i=1; AJvYcCVHMoXOnHQnajA38KFXHHe/gp8qhjJCdMyYsAn+bUjPVWHCxHFT2RYeEoZrXE3uyL6MS4O4UVN8S0EaTUJWozAoABs= X-Gm-Message-State: AOJu0Yzu4ag+lZ7RkugjckSNTx1d94u7tAxlEWtqqnFIT+3DiRBKIX8m arefWcmQ503/EtnyvDB6ChpyTcqQivta+je8wqNTnX2AmXN/AckiG2UUkDPFNDI/IRISfBG34Ml GU25ZdA== X-Google-Smtp-Source: AGHT+IE2N8IUDoaCoQ35oH8e6ylpQThr4gUfoGvByTuGpp5VNd52BgPesjaN4t3AfpXf/BtBvcLLGA== X-Received: by 2002:a17:906:b56:b0:a43:adc:7213 with SMTP id v22-20020a1709060b5600b00a430adc7213mr2033802ejg.20.1708904766237; Sun, 25 Feb 2024 15:46:06 -0800 (PST) Received: from mail-ed1-f48.google.com (mail-ed1-f48.google.com. [209.85.208.48]) by smtp.gmail.com with ESMTPSA id q17-20020a170906679100b00a3e0119b4a3sm1885728ejp.140.2024.02.25.15.46.05 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 25 Feb 2024 15:46:05 -0800 (PST) Received: by mail-ed1-f48.google.com with SMTP id 4fb4d7f45d1cf-563f675be29so2433403a12.0 for ; Sun, 25 Feb 2024 15:46:05 -0800 (PST) X-Forwarded-Encrypted: i=1; AJvYcCWF1jpVpLx8j/QG23wtvpHjnxcnvtqqdn5WesaURkpc3e93Zx64BcbVtQLsRybDoCmQoTrb+dIheyXaoDg1dHdvoG8= X-Received: by 2002:a17:906:6701:b0:a3f:6717:37ae with SMTP id a1-20020a170906670100b00a3f671737aemr3748388ejp.69.1708904764619; Sun, 25 Feb 2024 15:46:04 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Linus Torvalds Date: Sun, 25 Feb 2024 15:45:47 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [LSF/MM/BPF TOPIC] Measuring limits and enhancing buffered IO To: Matthew Wilcox Cc: Kent Overstreet , Luis Chamberlain , lsf-pc@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-mm , Daniel Gomez , Pankaj Raghav , Jens Axboe , Dave Chinner , Christoph Hellwig , Chris Mason , Johannes Weiner Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 04C7840009 X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: ugjer6p6dejjxbcpdrtdk63zd4i74yq6 X-HE-Tag: 1708904767-52292 X-HE-Meta: U2FsdGVkX1/c43INg68lupUGqNOV+AGpm/0qoDRcBpK0NQnKQ5EtCv73CacATVx8w7j8DfEoGziaSS/ym4xa5jCR5rwmUesJ9iFJumUQnWwTl5Ss5qAZa+rmEx+b9mAMs+0fxw4obEh3vPcs7LW1EPCX4QkdyyE99u6T2wPJvwmzg4AN2c3jcJz8GgcbuXbuP8iPJ0GuHdGfU+qX+BtlgrL2TiAQBsh30m8+IUfepndQixLm3UopQHVYPNF0RjgniQ6ubThftIsh9yJFSVVJBRQ8YeP54y4BHxbzCYaNgxBYV3M4mUJoN3bGgGSntDCJWugCloGPIcM6Vi8zLrrNGC1dCF1bxPSjzbHzLBckxToc4hJNMxf9b0dpFWsU01rlvwUE4gV1BxdAIsAmjtzIxZ+99VEc6WOEL1LeJk9i9/ALzU6upflgfjUdyd5Qu/9FVEScmMBHUbQnLuEGmPUcpzd/bte7bb28dxRkzKn1mEyOmzOGzsdH9G42p1ayRLz9bYmYBk/PfLpJSrSheqkkbwYnij8UnYrCE7RtHSuhxtVCUvYTFcve2KVUCJUkKJhbsL90hhVdoxpLnDMKgcuONOhhApSk80haIrNmee5SXlJPMj7Qc8ZbwiXWddGwX08KfHNnxt/33Ufk0QhiX7QyyehUqUIXhX5wg+4oJQBflJtNkz6t9uFisv2s/NgB1Xxoa1k3yvhoR7fQx1u/fFe7/aJVZ0mLFHdSB8p3vpXZ0oU67tirse6pDAfcO20/YwxJP3H09zdNSAIaSId9CL/GS4oejd0ccdAfn8F8eCfMoqL1mKqys9LCdicErSOm7XN1jNuKwIQ3KlyJ1CD/XesaoKkd9ZjjGeRI/wd/PRj9tKdLgVRn6O93V76AT+GB9Ufjxr85miKNQDjH3WDikzyeq4WTjxl8WZatrsDeQi0vKq4PEJMFlHR88w7YYMVDOiexX+HnJcP0XQ+APQO0SXG zBfByxl5 DOupRifhrdCnoYW4zGHZuk1AlcrBTrUJITdaO8AxnQJO+DIJT0aShaCeX+pLgtCoI26pyvJoMh7ELZmooi3PwpdCZ/I4pVJeAAJ75046V4+6LphlWsgCKjsx5yAMIwkCa6Sj9ilVsFeTH6X7yPwSx7zO/brXJK9syRyi52lXxjEYtQjf0cH06sL8UbKitAVmw9SzYlYVG/oUHD10= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000002, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sun, 25 Feb 2024 at 13:14, Matthew Wilcox wrote: > > Not artificial; this was a real customer with a real workload. I don't > know how much about it I can discuss publically, but my memory of it was a > system writing a log with 64 byte entries, millions of entries per second. > Occasionally the system would have to go back and look at an entry in the > last few seconds worth of data (so it would still be in the page cache). Honestly, that should never hit any kind of contention on the page cache. Unless they did something else odd, that load should be entirely serialized by the POSIX "atomic write" requirements and the "inode_lock(inode)" that writes take. So it would end up literally being just one cache miss - and if you do things across CPU's and have cachelines moving around, that inode lock would be the bigger offender in that it is the one that would see any contention. Now, *that* is locking that I despise, much more than the page cache lock. It serializes unrelated writes to different areas, and the direct-IO people instead said "we don't care about POSIX" and did concurrent writes without it. That said, I do wonder if we could take advantage of the fact that we have the inode lock, and just make page eviction take that lock too (possibly in shared form). At that point, you really could just say "no need to increment the reference count, because we can do writes knowing that the mapping pages are stable". Not pretty, but we could possibly at least take advantage of the horrid other ugliness of the inode locking and POSIX rules that nobody really wants. Linus