From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DA6DBC30653 for ; Wed, 3 Jul 2024 16:47:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4CA746B009C; Wed, 3 Jul 2024 12:47:54 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 47A9E6B009D; Wed, 3 Jul 2024 12:47:54 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 343616B009F; Wed, 3 Jul 2024 12:47:54 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 1783B6B009C for ; Wed, 3 Jul 2024 12:47:54 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 8E1C680DAD for ; Wed, 3 Jul 2024 16:47:53 +0000 (UTC) X-FDA: 82299023226.28.964550E Received: from mail-ej1-f48.google.com (mail-ej1-f48.google.com [209.85.218.48]) by imf13.hostedemail.com (Postfix) with ESMTP id 58B1F20029 for ; Wed, 3 Jul 2024 16:47:51 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=google header.b=DJlYY0Ho; spf=pass (imf13.hostedemail.com: domain of torvalds@linuxfoundation.org designates 209.85.218.48 as permitted sender) smtp.mailfrom=torvalds@linuxfoundation.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1720025248; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=DTEVlOEUHDwAOvpYSd+1fj7nxZG6XoxqWf/WJzx45vU=; b=NOMwZ746Ta6VgZ4E5pXkvfOm877T5367LBbcRVfKFiP368Wjih+0vQJtp5hyX1rjvG6vV1 kHgdRyh8ZnP5aOKCr74QjoZMxapCaUgoTu6zx0E9IKmGS4qfbLdWrTKPEIY1wjzsSlwDUh CjqQyj5UVXlE3h+haNL+HsagDm6nso4= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1720025248; a=rsa-sha256; cv=none; b=vUHU6YO5SbuRKNGZJhyPQYtfxuSNCARgdKzPn4mHH8YhbpVBjwDMWerr+lBDeMyazqtn9l 4+KIEhScbrXmVpvRxwZ2hfcfiNF1LLEDq7oZ1iP65seeKXqCpMLlkLffylY8GIYomTsjyw HhK1zhkjZLHWdCAisZo+V15qhdNoZFY= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=google header.b=DJlYY0Ho; spf=pass (imf13.hostedemail.com: domain of torvalds@linuxfoundation.org designates 209.85.218.48 as permitted sender) smtp.mailfrom=torvalds@linuxfoundation.org; dmarc=none Received: by mail-ej1-f48.google.com with SMTP id a640c23a62f3a-a728f74c23dso642648666b.1 for ; Wed, 03 Jul 2024 09:47:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; t=1720025269; x=1720630069; darn=kvack.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=DTEVlOEUHDwAOvpYSd+1fj7nxZG6XoxqWf/WJzx45vU=; b=DJlYY0HoCZNqsKf4aIO5bP0lFpafhEg2oBSuMKKWUv0D5hX8ABcYPgg9pF6ybtAEap GsE2ID3hrJYwMJrZxt/kP2urLycp+Gxk6P7VjXfUlngO7Rn33EIOjrNx3hXCQvLVbaas SUMS5oprvxfcpoZxHkYGFrBFOycesAMJrfIyg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1720025269; x=1720630069; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=DTEVlOEUHDwAOvpYSd+1fj7nxZG6XoxqWf/WJzx45vU=; b=bn2fKs4iGZJI9gfDG9VX1p7i60EH2e2Vqb6qw1qqf377MGTIxr3dh7828Rz61LHUPV BKFRxjsQ9DN1c+3URa3qSrAoywiGRo+xx3K/ZpAKaTCkeGcBh84twePNY7igqz+Eoh5Q sxy/WH03vEzntb6q1FS11hBkkr0mSSnmd0czXWAg5hfPjey2O1NZQuUwUO3jR4gc/C2z nCLblhr/FlXXQKV2RsOE+MwCFlVkktACOhDahKVHRnXnKHuYvcNJ08ahIm1jLfJD07RK K4fFskg+hZZ4ezB4Z+NDHlHe3kQxuANuIiIaZNZ46gI71oQSO68BzKAfJCJTcSaGs9bb t2tg== X-Forwarded-Encrypted: i=1; AJvYcCUe+x9CPiuD/UBgH8Z0xH9McVzAUsxdJX914Z8t4M6oQwYFOTucGETo8TD+Cms2wgGxSDZt2lhIAUMCmJXzNY8Oe3k= X-Gm-Message-State: AOJu0YzvvmNW35DSjox5sl2K5QuC/4mRL8Ewut+TKghgBO7VmT3SYHqm bHMJ0L4iuj3dI6D5EsfR82UKCaylfrpY17n8caHVhCOTtmCBtcHBG8+1ONHOScEob1FG4BPk9St MbenAjA== X-Google-Smtp-Source: AGHT+IHnuyfPda3gXgQVL4u5LJzAPoq740iz/gl+SGUfLo3pdJM2b/DEKN7TU3kGcM4ABpWuQJHvkg== X-Received: by 2002:a17:906:6b9a:b0:a72:b493:85f6 with SMTP id a640c23a62f3a-a7514439967mr692064666b.24.1720025269300; Wed, 03 Jul 2024 09:47:49 -0700 (PDT) Received: from mail-ej1-f53.google.com (mail-ej1-f53.google.com. [209.85.218.53]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-a72ab08cf7csm525246766b.155.2024.07.03.09.47.48 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 03 Jul 2024 09:47:48 -0700 (PDT) Received: by mail-ej1-f53.google.com with SMTP id a640c23a62f3a-a7541fad560so265486766b.0 for ; Wed, 03 Jul 2024 09:47:48 -0700 (PDT) X-Forwarded-Encrypted: i=1; AJvYcCUAkpfeF50vdSGQVJ4SEbhKet8p+oTA1kYIkLHW9SAlnu+LcBjf9agguK4c9YcXyV7PGGxNP3pDzBOC7wDu57xV5Xw= X-Received: by 2002:a17:906:794d:b0:a72:7603:49ef with SMTP id a640c23a62f3a-a7514451ce8mr941408566b.35.1720025268507; Wed, 03 Jul 2024 09:47:48 -0700 (PDT) MIME-Version: 1.0 References: <3g3arsrwnyvv562v2rsfv2ms4ht4mk45vwdkvssxkrjhfjtpdz@umyx5tl2du7o> <6knlkefvujkry65gx6636u6e7rivqrn5kqjovs4ctjg7xtzrmo@2zd4wjx6zcym> In-Reply-To: From: Linus Torvalds Date: Wed, 3 Jul 2024 09:47:32 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [linux-next:master] [lockref] d042dae6ad: unixbench.throughput -33.7% regression To: Mateusz Guzik Cc: Christian Brauner , kernel test robot , oe-lkp@lists.linux.dev, lkp@intel.com, Linux Memory Management List , linux-kernel@vger.kernel.org, ying.huang@intel.com, feng.tang@intel.com, fengwei.yin@intel.com Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 58B1F20029 X-Stat-Signature: 98og6qwkmmtmnqdy5am7p9ehkgng1356 X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1720025271-977929 X-HE-Meta: U2FsdGVkX1/K9CqJdgX/L4bbjDxjx4IcQmzIpWe6d5Dr4gbqieXIVdqc91x25TooXe7nlnlNfwZnnZ+lw+AO6WBSwysvnk5NxtBR7tt9o80V8MecZfu3at9jnz7/3Apz/0P/nB1TowRM7Uz09RwqcyNrFen91A+s2B+4fNsT1Zto/99E55Ns7iVhDeTHPvcxDoHvIc/qh1VP8a+8CYBtN2BIoGNcRSxFTcfwjKR7VMQFf1tTkJnAdo7sWOoGJGZQPJWfTozlU9lHMNK6w0Z2r4E0B5T6ILBepOE32xm6W0QnGvz71XM2+r/jBF8utF1Xx7khCxz29sQjsZqbOdpSp5yVfpBrQ5OQslcOkYUti4t0/LQTY4Vdb6i8fBMeMtGOJF6IpPUE+VQ2OMQgz27Zb6D//AwCGJkaN2LZbb8/w5hIs6CoTsHuM/C2amxadlREa12j8yTh0JZilzjmyDEHG3+yGDLRxwU7yzfRBC++n4cULCBvWLl7L/lzp0PUSDKN1F4sIvw6kqilSyeCZmfqsMXgJX/Jz07E/fhHvPbMybcT7TrvBA3stHn/FzdhRlwbINJQbrIPhx0uOLt3R+8qK/9bZjSbY5YkIKIuu4MJ84RUlfodj2QA1OgyqzLS7tSvcBE8xE6BzvZl+37syMGLJ1ZStnLWwH7wQvrc2irDLcchiVBdR3mC2KqtkNKNLEd3JuB40JiQwVRe/0op9qhatWwi9qqoIhAKPF+Y31Q9mYCpkXCx69ET0QJ86pmNPA7i7ta7xTReNlTVF9wZOAaTsVxwWxOYHOboinezh41k24Za6iBB5n5GHue4t1YOPvkHbHSXTCRpPZOHpRKa/sXN5wy7XU8ogQ00xvwT0hgwrb8WFuQmIcipgFmFVD/4DyB45c9BV4NbR9M66Z22hJgP75wduVK/CL39aRfPlx/FyPFDgfYhK74lcVPqvhUfiDuiHEs3NeEmC+gr89Oz/eI Htok6Jwf 6YhbQWHS5Jo+hMk/huiB7QdPo5Klxvm9ds2b96rwyhG44d19OHLLabWsjVd59H/UE8AWcd91V848ff7AIXrcHInYDOfB90oQnF6oR9YeQ/2+JX/z65C+Lh3AwsGGl/tOG270Odi1iEj8qngdFHsf4yolJ5/vLePrDjKxZj42kJFhc6XT78TLcthIdH1fNwShjtk3Haxh9ao/CqYzVWd6XKiiJd4CmHDqF+Rz8lFLr3y4AVnsNIEGRuHEGuO6PBQdplO+fkkXv3Ncpkm9fF/CTyRRkSEhyhor18Cw4gxjeC6nAci8mOGlK1Zm238D2KjjNZXPEDNBFN2iTCHv/PT8vMpjJMo+/8jaAjL96RMv4uHzn+dfSCJUG5dC/ZzrE86BfGbSyQP4O2HuyNsVZxpUuA9YYAk9EFq06SDGSDDbRRTSJDa5hw6f7ktFE1LfMMeX1FOPcPBASLhLuxfEPmkBBCADUnUSNlyTF55MS X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, 3 Jul 2024 at 06:53, Mateusz Guzik wrote: > > Now I'm confused mate. Based on the convo so far I would expect you > would consider the xfs thing a no-go for the machinery. > > You were rightfully pointing out the relationship dentry<->inode is not > stable and care is needed to grab the pointer, and even then the pointer > may be wrong by the time one finishes the work. > > I presume you are also worried about callbacks not taking proper steps > when looking at the inode itself -- after all they can be racing with > teardown and have to handle it gracefully by returning an error. No. I'm *assuming* the callbacks don't take proper steps. IOW,. the reason I think the callback model is the right model is exactly because I do not believe any user will reasonably understand and get all the RCU pathwalking rules right. So my mental picture of the callback model is that it is entirely speculative. It will *speculatively* fill in the stat data. And it obviously will *not* fill it in in user space - because you can't do user space accesses while in an RCU-locked region. So the stat callback would purely fill in a speculative kernel buffer. And then the path walking would confirm the sequence numbers *after* calling the callback, and override the return to ECHILD and finish the path walk non-speculatively if the sequence numbers don't match. > Inode changing identities adds potential trouble which does not need to > be there. I agree that the XFS stuff may be questionable, but I still don't see the problem wrt any stat callback. The sequence number tests would be EXACTLY THE SAME ones that we currently use for regular file open etc. If they are wrong for one case, they'd be wrong for another one. I think you are coming into this from a backgroudn that would do the stat buffer _without_ doing proper validation afterwards. > Suppose the inode got reused and is now representing a device, i_rdev is > some funky value. Tell me how the inode gets re-used with the sequence numbers still matching, and then tell me why regular path lookup doesn't have this issue. > There is also potential trouble with security modules as they > unfortunately have a hook for getattr. Oh, absolutel;y. The stat security callbacks would need fixing. Exactly the same way we had to fix 'permission()' for RCU lookup with the whole MAY_NOT_BLOCK thing, where they just say "I can't do that", and we'd have to fall back on the non-RCU case. And yes, filesystems might disable RCU stat - the same way filesystems can currently react to revalidate etc under RCU with -ECHILD. This is all *EXACTLY* why I think that "callback with error cases" makes sense. Because the callback may well say exactly "do the old thing" because it can't handle the RCU case, and it will depend on things that are outside the control of path walking or stat itself. Linus