From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A9939C369D7 for ; Wed, 25 Sep 2024 12:20:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 11B366B0088; Wed, 25 Sep 2024 08:20:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0A4716B009C; Wed, 25 Sep 2024 08:20:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E60506B009D; Wed, 25 Sep 2024 08:20:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id C4E446B0088 for ; Wed, 25 Sep 2024 08:20:12 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 6478B1C32E2 for ; Wed, 25 Sep 2024 12:20:12 +0000 (UTC) X-FDA: 82603167864.21.130EF9F Received: from frasgout12.his.huawei.com (frasgout12.his.huawei.com [14.137.139.154]) by imf13.hostedemail.com (Postfix) with ESMTP id 101BC20002 for ; Wed, 25 Sep 2024 12:20:06 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=none; spf=pass (imf13.hostedemail.com: domain of jonas.oberhauser@huaweicloud.com designates 14.137.139.154 as permitted sender) smtp.mailfrom=jonas.oberhauser@huaweicloud.com; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1727266750; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Ko1BJZR80rJ9X6bYWi655+HWFHJpNO2ZX6at/1+3K/o=; b=tyDjwXM261MYVVckHZGyDbO+SR/+I0wYdfteHvoXtFUq+TrNiKQXLgWqlxLInWyCxrg/fQ Fvc4jV7ktDos5m+5HDzQigT1rmhYXcdWGUEm3lnV0mC4jZN7LbnKUFLcYcon9NFKKANP/K BXGdnzGcfD4QT0LNoVAFk02VjyXt5U8= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=none; spf=pass (imf13.hostedemail.com: domain of jonas.oberhauser@huaweicloud.com designates 14.137.139.154 as permitted sender) smtp.mailfrom=jonas.oberhauser@huaweicloud.com; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1727266750; a=rsa-sha256; cv=none; b=CEusbV3IS1KM/6F5BtzIlAPueGE6GIrfivTMvdgK/9QKmYfTECDZONO5I+LInQRk4Wi5iz jXGSO2nkNG5lmeoC1Gal4EEuMfq9xji14jlM0a3Fh3SVFrMhFp1Kt5jMLxdDIcJ1EsuHqM AvSYck6sxM4XTnZ4p4stS4pmB78Q9XI= Received: from mail.maildlp.com (unknown [172.18.186.29]) by frasgout12.his.huawei.com (SkyGuard) with ESMTP id 4XDFWk59P9z9v7Jg for ; Wed, 25 Sep 2024 19:54:26 +0800 (CST) Received: from mail02.huawei.com (unknown [7.182.16.27]) by mail.maildlp.com (Postfix) with ESMTP id 75B82140391 for ; Wed, 25 Sep 2024 20:20:02 +0800 (CST) Received: from [10.81.208.14] (unknown [10.81.208.14]) by APP2 (Coremail) with SMTP id GxC2BwAniMjl__NmNLmfAQ--.24510S2; Wed, 25 Sep 2024 13:20:01 +0100 (CET) Message-ID: <55ea84c8-92fd-4268-9732-6fac3a0e78b7@huaweicloud.com> Date: Wed, 25 Sep 2024 14:19:47 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC PATCH 1/4] hazptr: Add initial implementation of hazard pointers To: Mathieu Desnoyers , Boqun Feng Cc: linux-kernel@vger.kernel.org, rcu@vger.kernel.org, linux-mm@kvack.org, lkmm@lists.linux.dev, "Paul E. McKenney" , Frederic Weisbecker , Neeraj Upadhyay , Joel Fernandes , Josh Triplett , Uladzislau Rezki , Steven Rostedt , Lai Jiangshan , Zqiang , Peter Zijlstra , Ingo Molnar , Will Deacon , Waiman Long , Mark Rutland , Thomas Gleixner , Kent Overstreet , Linus Torvalds , Vlastimil Babka , maged.michael@gmail.com, Neeraj Upadhyay References: <20240917143402.930114-1-boqun.feng@gmail.com> <20240917143402.930114-2-boqun.feng@gmail.com> <55975a55-302f-4c45-bfcc-192a8a1242e9@huaweicloud.com> <4167e6f5-4ff9-4aaa-915e-c1e692ac785a@efficios.com> From: Jonas Oberhauser In-Reply-To: <4167e6f5-4ff9-4aaa-915e-c1e692ac785a@efficios.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-CM-TRANSID:GxC2BwAniMjl__NmNLmfAQ--.24510S2 X-Coremail-Antispam: 1UD129KBjvJXoWxAF45Jw48Xw4xJFy8uw4Uurg_yoWrGFW3pr WkK3WUJFWDJr40kr1Utr1UAryYyr18J3W5Grn5JFyjyr4Ygr1jqr42qr1j9FyUAw4kXryj vr1Yq3srZF17XaUanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUvYb4IE77IF4wAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28lY4IEw2IIxxk0rwA2F7IY1VAKz4 vEj48ve4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_Jr0_JF4l84ACjcxK6xIIjxv20xvEc7Cj xVAFwI0_Gr0_Cr1l84ACjcxK6I8E87Iv67AKxVW8JVWxJwA2z4x0Y4vEx4A2jsIEc7CjxV AFwI0_Gr1j6F4UJwAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG 6I80ewAv7VC0I7IYx2IY67AKxVWUGVWUXwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFV Cjc4AY6r1j6r4UM4x0Y48IcVAKI48JM4IIrI8v6xkF7I0E8cxan2IY04v7MxkF7I0En4kS 14v26r4a6rW5MxAIw28IcxkI7VAKI48JMxC20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I 8CrVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwVAFwI0_JrI_JrWlx4CE17CEb7AF67AKxVW8ZVWr XwCIc40Y0x0EwIxGrwCI42IY6xIIjxv20xvE14v26r1j6r1xMIIF0xvE2Ix0cI8IcVCY1x 0267AKxVW8JVWxJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_ Jr0_Gr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBIdaVFxhVjvjDU0xZFpf9x07 jzE__UUUUU= X-CM-SenderInfo: 5mrqt2oorev25kdx2v3u6k3tpzhluzxrxghudrp/ X-Rspamd-Server: rspam03 X-Rspam-User: X-Rspamd-Queue-Id: 101BC20002 X-Stat-Signature: idmpwi9tiy447zdxkwaujj3gwc7i5g3b X-HE-Tag: 1727266806-888854 X-HE-Meta: U2FsdGVkX1+RIx8O3uhP9HA8qquwXQSFtd+1riqieEDSnTTuDZ8G2E7bun98hSTYNFT+62jUazHawZ0LqagB1vy4GuXgLvlwKqsr9VVS3iUiv9ZHEvI2O2/UKympdGQiksAyGCaBfVQS7iQle3cslnlMz1YMenI5Oz1krq8Ilkn/xsnxXDi0vgEWeWWANvnID09qG6hZQxE+REdlRkb73FGD7/jdF6/dj9h2PIsPCqxzRaI6hBzIOsgGS0NPvPkN3U637ejew1heYw482GluDqySy6T3mlHLhDVF3SKEDHybr8zpAbx++j2K92LXCisyNxYedqoR0sQK8JqoVUbRzbEiiBhwz3GEF5kXEfAiTwrjPZWmOCyGQ23eQvH/LrvBT6g8fEXwS/NHQX2gt0HiKfIgNKFKYe8YH4t9zWV+A6Ovrpm2/uhoiIh548y14gXYyhDa8MgD3GXSP6i7fgAk/E992EX2EU+kB33zovNqlOoNZvD0MJ2dVvS++VTF/i5/Xo3qJtJhlTXYI4llnnHURp1/5r8D/YO1XVPQetlEHLYFqf3x+8KForMQiF28Ms/CcHQ38x7zDeJyOSwysAFf6q0+dIPu7opkjyq71WnQ50twWUKSstBpdrk1Ro/jGxFpgIqTykMPlU80CzJU0jJAoZIZ3MRNwpaSULS3Fw2Mugj/16JaRVPgnEkBTCksvEqr+EJ0RUS7SFc3t33J4UWDzGACOfbArGATJvAr+oaL5pCmhqZ1+kNvfdT+5DRgRcyFzCNOZ2pi3l7iKq7JtXJrvtE9lbAfzfD20GPs4R5jRDWZi2XeGEE+7psvaosRG4521DnWxoOvf8m/KEczvgniAK/4l/zfp+rZHw/m0AUHXRK21qlQZ2dlwpTulmkgjZEEGdKexCj2+IrSLfs8onC4HzW03uoH0xKrxnmWs/A33NLGZ4NbyPKq6ZsfW42LnPbPBTe0FMpKzzu6nT2C2dz Fitc2yEs eBRZIfpafBXIZ3wBRP1HljS0PFZsCvy7QhVa2jNnjjB9qfYIvS8UWQ0FfEZWSjgGFYtoUXtXKrzOp8WRS/XXoIup427jQw0LLA2yIy6Be3JV70SmHjkD4r6NpilfN/BMOz1UjKIgC5o7Pt3K1mx7nuZYGu+HXqnrNCSCvh7RrJF43Rbz75sYFdoh5WtLBFYJelwuf8Z9QwEXjsrRQKPPN3hkdMHjTHH8Bo797boP2aDuWmQtEHCMhV8l10oduzSCqXbPU X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Am 9/25/2024 um 1:59 PM schrieb Mathieu Desnoyers: > On 2024-09-25 12:45, Boqun Feng wrote: >> On Wed, Sep 25, 2024 at 12:11:52PM +0200, Jonas Oberhauser wrote: >>> >>> >>> Am 9/25/2024 um 12:02 PM schrieb Boqun Feng: >>>> Hi Jonas, >>>> >>>> Of >>>> course, if we are really worried about compilers being too "smart" >>> >>> Ah, I see you know me better and better... >>> >>>> we can always do the comparison in asm code, then compilers don't know >>>> anything of the equality between 'ptr' and 'head - head_offset'. >>> Yes, but then a simple compiler barrier between the comparison and >>> returning >>> ptr would also do the trick, right? And maybe easier on the eyes. >>> >> >> The thing about putting a compiler barrier is that it will prevent all >> compiler reorderings, and some of the reordering may contribute to >> better codegen. (I know in this case, we have a smp_mb(), but still >> compilers can move unrelated code upto the second load for optimization >> purpose). Asm comparison is cheaper in this way. But TBH, compilers >> should provide a way to compare pointer values without using the result >> for pointer equality proof, if "convert to unsigned long" doesn't work, >> some other ways should work. >> > > Based on Documentation/RCU/rcu_dereference.rst : > > -       Be very careful about comparing pointers obtained from >         rcu_dereference() against non-NULL values.  As Linus Torvalds >         explained, if the two pointers are equal, the compiler could >         substitute the pointer you are comparing against for the pointer >         obtained from rcu_dereference().  For example:: > >                 p = rcu_dereference(gp); >                 if (p == &default_struct) >                         do_default(p->a); > >         Because the compiler now knows that the value of "p" is exactly >         the address of the variable "default_struct", it is free to >         transform this code into the following:: > >                 p = rcu_dereference(gp); >                 if (p == &default_struct) >                         do_default(default_struct.a); > >         On ARM and Power hardware, the load from "default_struct.a" >         can now be speculated, such that it might happen before the >         rcu_dereference().  This could result in bugs due to misordering. > > So I am not only concerned about compiler proofs here, as it appears > that the speculation done by the CPU can also cause issues on some > architectures. No, this is only possible in this example because of the compiler first doing some other optimizations (like what I mentioned on Boqun's original patch). If you can ensure that the instruction sequence corresponds to more or less t = load p // again // on alpha: dep fence ... *t then you can be sure that there is an address dependency which orders the access. This is guaranteed by LKMM, or if you don't trust LKMM, also by Arm, Power, Alpha etc. The extra dep fence on alpha is automatically inserted if you use READ_ONCE as boqun did (and I assumed your uatomic_load or whatever is doing the same thing, but I didn't check). Given that the hazard-pointer-protected object presumably is not a single static non-freeable object, but some dynamically allocated object, it is pretty much impossible for the compiler to guess the address like in the example you shared above. Note that inside the if, the code after transform is do_default(default_struct.a); which is an address that is known to the hardware before it loads from gp. That would not be the case here (if the compiler optimization is ruled out). jonas