From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 75466CEBF96 for ; Fri, 27 Sep 2024 11:01:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9A3CB6B0089; Fri, 27 Sep 2024 07:01:11 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 953B76B00C6; Fri, 27 Sep 2024 07:01:11 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 81B216B00AA; Fri, 27 Sep 2024 07:01:11 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 615086B00D0 for ; Fri, 27 Sep 2024 07:01:11 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id CE50E160C53 for ; Fri, 27 Sep 2024 11:01:10 +0000 (UTC) X-FDA: 82610226300.17.2E41311 Received: from smtpout.efficios.com (smtpout.efficios.com [167.114.26.122]) by imf13.hostedemail.com (Postfix) with ESMTP id D142320010 for ; Fri, 27 Sep 2024 11:01:07 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=efficios.com header.s=smtpout1 header.b=P9OowQtc; spf=pass (imf13.hostedemail.com: domain of mathieu.desnoyers@efficios.com designates 167.114.26.122 as permitted sender) smtp.mailfrom=mathieu.desnoyers@efficios.com; dmarc=pass (policy=none) header.from=efficios.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1727434805; a=rsa-sha256; cv=none; b=Pv0A9tN7FW3rlm1h4//D++K3ogh0zt2LQWLCLg1bUM8JSnFgzShEYfH9hj/qTy+TRkDTAK g+llJa7y/dyhf3zMxXGh7VtQDyyl9eUXeJ4t0lcO/VsjMepSJmvhtVfyR1sw+zGjMMMigh 6SHchW/bbIXW8+dJfIXUDqsh+iciLmg= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=efficios.com header.s=smtpout1 header.b=P9OowQtc; spf=pass (imf13.hostedemail.com: domain of mathieu.desnoyers@efficios.com designates 167.114.26.122 as permitted sender) smtp.mailfrom=mathieu.desnoyers@efficios.com; dmarc=pass (policy=none) header.from=efficios.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1727434805; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=/O8/4/siXIDJx3FBrTohIsthV3irzqF38mGeeruiE5w=; b=QK9qxqwZdh1rDZp+AuGsyq+RWoqIky9+PPm/mvE2Uoiziub4ik0W6s8Lq5Rwc1+lfDCHtB asXWb/3k3FQh1CCogmlU7vkwV5R6nDU67qCmLRMzf6ReY5ZEs33mdsHc28miygc5rBaz9Z JUKN5P7Vuo9WyYPhakJ3i8BF61CcOak= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=efficios.com; s=smtpout1; t=1727434866; bh=w+koXmB7BOFclIQM2otysNtBd+8qkrsHPTo5gWSR9w0=; h=Date:Subject:To:Cc:References:From:In-Reply-To:From; b=P9OowQtcg/QxmmMSafMStk7GwzfeXc0VlR4KwNBDSBeU9FcEPgC+OORMGH0KNgxEI 80IVtAE4VvFyERUYPQwMwDGyQVqa4diNUZ3iTTWC9iUvIUG7HlTMng5UvPsNcLm3vx LWTaYRGt1Wq95ppZ3Syp3hzgWHbjTWLAm7k+vlz5FvAWwPmAzoBETI8LbX9UtizMgz aSMVNjXoHUWD5HTyBeGYAR/aEC3ZZCrXwtkqCUTQKik2gWlLwmysnSB3XAgcmwAsVb Od+NdeO+Gu9NBati/W1zCdsLNvNBB9B1NLaKQc7oH/FXFlk0vJ6K0OtrHMcmS+hAgP Arw3Innm1ID9g== Received: from [IPV6:2606:6d00:100:4000:cacb:9855:de1f:ded2] (unknown [IPv6:2606:6d00:100:4000:cacb:9855:de1f:ded2]) by smtpout.efficios.com (Postfix) with ESMTPSA id 4XFSFG1pQRz1MqK; Fri, 27 Sep 2024 07:01:06 -0400 (EDT) Message-ID: <8aceaf4f-5578-4fca-8be7-3448d7b89721@efficios.com> Date: Fri, 27 Sep 2024 12:59:33 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC PATCH 1/4] hazptr: Add initial implementation of hazard pointers To: Boqun Feng , Linus Torvalds Cc: Jonas Oberhauser , linux-kernel@vger.kernel.org, rcu@vger.kernel.org, linux-mm@kvack.org, lkmm@lists.linux.dev, "Paul E. McKenney" , Frederic Weisbecker , Neeraj Upadhyay , Joel Fernandes , Josh Triplett , "Uladzislau Rezki (Sony)" , rostedt , Lai Jiangshan , Zqiang , Peter Zijlstra , Ingo Molnar , Will Deacon , Waiman Long , Mark Rutland , Thomas Gleixner , Kent Overstreet , Vlastimil Babka , maged.michael@gmail.com, Neeraj Upadhyay References: <48992c9f-6c61-4716-977c-66e946adb399@efficios.com> <2b2aea37-06fe-40cb-8458-9408406ebda6@efficios.com> <55633835-242c-4d7f-875b-24b16f17939c@huaweicloud.com> <54487a36-f74c-46c3-aed7-fc86eaaa9ca2@huaweicloud.com> <0b262fe5-2fc5-478d-bf66-f208723238d5@efficios.com> From: Mathieu Desnoyers Content-Language: en-US In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Stat-Signature: owbzo468f5wce7u3soqjifr457ankuwy X-Rspamd-Queue-Id: D142320010 X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1727434867-597816 X-HE-Meta: U2FsdGVkX1+PDjZ6kAue2zrt8r1hm9pY3/tYz1O7L4z076kQYOd9sFkEAaQe30aNDKSvWA5l0mQVhwhup2ZkvLxt1TqWPZubTgtDSw+0PTizg7CWMFRK0/pfw5ewPEgp9UQG2CzTpbc25Z+k79p8l63xA0VpDYx3f/+HKhLx/CwMEVKcagT4BALKzds0ZYLLeP/Vt2paTho41j9Kym2f1NaV237YliG98liqOul6qzZHIo9sSFHwtMU8ZrbUv0oqneWgvlkQ+OfWaWXJO7gktAhe2eFVEST9PAOU9ABrsiKw1gr9a50pfB5gS9EmY5R8E53Vj5y6KncsGoyKPDQZNAJBiphTZP2VfwqB3s8X1kjnwq8obb4vahSBFAsyEO3vClqd5S0ZOmdKRFcOuodmQJUqj+FmFhq8r/N36ZIFqUsTrqx8TsR2Y4mUy8LvvsEecyo08l1UjtXG1mSmjk+HQc+DwdlpzrQ4vkVk4GFS7SwksOh/KwzBiCQ+ePqEcWqlwypandGw+JMzSV1ixUxRh41NKYg4Tv44j46aQPWP7vEdTrc0PeQJk8cgUtsmMA38Jzr+n7osWVRbU/7AmmxnSODhbrQW7nkXBpfn6zq33CSwQZUBlS5IxFiiNevEqNIvlhQScq2Vf6qUWE99dgSC6V2gXLT3522XCMhxd8KxXcHeKD+APlifRjHPvZGUqmqsZhMY2rUfZ5WzbmVU5PTNSBHnGyJAgwvs0bEcEaimOiFy3K71XfQims259izSfrvSbsLNqQKBInaLBsurLSkbHu+GMD9W8/Gk88htCPffqk559UyOsTYpITDgLfiyVbRDwt2EgeLdNigeREWHmGAv0OvLg4+atC0sSubIthyJi1vRitYw565dEOdugK696R2SaZ2W8KnvrAHE53rNny+7svePihdx+4pe1xhXxQ4WtNh5ez6vVk97LONRxa83+fpguKgBi3gxV7G4THJBjym +xgJXtTW eH1ApBGGQvbveoo/7cEsypgSrsNPd+aF+o4NTx07FZWlPklNv1icN5YEUKj5ujK+NSh4QsApAtgcvRGG6j8+4Rn8yT3FYF+PLLLXrKxKwjnUmGmC8oH85gCeqO8hWM/s83XQGdSekSEHtWIoCwOhWMceUm0SSVO6EIzeA9LJTaDOr6zCJcGf1qzCx8rc8ntJbB36eIwt2+iz2vey4UiPIcZKa7QgIz2m1P3q99XQ0gWqmOke00b/5GcTyXfAJPgvjZOwW9feUtE5jKeR59SMKswiY4I4mcIB64IrwXbE+ib0n2r5YhnqgF03tPZPA6eAFGsOWRA7colHwnKgn4+VN6jg+fxYc1J36LZxWZoK/Q3X6XCwWLxK3gqyxuAp2GlcBfH8Bz6e5kK4AiG0jcESWogwJ/9bN0USy5Qq7thjX/b1P/517G6YvfMlNZUbS7bGxKTTJNcV1dCj53gUqDSFu/rJ9mzsV8kbr0HAbgtI2YI855xE9ooX2n8Ej3XXGRpGzmT6c X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2024-09-27 06:28, Boqun Feng wrote: > On Fri, Sep 27, 2024 at 09:37:50AM +0800, Boqun Feng wrote: >> >> >> On Fri, Sep 27, 2024, at 9:30 AM, Mathieu Desnoyers wrote: >>> On 2024-09-27 02:01, Boqun Feng wrote: >>>> #define ADDRESS_EQ(var, expr) \ >>>> ({ \ >>>> bool _____cmp_res = (unsigned long)(var) == (unsigned long)(expr); \ >>>> \ >>>> OPTIMIZER_HIDE_VAR(var); \ >>>> _____cmp_res; \ >>>> }) >>> >>> If the goal is to ensure gcc uses the register populated by the >>> second, I'm afraid it does not work. AFAIU, "hiding" the dependency >>> chain does not prevent the SSA GVN optimization from combining the > > Note it's not hiding the dependency, rather the equality, > >>> registers as being one and choosing one arbitrary source. "hiding" > > after OPTIMIZER_HIDE_VAR(var), compiler doesn't know whether 'var' is > equal to 'expr' anymore, because OPTIMIZER_HIDE_VAR(var) uses "=r"(var) > to indicate the output is overwritten. So when 'var' is referred later, > compiler cannot use the register for a 'expr' value or any other > register that has the same value, because 'var' may have a different > value from the compiler's POV. > >>> the dependency chain before or after the comparison won't help here. >>> >>> int fct_hide_var_compare(void) >>> { >>> int *a, *b; >>> >>> do { >>> a = READ_ONCE(p); >>> asm volatile ("" : : : "memory"); >>> b = READ_ONCE(p); >>> } while (!ADDRESS_EQ(a, b)); >> >> Note that ADDRESS_EQ() only hide first parameter, so this should be ADDRESS_EQ(b, a). >> > > I replaced ADDRESS_EQ(a, b) with ADDRESS_EQ(b, a), and the compile > result shows it can prevent the issue: I see, yes. It prevents the issue by making the compiler create a copy of the value "modified" by the asm before doing the equality comparison. This means the compiler cannot derive the value for b from the first load when b is used after after the equality comparison. The only downside of OPTIMIZER_HIDE_VAR() is that it adds an extra "mov" instruction to move the content across registers. I don't think it matters performance wise though, so that solution is appealing because it is arch-agnostic. One small improvement over your proposed solution would be to apply OPTIMIZER_HIDE_VAR() on both inputs. Because this is not a volatile asm, it is simply optimized away if var1 or var2 is unused following the equality comparison. It is more convenient to prevent replacement of both addresses being compared by the other rather than providing the guarantee only on a single parameter: #define OPTIMIZER_HIDE_VAR(var) \ __asm__ ("" : "+r" (var)) #define ADDRESS_EQ(var1, var2) \ ({ \ bool _____cmp_res = (var1) == (var2); \ \ OPTIMIZER_HIDE_VAR(var1); \ OPTIMIZER_HIDE_VAR(var2); \ _____cmp_res; \ }) Thanks, Mathieu > > gcc 14.2 x86-64: > > fct_hide_var_compare: > .L2: > mov rcx, QWORD PTR p[rip] > mov rdx, QWORD PTR p[rip] > mov rax, rdx > cmp rcx, rdx > jne .L2 > mov eax, DWORD PTR [rax] > ret > > gcc 14.2.0 ARM64: > > fct_hide_var_compare: > adrp x2, p > add x2, x2, :lo12:p > .L2: > ldr x3, [x2] > ldr x1, [x2] > mov x0, x1 > cmp x3, x1 > bne .L2 > ldr w0, [x0] > ret > > Link to godbolt: > > https://godbolt.org/z/a7jsfzjxY-- Mathieu Desnoyers EfficiOS Inc. https://www.efficios.com