From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C883BC38159 for ; Wed, 18 Jan 2023 03:57:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 288226B0071; Tue, 17 Jan 2023 22:57:20 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 2113D6B0072; Tue, 17 Jan 2023 22:57:20 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 08B376B0074; Tue, 17 Jan 2023 22:57:20 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id E84006B0071 for ; Tue, 17 Jan 2023 22:57:19 -0500 (EST) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id A1CE68026C for ; Wed, 18 Jan 2023 03:57:19 +0000 (UTC) X-FDA: 80366559798.29.AC86644 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf11.hostedemail.com (Postfix) with ESMTP id D8A9240002 for ; Wed, 18 Jan 2023 03:57:17 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=reC1fJKW; dmarc=pass (policy=none) header.from=kernel.org; spf=pass (imf11.hostedemail.com: domain of "SRS0=DEr1=5P=paulmck-ThinkPad-P17-Gen-1.home=paulmck@kernel.org" designates 145.40.68.75 as permitted sender) smtp.mailfrom="SRS0=DEr1=5P=paulmck-ThinkPad-P17-Gen-1.home=paulmck@kernel.org" ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1674014238; h=from:from:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=8musbzXhapxQfIbPErj9xxMQzS0jHXcpoOuYtrqwh2s=; b=iAONqWpSMZfC0RvfYIAZE8V7dViubIE5d5SYyfYbnKrCRJ1yjCMzWfzs/tUrBtdWS2rDik p7PpX5KVEOywkWIeI3FFKvAHLM/oOuR7oGUJYLi2baDerIhcBiPCcZgekbPY9zEaClKoM5 nGcCrU9Y7jvyaWJcZze1+rj0Z1SW/bc= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=reC1fJKW; dmarc=pass (policy=none) header.from=kernel.org; spf=pass (imf11.hostedemail.com: domain of "SRS0=DEr1=5P=paulmck-ThinkPad-P17-Gen-1.home=paulmck@kernel.org" designates 145.40.68.75 as permitted sender) smtp.mailfrom="SRS0=DEr1=5P=paulmck-ThinkPad-P17-Gen-1.home=paulmck@kernel.org" ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1674014238; a=rsa-sha256; cv=none; b=OFZSTG2+qyl7sNlFKWcGgl3nbaZD6NGVUNzsic6A8U/qEim4AxNRW/NvzVP7irLR6Sn6O3 zu9EQP8dWAiJotAGYqgPjPaXgTe+/HT1rhcuTp+/V3TOdhxapsk49InY/ciwuDbSIEXSUR warXz2TQOf/vjdx86rKm83V9lGXPkVI= Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 10A5DB81B05; Wed, 18 Jan 2023 03:57:16 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id BE709C433EF; Wed, 18 Jan 2023 03:57:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1674014234; bh=1aF9zC75Io5WiTMDXpCHOF1dTNQ5sBPwdAz07B2K/L0=; h=Date:From:To:Cc:Subject:Reply-To:References:In-Reply-To:From; b=reC1fJKW1aZuGExwTYhTEhs/8bMacyJ5hd4qNmlX5UyJ8Dap3Yxt+hF35Qb1l9gS9 CjBoldyx7YciVoGGHRNQdKnzeKE9agBGsKeG7lrUiDtdzRtCfL8V2lIH1/s1K8Tkm9 qQwMXV/75GHZajB9ITDCiOQKCV7/kKfd2gddpEvFO2ZUUu60nYYM+euGCM61lpokH0 HPK7fDy0icCQ1YYtBubE7F5lV2lD6Odz33+Y00ZuJQ288M6LtPyFtRVxVVne1mqa5N 1T+aPTJIEC1bnGnJ/46wPPs0ia39o5CYl/UQokOxJOz8EpWOyuBoOobqnlZY3QE9GB 4BUlNaMwq3ixw== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 6D4575C1052; Tue, 17 Jan 2023 19:57:14 -0800 (PST) Date: Tue, 17 Jan 2023 19:57:14 -0800 From: "Paul E. McKenney" To: Yang Shi Cc: linux-mm@kvack.org Subject: Re: [QUESTION] Linux memory model: control dependency with bitfield Message-ID: <20230118035714.GU2948950@paulmck-ThinkPad-P17-Gen-1> Reply-To: paulmck@kernel.org References: <20230109173155.GS4028633@paulmck-ThinkPad-P17-Gen-1> <20230109231106.GZ4028633@paulmck-ThinkPad-P17-Gen-1> <20230110000418.GC4028633@paulmck-ThinkPad-P17-Gen-1> <20230112001511.GD4028633@paulmck-ThinkPad-P17-Gen-1> <20230114041532.GK4028633@paulmck-ThinkPad-P17-Gen-1> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: D8A9240002 X-Stat-Signature: m76mctpzh4djia3stywx89nha4665rip X-HE-Tag: 1674014237-573833 X-HE-Meta: U2FsdGVkX19SIdMvDPni2llNL16+SHOghDOsTR4A/CpT+NZMEOfZv/IYorMKDTUaxHwJSrKIk78ikkXdPQpFNMj6PPTGmQUTe2trLEpZaryHAjgG0hbENp7moRrMyWCu/387gkmgM11bj3Eov0F7wlUjvo9RGRpJjhbg/GNanqP8I3bPilzZGObrZ7PpOSpmJu3I9cXwsaNPYisKbOIDh1PwuL4Og/WLGtD8ULrdOK4BwbIDtsbuyIO4bpvz9Y/KoQbWDyrUg+qRs4t1JHAsy/3r9EA6hQrGgn1OlHD6y4oMU7hmpfyszBq6MdT/WroI4oFpbsl7zl5e+rRixS+ah41jFFXQ3nAnbCneJo36eG3rWMlcq348o4UGC+X0PESwYAI7EpP/BOqNi+O80+9geG/yo76SDwXrAJs0QtPEPQ5YCWOf4EMb9LdTRVibiOov5JMKPnHk4Amtop3xg+RYLqeFpKGP231pWKsclpvvTJ0xww8kjcVxXtVgngiSm1rfGrwPHds+HT5exZq1KhHXgNVpoXVnQ5Pwj9lSsgEkj9iuSJhuXOfmWQuPw4Bd5ChbrbgLNfcSWl/BM+v8crP7qZp7p3RVCifVayJiwyH9U7kmhbQaq5wlC2iAdI/dSnrSinbsJLY3MWtwAVX4jubBKAgXPd1nMeHrT8OjgjfF9PK7G7x0I2ZvXkpXnkRSSVVNKy/XGBNdSKtKj8vrEMT35qqXE5qQMpXZ7lr1C5q4WSd1Kk5XflJcB1HlxwmZD5VbplJAQvq4Ya8jusF6qm4Ws8HNdSSteM/jBl06ZrNfrkkFCeWtwzQIA2NAvYujilSiD+rfpGIa8GGTmh0VlB0Fyd0UQCsDkWpKQrbS8/g5rVfHF4fF4Mcj/CJi6f3zQQJFx4CJnywYoIkvnA5QUuOsmuIqQ6QiJ9lCI4tXTGzFfFdF49kxQumMi6no4rtwqHE7x2R8W+LqR4+AZPYAs/S +37cQ3LT DCNDSACioobInaEFYPZECGUCR62/pnubf/Cscbaode3+2tr1xD/TBr9w7l6hZAJTIPgXl9PBAT3Qz57e0au4moxTga+BhSRT6iAPO0nUyRzWduhurkZDAdjiyiI3LqoNIPB+STNcWAMxwXCzRPkRm685A/JH8uHh+c3fXpGot699rLfQMfDfmhdWgd3CIUlT7sN6aDThhcRoF3DQPjDiS0ZnP78NFq6ddgebmZRlmoIwMUWSS/FowarqeDrYzffWZUe0o7bG1pELIcaInRfkzV97YyVgppDljLlAWNkPCtWK5n9s6tSkqofmoZBKVL9o8KxpCNgOtqHeTxLmszFqv4+13HcwL4NaQyxlwLF/2AUWp2J0leIq+nQj4gjOt8mtHxt2vMlSO0xEO3VggS40Z9MJoOrpErD5H6z+4SLWaeJakh08= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Jan 17, 2023 at 01:28:19PM -0800, Yang Shi wrote: > On Fri, Jan 13, 2023 at 8:15 PM Paul E. McKenney wrote: > > > > On Fri, Jan 13, 2023 at 06:37:42PM -0800, Yang Shi wrote: > > > On Wed, Jan 11, 2023 at 4:15 PM Paul E. McKenney wrote: > > > > > > > > On Wed, Jan 11, 2023 at 04:01:34PM -0800, Yang Shi wrote: > > > > > On Mon, Jan 9, 2023 at 4:04 PM Paul E. McKenney wrote: > > > > > > > > > > > > On Mon, Jan 09, 2023 at 03:44:54PM -0800, Yang Shi wrote: > > > > > > > On Mon, Jan 9, 2023 at 3:11 PM Paul E. McKenney wrote: > > > > > > > > On Mon, Jan 09, 2023 at 02:08:35PM -0800, Yang Shi wrote: > > > > > > > > > On Mon, Jan 9, 2023 at 9:31 AM Paul E. McKenney wrote: > > > > > > > > > > On Mon, Jan 09, 2023 at 09:14:19AM -0800, Yang Shi wrote: > > > > > > > > > > > Hi Paul, > > > > > > > > > > > > > > > > > > > > > > Hope this email finds you are doing well. I recently ran into a > > > > > > > > > > > problem which might be related to control dependency of the memory > > > > > > > > > > > model. Conceptually, the code does (from copy_present_pte()): > > > > > > > > > > > > > > > > > > > > > > acquire mmap_lock > > > > > > > > > > > spin_lock > > > > > > > > > > > ... > > > > > > > > > > > clear bit (a bit in page flags) > > > > > > > > > > > ... > > > > > > > > > > > VM_BUG_ON(test bit) > > > > > > > > > > > ... > > > > > > > > > > > spin_unlock > > > > > > > > > > > release mmap_lock > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > IIUC there is control dependency between the "clear bit" and > > > > > > > > > > > "VM_BUG_ON" since VM_BUG_ON simply tests the bit then raises the BUG. > > > > > > > > > > > They do touch the overlapping address (the page flags from the same > > > > > > > > > > > struct page), but they are bit field operations. Per the memory model > > > > > > > > > > > documentation, the order is not guaranteed for bit field operations > > > > > > > > > > > IIRC. > > > > > > > > > > > > > > > > > > > > > > And there are not any implicit barriers between clear bit and test > > > > > > > > > > > bit, so the question is whether an explicit barrier, for example, > > > > > > > > > > > smp_mb__after_atomic() is required after clear bit to guarantee it > > > > > > > > > > > works as expected? > > > > > > > > > > > > > > > > > > > > I am not familiar with this code, so I will stick with LKMM > > > > > > > > > > clarifications. > > > > > > > > > > > > > > > > > > Yeah, sure. This is why I tried to generalize the code. > > > > > > > > > > > > > > > > > > > First, please don't forget any protection and ordering that might be > > > > > > > > > > provided by the two locks held across this code. > > > > > > > > > > > > > > > > > > Yes, but for this case I just care about the code between clear bit > > > > > > > > > and VM_BUG_ON. > > > > > > > > > > > > > > > > Fair enough! > > > > > > > > > > > > > > > > > > Second, a control dependency extends from a READ_ONCE() or stronger > > > > > > > > > > (clear_bit() included) to a later store. Please note "store", not > > > > > > > > > > "load". If you need to order an earlier READ_ONCE() or clear_bit() > > > > > > > > > > > > > > > > > > So you mean: > > > > > > > > > > > > > > > > > > clear bit > > > > > > > > > ... > > > > > > > > > if (test bit) { > > > > > > > > > load_1 > > > > > > > > > store_1 > > > > > > > > > load_2 > > > > > > > > > store_2 > > > > > > > > > } > > > > > > > > > > > > > > > > > > The dependency reaches to the first store? > > > > > > > > > > > > > > > > It reaches both stores, but neither load. > > > > > > > > > > > > > > > > That means that your example might well execute as if it had instead > > > > > > > > been written as follows: > > > > > > > > > > > > > > > > load_1 > > > > > > > > load_2 > > > > > > > > if (test bit) { > > > > > > > > store_1 > > > > > > > > store_2 > > > > > > > > } > > > > > > > > > > > > > > > > Assuming that you mean the test_bit() function. If you instead mean > > > > > > > > a C-language statement that tests a bit, then the compiler can do all > > > > > > > > sorts of things to you. The compiler can also do interesting things > > > > > > > > to you if the stores are plain C-language stores instead of something > > > > > > > > like WRITE_ONCE(). > > > > > > > > > > > > > > It is a test_bit() function. Is it possible clear_bit() is reordered > > > > > > > with test_bit(), or test_bit() doesn't see the result from > > > > > > > clear_bit()? > > > > > > > > > > > > If the various calls to test_bit() and clear_bit() are to the same > > > > > > location, then they will not be reordered with each other. > > > > > > > > > > > > If they are to different locations, they can be reordered. But in that > > > > > > case, they would not see each others' results anyway. > > > > > > > > > > Yeah, make sense. > > > > > > > > > > > > > > > > > > > > > with a later load, you will need acquire semantics (smp_load_acquire(), > > > > > > > > > > for example) or an explicit barrier such as smp_rmb(). Use of acquire > > > > > > > > > > semantics almost always gets you code that is more readable. > > > > > > > > > > > > > > > > > > Does the load acquire have to pair with a smp_store_release()? > > > > > > > > > smp_mb__after_stomic() is not needed because it is too strong and the > > > > > > > > > weaker barrier is good enough, right? > > > > > > > > > > > > > > > > It needs to pair with some type of applicable ordering, but yes, > > > > > > > > smp_store_release() is a good one. > > > > > > > > > > > > > > So, it should look like IIUC: > > > > > > > > > > > > > > clear_bit() > > > > > > > smp_load_acquire() > > > > > > > ... > > > > > > > if (test_bit()) { > > > > > > > smp_store_release() > > > > > > > load_1 > > > > > > > store_1 > > > > > > > load_2 > > > > > > > store_2 > > > > > > > } > > > > > > > > > > > > Just so you know, smp_load_acquire() does a load and smp_store_release() > > > > > > does a store. > > > > > > > > > > > > Also, is this code executed by a single CPU/task? If so, you need to > > > > > > also consider the corresponding code executed by some other CPU/task. > > > > > > > > > > The code could be executed by multiple tasks in parallel. > > > > > > > > It is hard for me to tell you what to write without more information > > > > about what you do and do not want to happen, but here is one possibility: > > > > > > > > clear_bit(5, &my_bits); > > > > if (test_bit_acquire(5, &my_bits)) { > > > > r1 = READ_ONCE(a); > > > > WRITE_ONCE(b, 1729); > > > > r2 = READ_ONCE(c); > > > > WRITE_ONCE(d, 65535); > > > > } > > > > > > > > This is of course a bit nonsensical because something somewhere would > > > > need to set bit 5 in my_bits for the body of the "if" statement to ever > > > > be executed. I am assuming that this happens somewhere else. > > > > > > > > The clear_bit() would be ordered before the test_bit_acquire() due to > > > > their both accessing the same location. The test_bit_acquire() would > > > > be orderd before the body of that "if" statement due to the _acquire() > > > > suffix. > > > > > > > > Is that what you are looking for? If not, what are you looking for? > > > > > > Sorry for the late reply. I think the _acquire() should be something > > > I'm looking for. I just figured out a stable reproducer for my > > > problem, so I will collect more debug information and try some debug > > > patch with _acquire(). Hopefully I could get more clear picture next > > > week. > > > > Glad it helped, but are you sure that it is just a memory-ordering > > problem as opposed to an algorithmic bug? I always have to ask... > > I did ask myself the same question. But AFAICT, all modifications to > the flag (set and clear) are protected by page table lock. Sounds good! At least until further notice. ;-) Thanx, Paul