From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A9A75C433FE for ; Sat, 29 Oct 2022 18:05:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A5DF06B0071; Sat, 29 Oct 2022 14:05:16 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A0E746B0073; Sat, 29 Oct 2022 14:05:16 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8AE388E0001; Sat, 29 Oct 2022 14:05:16 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 7A6E26B0071 for ; Sat, 29 Oct 2022 14:05:16 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 32C88C0A16 for ; Sat, 29 Oct 2022 18:05:16 +0000 (UTC) X-FDA: 80074763832.07.D2647BF Received: from mail-pj1-f47.google.com (mail-pj1-f47.google.com [209.85.216.47]) by imf26.hostedemail.com (Postfix) with ESMTP id BAFFE140005 for ; Sat, 29 Oct 2022 18:05:15 +0000 (UTC) Received: by mail-pj1-f47.google.com with SMTP id d59-20020a17090a6f4100b00213202d77e1so12625734pjk.2 for ; Sat, 29 Oct 2022 11:05:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=WnIesWyzgkP1beLu0wv1uoWqrkOzbsPNcoQRzCjbCb4=; b=aLnMTMbaObzfBmNbzgXJVZXOaL1AxngryjGDYj8RIA2BREsOX0sB3JPeo4JAhgVGnn 1eIjcFmRichJlhHMqzWyLQKziGgfzEytwHjTx0q2pAhLTC3GnvAyKRX2fRFDyYyW/JaN 4mcJJO8m+qbZsWFfrtHUhZnWk1JtvV0vQY8sX8tk9v4/q5pHWim2zII2cBT9Ltz+ZGWm u4o5s14bofcuceHOMb6pJ/lDJLBmSJd9nOvi4Oafrlqo7LUjw5AD6HRDEa0/eLLmPqWu Y7wuPDE+ZUBov6s/076Tb1FyolulkfhFqSth1jNF/TA0c3sDTfA0pD+QkAf/S90kS5ig axTA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=WnIesWyzgkP1beLu0wv1uoWqrkOzbsPNcoQRzCjbCb4=; b=LawFYm5bvyAWFgkTIA8qR0qb9WBZ3i4QyhLImq/nQ1FozjFCtqePioYbo/HJtwF5A1 E5j+Ceu4rmmYJHE5aRypC9VAFeqPklMWadeylEHVlbIXu8Fx1PrHc/B5J1QEgop9TOfc xoMJ2Rn5+TAZoMT6gR2EZqk9AoG1KSpCJZuhscwbrdWBS3ojEMPNZdYDBv2hz/hTxjNg 7E06+ZG4+5ojgRPJ7IrfO+2Nh8HpmFrpa6Z+fRbOVA+9ySuOO8wQqPlSaU8afgjdPj07 ekJ63D3NYmIn3iy6DKx7Tzc1PCfaMm5wibfnephpinMZoYPPfgmbNouCD1f6bqOqOMe2 ta+g== X-Gm-Message-State: ACrzQf1xxUpJJ2V3CmUAE+qNJyLEHOJurYvkd65JSl4X7peni19Ang1j Clp7FV4l46PBdSTinI5U3dc= X-Google-Smtp-Source: AMsMyM5lj4q3vK9rT7mEglcTCN/A6U/+zbMwkMB7cJNYfvLLypMP4BvWT7lIGHKabsSmDMcvs4+/vg== X-Received: by 2002:a17:902:e74a:b0:186:a094:1d3 with SMTP id p10-20020a170902e74a00b00186a09401d3mr5374456plf.153.1667066714270; Sat, 29 Oct 2022 11:05:14 -0700 (PDT) Received: from smtpclient.apple (c-24-6-216-183.hsd1.ca.comcast.net. [24.6.216.183]) by smtp.gmail.com with ESMTPSA id b7-20020a170903228700b0017f5ad327casm736401plh.103.2022.10.29.11.05.12 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sat, 29 Oct 2022 11:05:13 -0700 (PDT) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.120.41.1.1\)) Subject: Re: [PATCH 01/13] mm: Update ptep_get_lockless()s comment From: Nadav Amit In-Reply-To: Date: Sat, 29 Oct 2022 11:05:12 -0700 Cc: Peter Zijlstra , Jann Horn , John Hubbard , X86 ML , Matthew Wilcox , Andrew Morton , kernel list , Linux-MM , Andrea Arcangeli , "Kirill A . Shutemov" , jroedel@suse.de, ubizjak@gmail.com, Alistair Popple Content-Transfer-Encoding: quoted-printable Message-Id: <47678198-C502-47E1-B7C8-8A12352CDA95@gmail.com> References: <20221022111403.531902164@infradead.org> <20221022114424.515572025@infradead.org> <2c800ed1-d17a-def4-39e1-09281ee78d05@nvidia.com> <6C548A9A-3AF3-4EC1-B1E5-47A7FFBEB761@gmail.com> To: Linus Torvalds X-Mailer: Apple Mail (2.3696.120.41.1.1) ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1667066715; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=WnIesWyzgkP1beLu0wv1uoWqrkOzbsPNcoQRzCjbCb4=; b=z/sGxTFVG98Wz0k1i1xuU0yawmh7LJGOXWvtzcL2MNDsOIDUeT5P0acauEsCmVuV+zXPTC 9D/PbS8/PweCmXF+bGK/temiLeuf2suSuShXhl5hKVV59sSCLSyEG50pFb/skUnHWpsugf QWFdd8ypw4QYbIBAc2A3uJuzn0Dds9k= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=aLnMTMba; spf=pass (imf26.hostedemail.com: domain of nadav.amit@gmail.com designates 209.85.216.47 as permitted sender) smtp.mailfrom=nadav.amit@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1667066715; a=rsa-sha256; cv=none; b=quUSPqK8A3Nv1LAnsge7i5sQKkntsDVE56KXdjrokwWlNalW4AFR1y6C9XgCV8k1CzvT+s Cy49N+pOetUZPLo6yJaduFg/GyFW52G7esc/eINDM3VAKvh+tnCn0+EY880h+Eamnt9SD9 KjKo9JGBX7GGvgIn84Oj3nwIiZqmkSA= X-Stat-Signature: 5m189d5nt51uwgfgr1nyi5hyz7p1btaz X-Rspamd-Queue-Id: BAFFE140005 Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=aLnMTMba; spf=pass (imf26.hostedemail.com: domain of nadav.amit@gmail.com designates 209.85.216.47 as permitted sender) smtp.mailfrom=nadav.amit@gmail.com; dmarc=pass (policy=none) header.from=gmail.com X-Rspam-User: X-Rspamd-Server: rspam01 X-HE-Tag: 1667066715-105428 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Oct 28, 2022, at 5:42 PM, Linus Torvalds = wrote: > I think the proper fix (or at least _a_ proper fix) would be to > actually carry the dirty bit along to the __tlb_remove_page() point, > and actually treat it exactly the same way as the page pointer itself > - set the page dirty after the TLB flush, the same way we can free the > page after the TLB flush. >=20 > We could easiy hide said dirty bit in the low bits of the > "batch->pages[]" array or something like that. We'd just have to add > the 'dirty' argument to __tlb_remove_page_size() and friends. Thank you for your quick response. I was slow to respond due to a jet = lag. Anyhow, I am not sure whether the solution that you propose would work. Please let me know if my understanding makes sense. Let=E2=80=99s assume that we do not call set_page_dirty() before we = remove the rmap but only after we invalidate the page [*]. Let=E2=80=99s assume that shrink_page_list() is called after the page=E2=80=99s rmap is removed = and the page is no longer mapped, but before set_page_dirty() was actually called. In such a case, shrink_page_list() would consider the page clean, and = would indeed keep the page (since __remove_mapping() would find elevated page refcount), which appears to give us a chance to mark the page as dirty later. However, IIUC, in this case shrink_page_list() might still call filemap_release_folio() and release the buffers, so calling = set_page_dirty() afterwards - after the actual TLB invalidation took place - would fail. > Your idea of "do the page_remove_rmap() late instead" would also work, > but the reason I think just squirrelling away the dirty bit is the > "proper" fix is that it would get rid of the whole need for > 'force_flush' in this area entirely. So we'd not only fix that race > you noticed, we'd actually do so and reduce the number of TLB flushes > too. I=E2=80=99m all for reducing the number of TLB flushes, and your = solution does sound better in general. I proposed something that I considered having the = path of least resistance (i.e., least chance of breaking something). I can do = what you propsosed, but I am not sure how to deal with the buffers being = removed. One more note: This issue, I think, also affects = migrate_vma_collect_pmd(). Alistair recently addressed an issue there, but in my prior feedback to = him I missed this issue. [*] Note that this would be for our scenario pretty much the same if we = also called set_page_dirty() before removing the rmap, but the page was = cleaned while the TLB invalidation has still not been performed.