From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 68FAAC5475B for ; Thu, 7 Mar 2024 00:23:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C2FC26B00C1; Wed, 6 Mar 2024 19:23:36 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id BE0486B00C2; Wed, 6 Mar 2024 19:23:36 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AA8146B00C3; Wed, 6 Mar 2024 19:23:36 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 99B3C6B00C1 for ; Wed, 6 Mar 2024 19:23:36 -0500 (EST) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 6E8E0A023F for ; Thu, 7 Mar 2024 00:23:36 +0000 (UTC) X-FDA: 81868344432.15.FEF77D3 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf22.hostedemail.com (Postfix) with ESMTP id 3ACA0C001B for ; Thu, 7 Mar 2024 00:23:34 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=CGCNzgAp; spf=pass (imf22.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1709771014; a=rsa-sha256; cv=none; b=ZruuV0WXFethPg/2AL+xoCKRb4mqucfRIe+Rhpuu1KyOFTGahNZW1IgHZlU76B7ZFI2/yf bCmd6g43AYU6H5mIauxk7sI39XiKmTZXM7QpamFOihlQVOQVNmTbXcYBrU/nivCf9rcEwN ljKwZICNLdETNmmXpJyMwE/EQXzYISw= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=CGCNzgAp; spf=pass (imf22.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1709771014; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=OCES2kaXYWvEX1DXjcfkz3runp0pMNuB+6tHKRWiSvE=; b=h+3D85p5Gb867XARVJqCaxMdKU8Zy/6sJqmZEINYlEUQ0fRdJu/TCV8Ol28jlmhI2VEHvO AmWRXprfAK3xMs62R6h1YeE24ikhFszzM5+zR8ti1gY3awOfwa6Rzm0oV1byZrGSFGOX+9 yZUQJKMyLo3WNsdTpnU00lUjwcTMy2Q= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1709771013; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=OCES2kaXYWvEX1DXjcfkz3runp0pMNuB+6tHKRWiSvE=; b=CGCNzgApJyZ1Vudx09akenBdY+NvprT7dV4tCSHq16zBlw/guLjUqQWp2vBxIytDmMhsHW F2m6YGpvKTEwU/ImdV6YT/DOUrIT4v7bPzBfuRc1XBx9t8YDGkdwSgHJKOIvk1lnotlUQa vxFWc9BSNzz4RW93+AFjIh5MDuWS3SY= Received: from mail-pg1-f199.google.com (mail-pg1-f199.google.com [209.85.215.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-478-O8pOQDyzOrCJqaUD5Earyw-1; Wed, 06 Mar 2024 19:23:32 -0500 X-MC-Unique: O8pOQDyzOrCJqaUD5Earyw-1 Received: by mail-pg1-f199.google.com with SMTP id 41be03b00d2f7-5cfccde4a54so99982a12.1 for ; Wed, 06 Mar 2024 16:23:32 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709771011; x=1710375811; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=OCES2kaXYWvEX1DXjcfkz3runp0pMNuB+6tHKRWiSvE=; b=GYIkQeqj8lFnx6hcXmZT/v0jZxhEkeyoHw4+PGJrUHy05oowfN4uzJMDNIYts8HpAL 4f2shsRFo5h+Rd+k02Hg5Do/K2sWpdyYLNqfYfldzOxRyTTD0rVkEpZAi/YaHEqMm6Ra OBSoGbT0g/j9IlkKpTihx3qMwLD5hJZ7XtnM2FXW3PMcYKcdvCYD7IwCAF8hymdTQ4sJ S+r6KsE9NF+pPRKVJV8ebZWQk4iGU8VTUl10aGopiK7Q/6yIhGF9X1sWo+ZJT84JGU1g zPOrKU2aACKXqnl92bM+/AGh/FVCu3piwP3EPRATJLhpT0yWKlrsJw+NoFJxqPg7wjzG riQw== X-Forwarded-Encrypted: i=1; AJvYcCU9LLr2EQlsHY8be5GIw9oOphouwsFVauYZt+OlhfkJ55cXfbmCrG7bxihliqhwQ5mrPd5ZT4l8O5xajfgso2IX9mg= X-Gm-Message-State: AOJu0YyAdwe/Cg+4gr2DO2e9Jw0Sc9j+sMDj4KZ7XFgxKDXzRxQbqUtH Ie8G1Od8T/sTCaoj6qiIY6WrOUoePsMPt0IhMNZCuHy9m2XXweJCnJsWHP3HoKPQwOm6FYbHD+h NflpyJ/FWwbAJAQ43GAk/+Fp3eSxYuYUxS1IgHX2PeMq/aPYL X-Received: by 2002:a05:6a00:178e:b0:6e6:5574:17fe with SMTP id s14-20020a056a00178e00b006e6557417femr715258pfg.2.1709771010963; Wed, 06 Mar 2024 16:23:30 -0800 (PST) X-Google-Smtp-Source: AGHT+IE29oJb8+FlW3bA6qi0i5kybdj0/VJbREt+Tai6bog6Vub1H9UY05blOcWw9vPqOf2NBpHyeQ== X-Received: by 2002:a05:6a00:178e:b0:6e6:5574:17fe with SMTP id s14-20020a056a00178e00b006e6557417femr715237pfg.2.1709771010542; Wed, 06 Mar 2024 16:23:30 -0800 (PST) Received: from x1n ([43.228.180.230]) by smtp.gmail.com with ESMTPSA id j1-20020a62b601000000b006e64370ace9sm3065918pff.195.2024.03.06.16.23.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Mar 2024 16:23:30 -0800 (PST) Date: Thu, 7 Mar 2024 08:23:23 +0800 From: Peter Xu To: James Houghton Cc: Axel Rasmussen , Andrew Morton , Muchun Song , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] mm: Add an explicit smp_wmb() to UFFDIO_CONTINUE Message-ID: References: <20240306001511.932348-1-jthoughton@google.com> MIME-Version: 1.0 In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 3ACA0C001B X-Stat-Signature: dxfam1rf3x4bn1czydecq8rz8881gdj1 X-Rspam-User: X-HE-Tag: 1709771014-899190 X-HE-Meta: U2FsdGVkX18Yi+Fbr3/t+v+MjqCPZGzWj9s9XvHIeiWwNHrBbxYRe9dAS+wn4hX8cHfPhA8fHq1OERE46mkeEPgGstla7B2cp7SAUAnnbEU0KAjb1axFOt402xqwqwT0uy6QC2uBTC1xn2dZgN3KlEWOcqVAyUy+4cnVyNWTRtT7+vpo72mCz+Axv9bmWTV5CShBe8DStmD4wM6XnuSIA6CyPo9DJNB7KeCpMGh3cBFqlVBBwbFH2SahjX9zHaHxRvU7cYOj/Hk2SfcAD8BNCQHZVp94Z5drZ7KnnOWww+68qu06oGAcYNKYTSt4GqvqYgTsU2IaL9KOjS51X2OUmgWVz/KRsDepmNGzvUaGzZpLIuautyapYZ4HyUXWqI0a745egcNVSD8ZrMIoIz9rfmAppNaOJNHUBaRc7wVeo8SeOty7GkCZ79YQXT885Qxc5PTyKfUdq67O+I3b8tFTy6GNrIDHbun0UXNF/xEhHlflk7duMf63KyXdeMJbTc2XXS/c+FMPuyT+ovY27lsGBuzBnuCqiH1tho5PMQAeJwPMWy1MdFrthMpub4unpQcDF7ZB6vgSzjT+WA6U7hDFGT44+NwOxEUq4t/KYy+XMYtxlQMxa0tgMomXFHshoKeS2vlcXDuiiCEI2iYkh9RzAmthLtbfbyoKB8IBNRyVd+cUjYZrqwiqKS5nz6b4KYnpk/bJi01K3G3eZp6UlsbjNq4xnySD6OvIHS/w0uhyCARlIiOGbgEyNmigYVX7JH0jMaVzWlCVdu/QsZwxAqq9UHDOTrG6FYTiCHoGSudsL3uU6ph5lyU19urf4OaqX9tdve9ckkrFLIZ44z16wzXISXYIEJAW5uuvd2INuNl7oK+oQkrnsagJsxnJQ4r275nt/1Uyo5trZe8EC17uSXEqCa50nJ5c6YK1LQ77VgHdhCpsbsyYipcdvQbuUsy1Xom6RURFtTuraGiZPw88+Bj B8GlspST 0NvS+Eyq1K1sD74aM2i/Tu5LNwEYFn3lctXq0+cGelvMEA/UGw/ndJAHnbEVI7y+itn1wnKTJdCmLyaT8dKl4quTpADCfA50kqe5LfjCI9U1zJ0dSyhx7FV3xe5bBkvmAk/ko/reqkRsJs++rlFT8tH+6OuRdY4YVnjelzHK/nyYUXJlwX0OnD3q0ntiM4Zsie/tvXno0KDJw6GNeQ0MAye0aISK4oWhIv6bDEsrkFpisW0IMEHjbePGIlQj+C0oQ1jeYiwO3bWjvPRQcGUFhLME2s8kPqvBH7LUJ76qVF83OFmed+EvEqUzrqzm3G2jVeHXzaS3/FG1i/UKfyIdZTDRgmkw+65PqmE+2FLMXo5313Y8tFLXqv3Cn+Lqc5bE1mK6k9NWjUB2Qlc1/Kk85cAWSHK1t3jtMwXCc3uoLO0smcZ4CPzNwHOiu/NHFRYAVBhic5E52dzsCwi/AATqsoNN63U9C+jr8JtL3 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Mar 06, 2024 at 09:57:17AM -0800, James Houghton wrote: > On Tue, Mar 5, 2024 at 7:41 PM Peter Xu wrote: > > > > On Wed, Mar 06, 2024 at 12:15:10AM +0000, James Houghton wrote: > > > I've tried to see if I can legitimately get a user to read stale data, > > > and a few attempts with this test[2] have been unsuccessful. > > > > AFAICT that won't easily reproduce even if the problem existed, as we > > contain so many implict memory barriers here and there. E.g. right at the > > entry of ioctl(), mmget_not_zero() already contains a full ordering > > constraint: > > > > /** > > * atomic_inc_not_zero() - atomic increment unless zero with full ordering > > * @v: pointer to atomic_t > > Oh yes, of course. Thanks for pointing this out. So we definitely > don't need a Fixes. > > > > > I was expecting the syscall routine will guarantee an ordering already but > > indeed I can't find any. I also checked up Intel's spec and SYSCALL inst > > document only has one paragraph on ordering: > > > > Instruction ordering. Instructions following a SYSCALL may be > > fetched from memory before earlier instructions complete execution, > > but they will not execute (even speculatively) until all > > instructions prior to the SYSCALL have completed execution (the > > later instructions may execute before data stored by the earlier > > instructions have become globally visible). > > > > I guess it implies a hardware reordering is indeed possible in this case? > > That's my understanding as well. Let's also mention that in the commit message when repost? Just in case it'll answer other readers when they read this patch. > > > > > > > > > [1]: commit 153132571f02 ("userfaultfd/shmem: support UFFDIO_CONTINUE for shmem") > > > [2]: https://gist.github.com/48ca/38d0665b0f1a6319a56507dc73a173f9 > > > > > > mm/hugetlb.c | 15 +++++++++------ > > > mm/userfaultfd.c | 18 ++++++++++++++++++ > > > 2 files changed, 27 insertions(+), 6 deletions(-) > > > > > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > > > index bb17e5c22759..533bf6b2d94d 100644 > > > --- a/mm/hugetlb.c > > > +++ b/mm/hugetlb.c > > > @@ -6779,12 +6779,15 @@ int hugetlb_mfill_atomic_pte(pte_t *dst_pte, > > > } > > > } > > > > > > - /* > > > - * The memory barrier inside __folio_mark_uptodate makes sure that > > > - * preceding stores to the page contents become visible before > > > - * the set_pte_at() write. > > > - */ > > > - __folio_mark_uptodate(folio); > > > + if (!is_continue) { > > > + /* > > > + * The memory barrier inside __folio_mark_uptodate makes sure > > > + * that preceding stores to the page contents become visible > > > + * before the set_pte_at() write. > > > + */ > > > + __folio_mark_uptodate(folio); > > > > Can we move the comment above the "if", explaining both conditions? > > Yes, I'll do that. I think the explanation for > WARN_ON_ONCE(!folio_test_uptodate(folio)) is: > > We only need to `__folio_mark_uptodate(folio)` if we have > allocated a new folio, and HugeTLB pages will always be Uptodate if > they are in the pagecache. > > We could even drop the WARN_ON_ONCE. No strong opinions, keeping it still makes sense to me. Btw, it'll also be important to document "how is the ordering guaranteed for CONTINUE", then you can reference the new code you add, as readers can get confused on why CONTINUE doesn't need such ordering. Thanks, -- Peter Xu