From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EB4ACC5AE59 for ; Tue, 3 Jun 2025 19:12:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7EA866B04FD; Tue, 3 Jun 2025 15:12:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 79B026B04FE; Tue, 3 Jun 2025 15:12:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 689216B04FF; Tue, 3 Jun 2025 15:12:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 4ACB76B04FD for ; Tue, 3 Jun 2025 15:12:15 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id E3A9F5F890 for ; Tue, 3 Jun 2025 19:12:14 +0000 (UTC) X-FDA: 83515034988.18.725D489 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf06.hostedemail.com (Postfix) with ESMTP id 42F97180002 for ; Tue, 3 Jun 2025 19:12:07 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=IYk33VbC; spf=pass (imf06.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=quarantine) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1748977932; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=5RyZVbCmKQfENb52bbtkkOv+s+xa+wwCpogRbedT1S8=; b=Qm9cgi2euSBl7keWx9eVM1O6itEwLUvcHBKyXQVX/i6Xe1fGTorh3Zm6i5GjtkhA422pA/ +jU9Ir1pwfvMRJd9ZzQfhgFS6Qozzw85vIb7tHo3HQ0KmIEOg3xU9dJuzDXjdcXPuoIreJ qjWljVuhGDrk4kasNENyu56R1TdqUn4= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=IYk33VbC; spf=pass (imf06.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=quarantine) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1748977932; a=rsa-sha256; cv=none; b=k2zmEW7R0HT3Cm7qxi/qqZMYrdFbwf15t324PRDnqM7N1MVaD/7ohMxTjB2DlZeQTOs+zc pul0vDtxceAlD4G4bjuxPIRGyZOi9lB5Uq1ed0JmAKTkWsfysfk9t4HQOvac1bAw8OiNuz H9zxPubKlpr+jBiZjvXvJgYkhwMmf5o= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1748977926; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=5RyZVbCmKQfENb52bbtkkOv+s+xa+wwCpogRbedT1S8=; b=IYk33VbC1mx3EAaXwe9x6gSwbLjJAxtfgdVhyaf8SFipdeyq0CyzGJESuKUM/Krmr4ma6h 5VEmEy2WDf4i5vBNqLKtMQJlMZvnJalhZ6YxIwvpzdSXYuoy7ZRLv750bv/y6B+2ymN5jm 0NAxs6i15X/Av8GBoPoCR0A5Z73VarQ= Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com [209.85.222.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-562-j-x_gQisPlio3AX7FdKN4A-1; Tue, 03 Jun 2025 15:12:05 -0400 X-MC-Unique: j-x_gQisPlio3AX7FdKN4A-1 X-Mimecast-MFC-AGG-ID: j-x_gQisPlio3AX7FdKN4A_1748977920 Received: by mail-qk1-f198.google.com with SMTP id af79cd13be357-7d09ed509aaso884033385a.3 for ; Tue, 03 Jun 2025 12:12:05 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1748977920; x=1749582720; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=5RyZVbCmKQfENb52bbtkkOv+s+xa+wwCpogRbedT1S8=; b=l59ss20Fy6o+LrG+XuOskQ5pnMS6GFUNJlZG/v57KcOOSp+NFhTeiegxt0OcJNeL7L wzH99nODWdZFeo7TSklAIrUQPm0U3mp981M8w9COXXkpQReH3K6hprWaQfIBSNjcjjEz gDB/Dv5a8FS7lZ6YY++BP1FigaF69sDp0tTfEGTavHdYXYUNSEUGBM0oFFynYmsNHDSW LmSX7+PUiJJ6f4urvXCYn17oRiG30Xab7K2MoScxhzfGvcShE7EO6ChDuH+Z1LCwxXc8 RF02tyikE40jinS/4NFpvPUZovxmCs7pn9D9exmioCTQJn1Qi3FMOIciGj1388ObYTwq j32w== X-Forwarded-Encrypted: i=1; AJvYcCXMlEoYMFfud2Gm9oyr7mOnw7Zl0dkBhy8PK/crd45rsvK3sAqDJE+2enTZcU9Os+hTdqtpM1OWxA==@kvack.org X-Gm-Message-State: AOJu0Yy7/1lpUNOvvqjx0K9MfO4Y7RmdtnTOXlhcv4qCTSaaFclZAx1y WXCPDlcIk5B4mWCJsDw7BvYEdyPphbKm9f3iAHFts4B7VcrtZA0lHLThb5s/kBv4tcnYX3FW0lS nNjLU+OD6PDtW+fTa21GpEHe4W8VvpyCG+Vk9MTieiMQC/AK9iW72 X-Gm-Gg: ASbGncthv3rzlFS3VUAPyjcxY9RaodsMtnvLG1drgnm3bNdhIigXkVTKJjhQ58ywm/u Qc8WvMzH5fAyp4GxgpOUehAmR3MRNEkhItbVN24c4LlEBEXGmuivScr20IeSUKaW/gmzCQoQ2+q bp5qnUnGIAeY+uJLEKCv4WfcmdbwslLZb8fOr5xxPn5f9IA0/jAKgf7Lnav2xFKDvOUYZbtZW6b rUIJVmGcLyr5rMRyy6Pjulp508N2kGYHsS0IzBJKhw0gXcXBT1HVrXe36osdcK0k8ztKahxUKXQ MKc= X-Received: by 2002:a05:620a:178c:b0:7c5:f696:f8e5 with SMTP id af79cd13be357-7d21996b108mr32010985a.14.1748977920039; Tue, 03 Jun 2025 12:12:00 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHN42dB5vCCNHLZ+vL60sDUi7aO+SvVRrEj8LwA0i4xWVnBVhNuWCI7jP2coEJg5mchI77zeg== X-Received: by 2002:a05:620a:178c:b0:7c5:f696:f8e5 with SMTP id af79cd13be357-7d21996b108mr32008685a.14.1748977919645; Tue, 03 Jun 2025 12:11:59 -0700 (PDT) Received: from x1.local ([85.131.185.92]) by smtp.gmail.com with ESMTPSA id af79cd13be357-7d09a195c95sm885297885a.87.2025.06.03.12.11.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 03 Jun 2025 12:11:58 -0700 (PDT) Date: Tue, 3 Jun 2025 15:11:55 -0400 From: Peter Xu To: David Hildenbrand Cc: Oscar Salvador , Andrew Morton , Muchun Song , James Houghton , Gavin Guo , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [RFC PATCH 1/3] mm, hugetlb: Clean up locking in hugetlb_fault and hugetlb_wp Message-ID: References: <20250602141610.173698-1-osalvador@suse.de> <20250602141610.173698-2-osalvador@suse.de> <1602a87b-b1bc-4b53-abe7-dce8adddbe46@redhat.com> MIME-Version: 1.0 In-Reply-To: <1602a87b-b1bc-4b53-abe7-dce8adddbe46@redhat.com> X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: Bcrd8A3L2LTI8yW2eT8IzbLJqPOknn4cKo4Ifj8aggM_1748977920 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline X-Rspamd-Queue-Id: 42F97180002 X-Stat-Signature: yhz5477croycuuty6ywifygy3kjb9jb6 X-Rspam-User: X-Rspamd-Server: rspam07 X-HE-Tag: 1748977927-145418 X-HE-Meta: U2FsdGVkX1+U2y8k2FyrsAvGnj2eutz9BFuCPElzYd4e39+XuUOr+YrajDt3AEu4NCDyGdCEJQcS/RRlluoKjQzFW33klU/GL0WPjv73Yfyd4QqvA2+iKYyAdfqkV5PD8VBdyBKwmczbkFFlMq28KB0JicE+9CJGy3RLUDEjrPXPyYug3zp1use54PxzDQbujDKBeAZZGL10CxL2JtLb+yHQGqu8tvev/kF8eEBrriywFKU8eH79z/QM/9BdnxKYOSQjX7ckfndHIiKUfrnZU/OHK/6c2anur7YS4LA2FDml2irmH2xFIW5v1tYHCXTUdcgck2Kkc9meIWVRmuNevtTKR5pKYtxKNOS2hSi5nNyE4VRPOu9ifNP0TCTRU023ZQJ/T6r5kBsOhBZGVFPcv3GBXixb9i7SUondMbEW3HN1/9qTKXr7/nes/eNhb12ETv1EXVa2g5EIckkx7nPBZNO40JNd05npurEkmw/5O+B+NH1btM6cxy825HRfUBxUgtR4LxPUGEe/gEaZCTmg0orp1NG+/ptxcwsPHFy/FjBYN+/RpmwZx6+7BgY6S0NAyatjDtleUH1E/lI3u5+auM2TE4DcB5AcAh+OiH9fTxTwmgb1pB3wH2z4cYhZ/THPsXx2blhQqJMPdprb3241wd1LHOXjyvpfT4nSMutx/5a+7njvGvFEfrc/WTF2GAmHbDymQy+JWQfbXKPZaXMorfcq2akEql6b9fNyI5KEsur1lEweF6YTxOFEtpQmXQCjilaTGd/f1LM7RYoEP8zK1VRC4yFPBCtz8YDGfkP9dYaGW454ng4NzUSkOH/Z6xRzpQYfr7R22lXsO55bVRlAcZhMvPlfB9aIwB4qhLhlaBo3XjAHBtqvF8Vww1ynCVG2A+vIVCE1NX6DKDYsr+hhp24lgP+Ue9mgcBT6qV3Q38nPW9v7kbQJoAXaudUxE6eONiw9XHLkeI1vniz2cew 4k7qUEtl ReU3dVRShUrJUrq++qVxgRT3bNyymCOHXQHjy5yOIn3CHlyVnpvDdP3l655fvUPxvgraNLzzL9u0JJyugER0JR+FfPf0qoA+mcikhBHmnWK1LXVTIvnEboASUq1Z4ZR8RJxSbka3F4at/u2cjI7gmoh6Hpe3hRXmtSu1jl6M8NvyxbjepLGDK0Wnpms+zo11zcQ5xq8G6YoWKa0i1L0wUJzfkWe8fjMRTG1DiceYeKJoLaO8y+F0QZzDnRmnSrBChOBqUojJJOgEFRMaT72jRTr8irgOSsRtImVPkdYbTq3Ym8fmatXSoi6etXbnWVwACggJGNoHKVo9VRQ+HsNxLHs6Wm6zH+Ec2UAJaqtXk51u83ZtMX3Ve1n92lIFyMP/mcy9FHU0XcG7JjTyrCQp8cI6pqx5RqENkVW1X6770Kxt0uTGSOq4RVqapTA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Jun 03, 2025 at 07:19:13PM +0200, David Hildenbrand wrote: > > > As stated elsewhere, the mapcount check + folio_move_anon_rmap need the > > > folio lock. > > > > Could you elaborate what would go wrong if we do folio_move_anon_rmap() > > without folio lock here? Just to make sure we're on the same page: we > > already have pgtable lock held, and we decided to reuse an anonymous > > hugetlb page. > > For now we have > > VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio); > > right at the beginning of folio_move_anon_rmap(). > > That dates back to > > commit c44b674323f4a2480dbeb65d4b487fa5f06f49e0 > Author: Rik van Riel > Date: Fri Mar 5 13:42:09 2010 -0800 > > rmap: move exclusively owned pages to own anon_vma in do_wp_page() > When the parent process breaks the COW on a page, both the original which > is mapped at child and the new page which is mapped parent end up in that > same anon_vma. Generally this won't be a problem, but for some workloads > it could preserve the O(N) rmap scanning complexity. > A simple fix is to ensure that, when a page which is mapped child gets > reused in do_wp_page, because we already are the exclusive owner, the page > gets moved to our own exclusive child's anon_vma. > > > My recollection is that the folio lock protects folio->mapping. So relevant rmap walks Yes, I had similar impression but only for file, as the comment discussed in rmap_walk_file(). For anonymous, it was always not clear to me, as at least rmap walk anon doesn't seem to need folio lock in some special paths like damon/page_idle/folio_referenced. [1] > that hold the folio lock can assume that folio->mapping and > thereby folio_anon_vma() cannot change. > > folio_lock_anon_vma_read() documents something regarding the folio lock protecting the > anon_vma. Right, I remember that change, though I was expecting the comment was referring to the assert(locked) above. Unfortunately we didn't have more clue on the folio lock, even though the change itself makes perfect sense regardless, to double check anon_vma from changing (after UFFDIO_MOVE). > > I can only speculate that locking the folio is cheaper than locking the relevant anon_vma, and > that rmap code depends on that. I see this as two separate things to protect: folio->mapping, and the anon_vma tree inside of it. For now it looks like we're "almost" using folio lock to protect folio->mapping for anon, however we could still read folio->mapping without folio lock, per discussed above [1]. Below commit should be another sign of that, where Alex mentioned the WRITE_ONCE needed for page_idle. IOW, even with folio lock held, something can be reading folio->mapping, and further walking the anon_vma.. I had a long-standing feeling that it works out only because anon_vma updates can only happen within parent/child processes, so they're internally holding the same anon_vma lock (anon_vma_trylock_read, taking the root lock). Hence even if a race happened it'll still take the same lock. I think it means as long as we decided to reuse an anon page (hugetlb or not, as long as holding the pgtable lock), updating folio->mapping with/without the lock should keep working.. But maybe I missed something. And it may not be extremely important either so far; taking the lock doesn't seem to be bad anyway. It's only some confusion I never figured out myself. Thanks, > > > I'll note that in the introducing commit we didn't use the WRITE_ONCE, though. That was added in > > commit 16f5e707d6f6f7644ff07e583b8f18c3dcc5499f > Author: Alex Shi > Date: Tue Dec 15 12:33:42 2020 -0800 > > mm/rmap: stop store reordering issue on page->mapping > > But I don't think that the folio lock was a replacement to that WRITE_ONCE. -- Peter Xu