From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D0EC0C61CE7 for ; Sat, 7 Jun 2025 12:56:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2A1886B0088; Sat, 7 Jun 2025 08:56:29 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 251A06B0089; Sat, 7 Jun 2025 08:56:29 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 143466B008A; Sat, 7 Jun 2025 08:56:29 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id E8BC76B0088 for ; Sat, 7 Jun 2025 08:56:28 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 752D5A0482 for ; Sat, 7 Jun 2025 12:56:28 +0000 (UTC) X-FDA: 83528603256.04.7AE85FC Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf02.hostedemail.com (Postfix) with ESMTP id 4E1FC80004 for ; Sat, 7 Jun 2025 12:56:26 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=FDSvxsA7; spf=pass (imf02.hostedemail.com: domain of npache@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=npache@redhat.com; dmarc=pass (policy=quarantine) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1749300986; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=7Pz3n+S4GF9CWwbPXAPkF5BhdMy9zbB86GVsp+E3IRU=; b=bNA7MSIMgRXxi58GKo91WiU7aCyYWn3VlSKqVPaP+bA/m1O11j7NDJs3D1KBRo4KrwZBG+ pHWuzQRD5MpdPu9UWPb2j52nsmmhfSRCQ9J1NdSU0FLM9Uyljle9qOiuX9s86gMTTdgFkw Mtnwfuuzmyoe1qRKjio8w6jYVnYST1s= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=FDSvxsA7; spf=pass (imf02.hostedemail.com: domain of npache@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=npache@redhat.com; dmarc=pass (policy=quarantine) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1749300986; a=rsa-sha256; cv=none; b=v3T9iOI2La2N+/V1R3Hiiw90q6Su6e/sQ6946jlzWMZuOUIKFr1UI9Ei6NyNvnIOJ5xTYa LoT4InH6E6AweuiI8dgz2OLFB5QCeDCJRbgk62k4Yd7QcvHr+iXqc/HT5UOUtz1mPk/W9J IUajEUW9YyoyNqJsW0OaqkDwZI3CsW8= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1749300985; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7Pz3n+S4GF9CWwbPXAPkF5BhdMy9zbB86GVsp+E3IRU=; b=FDSvxsA7CteFYLtkxZuStpq1DhRxK/0TnkjFjM2JuxdwqxGBFHROOwGfDir4D2dxuhfLnX ZvK0sgbop8BEm46jukP/Gc8QT9pMvMZWzunWCGYJct3E6nhYgwPfFAiBbPwlLcy6iDJ7Pe C/wLhlbn+4LBIOZII3sRjrDcz58KiVU= Received: from mail-yw1-f197.google.com (mail-yw1-f197.google.com [209.85.128.197]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-552-E9uRNAbFMoK7sabyQw6bxA-1; Sat, 07 Jun 2025 08:56:24 -0400 X-MC-Unique: E9uRNAbFMoK7sabyQw6bxA-1 X-Mimecast-MFC-AGG-ID: E9uRNAbFMoK7sabyQw6bxA_1749300983 Received: by mail-yw1-f197.google.com with SMTP id 00721157ae682-70a4fd518b7so42592507b3.0 for ; Sat, 07 Jun 2025 05:56:24 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1749300983; x=1749905783; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7Pz3n+S4GF9CWwbPXAPkF5BhdMy9zbB86GVsp+E3IRU=; b=wIuEDY+QoYMjPzm3DJAHvGgWsvBkHeZFcIo+cnVyBQuLc/J7aMDSt98GOngP77shqS Lo8CC9cihkdC1dsxPav759TbPLrntZAWPf4rwReNIJPEPLFo9oPf5O5Sf9qPCgCZ/uYH wid3OBBhF8i82VNKzRZZzxu3RPUzMBiQmuljxhUPlKzY5Z8Rmr302H8lGhGsv4WXps9b YPBg11t7c4A3LiHtYhGrAOBDgXymuMb1QTgdiF0M79e1ag8NnM+4z+wK6IuljsbmAbOk eB6OPhS2ET/6mSjOCINyNiFFow1ndKejGRvot9VCtWh8jySuAjDx/Fuf2VwJpNE/xUtP TNew== X-Gm-Message-State: AOJu0YwInAa3XqbAPtSwxdxdjQfpnwuNoDacCKGNEBUsz0pX87tsegi4 Y2lExIZiJ+cPreN9fi8j6fCCRzJ/geIwVjwOivFu2o94nREZf2ZIsq6G18FLf8GQUJ+UhLTVvGN 1cEACRvRGGxVHnjupbZ2plKqydfjeD+tFcxJCGlAe6yHsGNHA1SuoHun+m/rA0ONifAxDQQyZV+ pypQ/A5Cw5yzhcQDJ19S15eGqGFFs= X-Gm-Gg: ASbGncsICdlp3G51kSFnxFgB4mpe2yxm9NesaGSq781YY4UVb4CP46vTyGOD2sfkqEB WJa2+sU9lya4WEYyEnbTFZMiOUiXuQggZj7AP/fPAlfTgHitjinh6AB00dKBG1Pck8BmF0OCuyy 5xglAZOQ== X-Received: by 2002:a05:690c:6904:b0:70d:f237:6a6a with SMTP id 00721157ae682-710f76e77demr106566707b3.11.1749300983469; Sat, 07 Jun 2025 05:56:23 -0700 (PDT) X-Google-Smtp-Source: AGHT+IF7FNT8vSEilFYylIHoOVdCrN0R+1kocUas31odbrgRlbrWaYEbXkR61oPuKufF+zoQ6s8u8768x4WQQ3TJiTA= X-Received: by 2002:a05:690c:6904:b0:70d:f237:6a6a with SMTP id 00721157ae682-710f76e77demr106566237b3.11.1749300983107; Sat, 07 Jun 2025 05:56:23 -0700 (PDT) MIME-Version: 1.0 References: <20250515032226.128900-1-npache@redhat.com> <20250515032226.128900-8-npache@redhat.com> <6f061c65-f3aa-42bb-ab70-b45afdcf2baf@arm.com> In-Reply-To: <6f061c65-f3aa-42bb-ab70-b45afdcf2baf@arm.com> From: Nico Pache Date: Sat, 7 Jun 2025 06:55:57 -0600 X-Gm-Features: AX0GCFtcqcQpOoDiuu78xnQcQoifshBwWWOu_PewXOLZ9tH8vG2w5ZNBvZ9_90Q Message-ID: Subject: Re: [PATCH v7 07/12] khugepaged: add mTHP support To: Dev Jain Cc: linux-mm@kvack.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org, david@redhat.com, ziy@nvidia.com, baolin.wang@linux.alibaba.com, lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com, ryan.roberts@arm.com, corbet@lwn.net, rostedt@goodmis.org, mhiramat@kernel.org, mathieu.desnoyers@efficios.com, akpm@linux-foundation.org, baohua@kernel.org, willy@infradead.org, peterx@redhat.com, wangkefeng.wang@huawei.com, usamaarif642@gmail.com, sunnanyong@huawei.com, vishal.moola@gmail.com, thomas.hellstrom@linux.intel.com, yang@os.amperecomputing.com, kirill.shutemov@linux.intel.com, aarcange@redhat.com, raquini@redhat.com, anshuman.khandual@arm.com, catalin.marinas@arm.com, tiwai@suse.de, will@kernel.org, dave.hansen@linux.intel.com, jack@suse.cz, cl@gentwo.org, jglisse@google.com, surenb@google.com, zokeefe@google.com, hannes@cmpxchg.org, rientjes@google.com, mhocko@suse.com, rdunlap@infradead.org X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: V4TAb3KsZwWLwNv-TJkrEC-PwG_UPYDVsJITFMddO84_1749300983 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 4E1FC80004 X-Stat-Signature: p1t8kruire4nmnktfsch5zn8ut97rofw X-Rspam-User: X-Rspamd-Server: rspam07 X-HE-Tag: 1749300986-959864 X-HE-Meta: U2FsdGVkX19zbt5V6Kx49AsEih4h/rQ6ML1OONuIRnCyRqSvBV3spyAchhvnVzHBVJX85z6IZ1x7gc2acqOL4nDSg5pkXAiRBhs8pLsgHftlnvQY4ytUZDCiJ0aXXvxP1J6CTE5izqZs7VddQgwGHzpB41xASQgrfHlEUYT5vzZr6+vQ3+ab3yUxdCyzSIqpgOQqdykKq0O0XPI4lrz1FPVeFANct8AdwocC8cTatbdWCyKRPL9422USE/UqUkjRosRTt+racPhpiYRQtb/u1uv+GzMm6J2b0qZYAjvdClnRwT4iXUZOQr24r8oBWxUfXXDmVvrGqodRTN68cvQVTngmUyEIJaOViE6i7virJwaXcpCeOB5l1ArPE9mmeO616WxWUivaVhTVmICBI3pd0YjAWSyyeUS0+NJ9LmyMpMSXEVSOP2IVU1MG1D3FJbvUvrKSt8Kd/dwX6sXNGcSdk/DPgW5nVGTWqwvlfv6KYMYdxW2WM2ZiAtttqaPrFEv/jd6f819qZwGBrvx4A8jVxiYTWPlTr6HUzxyHjA/NrhbG3kc+hk0mCD/+rd351VsCXjqCm1C0USjdBecR0+ZRsDtRNXeDGfpN4xrwN9EQGzFFAVK/0g0lsOXe83dEqhdUsNZ9Tv2mWnSTaGOTgkapeK6VdP+MLK0y4wTLO8VEA2wvKPwuYkKt1gg0A+etzTLYtJyiGGviMSMcmCqLr8lMDi4fwkiI7Fvs2uOCa5IliS52NXOSFOCGLoSRDTp9+aBg0CgtDlg5+1ACZMnonqGBJhUozio2tBy+qRGA/v4nGzMXiRQoQYxoKk8ejpCQg4zaQQc09TyXglUV9Tl7pq29xpY+ik7Sf9PXahhFxd9CTYlcDEFVpQgfM40YAaDlBGWnrgSk72q0Pl0dP0rUP5w0CIzGutW4SEjFUW7cLwZgmHx7B4lq8a4v2xF0L0qTOVsPUW6hJqvTtf6vqLGWcNC ktAEm7+S +9uzzANXnlZRdTMPXIyHxxY8qXneXGVtOrn8STperuiiIS/tyV6noOhkY41udwT0fCn3XDyxSv9iz+wvviDcFZk1CO0zyZe3xipunSVeAE4G8F3UcnjjVU0xuThVyZDwlUbr1NBwiAt9FSoqVmdFPhlhxv9VtuKIa/DMPFtlB6GB/OjiYhbW4tJWrmBRTa8026uZ0YLKfqbkkV3HPsFuA7PZlWLX6Gy4B6LTnpysgXQ9/tVB/kVPu658mvWcKUamwjO7auirxpV18fhre4XYs9iPjKh1tl84iT18ug+S0YfsxBXpBNmiiHBvFJub/X22exlcuu/WVKgTB0IqguPGS66J8mZPpOwmKFWA8sKYGOO8dAo97X30e/DnrvXYEg5kkS4XG+5MJ/xVfh3GPDyDVy0bJItAoVYHx/nhZ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sat, Jun 7, 2025 at 12:24=E2=80=AFAM Dev Jain wrote: > > > On 15/05/25 8:52 am, Nico Pache wrote: > > Introduce the ability for khugepaged to collapse to different mTHP size= s. > > While scanning PMD ranges for potential collapse candidates, keep track > > of pages in KHUGEPAGED_MIN_MTHP_ORDER chunks via a bitmap. Each bit > > represents a utilized region of order KHUGEPAGED_MIN_MTHP_ORDER ptes. I= f > > mTHPs are enabled we remove the restriction of max_ptes_none during the > > scan phase so we dont bailout early and miss potential mTHP candidates. > > > > After the scan is complete we will perform binary recursion on the > > bitmap to determine which mTHP size would be most efficient to collapse > > to. max_ptes_none will be scaled by the attempted collapse order to > > determine how full a THP must be to be eligible. > > > > If a mTHP collapse is attempted, but contains swapped out, or shared > > pages, we dont perform the collapse. > > > > For non PMD collapse we much leave the anon VMA write locked until afte= r > > we collapse the mTHP > > Why? I know that Hugh pointed out locking errors; I am yet to catch up > on that thread, but you need to explain in the description why you do > what you do. I will add a better description in the next version. The reasoning is that in the PMD case all the pages are isolated, but in the non-PMD case this is not true, and we must keep the lock to prevent changes from occurring after we unlock it. Another potential solution is to isolate all the pages in the PMD, then undo it after we collapse the mTHP. -- Nico > > [--snip---] > > > > > - > > - spin_lock(pmd_ptl); > > - BUG_ON(!pmd_none(*pmd)); > > - folio_add_new_anon_rmap(folio, vma, address, RMAP_EXCLUSIVE); > > - folio_add_lru_vma(folio, vma); > > - pgtable_trans_huge_deposit(mm, pmd, pgtable); > > - set_pmd_at(mm, address, pmd, _pmd); > > - update_mmu_cache_pmd(vma, address, pmd); > > - deferred_split_folio(folio, false); > > - spin_unlock(pmd_ptl); > > + if (order =3D=3D HPAGE_PMD_ORDER) { > > + pgtable =3D pmd_pgtable(_pmd); > > + _pmd =3D folio_mk_pmd(folio, vma->vm_page_prot); > > + _pmd =3D maybe_pmd_mkwrite(pmd_mkdirty(_pmd), vma); > > + > > + spin_lock(pmd_ptl); > > + BUG_ON(!pmd_none(*pmd)); > > + folio_add_new_anon_rmap(folio, vma, _address, RMAP_EXCLUS= IVE); > > + folio_add_lru_vma(folio, vma); > > + pgtable_trans_huge_deposit(mm, pmd, pgtable); > > + set_pmd_at(mm, address, pmd, _pmd); > > + update_mmu_cache_pmd(vma, address, pmd); > > + deferred_split_folio(folio, false); > > + spin_unlock(pmd_ptl); > > + } else { /* mTHP collapse */ > > + mthp_pte =3D mk_pte(&folio->page, vma->vm_page_prot); > > + mthp_pte =3D maybe_mkwrite(pte_mkdirty(mthp_pte), vma); > > + > > + spin_lock(pmd_ptl); > > Nico, > > I've noticed a few occasions where my review comments have not been ackno= wledged - > for example, [1]. It makes it difficult to follow up and contributes to s= ome > frustration on my end. I'd appreciate if you could make sure to respond t= o > feedback, even if you are disagreeing with my comments. Thanks! > > > [1] https://lore.kernel.org/all/08d13445-5ed1-42ea-8aee-c1dbde24407e@arm.= com/ > > > [---snip---] >