From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D3649C369D9 for ; Wed, 30 Apr 2025 20:52:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 46E936B00A6; Wed, 30 Apr 2025 16:52:25 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3F3B46B00A9; Wed, 30 Apr 2025 16:52:25 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2514C6B00AA; Wed, 30 Apr 2025 16:52:25 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 01B2F6B00A6 for ; Wed, 30 Apr 2025 16:52:24 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 79F425D2E3 for ; Wed, 30 Apr 2025 20:52:26 +0000 (UTC) X-FDA: 83391908292.11.B154992 Received: from mail-ed1-f49.google.com (mail-ed1-f49.google.com [209.85.208.49]) by imf22.hostedemail.com (Postfix) with ESMTP id BEA42C0002 for ; Wed, 30 Apr 2025 20:52:24 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=wRwHfEFI; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf22.hostedemail.com: domain of jannh@google.com designates 209.85.208.49 as permitted sender) smtp.mailfrom=jannh@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1746046344; a=rsa-sha256; cv=none; b=wlaqQ16H+nyDKFQSMmB51b1VCnDrF4yPOdcpHGrhY2ooQartgsrtNlTbdoPUQZzJuXEyMJ JIbewwX6pzKP1/5CjHLkOL9rrBV3a5hAa7BAq7Bzb4XdExHqlVTljolJSNqFS36DjzIc/9 FZ5o05DpmA4XMJ5xIMvF1j1JGO8S8EY= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=wRwHfEFI; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf22.hostedemail.com: domain of jannh@google.com designates 209.85.208.49 as permitted sender) smtp.mailfrom=jannh@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1746046344; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=23hiVYsmBQF9yQXvvKwFVuWz0LAimBGXKxbs767cKCA=; b=qFumgxjvMc9UyECCstVSy7KoUij4TIZBUUmHBT8n1Z7sC4CDcePld94Erc8eDIbpP2w5R+ x5i3GV9KYZeO7/X0rWH0uM94YlJDzByhMNOtYd99p8eNWM9Yod5UrML9cF46kA8BTjebkD VMwJRvfo8Oj8elSZ+W+GSdx84h+P3gI= Received: by mail-ed1-f49.google.com with SMTP id 4fb4d7f45d1cf-5dbfc122b82so694a12.0 for ; Wed, 30 Apr 2025 13:52:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1746046343; x=1746651143; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=23hiVYsmBQF9yQXvvKwFVuWz0LAimBGXKxbs767cKCA=; b=wRwHfEFIQCoCPF6AkkJCZhZDevPaHdbySaiimZXOS6b0IApoqZZumNO68ZBbLpM9OH u4TUSkxFM05DKP8Izuuyk9BGtSQt4hXWNWXLA0xjh0LqZuV2JvanhLBL+EgmLjc7d+dM Cjn8OsRjQ8CKbCCy3VrAGtu6MUSg0mSdZcC8sBLRqMjtPnPimvlEPqYmkTHT377RPiTW FpR29KbjQ2tLvLmJ4pruMF9Lovc7B+gkleNk7RSKyzwQLcdMJjX4gDWoc2rWAZPDfKoL LHbJINLAZDQa62PrspPlx6vj4I0ZHNQkBqXwuk/uxnRwfQ/cbthtzz3859MjQKXNzjaS iqdQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1746046343; x=1746651143; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=23hiVYsmBQF9yQXvvKwFVuWz0LAimBGXKxbs767cKCA=; b=Qh/X39qSN8T9wblnuamxs6bXahEYoM9o3yzIO3tBjrnrTk6q2wxeNVSFtjK2fZv7zs PluSbgllVrfHLz2T2233z2n2VqEd26zFWTg/QkFQV2GGeVEGFF+QuGZnRicfqG1y+h84 ETRS1rJLhTfw6ycoBFAufa9xjb4JwtzTVUbTQMzLfbb+I7SlnY2COhejRIcI3Kb+7+8T TPoFPTJwIVQT6oStit6g68elAKu71huewR54T8hYh8lHkbJNTMe0sNQn9EbrK3UYr0mO l6e5qJibJRzcJ4PBadXZp5b1f7sivNpBL6/ntUfvl+rOaobiuECI3iK2xxnXXpDW5x2H WoDw== X-Gm-Message-State: AOJu0YxHUFlULkpJvkH0Tw2rHYA9giuAtunRZ1LkM56PCyLP0fy79e+f OJj/ZGEgzCwKswRuaoit5mDDu808VE46B+FhIhgxGUqCB7SqyV+aw4syJruYAwbFk+nzZeYr4Fd 2Hn9730lVojRurPsbPH/PXqWiOroTXIDlPE3f X-Gm-Gg: ASbGncsBkqsDaodnQlENloJGNG8a1URLUxl0PcvOk0zaeGtLdscN7xxB2vOnD93ocy/ hHBuZ8O+Zo7egdLEe5OK4R/dpqvDNcIGHLKMlhcgppN11iOzFIhHSpSEqrdbkxYjSEHb3mbW/L7 0HGxoT5HJo29gwYalUXDtczcYaVnNz3YsHDlLMpyT0CyanmQ18p0A= X-Google-Smtp-Source: AGHT+IE2B3xe/+DPD2M7lWdvwY8uHQG2L+pnPzy2O9bW24VKqvrt3dxQEMF9lvotWS4qoTaaP5jcihl9aeV++lrP6Is= X-Received: by 2002:a50:d783:0:b0:5e0:eaa6:a2b0 with SMTP id 4fb4d7f45d1cf-5f918c2b650mr1378a12.5.1746046342783; Wed, 30 Apr 2025 13:52:22 -0700 (PDT) MIME-Version: 1.0 References: <20250428181218.85925-1-npache@redhat.com> <20250428181218.85925-8-npache@redhat.com> In-Reply-To: <20250428181218.85925-8-npache@redhat.com> From: Jann Horn Date: Wed, 30 Apr 2025 22:51:46 +0200 X-Gm-Features: ATxdqUHTpwiuUHFPjjOCSAtuFNVfWyscjbmOvS8EHiScAhewMiIQNRRDtNrYr8E Message-ID: Subject: Re: [PATCH v5 07/12] khugepaged: add mTHP support To: Nico Pache Cc: linux-mm@kvack.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org, akpm@linux-foundation.org, corbet@lwn.net, rostedt@goodmis.org, mhiramat@kernel.org, mathieu.desnoyers@efficios.com, david@redhat.com, baohua@kernel.org, baolin.wang@linux.alibaba.com, ryan.roberts@arm.com, willy@infradead.org, peterx@redhat.com, ziy@nvidia.com, wangkefeng.wang@huawei.com, usamaarif642@gmail.com, sunnanyong@huawei.com, vishal.moola@gmail.com, thomas.hellstrom@linux.intel.com, yang@os.amperecomputing.com, kirill.shutemov@linux.intel.com, aarcange@redhat.com, raquini@redhat.com, dev.jain@arm.com, anshuman.khandual@arm.com, catalin.marinas@arm.com, tiwai@suse.de, will@kernel.org, dave.hansen@linux.intel.com, jack@suse.cz, cl@gentwo.org, jglisse@google.com, surenb@google.com, zokeefe@google.com, hannes@cmpxchg.org, rientjes@google.com, mhocko@suse.com, rdunlap@infradead.org, lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: BEA42C0002 X-Stat-Signature: hk6fr1js6w38xogss9wpj9ch36uo63tq X-Rspam-User: X-HE-Tag: 1746046344-849239 X-HE-Meta: U2FsdGVkX18HuKudbN8aUkq5H1WxSqEMsfpZFGRMIhekCXEKTYup1X3jzd8tfO1f148DXxc9agEGDbocHW4uMavYbsGSPwQ1rXCkRJDkV/Hoeqpup9elM50YMlhtxqIieqv76Fdi+umN9nEHIQt/foNuqMcNPh6X/X63xyhjXif3QvCvYXtPuJ2mzaNxZvTnO9GPriM11HGXjdMUL0VrOUaIL1h9+FQhyyiSBmLkJZzpn2ODHijNS8+z+s9OU8cdty0m92t4BIdsiCjt3epMp7R8bVJer89l6SzwiL1Zt10wOn+lvEttpa9QM0z2g9xqXDOF+OBHkJ6nE+ARORfiVT0i+12xML6PT+9jo7sMNBB6ZLJMuZ9rrjP2G56b9+jDg0dGlZK+FcH6j03VJM42rcblCeMzlsRaeU+fcT90sQ60jNYcyc+0WUEC9OlC3db/javq1/CspDyxy24UF9/osehEWV7FPZx0y02pTe8UMXgd5TDMK363WxqnCgcf97fwMKCpAKE6iIGEpsh20usExw2zScaTtGLZsDNMMqJxltEFwFYsmBDLQTQk501Dev71WuMtYUxhM+psaXMg0H8lOgRpd1VWff8Gjbnl+9FPNXrPwcR99IMR6QfS2PPyybb06UhQfcsgOZn4QNTroIUUXEiBu+LtSG/laz5JEcN/s5HHpn0Sh2hhQBgSIIMdppsbSHveRkUK8GScgmsD7j7cDpANpz1h+W82uQPX4qd+kfMxqQXPR6oJF/Whl8AOL0xFYbNx3rNI9EtyH6w1zbgEvBs24eE2sYd2bgfeK28OVcHchbI9pwMfJk6byZvGAt9PeIoVN/qy2ifu5NJ7dh/HL9w9H0sdbSBeSVrUwkvhdi34zoUDW2Hggn/BTSkzzmbrTQP5101CwBvBWEyyDPEqTr/M7f6Cu50Otf3CTNixp+Ss68XLBAXGsbgnMdKNFcotgv623XKgABua7EU5yXm WsJmnpmR MGuJ71/xNQeWNbmY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Apr 28, 2025 at 8:12=E2=80=AFPM Nico Pache wrot= e: > Introduce the ability for khugepaged to collapse to different mTHP sizes. > While scanning PMD ranges for potential collapse candidates, keep track > of pages in KHUGEPAGED_MIN_MTHP_ORDER chunks via a bitmap. Each bit > represents a utilized region of order KHUGEPAGED_MIN_MTHP_ORDER ptes. If > mTHPs are enabled we remove the restriction of max_ptes_none during the > scan phase so we dont bailout early and miss potential mTHP candidates. > > After the scan is complete we will perform binary recursion on the > bitmap to determine which mTHP size would be most efficient to collapse > to. max_ptes_none will be scaled by the attempted collapse order to > determine how full a THP must be to be eligible. > > If a mTHP collapse is attempted, but contains swapped out, or shared > pages, we dont perform the collapse. [...] > @@ -1208,11 +1211,12 @@ static int collapse_huge_page(struct mm_struct *m= m, unsigned long address, > vma_start_write(vma); > anon_vma_lock_write(vma->anon_vma); > > - mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, mm, address, > - address + HPAGE_PMD_SIZE); > + mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, mm, _address= , > + _address + (PAGE_SIZE << order)); > mmu_notifier_invalidate_range_start(&range); > > pmd_ptl =3D pmd_lock(mm, pmd); /* probably unnecessary */ > + > /* > * This removes any huge TLB entry from the CPU so we won't allow > * huge and small TLB entries for the same virtual address to It's not visible in this diff, but we're about to do a pmdp_collapse_flush() here. pmdp_collapse_flush() tears down the entire page table, meaning it tears down 2MiB of address space; and it assumes that the entire page table exclusively corresponds to the current VMA. I think you'll need to ensure that the pmdp_collapse_flush() only happens for full-size THP, and that mTHP only tears down individual PTEs in the relevant range. (That code might get a bit messy, since the existing THP code tears down PTEs in a detached page table, while mTHP would have to do it in a still-attached page table.)