From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 22491C54E65 for ; Wed, 21 May 2025 11:36:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7DE516B00AF; Wed, 21 May 2025 07:36:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7900B6B00B0; Wed, 21 May 2025 07:36:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 67E8F6B00B1; Wed, 21 May 2025 07:36:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 481A76B00AF for ; Wed, 21 May 2025 07:36:15 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 9ECC5120C59 for ; Wed, 21 May 2025 11:36:14 +0000 (UTC) X-FDA: 83466711468.24.1129E46 Received: from mail-qv1-f44.google.com (mail-qv1-f44.google.com [209.85.219.44]) by imf14.hostedemail.com (Postfix) with ESMTP id C4364100018 for ; Wed, 21 May 2025 11:36:12 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=almkrekJ; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf14.hostedemail.com: domain of laoar.shao@gmail.com designates 209.85.219.44 as permitted sender) smtp.mailfrom=laoar.shao@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1747827372; a=rsa-sha256; cv=none; b=t8LRw93NQUIW0QS34mA0QajO1/HIxocC9rawu1Ja+njL35KMQDqiP76Y7Z24VRzgCbv4KZ 5h+dAmR2mu1PeTpb0MT5iPguyeVOESkJ2TfZLCeTi+c61J4h7pl3HJWJjhpe0vwWoAufEx sZrKjlbRwmDyRsqNiVQlT518axQkgz0= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=almkrekJ; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf14.hostedemail.com: domain of laoar.shao@gmail.com designates 209.85.219.44 as permitted sender) smtp.mailfrom=laoar.shao@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1747827372; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=MYdOgdcjYPnCY9yhLEBUShjQzNMS1iD03XTdqPSE560=; b=6mMLDD77tKtEG2x3r6vvKnRsoezecOAn/dxJ7RAP8SVoQkOWUexl7oB1pO5nrpRw6MPSt0 rhmhCF4l/cS6Z9rf7oMSOtxYs5Iih2Bp6X6edO2vTplkm0JoFTNSLIIctycnxgR7wJ1XGk O9av6y6Z6H8BdbcMD5LBjE21n5Oy028= Received: by mail-qv1-f44.google.com with SMTP id 6a1803df08f44-6f8c2e87757so58113716d6.2 for ; Wed, 21 May 2025 04:36:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1747827372; x=1748432172; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=MYdOgdcjYPnCY9yhLEBUShjQzNMS1iD03XTdqPSE560=; b=almkrekJ+DjixIBga0mIWoKvn0PcuWwe36Ql9+oBbbS+ZWjxQ77agRTvLC9agxd29U 9MiGubFvR/nANpeMW7N32BsJkJwnvc3rsY4/7A0euNbMQmoyNixoD0Cqn/iU+B9bF6+u wr3E65j+pQXOhzFNKhvQqcA6v20H9KHdx35MBS+FWq3H18Nesffj1DVG86ae9TC+8K8H 7OQhrX+0hN2K/XBZuDCO7gKxDlirpRiATGY0HCINLPLr6sixX93mUVXFGtJ38EgY2Ypi OrQoebZzVHimcUboA5GZGj0ISJ8KCf1Ql4I3mJM4iXlN64V3W5dH7ZlBOti4c+WP4AgJ t39A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1747827372; x=1748432172; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=MYdOgdcjYPnCY9yhLEBUShjQzNMS1iD03XTdqPSE560=; b=fQSOoMfs/5j+TSUSPHv9FWGkZwgJrJC1xrAR+3g66C+86i38OIMui0gIiENa6EwedN gFEfeQ2tRTsfpUxK/hT/Ye770pigp1UV1bDv3Ef3AtnEd2JWqK+bic3pEgffBwR/SC9h 6cA7Ob2O7DH3ua+19dna2n9W0ptonm3Beurv+pJGOSVyChFslZyfXIrwzyQYJ4s/UjUo lA+jkbQ6rXYW7w8HbZGAbTYrKpMCeF6AB4mzZ9Syo03zmBre9aiEa6YcJxRVuV85Mi+1 z1G8+0TDGlRI0KfRrNaA7OsRffFGAuZW8IMKedBAwJPpWt5hr2mtCILwXKML/gnAUQKZ gM+g== X-Gm-Message-State: AOJu0YzogWjml1Ay9x1C/hZew03jZcQ0BK057W9/6MS9IUu25wUhdFEr QD/mFbgi8A5pnnbsKc4LRj5Rvgs97Gx7AlTEdF+qPWAuEkJpIR5kMKRfUYQrf0+fEFPcQtDe7u0 Q2WpT1uF/y/lYw9IifPeBbS0gxJSQe7w= X-Gm-Gg: ASbGncvqQqwP9MNwuKYKildzRYRzJ9S4/v8Z8GN3r6lqnF6XcxPb3hB3xQicreRv8us /d1GWBvd46pGdqAr4prRlPOB5a39HrRyCT6oKVJD639Jmqisjyzd0GEzONLg+IgXvXl2Ub/ae8z PBFd9nrEn5YQRe0mPcnVuKvGNah2Dp5VQ2ig== X-Google-Smtp-Source: AGHT+IG0NSqVB+WG0qBhrlsKA2H54Ax1QA3WrMlpFSukxISaFW+UsqeUomOGi6GkNqpdR+aDMSvDlBsRCUMQuSB19Xw= X-Received: by 2002:a05:6214:1302:b0:6f8:ac7b:c1ab with SMTP id 6a1803df08f44-6f8b08bfa84mr330355896d6.34.1747827371620; Wed, 21 May 2025 04:36:11 -0700 (PDT) MIME-Version: 1.0 References: <20250515033857.132535-1-npache@redhat.com> In-Reply-To: From: Yafang Shao Date: Wed, 21 May 2025 19:35:35 +0800 X-Gm-Features: AX0GCFuRjloFvRQwWoX3nj3PIwElDQvn-OuuZckoV7z_HUQsvYM0MLMqgtm-VHg Message-ID: Subject: Re: [PATCH v6 0/4] mm: introduce THP deferred setting To: Nico Pache Cc: linux-mm@kvack.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, rientjes@google.com, hannes@cmpxchg.org, lorenzo.stoakes@oracle.com, rdunlap@infradead.org, mhocko@suse.com, Liam.Howlett@oracle.com, zokeefe@google.com, surenb@google.com, jglisse@google.com, cl@gentwo.org, jack@suse.cz, dave.hansen@linux.intel.com, will@kernel.org, tiwai@suse.de, catalin.marinas@arm.com, anshuman.khandual@arm.com, dev.jain@arm.com, raquini@redhat.com, aarcange@redhat.com, kirill.shutemov@linux.intel.com, yang@os.amperecomputing.com, thomas.hellstrom@linux.intel.com, vishal.moola@gmail.com, sunnanyong@huawei.com, usamaarif642@gmail.com, wangkefeng.wang@huawei.com, ziy@nvidia.com, shuah@kernel.org, peterx@redhat.com, willy@infradead.org, ryan.roberts@arm.com, baolin.wang@linux.alibaba.com, baohua@kernel.org, david@redhat.com, mathieu.desnoyers@efficios.com, mhiramat@kernel.org, rostedt@goodmis.org, corbet@lwn.net, akpm@linux-foundation.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: C4364100018 X-Rspam-User: X-Stat-Signature: syyt8o41d1zzaxps3o9iwos11kp8g6kw X-HE-Tag: 1747827372-276831 X-HE-Meta: U2FsdGVkX19XIxlXzGJo8lF55nSYtYrR/NWjazViQVBy80PTMhDD/N2VobvmhGsCLtKH/bjC48pIN3Ip9sMrfS7NzlpUAridqTNPFrqD4L4TRZlSIWRAwncSyJJE0iVpH+zpSwWGfd9vrbZDwJq/88sQofxkhCTgBfLwunkiSNzBLvDMslrqsTX1nPKdF1G0WLaH7Xzka0zHELqfLAfzIs2aCpSRZMutn14gNPd82CvsvqcQPohw7UH8NCq0QjCM1oxc6Ks9jQsSCUrkFQ+4i5x35JvOj4OlL4F/Tp0auYZD3pKe6Z2zK33wYCYYyLK7mrKTDdsBXlvOjqdvY4Z54gooANHMSyEKK6qjUogBoMNlvfh5dsahrqkbt+phnk6by+HONfW9Q2YYZTdIsZ7TZmOvJeLDZbrGqdqOc8kTR0ToQs6adv5gLf5WZ0IFwx0hsgBOmIHW7jXgeRhrYR4NH+BfqEG9FRCFpPL87q9+6EfRj0wTU5OPO6jhNYUVBd1OJq31pAdO0jZqtgpIlP8tI0bqUghcIuh0hU6h1onUHlYO45fnhLl1t8uVRpQXM3OXUL4hUwC1QwZhK4gcDzxXdmy/g0+BWywq2DgyzsFtvVVw5qf72HU0dnUO6iMY30bCCTgndGLGb0hNl9M6xg4YddsHDjNVluYBdMzqH+Vz0ZVGlL5kPa8W3fUpqat4O8yT5THXlTWUgPDjuE82sFGUET+T65HiXGNs1cfxhWDZeyzC81rDUcNaTeBx939453gJvFxHrd0/DwxViahBmYBK9LxfWbDB5E2FrDNN6qHv4aKLAWWgEEYXoYgSMsoJ/fhsFNgkuuMfr5wEBiJcYNxy+rIiZQONigI/zhumlPa3rLtTusCbNIRnC/QSh4i+nHWCPqHBK1GbyNaPCGGb0ZEOYtrykLDvh5ejY78q7cL6CJYuFb5H9jvepcK6v483VxkMssCRNpUR+qacLYMjIpi wjMGfC52 6PZmHTUqSkvtoV7kcbnhUc2wxxMlnNok6Vta41//o+YDTJFlu4t8IcT+FH4oEAvm/HsgbYZEHNMBlGY+vFlr+QuAOn6JIxU2hQSL2Nt/WtaFrfwNYSSCwT9SDm+JIJsUn5VVfO85NF4jQHE6LDzeR0M1eYsuNHEc37OHKWMjR0ABnkkARNtgpAlh27ZGkkU0J0JvocwmkrK9BzvZVxjCYoRkt8gzEA829CrNOYN2Co16WcQe6ib4H8wvOay+omTVOAlQxxNkQuPH9SZJWrF6TpSLEi5KmIQoQzkpXrStL44j1SPA1r1ErLjT05XyZRGwMuIR80dJ28J58KWNR+4G2PXwSC0bDVRqiSUEOfV+60zxDrrQgF5W5gE1f5fJjPoPD6eJ7pXZjP6d36dIwbVuHdhE4xeeroO68dc86/4vZwKNLhiAlizpO9/xlV3gSAVtHDig4/rnwohgBIy5NcWsndTUBFSrg/2QAbAaxKbllVOev6JFrhPJhBxhHhxb2kgg2FuDe X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, May 21, 2025 at 6:19=E2=80=AFPM Nico Pache wrot= e: > > On Tue, May 20, 2025 at 3:25=E2=80=AFAM Yafang Shao wrote: > > > > On Thu, May 15, 2025 at 11:41=E2=80=AFAM Nico Pache = wrote: > > > > > > This series is a follow-up to [1], which adds mTHP support to khugepa= ged. > > > mTHP khugepaged support is a "loose" dependency for the sysfs/sysctl > > > configs to make sense. Without it global=3D"defer" and mTHP=3D"inher= it" case > > > is "undefined" behavior. > > > > > > We've seen cases were customers switching from RHEL7 to RHEL8 see a > > > significant increase in the memory footprint for the same workloads. > > > > > > Through our investigations we found that a large contributing factor = to > > > the increase in RSS was an increase in THP usage. > > > > > > For workloads like MySQL, or when using allocators like jemalloc, it = is > > > often recommended to set /transparent_hugepages/enabled=3Dnever. This= is > > > in part due to performance degradations and increased memory waste. > > > > > > This series introduces enabled=3Ddefer, this setting acts as a middle > > > ground between always and madvise. If the mapping is MADV_HUGEPAGE, t= he > > > page fault handler will act normally, making a hugepage if possible. = If > > > the allocation is not MADV_HUGEPAGE, then the page fault handler will > > > default to the base size allocation. The caveat is that khugepaged ca= n > > > still operate on pages that are not MADV_HUGEPAGE. > > > > > > This allows for three things... one, applications specifically design= ed to > > > use hugepages will get them, and two, applications that don't use > > > hugepages can still benefit from them without aggressively inserting > > > THPs at every possible chance. This curbs the memory waste, and defer= s > > > the use of hugepages to khugepaged. Khugepaged can then scan the memo= ry > > > for eligible collapsing. Lastly there is the added benefit for those = who > > > want THPs but experience higher latency PFs. Now you can get base pag= e > > > performance at the PF handler and Hugepage performance for those mapp= ings > > > after they collapse. > > > > > > Admins may want to lower max_ptes_none, if not, khugepaged may > > > aggressively collapse single allocations into hugepages. > > > > > > TESTING: > > > - Built for x86_64, aarch64, ppc64le, and s390x > > > - selftests mm > > > - In [1] I provided a script [2] that has multiple access patterns > > > - lots of general use. > > > - redis testing. This test was my original case for the defer mode. W= hat I > > > was able to prove was that THP=3Dalways leads to increased max_lat= ency > > > cases; hence why it is recommended to disable THPs for redis serve= rs. > > > However with 'defer' we dont have the max_latency spikes and can s= till > > > get the system to utilize THPs. I further tested this with the mTH= P > > > defer setting and found that redis (and probably other jmalloc use= rs) > > > can utilize THPs via defer (+mTHP defer) without a large latency > > > penalty and some potential gains. I uploaded some mmtest results > > > here[3] which compares: > > > stock+thp=3Dnever > > > stock+(m)thp=3Dalways > > > khugepaged-mthp + defer (max_ptes_none=3D64) > > > > > > The results show that (m)THPs can cause some throughput regression = in > > > some cases, but also has gains in other cases. The mTHP+defer resul= ts > > > have more gains and less losses over the (m)THP=3Dalways case. > > > > > > V6 Changes: > > > - nits > > > - rebased dependent series and added review tags > > > > > > V5 Changes: > > > - rebased dependent series > > > - added reviewed-by tag on 2/4 > > > > > > V4 Changes: > > > - Minor Documentation fixes > > > - rebased the dependent series [1] onto mm-unstable > > > commit 0e68b850b1d3 ("vmalloc: use atomic_long_add_return_relaxed= ()") > > > > > > V3 Changes: > > > - Combined the documentation commits into one, and moved a section to= the > > > khugepaged mthp patchset > > > > > > V2 Changes: > > > - base changes on mTHP khugepaged support > > > - Fix selftests parsing issue > > > - add mTHP defer option > > > - add mTHP defer Documentation > > > > > > [1] - https://lore.kernel.org/all/20250515032226.128900-1-npache@redh= at.com/ > > > [2] - https://gitlab.com/npache/khugepaged_mthp_test > > > [3] - https://people.redhat.com/npache/mthp_khugepaged_defer/testoutp= ut2/output.html > > > > > > Nico Pache (4): > > > mm: defer THP insertion to khugepaged > > > mm: document (m)THP defer usage > > > khugepaged: add defer option to mTHP options > > > selftests: mm: add defer to thp setting parser > > > > > > Documentation/admin-guide/mm/transhuge.rst | 31 +++++++--- > > > include/linux/huge_mm.h | 18 +++++- > > > mm/huge_memory.c | 69 +++++++++++++++++++-= -- > > > mm/khugepaged.c | 8 +-- > > > tools/testing/selftests/mm/thp_settings.c | 1 + > > > tools/testing/selftests/mm/thp_settings.h | 1 + > > > 6 files changed, 106 insertions(+), 22 deletions(-) > > > > > > -- > > > 2.49.0 > > > > > > > > > > Hello Nico, > > > > Upon reviewing the series, it occurred to me that BPF could solve this > > more cleanly. Adding a 'tva_flags' parameter to the BPF hook would > > handle this case and future scenarios without requiring new modes. The > > BPF mode could then serve as a unified solution. > Hi Yafang, > > I dont see how this is the case? This would require users to > modify/add functionality rather than configuring the system in this > manner. What if BPF is not configured or being used? Having to use an > additional technology that requires precise configuration doesn't seem > cleaner. The core challenge remains: while certain tasks benefit from this new mode, others see no improvement=E2=80=94or may even regress. For that reason, implementing it globally seems unwise=E2=80=94per-task control would be far more effective. --=20 Regards Yafang