From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8F2B0C2D0CD for ; Wed, 21 May 2025 04:02:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F1BD36B008C; Wed, 21 May 2025 00:02:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id ECB9E6B0092; Wed, 21 May 2025 00:02:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DBB536B0093; Wed, 21 May 2025 00:02:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id BB22E6B008C for ; Wed, 21 May 2025 00:02:48 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 0BE89E7027 for ; Wed, 21 May 2025 04:02:48 +0000 (UTC) X-FDA: 83465568816.02.3036479 Received: from mail-qk1-f172.google.com (mail-qk1-f172.google.com [209.85.222.172]) by imf10.hostedemail.com (Postfix) with ESMTP id 33119C0009 for ; Wed, 21 May 2025 04:02:46 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=PiUjzH4T; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf10.hostedemail.com: domain of laoar.shao@gmail.com designates 209.85.222.172 as permitted sender) smtp.mailfrom=laoar.shao@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1747800166; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=6HtqLLCwBUb99gIabn3dnxW6sMl7Nr+lz5CkjTrbe0Q=; b=5Ck9qn/xCAn5yB8J68T4x0YMNk5+4vXTPiHs+TScZsoMJA3AQhpIdTZcwEj3G/8PQiu3/U 1+M0rcVfoyVvZ9ZDeCOc00wWnEEyaDBfZjrOGqgVpc4YL9QW6HVa5JG6WIYKGq2DioQk2l p+49AE/U96uymtNS1p11iRToPeaReSA= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1747800166; a=rsa-sha256; cv=none; b=mJKXUiM5FpcgnArL63anH5C+dxeLEioPC0xfrDK+JxMy/0aS665Phfk64W8THdsiKk5ugR GVPeGFBtPITL5B2rzKWEppJ3+vjJCctay2GNF0BO/YJoOgFp3Fih+pC5IXcBdMy3ZTJKr4 Xaq7IvZKjcJGNjtJdvLE9PK3KcGeYBM= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=PiUjzH4T; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf10.hostedemail.com: domain of laoar.shao@gmail.com designates 209.85.222.172 as permitted sender) smtp.mailfrom=laoar.shao@gmail.com Received: by mail-qk1-f172.google.com with SMTP id af79cd13be357-7c5ba363f1aso900988785a.0 for ; Tue, 20 May 2025 21:02:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1747800165; x=1748404965; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=6HtqLLCwBUb99gIabn3dnxW6sMl7Nr+lz5CkjTrbe0Q=; b=PiUjzH4T4p7LbpeY+H73LyeH3uZ49XoIKurhAaxtcFpE6n1PsTMxfTnXbwvVPu2kXD mWfUP5n2QKpOTeU4TfDGEmvTnrd4fOje/31cEpdKO9UBq9XMhJPnC+f0G0BITWN2Ivib y7304wm/RBSqpTwp2Sz5KYtT1eBu40i3M23bTp9kGDKafZj49s38PGd/5xYgeEE8R6g9 Le4ddyXsPEMmsWCIz9+9uWJvpXGlTCa5QZyCSpcw5URSzBgRNs7De13kxW82+Es4jafl nDMG94fo5TCI4iSMjEC7xlBq1ExmSkV5yx0XHTgM1SUS5QMv7A+HItm1LSgvY+Y15HQJ a8/w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1747800165; x=1748404965; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=6HtqLLCwBUb99gIabn3dnxW6sMl7Nr+lz5CkjTrbe0Q=; b=GoHq70LaqP35848cGhHxwwAAeOzylWczfbLy8e3cthz2cPr8PW6wVToJrvoT17Bw2w FxiLRbO2eTmI5IYsXcKR8Cqps+ZCoNzU3Rt97gFXbb3aPNfLvqaDQ9TxnBZQbvtWdlNt NNEw6wXRA+VJJvk0GVqHlc1xA5sB+WJxLeDAdftT7jKI9rTeKnNc7BxtEb0FTGJk5JOp bsUbC/Ao1VMdPLesnFA/TxDxzxGdyV/szdyk0d5V/m2RXNDuiOYCkYb7pZFucZPwXrSd aoNJTRBOTCy+8UcFY9MbWrxyfjyL0px8ZiDvKg+VXAgAJXxa5lDcd41IkFMJOf8cDYeQ 2nEQ== X-Forwarded-Encrypted: i=1; AJvYcCVCfge60PjJUKmIaECi21wcWgtsWRLOpoJyYp4ruhYu/yheORX32OwbcXSg/Jul0uocjr2KW7i/OQ==@kvack.org X-Gm-Message-State: AOJu0Ywr0E0+Q1beHKrx+iUXkzHLAGtTD2Ag1rzxfkDXnJPCWqi5GgU3 eK5Bn83+GDWsGdmFt0z3GbuSV0ybBNK8AYEuiab6X90dDZj6u1D/uz4KC7PkNQND6v26Vd0MGCT 0k98BY/arixoNAAaUEEzqn5iZTDUNkXA= X-Gm-Gg: ASbGncvago/fkyOcfCVlB9CtrIfiqh0IjTtdAzkvCveYpuyaKMhTBjUu0slI9C/MjBE 45eNNQFVrZy1ejxrmoaIJNuMZl43RoSp5iDCwxTD9gEDucgBScCwzC3BLETMQBvl9aszbRFWGsC ycDzPL6nbWFaZ7KgCTituVs4cTAz/R4vKzPw== X-Google-Smtp-Source: AGHT+IH3VLJxqXmpLuRFutOgfXxr4uHJ5w7FLUB7gGAidC+ZjeWwWlebh1V0twWXR4V4cpBCnI1l09g+Y1zg63YlWkI= X-Received: by 2002:a05:6214:124c:b0:6f5:3e38:6127 with SMTP id 6a1803df08f44-6f8b2d3ccc8mr312348726d6.42.1747800165186; Tue, 20 May 2025 21:02:45 -0700 (PDT) MIME-Version: 1.0 References: <20250520060504.20251-1-laoar.shao@gmail.com> <746e8123-2332-41c8-851b-787cb8c144a1@redhat.com> <849decad-ab38-4a1a-8532-f518a108d8c6@lucifer.local> <9b44fe43-155d-457d-81ce-a2c1fb86521a@redhat.com> In-Reply-To: <9b44fe43-155d-457d-81ce-a2c1fb86521a@redhat.com> From: Yafang Shao Date: Wed, 21 May 2025 12:02:09 +0800 X-Gm-Features: AX0GCFsReOKCGqxICXXb7neCkzsy6lyNJknjYXZCpDfY_Aoc9vO15nTIrdwWGZA Message-ID: Subject: Re: [RFC PATCH v2 0/5] mm, bpf: BPF based THP adjustment To: David Hildenbrand Cc: Lorenzo Stoakes , akpm@linux-foundation.org, ziy@nvidia.com, baolin.wang@linux.alibaba.com, Liam.Howlett@oracle.com, npache@redhat.com, ryan.roberts@arm.com, dev.jain@arm.com, hannes@cmpxchg.org, usamaarif642@gmail.com, gutierrez.asier@huawei-partners.com, willy@infradead.org, ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, bpf@vger.kernel.org, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: 15epft41x9y17wqdi3bzcjt4af6ta8zc X-Rspam-User: X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 33119C0009 X-HE-Tag: 1747800166-863292 X-HE-Meta: U2FsdGVkX1+Reu3qHl97PloiY6HV5cfnTavaiNnRg4Q667Kfy0CiF73+GMdoSAdajOSA4GscV4pINrJDedikf6zYqdNS0OEZ9V/4o3d29P8xFyIB3x2TesUGeN3w1pfY6MBkx+dTI84jAkvDcU5oKKQVkRzDMde1uhawhbezNvQYfBsDpa5GmPHxyG8/yRW8G2SWx+bI0SG5SKu8zCvYeDDhAXrHocOXC5gVp9hreQ3dBhC41Zm9OPOAb2JEtInr7K6MuEQ3DgVaEVBNj+PaBH7Y5BrB0jDAJA90GWmeat9ge3MV1absINxeFF2XCMC4AIPJBfonJtsfm3zSrpuxPJn/WplCa+Jl5vgB2ZxpyVrgTUgzkqtOGw5nHXwktDQ8nYvzdxjQ697m4W8jG7Ss4qpdWAQDJXu2bPXamh1A4g/AhjS1KGgz7DOhASKEz1vOKiJn+riI2O/hal3GgraDztkj7YrmuNy2iq2Jc9gJ+LJ8hcebUvDitOGyqjGZE2yfa/3pYioTyPVEOWTvKrPL//kg2aTd20+CSkqRj8nSvVF0rkq2kVhpia+ZjKGM3kqCYTt/zAKUSR0iOok0iCiSapktTx62Itk/mu7ol5vFtPbpMKjJVoLRU+IGFDAKOTJepyhyBbDg2EzCcr4ik5mgbCV+FxC8FvHvwtSngv+qQO7HiI1RsV7CqftFywatB2KiAJHJoHVNZwQLJm60seQv4QtoSk4/s9t7l8CJ/EToDzGxIYeK2cjLs/W1z0LwxTF7YAbwPMmmhIQ1Zr59wNQhk/24SvrcRbgTbtalmDZEcIhpWCP0WCYKuKCD+CXsBD5+KD30rc2knjHOMv5+f85J6jmzgWhQeJq3FhPURm+G7hMAb1659woo003WKTmkfvm4hvR7PHpqVofcoU0drNwWky2aX6YopQOI5ZKZ1927TXv+rBJaDsa9vmIedJV5MkoHaH5boingRRQ+nPCecRt vyYEkhrY 80atl0W5RY8xcn/CiZ8JQqvvYxsVbMvh6laPtzOJXyD36lzDbHcxmim56G+9CqGrwk/azS5t3iYAmv1/Ukx0SzRcKI4VTvE+KsgkH3tmlSm5HVQ1RPIDsQrC2nIxJiTAVht2y8BSt78mZEiEpl1E+e7S7mZI/yAWWi3XaCKD3GqD8rva54Ji8Hca1e9Ao43i3fEGGRCXk4hYMZnb5TPHRwXQ81XVz8Uly1ru5LbHwcyeHD/6HX3snkyJwGjvSMYko3yc5L1Ve7g1uOXJluwnNSDomMhi4Qg0iCiqIyK7H2N0RzNlTR/RhJRiPXVmINc57Iylcs11UiLCz9GsFwSvKvle/eEzj1Kq82txvpJx4WPIeyhkzhpArbvtoCk23O2ww7dmYJ9qChf3geCQktEMDurhE7QumieoZGCKCi25d7URRY+o= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, May 20, 2025 at 11:54=E2=80=AFPM David Hildenbrand wrote: > > >> I totally agree with you that the key point here is how to define the > >> API. As I replied to David, I believe we have two fundamental > >> principles to adjust the THP policies: > >> 1. Selective Benefit: Some tasks benefit from THP, while others do not= . > >> 2. Conditional Safety: THP allocation is safe under certain conditions > >> but not others. > >> > >> Therefore, I believe we can define these APIs based on the established > >> principles - everything else constitutes implementation details, even > >> if core MM internals need to change. > > > > But if we're looking to make the concept of THP go away, we really need= to > > go further than this. > > Yeah. I might be wrong, but I also don't think doing control on a > per-process level etc would be the right solution long-term. The reality is that achieving truly 'automatic' THP behavior requires process-level control. Given that THP provides no benefit for certain workloads, there's no justification for incurring the overhead of allocating higher-order pages in those cases. > > In a world where we do stuff automatically ("auto" mode), we would be > much smarter about where to place a (m)THP, and which size we would use. We still have considerable ground to cover before reaching this goal. > > One might use bpf to control the allocation policy. But I don't think > this would be per-process or even per-VMA etc. Sure, we might give > hints, but placement decisions should happen on another level (e.g., > during page faults, during khugepaged etc). Nico has proposed introducing a new 'defer' mode to address this. However, I argue that we could achieve the same functionality through BPF instead of adding a dedicated policy mode. [0] [0]. https://lore.kernel.org/linux-mm/CALOAHbAa7DY6+hO4RJtjg-MS+cnUmsiPXX8K= S1MKSfgy6HLYAQ@mail.gmail.com/ > > > > > The second we have 'bpf program that figures out whether THP should be > > used' we are permanently tied to the idea of THP on/off being a thing. > > > > I mean any future stuff that makes THP more automagic will probably inv= olve > > having new modes for the legacy THP > > /sys/kernel/mm/transparent_hugepage/enabled and > > /sys/kernel/mm/transparent_hugepage/hugepages-xxkB/enabled > > Yeah, the plan is to have "auto" in > /sys/kernel/mm/transparent_hugepage/enabled and just have all other > sizes "inherit" that option. And have a Kconfig that just enables that > as default. Once we're there, just phase out the interface long-term. > > That's the plan. Now we "only" have to figure out how to make the > placement actually better ;) > > > > > But if people are super reliant on this stuff it's potentially really > > limiting. > > > > I think you said in another post here that you were toying with the not= ion > > of exposing somehow the madvise() interface and having that be the 'sta= ble > > API' of sorts? > > > > That definitely sounds more sensible than something that very explicitl= y > > interacts with THP. > > > > Of course we have Usama's series and my proposed series for extending > > process_madvise() along those lines also. > > Yes. > --=20 Regards Yafang