From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5D530C369D5 for ; Tue, 29 Apr 2025 02:41:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2A8506B000D; Mon, 28 Apr 2025 22:41:52 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 255E76B000E; Mon, 28 Apr 2025 22:41:52 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 11DCB6B0011; Mon, 28 Apr 2025 22:41:52 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id E4EFF6B000D for ; Mon, 28 Apr 2025 22:41:51 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 41BE7B6625 for ; Tue, 29 Apr 2025 02:41:52 +0000 (UTC) X-FDA: 83385531264.03.3CE3AE1 Received: from mail-pj1-f42.google.com (mail-pj1-f42.google.com [209.85.216.42]) by imf25.hostedemail.com (Postfix) with ESMTP id 66415A0006 for ; Tue, 29 Apr 2025 02:41:50 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=CaITFid8; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf25.hostedemail.com: domain of laoar.shao@gmail.com designates 209.85.216.42 as permitted sender) smtp.mailfrom=laoar.shao@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1745894510; a=rsa-sha256; cv=none; b=iDGca1Axf3LO84YgMSPYWsNOdj1+orloesUjPdK/sRj/DTdNnvB2v6sXSR1sJAHWdHVfCp ABXFqxIfSzptlhXNL2B2c+8Z9ZWEZ5mT1jbCMvXfVbmKncjiQ6zxwal4Rj98IZPDlphWzM isMYqWorIaz8IUk3/U+lDArejkk2K+M= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=CaITFid8; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf25.hostedemail.com: domain of laoar.shao@gmail.com designates 209.85.216.42 as permitted sender) smtp.mailfrom=laoar.shao@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1745894510; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=tmLnEpBuZGcdIcLvXXK/nzmpJRoqGaDvFGAeEREmXmw=; b=M+cNa9iOmOTxzBokPHXeUHZQgWoR9PwwMCsdcrKTc2n7Dcqh7x6n7LEpB/tXcRP/GGroT+ 9Rc0jRZJJJnSYBKA6/JTEYB/v9KBuZkwkDfqeX4idwN5GxY9GPVKDwZVbF2Aw4em5GNB7p jBCX1I3mqhiWWFRNFxIlpeyT2LyaoM8= Received: by mail-pj1-f42.google.com with SMTP id 98e67ed59e1d1-306b78ae2d1so4528785a91.3 for ; Mon, 28 Apr 2025 19:41:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1745894509; x=1746499309; darn=kvack.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=tmLnEpBuZGcdIcLvXXK/nzmpJRoqGaDvFGAeEREmXmw=; b=CaITFid8Z4zSvW5qGPGXiFHsoWh86naiuGWMQgdbeaTY2Zi37rYHbTiBAllGRRZL07 YQ9SgIMLRGcnhEro7SujHjt1J7BES1D3YjzCrfIU8e9c18+X0gcTxszB5FGsEQw6KwF3 v6EUBwuyDDDkO8YsHgHxsQryOrAAdLPgYUzX+BQdNtiJ5CfCyRbnhS3r67g+QiE4aWgZ h0158plh0A1+uWqE/f6iWLT36UWNOxCvjXJ/Wa//biHAu3btuP1Ff4/NdiHIgHw4k6ST kQYXLnYZBnpdX0uihYNJfmCLiVIoLu97kO6y2ncWQ7mXZuG4vLzu/vRWpdGaVrDDwEic OgoQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1745894509; x=1746499309; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=tmLnEpBuZGcdIcLvXXK/nzmpJRoqGaDvFGAeEREmXmw=; b=LpG2PMhx4RTSmrBhjg8xCam+Mksu0OlOBnFACqP9g6eIPxrFJssY3BAHkQzSlpSKGf yqK2PlQoa164iLAM5uF7ad3RIvpQqbHHk007rDMTiyjmOLSPIeRDpCuGXQhTYOMqE0oH msFDJKDewL3vixKTT4XYDD1zV10kP3F0Edhpx5moYIZ+o5J18mHHUEF+L/qzj6/SWikr DoH7KyvT/EFrsbUvk/kbH19AkWtZJ6DVYP1zj6qsnjo2SrdRZPE/bMg5/dAIzUJSwGHj AM2MyqxkNWiSRTvDGxGdvWPBgVG6+140EnWix7dnpBLagbzQoeQMdb4un9ZEqyLBC5UD s62Q== X-Forwarded-Encrypted: i=1; AJvYcCVC5LY/JXYVN4zTw4/MOej1/IDPGBRJGo24ynht4yh6bvfofuNhT+22I1/c0j/Z3AySCKgIvAkfSg==@kvack.org X-Gm-Message-State: AOJu0YyniY0m6SA+eAXebqhXAjLcx81M+kq3O0JdfbDJ9t6iKuiGI15z TAum4mcC5M/NeZNtwavaP7Jjcd2+9tdPqF4hjw3wR5tsYyNmoivaJZazCAWCOHg= X-Gm-Gg: ASbGncsWOH/rjiS3siQf8j5DxCbEx+/dY2Qa63d3JZhaGxI7H4X+AOZIm1Z7wun1Qgk bz5k3GFZPSQ2XHb8ib2SKWNsLM8PQVE3lUVMKDKhTvE4QFRL0FQVIjq4EXz8p9PdVJ2QZBA/BsB UElqkiNEbbwYR5Qb6gWWPSMiKw7lATtr0YsJm1uKwQIFFxigf6IR2nOU/xgGe3L0jw3uvPTCike tLYJjWTxd9bQtLnC3Z6yH4kJHmxkazG2c/DtzSn3cfyMq3fk0yRt1hedZQVKL406rDYRUEe42yt cMMBP8I7+53W2P48SIvkJK6RAV3bq0VtGyAVA+YAvqwGde4g71ZuPNOihyrWzsHiOEYUoV6sqOK r X-Google-Smtp-Source: AGHT+IH0G23aMYvesu02FPqyAxy08qoIZt9Eo0Z4T9po20YNdvXj0iyP+sEk1i1d7qTePy3Q7TEBtw== X-Received: by 2002:a17:90b:5870:b0:2fe:ba7f:8032 with SMTP id 98e67ed59e1d1-30a013069f8mr17797743a91.9.1745894509029; Mon, 28 Apr 2025 19:41:49 -0700 (PDT) Received: from localhost.localdomain ([39.144.106.153]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-309ef097cb7sm9893211a91.22.2025.04.28.19.41.45 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 28 Apr 2025 19:41:48 -0700 (PDT) From: Yafang Shao To: akpm@linux-foundation.org, ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org Cc: bpf@vger.kernel.org, linux-mm@kvack.org, Yafang Shao Subject: [RFC PATCH 0/4] mm, bpf: BPF based THP adjustment Date: Tue, 29 Apr 2025 10:41:35 +0800 Message-Id: <20250429024139.34365-1-laoar.shao@gmail.com> X-Mailer: git-send-email 2.37.1 (Apple Git-137.1) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: 66415A0006 X-Stat-Signature: 54yatk1iqjbjmt9rtwqhwrboqbrpss44 X-Rspam-User: X-HE-Tag: 1745894510-600679 X-HE-Meta: U2FsdGVkX1+r4mrikT4u2eRQ8S4wfdVeG1+Rnwywh6jYF8QSnhOgFSZqIBIRXl6avUIMfq6Gp+xm0vForwSLFS+s7ViokOsIX1ro3ZpDPfs2BHUAF838DT3fLK48mBNovw0dPeDf469zOU2qsZ/hLFyktH1A+P/y7fLbiLqNONhEz8Pafp1ICffSrbtgx93+M5iN2ERh9kwvqr9+pKmcsnTU0Ea0iCg9rcv5Ol8Md+GYmQPUz8o6KbNi06KKEndGptV9slkYZR8ngdEHIWIof9+bVytLRCcoDf2jYm8Xn3cS0vb6qDuPYTaNEU/4p67tzHYrpmK5vUwkmvEgvmM6CxDwEI/l5FeQ0Qh74P9idYr3YgmL6fKTsqhPElI+MXK+/zj1thcllKXnCxpECQEUpNYA9zLgH9gqe4SE2zd83wxaVTU73taw1MbC9EXrKWOL7eU+zZCuDGQDP9EbuQs9fI8uUZ9JDuB/hoqtqBRh5Lq+16cJ93YY5gid1Nz+EQck2pZvkq/KKn+nfXOSi9nJDjLG5KcJrMKHzHOesdKYs8nPy2p7DavjLB5lAKAx87wh6TEdmHTVEZLiu6e7d3ktYg0fOjrhLfyCqGUnn5fgKYgjBqecpk5MBEcQx8lboeBCZ3ql0WLC52FtmYUnxYc31yXoXaNyeA8gsvSHko3NYm08Ry4Yjng4+0cl9eefk4UnlEG4kFhMlANZbQUsucwjQchjJ3m1ypJ21DOEUitxHWwSXy4t6TZVrt6VNmsUmfssyQrOEsWKrAGYsCVC0PtE6PZSrTijNZ7HuzIqpPgbHh+7BxbSHmtTDIBs1kQE4pyngBB7bmOrE4l4zTl2kOUU5wNnqwyc6TFc1Mq3qYCkNYBfRCsP0ypFVcg8Lq6v1ZTxQg11tuNlQZH325JcOVlal+ylNYO1kvI3xdmcRPPWScjK3oSKv9P6iuOoUAOFzynSaJy6Pal/2rhHnpANcuQ kOTKQU6S IXjIdv4K4adCW+1D6mEySLHkFLvbIM+wOfnQBf2KkJ+wpyXxVe7Dp15OGAtO47EwOUeqkSgEgypc1xuCn5WqGKGFI5K1kKKBNO5ER8MttnCdUANOw/RjOm0fDCDjfcqeRckDqVcfGA4OTGbuYMmiYYAM9byLM3a5I9oX6/1Q9nwKsZhf+ix2lKzuAiMmBlfICQj43SX4oYZTAOhO2fbnv9uELOrbtX50PNwwqrG3eZlBgx85Rpf1Vsrs0qoC2vGaegjLxk+GFbDc4RKiq+p5liJOXT3cZb6Y1MlcEJGB5yqAUJjcFPJk/BnPDpAPzN0r3WL79BYTthctLYiKLi9mqAbVstwP0nbXol+OuyDYAcp1D2YZZqzfnHLmUbsM11JvB4Ntt8BFh2AQ5CTyslJMMdOt6SUPj+0YMLySWqJpG1jobr5VOPv3Io9IxAlF1NPCpWOB3kUQwAh1fWrdS0Af9QBl1rHALMrPcpyLMjO3jpRA/iG5A/Sznv9jn4ye0todR4TC8pEuwxgguXO6FPVAf8s6pmg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: In our container environment, we aim to enable THP selectively—allowing specific services to use it while restricting others. This approach is driven by the following considerations: 1. Memory Fragmentation THP can lead to increased memory fragmentation, so we want to limit its use across services. 2. Performance Impact Some services see no benefit from THP, making its usage unnecessary. 3. Performance Gains Certain workloads, such as machine learning services, experience significant performance improvements with THP, so we enable it for them specifically. Since multiple services run on a single host in a containerized environment, enabling THP globally is not ideal. Previously, we set THP to madvise, allowing selected services to opt in via MADV_HUGEPAGE. However, this approach had limitation: - Some services inadvertently used madvise(MADV_HUGEPAGE) through third-party libraries, bypassing our restrictions. To address this issue, we initially hooked the __x64_sys_madvise() syscall, which is error-injectable, to blacklist unwanted services. While this worked, it was error-prone and ineffective for services needing always mode, as modifying their code to use madvise was impractical. To achieve finer-grained control, we introduced an fmod_ret-based solution. Now, we dynamically adjust THP settings per service by hooking hugepage_global_{enabled,always}() via BPF. This allows us to set THP to enable or disable on a per-service basis without global impact. The hugepage_global_{enabled,always}() functions currently share the same BPF hook, which limits THP configuration to either always or never. While this suffices for our specific use cases, full support for all three modes (always, madvise, and never) would require splitting them into separate hooks. This is the initial RFC patch—feedback is welcome! Yafang Shao (4): mm: move hugepage_global_{enabled,always}() to internal.h mm: pass VMA parameter to hugepage_global_{enabled,always}() mm: add BPF hook for THP adjustment selftests/bpf: Add selftest for THP adjustment include/linux/huge_mm.h | 54 +----- mm/Makefile | 3 + mm/bpf.c | 36 ++++ mm/bpf.h | 21 +++ mm/huge_memory.c | 50 ++++- mm/internal.h | 21 +++ mm/khugepaged.c | 18 +- tools/testing/selftests/bpf/config | 1 + .../selftests/bpf/prog_tests/thp_adjust.c | 176 ++++++++++++++++++ .../selftests/bpf/progs/test_thp_adjust.c | 32 ++++ 10 files changed, 344 insertions(+), 68 deletions(-) create mode 100644 mm/bpf.c create mode 100644 mm/bpf.h create mode 100644 tools/testing/selftests/bpf/prog_tests/thp_adjust.c create mode 100644 tools/testing/selftests/bpf/progs/test_thp_adjust.c -- 2.43.5