From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8B111C83038 for ; Tue, 1 Jul 2025 15:40:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E8BC06B008C; Tue, 1 Jul 2025 11:40:25 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E633C6B0099; Tue, 1 Jul 2025 11:40:25 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D9F866B009A; Tue, 1 Jul 2025 11:40:25 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id C792A6B0099 for ; Tue, 1 Jul 2025 11:40:25 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id D6DCB1213E6 for ; Tue, 1 Jul 2025 15:40:24 +0000 (UTC) X-FDA: 83616107568.27.29E60A7 Received: from mail-ej1-f54.google.com (mail-ej1-f54.google.com [209.85.218.54]) by imf24.hostedemail.com (Postfix) with ESMTP id EACB1180005 for ; Tue, 1 Jul 2025 15:40:22 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="eM1wy/IM"; spf=pass (imf24.hostedemail.com: domain of shy828301@gmail.com designates 209.85.218.54 as permitted sender) smtp.mailfrom=shy828301@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1751384423; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=/7vfJ4ag6ZewhYkEmPZw35EEArCVesg3SRNuB5E6xFk=; b=3NpLKqH0ziAAGKZs6QqArjNeOIe7FZKIn3md5rucsFdVJGnPkWR7bOs6Xyb1pSGQPEdulb BiWkdWWWW4VeIgyDaflo+rCl2pZbdcW2TuHcJ9hlbpZ4U1hOH/V5dFO6TweskgZA39a8G4 r6pbTi/banvEiBbla9+5wtnbR1NC8rI= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="eM1wy/IM"; spf=pass (imf24.hostedemail.com: domain of shy828301@gmail.com designates 209.85.218.54 as permitted sender) smtp.mailfrom=shy828301@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1751384423; a=rsa-sha256; cv=none; b=3qY5oCnbE+sCpXbAg6e3gvH2g/kfvLYtXapTF86z2fHyRbSzb4K0mU763g9oypcUkVmmvb MnPz4Uea08zTjPqkkQVGHkOnoogvqeodMbtrz+k1HKKck5p4Cfe8U5AXc//F2+I2lPM5WY BEFWkS1aaoJgaFXGArjr5X8zuvoJ93c= Received: by mail-ej1-f54.google.com with SMTP id a640c23a62f3a-ae0e0271d82so1024407866b.3 for ; Tue, 01 Jul 2025 08:40:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1751384421; x=1751989221; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=/7vfJ4ag6ZewhYkEmPZw35EEArCVesg3SRNuB5E6xFk=; b=eM1wy/IM3vqonf0xvkDSufbSUeIhqc43Ljl+bLLwQDfnrXiLEpomV44CPwzdnO4Hwj 523aa0bGJou8u11jLhK0bTAw+tR+NBOVuoxl21OOe+9pmt2ORlhgfGgtId9Wb3xT5RoL hM+eS7LuAxajdsywo7nGafLawjOqp9oANk1Yyf7H8JueYXA02ubw2flXDM0jTRZXt6M+ eOiMOOwTxnALU1sGagwfZEE1TARW1pSKndRCZmSE8pA5fJpautV5kKtsgNT9chJNmebi YWkSKITlAqz6ftV28HZXi5Gr6JNJZHrpP+tcq9d+h9oW5AwOuk1ycDOPvQn+sAqCpgAP Kmzw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1751384421; x=1751989221; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=/7vfJ4ag6ZewhYkEmPZw35EEArCVesg3SRNuB5E6xFk=; b=se6FQot6MwbHLH9EAoTwX07qEjsXhDSU7GV14SUm4Whw+J9SlcJb8yvd7smh3P5lDX 66HVYcXVD9TbbdJTJ4xZIjOrWF0MWk87AcVT3tV2XiUNzDDbceM7Tyw7sAtEPDlQv7l9 3oP+pXcZBjzuXfakhV3H+vxf9QvuC/wKcAhJHqU+xn8fjSYrF/vxq1KXRZUdwJ4dvXe/ VjMOnbH70IDIYbq6h70IG7oRUJwp2aiArf26fjqX40vQzwXtN48uwV9r1YTQxt33F2Y9 76fsBY9kyjA3JWMsZvPTeMPhmRMjC1llvjqH+jlttK1JKj6girjlOt50GQuie9I1XxqR yMgw== X-Forwarded-Encrypted: i=1; AJvYcCXwDMsH6FkbV6J9RFk/kbS+MeCQXjOmfeAeAms0EtntFRVicoL6z8iNjClSibm8nyvAl1KOoOhATw==@kvack.org X-Gm-Message-State: AOJu0YzaVlaumPKMVyWNBT+HWLTw/S5gCVrHkubiWxBCcoee22jmuCTn FlkJCBGmhJ668PKdBwiSbXKm6rhFQLuurE7Dt4D1ZIlCYbmp65xg1VqFf2caZlyDOjjhHu2NuQb SwGUJqeFfTY4j0GYhdbgG0D+ShUCOCz4= X-Gm-Gg: ASbGncvjvCj4BPph+Vz9Y/TTvT8c1H43Rg1jXsLIsqwyQUMiCZCctrK4d3ychB+3PkL 8ieV5UMp1s0OeIKH8MDkR1v0F5d4MPnnZ6BUMBx+I9xYUsLV3uPdXokvWUMTh++yxD8YedIMkKK XfyqhtKmgxXjEr/VyHldNq8X62iufTyM2Liz7+hII= X-Google-Smtp-Source: AGHT+IEeSarDYRbMce0QhHKbDHu/z47KA58d+tnKdcjgX34ocx6jVrPQA6V7BqorwKG+TvgDjiYRcVPrvLbIxErM0aM= X-Received: by 2002:a17:907:d24:b0:ae0:a1c2:262e with SMTP id a640c23a62f3a-ae35018f138mr1612284866b.50.1751384420936; Tue, 01 Jul 2025 08:40:20 -0700 (PDT) MIME-Version: 1.0 References: <4990838b-660d-46a2-b21c-67adcba61ff9@lucifer.local> <19714cae-6b73-43ec-af7a-1455196561d1@arm.com> <3ee2e7fea6f263aa884e3e715632b09f@kenip.in> <5816677a-705e-4a8f-b598-d74ff6198a02@arm.com> <80b849d4-faf3-47a9-8b8c-e8053299cfb2@arm.com> <2e99712b-8dac-4762-9fc5-fe3ef569b65e@lucifer.local> <787639a1e6a27c0f3b0e3ae658e1b8e7@kenip.in> In-Reply-To: <787639a1e6a27c0f3b0e3ae658e1b8e7@kenip.in> From: Yang Shi Date: Tue, 1 Jul 2025 08:40:09 -0700 X-Gm-Features: Ac12FXypL29YHg7f5sZW2FS0OrnA3vIxjvc47EWIlo06o3rh-Y54uGzeWKKVbTg Message-ID: Subject: =?UTF-8?Q?Re=3A_=5BPATCH=5D_mm=3A_limit_THP_alignment_=E2=80=93_performance_?= =?UTF-8?Q?gain_observed_in_AI_inference_workloads?= To: siddhartha@kenip.in Cc: Dev Jain , Lorenzo Stoakes , linux-mm@kvack.org, linux-kernel@vger.kernel.org, mgorman@suse.de, Vlastimil Babka , Rik van Riel Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: EACB1180005 X-Stat-Signature: kyjf1orqbg89t775hexyq9f3mh86j937 X-Rspam-User: X-HE-Tag: 1751384422-11706 X-HE-Meta: U2FsdGVkX1+SZPM3c5rpup5lTzIivnmI+6fBqDfPSfMTiKX+D7XPmK/CHK6lpvn/tlW/Qrc+4H1hu5rWG1k8nQuB6jvo9ypUIZ+5cGOwYix5CKbNrcdF6fs7ihRzDME6vBZj2UEWap/oFT39H6ipfkOtd/KVjafkiQk9qWW2HBcOzBcJcFtmvJeFLG5N5y5gwx/N1TrMAeknhuXSUv57ObRkeGFvLkOYSzx3XG+72MhZtjZfKhTmZ9ydh9ynK/gx4d72WzZNwuJjstwMgJguS93+LvVbnnS09A667qFOJcBZqOwbRCNjjQ/HPY1l99CH5UA3Nkq9CKLGbbeZfFgk2gO+8tgMAzzshKs0SKykeVxJ+27tb8dq9eT9/Y0WW3GhoyPFfovlFDhYO7s1VpbxpnLFwsZVLgwjOaUm9CfHvvNcIGAmPt2UX7uaO6ofoyM1DhimQ1so0zDCafuGW4VwWARbq03fZCmwTne3nY5OFZgtUpJD2z9TNVxpXic2bBzzHxV3B3tV4HcvI4Th6GqjV0C+jyJ4dmY1enhHxOs5qHC6M1ZVueIdkPmR4ISD6sr9JJE09xkObEFQu+108S4U0D+DCnRI+vVw0PnoJXy1uEoZPumAO75OS0FdoPjf7qiGSCLK4s6VOfF4aadg/zsLfybt4jb3b1HeJ0PlaTNaY+3cPree6sAEpp3GNiu6jrdq35AZOKLJ9S5rmICQuNMKIgIisUt4+NFIEXUGgMXj61SWMuVdPK/5OIq43iTqVzEXzKmFs6V3H/Uns7167P6g421Fhxl+cBLFmPEglxcp+vd5LJlFiYPmn2ZKcjZBJ79SRbJXB7/mi5GSQ9D8oB3KnqEbaLgpIUqSCEHMypDxUeDUWTs1FoXsWuDnbAvhbqT4KRWQ7CBEgrqUzA4HyHETRSDq2SJfCwlUrOkOeNH5k28j5+zhfWXTcCfyR6y0FZ4S+uhsW6XElXixEHWyK/T lvnIzYbZ 1Cp+vPf+tuCYupM6mhlh9cWCbsFtch8EuURCXGxp6C7Mt/Q8heBxuxSMVGS/FVFS9LnC1tCCsuLAq/vqsI37TfSGe/GLMDrfnFsOxyvCVaE5savmnAcDDMW9wISxmxmTTQq386aTR/nH+SWHVv8+FTTp5KN+fhLmvT0l8CB0d+Zav0AdH/pRo7Sz6I5s0ZAWcFOMjcc2ykdWaYzY+VnXi24MnaOP0xQYs08RzuIo+D2PVhtAhqqV6oK1QamB9AwHCWwT52eoCczLrBEKf+tad2HeMDEL5S6Xry0FGCz1IX6G+Z6r5NDzTHPBe4eX5YFGzsOBRSXsvO8/sSUgC9ZZmGnqWpEZvM/XCkWj8ugJT1ls88tIu8Yv0bmoA57OF5tHdk/Qv X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: > > =F0=9F=A4=96 3. How does this impact AI workloads like Hugging Face Trans= formers? > Tokenization and dynamic batching create non-deterministic memory > allocation patterns: > > Models like BERT and T5 dynamically allocate intermediate buffers per > token-length, batch size, and attention window. > > Hugging Face + ONNX Runtime uses multiple small-ish anonymous mmap()s, > often 512KB=E2=80=931.8MB. If I remember correctly, Rik's patch should just force PMD alignment when the allocation size is greater than PMD size. Such VMA fragmentation should be caused by allocations greater than 2M but not PMD aligned, so they create 2M PMD + a bunch of 4K PTEs. Less than 2M allocations should be right next to each other and mergeable. Did I miss something? Thanks, Yang > > These allocations come in bursts =E2=80=94 but due to forced alignment, t= he > kernel was placing them with artificial gaps, defeating THP eligibility > entirely. >