From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4C348C677C4 for ; Wed, 11 Jun 2025 06:59:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B0E406B0088; Wed, 11 Jun 2025 02:59:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id ABF8D6B0089; Wed, 11 Jun 2025 02:59:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9D4CC6B008A; Wed, 11 Jun 2025 02:59:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 816106B0088 for ; Wed, 11 Jun 2025 02:59:18 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id DA9DD801C1 for ; Wed, 11 Jun 2025 06:59:17 +0000 (UTC) X-FDA: 83542218354.21.0464FF8 Received: from out30-110.freemail.mail.aliyun.com (out30-110.freemail.mail.aliyun.com [115.124.30.110]) by imf16.hostedemail.com (Postfix) with ESMTP id D16E3180007 for ; Wed, 11 Jun 2025 06:59:14 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b="Utt/4rxj"; spf=pass (imf16.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.110 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1749625156; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=gP5iwLJ/m7kR1tCrsX8gbkxg5lZ/MdYiEoUObTmykjg=; b=Y1GyzJGQiW6I0g0t/z2rn/lNH2GUVQL8/h/yjQPC54VLYiK7c0VGMaY6ouAdnNMB6UAPOu pUo5kDdI2HsRNX7lv5D1Gu2OB6MFOiTPWDHxKCGzDheiNbvaWf6ga14OTVFgPCY9vuuBls FO/KGWot1uCi0WD2Trhi0xTMgx4OYY4= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b="Utt/4rxj"; spf=pass (imf16.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.110 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1749625156; a=rsa-sha256; cv=none; b=TbQfwXfSgGQ+l775X1S57ug1HrIi53KQ6ANKbB7roAZ+uYu/ukMJNBwvzd1quFPtJNmh9G MfOK8Jhozoa8N80nCyfAepTd6nr9UhJzBJFUCczwM8kcASFV9v+kBzRKEf0OgiJCfg1Zhc /+BSO6Shm9MyCoVPUSZ6EY/GsIVAmTc= DKIM-Signature:v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1749625151; h=Message-ID:Date:MIME-Version:Subject:To:From:Content-Type; bh=gP5iwLJ/m7kR1tCrsX8gbkxg5lZ/MdYiEoUObTmykjg=; b=Utt/4rxju3tu4lgo0RwZ+H7f39GMPltLOIyr2yF8qg2UnXKe1FarkWjbb3tQdk54xdUWYZM80WiYR54+0CxaLGZeDj0+FFzLk6pTxzQAiosQ9Dm1PtNANoD2LMaeJGT6VyU2yHEClepxWKESnxHzMn2+ztNif9jBv9Rp3EZvLuA= Received: from 30.74.144.128(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0Wdc8EnH_1749625149 cluster:ay36) by smtp.aliyun-inc.com; Wed, 11 Jun 2025 14:59:10 +0800 Message-ID: <52024f6e-22e5-495d-ae15-683a50b2ac49@linux.alibaba.com> Date: Wed, 11 Jun 2025 14:59:09 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v2 1/2] mm: huge_memory: disallow hugepages if the system-wide THP sysfs settings are disabled To: Lorenzo Stoakes Cc: akpm@linux-foundation.org, hughd@google.com, david@redhat.com, Liam.Howlett@oracle.com, npache@redhat.com, ryan.roberts@arm.com, dev.jain@arm.com, ziy@nvidia.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org References: <8eefb0809c598fadaa4a022634fba5689a4f3257.1749109709.git.baolin.wang@linux.alibaba.com> <998a069c-9be5-4a10-888c-ba8269eaa333@lucifer.local> From: Baolin Wang In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: D16E3180007 X-Rspamd-Server: rspam03 X-Rspam-User: X-Stat-Signature: p33s1djpktxwgork88t6m83nq63quwto X-HE-Tag: 1749625154-933954 X-HE-Meta: U2FsdGVkX1/kOhyCtwgrYvzQt87zRODBw6pCeRJMPRQ3+DN9d01SWzsH4mb5tlV8eSkKEhFZubktPWiBbutcvtmzQBv+ayFLfXRT+Unpb8sjORAAqajtpMHbEdNyWpNK1XN10ZLGnLv6b3+JXVk0SZZbeLttLu/33VFmyo20p+koDTjOKSC7jDcFJTonTn8SlR4WBh2yCn9rv6TFLjTPrBGOJmJU2vCh1Q1bAm35wjDMiqAJC1oaNDLdpLvKq96qGbPLJuNXssFgCVTHG5WzaeAce3JLB/la6IZUEbq4cqNDJoiX+HDwBWWvNXO1wXCOhAgJ3ej/oDFL2SNFeoE+LlCId9ymlvrQyhuvQkOY/jEWSpLKkQtBZQQoxvwNT4wLoRsjnC+M6iyAnjuFzdJXGfM5Vn8WOHArqvHttbMs4ypAaCD7Tf2B56xsOJNMjMy3y85ZL2wE+O5Dp+7WAI4Z4pY2B6LavqFt8aHplbpn7sl+C/UQhl34Q6spVZWQ8nTM3PsbBpRtlEboscb8U14fr4JUTQqArVi6WC2A7WGej2+JXmJ2ApaStm1kEY3fB6Ng50tAplpLcHH6XzSZ4OlVuSSUCTicU7RIv1g5BxP1MXseyEgtyvSIC2nibsf3+ha96KQrivitJxkFvCmMG8rwA9rjcoufjPXOtE7ShesTphUoRzvbNvG/RR9veduol3fEUVrlDPFeVZAEEv0VB2oCQxLqbuHfB/DGn3zQXC8DUOXBA0RrVuYo9T3G7a/pKgzQUhgiMR5pzJP1tQBMeC212mpKUbXbTmCRG93OYytipVREK5SMUWar/AumldMKJBjFymDFI6Lu3d7wgsGzWFhiSWvfmeGtUHmMzoB375BUlULtwGQtob9MM6tDJnTw3XbtHRvMNKj5HgLQGeY7CVhXXfNQclfMfILVBPwHULSrx3rDXaxB5zZe3YHkA1+RBwN+fqnjBqFDbofcA+1vAbs nqgqEWi8 S8cOLFZ0BAECtVPBUX/HhunxCHDDhsuuyk6K0I8VbUoGg3frOskUMjlZ9UxdFJEUFfaIrfXUDcvK+fNxazBErWzxG4SCzqp12JP3ALTA2QtqGjBw7fvGv3k1e4FJ5wk0/ipADafPz0eJamp+v0YE0Mrgetj2YcquKbkoDQBNp4MJ+d0Vej7zjwkDnZAhDZ31INyzmJyDjoIvZkKMcXc0n7fQ36b2G0FQQiyK4ebfvGZwxqbBqmupBeVLIv1Xs79jNaEdAehLxWqrdoLR0VKeCnah5JYleHSpVmeDKpljpTQ1sy7Wzjsashr90nAGWR4GSwFMv94kPHuQRo0MxRisixiqMp+t7n2Y2SzVgOYk+UdRXBH+kGIa0fWChnPUJDIkfahw7MRhxxOM+AQY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2025/6/9 23:17, Lorenzo Stoakes wrote: > On Mon, Jun 09, 2025 at 02:10:12PM +0800, Baolin Wang wrote: >> >> >> On 2025/6/7 19:55, Lorenzo Stoakes wrote: >>> Not related to your patch at all, but man this whole thing (thp allowed orders) >>> needs significant improvement, it seems always perversely complicated for a >>> relatively simple operation. >>> >>> Overall I LOVE what you're doing here, but I feel we can clarify things a >>> little while we're at it to make it clear exactly what we're doing. >>> >>> This is a very important change so forgive my fiddling about here but I'm >>> hoping we can take the opportunity to make things a little simpler! >>> >>> On Thu, Jun 05, 2025 at 04:00:58PM +0800, Baolin Wang wrote: >>>> The MADV_COLLAPSE will ignore the system-wide Anon THP sysfs settings, which >>>> means that even though we have disabled the Anon THP configuration, MADV_COLLAPSE >>>> will still attempt to collapse into a Anon THP. This violates the rule we have >>>> agreed upon: never means never. >>>> >>>> Another rule for madvise, referring to David's suggestion: “allowing for collapsing >>>> in a VM without VM_HUGEPAGE in the "madvise" mode would be fine". >>> >>> I'm generally not sure it's worth talking only about MADV_COLLAPSE here when >>> you're changing what THP is permitted across the board, I may have missed some >>> discussion and forgive me if so, but what is special about MADV_COLLAPSE's use >>> of thp_vma_allowable_orders() that makes it ignore 'never's moreso than other >>> users? >> >> We found that MADV_COLLAPSE ignores the THP configuration, meaning that even >> when THP is set to 'never', MADV_COLLAPSE can still collapse into THPs (and >> mTHPs in the future). This is because when MADV_COLLAPSE calls >> thp_vma_allowable_orders(), it does not set the TVA_ENFORCE_SYSFS flag, >> which means it ignores the system-wide Anon THP sysfs settings. >> >> So this patch set is aimed to fix the THP policy for MADV_COLLAPSE. >> > > Yeah of course, and this is exactly why, but what I mean is, the patch > doesn't explicitly address MADV_COLLAPSE, it addresses a case that > MADV_COLLAPSE uses (which is as you say the motivating cause for the > change). > > So I think the commit message should rather open something like: > > If, when invoking thp_vma_allowable_orders(), the TVA_ENFORCE_SYSFS > flag is not specified, we ignore sysfs TLB settings. > > Whilst it makes sense for the callers who do not specify this flag, > it creates a odd and surprising situation where a sysadmin > specifying 'never' for all THP sizes still observing THP pages > being allocated and used on the system. > > The motivating case for this is MADV_COLLAPSE, :) OK. Will update the commit message. Thanks.