From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 45285C61CE8 for ; Fri, 13 Jun 2025 02:07:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8D5226B007B; Thu, 12 Jun 2025 22:07:31 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 885CA6B0089; Thu, 12 Jun 2025 22:07:31 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 79CDC6B008A; Thu, 12 Jun 2025 22:07:31 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 5A85D6B007B for ; Thu, 12 Jun 2025 22:07:31 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 91A4D100EA5 for ; Fri, 13 Jun 2025 02:07:29 +0000 (UTC) X-FDA: 83548740618.23.B859A3C Received: from out30-124.freemail.mail.aliyun.com (out30-124.freemail.mail.aliyun.com [115.124.30.124]) by imf10.hostedemail.com (Postfix) with ESMTP id 74DBFC000E for ; Fri, 13 Jun 2025 02:07:26 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=IZgsldGx; spf=pass (imf10.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.124 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1749780447; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=p+rRa3QKmOhYw/Lt12EnBbPLQ9TK+8gj2nP9q+FYN8w=; b=WIpeV1Vw0a22WES/Hs4nKIEKy+b/18XkNWCQAmVoUdDUZ8/lnF6ff4NIFIFPYxiguQlPVi I9YiNISiSXW7ezcRAFxJ2bXY9WPej7bW89mJosMg91IrKVg5m0dws6HPysaG5VLVK1s0Cr cMypZICfcLY35sA2naNlu2jmvZKIilY= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1749780447; a=rsa-sha256; cv=none; b=7EPOyRgdZ5WQZYM60xTSujThnaVPCrwvbAEZYZ/skTnMzrjufezoGB5IpG6OR9kKLjHrh0 4nYGzvfBj2DlodgloQCKlT44dchxvbAfwRAFSZHtDWk4qiGzB05BDih8b6liMsmD4+xia7 UU60CqlGkaIfp42Sg5v0QxroxfnLlZE= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=IZgsldGx; spf=pass (imf10.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.124 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com DKIM-Signature:v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1749780443; h=Message-ID:Date:MIME-Version:Subject:To:From:Content-Type; bh=p+rRa3QKmOhYw/Lt12EnBbPLQ9TK+8gj2nP9q+FYN8w=; b=IZgsldGxvkN5nJdmb5UHJcH16R8EPgMRQClDR3m61HiKBWLy4AfdCNqxFG0CNQQzwDNGcQ6vHV0KbZw8E7Bo6kFAKeccXBcuykhuZJW+E9hRkq23BGvm09GPWbo9mUld4Ab0tYXYk7hoV+sCBs14opy3ljoDH8ZyKp2/CeNOpiI= Received: from 30.74.144.147(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0WdiglX0_1749780441 cluster:ay36) by smtp.aliyun-inc.com; Fri, 13 Jun 2025 10:07:21 +0800 Message-ID: <7c911fff-aaf1-430c-89f2-5e7925e90a04@linux.alibaba.com> Date: Fri, 13 Jun 2025 10:07:20 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v2 1/2] mm: huge_memory: disallow hugepages if the system-wide THP sysfs settings are disabled To: Lorenzo Stoakes , David Hildenbrand Cc: akpm@linux-foundation.org, hughd@google.com, Liam.Howlett@oracle.com, npache@redhat.com, ryan.roberts@arm.com, dev.jain@arm.com, ziy@nvidia.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org References: <8eefb0809c598fadaa4a022634fba5689a4f3257.1749109709.git.baolin.wang@linux.alibaba.com> <1ec368c4-c4d8-41ea-b8a3-7d1fdb3ec358@redhat.com> <2ff65f37-efa9-4e96-9cdf-534d63ff154e@linux.alibaba.com> <953596b2-8749-493d-97eb-a5d8995d9ef8@redhat.com> <97a67b74-d473-455e-a05e-c85fe45da008@linux.alibaba.com> <815f383d-8636-490d-8994-486be51f3123@lucifer.local> From: Baolin Wang In-Reply-To: <815f383d-8636-490d-8994-486be51f3123@lucifer.local> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspam-User: X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 74DBFC000E X-Stat-Signature: xzgn4d7pnuazcqqe6wm71nja494q4dgd X-HE-Tag: 1749780446-435220 X-HE-Meta: U2FsdGVkX19f2ZS4GXxKTSF4eViDsu7ytrw9kwBuSSknut0YKeWBeBmo7xEhlX/IO5lKo19fd90/oFt8PX1vSAeAEeAl45HjdDNMpxqqujIc5+K6LD/q3P96S7skeA2VJdpKQ6PRjTGuukkL7lHeX0oGeTuexfpPyZq3zfYVz5PKbEaYDvxBu4OdVsMGkO7LWlAtetTvFlNysdKFJy04BSSBOzcuz+CzaQZ/yuYgod+7zoa05JGlZQAs7iiiiN7DWzXHeOsR9Oe8JatOHCVfSSWdE/IhS+tDIMbfgCAH2cd0Isx2fU0cOaz47yVAfFNeePfFUh5J9VfjTfgRb7+jP768mQa+qAsZOwgxPMYxM1nDA6EZe0QJTkiQxBNw8hDin3ei5LjaG8UGYC0wBuMoyN7RHCqp+5Oedufah+yRj2Xkh8MNh8FFY1i8W2JYsPxufkxMy9vpjpjcOijv9TKkiHHJ0+Kc/mnAuQz+LZXXAg7JC/16SB9rP+AT5AJofuW4cs7XJ/ronaVi0CIjVfgsnBohRZ2JKQZImC1XttpyzdC9YMEr/zp8qq5K4bnKjc80/H8bfas34s5x8N98uBBmdW9iOYAsPO3reQhALMz1ECD4tZoEiENoF8mtWNwIR5DhNbmdtb6YJmM6OFtrhGFdWoF6alzw854vcFm689a1BzpjPbhII26VqgeTqsPAl9GFb+YCLc878kZ4BOXElRJHjLpg/WEGNFhgUUowmJggKBG4zrgrrFGlsbdPo23qPr8b1ul6KWRbjK1Q1tj9966H/rvzxjIlg37XeFxRu/vi9tiljKuBWh3tKopQShIEjZlTsgw/87qAUtpV1/BEgQo4gc29w84a+Wq6P3mQ/KVIO7J8zMAGycqvVgJJ5dejqlPGOn29VelpjcmQq4jVJfJGr0S6Am+MHGLKqRlZ/1ecNYwR32NlO+08j+OIsrRSiM/jWIBAQvVpEdYJvODBnco RgiH9NmI 0H1PkTJmk4geYDHHx78gyaROBOjJ4Db9tq5Wb+gs0QYCP/llKajvo46GOIKbVSYSgeCKEB+YHuCxko0of3QNRMYsYiwCloZENVIM/lfKLVVp5OJqsH49kgME1rAoW/oFQECqoda+5ho6pm+8oRNyCUq8X4GyZLFU1pbeggPz5vx5V7lXghSl/5Aeqdov6d/TWbuH4Am5cqQFwiKf1ACeGcF8XjFzc6jjvDsf6ACaZVdHFjUvLHfHbMhU6SBQ4JlcSbOpfeW+SDviiJqqWViUzR5km0w7XveLub8OkmH9AnBffNvgW1+fIPFZTdkkeaBBxJSDw+xqo65++kmz58H1v/RyQNaI1LcZMjL1cWwA0GqhjMenoqFD9iYck/Q== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2025/6/12 22:49, Lorenzo Stoakes wrote: > On Thu, Jun 12, 2025 at 04:09:27PM +0200, David Hildenbrand wrote: >> >> >>>> @@ -265,6 +265,42 @@ unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma, >>>> unsigned long tva_flags, >>>> unsigned long orders); >>>> +/* Strictly mask requested anonymous orders according to sysfs settings. */ >>>> +static inline unsigned long __thp_mask_anon_orders(unsigned long vm_flags, >>>> + unsigned long tva_flags, unsigned long orders) >>>> +{ >>>> + const unsigned long always = READ_ONCE(huge_anon_orders_always); >>>> + const unsigned long madvise = READ_ONCE(huge_anon_orders_madvise); >>>> + const unsigned long inherit = READ_ONCE(huge_anon_orders_inherit); >>>> + const unsigned long never = ~(always | madvise | inherit); >>>> + >>>> + /* Disallow orders that are set to NEVER directly ... */ >>>> + orders &= ~never; >>>> + >>>> + /* ... or through inheritance (global == NEVER). */ >>>> + if (!hugepage_global_enabled()) >>>> + orders &= ~inherit; >>>> + >>>> + /* >>>> + * Otherwise, we only enforce sysfs settings if asked. In addition, >>>> + * if the user sets a sysfs mode of madvise and if TVA_ENFORCE_SYSFS >>>> + * is not set, we don't bother checking whether the VMA has VM_HUGEPAGE >>>> + * set. >>>> + */ >>>> + if (!(tva_flags & TVA_ENFORCE_SYSFS)) >>>> + return orders; >>> >>> This implicitly does a & mask as per suggested previous version, which I think >>> is correct but worth pointing out. >> >> Yes. >> >>> >>>> + >>>> + if (!(vm_flags & VM_HUGEPAGE)) { >>> >>> Don't love this sort of mega negation here. I read this as _does_ have huge >>> page... >> >> Well, it's very common to do that, but not objecting to something that is >> clearer ;) >> >> I assume you spotted the >> >> if (!(tva_flags & TVA_ENFORCE_SYSFS)) >> >> :P > > Lol yeah I know I know, I just think I guess in this case because you're > negating elsewhere it makes it harder... > >> >> if (vm_flags & VM_HUGEPAGE) >> return orders; >> >> >> Would have been easier. >> >>> >>>> + /* Disallow orders that are set to MADVISE directly ... */ >>>> + orders &= ~madvise; >>>> + >>>> + /* ... or through inheritance (global == MADVISE). */ >>>> + if (!hugepage_global_always()) >>>> + orders &= ~inherit; >>> >>> I hate this implicit 'not hugepage global always so this means either never or >>> madvise and since we cleared orders for never this means madvise' mental >>> gymnastics required here. >>> >>> Yeah I feel this is a bridge too far, we're getting into double negation and I >>> think that's more confusiong. >> >> >> Same here ... I think we should just have hugepage_global_madvise(). :) > > Ideally in future not have these stupid globals all over the place and rework > this whole damn thing... > >> >>> >>> >>>> + } >>> >>> I propose a compromise as I rather like your 'exclude never' negation bit. >>> >>> So: >>> >>> /* Strictly mask requested anonymous orders according to sysfs settings. */ >>> static inline unsigned long __thp_mask_anon_orders(unsigned long vm_flags, >>> unsigned long tva_flags, unsigned long orders) >>> { >>> const unsigned long always = READ_ONCE(huge_anon_orders_always); >>> const unsigned long madvise = READ_ONCE(huge_anon_orders_madvise); >>> const unsigned long inherit = READ_ONCE(huge_anon_orders_inherit);; >>> const unsigned long never = ~(always | madvise | inherit); >>> const bool inherit_enabled = hugepage_global_enabled(); >> >> Can we just have hugepage_global_never/disabled() to use instead? > > This would be nice! > > Could be a follow up... though again would be nice to somehow do away with all > this crap altogether. > >> >>> >>> /* Disallow orders that are set to NEVER directly ... */ >>> orders &= ~never; >>> >>> /* ... or through inheritance (global == NEVER). */ >>> if (!inherit_enabled) >>> orders &= ~inherit;> >>> /* >>> * Otherwise, we only enforce sysfs settings if asked. In addition, >>> * if the user sets a sysfs mode of madvise and if TVA_ENFORCE_SYSFS >>> * is not set, we don't bother checking whether the VMA has VM_HUGEPAGE >>> * set. >>> */ >>> if (!(tva_flags & TVA_ENFORCE_SYSFS)) >>> return orders; >>> >>> if (hugepage_global_always()) >>> return orders & (always | inherit); >>> >>> /* We already excluded never inherit above. */ >>> if (vm_flags & VM_HUGEPAGE) >>> return orders & (always | madvise | inherit); >>> >>> return orders & always; >>> } >>> >>> What do you think? >> >> With the fixup, it would work for me. No magical "mask" variables :D > > Thanks! Fair enough. You both prefer the 'exclude never' logic, and I will follow that logic. >>>> + return orders; >>>> +} >>>> + >>>> /** >>>> * thp_vma_allowable_orders - determine hugepage orders that are allowed for vma >>>> * @vma: the vm area to check >>>> @@ -287,16 +323,8 @@ unsigned long thp_vma_allowable_orders(struct vm_area_struct *vma, >>>> unsigned long orders) >>>> { >>>> /* Optimization to check if required orders are enabled early. */ >>>> - if ((tva_flags & TVA_ENFORCE_SYSFS) && vma_is_anonymous(vma)) { >>>> - unsigned long mask = READ_ONCE(huge_anon_orders_always); >>>> - >>>> - if (vm_flags & VM_HUGEPAGE) >>>> - mask |= READ_ONCE(huge_anon_orders_madvise); >>>> - if (hugepage_global_always() || >>>> - ((vm_flags & VM_HUGEPAGE) && hugepage_global_enabled())) >>>> - mask |= READ_ONCE(huge_anon_orders_inherit); >>>> - >>>> - orders &= mask; >>>> + if (vma_is_anonymous(vma)) { >>>> + orders = __thp_mask_anon_orders(vm_flags, tva_flags, orders); >>>> if (!orders) >>>> return 0; >>> >>> I pointed out to Baolin that __thp_vma_allowable_orders() handles the orders == >>> 0 case almost immediately so there's no need to do this, it just makes the code >>> noisier. >> >> The reason we added it in the first place was to not do the (expensive) >> function call. > > Ack point taken! > >> >> >> -- >> Cheers, >> >> David / dhildenb >> > > For convenience, I enclose the fixed version + tweaked the inherit local bool to > be inherit_never to be clearer: > > /* Strictly mask requested anonymous orders according to sysfs settings. */ > static inline unsigned long __thp_mask_anon_orders(unsigned long vm_flags, > unsigned long tva_flags, unsigned long orders) > { > const unsigned long always = READ_ONCE(huge_anon_orders_always); > const unsigned long madvise = READ_ONCE(huge_anon_orders_madvise); > const unsigned long inherit = READ_ONCE(huge_anon_orders_inherit);; > const unsigned long never = ~(always | madvise | inherit); > const bool inherit_never = !hugepage_global_enabled(); > > /* Disallow orders that are set to NEVER directly ... */ > orders &= ~never; > > /* ... or through inheritance (global == NEVER). */ > if (inherit_never) > orders &= ~inherit; > > /* > * Otherwise, we only enforce sysfs settings if asked. In addition, > * if the user sets a sysfs mode of madvise and if TVA_ENFORCE_SYSFS > * is not set, we don't bother checking whether the VMA has VM_HUGEPAGE > * set. > */ > if (!(tva_flags & TVA_ENFORCE_SYSFS)) > return orders; > > /* We already excluded never inherit above. */ > if (vm_flags & VM_HUGEPAGE) > return orders & (always | madvise | inherit); > > if (hugepage_global_always()) > return orders & (always | inherit); > > return orders & always; > } Thanks Lorenzo. Let me follow this logic and do some testing.