From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3D934C77B7C for ; Tue, 24 Jun 2025 09:57:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E08A48D0003; Tue, 24 Jun 2025 05:57:40 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DE0A48D0001; Tue, 24 Jun 2025 05:57:40 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D1D4F8D0003; Tue, 24 Jun 2025 05:57:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id C1A168D0001 for ; Tue, 24 Jun 2025 05:57:40 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 3E4C1C031E for ; Tue, 24 Jun 2025 09:57:40 +0000 (UTC) X-FDA: 83589842280.24.7F9CE79 Received: from out30-111.freemail.mail.aliyun.com (out30-111.freemail.mail.aliyun.com [115.124.30.111]) by imf20.hostedemail.com (Postfix) with ESMTP id 9DF3B1C000D for ; Tue, 24 Jun 2025 09:57:37 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=pXHkkyRn; spf=pass (imf20.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.111 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1750759058; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=3geTY9Lg0SUAa3yFGCcB0qvuVBnVvAoChOzavuhTgtQ=; b=vjnm6YSrarXliowDAK/J46Y7Wzlu+gy2l2qos5b0yE7YXhuexZqj7xvFVY2wIRhJnTHgjp oYI25cuvVmViLHqGVp74eDllNg8+LlMT4dr4Z8ftZM/HrvMs6Ax0vpRmvN2hJjaI7up/e0 nw8E+VvO2I/09qvPvKVdwUWlCe8M4VA= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1750759058; a=rsa-sha256; cv=none; b=AUyzrZl10OG5mpRoEQd6bSXRhZ44e/hJmvPjO9DoxKzYbuYUWb/M2gz79njnxqOaRQj1IN 7inN0fA058/ANZZJY3znFLaPb3lJnBWLsVvHypM1OEwPsSxZf9F7NdvpZscQhhbX/C7IUD TxG7DJNSrje2VCw1M33z+p51QKMwLwo= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=pXHkkyRn; spf=pass (imf20.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.111 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com DKIM-Signature:v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1750759054; h=Message-ID:Date:MIME-Version:Subject:To:From:Content-Type; bh=3geTY9Lg0SUAa3yFGCcB0qvuVBnVvAoChOzavuhTgtQ=; b=pXHkkyRnnbmpHkRCJvxo0hPIY3CBAPK+wWI9ZLtr/bEXNOjjdisuMg+XkOFV2qgXT9v11o2URy6IS+WeTB96GCZP5g7KaT+/pW+maxerM05fS4emecunbkOGC4B3713NxKo9n7XQZEQ+jPFpqEfzWkkrvRIasYJ2RPAnb/24Lo4= Received: from 30.74.144.102(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0WegcnpY_1750759053 cluster:ay36) by smtp.aliyun-inc.com; Tue, 24 Jun 2025 17:57:34 +0800 Message-ID: Date: Tue, 24 Jun 2025 17:57:33 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v3 1/2] mm: huge_memory: disallow hugepages if the system-wide THP sysfs settings are disabled To: Dev Jain , akpm@linux-foundation.org, hughd@google.com, david@redhat.com Cc: ziy@nvidia.com, lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com, npache@redhat.com, ryan.roberts@arm.com, baohua@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org References: <8912e179-601a-4677-b2f6-14f40d488d98@arm.com> From: Baolin Wang In-Reply-To: <8912e179-601a-4677-b2f6-14f40d488d98@arm.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 9DF3B1C000D X-Stat-Signature: njosjo1trzp8b6asj65ufx9upfocog3z X-Rspam-User: X-HE-Tag: 1750759057-125077 X-HE-Meta: U2FsdGVkX1+Zq9AnD1RRVKUlywigYT4XCTf41fP6WiR3ztm8st76unX18BQofehYKS9Q3lOblYe5HJbjLZ9EvpyyRL1OA5pNk2sMTVWBT8DtZgWabqIYDKMzziDU9z3iz0XDlyEgUBDFhTSfZJc8BMHfIMsOLCAbqGdSXVQqBHxc0DfF14w4NhtRAc1R/iBd+yvh9pIWctff7x5ToBGcMWG3/AsAxd6/y1+oX/KIAVkBkUG4NxH2FN0uB9JGV0RnPg+vmiJvr00hPFo0UxjhZjK1W84S0WyFt/3hRvNAaUoEYHcnL1JSGMKT85HlwceBAA6uWUjXYJ+AXN7dC/klaY6mp5RkTEuotqgW52b5ziIxAUrmabd/AYAY426iQbTIS+lndHVGey8YIDyxzbGHe15990HDdzcV6H0ZZzvJADJ7LJsDAtUekkNOvfXFFc5i8danHeGeCSy4oQh6ou3RNEvt986NU2SUfo/2Pcxtw7qUSP1ozgun6zpz8MJbTAjg5Bfmqvic17fX3AAyE8kxj3tTAip+70YazubzUTgh+1E9qprc8fqOsrhqTOoJ+L6je4NsCRhBveNMUWWV3ASNJ5kvXAwse0z6uAbFrCv1SsjvZAHsaexlsuQ1zEZzspWFJ6XVcbysRt6SG7tQHBiorFmPda363NC3bHqJDP+/lX1H2dZt0lELDeeRjSvWq6G4YXLkWrCpPpUQEPZl8X8WtRzd1jGA84RUBT2GCHC92Yfi/AfHyUXFG3uj0SfYj1w/IppvuPXgM6JGjcCo4vZL9xYPxwDIiB+t2qTqJTBgMkDJD6/5AJOgtXWISvitnnVSNUWbIMWhqZ6iZzWSHKZWCgMC+iXw/6a7CE+skWqM75sW1Gb8h+UmMRfxQcIrdvMses3qpDq7FJAeE7XOppjfJjyQ5Pzmlzs90ROvOSDQcl72lf2yDBxmY7IhafUvm8R/xYTfSA/6El840vLd6gw l1rn2+Be 5vbm+9vycVwvkgGJYGcPz5DwfwHMjsovzGf7iBh+9ytiJRDrI03ey8h8jxKT8BQS2sQSHZHCdDXji8x/jOQ6sZEdxVyTe6jQMv+OdPDaBqM7D+hxIyU9bwqZWWqioHmtD4WlS6mJYWOnipSlmA8bNZ/LC74LmSdw7QFqbsz/zqNmtCQDfmkMbUK5LT+nB3hrvqWT9FhxRlSN4ikW/ov/cKwf/1oRzxS/KjaTHpXdCO3E9OOEYx1Ms7518bLuU8e6TUmzRkz3/Ix/EhgcsbrjrSKtVrswyKIjqYFAfUd2PLsZnldgX51aY83XibofB8l+5nU+XegpJUeXAiUsSuiG5OZ+HLGsG1K8frF6v9JAk2yadKhzMWj5a/90KFtwhYGawdWV+O/grw0N2rlI= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2025/6/24 16:41, Dev Jain wrote: > > On 23/06/25 1:58 pm, Baolin Wang wrote: >> When invoking thp_vma_allowable_orders(), the TVA_ENFORCE_SYSFS flag >> is not >> specified, we will ignore the THP sysfs settings. Whilst it makes >> sense for the >> callers who do not specify this flag, it creates a odd and surprising >> situation >> where a sysadmin specifying 'never' for all THP sizes still observing >> THP pages >> being allocated and used on the system. >> >> The motivating case for this is MADV_COLLAPSE. The MADV_COLLAPSE will >> ignore >> the system-wide Anon THP sysfs settings, which means that even though >> we have >> disabled the Anon THP configuration, MADV_COLLAPSE will still attempt >> to collapse >> into a Anon THP. This violates the rule we have agreed upon: never >> means never. >> >> Currently, besides MADV_COLLAPSE not setting TVA_ENFORCE_SYSFS, there >> is only >> one other instance where TVA_ENFORCE_SYSFS is not set, which is in the >> collapse_pte_mapped_thp() function, but I believe this is reasonable >> from its >> comments: >> >> " >> /* >>   * If we are here, we've succeeded in replacing all the native pages >>   * in the page cache with a single hugepage. If a mm were to fault-in >>   * this memory (mapped by a suitably aligned VMA), we'd get the hugepage >>   * and map it by a PMD, regardless of sysfs THP settings. As such, let's >>   * analogously elide sysfs THP settings here. >>   */ >> if (!thp_vma_allowable_order(vma, vma->vm_flags, 0, PMD_ORDER)) > > So the behaviour now is: First check whether THP settings converge to > never. > Then, if enforce_sysfs is not set, return immediately. So in this > khugepaged > code will it be better to call __thp_vma_allowable_orders()? If the sysfs > settings are changed to never before hitting collapse_pte_mapped_thp(), > then right now we will return SCAN_VMA_CHECK from here, whereas, the > comment > says "regardless of sysfs THP settings", which should include "regardless > of whether the sysfs settings say never". Sounds reasonable to me. Thanks. I will change thp_vma_allowable_order() to __thp_vma_allowable_orders() in the collapse_pte_mapped_thp() function to maintain consistency with the original logic. Lorenzo and David, how do you think? Thanks.