From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6E75AC77B7C for ; Wed, 25 Jun 2025 01:40:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D9FFB6B00A5; Tue, 24 Jun 2025 21:40:23 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D50A16B00AB; Tue, 24 Jun 2025 21:40:23 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C66D36B00AC; Tue, 24 Jun 2025 21:40:23 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id B5D016B00A5 for ; Tue, 24 Jun 2025 21:40:23 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 5DD01102267 for ; Wed, 25 Jun 2025 01:40:23 +0000 (UTC) X-FDA: 83592217926.19.789D442 Received: from out30-112.freemail.mail.aliyun.com (out30-112.freemail.mail.aliyun.com [115.124.30.112]) by imf20.hostedemail.com (Postfix) with ESMTP id 4FF621C0008 for ; Wed, 25 Jun 2025 01:40:19 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=UiMkHKVw; spf=pass (imf20.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.112 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1750815621; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=s3gRp0xJxwb0f1sG+vjcL17ce4cZOv0ZWJkdD4FJg/M=; b=GU8hzCt462evagPghef5N19iYN5SNMUgMidofmNFb/WgyUVme69LMxhX/AK5v+/ynN281w Cnu9eY3xfmc1oCSk4eMI5F6zCTbhuIfoRw9JpYJO8ZpTj4TbUx4Pl8kXKGUa+RrdbQtYdi vD4Xjo/q9d9tADaE6c/G9tZfO+mFRHQ= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=UiMkHKVw; spf=pass (imf20.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.112 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1750815621; a=rsa-sha256; cv=none; b=yXFMNqhogtvv8YcmrC368NbeFtRq1M0uMH+YFlKLvHZtKoLE0d4id/8Ix2+I22+1uZ5Yc+ Onj87k+kqiuw7AHG9wFCj7PGwhSAdlVeHX+k3HZbKBMy+Xo9VMlysZQ70f2O39gJtPZySO m2Ljy0agivRmtXtCqJmwucxuPYxugLE= DKIM-Signature:v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1750815617; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=s3gRp0xJxwb0f1sG+vjcL17ce4cZOv0ZWJkdD4FJg/M=; b=UiMkHKVw0uC37m7w0xyRf3IDtBuCyKhpU2w7cBZz5lYOpB9zuFFTBCDEU2Qoj03ALZnYR0OIQMWh2TiuYTrTP8hH5z88lhgE3e5B85M4A9lLbAIbdXFhLEn3voWHFF26koWt06zQnegZGbFdYHXvBJOZi9eekbvBMAxzaTn2c2c= Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0WeqBlig_1750815615 cluster:ay36) by smtp.aliyun-inc.com; Wed, 25 Jun 2025 09:40:16 +0800 From: Baolin Wang To: akpm@linux-foundation.org, hughd@google.com, david@redhat.com Cc: ziy@nvidia.com, lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com, npache@redhat.com, ryan.roberts@arm.com, dev.jain@arm.com, baohua@kernel.org, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v4 0/2] fix MADV_COLLAPSE issue if THP settings are disabled Date: Wed, 25 Jun 2025 09:40:08 +0800 Message-ID: X-Mailer: git-send-email 2.43.5 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 4FF621C0008 X-Stat-Signature: 4f9m6m8ry8m131psexp5pqdzzxeqhss5 X-Rspam-User: X-HE-Tag: 1750815619-10357 X-HE-Meta: U2FsdGVkX19kkLZtoXhXw2dIg5pe/j5sgjtdLlgo/e5Wrl/5GvVLkALu3gV4at6mwn3CBiRczvV35iBAEqKuT8C0X8fnpGlIdxeTAL9AMDAJjNubjvLKsTh/5FOeCsPpEEyAKJQTQkJQ7Xl5WIf25derpx7AbR6QbmYojzHlOGGkyzM+M2LaUWLQ9Zjz82p+7jQhjDjFg4t7g90Xj06JlP3OPr+yBczD8CmNtbKlnWbxAudacvKSqfcXHCnPJlEa/ESoDIMoLxLOpPSkT4voOksKhzCIk+zrML7Z5Ce7/qjA95N4zo+buE1JBbrJvwCsAsAXoFtkHMrFqb1x2pdEQOpW09CeYlEhEYxf5OwhYDOSbpwMk837+uqYME5EfZRyiZ9Ktwh8Umv0lVYA38Yqf+eO6o6LbbybKafFLfqfi+aoFcDwjJ3YqISljfXR4Cmp48L/W8RXTsnU+e3jNJvLr/9fYnRjODT0hYkkT0nGag7JcAiPWLU0v4VmQi3FJA/np/EfkTALa0cRZtCkBqrkY1XF2hl+c6M4dGBUxElSKl1BghWGsWGlhWq16BfvhcriLjm0D8oJL3i1fT5JBFSLKpoiBAgBn90TJ5woAcVIULNqXbbYiR3Mud3TVt5lsXhsmJ9EmkLLl2nGcRz7zBza7yyGRA7Ec79mufudxDEHqWA2S5qTXHRWJAgr2frOd5f3I5lMt2zJwAihPTloD8YBfMeF36SEuUZCipD2QIHDb1YghDKVs4uRmyYBP6R+4j2Vt9xA//ejC7Rny+JzRbWz2jt5NQoTNK8PNRT3l9XbkPE3P4DPUYL011kGef+I9DcQ/wIfUMpl/UWecNAvJ5tQmChuXkLClYgd5nTZPUuB6ut7O92gEYBruBHLLRPg1vjXC4jKbOJBRclECpYUl5FjT3zxzHPGVvtWuzmvhv0827iQOqnosJs3ETNxsM40Id2XsrZwTedBx4cZa4mInsB CwzWId7e KOdj0mAbNH0UsuKbEJ4nmBPAOZZ6W2HfClo0H0oS26SI/bllHD2FTV09ORODxYjGSRvNobAi4RukgdSYmXjlDMKEUprxGjuRFImD6vusU99CmgJIAhwLepciMxnjXBHHnB4Nb50JjYvIAyj+u3JYVYpARyuajXUeyKQy68DdprIraU9tEIhiwrdUCuy/IRIj1iZ37gVWJrZKec5dWbMEGnZ9HrLkL9IsMD9lsPaBitKc3l2I7GicinqDFXEHjOSN27osQkFLHz3vU498kVuS/+omV4NR3VtuEdoAmyWBzJgZ/EhduR30gaz03hKLrGFvTLYhlEIVwe2DTfQw= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: When invoking thp_vma_allowable_orders(), if the TVA_ENFORCE_SYSFS flag is not specified, we will ignore the THP sysfs settings. Whilst it makes sense for the callers who do not specify this flag, it creates a odd and surprising situation where a sysadmin specifying 'never' for all THP sizes still observing THP pages being allocated and used on the system. And the MADV_COLLAPSE is an example of such a case, that means it will not set TVA_ENFORCE_SYSFS when calling thp_vma_allowable_orders(). As we discussed in the previous thread [1], the MADV_COLLAPSE will ignore the system-wide anon/shmem THP sysfs settings, which means that even though we have disabled the anon/shmem THP configuration, MADV_COLLAPSE will still attempt to collapse into a anon/shmem THP. This violates the rule we have agreed upon: never means never. For example, system administrators who disabled THP everywhere must indeed very much not want THP to be used for whatever reason - having individual programs being able to quietly override this is very surprising and likely to cause headaches for those who desire this not to happen on their systems. This patch set will address the MADV_COLLAPSE issue. Test ==== 1. Tested the mm selftests and found no regressions. 2. With toggling different Anon mTHP settings, the allocation and madvise collapse for anonymous pages work well. 3. With toggling different shmem mTHP settings, the allocation and madvise collapse for shmem work well. 4. Tested the large order allocation for tmpfs, and works as expected. [1] https://lore.kernel.org/all/1f00fdc3-a3a3-464b-8565-4c1b23d34f8d@linux.alibaba.com/ Changes from v3: - Collect reviewed tags. Thanks. - Update the commit message, per David. Changes from v2: - Update the commit message and cover letter, per Lorenzo. Thanks. - Simplify the logic in thp_vma_allowable_orders(), per Lorenzo and David. Thanks. Changes from v1: - Update the commit message, per Zi. - Add Zi's reviewed tag. Thanks. - Update the shmem logic. Baolin Wang (2): mm: huge_memory: disallow hugepages if the system-wide THP sysfs settings are disabled mm: shmem: disallow hugepages if the system-wide shmem THP sysfs settings are disabled include/linux/huge_mm.h | 51 ++++++++++++++++++------- mm/shmem.c | 6 +-- tools/testing/selftests/mm/khugepaged.c | 8 +--- 3 files changed, 43 insertions(+), 22 deletions(-) -- 2.43.5