From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 06844C30658 for ; Tue, 2 Jul 2024 16:33:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 594A66B0089; Tue, 2 Jul 2024 12:33:36 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 544896B00A2; Tue, 2 Jul 2024 12:33:36 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 40CF66B00A4; Tue, 2 Jul 2024 12:33:36 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 233AE6B00A2 for ; Tue, 2 Jul 2024 12:33:36 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id BB1DE1A03CE for ; Tue, 2 Jul 2024 16:33:35 +0000 (UTC) X-FDA: 82295358390.19.E1A12BB Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf16.hostedemail.com (Postfix) with ESMTP id 0B63418001D for ; Tue, 2 Jul 2024 16:33:32 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=none; spf=pass (imf16.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1719937983; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=YKHeidiC8EC/JCFO53cagyuS4dgdg6HJkQK4ZKsaKXE=; b=I6ZsXf7hswoUIxlVedMg6XLTiRAZdwonPIYxhnQy+m7xIji5m1FCRcG6mW1YDub2LUqlr3 gljQQLbMnCBloFLOufuYJCSCxRUhnNqD3fz2cLdcXHFaGsHedZdZkGXik50IyDxhedX/6Q eYn2xkoltduEGjTS7qVFFRGneKVqyiY= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=none; spf=pass (imf16.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1719937983; a=rsa-sha256; cv=none; b=8nSqEmEVvmyFlIdASh/M3KH/m8YoXNAJkZqHekDcBJYQQ/KGFopTfQfANCWKo1y/RnZ3ou id/Ph8WoT9wj8GKNdcZ5wFOW2SwHwxz2ZoCUmdBFd4OAOF9cgARtTFfuXqLHImYpqbZcY9 abBoF4vh9Fj69i1Lr9i6mYMu+/TG5qo= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id E7677339; Tue, 2 Jul 2024 09:33:56 -0700 (PDT) Received: from [10.1.32.193] (XHFQ2J9959.cambridge.arm.com [10.1.32.193]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 9FDDF3F73B; Tue, 2 Jul 2024 09:33:30 -0700 (PDT) Message-ID: <46fb005b-30e4-4340-b1e2-8789833f952c@arm.com> Date: Tue, 2 Jul 2024 17:33:29 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v1] mm: Fix khugepaged activation policy Content-Language: en-GB To: David Hildenbrand , Andrew Morton , Jonathan Corbet , Barry Song , Baolin Wang , Lance Yang , Yang Shi Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org References: <20240702144617.2291480-1-ryan.roberts@arm.com> <686d3f32-45b0-4dd5-8954-e75748b9c946@redhat.com> From: Ryan Roberts In-Reply-To: <686d3f32-45b0-4dd5-8954-e75748b9c946@redhat.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 0B63418001D X-Stat-Signature: jocia3quzfk1tizqqkraeft3enuumfya X-Rspam-User: X-HE-Tag: 1719938012-597150 X-HE-Meta: U2FsdGVkX1+ELGmNO7fqyzROwWB1fIFfqtC4IXHz9OcxxdxiLwZ4GN+EuBWg7A/MkhJQINPZOY+YsUxuUxi0/vZghVeIH4cSqI51itfoKg86xu+mq2EusxQeQfhmcULl+IWRlmaSCDzH3jre3f+XUdPSKQFAMkEDQ/sdvBhasdjjQ69dN9IdHkNghm8utCaeBj2pLrK6h3BplPYQmJaUCurRH6kETDEH2mr6Na0J1jua6ZrZYqzfQilAWDbNWUGnBTXZfBjQF8WJfPfbMhj5TD0vkaxx4qNOrHFYlV9+mLVpVLCmYTeNgTIEi1/wI6QeBKe6KraxaNz9qGAoGJwm52B+kW+lHRKkck1k5zSxRY9YLfrz9wSYUMWdjM6gPjEmMFiijxhZruYOabhpJLl9n4Dkr0TLx+O5ZIbggpyis3w5nWa4Spw26OClFWbty60uZc64/XCIx1zW+JdgOgV5vf7ozEuxeEQYRfrRypfg3MUXqOtW7I2EEDDL9uGCkQWKj1EmMCak/XFW1FWd9B4bZY+lzs+JSbugFcvQQsifKtcMwKQPrCjufQRTbmkXKgOp5c5oe6zv5dXCNMNRVcY3+qTA4wqMeHb5tOh52b2ku1AXV23F6dBKJXPDZvc8qVuAQC5ixgOT0e1lCsSsK4UlsYrYt+R8QFVTYqmL0LZ6Mxmw373cbDuloPp1QEyKGJ7uPfR+FbcIy4Rq3PALit1GagEFdG+20Iu7Hc2sBPJw/3e9ew4neTrGQVXDEUmXsIMcFIa1oeyVAa4k1p65A45Zeur9UDy0RP2CyJQw4fsFGwTuk0KDkvpelAjoxNNlwx9wVgXmdlYA0Id9A3xXvNK09n7sq/N+sikXOD69d5ji7/7wChfoAwDQhXSblTFEWI6MuDrUi8CndBO9gpkB4CAunIaph/Pa/BjGvkCp62dTFbPW9/MrpNG5vE1xhJDj6+0I+5wnOd9U7vZ08/XLOWY tOI7dRyJ IkJ7rFc2MEvBapNi30MUfOOZSn9mErmA7rgKENDMsoRXKdqsqAcvYWz4VRjrvy5HqE9NNkVrNDtOQ+FeRbyiOi1zjbyGAXyk1qDAmx+KDPI0UHfUFygMgH/Z0zWhwQUNOoBEHr1ieJ6x00pAvFMIhWuLotmJm57QSoQdtgjHLDVX88jM2y5OT2Q/ndqqD78FZzPtGDULipL+AJn4aiQ7t0jMmTWO30Hnjw82wIRnjpn9ihYSE49u4G66MpbptEONSfj2/O8ao868mUkUiwxrN2yP62TxYru9jwhGcy2Se1bpnhOSPlCKzKX5hnw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 02/07/2024 16:38, David Hildenbrand wrote: > On 02.07.24 17:29, Ryan Roberts wrote: >> On 02/07/2024 15:57, David Hildenbrand wrote: >>> On 02.07.24 16:46, Ryan Roberts wrote: >>>> Since the introduction of mTHP, the docuementation has stated that >>>> khugepaged would be enabled when any mTHP size is enabled, and disabled >>>> when all mTHP sizes are disabled. There are 2 problems with this; 1. >>>> this is not what was implemented by the code and 2. this is not the >>>> desirable behavior. >>>> >>>> Desirable behavior is for khugepaged to be enabled when any PMD-sized >>>> THP is enabled, anon or file. (Note that file THP is still controlled by >>>> the top-level control so we must always consider that, as well as the >>>> PMD-size mTHP control for anon). khugepaged only supports collapsing to >>>> PMD-sized THP so there is no value in enabling it when PMD-sized THP is >>>> disabled. So let's change the code and documentation to reflect this >>>> policy. >>>> >>>> Further, per-size enabled control modification events were not >>>> previously forwarded to khugepaged to give it an opportunity to start or >>>> stop. Consequently the following was resulting in khugepaged eroneously >>>> not being activated: >>>> >>>>     echo never > /sys/kernel/mm/transparent_hugepage/enabled >>>>     echo always > /sys/kernel/mm/transparent_hugepage/hugepages-2048kB/enabled >>>> >>>> Signed-off-by: Ryan Roberts >>>> Fixes: 3485b88390b0 ("mm: thp: introduce multi-size THP sysfs interface") >>>> Closes: >>>> https://lore.kernel.org/linux-mm/7a0bbe69-1e3d-4263-b206-da007791a5c4@redhat.com/ >>>> Cc: stable@vger.kernel.org >>>> --- >>>> >>>> Hi All, >>>> >>>> Applies on top of today's mm-unstable (9bb8753acdd8). No regressions >>>> observed in >>>> mm selftests. >>>> >>>> When fixing this I also noticed that khugepaged doesn't get (and never has >>>> been) >>>> activated/deactivated by `shmem_enabled=`. I'm not sure if khugepaged knows how >>>> to collapse shmem - perhaps it should be activated in this case? >>>> >>> >>> Call me confused. >>> >>> khugepaged_scan_mm_slot() and madvise_collapse() only all >>> hpage_collapse_scan_file() with ... IS_ENABLED(CONFIG_SHMEM) ? >> >> Looks like khugepaged_scan_mm_slot() was converted from: >> >>    if (shmem_file(vma->vm_file)) { >> >> to: >> >>    if (IS_ENABLED(CONFIG_SHMEM) && vma->vm_file) { CONFIG_READ_ONLY_THP_FOR_FS depends on CONFIG_SHMEM in Kconfig, so I think this is all correct/safe. Although I'm not really sure what the need for the dependency is. >> >> By 99cb0dbd47a15d395bf3faa78dc122bc5efe3fc0 which adds THP collapse support for >> non-shmem files. Clearly that looks wrong, but I guess never spotted in practice >> because noone disables shemem? >> >> I guess madvise_collapse() was a copy/paste? >> > > Likely. > >>> >>> collapse_file() is only called by hpage_collapse_scan_file() ... and there we >>> check "shmem_file(file)". >>> >>> So why is the IS_ENABLED(CONFIG_SHMEM) check in there if collapse_file() seems >>> to "collapse filemap/tmpfs/shmem pages into huge one". >>> >>> Anyhow, we certainly can collapse shmem (that's how it all started IIUC). >> >> Yes, thanks for pointing me at it. Should have just searched "shmem" in >> khugepaged.c :-/ >> >>> >>> Besides that, khugepaged only seems to collapse !shmem with >>>    VM_BUG_ON(!IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && !is_shmem); >> >> That makes sense. I guess I could use IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) to >> tighen the (non-shmem) file THP check in hugepage_pmd_enabled() (currently I'm >> unconditionally using the top-level enabled setting as a "is THP enabled for >> files" check). I'll do a v2 with this addition if you agree? static inline bool hugepage_pmd_enabled(void) { /* * We cover both the anon and the file-backed case here; for * file-backed, we must return true if globally enabled, regardless of * the anon pmd size control status. */ return (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && hugepage_global_enabled()) || test_bit(PMD_ORDER, &huge_anon_orders_always) || test_bit(PMD_ORDER, &huge_anon_orders_madvise) || (test_bit(PMD_ORDER, &huge_anon_orders_inherit) && hugepage_global_enabled()); } >> >> But back to my original question, I think hugepage_pmd_enabled() should also be >> explicitly checking the appropriate shmem_enabled controls and ORing in the >> result? Otherwise in a situation where only shmem is THP enabled (and file/anon >> THP is disabled) khugepaged won't run. > > I think so. I'll do this part as a separate change since it's fixing a separate bug. > >> >>> >>> The thp_vma_allowable_order() check tests if we are allowed to collapse a >>> PMD_ORDER in that VMA. >> >> I don't follow the relevance of this statement. > > On whatever VMA we indicate true, we will try to collapse in khugepaged. > Regarding the question, what khugepaged will try to collapse. Oh I see. Yes agreed. But we don't have a VMA when enabling/disabling khugepaged. We just want to know "are there vmas for which khugepaged would attempt to collapse to PMD size?"