From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8C6DBC4332F for ; Tue, 12 Dec 2023 15:32:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0D8416B0300; Tue, 12 Dec 2023 10:32:37 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 088D76B0301; Tue, 12 Dec 2023 10:32:37 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E91D16B0302; Tue, 12 Dec 2023 10:32:36 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id DA87E6B0300 for ; Tue, 12 Dec 2023 10:32:36 -0500 (EST) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 9CF0040530 for ; Tue, 12 Dec 2023 15:32:36 +0000 (UTC) X-FDA: 81558558312.04.C131F17 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf09.hostedemail.com (Postfix) with ESMTP id 88F3014000A for ; Tue, 12 Dec 2023 15:32:34 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf09.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1702395154; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/JT3nkY0H/SSjI3JKkGKF0lUbJqq7C9Gl5r3saOygHI=; b=lLPLhZbL3O/EMiatVtHM5F+iqqNBZKaE3ZKItMdN6i6UyEQzq+YHWpO5q9gPevmK3DucdM co5Nig/wcoMwk+jhWzLvIoLgV1syRW/8/xYTjZbOVlDX/TBn18EQJX4dkBzNhZDJelZXDt mKaRru8aBHqFPxF6crLx8ojPfBrBTeA= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf09.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1702395155; a=rsa-sha256; cv=none; b=dIwRQDR7xIAVZAJ2u7no5c2sXZLL9qwib5biKbxc7zlsOAkFL0nkyB5dNTdpcFn/siotJ6 VGjlK10kF9HDEMwfyhuqJgJGja3sjFNEh1FpPZs/S2tcIwRF/AB/RlAfblrJ5dET0QaCYr 8O9iph+Nhqv2/MzlvAD212UXbGE003A= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id A394A143D; Tue, 12 Dec 2023 07:33:19 -0800 (PST) Received: from [10.1.39.183] (XHFQ2J9959.cambridge.arm.com [10.1.39.183]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 8385B3F738; Tue, 12 Dec 2023 07:32:30 -0800 (PST) Message-ID: Date: Tue, 12 Dec 2023 15:32:29 +0000 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v9 03/10] mm: thp: Introduce multi-size THP sysfs interface Content-Language: en-GB To: David Hildenbrand , Andrew Morton , Matthew Wilcox , Yin Fengwei , Yu Zhao , Catalin Marinas , Anshuman Khandual , Yang Shi , "Huang, Ying" , Zi Yan , Luis Chamberlain , Itaru Kitayama , "Kirill A. Shutemov" , John Hubbard , David Rientjes , Vlastimil Babka , Hugh Dickins , Kefeng Wang , Barry Song <21cnbao@gmail.com>, Alistair Popple Cc: linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, Barry Song References: <20231207161211.2374093-1-ryan.roberts@arm.com> <20231207161211.2374093-4-ryan.roberts@arm.com> From: Ryan Roberts In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 88F3014000A X-Stat-Signature: cwtycuop4fa1uocay8c5u7kb8ttddjxk X-HE-Tag: 1702395154-931423 X-HE-Meta: U2FsdGVkX19VuLFKXCl6nfyvSGebVJK6wl6EvDZdOSTnAfk3oxgz+ScD3x/05a5HmZ98QyOH5Fn26Ge/kNjbpn1MNJCMDXjU2YhoFUbN3OpVVg2+B0+i2UUi2gcjIE7fTo2ZM+qsfz9fwoLaMlCNG9N6/C2DH3c5WaBiosGguYvfoYA9UFg0AbCbDfwZUhQPz7Uf2R1HfwNROrRUZkVb7ykl2irbfE6NMWLrsl+YDMRfY5knC2BrjYCQlcXH4I/znwtavGga6qwqKB2vHFAYSqzw1CFKzxhGkbLdmluUSaKuzOD7pcFa7AsdqSKvwWQpf5PaCEH1uKsLBiaUN9m5zZYyoKz+2TpJMZdlir3SirFkU1SyqcwtJUDDC9khDFwqb5uF54paq5m/dkLKu0RtRJqSIxkQoI5nmuj/Da+PtsqzW27mV6WJlAQalJcfz5zy8WPe1jtGrYyqpGczGdxVnbR4rqWXnWOZnw/NkgScW6z6IFjBEzb45wfkve5DnqN5ZSz8jPTiyuhNPBHbn0M/G7pJ/biqdFec0NEpKZz6g96xggkHOL8WP8ydcZ0WZpijxuabscg/0bOeHZkEjlQLQXVcxSA+H+eCIok91R+O5CCw8D4igFC2swBPLzYXbpkMjibzTIeQnmoJ1R/+HDoDGSe3pQiW8A6OH2u1QvmThDQENxvEsmkJQgVAdXOGvfvSNnnAMT9nxt2yBoYx84DoczvOgQTmJ/8wTjypCbnyV5lNG2dKKb3YbQiUuWFXL7zRXxB5v2S/ljVUgweqt8esBmB5nGujc7lefAH5NdXGbsPysDqmqwid4bhyUztr7H+fmaKUszQmPIhoKyDdsWkEdkIRrCx41TFUxDm3Ty9WrWB3VVd2AChxMALBDY6uafvCxjAKkdXEu04H4pGaj92x101tYUsilOHgOFPWswk30daCln5e/hEiteoBRNM85jTt/cNqCR+LUK/CW2s9c0I BBaJF9Nc /wwJR5ISL+OZa1nXIfbrvEaG9CeAL+39njl8A7mbighY+fG25uVn+4NPDEuYUsIbJstV3caShEvRoeuBhD18NgrRe2EU2eS/1clIkQefasEhpS8dmbOu0MABPNSJa8Iwwr2/mkdhwJe5qld17eeI46fl2EqT+bxyrf5DiAJYRZMp+lMv+F+IGjBzFRu2lscstTCKuieRXlgHyNRKWIki/2FXiu/sVs3fnztyOx/Oh/k8EBO2d9aWY4ZPFI0HMnSQ+6BYsAEWocireYfoLE4gb6udgMpddFDKgYzY+oygYiwtIMMiMyxbmAg/5/quon3BV9bNNj8hw5btSpm5OuJjlD/Pv9i7KgIFrH6yOxQtwFDDaAgJe8mcdaCoKKRpkuD7TL+Kxlce/YkkVr/E1lb59JXsmzA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 12/12/2023 14:54, David Hildenbrand wrote: > On 07.12.23 17:12, Ryan Roberts wrote: >> In preparation for adding support for anonymous multi-size THP, >> introduce new sysfs structure that will be used to control the new >> behaviours. A new directory is added under transparent_hugepage for each >> supported THP size, and contains an `enabled` file, which can be set to >> "inherit" (to inherit the global setting), "always", "madvise" or >> "never". For now, the kernel still only supports PMD-sized anonymous >> THP, so only 1 directory is populated. >> >> The first half of the change converts transhuge_vma_suitable() and >> hugepage_vma_check() so that they take a bitfield of orders for which >> the user wants to determine support, and the functions filter out all >> the orders that can't be supported, given the current sysfs >> configuration and the VMA dimensions. The resulting functions are >> renamed to thp_vma_suitable_orders() and thp_vma_allowable_orders() >> respectively. Convenience functions that take a single, unencoded order >> and return a boolean are also defined as thp_vma_suitable_order() and >> thp_vma_allowable_order(). >> >> The second half of the change implements the new sysfs interface. It has >> been done so that each supported THP size has a `struct thpsize`, which >> describes the relevant metadata and is itself a kobject. This is pretty >> minimal for now, but should make it easy to add new per-thpsize files to >> the interface if needed in future (e.g. per-size defrag). Rather than >> keep the `enabled` state directly in the struct thpsize, I've elected to >> directly encode it into huge_anon_orders_[always|madvise|inherit] >> bitfields since this reduces the amount of work required in >> thp_vma_allowable_orders() which is called for every page fault. >> >> See Documentation/admin-guide/mm/transhuge.rst, as modified by this >> commit, for details of how the new sysfs interface works. >> >> Reviewed-by: Barry Song >> Tested-by: Kefeng Wang >> Tested-by: John Hubbard >> Signed-off-by: Ryan Roberts >> --- > > [...] > >> + >> +static ssize_t thpsize_enabled_store(struct kobject *kobj, >> +                     struct kobj_attribute *attr, >> +                     const char *buf, size_t count) >> +{ >> +    int order = to_thpsize(kobj)->order; >> +    ssize_t ret = count; >> + >> +    if (sysfs_streq(buf, "always")) { >> +        spin_lock(&huge_anon_orders_lock); >> +        clear_bit(order, &huge_anon_orders_inherit); >> +        clear_bit(order, &huge_anon_orders_madvise); >> +        set_bit(order, &huge_anon_orders_always); >> +        spin_unlock(&huge_anon_orders_lock); >> +    } else if (sysfs_streq(buf, "inherit")) { >> +        spin_lock(&huge_anon_orders_lock); >> +        clear_bit(order, &huge_anon_orders_always); >> +        clear_bit(order, &huge_anon_orders_madvise); >> +        set_bit(order, &huge_anon_orders_inherit); >> +        spin_unlock(&huge_anon_orders_lock); >> +    } else if (sysfs_streq(buf, "madvise")) { >> +        spin_lock(&huge_anon_orders_lock); >> +        clear_bit(order, &huge_anon_orders_always); >> +        clear_bit(order, &huge_anon_orders_inherit); >> +        set_bit(order, &huge_anon_orders_madvise); >> +        spin_unlock(&huge_anon_orders_lock); >> +    } else if (sysfs_streq(buf, "never")) { >> +        spin_lock(&huge_anon_orders_lock); >> +        clear_bit(order, &huge_anon_orders_always); >> +        clear_bit(order, &huge_anon_orders_inherit); >> +        clear_bit(order, &huge_anon_orders_madvise); >> +        spin_unlock(&huge_anon_orders_lock); > > Why not perform lock/unlock only once in surrounding code? :) I was nervous that sysfs_streq() may be unhappy in atomic context... Unfounded? > > > Much better > > Acked-by: David Hildenbrand >