From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 80D4FD5B847 for ; Mon, 28 Oct 2024 22:36:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0BEFE8D0007; Mon, 28 Oct 2024 18:36:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 06E658D0003; Mon, 28 Oct 2024 18:36:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E29D98D0007; Mon, 28 Oct 2024 18:36:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id C4B438D0003 for ; Mon, 28 Oct 2024 18:36:12 -0400 (EDT) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 46D8480725 for ; Mon, 28 Oct 2024 22:36:12 +0000 (UTC) X-FDA: 82724469996.05.36F6249 Received: from mail-vk1-f173.google.com (mail-vk1-f173.google.com [209.85.221.173]) by imf05.hostedemail.com (Postfix) with ESMTP id 55F0910000D for ; Mon, 28 Oct 2024 22:35:25 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="f1oGe5Z/"; spf=pass (imf05.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.221.173 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1730154813; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ruanekakymRz5BAdcZ4X/lBPdAj5pXrys4sR3ewuFz4=; b=jY7Pk/g1OzrxIGZhMRrw6kbWGE0LhKRbOKnHgB1v/2io8FnBKqbgezHABWey5Fe6TjYbnw 0CKzBFp/imRbhU0iZVQVm4UckQRF3WTmWoJPqxKbzsAmuboWYNDVxCR/ZKyK5bbRLiRpjz WxfqEam6jqrEyVGAMTrmlvbL8vEzzDA= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1730154813; a=rsa-sha256; cv=none; b=tIkMImhtfqG5Mi3+AB/vprYJglIkgEZig2iLVoBHJni9M55UwGeEKigbxjCEZl3JttydiC h+wjkRVjP90BlxuHK7v73qxuaFneVuFGghrQM7TKFhMiKlAenZS9+GQVO8VQuFTB0Lah3B PBBwzYw5CrK1Fn4oSVEvOj6kCVyOR/c= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="f1oGe5Z/"; spf=pass (imf05.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.221.173 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-vk1-f173.google.com with SMTP id 71dfb90a1353d-50fcc0cdcefso1322780e0c.0 for ; Mon, 28 Oct 2024 15:36:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1730154969; x=1730759769; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=ruanekakymRz5BAdcZ4X/lBPdAj5pXrys4sR3ewuFz4=; b=f1oGe5Z/gL/o3NhDD8VUnX6hfJ4mHka/NsjDVyVLHY7gXkqzAv9l7pozeFqjvNbHJ9 PnXolpdU7shoNTLYP/IIpbET8lp/G3XOhAAHxBAi0RNx3t0ikhEdLaRpIfUlM66nqgO+ f7D+4JM4cMS1VmnVbeMmcpJPklWAsklg2HIBGIZy0YskWXLdZo+77F2f3esgNqQY1btm bklj5iKe2tnSkQ++p+NtSmkuVZ9Hvz5vwx8W8NnNoUQABnujUROxTcge7BW/VtECX0io A8GOs/zZDsqM5HeaY8c2jQBafXlfs2mYMCNwMCA3/MZxdohk91BadOBodd1hfCEA1mXq qYdw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730154969; x=1730759769; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ruanekakymRz5BAdcZ4X/lBPdAj5pXrys4sR3ewuFz4=; b=oRQLHsahN5CFDT8aV/oKQ39XTUgFtw+jDxOzoFZ06mX4l9w1SNgetuZVQWFcrRm/MO eBY5lRCDfvH//GyEmypldHbztppmM/HOYLJOzVcmT84N2iYmccFpCAq0yMgbNL77+1ii BkUPw2bw49TFu8vKGiPMNhz6MgiWAlDJHtwF3VewUUXoCM6rnrl+1eToNZXtrC0c0hWo ULcGYytxEG2cO6UrM2i/SHwhJkcP5FaKg1vJ8RvLzipys3lAFTabFllALdz8jP8qDUSH 324EVsIvTdntae80Cj2szVvEHQ9xiIiZyJuaL5Z2IpkvfLjBzkBZyBxRzRB+9QLxcfPN BaaA== X-Forwarded-Encrypted: i=1; AJvYcCVAQd13n1A+GvIS02oy+N0nqG7NZ7Z5o3ZvRRraafWEQWtgZdMO5WYlBZOsKrLjkmcFMGyc39r30w==@kvack.org X-Gm-Message-State: AOJu0YwYEvRwRvw5pNhc+ymtPtbFzdJV/q/PAtxsvyppjW+g3eTTu83r wIUWMBW/DHXOBnd+mMi1Y3ng0U3xvIlqdON9cmiavsvlc+McKeYMBPSX6KBqIo0rtzy1zZysLJv pPTl6kDFJLbDAjHhlrOAcMW7fq2k= X-Google-Smtp-Source: AGHT+IFwD/SdrQv0Dd5YYxYWS6K93sh3LWdxo8AeuPPutZ/ZdZN4h84VpKl95axI38SS8MoNcn2ehxkCY6a+bgThV0g= X-Received: by 2002:a05:6122:889:b0:50d:535b:a18a with SMTP id 71dfb90a1353d-510150459a6mr6315726e0c.6.1730154969575; Mon, 28 Oct 2024 15:36:09 -0700 (PDT) MIME-Version: 1.0 References: <20241027175743.1056710-1-mcanal@igalia.com> <20241027175743.1056710-4-mcanal@igalia.com> <2505d52c-3454-4892-8c90-e3d9b2f0c84f@igalia.com> In-Reply-To: From: Barry Song <21cnbao@gmail.com> Date: Tue, 29 Oct 2024 06:35:57 +0800 Message-ID: Subject: Re: [PATCH 3/3] mm: shmem: override mTHP shmem default with a kernel parameter To: =?UTF-8?B?TWHDrXJhIENhbmFs?= Cc: Jonathan Corbet , Andrew Morton , Hugh Dickins , David Hildenbrand , Ryan Roberts , Baolin Wang , Lance Yang , linux-mm@kvack.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-dev@igalia.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 55F0910000D X-Stat-Signature: mbjzencojfbtkpujubri87h3pep7cmbi X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1730154925-129689 X-HE-Meta: U2FsdGVkX1/td/Wqz8p8m/ww4Ewb5PDJdrVvsKl68YbjgoD3UB4hFZ8E3jDlSP/Cu/Sq7Riaydc6XB0Xkt0dgwITs/pn6b18J4DU0ReEQc0fOq7T4yl5h2gmwvimXh/Uo8GOD38lzI+NAyPpEDL1HNjFiJwlzAWDopGFR2uhHjNxbeK4wYHPL0tabcwQ0blFTWIEJYskL9BiWaoIvpmqfznbxe2RVj6dD38IfUJtwC2+CsAsIJbuW8QAOnYePitv/J5xuS/WIrT6onQCJLemtDOZT7Qa22zF03UTtaa2ftA6RWm+7xeZJfSygfvdb82IKAOfduWBKkJGk3QoHNNHON/LT9n7E9FKxxD5sPUpmvjNztsNLrVnwjkFYaOWqy1J9KjgwCZlsFggzrs62MTlAnJvQcUAJ3JnkSyad9WzF/nKmkRz/iEsjuVutMUsTsgg11myujEl34A9nm4pFj+lYJM+SdPTdPb6rK6ZbfcgDRObi2rTZISnii+ByQk8JoZj/NIvltjpYTVhlafkqesVP+H7jnHc+1pLoLKm43jzTRNEWMSaIrBbLnJnNjkYoQZeQKRsOD34UAG/a1krDQZHsnuBdMJ0r636VZzOuB8DhsHuKik9/pZ3n+xDVVcPlLOfSsxJNXH08S9k6/jeWO4Yqa37QIHDzfbGAVYE+PPUOhuPcVq7e08tjvbIRQnZzwO4dxEUYdYjpjxwSNsRlX1GUL1BLujUNSEEStBck0Thfx0kLpSIaG/5jRIznOCwcv+Yu9wvqTlkglBBgR4fd0gmoYPBqsVBx3RaCwFDvuzzzqlW8NOhDnPN8w3byO6CNBySjcfz+bwfQVKisXc2zTRtJP/wzuLMJr0ayUXKsEWM2J+59hhDdnO62YPFisMvIreJJqprfzFxqyk45PDEms0tj+IDNHd5Jn+YM3Ksi17zHQqrLltZUASA3Q0ZktTUi8eka8RX7VRto+ACj3/yRrb dvq7X+f1 3Z1HX0ULOdjBun3rAfpT26/yN3IoiUw9JChcOs3b2a2o2ly5JzBIgwEGFkSQNCEw4idxK1ODRWDxeSGH+W1TAn7RtoZHTTdlXjf9VdalMqf06dbOpK8NXgg2LP0DkvRBjRtm668V5eiga6bI3pzs8FEgSHuTPQgK/gVyiU2r8EIXYHyV2taS++VHHrRRNy1umhrL8bsFaV+oE4t3f6Fl+V39dF6aeJLwwfKF9w1IajqFh2mdV65wxmHMKAXyQipryj//RwZR+nniixiiDWTMJ6667MbEX5o8XZAAlphBG3rRmmEELTYEwQGA4DC0wlB69TDYujVJWW6xIugBrIIR9oIpIRcgVF+xleGBjaeTqdKYi+740qkfM1PKze1HbwQr3HXL9 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Oct 28, 2024 at 7:34=E2=80=AFPM Ma=C3=ADra Canal wrote: > > Hi Barry, > > On 28/10/24 08:09, Barry Song wrote: > > On Mon, Oct 28, 2024 at 6:10=E2=80=AFPM Ma=C3=ADra Canal wrote: > >> > >> Hi Barry, > >> > >> On 27/10/24 18:54, Barry Song wrote: > >>> On Mon, Oct 28, 2024 at 6:58=E2=80=AFAM Ma=C3=ADra Canal wrote: > >>>> > >>>> Add the ``thp_shmem=3D`` kernel command line to allow specifying the > >>>> default policy of each supported shmem hugepage size. The kernel par= ameter > >>>> accepts the following format: > >>>> > >>>> thp_shmem=3D[KMG],[KMG]:;[KMG]-[KMG]= : > >>>> > >>>> For example, > >>>> > >>>> thp_shmem=3D16K-64K:always;128K,512K:inherit;256K:advise;1M-2M:never= ;4M-8M:within_size > >>>> > >>>> By configuring the default policy of several shmem huge pages, the u= ser > >>>> can take advantage of mTHP before it's been configured through sysfs= . > >>>> > >>>> Signed-off-by: Ma=C3=ADra Canal > >>>> --- > >>>> .../admin-guide/kernel-parameters.txt | 10 ++ > >>>> Documentation/admin-guide/mm/transhuge.rst | 17 +++ > >>>> mm/shmem.c | 109 +++++++++++++= ++++- > >>>> 3 files changed, 135 insertions(+), 1 deletion(-) > >>>> > >>> > >>> Hi Ma=C3=ADra, > >>> > >>>> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Docum= entation/admin-guide/kernel-parameters.txt > >>>> index acabb04d0dd4..595fa096e28b 100644 > >>>> --- a/Documentation/admin-guide/kernel-parameters.txt > >>>> +++ b/Documentation/admin-guide/kernel-parameters.txt > >>>> @@ -6700,6 +6700,16 @@ > >>>> Force threading of all interrupt handlers = except those > >>>> marked explicitly IRQF_NO_THREAD. > >>>> > >>>> + shmem_anon=3D [KNL] > >>>> + Format: [KMG],[KMG]:;[KMG]-[KMG]: > >>>> + Control the default policy of each hugepage = size for the > >>>> + internal shmem mount. is one of pol= icies available > >>>> + for the shmem mount ("always", "inherit", "n= ever", "within_size", > >>>> + and "advise"). > >>>> + It can be used multiple times for multiple s= hmem THP sizes. > >>>> + See Documentation/admin-guide/mm/transhuge.r= st for more > >>>> + details. > >>> > >>> I'm not sure this is the right name. How about "thp_shmem"? > >> > >> Oops, sorry about that. > >> > >>> > >>>> + > >>>> topology=3D [S390,EARLY] > >>>> Format: {off | on} > >>>> Specify if the kernel should make use of t= he cpu > >>>> diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentat= ion/admin-guide/mm/transhuge.rst > >>>> index 9b5b02c4d1ab..47e7fc30e22d 100644 > >>>> --- a/Documentation/admin-guide/mm/transhuge.rst > >>>> +++ b/Documentation/admin-guide/mm/transhuge.rst > >>>> @@ -332,6 +332,23 @@ allocation policy for the internal shmem mount = by using the kernel parameter > >>>> seven valid policies for shmem (``always``, ``within_size``, ``ad= vise``, > >>>> ``never``, ``deny``, and ``force``). > >>>> > >>>> +In the same manner as ``thp_anon`` controls each supported anonymou= s THP > >>>> +size, ``thp_shmem`` controls each supported shmem THP size. ``thp_s= hmem`` > >>>> +has the same format as ``thp_anon``, but also supports the policy > >>>> +``within_size``. > >>>> + > >>>> +``thp_shmem=3D`` may be specified multiple times to configure all T= HP sizes > >>>> +as required. If ``thp_shmem=3D`` is specified at least once, any sh= mem THP > >>>> +sizes not explicitly configured on the command line are implicitly = set to > >>>> +``never``. > >>>> + > >>>> +``transparent_hugepage_shmem`` setting only affects the global togg= le. If > >>>> +``thp_shmem`` is not specified, PMD_ORDER hugepage will default to > >>>> +``inherit``. However, if a valid ``thp_shmem`` setting is provided = by the > >>>> +user, the PMD_ORDER hugepage policy will be overridden. If the poli= cy for > >>>> +PMD_ORDER is not defined within a valid ``thp_shmem``, its policy w= ill > >>>> +default to ``never``. > >>>> + > >>>> Hugepages in tmpfs/shmem > >>>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D > >>>> > >>>> diff --git a/mm/shmem.c b/mm/shmem.c > >>>> index 24cdeafd8260..0a7a7d04f725 100644 > >>>> --- a/mm/shmem.c > >>>> +++ b/mm/shmem.c > > [...] > > >>>> static int __init setup_transparent_hugepage_shmem(char *str) > >>>> { > >>>> int huge, ret =3D 0; > >>>> @@ -5206,6 +5228,91 @@ static int __init setup_transparent_hugepage_= shmem(char *str) > >>>> } > >>>> __setup("transparent_hugepage_shmem=3D", setup_transparent_hugepa= ge_shmem); > >>>> > >>>> +static char str_dup[PAGE_SIZE] __initdata; > >>>> +static int __init setup_thp_shmem(char *str) > >>>> +{ > >>>> + char *token, *range, *policy, *subtoken; > >>>> + unsigned long always, inherit, madvise, within_size; > >>>> + char *start_size, *end_size; > >>>> + int start, end, nr; > >>>> + char *p; > >>>> + > >>>> + if (!str || strlen(str) + 1 > PAGE_SIZE) > >>>> + goto err; > >>>> + strcpy(str_dup, str); > >>>> + > >>>> + always =3D huge_shmem_orders_always; > >>>> + inherit =3D huge_shmem_orders_inherit; > >>>> + madvise =3D huge_shmem_orders_madvise; > >>>> + within_size =3D huge_shmem_orders_within_size; > >>>> + p =3D str_dup; > >>>> + while ((token =3D strsep(&p, ";")) !=3D NULL) { > >>>> + range =3D strsep(&token, ":"); > >>>> + policy =3D token; > >>>> + > >>>> + if (!policy) > >>>> + goto err; > >>>> + > >>>> + while ((subtoken =3D strsep(&range, ",")) !=3D NULL)= { > >>>> + if (strchr(subtoken, '-')) { > >>>> + start_size =3D strsep(&subtoken, "-"= ); > >>>> + end_size =3D subtoken; > >>>> + > >>>> + start =3D get_order_from_str(start_s= ize); > >>>> + end =3D get_order_from_str(end_size)= ; > >>>> + } else { > >>>> + start =3D end =3D get_order_from_str= (subtoken); > >>>> + } > >>>> + > >>>> + if (start < 0 || end < 0 || start > end) > >>>> + goto err; > >>>> + > >>>> + nr =3D end - start + 1; > >>>> + if (!strcmp(policy, "always")) { > >>>> + bitmap_set(&always, start, nr); > >>>> + bitmap_clear(&inherit, start, nr); > >>>> + bitmap_clear(&madvise, start, nr); > >>>> + bitmap_clear(&within_size, start, nr= ); > >>>> + } else if (!strcmp(policy, "advise")) { > >>>> + bitmap_set(&madvise, start, nr); > >>>> + bitmap_clear(&inherit, start, nr); > >>>> + bitmap_clear(&always, start, nr); > >>>> + bitmap_clear(&within_size, start, nr= ); > >>>> + } else if (!strcmp(policy, "inherit")) { > >>>> + bitmap_set(&inherit, start, nr); > >>>> + bitmap_clear(&madvise, start, nr); > >>>> + bitmap_clear(&always, start, nr); > >>>> + bitmap_clear(&within_size, start, nr= ); > >>>> + } else if (!strcmp(policy, "within_size")) { > >>>> + bitmap_set(&within_size, start, nr); > >>>> + bitmap_clear(&inherit, start, nr); > >>>> + bitmap_clear(&madvise, start, nr); > >>>> + bitmap_clear(&always, start, nr); > >>>> + } else if (!strcmp(policy, "never")) { > >>>> + bitmap_clear(&inherit, start, nr); > >>>> + bitmap_clear(&madvise, start, nr); > >>>> + bitmap_clear(&always, start, nr); > >>>> + bitmap_clear(&within_size, start, nr= ); > >>>> + } else { > >>>> + pr_err("invalid policy %s in thp_shm= em boot parameter\n", policy); > >>>> + goto err; > >>>> + } > >>>> + } > >>>> + } > >>>> + > >>>> + huge_shmem_orders_always =3D always; > >>>> + huge_shmem_orders_madvise =3D madvise; > >>>> + huge_shmem_orders_inherit =3D inherit; > >>>> + huge_shmem_orders_within_size =3D within_size; > >>>> + shmem_orders_configured =3D true; > >>>> + return 1; > >>>> + > >>>> +err: > >>>> + pr_warn("thp_shmem=3D%s: error parsing string, ignoring sett= ing\n", str); > >>>> + return 0; > >>>> +} > >>> > >>> Can we share source code with thp_anon since there's a lot of duplica= tion? > >> > >> I'm not a regular mm contributor and I'm most usually around drivers, = so > >> I don't know exactly here I could add shared code. Should I add the > >> headers to "internal.h"? > > > > My comment isn't related to drivers or memory management. It's solely a= bout > > avoiding code duplication. For example, we could create a shared functi= on to > > handle both controls, reducing redundant code :-) > > Let me rephrase it. > > I completely agree that we should avoid code duplication. I'm asking > where is the best place to add the headers of the shared functions. > "linux/shmem_fs.h" doesn't look appropriate to me, so I believe the > remaining options would be "linux/huge_mm.h" or "internal.h". Both locations seem quite odd. I have a feeling that these boot command elements are purely internal, yet internal.h contains something that is actually 'external' to mm. The shared code isn't 'external' enough to belon= g in internal.h. I didn't realize that shmem has placed these controls in its own file; I thought they were also located in mm/huge_memory.c. Given the current situation, I would prefer to keep the code as it is and tolerate the code duplication. Unless we are going to place controls for shmem and other thp controls in one place, I feel your code is better than having a shared function either = in internal.h or linux/huge_mm.h. > > I would like to know your opinion about those two options. > > Best Regards, > - Ma=C3=ADra > > > > >> > >> Best Regards, > >> - Ma=C3=ADra > >> > >>> > >>>> +__setup("thp_shmem=3D", setup_thp_shmem); > >>>> + > >>>> #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ > >>>> > >>>> #else /* !CONFIG_SHMEM */ > >>>> -- > >>>> 2.46.2 > >>>> > >>> Thanks barry