From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C9A4FC433DB for ; Wed, 17 Feb 2021 15:49:30 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 7A58164E2F for ; Wed, 17 Feb 2021 15:49:30 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7A58164E2F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=nvidia.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 25DBB6B0070; Wed, 17 Feb 2021 10:49:30 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 1E6AE6B0071; Wed, 17 Feb 2021 10:49:30 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0AFD86B0072; Wed, 17 Feb 2021 10:49:30 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0007.hostedemail.com [216.40.44.7]) by kanga.kvack.org (Postfix) with ESMTP id E66376B0070 for ; Wed, 17 Feb 2021 10:49:29 -0500 (EST) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id AD1091803952F for ; Wed, 17 Feb 2021 15:49:29 +0000 (UTC) X-FDA: 77828194458.09.dress77_200fb502764d Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin09.hostedemail.com (Postfix) with ESMTP id 91D06180AD830 for ; Wed, 17 Feb 2021 15:49:29 +0000 (UTC) X-HE-Tag: dress77_200fb502764d X-Filterd-Recvd-Size: 6272 Received: from hqnvemgate26.nvidia.com (hqnvemgate26.nvidia.com [216.228.121.65]) by imf38.hostedemail.com (Postfix) with ESMTP for ; Wed, 17 Feb 2021 15:49:28 +0000 (UTC) Received: from hqmail.nvidia.com (Not Verified[216.228.121.13]) by hqnvemgate26.nvidia.com (using TLS: TLSv1.2, AES256-SHA) id ; Wed, 17 Feb 2021 07:49:27 -0800 Received: from [10.2.58.214] (172.20.145.6) by HQMAIL107.nvidia.com (172.20.187.13) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Wed, 17 Feb 2021 15:49:24 +0000 From: Zi Yan To: David Rientjes CC: Alex Shi , Hugh Dickins , Andrea Arcangeli , "Kirill A. Shutemov" , Song Liu , "Michal Hocko" , Matthew Wilcox , Minchan Kim , Vlastimil Babka , Chris Kennelly , Subject: Re: [RFC] Hugepage collapse in process context Date: Wed, 17 Feb 2021 10:49:22 -0500 X-Mailer: MailMate (1.14r5757) Message-ID: In-Reply-To: References: MIME-Version: 1.0 Content-Type: multipart/signed; boundary="=_MailMate_7887A3EB-949D-4C51-A7BE-94E1B0AC7FFB_="; micalg=pgp-sha512; protocol="application/pgp-signature" X-Originating-IP: [172.20.145.6] X-ClientProxiedBy: HQMAIL107.nvidia.com (172.20.187.13) To HQMAIL107.nvidia.com (172.20.187.13) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nvidia.com; s=n1; t=1613576967; bh=vqAsZyBnYb9jKpngkDr6R3WaGr/HRfEa0xbducxy1lk=; h=From:To:CC:Subject:Date:X-Mailer:Message-ID:In-Reply-To: References:MIME-Version:Content-Type:X-Originating-IP: X-ClientProxiedBy; b=AC3mwbCzPiAEWzfEdXIZJbNTjVgkjMurVVrYPCteTnnqFhAMvtEhTZU3sYDMm28zB EXUE5Q1kbGA0lN9Jo55vmT7BGRJyxLm5Ms+/G+3NWWc+weo2qd4XEM5Gv3ey+EnkIY hhTNyXW41oB1RlIj5V4LFTQocFs2UVVZYAKOsL4jHbtk1ZrP4G0/vChdVeDEgDeZ3X fWI6b7RxvrVIsFL3zAjUJPVpPrOczwlpMrOw34F+VW3GrGRuvPMUhSVwbFxhdLuIqe hqZIzn4EyUBlx093os5O+PLcE25ca1DSKSP24WftSkl1+ZTq/icdU1zq7MlqsFaE9P CAbJr24lWlAAw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: --=_MailMate_7887A3EB-949D-4C51-A7BE-94E1B0AC7FFB_= Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On 16 Feb 2021, at 23:24, David Rientjes wrote: > Hi everybody, > > Khugepaged is slow by default, it scans at most 4096 pages every 10s. > That's normally fine as a system-wide setting, but some applications wo= uld > benefit from a more aggressive approach (as long as they are willing to= > pay for it). > > Instead of adding priorities for eligible ranges of memory to khugepage= d, > temporarily speeding khugepaged up for the whole system, or sharding it= s > work for memory belonging to a certain process, one approach would be t= o > allow userspace to induce hugepage collapse. > > The benefit to this approach would be that this is done in process cont= ext > so its cpu is charged to the process that is inducing the collapse. > Khugepaged is not involved. > > Idea was to allow userspace to induce hugepage collapse through the new= > process_madvise() call. This allows us to collapse hugepages on behalf= of > current or another process for a vectored set of ranges. > > This could be done through a new process_madvise() mode *or* it could b= e a > flag to MADV_HUGEPAGE since process_madvise() allows for a flag paramet= er > to be passed. For example, MADV_F_SYNC. > > When done, this madvise call would allocate a hugepage on the right nod= e > and attempt to do the collapse in process context just as khugepaged wo= uld > otherwise do. > > This would immediately be useful for a malloc implementation, for examp= le, > that has released its memory back to the system using MADV_DONTNEED and= > will subsequently refault the memory. Rather than wait for khugepaged = to > come along 30m later, for example, and collapse this memory into a > hugepage (which could take a much longer time on a very large system), = an > alternative would be to use this process_madvise() mode to induce the > action up front. In other words, say "I'm returning this memory to the= > application and it's going to be hot, so back it by a hugepage now rath= er > than waiting until later." > > It would also be useful for read-only file-backed mappings for text > segments. Khugepaged should be happy, it's just less work done by gene= ric > kthreads that gets charged as an overall tax to everybody. > > Thoughts? The idea sounds great to me. One question on how it interacts with khugepaged: will the process be exc= luded from khugepaged if this process_madvise() is used on it? Since it may sav= e khugepaged some additional scanning work if someone is actively collapsin= g hugepages for this process. =E2=80=94 Best Regards, Yan Zi --=_MailMate_7887A3EB-949D-4C51-A7BE-94E1B0AC7FFB_= Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQJDBAEBCgAtFiEEh7yFAW3gwjwQ4C9anbJR82th+ooFAmAtOwMPHHppeUBudmlk aWEuY29tAAoJEJ2yUfNrYfqKiZAP/iFfU3jlBm3rhxx0z9XLorvI8VF7W3zrSL69 WUKmMBvnBuz5OHan81HDIPv7F9BAJByTCTWg9PhsD2Gncks22UmJ9C8QtSXxxb8K ftmV4H+n1t0N6kXE4ALghnXk4LxcVfcsYV1HXESwR+Ts7JKWKhDOfXfhMlvMG5zT jX1HTcEdQVcJGsrAPtV60+vcxZFRXhBoVfnuv4nBy/anzWK1fwOlGEFeab7HY8zo nc0lFz4FIUQBh1CuP/1fHSLkGzDzAHiNVs4CA3tscDeXwBR07i9E6D3vYL0vGTI3 f1rMzyTKRf10u1zr0/BkLfxHoZi3DRLdxVNhhVrDI+J1lD+FHChIm8xB0XHNKwy5 c58N5D8OS6jNay/IgPl8Yrbl02bEuBrKhwu+0v/TRGb8UP5QQcxgawJfJBu+JNMr SMlVqFrpvUZrX+L504sL2DZt80J0xZgA8voZNLHTUwluvbDnqlK2pNlWlzhrcaak e1cy+QZzJUWy4MIKeh8CeL2v3FX9fHrdUWB8g3mPZQAeyHHM3st0iMHWckywQLti t2TGzdRkde0MWeN9q/MIkvolMu7kwujDZ2okBnaYAk1SUGnUFqcxa40ayIx7M85e K002lgGb9BXlrc8DKdIcm52YxVP2jH8nyBsXP4jLF8wGsr55prQTVjDBedTYOkai hqrDV0nU =5ZL6 -----END PGP SIGNATURE----- --=_MailMate_7887A3EB-949D-4C51-A7BE-94E1B0AC7FFB_=--