From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 024A7E6FE3C for ; Fri, 22 Sep 2023 14:43:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 889796B026F; Fri, 22 Sep 2023 10:43:23 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 812896B0277; Fri, 22 Sep 2023 10:43:23 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6B4576B02E4; Fri, 22 Sep 2023 10:43:23 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 572D86B026F for ; Fri, 22 Sep 2023 10:43:23 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 2B562A0FE6 for ; Fri, 22 Sep 2023 14:43:23 +0000 (UTC) X-FDA: 81264501486.01.21149A9 Received: from mail-qt1-f174.google.com (mail-qt1-f174.google.com [209.85.160.174]) by imf25.hostedemail.com (Postfix) with ESMTP id 45F40A0007 for ; Fri, 22 Sep 2023 14:43:21 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=soleen.com header.s=google header.b=Xqoejwmd; spf=pass (imf25.hostedemail.com: domain of pasha.tatashin@soleen.com designates 209.85.160.174 as permitted sender) smtp.mailfrom=pasha.tatashin@soleen.com; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1695393801; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=q9o0eyPrPED08Whg4Z5ZxhQWffqyfOzmMnSHg2K2zdA=; b=Vxd2YkbDlXbgwHn0flTGFYt5OZN8ltnuDlagKTmPCXjvcrotEx/bWXSj/VXQ9iK8fr+rT6 BnI6FSoVDYqfNf/2VOhaZAZRpi7tRz3YEbeVirB51u47QITfvvdtUeqvzgLuUdNoAj/Cwl tiE5qKLWy37QqdEkJ72YeVxf/LpUzOc= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1695393801; a=rsa-sha256; cv=none; b=YZBdq+Ioc/xfCqZ4m9VvTAlFX9iN43U5FS0MJFmIuI03PlFkrjSKXilYsha+ee5gINuy7Y Msgz7r8N465smnn8fy+4HjT5cO1RQnaoXzmHj/jiEljzatDZJxtz+k38wqaOswTAE3pcqO +i0ND09xLqJ2LNaeYTqnF4HyjkwOoo0= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=soleen.com header.s=google header.b=Xqoejwmd; spf=pass (imf25.hostedemail.com: domain of pasha.tatashin@soleen.com designates 209.85.160.174 as permitted sender) smtp.mailfrom=pasha.tatashin@soleen.com; dmarc=none Received: by mail-qt1-f174.google.com with SMTP id d75a77b69052e-417ab9cb14cso14189561cf.2 for ; Fri, 22 Sep 2023 07:43:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1695393800; x=1695998600; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=q9o0eyPrPED08Whg4Z5ZxhQWffqyfOzmMnSHg2K2zdA=; b=Xqoejwmdqmb7aBJR7EE7jd7043jA5noXqA2a1qRHtHc+3Dms8WGHduTEQlRJ93kBaS x4z9vdp7DOW8uHwXDvLk/ifODG7cjcU6MF1JKWEHjwuiwlPi/2oM5Z0xq+cwBdbW/zNy bKVyH/bLT86VAti6ZuZKa0KNSXmCMGbYaJ1oLJ8i79ykYiaY4sfZvV1J8lmMaAe08h+3 3M+qAP1FbFE1W5SYiG29k7dz9RGtSI4OCFVFAlHU8vb19Cig07uu9dkoyxiNIZcz8cel 2iIvKH4UUUPTBCClXC4e6SlkfS1gDs4MgBuNBGx3DrTIR7crK1tmEk6lHmxCOU27ZcxY vxuw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695393800; x=1695998600; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=q9o0eyPrPED08Whg4Z5ZxhQWffqyfOzmMnSHg2K2zdA=; b=ee6/7HM0gDxxupR+0fXsyAXKASMF6rDFcp8vOYfyN2tKUzvywOXVqs3LYBDuzzqrhb xNGRI75nAX3W5+lJotEXUSYYcUFD9iFODZl/39fr6iDZNLP05CB0uqIZ3B0MGn0gYyd3 oHHmXeFiqOM2zKFbBdu4DhqtW6+MkWMKt3euvWghyLBy0/XsNXDM1HWEPymFcJohVYqi h5AzjMWqTLfv0sgBZPrRyfgAr1yyW+cEBiyZNv4WlKJHlJjkacQJIn7VtzpN6K0L7qwL /W7kIUly7cW3PsR1fMyW92ysaIZZmwbpyxj6yjgTE7vj7toVCAVjIxXIWWER6rjA1hJO pjIQ== X-Gm-Message-State: AOJu0Yz6B7wEsMK9CGlL65vU1n/xfw7B7bNRMthYKlJx+1LOLZ/LNLVM 3XA1EjURw9eTCwb7AdwDaudTRgEoGZ3SbJF0E6XYlg== X-Google-Smtp-Source: AGHT+IEj51Pc90r3QxMhk9K9tsL2rFrHblPnj7OXP5XQsNQ+DtOoZ8Qs/4DE3nnUfLiohjWib5sRDYhb1JtZkQYm084= X-Received: by 2002:ac8:5a16:0:b0:40f:ef6d:1a31 with SMTP id n22-20020ac85a16000000b0040fef6d1a31mr10433299qta.13.1695393800409; Fri, 22 Sep 2023 07:43:20 -0700 (PDT) MIME-Version: 1.0 References: <20230906112605.2286994-1-usama.arif@bytedance.com> In-Reply-To: <20230906112605.2286994-1-usama.arif@bytedance.com> From: Pasha Tatashin Date: Fri, 22 Sep 2023 10:42:44 -0400 Message-ID: Subject: Re: [v4 0/4] mm: hugetlb: Skip initialization of gigantic tail struct pages if freed by HVO To: Usama Arif Cc: linux-mm@kvack.org, muchun.song@linux.dev, mike.kravetz@oracle.com, rppt@kernel.org, linux-kernel@vger.kernel.org, songmuchun@bytedance.com, fam.zheng@bytedance.com, liangma@liangbit.com, punit.agrawal@bytedance.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 45F40A0007 X-Rspam-User: X-Stat-Signature: qrd7qbd5durnms5a4bp8n7ame5zbqrho X-Rspamd-Server: rspam03 X-HE-Tag: 1695393801-310479 X-HE-Meta: U2FsdGVkX1/g1z7hgJAvCUCNh4UOwTEC8TvIU43XdRnBqQrQYD0eZmCDqRPGfWoY3yrEacEcloEppVz7LQMtxTT75dkG/S/bolVHY3b6SoKOZDvhn/77Ens8JyjUI4XEeRkIiZ6DI4fZn/eyeMvqV7+/lhoY5J/zNcnplQKuiejXeDgCTH4UpSvnzJCnQneeSFF74c/SQPsxepMsQOkDu9INoqWo+nvIjKSpHLgtfbaehb9vy4B7lUQz0KM5HascZ22e4OmYLZrDfRgt51rDGU2z5IwBL+fOXA5Mgd8RFvRxbJ+FyJgrMDH7GOLOppS6PlxiydFa8JoEhKQQlK4d7EoBi1Bcx8qgbKAnXpGJtzels9nWSLJAzJcKZ+MmPkbLTWKW2Fu3aSQ4EwB3rvGnV2dGFicQwWv2Qz4ceHo3S9QjSBQXQMKnHWD5Xym0WZcNG0RFgiY3d363d5JzejMsuVmn5gGS1xviFXF96il6wiymhjiDd8JsBEfJBRBbj3D+V6jNsM+D8lclAsrE6JcOs//nNLymabChPgl9O07/MA6nOQHtUz/2huHR5omCqeXWpWo6QmFRQh+qOnFKsqLZYtDIaAEN63xwN0tTm4LuH09V0sWc+yyDSqqFB9Xi5cydQ4ZoANjcdjNSgWV25x+HuxteI6fptFckj9mj2r4rqTT86Vlsno7fveOfhsgIoHLPy0o1HO3QlPKIKjgrxx3VB7ok9f7MNIQt4HOhnatkgfz9+LqGGr+u/jV6IZT1eit1RmvcOXV2qo8kNlm9OALY1H1la7V3RxCIMR0JU+fARbwm7w4h/CC2M4hp/z5DIqN7pKehGZ+GFVJ5H+og1iaI2s5CkBAXB2MQj6wVxDtl4rqU5NxSBKz50voH6x4xb4VQuFhBgJn+gr8Dy3Kkj2hpQbFwKedZ58ebirH8mNDqkSJrzuKxgshlgv0LNfk3EuLyrh7dutPpdaynASFW/c2 ByY7n/o9 oLxpWCbQPLmNWlaejRyKc4yzu93PX4O+Hxa3gfHt4ujUSG+zysRPDAidCozk04PX1w+Mk/s2uuWVNYrr8O9HlhdzzjRqIMJoexijVVKStowQYXBh8qn/b/TR21mO/n3KSzhE+YFt60rVZoeXuNU5otnkK0YVVH9/FYhm1xoTT5zvcS2wHy7Iei6U4vTR/Kow8Y8Nj4EhFZdEADxkWc3r4ZaBTN+fjmNQqY7tVCxZJUpcofJ2Oh80y7M59pm/LVwd5trDM8rY7+zJ3+ikarH3jFDEuEzySYKECuZN/GvKhGKJFmBM= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Sep 6, 2023 at 7:26=E2=80=AFAM Usama Arif wrote: > > This series moves the boot time initialization of tail struct pages of a > gigantic page to later on in the boot. Only the > HUGETLB_VMEMMAP_RESERVE_SIZE / sizeof(struct page) - 1 tail struct pages > are initialized at the start. If HVO is successful, then no more tail str= uct > pages need to be initialized. For a 1G hugepage, this series avoid > initialization of 262144 - 63 =3D 262081 struct pages per hugepage. > > When tested on a 512G system (which can allocate max 500 1G hugepages), t= he > kexec-boot time with HVO and DEFERRED_STRUCT_PAGE_INIT enabled without th= is > patchseries to running init is 3.9 seconds. With this patch it is 1.2 sec= onds. > This represents an approximately 70% reduction in boot time and will > significantly reduce server downtime when using a large number of > gigantic pages. My use case is different, but this patch series benefits it. I have a virtual machines with a large number of hugetlb pages. The RSS size of the VM after boot is much smaller with this series: Before: 9G After: 600M The VM has 500 1G pages, and 512G total RAM. I would add this to the description, that this series can help reduce the VM overhead and boot performance for those who are using hugetlb pages in the VMs. Also, DEFERRED_STRUCT_PAGE_INIT is a requirement for this series to work, and should be added into documentation. Pasha > Thanks, > Usama > > [v3->v4]: > - rebase ontop of patch "hugetlb: set hugetlb page flag before optimizing= vmemmap". > - freeze head struct page ref count. > - Change order of operations to initialize head struct page -> initialize > the necessary tail struct pages -> attempt HVO -> initialize the rest of = the > tail struct pages if HVO fails. > - (Mike Rapoport and Muchun Song) remove "_vmemmap" suffix from memblock = reserve > noinit flags anf functions. > > [v2->v3]: > - (Muchun Song) skip prep of struct pages backing gigantic hugepages > at boot time only. > - (Muchun Song) move initialization of tail struct pages to after > HVO is attempted. > > [v1->v2]: > - (Mike Rapoport) Code quality improvements (function names, arguments, > comments). > > [RFC->v1]: > - (Mike Rapoport) Change from passing hugepage_size in > memblock_alloc_try_nid_raw for skipping struct page initialization to > using MEMBLOCK_RSRV_NOINIT flag > > Usama Arif (4): > mm: hugetlb_vmemmap: Use nid of the head page to reallocate it > memblock: pass memblock_type to memblock_setclr_flag > memblock: introduce MEMBLOCK_RSRV_NOINIT flag > mm: hugetlb: Skip initialization of gigantic tail struct pages if > freed by HVO > > include/linux/memblock.h | 9 ++++++ > mm/hugetlb.c | 61 ++++++++++++++++++++++++++++++++++------ > mm/hugetlb_vmemmap.c | 4 +-- > mm/hugetlb_vmemmap.h | 9 +++--- > mm/internal.h | 3 ++ > mm/memblock.c | 48 ++++++++++++++++++++++--------- > mm/mm_init.c | 2 +- > 7 files changed, 107 insertions(+), 29 deletions(-) > > -- > 2.25.1 > >