From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 70829C83F10 for ; Thu, 31 Aug 2023 07:33:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 049AA8E000D; Thu, 31 Aug 2023 03:33:49 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id F3B728D0001; Thu, 31 Aug 2023 03:33:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E2A488E000D; Thu, 31 Aug 2023 03:33:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id D0ECF8D0001 for ; Thu, 31 Aug 2023 03:33:48 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 98BC6160177 for ; Thu, 31 Aug 2023 07:33:48 +0000 (UTC) X-FDA: 81183585336.23.7EA5B2B Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf04.hostedemail.com (Postfix) with ESMTP id D5A4940025 for ; Thu, 31 Aug 2023 07:33:46 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=p78tUK9q; dmarc=pass (policy=none) header.from=kernel.org; spf=pass (imf04.hostedemail.com: domain of rppt@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=rppt@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1693467227; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=d/AEdKuAajfHEQ5QVC/Sup7VAFfHUTTwRNUFxmdvJkY=; b=PVlRoxkXaQoHC3ZxJ9CxI0Uu+fVtRzeBGCA8c0HhCCWlwVC3pPraOdf3oq6u8yIO5vJVHQ OO9wVU7zvP4RMX5ijvcGEqrjV++n5DrEKq9rxo6Q9SstUbFBbDImWKWBNeybhXlvruNgPS P10TB2qk5TZvD4jYbIZbn0AFKd0rkyo= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=p78tUK9q; dmarc=pass (policy=none) header.from=kernel.org; spf=pass (imf04.hostedemail.com: domain of rppt@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=rppt@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1693467227; a=rsa-sha256; cv=none; b=Q9d5sCQ51lmu5d7drQPJkv1ieks2mQKCdY/+9tN970o0VF1Q66m1zrPBqlNwk5SnzeobGr l7ANxGcX2WlIeO8bbO7ctuV/7NFZGryKJUmKY+mOhpyhGlz8oBRtJ9XwPdq/ZZ8nc1OTdQ 07BkUgoJDUqYxdtUzwCnMZO+yCx0vec= Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id C600862BE2; Thu, 31 Aug 2023 07:33:45 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id B3109C433AB; Thu, 31 Aug 2023 07:33:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1693467225; bh=uBbUIx+dWiP81snMHPZPsuyuXAf8/yb8NeMDAgIBQFw=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=p78tUK9qmJHU66Mv1WXWjGIsu44jcI1CeWmOqcSZeneI72RA+W/nhHwI7K5fhBHQ/ o1AwHUdH6uld+XBRisOf0GaMsPIcyXhms/j32rOVrwnkAR7ZcVB2IRYYZtHR1JBrmB Pzqgjf2RH29h/Ul2knGx33f3/X/3nssjfsc3VdzCh7Q3NKdyC9kQZfJhHMUYBqWMgR sa+P5BfvRAnKa2f7Z6Yq9kkqK3WG889sjV+WyZs9xZ/IkRNoz0uIRnLehV/+F0yxlu texuzH2ulLcki8UyV4PtfURDvbfmy43KYIp4nrwHWvH+57omGQEQeMX7vuLgQEJtlm ut6tkFnSywtyQ== Date: Thu, 31 Aug 2023 10:33:06 +0300 From: Mike Rapoport To: Usama Arif Cc: Muchun Song , Linux-MM , Mike Kravetz , LKML , Muchun Song , fam.zheng@bytedance.com, liangma@liangbit.com, punit.agrawal@bytedance.com, mgorman@techsingularity.net, akpm@linux-foundation.org Subject: Re: [External] Re: [v3 4/4] mm: hugetlb: Skip initialization of gigantic tail struct pages if freed by HVO Message-ID: <20230831073306.GE3223@kernel.org> References: <20230825111836.1715308-1-usama.arif@bytedance.com> <20230825111836.1715308-5-usama.arif@bytedance.com> <486CFF93-3BB1-44CD-B0A0-A47F560F2CAE@linux.dev> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Queue-Id: D5A4940025 X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: sxfeka6mnywkt9tmjypuohw158mokq7f X-HE-Tag: 1693467226-569908 X-HE-Meta: U2FsdGVkX19PawsotbRzri6+UG1OD71XBeBg/saD2fAkXDcuPVOvl741KWLb5sA/All44H7Iali9aSF46qECFbomUUfI0SFlhWPvzH48uPzAU2y90FGFJ6U/innQBLCgT59WqDtxbX4rk/q5iBnkKdO5V8QhtZKt2pVvadKEQlh4pKv7cud7zsPv8KU1tLYdlVyvbFit9lMDxyqZnupV+zKOrZVLmpAGyuvGxQN8F75HDZnJOCzOWHi9vDS4DwPYlzt3yXXU1aT0wzxZLnJ985VfP2NOUPAAav3YyZxGD98aqMl91lLde23mZPzRI+RLVLfqeRqvqCO4ywTUZ6G/QIG5ah9rpfZAzsCJONQhUQnq0Hz9YxzW5scOrvNtjC9d6XEilYzlWbzeLZIQesP9erUefGQkpgdgIdNSMgSXGtcH63Ye8X5Cka2FmVC6oYRfsqUHuQ0BlFCcdRqi8VlGQBBPRVbPwCxn3NkTvSIaUpmjbZ0eX4YO64H51NyLpHyGNBDmPEZLOqQMPKo9jBUSiYKn8VJ2HEXCdHDLFNoTF/GxTohpRHnLuf0IrdO2Pcik37QcNbsCDVt/zPav529Tm1pyTEQkpEoQPPYo3AJioJn78g9ZuwI3+6wOyt5uwDVueAx3FIgRJ984G2SB3pC/2uw7vEf4kRFqG/r+cd7yriJbmXN1bTCU2EKr3fhhLNe9gK1WcU7yx4d4JPQ4MGQo8aCvqlp31UzUrAAeJYpO8HMBrjH+xwnpnG5houjP6GnhucH/iL9KhaJ1PfHWXKAzzGS22/ZDAYZC1st8N9R9f3cfBJEeq/59A0fUODpAbCB3QlooajxRO1SBMdJkhQJPpEYZNADhrlg/iRf0eFERy9fj/EhP5Y8/BG+cdOm93eX/yVrSWz5O09Mw28jzubdWAxfoDyyEQrJR6/k+jISCOT0c0XjRuwqm4qs0ujS+Fi2gDb/Hytx7Rj/PDj7WjHa 5JPByG6S 6o+b5IrFL59S1kmHajXhgipAhgxLgpZiQbBB3UOUlNXEq+whQrXJhB8bgAfeMJTF5e8cgYP7mWjHZt7vBl0MdEApDc4M/ZnAknZlKsXf66NzFG432Hu3uE0KPpSxmOmcLGTmeGdaBp2VfctPiv3zVePlb/oCnIhRQkjMrRUD7ghLhotgfhQ3/J2JaEtymdzcz9qz6E49ICuoN/9UY0DUXXSip3eloR3XEPJHDTTgI6Ov99KeonWtL/52r7gsqMbmgEUYuYbNWNinuPXU+5ZZdExeuJEaxtEEdtXWEtYkvF/XcRUEEQqdK7D8ne2dih28Zbg7ZzAMwxla/BHXsA/KkC2VNlvd0oWB/cm+K X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Aug 30, 2023 at 11:27:42AM +0100, Usama Arif wrote: > > On 28/08/2023 12:33, Muchun Song wrote: > > > > > > > On Aug 25, 2023, at 19:18, Usama Arif wrote: > > > > > > The new boot flow when it comes to initialization of gigantic pages > > > is as follows: > > > - At boot time, for a gigantic page during __alloc_bootmem_hugepage, > > > the region after the first struct page is marked as noinit. > > > - This results in only the first struct page to be > > > initialized in reserve_bootmem_region. As the tail struct pages are > > > not initialized at this point, there can be a significant saving > > > in boot time if HVO succeeds later on. > > > - Later on in the boot, HVO is attempted. If its successful, only the first > > > HUGETLB_VMEMMAP_RESERVE_SIZE / sizeof(struct page) - 1 tail struct pages > > > after the head struct page are initialized. If it is not successful, > > > then all of the tail struct pages are initialized. > > > > > > Signed-off-by: Usama Arif > > > > This edition is simpler than before ever, thanks for your work. > > > > There is premise that other subsystems do not access vmemmap pages > > before the initialization of vmemmap pages associated withe HugeTLB > > pages allocated from bootmem for your optimization. However, IIUC, the > > compacting path could access arbitrary struct page when memory fails > > to be allocated via buddy allocator. So we should make sure that > > those struct pages are not referenced in this routine. And I know > > if CONFIG_DEFERRED_STRUCT_PAGE_INIT is enabled, it will encounter > > the same issue, but I don't find any code to prevent this from > > happening. I need more time to confirm this, if someone already knows, > > please let me know, thanks. So I think HugeTLB should adopt the similar > > way to prevent this. > > > > Thanks. > > > > Thanks for the reviews. > > So if I understand it correctly, the uninitialized pages due to the > optimization in this patch and due to DEFERRED_STRUCT_PAGE_INIT should be > treated in the same way during compaction. I see that in isolate_freepages > during compaction there is a check to see if PageBuddy flag is set and also > there are calls like __pageblock_pfn_to_page to check if the pageblock is > valid. > > But if the struct page is uninitialized then they would contain random data > and these checks could pass if certain bits were set? > > Compaction is done on free list. I think the uninitialized struct pages > atleast from DEFERRED_STRUCT_PAGE_INIT would be part of freelist, so I think > their pfn would be considered for compaction. > > Could someone more familiar with DEFERRED_STRUCT_PAGE_INIT and compaction > confirm how the uninitialized struct pages are handled when compaction > happens? Thanks! I'm not familiar with compaction enough to confirm it only touches pages on the free lists, but DEFERRED_STRUCT_PAGE_INIT makes sure the struct page is initialized before it's put on a free list. > Usama -- Sincerely yours, Mike.