From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 510EBC5B552 for ; Tue, 10 Jun 2025 05:44:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AF7216B007B; Tue, 10 Jun 2025 01:44:55 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A80FF6B0089; Tue, 10 Jun 2025 01:44:55 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 94A546B008A; Tue, 10 Jun 2025 01:44:55 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 723536B007B for ; Tue, 10 Jun 2025 01:44:55 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id ECE5A141066 for ; Tue, 10 Jun 2025 05:44:54 +0000 (UTC) X-FDA: 83538402108.27.214060B Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf28.hostedemail.com (Postfix) with ESMTP id 4E556C0009 for ; Tue, 10 Jun 2025 05:44:53 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=kxupDTQ0; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf28.hostedemail.com: domain of rppt@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=rppt@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1749534293; a=rsa-sha256; cv=none; b=hY9fKGkkuWKdlahGK8JkjU8DQWcRSUWJjzFE+b/Mt58ZdBH5bRw2TcoklhFTZTvCoYkjCV aQn/yiAY6YRZsdFx4PPm7MY/e0L6DxsNUfrf6IV+xcfeCwS5H+5kY3s3ocXNmvSQDWex7Q g3WKOcPW07TWuP6CqtKHispm02LviuA= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=kxupDTQ0; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf28.hostedemail.com: domain of rppt@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=rppt@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1749534293; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=YuKLGx6o286cAEpkNpnYdZ8aIdxAVC37hih+/xzulMg=; b=QLyzX10dleH9X4ivpnjMj0eIFvnptwcojg9OH19+miwn0Esu6nOniYsE9IXOSNa/FDZTwS BKsAxd7coCytSjSvDFdriK96F9R+ws0hsGDuhsq+3AmMsEBghYMQ7V3q+tEMKUVZDADab6 wRenLNQ1q0UAvVpUB9hi8pCYz3XM6eI= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id 9471EA50C11; Tue, 10 Jun 2025 05:44:52 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 528DEC4CEEF; Tue, 10 Jun 2025 05:44:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1749534292; bh=W+Gg4WOhK9awSlR/caQgQ38dLinN024eBtGS/ksxClw=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=kxupDTQ06zfuopf4OU2zbSpo9JenUV3xucU30QvDP3xxaPrjvCDxrwJACEPiTBqtu je6iLXmTkqxL0PPqW/aPIJeGMzq1pdHN15ydbQsMxFvyIghPwMjbK9yy6ONVJK68ju blTglhxSnHf106jy9zd85Ffu+pkKFbLe35sO/zJDrVA3Kn6PH0i9LMhAGid9BIQzbp NNN5MUja6ouaXVXwQsA8bUVd8iAGM2m90R3VxBWXpx2hpqLZ7F3uugzjvJ+EIO6aH/ xrkkT9fqv+6ka/RsRcFUZ20yOpd82LoHW+0ShrA2xtbwobV+9Ahc74WlnmyNsg7YXh uvufsTllM9Y6w== Date: Tue, 10 Jun 2025 08:44:45 +0300 From: Mike Rapoport To: Pasha Tatashin Cc: Pratyush Yadav , Alexander Graf , Changyuan Lyu , Andrew Morton , Baoquan He , kexec@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH] kho: initialize tail pages for higher order folios properly Message-ID: References: <20250605171143.76963-1-pratyush@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 4E556C0009 X-Stat-Signature: tpf9czotg7p3cx3h71bx88uyeoi5ot13 X-Rspam-User: X-HE-Tag: 1749534293-91561 X-HE-Meta: U2FsdGVkX1+3pamBr4keFQsrdg3JGEQmGwdSM7K819S2q+Eh4WlBkiVplr6NqKy32csrCCZu2wVlJZQbco3aYPqaLo9PFOidEUUDZa26Py20+SnbFVW5EWSP4xBWowqQrjXbYDiogGEqOabtclQY4WVTaD2Q3xLm8qVxe1bYxHHK15lmOBY1XzmZmMmQhlnKT3z6PEW6vHaFLrNWRvzxViSBaPYcD7PSLchqaxdklK4KX9TdNTvzrI9lWOEPrz7NQItGxdZxI8rKXUMoQonr6AHzZe+6+FNvOzD75G7vkn6rvtZp+S7zZb96Eh8GksMI0NNrc4j4b8WyqBaF0PmRc+YyLGesMjnN7GRYFfXyaNAa7TIEadCIVl/AYRsmbjuq2HIv61FDaXpdug07DMNOqiXgALXFeHRDznJS1ZJZgJdieOY5gP2ltHK0w2cJqPJb7NWRYKFGOD7a3uoiWlvRHrCEqTqe2JqFVknrJQfExxbWMXret1X1Qa14DhysxMG4FwDcubwfkV09DBTCAx+Lb1db+J1Lj7qUSe63IbFwJsOVGTSLUiLMuBO/fRiPP+NeJm9+Hra/XXtf+jltSQPxE+XXMGXSSafoYQRgH2fKlUI1IggT5KgR0OVsFbRfo083+dibPtvU8ZK7+eji6+Lsac5KllN7/QeS5wNxVITK1cV0dbV9jhq5JYkz5McI4sdsmo7iAwrA9NExxDvO7Atnqd6MQ+blJ4RbOwfekmPvLqu1yQ3bpRKLEs3jIyEgm2yI1sFEXXhWR2Iylom0jjQy0NfJ+Bbb31Ew+4CNYV+9RcxVK/80Oa7HjgrHAYQTPEOcwu/G/Kk+bf9u1XWM8iSgsDqKfDEoFXkAxpUcxWHNnEUMtyhwkZFY828Ovqm2rhlPnoNwtbUIoxuu3H+DdRKrh5Z9EgtdlQrkmQewucNJvf4UCWMIBbb+QFcqxMrICDDyWZ08wmPaXCiQZBy4G/x 5In23rjs ZVam3YxS4bf3JKEzG3iNPAsDZtdSikpWPrcerxrc2RdEvM6Y7cPLkDCvFllaD35yOmP+u9QAsJFRZclHY5izfjQrjRg02Lg3qTNiSOL0+/x7aZkL2mgDd5FOZVQHX+W4QLT83OLIc+miO7BwmIBfqv6teKgvVjY00pElrqQk1ODs3L8li+dUqi1nxmj9L/qLoDvpDkv5YJkGgOmAktxu5SraWzpKEFml996QLagQ3zZ4qRAoAGZZWg2jKoGi9Xx6HGFwTqz3kIBkxHXJ9C48MiE2LlMDPlgO5LLDDpb4+5joFijq/vjHixGO0BPPt7eGknVmpGW29xhPp/y0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Jun 09, 2025 at 04:07:50PM -0400, Pasha Tatashin wrote: > On Mon, Jun 9, 2025 at 3:36 PM Mike Rapoport wrote: > > > > Hi Pratyush, > > > > On Fri, Jun 06, 2025 at 06:23:06PM +0200, Pratyush Yadav wrote: > > > Hi Mike, > > > > > > On Fri, Jun 06 2025, Mike Rapoport wrote: > > > > > > > On Thu, Jun 05, 2025 at 07:11:41PM +0200, Pratyush Yadav wrote: > > > >> From: Pratyush Yadav > > > >> > > > >> --- a/kernel/kexec_handover.c > > > >> +++ b/kernel/kexec_handover.c > > > >> @@ -157,11 +157,21 @@ static int __kho_preserve_order(struct kho_mem_track *track, unsigned long pfn, > > > >> } > > > >> > > > >> /* almost as free_reserved_page(), just don't free the page */ > > > >> -static void kho_restore_page(struct page *page) > > > >> +static void kho_restore_page(struct page *page, unsigned int order) > > > >> { > > > >> - ClearPageReserved(page); > > > > > > > > So now we don't clear PG_Reserved even on order-0 pages? ;-) > > > > > > We don't need to. As I mentioned in the commit message as well, > > > PG_Reserved is never set for KHO pages since they are reserved with > > > MEMBLOCK_RSRV_NOINIT, so memmap_init_reserved_pages() skips over them. > > > > You are right, I missed it. > > > > > That said, while reading through some of the code, I noticed another > > > bug: because KHO reserves the preserved pages as NOINIT, with > > > CONFIG_DEFERRED_STRUCT_PAGE_INIT == n, all the pages get initialized > > > when memmap_init_range() is called from setup_arch (paging_init() on > > > x86). This happens before kho_memory_init(), so the KHO-preserved pages > > > are not marked as reserved to memblock yet. > > > > > > With deferred page init, some pages might not get initialized early, and > > > get initialized after kho_memory_init(), by which time the KHO-preserved > > > pages are marked as reserved. So, deferred_init_maxorder() will skip > > > over those pages and leave them uninitialized. > > > > > > So we need to either also call init_deferred_page(), or remove the > > > memblock_reserved_mark_noinit() call in deserialize_bitmap(). And TBH, I > > > am not sure why KHO pages even need to be marked noinit in the first > > > place. Probably the only benefit would be if a large chunk of memory is > > > KHO-preserved, the pages can be initialized later on-demand, reducing > > > bootup time a bit. > > > > One benefit is performance indeed, because in not deferred case the > > initialization of reserved pages in memmap_init_reserved_pages() is really > > excessive. > > > > But more importantly, if we remove memblock_reserved_mark_noinit(), with > > CONFIG_DEFERRED_STRUCT_PAGE_INIT we'd loose page->private because the > > struct page will be cleared after kho_mem_deserialize(). > > > > > What do you think? Should we drop noinit or call init_deferred_page()? > > > FWIW, my preference is to drop noinit, since init_deferred_page() is > > > __meminit and we would have to make sure it doesn't go away after boot. > > > > We can't drop noinit and calling init_deferred_page() after boot just won't > > work because it uses memblock to find the page's node and memblock is gone > > after init. > > > > The simplest short-term solution is to disable KHO when > > CONFIG_DEFERRED_STRUCT_PAGE_INIT is set and then find an efficient way to > > make it all work together. > > This is what I've done in LUOv3 WIP: > https://github.com/soleen/linux/commit/3059f38ac0a39a397873759fb429bd5d1f8ea681 I think it should be the other way around, KHO should depend on !DEFERRED_STRUCT_PAGE_INIT. > We will need to teah KHO to work with deferred struct page init. I > suspect, we could init preserved struct pages and then skip over them > during deferred init. We could, but with that would mean we'll run this before SMP and it's not desirable. Also, init_deferred_page() for a random page requires finding its node with early_pfn_to_nid() that's also suboptimal. > Pasha > > > > > -- > > Sincerely yours, > > Mike. -- Sincerely yours, Mike.