From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0AAA1C6FA8F for ; Wed, 30 Aug 2023 10:27:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3E5F5280044; Wed, 30 Aug 2023 06:27:49 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 395968E0009; Wed, 30 Aug 2023 06:27:49 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 25D39280044; Wed, 30 Aug 2023 06:27:49 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 17FF98E0009 for ; Wed, 30 Aug 2023 06:27:49 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id D555BC03C5 for ; Wed, 30 Aug 2023 10:27:48 +0000 (UTC) X-FDA: 81180395016.06.AFA5006 Received: from mail-wm1-f49.google.com (mail-wm1-f49.google.com [209.85.128.49]) by imf25.hostedemail.com (Postfix) with ESMTP id 39898A000F for ; Wed, 30 Aug 2023 10:27:45 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=T3XKTu8v; spf=pass (imf25.hostedemail.com: domain of usama.arif@bytedance.com designates 209.85.128.49 as permitted sender) smtp.mailfrom=usama.arif@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1693391267; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=RCcJI/XBAURtymMp8+JEIoJu+kdAE47YPWJFfv/6wUE=; b=rveDM5PNq75zZZI72Db9nRBmVctnfx6yEpq4AHqHolY2i7wubv7OvvwyFyq5BuQl1/EpGW lrmyQ+7KxtFeAus+tVaW44p5XTXr4XIf7XHHCdWWUUQQE3Fqjk8D9FrtSugIReW3QlsWHf N8wrjOR5C2qPYrGTOlLYRByYsGGlFWI= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1693391267; a=rsa-sha256; cv=none; b=vZvVjd/GmH8owGg0KyIj+1n7lzNELOyZVpizqcz9XGSIaDpmjktmp0BJL0B3uAeqE+FgMj kC8GKfsi5zjLmKSD0zhGeJ9KU9cBzz+x/Icwf4+jqbpF5rJy5rRmsBC9KJ2IbDbcK4spRO 2dPg9/cDD4mVUEYf3qyCDWH5JPqjAY4= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=T3XKTu8v; spf=pass (imf25.hostedemail.com: domain of usama.arif@bytedance.com designates 209.85.128.49 as permitted sender) smtp.mailfrom=usama.arif@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com Received: by mail-wm1-f49.google.com with SMTP id 5b1f17b1804b1-401b5516104so49675945e9.2 for ; Wed, 30 Aug 2023 03:27:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1693391264; x=1693996064; darn=kvack.org; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=RCcJI/XBAURtymMp8+JEIoJu+kdAE47YPWJFfv/6wUE=; b=T3XKTu8vlrJGv/MpY86Nzah8hQ1rYZzB1RSLBfm6TKPirK4fs5D2vWbkCWuvH/+Uhx qRC6AHgED8+NTe7kwfMlTrQJiXVl0a3a7yxlGBsjVzeZNdOQ0aG4vgZ0KJcnLI0Roz8D 9C/3s2wMB5kQte0VN2keE7oaxuZjmWFDSpgN45rERJtocw8dJ0EprDEJ6kmz8s0uGLKT PtYqun0z87tvNZHqepGRdFD/WB8GfNjiucO2kE7cxJWDGzMs1UtvC6MqVcDoStZam/y/ 3mpwhV35Np7Rol6wXu0UOIqqtU32cz5UyWzO5qqiZYVtQ+0J2zfz9sI9Zw/5kpYo6Px+ iNAQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1693391264; x=1693996064; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=RCcJI/XBAURtymMp8+JEIoJu+kdAE47YPWJFfv/6wUE=; b=PZZOh/gzIbTugcRSQj3r2Y30KQjvhNCbDInPF+nVAoMLqqy4O9eFMWdomaK3/2QgRw UpeGOEaZKSDdFQyeXhPkAuQGVHcp7s4FtAslL/yibFjWKkh2sjApiMSPAQuRyC7kedfV MG+jaI77Bb08nGvQZGwXsEzYvp1IC+r3+TcEYFWNuUx//agO0+qzKW20SV1JWOHU69E3 6niG2EBO/SDZAXzNzRp+vJ4VlWriBspao/W0Qd3GZK6va34TQFIug44yi4bjECosjL+r dUlelNPmVYx9jWRm63swrAMZrTQH/E2rr9ci/rOLe9a9PdrJ+41Bs+5BFEZy1GTbgjZV Fj0Q== X-Gm-Message-State: AOJu0YxkphDLj8hKoblIhbSirz0TaH3crGvqWFy28XxgF/1Z1AKFjLnK OfUw+NvLdYTw/V/5xY23UhKq/A== X-Google-Smtp-Source: AGHT+IEYjjd9P67gZXCIXw5Iz2P1HdlDip97YL3UJ8sR7ysJqjWDtT6YKwgJDOnUBVB76xMPDLt7lQ== X-Received: by 2002:a05:600c:2108:b0:3fd:3006:410b with SMTP id u8-20020a05600c210800b003fd3006410bmr1574444wml.34.1693391264419; Wed, 30 Aug 2023 03:27:44 -0700 (PDT) Received: from ?IPV6:2a02:6b6a:b5c7:0:7a7d:3dd8:1d8:b0bc? ([2a02:6b6a:b5c7:0:7a7d:3dd8:1d8:b0bc]) by smtp.gmail.com with ESMTPSA id u13-20020a5d514d000000b0031ae2a7adb5sm16044922wrt.85.2023.08.30.03.27.43 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 30 Aug 2023 03:27:43 -0700 (PDT) Message-ID: Date: Wed, 30 Aug 2023 11:27:42 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.4.2 Subject: Re: [External] Re: [v3 4/4] mm: hugetlb: Skip initialization of gigantic tail struct pages if freed by HVO Content-Language: en-US To: Muchun Song Cc: Linux-MM , Mike Kravetz , Mike Rapoport , LKML , Muchun Song , fam.zheng@bytedance.com, liangma@liangbit.com, punit.agrawal@bytedance.com, mgorman@techsingularity.net, akpm@linux-foundation.org References: <20230825111836.1715308-1-usama.arif@bytedance.com> <20230825111836.1715308-5-usama.arif@bytedance.com> <486CFF93-3BB1-44CD-B0A0-A47F560F2CAE@linux.dev> From: Usama Arif In-Reply-To: <486CFF93-3BB1-44CD-B0A0-A47F560F2CAE@linux.dev> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 39898A000F X-Rspam-User: X-Stat-Signature: ykginsk1bzgscxfnaxh4a97ydo4xb78y X-Rspamd-Server: rspam03 X-HE-Tag: 1693391265-676048 X-HE-Meta: U2FsdGVkX18yB5k1Lgv5moW7fygzQDAz56vbbfUj+r+kk/Sx5NF4AgjBDZdkBm+AWOekkoRNg5Cpg6k4+ly8MkEvb/VqK0ioPTUU70C/BbObqA8vqzjlf17jEmo+x1NPhbaWh+3E6opRjfpU7OmW2u4I41KeyI3rv+TU7JMhque+S7kqGfUzR0xyLlSoBi4xdLmqMEG/IUdgRwDZlM6OmcxhtQJxNZ+lS90NykcQ4zMwrsPN9Tb6qOaKNN0dq6yufWr7gLKcj2ua7cCWhxcFqbw/XnQpjLE28Btp7V0UA7aVOx4cJPJ1KatwclOZMKQHiiR6m15DTRNuTQjLrrFhe4xw+45zQAlN4nQ6L0FxssUeoJunXdtC476Ke5DkCJ6S9LM7zp1GGlBZRIM0rWcMXINQgyVqRflzRYqs+Pq81U2k0S+Wb2JpF2zLdNximLNbsXyb9gkFOOUPK21hNGqFTScydAGqrBh6db+7JBo/aaO2dqN1hD+Y5mvbBxoiDU1SIXgFoCazBKBcscXuFvX9321+I+jcUBbCTgHhfXW++k+A1f2vIbOu3//TCjLi5VBEkptZFQLJ1rw/h+qPn0vUcm3YBKCvJu86iLXzowglHovjzaX6Pz9BW8Dnlavk6e8BYk3HJSuOfsLED4tegw2GhMNO6seX/uQTXgazoAn2eVkeNxsXV2YLk18yJtbcROor6jl2RIdDoawf1c/zQaYVT3trM+i5xD7q5cmX3F9zL14Affzh5x2MgCabGuO1E0NGGRSieN8Hxf8e9ZnPwGboe+ZdEYxMdODa6Q3GWO2xQB6MvDR+R92Q9yaL5m2UDiUEelg4AoGA/sv6uMqXbiYPlyWN8M0WzqZTBbMQQ8yh1GMBL67/cswJ9qszjXqWMr0DciDPzfkwyYPlhJW98e0Qcwrf6OOx6kA9Cwq7cx03Nzw6OVqkJKiRgTESpsZz7okgYTyJTGYhZNQ/cprYwWN xO91zWab a5Q3YTKl28qNN/3qwoDPDgNv7YKm1IpZSgZ+mblbDlA7sToxlD9mgUQ4yT0xP4MkI5TI9xRMczsEeGmlR8OFl1q2egFckCgBMkh0tVI8/F7yRiKKkcXQUHySWCNbbEeM4aCMAd3Iau+eO2DiFZbf7hthrKNOJzch6prgKMHvLs1AnxCWWYh9ntasWfQmkvd01zTWsFp+Dy5fB+eLa5Fq4Er/2JesGVegRliiHoXib9J8wCbAdQtvulewcPw1l5CoK5tMHJkRxRhb5TuaGbkMC4Bqt4gZ5TJNkMQTrLXLXgNE+3cKAwf6qHskjM+W0HMiObmKK152gyxoEyoOSyW6j0x/GkEruADAS7BbmTdDN5bj2rMEVvuMHMocMOiSMFRvZl5wnDAC4VB1vc9b8eHUpBurlPORkSGqaYnQUTnYCPAIHnkdedvvBSDLrVA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000023, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 28/08/2023 12:33, Muchun Song wrote: > > >> On Aug 25, 2023, at 19:18, Usama Arif wrote: >> >> The new boot flow when it comes to initialization of gigantic pages >> is as follows: >> - At boot time, for a gigantic page during __alloc_bootmem_hugepage, >> the region after the first struct page is marked as noinit. >> - This results in only the first struct page to be >> initialized in reserve_bootmem_region. As the tail struct pages are >> not initialized at this point, there can be a significant saving >> in boot time if HVO succeeds later on. >> - Later on in the boot, HVO is attempted. If its successful, only the first >> HUGETLB_VMEMMAP_RESERVE_SIZE / sizeof(struct page) - 1 tail struct pages >> after the head struct page are initialized. If it is not successful, >> then all of the tail struct pages are initialized. >> >> Signed-off-by: Usama Arif > > This edition is simpler than before ever, thanks for your work. > > There is premise that other subsystems do not access vmemmap pages > before the initialization of vmemmap pages associated withe HugeTLB > pages allocated from bootmem for your optimization. However, IIUC, the > compacting path could access arbitrary struct page when memory fails > to be allocated via buddy allocator. So we should make sure that > those struct pages are not referenced in this routine. And I know > if CONFIG_DEFERRED_STRUCT_PAGE_INIT is enabled, it will encounter > the same issue, but I don't find any code to prevent this from > happening. I need more time to confirm this, if someone already knows, > please let me know, thanks. So I think HugeTLB should adopt the similar > way to prevent this. > > Thanks. > Thanks for the reviews. So if I understand it correctly, the uninitialized pages due to the optimization in this patch and due to DEFERRED_STRUCT_PAGE_INIT should be treated in the same way during compaction. I see that in isolate_freepages during compaction there is a check to see if PageBuddy flag is set and also there are calls like __pageblock_pfn_to_page to check if the pageblock is valid. But if the struct page is uninitialized then they would contain random data and these checks could pass if certain bits were set? Compaction is done on free list. I think the uninitialized struct pages atleast from DEFERRED_STRUCT_PAGE_INIT would be part of freelist, so I think their pfn would be considered for compaction. Could someone more familiar with DEFERRED_STRUCT_PAGE_INIT and compaction confirm how the uninitialized struct pages are handled when compaction happens? Thanks! Usama