From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 502F5EB64DC for ; Tue, 11 Jul 2023 17:19:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C169D6B0072; Tue, 11 Jul 2023 13:19:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BC7336B0074; Tue, 11 Jul 2023 13:19:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A8F646B0075; Tue, 11 Jul 2023 13:19:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 997C66B0072 for ; Tue, 11 Jul 2023 13:19:12 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 688591C82A9 for ; Tue, 11 Jul 2023 17:19:12 +0000 (UTC) X-FDA: 80999991744.25.E984216 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf06.hostedemail.com (Postfix) with ESMTP id EE44D18001A for ; Tue, 11 Jul 2023 17:19:09 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=OZTDL6Up; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf06.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1689095950; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=/9M+w9wyYwGUra5eOp7gqUj7VLWSNr2xpH2Oj5f44v4=; b=KocjRqWgKayfZFdd+b3lHG1wppmF31Qm7VbSqrLHljOriR/ry6fAmXv+wd+TJcssS2RHS5 3qaJQs5n88IY/45c7uegaKvO//0W4v6TAfLDtQDxW73e9zyMHzEx7vLS7rjPpVktAqFOWU gatqsjV9r4fc3oBPL0W7mr1eOmz35ys= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=OZTDL6Up; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf06.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1689095950; a=rsa-sha256; cv=none; b=ueTGc9vbeL0EgeSIkDGBo3WegZawL+ah0ySj3UW/Y323+1NehNXVkaIZwHFeMJtA9w23Tc boxjF4taQtuH94RyZtKdmzGvBfrbA5GofVVDA/jqpK6gs0pyROejWcstKa5ULDGVuwIKBF pFPdhLj9FWTSNdw3qlZKAk59QzoRD48= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1689095949; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/9M+w9wyYwGUra5eOp7gqUj7VLWSNr2xpH2Oj5f44v4=; b=OZTDL6UpyRtBH934SYW3bHmYPLXCdXV5I6SiClX0yEpVuTtmIxhCqo8WvDg9QkeaBj92pV B0P9sjWVnSzrYk3ir0ZplO9jLT0Xkbro5fkNDZMoLbFHElpJzqtE4yWHnzrwoeFsLJBTst VkoJ244JfNleyvjnDy6CrtDs2tWWy08= Received: from mail-wr1-f70.google.com (mail-wr1-f70.google.com [209.85.221.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-530-9v90hxydMGqixzxYs0W1oA-1; Tue, 11 Jul 2023 13:19:07 -0400 X-MC-Unique: 9v90hxydMGqixzxYs0W1oA-1 Received: by mail-wr1-f70.google.com with SMTP id ffacd0b85a97d-3113da8b778so3291479f8f.3 for ; Tue, 11 Jul 2023 10:19:06 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689095946; x=1691687946; h=content-transfer-encoding:in-reply-to:subject:organization:from :references:cc:to:content-language:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=/9M+w9wyYwGUra5eOp7gqUj7VLWSNr2xpH2Oj5f44v4=; b=ivGT+u0BsvMUtJYBd4hqeiKQ0OQznsYgCIUD5qa7EUbspjQYo2I8WMNJIZdTw0SMvg uQxJKMcwTgva+scweCgmfsDgchZR5yAG1oq8HrjquC6qs5OxyoBWxC1CMmQxakQLRVJZ MqHl3PEnvyworGa3QcyqhsGOKtI7EDHZmFbffcEaOw0p2S9t91l6tMFR/rWe2w26gJd3 QmJGsdgEhCDpQlIfsb0Rzp2lZMf7ndkBWyDvY1pf/5gflR5TrDsk2BfYsoIQLFnv5QcZ 5jT+EwSVkYSx3CcgiIN3ayNuBiQ2QcOG2tUE0CB3wTYW8ZulT+seLwuZOEdKjcY4ycsA ZusA== X-Gm-Message-State: ABy/qLZ/+AE2Q1DmDVSdsZBih07DJp41K9htgsJvftdkNuIhVPGDoyWx 0k2jNeHBNWVvlpWvCPQ0CKT7SK37akDF6bTdfORsuNdFURhhIfKnw2khuTsUXdc/47J9vmXnBdo OoURJLoJA6Tk= X-Received: by 2002:a5d:6d50:0:b0:313:e888:3517 with SMTP id k16-20020a5d6d50000000b00313e8883517mr9881893wri.43.1689095945837; Tue, 11 Jul 2023 10:19:05 -0700 (PDT) X-Google-Smtp-Source: APBJJlFjs9i+1e2SCLTiGZJoMwpTB+Sx8tNcVltP3KrC+1KsvYVjiwCBam3TKHcUrm76bK04AXGfqg== X-Received: by 2002:a5d:6d50:0:b0:313:e888:3517 with SMTP id k16-20020a5d6d50000000b00313e8883517mr9881879wri.43.1689095945413; Tue, 11 Jul 2023 10:19:05 -0700 (PDT) Received: from ?IPV6:2003:cb:c745:4000:13ad:ed64:37e6:115d? (p200300cbc745400013aded6437e6115d.dip0.t-ipconnect.de. [2003:cb:c745:4000:13ad:ed64:37e6:115d]) by smtp.gmail.com with ESMTPSA id i17-20020a5d55d1000000b00313f07ccca4sm2755866wrw.117.2023.07.11.10.19.04 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 11 Jul 2023 10:19:04 -0700 (PDT) Message-ID: Date: Tue, 11 Jul 2023 19:19:03 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.12.0 To: "Aneesh Kumar K.V" , linux-mm@kvack.org, akpm@linux-foundation.org, mpe@ellerman.id.au, linuxppc-dev@lists.ozlabs.org, npiggin@gmail.com, christophe.leroy@csgroup.eu Cc: Oscar Salvador , Michal Hocko , Vishal Verma References: <20230711044834.72809-1-aneesh.kumar@linux.ibm.com> <20230711044834.72809-5-aneesh.kumar@linux.ibm.com> From: David Hildenbrand Organization: Red Hat Subject: Re: [PATCH v3 4/7] mm/hotplug: Allow pageblock alignment via altmap reservation In-Reply-To: <20230711044834.72809-5-aneesh.kumar@linux.ibm.com> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: EE44D18001A X-Stat-Signature: iz86pd4wd3tzscye3ogtb1nod4yjezk4 X-HE-Tag: 1689095949-693833 X-HE-Meta: U2FsdGVkX18789b9ojzDnz8K1uraEApzqDDJmOT5o+D5z19XWOtlHQFP4gSlk2pNKxu/+fBhuU3GWIuZ33AlH7mIfBfOiT3qxdzBqwr2xPFdD2Ln2sTNccS9D7iWslMeKjy+9zhKXYO/q5ZWm9iJUiKTgHu9E+ivpewGHCQO+psAfTIlkUKOyqna+Y34pG/Xfc9Ck0O0MPKpWtGRe/pwnn2GiafBmonGS6PCj2EWQ54uA0124yu+w3WVPS428L38uAd9C2PitSFf7ORkQV9u8mUEK3/QMw19Up43wvr7/j/zkQylODgZs15rDNRbHwqIMNiFN59GAL5x7pe6Z3WxKmVyys/moLNhRVwtL+mGL0vctUY8OotM+PLDdKwt2GctRogzKha3JWAStlY9FU+Bb2EraEW8eOWPnnC4p95xqbGB1a++OnoznBe/8S+pn77FXUlDf+mXOBSaUtHDIPzbMzfOgcq+JZbKlZpi0MQ9+N0cXCdjvMFUrTPZTyJoM/TI8vybCNxk+C+3NE4PhB6Mbhpn3wNjAtIYf4Rg/3pB6fgCRx0TOT8ZGejMtJGN0XrMoz82am12g3KnWPjryXt3Wz7hHJdxjazRuAJvp20lpzHSyk/FHdgVS6dOuoq1STd5tysOev/EDSat5qEU0m4CT0nSmDHMNHRp967ElwVu8wnIfn3LO8GWJ2hkUbRCXGBqhmGCcy6BOYrcCnvAj1ZavSzUJ3iVlbVt7IgRqDJk3d7VQbWZwSb0ul1wEJzuSIw7ibvC3DJBTV5xTN4yPf0+R8e08U6SPheP6eIfrl1ktQO8/GB+pB0IpKCtM5c5m+0AqSmNTJq2paukQ7EGGgeSMC9kcssu3Q91Q4CrQ3/oS3oxTAVHKbMoivkhbTELHc8QvfHlQlcGxuiqyj2Ihk+qPkJ1BJT7MzdFsCUq0F9iv9nUfDUjIsYbY344BYmHdjg5ITB+NEopqdo2oJeKRyp xGGgjwam tN3SbtU7meyFRg5JP4mmOW4TmUmwBWu+/oqESWcmom389BrlqjkXuSpeKmMP3DbGF5dssfmaqW3Umo/VogOGqeYwkFBqFKUTShbiOsxXcG2KD+Ag4zxEqHEuBVN3jvxmq634HB18o7y1UFcQy+LU2xOeU5PGVNMGqlMPSPnlJP7cbRAKEKL2h4u+/Un0/0nVtctkKGlmJNX7clyGYAyPqZFhwMvvRB5inXs/MJ3WAhYM3Zv2FreVjHRchQemzrWMKwfJ/yv8C6L+Y1T1w6xgK5tHvEseIXNNKrt0ZjqKq8+p+xB6xVmZbrM1bkg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 11.07.23 06:48, Aneesh Kumar K.V wrote: > Add a new kconfig option that can be selected if we want to allow > pageblock alignment by reserving pages in the vmemmap altmap area. > This implies we will be reserving some pages for every memoryblock > This also allows the memmap on memory feature to be widely useful > with different memory block size values. "reserving pages" is a nice way of saying "wasting memory". :) Let's spell that out. I think we have to find a better name for this, and I think we should have a toggle similar to memory_hotplug.memmap_on_memory. This should be an admin decision, not some kernel config option. memory_hotplug.force_memmap_on_memory "Enable the memmap on memory feature even if it could result in memory waste due to memmap size limitations. For example, if the memmap for a memory block requires 1 MiB, but the pageblock size is 2 MiB, 1 MiB of hotplugged memory will be wasted. Note that there are still cases where the feature cannot be enforced: for example, if the memmap is smaller than a single page, or if the architecture does not support the forced mode in all configurations." Thoughts? > > Signed-off-by: Aneesh Kumar K.V > --- > mm/Kconfig | 9 +++++++ > mm/memory_hotplug.c | 59 +++++++++++++++++++++++++++++++++++++-------- > 2 files changed, 58 insertions(+), 10 deletions(-) > > diff --git a/mm/Kconfig b/mm/Kconfig > index 932349271e28..88a1472b2086 100644 > --- a/mm/Kconfig > +++ b/mm/Kconfig > @@ -570,6 +570,15 @@ config MHP_MEMMAP_ON_MEMORY > depends on MEMORY_HOTPLUG && SPARSEMEM_VMEMMAP > depends on ARCH_MHP_MEMMAP_ON_MEMORY_ENABLE > > +config MHP_RESERVE_PAGES_MEMMAP_ON_MEMORY > + bool "Allow Reserving pages for page block aligment" > + depends on MHP_MEMMAP_ON_MEMORY > + help > + This option allows memmap on memory feature to be more useful > + with different memory block sizes. This is achieved by marking some pages > + in each memory block as reserved so that we can get page-block alignment > + for the remaining pages. > + > endif # MEMORY_HOTPLUG > > config ARCH_MHP_MEMMAP_ON_MEMORY_ENABLE > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c > index 07c99b0cc371..f36aec1f7626 100644 > --- a/mm/memory_hotplug.c > +++ b/mm/memory_hotplug.c > @@ -1252,15 +1252,17 @@ static inline bool arch_supports_memmap_on_memory(unsigned long size) > { > unsigned long nr_vmemmap_pages = size >> PAGE_SHIFT; > unsigned long vmemmap_size = nr_vmemmap_pages * sizeof(struct page); > - unsigned long remaining_size = size - vmemmap_size; > > - return IS_ALIGNED(vmemmap_size, PMD_SIZE) && > - IS_ALIGNED(remaining_size, (pageblock_nr_pages << PAGE_SHIFT)); > + return IS_ALIGNED(vmemmap_size, PMD_SIZE); > } > #endif > > static bool mhp_supports_memmap_on_memory(unsigned long size) > { > + unsigned long nr_vmemmap_pages = size >> PAGE_SHIFT; > + unsigned long vmemmap_size = nr_vmemmap_pages * sizeof(struct page); > + unsigned long remaining_size = size - vmemmap_size; > + > /* > * Besides having arch support and the feature enabled at runtime, we > * need a few more assumptions to hold true: > @@ -1287,9 +1289,30 @@ static bool mhp_supports_memmap_on_memory(unsigned long size) > * altmap as an alternative source of memory, and we do not exactly > * populate a single PMD. > */ > - return mhp_memmap_on_memory() && > - size == memory_block_size_bytes() && > - arch_supports_memmap_on_memory(size); > + if (!mhp_memmap_on_memory() || size != memory_block_size_bytes()) > + return false; > + /* > + * Without page reservation remaining pages should be pageblock aligned. > + */ > + if (!IS_ENABLED(CONFIG_MHP_RESERVE_PAGES_MEMMAP_ON_MEMORY) && > + !IS_ALIGNED(remaining_size, (pageblock_nr_pages << PAGE_SHIFT))) > + return false; > + > + return arch_supports_memmap_on_memory(size); > +} > + > +static inline unsigned long memory_block_align_base(unsigned long size) > +{ > + if (IS_ENABLED(CONFIG_MHP_RESERVE_PAGES_MEMMAP_ON_MEMORY)) { > + unsigned long align; > + unsigned long nr_vmemmap_pages = size >> PAGE_SHIFT; > + unsigned long vmemmap_size; > + > + vmemmap_size = (nr_vmemmap_pages * sizeof(struct page)) >> PAGE_SHIFT; > + align = pageblock_align(vmemmap_size) - vmemmap_size; We should probably have a helper to calculate a) the unaligned vmemmap size, for example used in arch_supports_memmap_on_memory() b) the pageblock-aligned vmemmap size. -- Cheers, David / dhildenb