From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6F28EC0015E for ; Wed, 26 Jul 2023 09:42:09 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0BCD18D0001; Wed, 26 Jul 2023 05:42:09 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 06D436B0074; Wed, 26 Jul 2023 05:42:08 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E2A518D0001; Wed, 26 Jul 2023 05:42:08 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id D12636B0071 for ; Wed, 26 Jul 2023 05:42:08 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id A1DF91C9F02 for ; Wed, 26 Jul 2023 09:42:08 +0000 (UTC) X-FDA: 81053271936.28.4464C08 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf09.hostedemail.com (Postfix) with ESMTP id 6768714000C for ; Wed, 26 Jul 2023 09:42:05 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=KVeEcwal; spf=pass (imf09.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1690364525; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=NGx2jj4bNcV1rNpDgfUn3c40hTgdOgMy1nAFytodzYY=; b=xnirYRJtx2UyLbZstwuobHB+w1hP9cKGGwGfBEJTqs3rsSq2NGwgp1V5w6XaKIPgoooHCg wHAPWHgOpi5lBEPWep91LU+oVUUYO/ntXvSWQEQUdfS6JviKnTMSAyNye3N/B+flTPBpGv Dp5gEzdX9yM9L+bx5/YvgIT8N4f1Tqo= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=KVeEcwal; spf=pass (imf09.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1690364525; a=rsa-sha256; cv=none; b=c1VEns4ZmTK7hYtOIGE5mBtRfzDYVJKObWBVK26DGfzCvdUCvkqwu47XRcH5y4JLdRLPGR JcdURUPMS8jRA4e+zMiv8WGr36Wi6JIDVDkRhBs98lRgfXi/CXvdnyp3sDDApTlIbyKWAb 1RrbddHSPP7TZMDpCXpFxKj628IUCis= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1690364524; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=NGx2jj4bNcV1rNpDgfUn3c40hTgdOgMy1nAFytodzYY=; b=KVeEcwalYK5fYc9nH6voUPjRq+i5Qk7vpC9M1lBDiaan/mXTbuvbAhOe5XuBOWk6KaOBnZ bjOMjWK/CNVZYrZX5uKArpOuksOQmErQQLQhZFDdI8GQyw/CYrYG05nB1hwU8Pb9fGoC4q YSDTbgGjqTCdydZ2VedxQ2PtfR1eLco= Received: from mail-wm1-f69.google.com (mail-wm1-f69.google.com [209.85.128.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-625-vmxYxi_ZNW6o5sC1T4J7wA-1; Wed, 26 Jul 2023 05:42:02 -0400 X-MC-Unique: vmxYxi_ZNW6o5sC1T4J7wA-1 Received: by mail-wm1-f69.google.com with SMTP id 5b1f17b1804b1-3fd2778f5e4so4535665e9.1 for ; Wed, 26 Jul 2023 02:42:02 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690364522; x=1690969322; h=content-transfer-encoding:in-reply-to:subject:organization:from :references:cc:to:content-language:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=NGx2jj4bNcV1rNpDgfUn3c40hTgdOgMy1nAFytodzYY=; b=epYKTJyktSduId8VYN+Nk0AiihNZvfRm21/zUddO6jrqBVK7T0lSQAwJwUXSKaIpII LlmRrF36UyD6Hue+Gl7RFvOfrfcVBWT1ennSUhnn5nfZMYl21hwK0fXYth/EXGLip8Q4 1LI+Keec42uYuB2ZQoHiZ2OYFGYrfhdoGlLmcJWKEL8WNn6QkuTiLvGZLVEY19p7H922 rA44w1SuAyQ8Tsew1Nsif9maqhh730cAVSPWc6HAI3fZWk82/kb4ZoSLu8X88PAvsaFT I3ejgcHcKdGkpcWgaqANJw9V+YB4XKvaO9VOYZev/nY6CY5C0ATyi7UXTZ12S3Uctg5e GE9Q== X-Gm-Message-State: ABy/qLbsjA6Mk/NJBWiDhCZG+8Ly7rD7h+9PdMeM85EKOQ1M+PACJLDY /pFW50IvaJxbvxntndb8wsFpVBRSklD69RFA+D+InsT8KlChX+jnP3AqrRrfwp/O8H9QZjO7GwH RAvLSY2yOnS8= X-Received: by 2002:a05:600c:207:b0:3f9:b867:4bb with SMTP id 7-20020a05600c020700b003f9b86704bbmr4228050wmi.2.1690364521695; Wed, 26 Jul 2023 02:42:01 -0700 (PDT) X-Google-Smtp-Source: APBJJlEJR4MjuVs4MC2UvM+18MzTwAbr0oz696nN83E0KwVUxl9B23QTqt0l84TgedQXhtCQy/p3aw== X-Received: by 2002:a05:600c:207:b0:3f9:b867:4bb with SMTP id 7-20020a05600c020700b003f9b86704bbmr4228032wmi.2.1690364521297; Wed, 26 Jul 2023 02:42:01 -0700 (PDT) Received: from ?IPV6:2003:cb:c705:f600:a519:c50:799b:f1e3? (p200300cbc705f600a5190c50799bf1e3.dip0.t-ipconnect.de. [2003:cb:c705:f600:a519:c50:799b:f1e3]) by smtp.gmail.com with ESMTPSA id m25-20020a7bce19000000b003fbe36a4ce6sm1524526wmc.10.2023.07.26.02.42.00 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 26 Jul 2023 02:42:00 -0700 (PDT) Message-ID: Date: Wed, 26 Jul 2023 11:41:59 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.13.0 To: "Aneesh Kumar K.V" , linux-mm@kvack.org, akpm@linux-foundation.org, mpe@ellerman.id.au, linuxppc-dev@lists.ozlabs.org, npiggin@gmail.com, christophe.leroy@csgroup.eu Cc: Oscar Salvador , Michal Hocko , Vishal Verma References: <20230725100212.531277-1-aneesh.kumar@linux.ibm.com> <20230725100212.531277-7-aneesh.kumar@linux.ibm.com> From: David Hildenbrand Organization: Red Hat Subject: Re: [PATCH v5 6/7] mm/hotplug: Embed vmem_altmap details in memory block In-Reply-To: <20230725100212.531277-7-aneesh.kumar@linux.ibm.com> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 6768714000C X-Rspam-User: X-Stat-Signature: r77yyxwiizcumi9f1rxs5a1csoqoyzro X-Rspamd-Server: rspam01 X-HE-Tag: 1690364525-160140 X-HE-Meta: U2FsdGVkX19gvQXJY19eLlccTdZuYcRu1rlmKNJACwIstCzQlBaNlPk+vOocXjhsYe8Jzs1sSp3S0Ni7++LuQRmOQ4nSdg94OhWyO7tqUkClme8CSXJqa3GsUAWfv9eUQfmsULbbBVhLKbRg2YfNcLvRIyijz/qpL0BWPthxU8NwRferNGQQLBJABuVUE09Eao331184cXM4Ysx9kTQqVXSekmpG2ky53ne1ln3ABxNyou48zC1uDTETwLcHdEJ/8xuJ3rqNHJYPCu57WfUNJaxfwpj+AhrgMTs3DW9WWKXlDFARCqgVzUZ5e/DZnCr7+w7WftPmNtsC2m6QxGRjjlUTW0BpgU/3FE/kv2QvfqirCaFh8x8JDJPt0y6Qf2HAgaf5xzrn2PeLw6JFgZVC9WNZVFawPCQXPEbEiiraWzxd7JNQOb8nY6CSm4DFvYnD0T+h+QtcuqE/X6r6xFJ6/wPv912ZEZ6UWvRdf6NxvKZuZIu1t+9DhiOZ06VZwelvSzsKyCTojUFLPzHnbvropwHl4jHlhAvPhpffP3UB5Y+bpUCi054kmSmxXH4QnuPVR3ntfr1sqUqIsslObZsoRhJYCZ3/Zb1THN4CWAOZGFFCfvxr/Kti4Bt3pwso9rGYx3yEsELOgkjUtoVZdW4LBAa8Lg2nYlAgwotWOO0n1VFZ4Z3WDU5pxxxARpQyYDQypPCiHDhOoG1OszU415ls7pkrrEL/wTka4qpEvnSnLQoQOiMFNUQwJ27GdlIKV4EQrbuUANE/BnSSEhTdHkJFrOFHtzvMLOchmp/kwERfChb8rUodpnZlGzZf7VC33OTIXxn4jbP1687y9+1tvpY2Hs1Eizf8L6e7L6UoYVktHxrzCyDvasxS8mtnnqx3ePoRH/Iafjb5b63iW2ydCNIqt67vPlRFEogEDkWBC4I0YUFR1gutWLWCTqtLX/Rd76YJ5bj3FgYX4T8zAkjAgyI RMJxakqS TgIK6Wqd/d/N9FaKVQaSujLnc4bem6c1etBX2oNwPOZiEj1U4KhSWomM9S9NisfBxWcW+FKKRBBDS/ZEOLc5LUKZ4/1dmnJtwPhY29UWFCozoJuReV8DcRHSMoxoqVBtHfCDKo1GymKC4EEQwFcj2Flb8N+GpKIR3xRHtlp0FvI8vuc13NzhHG+0Dqio8Z1doP37wzScODl9W0z/oQulMynZr1kFVhfWZCNWv59sdXJEfuX5nj99TbbzWQi0nx/nbEvqeWoaAbxOjeXaVFjFzikAymFEVwT1iqFJLYBd99bWfKlLPrtCfqGygLc2J34ZfvPXL03hrPFhDf3kzcZ7lIZe1Glx0Wo0MiSza2X3+l4Zeqz3sL/o5UEHa5UL+fpa2fIOh7YlDYkPgx8wYdxWCF1YJMX9Vb3Za4psJ4pl9L8eFDt4C6IKMc7QtZm6J4yCAXbA2avQxnYtbKkc= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 25.07.23 12:02, Aneesh Kumar K.V wrote: > With memmap on memory, some architecture needs more details w.r.t altmap > such as base_pfn, end_pfn, etc to unmap vmemmap memory. Instead of > computing them again when we remove a memory block, embed vmem_altmap > details in struct memory_block if we are using memmap on memory block > feature. > > No functional change in this patch > > Signed-off-by: Aneesh Kumar K.V > --- [...] > > static int add_memory_block(unsigned long block_id, unsigned long state, > - unsigned long nr_vmemmap_pages, > + struct vmem_altmap *altmap, > struct memory_group *group) > { > struct memory_block *mem; > @@ -744,7 +751,14 @@ static int add_memory_block(unsigned long block_id, unsigned long state, > mem->start_section_nr = block_id * sections_per_block; > mem->state = state; > mem->nid = NUMA_NO_NODE; > - mem->nr_vmemmap_pages = nr_vmemmap_pages; > + if (altmap) { > + mem->altmap = kmalloc(sizeof(struct vmem_altmap), GFP_KERNEL); > + if (!mem->altmap) { > + kfree(mem); > + return -ENOMEM; > + } > + memcpy(mem->altmap, altmap, sizeof(*altmap)); > + } I'm wondering if we should instead let the caller do the alloc/free. So we would alloc int the caller and would only store the pointer. Before removing the memory block, we would clear the pointer and free it in the caller. IOW, when removing a memory block and we still have an altmap set, something would be wrong. See below on try_remove_memory() handling. [...] > -static int get_nr_vmemmap_pages_cb(struct memory_block *mem, void *arg) > +static int get_vmemmap_altmap_cb(struct memory_block *mem, void *arg) > { > + struct vmem_altmap *altmap = (struct vmem_altmap *)arg; > /* > - * If not set, continue with the next block. > + * If we have any pages allocated from altmap > + * return the altmap details and break callback. > */ > - return mem->nr_vmemmap_pages; > + if (mem->altmap) { > + memcpy(altmap, mem->altmap, sizeof(struct vmem_altmap)); > + return 1; > + } > + return 0; > } > > static int check_cpu_on_node(int nid) > @@ -2146,9 +2152,8 @@ EXPORT_SYMBOL(try_offline_node); > > static int __ref try_remove_memory(u64 start, u64 size) > { > - struct vmem_altmap mhp_altmap = {}; > - struct vmem_altmap *altmap = NULL; > - unsigned long nr_vmemmap_pages; > + int ret; > + struct vmem_altmap mhp_altmap, *altmap = NULL; > int rc = 0, nid = NUMA_NO_NODE; > > BUG_ON(check_hotplug_memory_range(start, size)); > @@ -2171,24 +2176,15 @@ static int __ref try_remove_memory(u64 start, u64 size) > * the same granularity it was added - a single memory block. > */ > if (mhp_memmap_on_memory()) { > - nr_vmemmap_pages = walk_memory_blocks(start, size, NULL, > - get_nr_vmemmap_pages_cb); > - if (nr_vmemmap_pages) { > + ret = walk_memory_blocks(start, size, &mhp_altmap, > + get_vmemmap_altmap_cb); > + if (ret) { > if (size != memory_block_size_bytes()) { > pr_warn("Refuse to remove %#llx - %#llx," > "wrong granularity\n", > start, start + size); > return -EINVAL; > } > - > - /* > - * Let remove_pmd_table->free_hugepage_table do the > - * right thing if we used vmem_altmap when hot-adding > - * the range. > - */ > - mhp_altmap.base_pfn = PHYS_PFN(start); > - mhp_altmap.free = nr_vmemmap_pages; > - mhp_altmap.alloc = nr_vmemmap_pages; > altmap = &mhp_altmap; > } Instead of that, I suggest (whitespace damage expected): diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 3f231cf1b410..f6860df64549 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1956,12 +1956,19 @@ static int check_memblock_offlined_cb(struct memory_block *mem, void *arg) return 0; } -static int get_nr_vmemmap_pages_cb(struct memory_block *mem, void *arg) +static int test_has_altmap_cb(struct memory_block *mem, void *arg) { - /* - * If not set, continue with the next block. - */ - return mem->nr_vmemmap_pages; + struct memory_block **mem_ptr = (struct memory_block **)arg; + + if (mem->altmap) { + /* + * We're not taking a reference on the memory block; it + * it cannot vanish while we're about to that memory ourselves. + */ + *mem_ptr = mem; + return 1; + } + return 0; } static int check_cpu_on_node(int nid) @@ -2036,9 +2043,7 @@ EXPORT_SYMBOL(try_offline_node); static int __ref try_remove_memory(u64 start, u64 size) { - struct vmem_altmap mhp_altmap = {}; struct vmem_altmap *altmap = NULL; - unsigned long nr_vmemmap_pages; int rc = 0, nid = NUMA_NO_NODE; BUG_ON(check_hotplug_memory_range(start, size)); @@ -2061,9 +2066,9 @@ static int __ref try_remove_memory(u64 start, u64 size) * the same granularity it was added - a single memory block. */ if (mhp_memmap_on_memory()) { - nr_vmemmap_pages = walk_memory_blocks(start, size, NULL, - get_nr_vmemmap_pages_cb); - if (nr_vmemmap_pages) { + struct memory_block *mem; + + if (walk_memory_blocks(start, size, &mem, test_has_altmap_cb)) { if (size != memory_block_size_bytes()) { pr_warn("Refuse to remove %#llx - %#llx," "wrong granularity\n", @@ -2072,12 +2077,11 @@ static int __ref try_remove_memory(u64 start, u64 size) } /* - * Let remove_pmd_table->free_hugepage_table do the - * right thing if we used vmem_altmap when hot-adding - * the range. + * Clear the altmap from the memory block before we + * remove it; we'll take care of freeing the altmap. */ - mhp_altmap.alloc = nr_vmemmap_pages; - altmap = &mhp_altmap; + altmap = mem->altmap; + mem->altmap = NULL; } } @@ -2094,6 +2098,9 @@ static int __ref try_remove_memory(u64 start, u64 size) arch_remove_memory(start, size, altmap); + if (altmap) + kfree(altmap); + if (IS_ENABLED(CONFIG_ARCH_KEEP_MEMBLOCK)) { memblock_phys_free(start, size); memblock_remove(start, size); -- Cheers, David / dhildenb