From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 826F3EB64D9 for ; Thu, 6 Jul 2023 16:10:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C83C66B0071; Thu, 6 Jul 2023 12:10:02 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C0CEF6B0072; Thu, 6 Jul 2023 12:10:02 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AADCE8E0001; Thu, 6 Jul 2023 12:10:02 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 959196B0071 for ; Thu, 6 Jul 2023 12:10:02 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 3BB3B1C8503 for ; Thu, 6 Jul 2023 16:10:01 +0000 (UTC) X-FDA: 80981673402.29.D13C6DE Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by imf27.hostedemail.com (Postfix) with ESMTP id C6096401EE for ; Thu, 6 Jul 2023 16:07:24 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b=pOklfIO+; spf=pass (imf27.hostedemail.com: domain of aneesh.kumar@linux.ibm.com designates 148.163.158.5 as permitted sender) smtp.mailfrom=aneesh.kumar@linux.ibm.com; dmarc=pass (policy=none) header.from=ibm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1688659644; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=EfwFJTBXrAy/0+QUpLOPg8xoAC239JYbRRDW5l7O9Yw=; b=tpeJnSXhNEgQXneWn20rhBL+pcYHestLrixg1Y29R6jeRUQhFSnYK0il95T32B1/P1+K7N Ik5TSYlXGNWdqFzjkKaFDfpURkgrun697qzqIEAx11BBXq+/SBrEzvmsSZLdQQHH1e6u6u 6QikGa1cLsGY2MSmmaevxHonKNZkPvI= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1688659644; a=rsa-sha256; cv=none; b=7S7Wgg8KhDhhgrVo8xsONF1KVZt759he922vGlvx5uhH0d58mi9sw0iOtvq5zWbp6KnHkt AuBN/L0X++XmawsmxvDvPF10fzybMkYYSVlHymI5qXokC5Q2AXnUg7ezqbBueM3ZfFU/T3 YxWrYo916LYDJtxKwIiUtr8P7NXJDu0= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b=pOklfIO+; spf=pass (imf27.hostedemail.com: domain of aneesh.kumar@linux.ibm.com designates 148.163.158.5 as permitted sender) smtp.mailfrom=aneesh.kumar@linux.ibm.com; dmarc=pass (policy=none) header.from=ibm.com Received: from pps.filterd (m0353723.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 366FrvSI018319; Thu, 6 Jul 2023 16:07:09 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : date : mime-version : subject : to : cc : references : from : in-reply-to : content-type : content-transfer-encoding; s=pp1; bh=EfwFJTBXrAy/0+QUpLOPg8xoAC239JYbRRDW5l7O9Yw=; b=pOklfIO+5Ptl6xYcvROdjgtEneNmd4zynvy+VtUQe8bAMatMAVPm6FwHdQ8+fQ2hWbNw rclnO1Komlh2qfK1DwvSZPan6+mxlk3zWi2l363vclfacDjrePl6yq1ZBJu6ATsG2l95 8havYLv3QKtZOlt5R9U0E6IZ6u/kTH3VquE0u5DR2XphZcQXOn5iejAHqbNNdaYV+66p CFyt2hcIV7p2rmA8DnQkPdSMEmeklRL73WyJtiuB1P1X8o7xV1afdvavG505lhETzxDk mRg9tBSpu4qawXguPzufJLgNZGlo7982pcJHc+h3he9Fk0YbXI6D2Un5fP8TgRnKXGZo 8Q== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3rp0nhra79-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 06 Jul 2023 16:07:07 +0000 Received: from m0353723.ppops.net (m0353723.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 366FuoTN024105; Thu, 6 Jul 2023 16:07:05 GMT Received: from ppma06fra.de.ibm.com (48.49.7a9f.ip4.static.sl-reverse.com [159.122.73.72]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3rp0nhr9yj-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 06 Jul 2023 16:07:05 +0000 Received: from pps.filterd (ppma06fra.de.ibm.com [127.0.0.1]) by ppma06fra.de.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 366FdLtr016708; Thu, 6 Jul 2023 16:06:58 GMT Received: from smtprelay05.fra02v.mail.ibm.com ([9.218.2.225]) by ppma06fra.de.ibm.com (PPS) with ESMTPS id 3rjbddtgx4-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 06 Jul 2023 16:06:58 +0000 Received: from smtpav01.fra02v.mail.ibm.com (smtpav01.fra02v.mail.ibm.com [10.20.54.100]) by smtprelay05.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 366G6uXP10355414 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 6 Jul 2023 16:06:56 GMT Received: from smtpav01.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 1DB162004D; Thu, 6 Jul 2023 16:06:56 +0000 (GMT) Received: from smtpav01.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id CCE3320043; Thu, 6 Jul 2023 16:06:53 +0000 (GMT) Received: from [9.43.113.146] (unknown [9.43.113.146]) by smtpav01.fra02v.mail.ibm.com (Postfix) with ESMTP; Thu, 6 Jul 2023 16:06:53 +0000 (GMT) Message-ID: <1a35cb1c-5be5-3fba-d59f-132b36863312@linux.ibm.com> Date: Thu, 6 Jul 2023 21:36:52 +0530 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.12.0 Subject: Re: [PATCH v2 1/5] mm/hotplug: Embed vmem_altmap details in memory block Content-Language: en-US To: David Hildenbrand , linux-mm@kvack.org, akpm@linux-foundation.org, mpe@ellerman.id.au, linuxppc-dev@lists.ozlabs.org, npiggin@gmail.com, christophe.leroy@csgroup.eu Cc: Oscar Salvador , Michal Hocko , Vishal Verma References: <20230706085041.826340-1-aneesh.kumar@linux.ibm.com> <20230706085041.826340-2-aneesh.kumar@linux.ibm.com> <72488b8a-8f1e-c652-ab48-47e38290441f@redhat.com> <996e226a-2835-5b53-2255-2005c6335f98@linux.ibm.com> <9ca978e7-5c09-6d92-7983-03a731549b25@linux.ibm.com> <256bd2f0-1b77-26dc-6393-b26dd363912f@redhat.com> From: Aneesh Kumar K V In-Reply-To: <256bd2f0-1b77-26dc-6393-b26dd363912f@redhat.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: YdghUj-MyhSLaeTnANBhfrgf6QSyZsR3 X-Proofpoint-GUID: pAN36dC5SZXeJkH4FVR00TYdN819shXn X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.591,FMLib:17.11.176.26 definitions=2023-07-06_11,2023-07-06_02,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 adultscore=0 impostorscore=0 lowpriorityscore=0 clxscore=1015 phishscore=0 priorityscore=1501 bulkscore=0 suspectscore=0 spamscore=0 mlxlogscore=939 mlxscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2305260000 definitions=main-2307060144 X-Rspamd-Queue-Id: C6096401EE X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: ndofhrxdtmypxaw685h98fdez1o18cxj X-HE-Tag: 1688659644-720468 X-HE-Meta: U2FsdGVkX1+bgjpc2RntaXVahpm74PUiCGZOmBhvl9hptFYKKu6NQJVPJRGrotH3uBjCO6zhNzIH7wG9eZ956bMFPgYspnGGtuzqOblfNQUhccReItWkWjQXOmRDxJfEM7FFcMF9dVL/bEhbBnESndHFT/cDZvbKbMY9aln1qvIMN49jms6+N3MObgPbKIdj9FhEUDp7yVQPLFDEUeTFWv4OZKOZ95HnL4t9gnwpIWLxVwQSZYaaY2IscEW6EkoJQ7P/m1BNaCOB71xPfCR7CSDBXlX8r4+/St87g+JI/+l416JSg9nIH600pXh5wQ0vPFYZJAyDaGjjtmuLk8sDJQj7M8fq809Kf8XyqIKw7nqsigb2H07i3mZJKTaroAG4FQJxmbwitqU2KfLDQI8Z+hBb2ZEhtwu8ZUf2ALL9jrF2ie+x3c2w4v77nLbkKiIqqZ4o3v2A+J3kXk20aT3d6UGepTvKQGzuYMpANSa6M5YVzmGDD7oJ/jyiNfs+/ED+NoWpXmjMxrwufmgx4q5bayOlD9YScbXH+THfsi8wxxI2ozTLofxqOkx45v0Qcmnp8PyhN/1CDR/KYXizzeacIf1/6hdqt0KXUbaTe9Xk9WtCKHXwcwS3iXXBMnwNj8dKrh7gN3rH/e7RY/1Wk5wYtSwPAasIbMsGOdarMr2qnt8rOoZIJ8fW4eolSx2tRGGIH/UIAv/SwWYgDmRXA06T6zUYKqsztjEtr8D4L9X1OlzMGJLYktPkuIVllJycdZWvEOR73j7zWPi+UVyEg24qCpIPfa9m3ppkzSIiyzCqZGfuYawaGWsnG4DLbm+QptqVilCLhsPWvUIWR55nA6eNxA8qkI5LdgMQlp7nV04txGboYgOt90HMAUV9wIR9YhWO9x53ErBa72NgTMtzqb6iRijkjGrEjhVY45KRm4rEMlWsHBU6AikutGDI/kGYbCR+E8lVgbHGZMC5LD3UDU5 Q9AXwjrz zOnZWF1nTNEJU9OGIx7mR4WCQ0hNZLgknnWHxmeIaZyFcldeZvK6V3ZLnUuxVawoafevWtMcuF3mIur5wV+3J2e3Rt12/PipRzby4MXcJ1deNcvpVcpMunJbUDykEOKzU+O5ovcKqMcJl/Ca8m7ZsSjeV2Pvc+8JKBIbmGVkxLp974rba5kkzhztWGQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 7/6/23 6:29 PM, David Hildenbrand wrote: > On 06.07.23 14:32, Aneesh Kumar K V wrote: >> On 7/6/23 4:44 PM, David Hildenbrand wrote: >>> On 06.07.23 11:36, Aneesh Kumar K V wrote: >>>> On 7/6/23 2:48 PM, David Hildenbrand wrote: >>>>> On 06.07.23 10:50, Aneesh Kumar K.V wrote: >>>>>> With memmap on memory, some architecture needs more details w.r.t altmap >>>>>> such as base_pfn, end_pfn, etc to unmap vmemmap memory. >>>>> >>>>> Can you elaborate why ppc64 needs that and x86-64 + aarch64 don't? >>>>> >>>>> IOW, why can't ppc64 simply allocate the vmemmap from the start of the memblock (-> base_pfn) and use the stored number of vmemmap pages to calculate the end_pfn? >>>>> >>>>> To rephrase: if the vmemmap is not at the beginning and doesn't cover full apgeblocks, memory onlining/offlining would be broken. >>>>> >>>>> [...] >>>> >>>> >>>> With ppc64 and 64K pagesize and different memory block sizes, we can end up allocating vmemmap backing memory from outside altmap because >>>> a single page vmemmap can cover 1024 pages (64 *1024/sizeof(struct page)). and that can point to pages outside the dev_pagemap range. >>>> So on free we  check >>> >>> So you end up with a mixture of altmap and ordinarily-allocated vmemmap pages? That sound wrong (and is counter-intuitive to the feature in general, where we *don't* want to allocate the vmemmap from outside the altmap). >>> >>> (64 * 1024) / sizeof(struct page) -> 1024 pages >>> >>> 1024 pages * 64k = 64 MiB. >>> >>> What's the memory block size on these systems? If it's >= 64 MiB the vmemmap of a single memory block fits into a single page and we should be fine. >>> >>> Smells like you want to disable the feature on a 64k system. >>> >> >> But that part of vmemmap_free is common for both dax,dax kmem and the new memmap on memory feature. ie, ppc64 vmemmap_free have checks which require >> a full altmap structure with all the details in. So for memmap on memmory to work on ppc64 we do require similar altmap struct. Hence the idea >> of adding vmemmap_altmap to  struct memory_block > > I'd suggest making sure that for the memmap_on_memory case your really *always* allocate from the altmap (that's what the feature is about after all), and otherwise block the feature (i.e., arch_mhp_supports_... should reject it). > Sure. How about? bool mhp_supports_memmap_on_memory(unsigned long size) { unsigned long nr_pages = size >> PAGE_SHIFT; unsigned long vmemmap_size = nr_pages * sizeof(struct page); if (!radix_enabled()) return false; /* * memmap on memory only supported with memory block size add/remove */ if (size != memory_block_size_bytes()) return false; /* * Also make sure the vmemmap allocation is fully contianed * so that we always allocate vmemmap memory from altmap area. */ if (!IS_ALIGNED(vmemmap_size, PAGE_SIZE)) return false; /* * The pageblock alignment requirement is met by using * reserve blocks in altmap. */ return true; } > Then, you can reconstruct the altmap layout trivially > > base_pfn: start of the range to unplug > end_pfn: base_pfn + nr_vmemmap_pages > > and pass that to the removal code, which will do the right thing, no? > > > Sure, remembering the altmap might be a potential cleanup (eventually?), but the basic reasoning why this is required as patch #1 IMHO is wrong: if you say you support memmap_on_memory for a configuration, then you should also properly support it (allocate from the hotplugged memory), not silently fall back to something else. I guess you want to keep the altmap introduction as a later patch in the series and not the preparatory patch? Or are you ok with just adding the additional check I mentioned above w.r.t size value and keep this patch as patch 1 as a generic cleanup (avoiding the recomputation of altmap->alloc/base_pfn/end_pfn? -aneesh