From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5B7B3C2D0CD for ; Wed, 21 May 2025 14:24:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A04556B00B1; Wed, 21 May 2025 10:24:24 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9B3E46B00B2; Wed, 21 May 2025 10:24:24 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8A3A46B00B3; Wed, 21 May 2025 10:24:24 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 6706F6B00B1 for ; Wed, 21 May 2025 10:24:24 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id DBEE1BB505 for ; Wed, 21 May 2025 14:24:23 +0000 (UTC) X-FDA: 83467135206.26.EC35A49 Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by imf25.hostedemail.com (Postfix) with ESMTP id 5D207A0009 for ; Wed, 21 May 2025 14:24:21 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b="Y/jvpc/H"; spf=pass (imf25.hostedemail.com: domain of sumanthk@linux.ibm.com designates 148.163.158.5 as permitted sender) smtp.mailfrom=sumanthk@linux.ibm.com; dmarc=pass (policy=none) header.from=ibm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1747837461; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=CUmBkivY7tO/sQL1SI7gmFR8qaMp1gAKSVaJ+GrGhvY=; b=7Z7y7XQH4k2xhyT0tAULdiRY0r5pIbsJD+8lqYbnOEzR2kLYPJVOG2kyLhoDDQSgyTxElD ZR4ni6QJVSDVppink535QhjV9udHZYvHT7FxIO0he5Ezx5EOu/rKTrrOOeSAu0rs85n3vI KKJPJZ+G9ubtUbneC7uhUYdOrlJdbbY= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b="Y/jvpc/H"; spf=pass (imf25.hostedemail.com: domain of sumanthk@linux.ibm.com designates 148.163.158.5 as permitted sender) smtp.mailfrom=sumanthk@linux.ibm.com; dmarc=pass (policy=none) header.from=ibm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1747837461; a=rsa-sha256; cv=none; b=g8mwDs6Asc4B8bhrNEJG3gBdsqm150UfGd+gNci6ASsviwYpvqJvV2cTEnk1c+cmFh5PwI MygdMJzSupQ5h8AqiJ6BVxd28SsvfJOpAw2W4HErAHTUQCZZ4bleNuEoO/V8Yc06HU/iBb X6sRdWbAXsA+vj1Qs57oi5I9y2PWKoo= Received: from pps.filterd (m0353725.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 54LDn2TU006457; Wed, 21 May 2025 14:24:20 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=cc :content-type:date:from:in-reply-to:message-id:mime-version :references:subject:to; s=pp1; bh=CUmBkivY7tO/sQL1SI7gmFR8qaMp1g AKSVaJ+GrGhvY=; b=Y/jvpc/H8c2Gebn0ZcHWJGrx+40mASN/UkEGB2J4AhIb57 VVW+plauERibi90Ja54FKix1X5C5kutduoEf9CT/mWj2SNBWAeU9weMCzf7FIbUm HiafDs7LStkm00oNhikUUUO0vaA1wWUy8LytFsCtrU9t6ze5dmlax78u0nyy+VnJ WoJYvyAsBFAft6KEhl0ahtRetZ3NYUq5Fmc4LxDjMdeGsJWuVQFRbwKe0+xKaKa6 /HDMSaM9s+ng8jFupW7ic9o/fW273hFqITbueVZEyXF0+Y9ZR32zbe/xfinZy/jN vON4tKsjl8vb+5QBakf4rb/USlXOklCHvKqR7uqw== Received: from ppma21.wdc07v.mail.ibm.com (5b.69.3da9.ip4.static.sl-reverse.com [169.61.105.91]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 46sg2305rc-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 21 May 2025 14:24:20 +0000 (GMT) Received: from pps.filterd (ppma21.wdc07v.mail.ibm.com [127.0.0.1]) by ppma21.wdc07v.mail.ibm.com (8.18.1.2/8.18.1.2) with ESMTP id 54LDctAW024664; Wed, 21 May 2025 14:24:19 GMT Received: from smtprelay01.fra02v.mail.ibm.com ([9.218.2.227]) by ppma21.wdc07v.mail.ibm.com (PPS) with ESMTPS id 46rwkr4j49-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 21 May 2025 14:24:19 +0000 Received: from smtpav03.fra02v.mail.ibm.com (smtpav03.fra02v.mail.ibm.com [10.20.54.102]) by smtprelay01.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 54LEOFqf49086972 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 21 May 2025 14:24:15 GMT Received: from smtpav03.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 7CBBF20043; Wed, 21 May 2025 14:24:15 +0000 (GMT) Received: from smtpav03.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id F3FFF20040; Wed, 21 May 2025 14:24:14 +0000 (GMT) Received: from li-2b55cdcc-350b-11b2-a85c-a78bff51fc11.ibm.com (unknown [9.87.130.155]) by smtpav03.fra02v.mail.ibm.com (Postfix) with ESMTPS; Wed, 21 May 2025 14:24:14 +0000 (GMT) Date: Wed, 21 May 2025 16:24:13 +0200 From: Sumanth Korikkar To: David Hildenbrand Cc: linux-mm , Andrew Morton , Oscar Salvador , Gerald Schaefer , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , linux-s390 , LKML Subject: Re: [RFC PATCH 1/4] mm/memory_hotplug: Add interface for runtime (de)configuration of memory Message-ID: References: <20241202082732.3959803-1-sumanthk@linux.ibm.com> <20241202082732.3959803-2-sumanthk@linux.ibm.com> <3151b9a0-3e96-4820-b6af-9f9ec4996ee1@redhat.com> <1b9285ba-4118-4572-9392-42ec6ba6728c@redhat.com> <496e6707-bdc9-4ad2-88e2-51236549b5f2@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <496e6707-bdc9-4ad2-88e2-51236549b5f2@redhat.com> X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: Q6gjXwEoWQYoOp3sliZjbUuUNyqrKs_g X-Authority-Analysis: v=2.4 cv=RPmzH5i+ c=1 sm=1 tr=0 ts=682de214 cx=c_pps a=GFwsV6G8L6GxiO2Y/PsHdQ==:117 a=GFwsV6G8L6GxiO2Y/PsHdQ==:17 a=kj9zAlcOel0A:10 a=dt9VzEwgFbYA:10 a=bvEob7D9_O_jlljhXYMA:9 a=CjuIK1q_8ugA:10 X-Proofpoint-Spam-Details-Enc: AW1haW4tMjUwNTIxMDEzNyBTYWx0ZWRfX7o+5WipuuU+M stjIdMAdyweWio/VWFcfnIcikD21aSkj3Omcty8zM8johLgnCa2laa+LmLzSzuRdJvfaS0Wb2rd lC11aUE58vDn4RCbAid57cawOVjceIJ0XmCRFceXdh+HkCoC3tbHYIuyellW1DsjHREAuQSDqvQ oiaIjGawlgwcw14hjZ1u6LiD7CXcaijhJxNzQptA8u6BQoxgQq+YOsndgLPzB2VWhTJeowLSQGy 9vG9ZQ2bmdhSOS21kXLciesiMe7CAQMA2mrAJx14dtk3upK44P016Np59QEv/CxXAv4wcI74SAd r02ndZFbeAx5yOLNC2uxiKMnSapB1XQiFZZkvtsJHUQ7H+B898D6Jsrc91GgWAHUIuQX1FQapgY tKgYB0MKG0RMIlKpPC5W9K9DxXoS6/FDmwr+vpQaeUCiuPM3DFS4T0b36t2SFMcJidOrr7+s X-Proofpoint-GUID: Q6gjXwEoWQYoOp3sliZjbUuUNyqrKs_g X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1099,Hydra:6.0.736,FMLib:17.12.80.40 definitions=2025-05-21_04,2025-05-20_03,2025-03-28_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 lowpriorityscore=0 impostorscore=0 adultscore=0 priorityscore=1501 bulkscore=0 phishscore=0 mlxscore=0 malwarescore=0 spamscore=0 mlxlogscore=999 suspectscore=0 classifier=spam authscore=0 authtc=n/a authcc= route=outbound adjust=0 reason=mlx scancount=1 engine=8.19.0-2505160000 definitions=main-2505210137 X-Rspamd-Queue-Id: 5D207A0009 X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: f3jqdt6pe7rmbdgn8d155w5o4g5pf5r6 X-HE-Tag: 1747837461-682616 X-HE-Meta: U2FsdGVkX1+Bb++GoLopwoLmX+ICGIppNOV2OgV0ll+qwmkhc3GJGMr5UeAIHjs1G7S0s5V5neRLPmroLUyh31OPjUwEekMgVDiwER+tZlUCFmjlkEDukP7TMBu3m4dVP3esgm0dw6iWnOtK0DT8A3+9U80NqViWdPM/JuFSHZ2l1YjTKfvZN3bJgHCLLm9dPVN6oPwHi1ANzg9OLoNeN66XLltR7wvEBW4I057IlCXyVZxy1QFA2Aj9uJSWt7/nBf9map1rycD7MYEVF7+7ArxHzznio1dU53mp4ce3SPqFzfcao/arhM825Q3frB+FGP1MPGPUMfBG0B7Zo9YzSMKdUyJntBXsdX7WkhWAaXlugBCsdGjPXkqV+avpbU6jBzKGdfTCFVbEf6FhlEbJwHxAdPOXex5diiDRdAQIcFo+a/9VbQmPr8fjCVrOE5wEhihW/BYEXn9+lG+rU8tXEOU+mWnxxx8LrXXXkPSo4z8iklfb7Da8ibpdo0ZBx9jYAuWxXzxVOJqw6vVLDxLro3BnRLUMmDtyUey3Wofq6EwbvSoykW6WX9EG2goHnpdadGT9QDkclAD7a8slgJbtG4LyZ/BByBxsGJ5f+AUOq6vp4hCqIyDctUaowy6hyz5K67r9YsQ3ZvUuU+1cIIxnCLxcaO3nInETqm2m3V8PEp6jJqNIZbpQMreR5SmgXZdktYJ4foIr3yb49WOsYc7TF4smhKc+b45S3KI5KJhK3C9VqpULgTdgBt0AdcdA9IJ848Z0BwyWgiiKMFlNUZvemLhXjE3Jqa2qI+KkUphRI5Oi4JeL9XM9sPCfdy3idzuFrSYrysDjUfa5uT3WoVz3VEmRkez02KO4XJQUKqlYNAK5GDu+M6RUZyKKB6mEwI11MtUJcFYNLCRC2UFGocXKotOzBmz7d6Jt2esjt6qCNN8TL2xexPSyxzRSrG7KZlN4nLU2p+/JuJzDT8C88eK 1sjWYtRN HDN+S7t+o/mrxMkoH+Ko56nR5b87JkeaU+iLl9PZgIF5A2tWe0mppmeNYwFVGON+UUpCTHV7o2boT6Z5Ja/aN9gGztjyEMo4f1N6/x6MZLPlN7ZHVYW8z9KbeWCXcVSCpMGW515FvOXIPyStY0quu8exOG2ru876znMtCM4z19DPpVTCq1HwXrwub5UZqDKkkngJTZpwTtH3pxyODqvYbldQh31FOO1dJ8dRdkdHOXZE1BPvk5yDldy+9sX06XZPgyJO5rTv+hkfAA6s+qdqguzFcUQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: > So, the same as /sys/devices/system/memory/block_size_bytes ? > > In a future where we could have variable sized memory blocks, what would be > the granularity here? I wasnt aware of this variable sized memory blocks. Then either introduce block_size_bytes attribute inside each memoryX/ directory ? or add it only when variable sized memory blocks support is implemented? > I assume, because that is assumed to be the smallest granularity in which we > can add_memory(). > > And the memory block size is currently always at least the storage increment > size, correct? > > > > > As I understand it, add_memory() operates on memory block granularity, > > and this is enforced by check_hotplug_memory_range(), which ensures the > > requested range aligns with the memory block size. > > Yes. I was rather wondering, if we could have storage increment size > > memory block size. I tried the following: * Config1 (zvm, 8GB online + 4GB standby) vmcp q v store STORAGE = 8320M MAX = 2T INC = 16M STANDBY = 3968M RESERVED = 0 the increment size is 16MB in this case and block size is 128MB. * Config2 (zvm, 512M online + 512M standby) vmcp q v storage STORAGE = 512M MAX = 2T INC = 1M STANDBY = 512M RESERVED = 0 But, memory_block_size_bytes() would return max(increment_size, MIN_MEMORY_BLOCK_SIZE) In both cases, therefore, memory block size will be 128MB. On the other hand, I checked one of the lpars, the increment size is 2GB, which is greater than MIN_MEMORY_BLOCK_SIZE. Hence, memory block size is 2GB here. > > I was wondering about the following practical scenario: > > > > When online memory is nearly full, the user can add a standby memory > > block with memmap_on_memory enabled. This allows the system to avoid > > consuming already scarce online memory for metadata. > > Right, that's the use case I mentioned. But we're talking about ~ 2/4 MiB on > s390x for a single memory block. There are other things we have to allocate > memory for when onlining memory, so there is no guarantee that it would work > with memmap_on_memory either. > > It makes it more likely to succeed :) You're right, I wasn't precise. > > After enabling and bringing that standby memory online, the user now > > has enough free online memory to add additional memory blocks without > > memmap_on_memory. These later blocks can provide physically contiguous > > memory, which is important for workloads or devices requiring continuous > > physical address space. > > > > If my interpretation is correct, I see good potential for this be be > > useful. > > Again, I think only in the case where we don't have have 2/4 MiB for the > memmap. I think, it is not 2/4Mib in every usecase. On my LPAR, the increment size is 2GB. This means 32MB struct pages metadata - per memory block. > > As you pointed out, how about having something similar to > > 73954d379efd ("dax: add a sysfs knob to control memmap_on_memory behavior") > > Right. But here, the use case is usually (a) to add a gigantic amount of > memory using add_memory(), not small blocks like on s390x (b) consume the > memmap from (slow) special-purpose memory as well. > > Regarding (a), the memmap could be so big that add_memory() might never > really work (not just because of some temporary low-memory situation). Sorry, I didnt understand it correctly. regarding a): If add_memory() is performed with memmap_on_memory, altmap metadata should fit into that added memory right? > > 1) To configure/deconfigure a memory block > > /sys/firmware/memory/memoryX/config > > > > 1 -> configure > > 0 -> deconfigure > > > > 2) Determine whether memory block should have memmap_on_memory or not. > > /sys/firmware/memory/memoryX/memmap_on_memory > > 1 -> with altmap > > 0 -> without altmap > > > > This attribute must be set before the memoryX is configured. Or else, it > > will default to CONFIG_MHP_MEMMAP_ON_MEMORY / memmap_on_memory parameter. > > I don't have anything against that option. Just a thought if we really have > to introduce this right now. If there are no objections on this design, I'm happy to start exploring it further. Thank you