From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0F44EC369B4 for ; Wed, 25 Sep 2024 09:32:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9EE396B0088; Wed, 25 Sep 2024 05:32:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 99E7A6B0089; Wed, 25 Sep 2024 05:32:04 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 866756B008A; Wed, 25 Sep 2024 05:32:04 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 66F0D6B0088 for ; Wed, 25 Sep 2024 05:32:04 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id DF6AFC0AAD for ; Wed, 25 Sep 2024 09:32:03 +0000 (UTC) X-FDA: 82602744126.24.9C63FD4 Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf23.hostedemail.com (Postfix) with ESMTP id 50491140009 for ; Wed, 25 Sep 2024 09:32:01 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=l3WHxoMF; spf=pass (imf23.hostedemail.com: domain of rppt@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=rppt@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1727256562; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=3ID7rS/8tJIyoX5d5vzWo/MrysMGdcfYtL8lTUqHLus=; b=o+H/SvT0XkWcvPOmAlR2ypvZZNfjpoC85Dy7Q3/NH4WZXHhXpbDqfx5Nhpaix3vPZRQO7i JNDQzxs2glpSNiCfCxq+eMxdtWl6Myt0cTVRUWTOKwMt9CTKQrdptA4GpRmzNh6uhXnIk3 jwhvCNqZ31GPw9sPD9eQiGNphbtbGBs= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=l3WHxoMF; spf=pass (imf23.hostedemail.com: domain of rppt@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=rppt@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1727256562; a=rsa-sha256; cv=none; b=wIUWWbXIMpmr1y75JDctwjY2fhuks6LWX2KKH8LsNWFk8EDG/NfoPPsmSVbgvARadVBBCz TK3zTfrT82Gb6cjPcJ6Q7zAVkrhukgi8uRUklT7c37NqUsS5u783L14rdDQQ8b1CrVvw/8 zUlD1b+XxI+3fIzXBlcmoWMpgmC5Bg8= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id 66D57A43DE4; Wed, 25 Sep 2024 09:31:52 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id B6D49C4CEC3; Wed, 25 Sep 2024 09:31:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1727256720; bh=X7wHJJlQGTqEuthpuVb29KmqlKJoK5uc7L0Yi8DfPB8=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=l3WHxoMFGNVzLEjZIvwIK0aD/Bn3yY99UjH3Kphkn/YZ8dnqP9FKuu/JG9OMycPkl ui9ItvwXXDevvxCpuKa0bEI700Qf0wzBdt2UwUWDrhjhtwcDTENHtlHvpoUcXU16nx 8cTHjIAImIiTh0Yj3ZJU7gJdcLJkM+mQAwpsbeGEeHNm1K7vsXTKp3JYviYP+htq4S 88XDRZkgd2kGz7FmEDxpbZS5i8mR/yk2SY4Ebigy3geHY3Gw2SrZXtbbi5fwn6J6k9 vzovHbca+EtAbnqQVtjW6j9Gf87D5Hdd7NmZAVIua9fN/y6J7CMlZ1NfZ2Prkworxn tXxb0CW6ZWkVw== Date: Wed, 25 Sep 2024 12:28:56 +0300 From: Mike Rapoport To: Bruno Faccini Cc: "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" , "akpm@linux-foundation.org" , Zi Yan , Timur Tabi , John Hubbard Subject: Re: [PATCH] mm/fake-numa: per-phys node fake size Message-ID: References: <20240921081348.10016-1-bfaccini@nvidia.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 50491140009 X-Stat-Signature: ww7fagy4z4ncjb5dtbmp7bpua4qtixwx X-Rspam-User: X-HE-Tag: 1727256721-821872 X-HE-Meta: U2FsdGVkX18xqNfJBJjOgwnehS3BgT0FXpOcSuvyQs3QDI6mIxdYXb/IDGsrgRG2sYDlxXWd2Irf5F/lyD0ZR9WFs+GCrC7fRW6dfUBg2hjrdCA8888+HzX/+E/L8489cdgRs0P7OZgaTOngCcVHmG0yS+BNslm865dIQ7EhXxOAC0Uv+OGZ4MZsDK+qQxbgcOgGibzj7a2EtUsnL1DjK2V+fj1OO7r7HY5KlhGpaCh1mpJQt/+PjL+VS8kwzuc0ysGkGrt6jKj8PI29LKffBIFo1ihNWlDYH6LYoLUIwzFM+/MvCx9PoqK621neeByAJ9ogu43/DQ9QjfQLJ+KPwqhIuciunYBmSyB2TL7wogdq6Gb7kfio8Y07yFMXvzx+FrAHOumq7g+nGxi9uNJKTHSgYlXKSRg2Xxc26MztwP2EMYO3nQVRjHv7kKibqe27W8eL7Ddi1jGjiQpCeCzyHuBLNlyEoF2jR7vE8k4ND4t+2Ac/kA//4ZbHaCrQGmzsAqNu+NYA4oYBE2dDYARA3JfNCo1YayjaK69+MQQxHaV7tuct4QQX+JNgqvoqYCFNAG+oqQAH6RHKNUHQ5B5BrXtbmiRXhuu5jLJCHcctha12Un7ZAri+dPuwOZaN/FLDsmjLRtYvX0qExVKS0LYfjvQYg4Wormjc+Z73zXb/RgfVvYIfRKvN23R7TWe3smWHiEYng7gPYeoWVX7iNSC9LF7/hKADuIrUbfX8df0GgoesePUsATs/EJcvnTdBrp+6uu4zuRqLl1yBfy6HBP8ZfprwxZmz5axUA9EWnqbwDZXUkW5D+2/U6pMpaibHSWss4LnMBEKHty81LRa/M+fh7uLISKUvBoN+XqlAhi0XvI+v31GIoKXW9wWIC/gF6zKnp60mOdiRM/adP4eVGfPXkK34Uj+S759XmRlD33wEJU/CeQ9pZ5tyONNkL42SrOWzNxPW7OpOoxGRXnywYy6 lR3j6Fcz ef2Qh30bsUAfR/oTwIifVTKtFolZPrw2PXW7D2HbRuY9uJrC+xIcF+zk950HgKCUGh6PeQyOQteGL1cuZPVlR0HuMws9O+nHngeJvtap0AAWpt7Nspr2n+gs0lkqMCWGXQhOlcH+XOjnIOIBDRmVuwnxYhLxJWTLgArliygi69vz6gZZ/SNMhVa4k2yQO+nNVCSsJJCTVOpyZA15fROlrhFn6Oe3R03JGmhRJqpRY0yxVamUxrSMBhmtsvMMcOVWI2WGiSzabFnmdQFU1EYiBL5VZ/xHynU24xO1INlhUqd2ZYEF6qouqzA224hUVS/HdOVvVNHoMZFRKwHPOdRqUGbEy6Q== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi Bruno, Please reply inline to the mails on Linux kernel mailing lists. On Tue, Sep 24, 2024 at 03:27:52PM +0000, Bruno Faccini wrote: > On 24/09/2024 12:43, "Mike Rapoport" wrote: > > On Sat, Sep 21, 2024 at 01:13:49AM -0700, Bruno Faccini wrote: > > > Determine fake numa node size on a per-phys node basis to > > > handle cases where there are big differences of reserved > > > memory size inside physical nodes, this will allow to get > > > the expected number of nodes evenly interleaved. > > > > > > Consider a system with 2 physical Numa nodes where almost > > > all reserved memory sits into a single node, computing the > > > fake-numa nodes (fake=N) size as the ratio of all > > > available/non-reserved memory can cause the inability to > > > create N/2 fake-numa nodes in the physical node. > > > > > > I'm not sure I understand the problem you are trying to solve. > > Can you provide more specific example? > > I will try to be more precise about the situation I have encountered with > your original set of patches and how I thought it could be solved. > > On a system with 2 physical Numa nodes each with 480GB local memory, > where the biggest part of reserved memory (~ 309MB) is from node 0 with a > small part (~ 51MB) from node 1, leading to the fake node size of ~<120GB > being determined. > > But when allocating fake nodes from physical nodes, with let say fake=8 > boot parameter being used, we ended with less (7) than expected, because > there was not enough room to allocate 8/2 fake nodes in physical node 0, > due to too big size evaluation. The ability to split a physical node to emulated nodes depends not only on the node sizes and hole sizes, but also where the holes are located inside the nodes and it's quite possible that for some memory layouts split_nodes_interleave() will fail to create the requested number of the emulated nodes. > I don't think that fake=N allocation method is intended to get fake nodes > with equal size, but to get this exact number of nodes. This is why I > think we should use a per-phys node size for the fake nodes it will host. IMO your change adds to much complexity for a feature that by definition should be used only for debugging. Also, there is a variation numa=fake=U of numa=fake parameter that divides each node into N emulated nodes. > Hope this clarifies the reason and intent for my patch, have a good day, > Bruno > > > > Signed-off-by: Bruno Faccini > > --- > > mm/numa_emulation.c | 66 ++++++++++++++++++++++++++------------------- > > 1 file changed, 39 insertions(+), 27 deletions(-) -- Sincerely yours, Mike.