From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 553D8C5478C for ; Tue, 27 Feb 2024 06:01:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D0C7C6B0198; Tue, 27 Feb 2024 01:01:27 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C95046B019A; Tue, 27 Feb 2024 01:01:27 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AF2156B0198; Tue, 27 Feb 2024 01:01:27 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 988ED6B0198 for ; Tue, 27 Feb 2024 01:01:27 -0500 (EST) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id F04B9A0CC4 for ; Tue, 27 Feb 2024 06:01:26 +0000 (UTC) X-FDA: 81836536572.26.5E804D4 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.13]) by imf28.hostedemail.com (Postfix) with ESMTP id 966D6C0015 for ; Tue, 27 Feb 2024 06:01:24 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=TmNgx7qc; spf=pass (imf28.hostedemail.com: domain of ying.huang@intel.com designates 192.198.163.13 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1709013685; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=5MDyZft2sSZUuDw1b8BjAAgEMBwG71eKUZSFCpCxn0Y=; b=Kh3qOV2/nSXXXZG3aTTTlY91zQufyYo+eoOtwYlGbMXW8mnlhYHNz6ClN3eZDw6RTKKAwf a5oS+UZ/Mx05nOFYoRkNXj+N4/e4zw7VORGB+Fhfh1BOgyxeYZ2L9QsVih5yBifuJb41ie QZQ10awu4W8JvjZufGjt8UBzjzWfFWI= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1709013685; a=rsa-sha256; cv=none; b=vl1MKQRl5tcIEZnl9Ual3cy3KK1VYaC4grJWtdEiztg9Szc/6WhfLhW5FD/PqnHgcb+Cn9 OzezijHaRBWTyVuzaWa/2dFz5OqC2YYlSvqt748FCbNk+mFOLJ/cHpxTl6ECHHTSz/ZsMi ZitM4uIUi04AAPsLsru6qMxW53Dnh4w= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=TmNgx7qc; spf=pass (imf28.hostedemail.com: domain of ying.huang@intel.com designates 192.198.163.13 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1709013685; x=1740549685; h=from:to:cc:subject:in-reply-to:references:date: message-id:mime-version; bh=f+XMnM1XcruBid7lq4nTxNkSIDokV7reYLzRYZf65+U=; b=TmNgx7qcDDk/YrNicdV1fD0ObzKCLfSU276DUvWWhJ1gP4/uj942uxnH ARqx0bfp01AsX349tFBhfgcTWxCLMw5zm6wW9RMPPgryOhZz6S3bqtYap vpuiDkMh/cGxhy568tAQLIRcwU2Jj6cdMeD1AlP9JKRVS37h6ZyMAC/Ut jsTJCbsauDaSKs5KlZO6kgortzBYIAXCuWFuEkrfi2ffSgw6CoV0MB/EW vshmkOzIikT7T/Nd9oi6KI3CPJM7hd0hRcAkCRLZA4Wb/Wzx4yctAOUpn rjgSlf6iFajzD4f4NzzxNeQCrsCscu+pWVbjql5zYgmdunZ9i4TkkCo3U Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10996"; a="6287508" X-IronPort-AV: E=Sophos;i="6.06,187,1705392000"; d="scan'208";a="6287508" Received: from fmviesa001.fm.intel.com ([10.60.135.141]) by fmvoesa107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Feb 2024 22:01:23 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.06,187,1705392000"; d="scan'208";a="37954670" Received: from yhuang6-desk2.sh.intel.com (HELO yhuang6-desk2.ccr.corp.intel.com) ([10.238.208.55]) by smtpauth.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Feb 2024 22:01:21 -0800 From: "Huang, Ying" To: Gregory Price Cc: Gregory Price , , , , , Subject: Re: [RFC 1/1] mm/mempolicy: introduce system default interleave weights In-Reply-To: (Gregory Price's message of "Tue, 27 Feb 2024 00:36:56 -0500") References: <20240220202529.2365-1-gregory.price@memverge.com> <20240220202529.2365-2-gregory.price@memverge.com> <87wmqxht4c.fsf@yhuang6-desk2.ccr.corp.intel.com> <87sf1jh7es.fsf@yhuang6-desk2.ccr.corp.intel.com> <87edcyeo78.fsf@yhuang6-desk2.ccr.corp.intel.com> Date: Tue, 27 Feb 2024 13:59:26 +0800 Message-ID: <87a5nme9c1.fsf@yhuang6-desk2.ccr.corp.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain; charset=ascii X-Stat-Signature: iwtyab5acgukhnxpufo8mzksi95z6fbn X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 966D6C0015 X-Rspam-User: X-HE-Tag: 1709013684-608846 X-HE-Meta: U2FsdGVkX19fqCxPCcjrEu/+usp4i8BOWpr81K/V9/bovltpHATc6LAejVQW7Bftpz0r44xCoauZGASXURNgkF4OX4oEF4Y+V5LXQpY5TBp5MZ9dW+JbCi/e536C3wEcO+mo3/seJND5KlXGJUekwS5V39wF83zJeJFU9y/U6zI2KzWqA3XFWpKgk5cU6l/cMiO8t0COMdW9sj7XA1MZiq1PoqSEELpdyaVyktH8HlehNwN9m/MFogdft9Thm3GCBYcHPWgcloiMx23aAhOBI4EqXU/hOJh5hdkru1O/8wQsTOnheijBcS7dv4KeuMbRTUY//Ct6ZcRqjNuAxNphJAErFaZbot1yM7wtXf8Zq48+C4AT7iuJQzcddKom+DKdjb5DIA1RwZxdm3nJhV4UQbn4yilzqyMybiINuj4gPoPoG8kP6QrPNH7/UVGaa54AIMDi3v473nrUOkQqvQaG4gzh3AahL4oB1SLNzn39kQehm5wEUngCo/k6jfE6LfOeHHa91VoidMLlqAJPY8j4/GraUJLwZW97sVDXZ5d3Dc3DgyD2FiTfCb6+LGyNwBUMlPpvfv/Zl58MHMXeevAMSfLMAT3K8igWmY5lo23MfPmXjodU92YKNV4EEqg448zPQhtRV3OcI/2iP9f49X3HJ5jDGZQ45IdkBadcqF2KnyXtwQuKxA1eeBy33BrV1UXTjQH9gL00q4DSgaeCDpUhuGVCH4RnBEKcH1DILWYvHuvI9Hr02uVlAQIye6/KE2kIZ5oZjNblKGKeZeYDZvUKDLdxVuv7dAKp5rSDfOmxN8VA4WgpYh48Nwz1oxgaz71y8boUDve3JIpgxyee1nJFOx/FMrVnSGI+uraiyN1a9g2qo5gxK755s33t8YNsUhQGMdubuJCeeHipCSV+2zCxg6UqFcEe1kgy X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Gregory Price writes: > On Tue, Feb 27, 2024 at 08:38:19AM +0800, Huang, Ying wrote: >> Gregory Price writes: >> > Where are the 100 nodes coming from? >> >> If you have a real large machine with more than 100 nodes, and some of >> them are CXL memory nodes, then it's possible that most nodes will have >> interleave weight "1" because the sum of all interleave weights is >> "100". Then, even if you use only one socket, the interleave weight of >> DRAM and CXL MEM could be all "1", lead to useless default value. So, I >> suggest don't cap the sum of interleave weights. > > I have to press this issue: Is this an actual, practical, concern? I don't know who have large machine like that. But I guess that it's possible in the long run. > It seems to me in this type of scenario, there are larger, more complex > numa topology issues that make the use of the general, global weighted > mempolicy system entirely impractical. This is a bit outside the scope It's possible to solve the problem step by step. For example, add per-task interleave weight at some time. >> > So, long winded winded way of saying: >> > - Could we use a larger default number? Yes. >> > - Does that actually help us? Not really, we want smaller numbers. >> >> The larger number will be reduced after GCD. >> > > I suppose another strategy is to calculate the interleave weights > un-bounded from the raw bandwidth - but continuously force reductions > (through some yet-undefined algorithm) until at least one node reaches a > weight of `1`. This suffers from the opposite problem: what if the top > node has a value greater than 255? Do we just cap it at 255? That seems > the opposite form of problematic. > > (Large numbers are quite pointless, as it is essentially the antithesis > of interleave) Yes. So I suggest to use a relative small number as the default weight to start with for normal DRAM. We will have to floor/ceiling the weight value. -- Best Regards, Huang, Ying