From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3E5BDE7D0C4 for ; Mon, 25 Sep 2023 22:46:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B5DD38D001E; Mon, 25 Sep 2023 18:46:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AE6EA8D000A; Mon, 25 Sep 2023 18:46:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 987DC8D001E; Mon, 25 Sep 2023 18:46:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 837888D000A for ; Mon, 25 Sep 2023 18:46:37 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 53CA8C0934 for ; Mon, 25 Sep 2023 22:46:37 +0000 (UTC) X-FDA: 81276605634.20.7E6CFB4 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.115]) by imf18.hostedemail.com (Postfix) with ESMTP id D79BF1C001B for ; Mon, 25 Sep 2023 22:46:34 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=LmQttchw; spf=none (imf18.hostedemail.com: domain of ak@linux.intel.com has no SPF policy when checking 192.55.52.115) smtp.mailfrom=ak@linux.intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1695681995; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=frDcyzEi7BETFvRwPE5lXsTchFG6Btz6mpVljw1D87Y=; b=HTScWzS/ZrMORRhZM8g72m6drGFM6fTrv58TcS3AMM4vcIPR56aKGnFvX3i++vfjgIg9m3 uUYC4JwPaizjT8pIcGCG0HlJbIAy1aNiN97hk1Kzfrmf8Brjo45UnZCQk5B9qyS6UOed+k 1JVfKH08YsLFkDVYa11GIHwYtx9zPyk= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1695681995; a=rsa-sha256; cv=none; b=ShJF3oqZW5ydy2KRvSO3XTbd7D31G2hD+4qR0fDqgIyP0ZAHkiWseH3bEYyyqm2zeN7zi/ 2yaluZmkh6rARGzJXsqm9yNwQipuR8Ovqci8NvXWLVWnaSLCnHMi89FVDU7dpYu1bK750K AIcYCLVH1nQEmPqBlFu7jfIy8JJ52UA= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=LmQttchw; spf=none (imf18.hostedemail.com: domain of ak@linux.intel.com has no SPF policy when checking 192.55.52.115) smtp.mailfrom=ak@linux.intel.com; dmarc=pass (policy=none) header.from=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695681994; x=1727217994; h=date:from:to:cc:subject:message-id:references: mime-version:in-reply-to; bh=YP7EHKPr2e+Pz1hp7bVbAs2UlnEMcHx46HJgSSpOj1Y=; b=LmQttchwIV+ZyXn6YTbHfu1Dro6iVSOAtYUrm/WxP/cacwMg2gsmTqB4 Rau5jp81Gj9YgUweqe/MthoafqCAf0V9tWknZONsVN1JiffLZ/+jT7g2J UVi94ynw19JKUO4J0cBUMOZGwmeqGGh0oLTC8j3RVTlt+QiWBtL22KoG/ JLUThLVzdvlUYC572m6FSoddX7Pcn1GVwGvgyPhc06igF51i/McZskfm7 qSf+ZyL/F3aEJxSl7QNpc9IAlczaqe9sWfKSCl+5Xap3gZwvtZAU5FLE2 mgU6T5MX34EiBuBHHAj7A+9MbINH0SAsvnkoZ0ihjXifjFK3gyG4o8nFk w==; X-IronPort-AV: E=McAfee;i="6600,9927,10843"; a="381333715" X-IronPort-AV: E=Sophos;i="6.03,176,1694761200"; d="scan'208";a="381333715" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Sep 2023 15:46:32 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10843"; a="725211019" X-IronPort-AV: E=Sophos;i="6.03,176,1694761200"; d="scan'208";a="725211019" Received: from tassilo.jf.intel.com (HELO tassilo) ([10.54.38.190]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Sep 2023 15:46:32 -0700 Date: Mon, 25 Sep 2023 15:46:31 -0700 From: Andi Kleen To: Hugh Dickins Cc: Andrew Morton , Christoph Lameter , Matthew Wilcox , Mike Kravetz , David Hildenbrand , Suren Baghdasaryan , Yang Shi , Sidhartha Kumar , Vishal Moola , Kefeng Wang , Greg Kroah-Hartman , Tejun Heo , Mel Gorman , Michal Hocko , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH 01/12] hugetlbfs: drop shared NUMA mempolicy pretence Message-ID: References: <2d872cef-7787-a7ca-10e-9d45a64c80b4@google.com> <47a562a-6998-4dc6-5df-3834d2f2f411@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <47a562a-6998-4dc6-5df-3834d2f2f411@google.com> X-Stat-Signature: sg5trznr6t7cgtcqaijhonrdys8tw7ee X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: D79BF1C001B X-Rspam-User: X-HE-Tag: 1695681994-222386 X-HE-Meta: U2FsdGVkX1+ZuXvZYtXcCpUt0O85tFD29wZ7hqr3JbP1Vew0nD4BK0bHzdcwYTShv3lVw9qCyyjBXiDwRyixCBd4kAmhePbMGQItUashA/hy/HPBNoXBfWzkP4Y7uSMxcrDCcFqrhJyc/Ykuxh5iTh2GGbx7a3SF0K6U2wujmSxwTTDd8LiA/31wgxrYu81izHWIVqZ6IlBXOnGYIQQB0EOnjreDhQVzJiVAm6UTM6cUeXW8KO+20IOG52GHMsw+T7ZIUfda6Rrjoju0b4AeKVnrOn7PpD/kkLC23qYZLxPcTTaNs28OV7XjoVCTZ5IPM0n5VGf3W5UgAlpe0apY8CVNlOwBo8qgE/bTH3V6XV6dkSPaB9ebuIC9TB5dEMLzEq6IfoFPv/ARQYbShShkt8gNyAIwKm+2ugtoskCSvqsMzn+o/MnPl5bnKdz7jZXn8eo9mDm7WhOPiQE87m+rzjzIDiGMDG34ecbxdtyVQElg4dHsTIYUfmV/Hch14Ic6k+HZynImbc3C8iWj8wKsOV6mcRhIqyC+WS8x0t/O2sK0T1jdNBmwpFm9fWV/RZpg69nHVTv9Tze1w9dJuCyHoc/R3pISeKfh49d/GryrRUycoE8sNxKD74DM7VZh/MqvOdwb72YW4DF31qld1j9wWiwXi3/gzSw67vVazNZs+7H2nYfr4IzrlEJlMDo1zrXN/FPrCJphpibA4C3pjsqWiZGH0e+6CmYTcnptobf23qEA3BMFc54RAuNRfguARo/SajnDiZsnVuWTnRUggXgcnPiwIMDtprzvX3E9GbHvAXt1UijDBfEZxQ2L9bzOS4cjgA5HtG7TDQn+VSQnEfeHg8B+7IVtDPg2dufFdC+Ae0uP+GJxOYUUCEYxqKLpb9ZIEMBv6fZIVGjR5EkOLpEGJzWU36Clc0fWIbAQ2d/oRMllyCX070kRB6rn7iz0AFE3Qi2h0VfKvIRP/wAYjom jS91iF6X 62zLsK8vTXAuS0Dff045DTfqP7ECM9PS7eshw2yn6t/1wXVTA6VqWMx7XyHzyPtS3wsknkNPWTWSDA+g0pNw6LBVSXUGuS1xKfCh5Bfpa6VwMZ7XcILkqo0t1X7Qqqvy9D3XlVuV6ik/+4LF68PJ9YNoDQ6o3l7lyPDSr0O/5W3tnlQOrro5kO11jcxGbiJlK7llV X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Sep 25, 2023 at 01:21:10AM -0700, Hugh Dickins wrote: > hugetlbfs_fallocate() goes through the motions of pasting a shared NUMA > mempolicy onto its pseudo-vma, but how could there ever be a shared NUMA > mempolicy for this file? hugetlb_vm_ops has never offered a set_policy > method, and hugetlbfs_parse_param() has never supported any mpol options > for a mount-wide default policy. > > It's just an illusion: clean it away so as not to confuse others, giving > us more freedom to adjust shmem's set_policy/get_policy implementation. > But hugetlbfs_inode_info is still required, just to accommodate seals. > > Yes, shared NUMA mempolicy support could be added to hugetlbfs, with a > set_policy method and/or mpol mount option (Andi's first posting did > include an admitted-unsatisfactory hugetlb_set_policy()); but it seems > that nobody has bothered to add that in the nineteen years since v2.6.7 > made it possible, and there is at least one company that has invested > enough into hugetlbfs, that I guess they have learnt well enough how to > manage its NUMA, without needing shared mempolicy. TBH i'm not sure people in general rely on shared mempolicy. The original use case for it was to modify the numa policy of non anonymous shared memory files without modifying the program (e.g. Oracle database's shared memory segments) But I don't think that particular usage model ever got any real traction: at leas I haven't seen any real usage of it outside my tests. I suspect people either are fine with just process policy or modify the program, in which case it's not a big burden to modify every user, so process policy or vma based mbind policy works fine. Maybe it would be an interesting experiment to disable it everywhere with some flag and see if anybody complains. On the other hand it might be Hyrum'ed by now. -Andi