From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B3DEBC5475B for ; Fri, 1 Mar 2024 08:24:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4974D6B0088; Fri, 1 Mar 2024 03:24:04 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 421346B008A; Fri, 1 Mar 2024 03:24:04 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 272D26B008C; Fri, 1 Mar 2024 03:24:04 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 126C26B0088 for ; Fri, 1 Mar 2024 03:24:04 -0500 (EST) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 92716121260 for ; Fri, 1 Mar 2024 08:24:03 +0000 (UTC) X-FDA: 81847782366.04.072670D Received: from mail-qk1-f173.google.com (mail-qk1-f173.google.com [209.85.222.173]) by imf24.hostedemail.com (Postfix) with ESMTP id 42903180026 for ; Fri, 1 Mar 2024 08:24:01 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=abPOY+n9; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf24.hostedemail.com: domain of horenchuang@bytedance.com designates 209.85.222.173 as permitted sender) smtp.mailfrom=horenchuang@bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1709281442; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=jbrAz4lnEz25kwMLB527PTKTo91cgL6BvX6RpZZhxe0=; b=cR6lrJg0kjn0QI0HDGjutSUAf2SfbH4rhdmv5q7uLWv4GkoXV0NRpRA9YvHgIAN4lqEmGn 6XsgtFUuOFFcIqFW+3RtvUm08kAUioZdbvNH4/GvZ5fHKfvTyIckmekxN4q0hdAj+GOV8K xM0adKNMRGPfJetM5aTNFhEWPNH7gb8= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=abPOY+n9; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf24.hostedemail.com: domain of horenchuang@bytedance.com designates 209.85.222.173 as permitted sender) smtp.mailfrom=horenchuang@bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1709281442; a=rsa-sha256; cv=none; b=yqVhSTiQ0U5Zt5fFctthG//mmwC+drJ2OMBhcbkG97C/qNfpEg2O2eGpj/ucYPDRAXcLqS pZVk2YRbx2U8mlJjIHjcR9fRogzVLbhLA6Qih69NmGnmlaUr6WiHuxN5Xz/1CL8LPX02iq omWiGWc05eb6Mlw6JyE0KshSkl11x0g= Received: by mail-qk1-f173.google.com with SMTP id af79cd13be357-787df45e513so112756885a.1 for ; Fri, 01 Mar 2024 00:24:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1709281440; x=1709886240; darn=kvack.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=jbrAz4lnEz25kwMLB527PTKTo91cgL6BvX6RpZZhxe0=; b=abPOY+n9y4oh6MGFDldcvxRLc4lck9PVJ2a+e4kojLSmjR0mRN25aJdR5AyMMdPw2z 41u/ESwC0CG83gv5MwSrNWH/wgHNFqmeAjmMlND9RjWc+DYGll1hnwPaGvGf+oTFnW/X 9rakupWTUQiGr/5C+5/Ef8csiHAw5gmDOZLffxbxHGA6ktBb6sWdaVhDPv3n1HlNhrw3 /6wUL/ORHN3fLIfPPaC6cMADrPFXBBwf0L/b/HXOqHoNoIAdzeCMzVbZegdA/3HAGmM/ OSxjWGcOEHieStvhuXEwDNM6PL1SP5X2evJ36H71Z5TUSNx+yfIA7rqntwj6Er3jrL5c o8CQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709281440; x=1709886240; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=jbrAz4lnEz25kwMLB527PTKTo91cgL6BvX6RpZZhxe0=; b=IPPyKwBaMLz6POKT3nyPwCbmGDjHSasl3qQ0dwkTIlig3wqI1qUdtNQuVz4UbGIqy0 LZG1RH3ZoXCWL86lQtlWlVXN79E8zxBWJTaw3DT/fL6Wc3TCIWySZrO2lQCDjeDaqCTV 15S6zQs92JxpNxm+zfkWlPectGc/c9oMMmoVheV+BPdD6oHAwiyK0FzfRRUAPuoSlPB3 /RNB/hSSVSvORYoiqo9dp8nAZt12ZF+sRMdOmyi2D33mfbOLlOuOA+aWWsyDVODLQABG StBXCohTTigZoBGqt5rOS6Hmnwbev9PNoUce2ArfRr0dRALeLQ6zd/XR2jsxmpeRgLGT zWdQ== X-Forwarded-Encrypted: i=1; AJvYcCXPbSi/Fjm9eASlNaqlf7kBHQwBBLMaOlNX2PY5TThHML1vxbS7fuRwkS3QEd5lIna3hE15RU9/CslkXoDuXDRp+50= X-Gm-Message-State: AOJu0YxW6Q+n0tnTvOZntRC1HYpEalwNksCE7bzk7t5VcJUWSnWM16V3 eMCR8SvuS7RKfyovdzjlfImzfDOR53VHN5wGnJAOztfnA0aKtPqdKgWBDMS1Nqc= X-Google-Smtp-Source: AGHT+IExEjh1BdqSl2OFFVnciy2m2VKRizXJqjFfRMSkTbsdSsINER9CoqawxtN38k6ZkAj5ZjelYw== X-Received: by 2002:a0c:c20a:0:b0:68f:43f6:4834 with SMTP id l10-20020a0cc20a000000b0068f43f64834mr990776qvh.26.1709281440109; Fri, 01 Mar 2024 00:24:00 -0800 (PST) Received: from n231-228-171.byted.org ([130.44.215.123]) by smtp.gmail.com with ESMTPSA id y19-20020a0cd993000000b0068fc392f526sm1631907qvj.127.2024.03.01.00.23.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 01 Mar 2024 00:23:59 -0800 (PST) From: "Ho-Ren (Jack) Chuang" To: "Hao Xiang" , "Gregory Price" , aneesh.kumar@linux.ibm.com, mhocko@suse.com, tj@kernel.org, john@jagalactic.com, "Eishan Mirakhur" , "Vinicius Tavares Petrucci" , "Ravis OpenSrc" , "Alistair Popple" , "Rafael J. Wysocki" , Len Brown , Andrew Morton , Dave Jiang , Dan Williams , Jonathan Cameron , Huang Ying , "Ho-Ren (Jack) Chuang" , linux-acpi@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: "Ho-Ren (Jack) Chuang" , "Ho-Ren (Jack) Chuang" , linux-cxl@vger.kernel.org, qemu-devel@nongnu.org Subject: [PATCH v1 0/1] Improved Memory Tier Creation for CPUless NUMA Nodes Date: Fri, 1 Mar 2024 08:22:44 +0000 Message-Id: <20240301082248.3456086-1-horenchuang@bytedance.com> X-Mailer: git-send-email 2.20.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: 42903180026 X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: jt3udyy88kdantehwrxtgshn4bdmqobo X-HE-Tag: 1709281441-13402 X-HE-Meta: U2FsdGVkX1+xLJWKT1zH8RmkxcZtWQO4gm2xZwhfZzOh8dI4CFpsdlekzPhWlDVFrFH3QdSIV1ylNdHDSTBN2yOzjLoDa+VtnX08qFRmJz+e5vNDLgZV3K59MpDT45l8hwWNf/dtXIi9ktyZIevEWKAXX0Rd+xI5L08muKM+2mkgXW5LRvLw7c22UjubieuIhXDoOM18PTHLtOdxbAfq9akUPLfHidj1wwdfn+81yqz+lQoQWkxmMWaLACuZ4OEpkf9BZ0mm+tgFDWVDjcycw+g568tJouiT89yoK+BBKr8OP5dsHimrm8ORqFg4Kp9xgaHMAY5+48Q0uIR7MSvY1AxPA40FuzISVQv57Sh0aR1uxxGbfRtigT3UMzhhpe3EsIZLBy/b2x5Z0uZL1Cd9TpUa9e2eEHLWkgXPRlsvplYvarbRygfp/nV+/pUSnGEUcOdANXmTtNU1pBFjDWJ9fVFV/35ZLEGVRrS1coIG8dYhhG/cc2W/tO3gmN9Ow649GtPt/j4uT2E/zCofGGrSI/TLMLtzyx2HlSQxcKbruzOIB4OL+DVhkj071krl5asMPqMqFrqoKh3W06zFbpPNzp0sESW4RaC2fCqLcUKwPq8d4BwhTnqzs4/xs5UjtuW7SVZUD69TeF5sC4dKHz7UDqSOOReqpvBhoVd/HrVHVQ9edXecp705tTuCWTMfcyxuKP4enlZvTKIPWC/HppyI/s8H0To/cZve32c1y7+WP4YB0zsHKtMNIxPu1vZVat52L8QZaaYyKUjz/B+oHN2KNedimdGLHScs4GaQokvRiXI0Mp8JS64SaZqGSX4wvwDP63z5j97eG8Jw92aVdan2KjmJaOfCy+i/nFx4R1EnhOG5Pnebd4bbiPlDD+C6VZrKu9KUFFTRQJh4o/opZ/M4uI0fSxZ2qneIuGir4OVovaBv5AaEiF9xwbqPvVb5i6sgNqcKuWwgDn0xGjWQrHJ J77M4F+T WPjCSrmXKY00wAy7X5v5KRwT6/+wNn7U5vqtfr4MDbrBXsT39G7SXlIH80CtEwwsRgkdH24CTbpiNsikngUbEJtD1TxX8AEQ1cEkIYWLBZ3ndatesrtcUmOUNnkl95qZj11rLxcDlK5uqdM01AUo9VZvuNF99Rnahp8el7QY1kikRFIACff2qYa7TDSERis/71eZIE+4DLVa+2HY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: The memory tiering component in the kernel is functionally useless for CPUless memory/non-DRAM devices like CXL1.1 type3 memory because the nodes are lumped together in the DRAM tier. https://lore.kernel.org/linux-mm/PH0PR08MB7955E9F08CCB64F23963B5C3A860A@PH0PR08MB7955.namprd08.prod.outlook.com/T/ This patchset automatically resolves the issues. It delays the initialization of memory tiers for CPUless NUMA nodes until they obtain HMAT information at boot time, eliminating the need for user intervention. If no HMAT specified, it falls back to using `default_dram_type`. Example usecase: We have CXL memory on the host, and we create VMs with a new system memory device backed by host CXL memory. We inject CXL memory performance attributes through QEMU, and the guest now sees memory nodes with performance attributes in HMAT. With this change, we enable the guest kernel to construct the correct memory tiering for the memory nodes. Ho-Ren (Jack) Chuang (1): memory tier: acpi/hmat: create CPUless memory tiers after obtaining HMAT info drivers/acpi/numa/hmat.c | 3 ++ include/linux/memory-tiers.h | 6 +++ mm/memory-tiers.c | 76 ++++++++++++++++++++++++++++++++---- 3 files changed, 77 insertions(+), 8 deletions(-) -- Hao Xiang and Ho-Ren (Jack) Chuang