From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 76234D64079 for ; Fri, 8 Nov 2024 19:02:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 02178900009; Fri, 8 Nov 2024 14:02:45 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id F13A08D0001; Fri, 8 Nov 2024 14:02:44 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D8EA9900009; Fri, 8 Nov 2024 14:02:44 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id B7CAB8D0001 for ; Fri, 8 Nov 2024 14:02:44 -0500 (EST) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 67B3780B5E for ; Fri, 8 Nov 2024 19:02:44 +0000 (UTC) X-FDA: 82763847894.27.25D3759 Received: from mail-qv1-f42.google.com (mail-qv1-f42.google.com [209.85.219.42]) by imf18.hostedemail.com (Postfix) with ESMTP id AF0951C001F for ; Fri, 8 Nov 2024 19:02:25 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=cs.cmu.edu header.s=google-2021 header.b=VnYPawo9; spf=pass (imf18.hostedemail.com: domain of kaiyang2@andrew.cmu.edu designates 209.85.219.42 as permitted sender) smtp.mailfrom=kaiyang2@andrew.cmu.edu; dmarc=pass (policy=none) header.from=cs.cmu.edu ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731092511; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=81ZMx+RGiqmljfOo8CBnJJ2cfz2vJRL0QuZfzXnhKEw=; b=0XeBh0FbNs9iRAOABg8PZfBHQ/HwAoedgrToYarVitdP3wwjJ14T6bqYlNvC5/gYK8xEHk KDph3rV1p64mtcMdJf/OYuVctaXiihtz7podqgC2AwbP73FvDGIJrskSbzHSNF9pNzavbd UZBED3Mw3/ELYOyxNEWwRWpqmAyUYXc= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731092511; a=rsa-sha256; cv=none; b=rTSIjkfF5kTn0mbHjEC693pw898AIrf89he9UGNG4OU51NeH+SdzofYH4Jo6Vcow6BgHuL Q0UbAbZil+75hPKtaEteN9VItlz1tUDhvVP1fHLsfbZX0vE0P876jY2XQKrFsCelr1XDbP xf7nyGgAa1J65WKDeSQaZpHKwChXEpI= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=cs.cmu.edu header.s=google-2021 header.b=VnYPawo9; spf=pass (imf18.hostedemail.com: domain of kaiyang2@andrew.cmu.edu designates 209.85.219.42 as permitted sender) smtp.mailfrom=kaiyang2@andrew.cmu.edu; dmarc=pass (policy=none) header.from=cs.cmu.edu Received: by mail-qv1-f42.google.com with SMTP id 6a1803df08f44-6cbf0e6414aso12543916d6.1 for ; Fri, 08 Nov 2024 11:02:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cs.cmu.edu; s=google-2021; t=1731092561; x=1731697361; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=81ZMx+RGiqmljfOo8CBnJJ2cfz2vJRL0QuZfzXnhKEw=; b=VnYPawo9K2HQLnpSrUf1qr1IqVLI066+opRfr2HEDhpbTdcb6Ii9ZSOA68rudy/ZVZ hDKuhdmd5xELRRBm/WPj+/S/jLgicTdz2YpCQPez5hajf1JCHWaQ3jelLfJPeUAWJPZ7 ciRZT9jxha8qNDciASpKH35+UNFYSA5+wjy5i8eRLn5E8eqE3OdyYb+VzxKON/SFyP8I To/imCtUuocVNr37JZAAnMaEgYmJwv5OBZowWioHTfC7PdhB+2yILf4V5FoVAWFoblKm +D9p/bwQA9JjoIKSyMnwtuMVDQyb6ov1Zq9lbOYzXtfFVyIDHyiCD+EM4vI7x5zEHhGL 8fAw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731092561; x=1731697361; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=81ZMx+RGiqmljfOo8CBnJJ2cfz2vJRL0QuZfzXnhKEw=; b=LCh1desg8LTzpb2yQh9ddp2nZUX45VMZ9rXrXm5aoLMCUxwt9M6WqrzlTyJBWJL8T8 uSzUJcI8YlPuLHCLU+HXqjJ81vFoaEc9E/u7xUnolXV8x1VAhqouGLx3WpWMGesubzMT w/Ms68tciO6RxAMLxJLoki4HPJ5poQk7VupHRTrOYJyhlxPFTpsJG3mNP1/Zs+07iTeS b7vUAkP7LIfrd6CaNj0Fownyt78Nbjdd/EcgmZ8yqPs1ZcbI5VGTdK6tykdhegiSaH3L df220grWkAjP6ukRwP8VzF216zsGnY9QDPClDOnaUqSv1lsHsjy2P/7820wy2hL8+Cye dEIQ== X-Gm-Message-State: AOJu0YxyizACNYfURwCRoqjNr48zARISgkj6MH4wfK1LY7y69fEyhUGn ZVsvgwUI8ozdAZ4s3jnmyp9JSqpYeT2+LtwcsPzZ/8iWt+NPXpuO2DviwT/OR/S7pv1t4Qmy5du YDNimGnGj/v++mpetpkPmv0iiPtR3mjvE7uZtNOPZaC9qqAa1Cx+6v0j9sv0KeU/11ABNdK7JA9 S35L3lHFkGQQM9XqnMet9wI81kREHXNuDU1Lw= X-Google-Smtp-Source: AGHT+IFXvTD+JCQH3xCv42hH5FsEM2PDtDsiGfaxLcdpCYJgiL0EpAKzrCx6enhhTbUIpMOoZ64rbg== X-Received: by 2002:a05:6214:5d8c:b0:6d1:8755:5cbe with SMTP id 6a1803df08f44-6d39e166ce0mr47184606d6.8.1731092561074; Fri, 08 Nov 2024 11:02:41 -0800 (PST) Received: from localhost (pool-74-98-231-160.pitbpa.fios.verizon.net. [74.98.231.160]) by smtp.gmail.com with UTF8SMTPSA id 6a1803df08f44-6d3961df2desm22447296d6.21.2024.11.08.11.02.40 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 08 Nov 2024 11:02:40 -0800 (PST) From: kaiyang2@cs.cmu.edu To: linux-mm@kvack.org, cgroups@vger.kernel.org Cc: roman.gushchin@linux.dev, shakeel.butt@linux.dev, muchun.song@linux.dev, akpm@linux-foundation.org, mhocko@kernel.org, nehagholkar@meta.com, abhishekd@meta.com, hannes@cmpxchg.org, weixugc@google.com, rientjes@google.com, gourry@gourry.net, Kaiyang Zhao Subject: Re: [RFC PATCH 0/4] memory tiering fairness by per-cgroup control of promotion and demotion Date: Fri, 8 Nov 2024 19:01:51 +0000 Message-ID: <20241108190152.3587484-1-kaiyang2@cs.cmu.edu> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240920221202.1734227-1-kaiyang2@cs.cmu.edu> References: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Stat-Signature: stnhr3976zwjixh1ctnq3t4cy54uzfdw X-Rspam-User: X-Rspamd-Queue-Id: AF0951C001F X-Rspamd-Server: rspam02 X-HE-Tag: 1731092545-239767 X-HE-Meta: U2FsdGVkX19BTllbulXwYmzWe9vd6pVWLU/f5uo8I2YKcaKDzfqOSrX2ynCvvTvwanmzk21O4aMEvpOd8eIfNiltD8VOFtm6+s18oXayKPoU0vDe+O9C0ZZRPsNCCSTrzQkIxjwI40fFDgKfr/2JUMu4Fs3IwmTtJWGstT7as01wqRAcjaIPeDz9Z2kGAgpS0VLYQZzh4aCOjN/ogYJcBMWRLFmk6Tt9gNiIiAH+N+XVB3krWs4RFpYnTEPJKdfGx8Bxd54pae4YUfTTeCsiARpflObzYVZLPtX/tWwdewJmwYAMBTa6uFao18/vfl1IoSA/MeYveMhSdFscC6uE9x2K4/v3Qp3VImZ6Kv9yFIP2VD/kdN8uW7FMYTndn37liXzkr748sLtMTw0ZE6wgSc9Q4lCEcpEIY8zMPs4LJ7jIckghyUuflfRWIefgoFi0l41Fc0aC1t4AeVVaIkazetJO8qL78OjlP9XeLbMmSf8wp9+3QDTILSMemdkH8By2QEuidyvP7MjNfCkq7g3D54bXxgGhXnqt9kbQOLl2YYtpFA76i3IAwbOK+bP11+KpqelgzczPuYvtMqXmPcTltvSV7qoeoi+kMKqdBgzRWKhECm2sWwAJlEOLePFMydgwoA5ToOYEkF2P/gytMYZFlXYTRbeJtMlK9J12p4Fz8ldOdCqfOC8gTw51DX2Go0zBbPdVvDvnFkJn2NoOnzF0gIrHT3NmKlbCM8M60TCMVK34GRWrO59KP9EkvIEtCLfgb1OPLiJnJFRa7ISJfTf9/+RHj6q9+RUUYhhCtZZQjCuwshQ8hKmBT7RufiEJEySNFfqj6niN1FEksiLCy/bLqMUqDLWLlghTQsTa9WyQwRHlZ72K7Mald1iPn5y5WjWGGjl4CErtQA+gvYMCz1KNq8UoF3xjPmMaBAVF8KNDDJJNhKGQHg+vhZ0nnRQ7BTabqTyHuAMzoFAct/ek8KM VEM3VDUe 3gNNJLJEPMuHMM5mxFliZu1zWTjMUkQgfyS1pbAboPFGwiMLGK+h9mJZ/tpv6zbiu8aIXcZzMax11iRtD8ZX9UYuBmvPKyPTMZJK4dfXYXcKwqKxgI09dVyjBnrfFJh1DhI6HndHYSZ/EFa8ib1PsemVy0f1F1rSuqrJ354yZ24ON0irg5JrXyj0atlMaqgvTlpctOjEEy3SvzvVO88Xh89Z6DzTOFybMO9pTCFnxcxd/5pRaVX9vkXz/gz2P9Grh4SRSjbRR33OMyc2jRYtKw73KDDDqokuv9rtD5QHH7hmyRRIfMtCHKpJJBFNzseMd5zT4pR4h6vyKd/6AcvaDLWqjB5KcKVzLArsqv+53GFEpPr8Y8LtbCd1gG4sA6hDaFxkBBVMSn+G0pkkTHAnvgDiTHV+jox2mqNiF X-Bogosity: Ham, tests=bogofilter, spamicity=0.020715, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Kaiyang Zhao Adding some performance results from testing on a *real* system with CXL memory to demonstrate the values of the patches. The system has 256GB local DRAM + 64GB CXL memory. We stack two workloads together in two cgroups. One is a microbenchmark that allocates memory and accesses it at tunable hotness levels. It allocates 256GB of memory and accesses it in sequential passes with a very hot access pattern (~1 second per pass). The other workload is 64 instances of 520.omnetpp_r from SPEC CPU 2017, which uses about 14GB of memory in total. We apply memory bandwidth limits (1 Gbps memory bandwidth per logical core) and LLC contention mitigation by setting cpuset for each cgroup. Case 1: omnetpp running without the microbenchmark. It is able to use all local memory and without resource contention. This is the optimal case. Avg rate reported by SPEC= 84.7 Case 2: Running two workloads stacked without the fairness patches and start the microbenchmark first. Avg= 62.7 (-25.9%) Case 3: Set memory.low = 19GB for both workloads This is enough memory local low protection for the entire memory usage of omnetpp. Avg = 75.3 (-11.1%) Analysis: omnetpp still uses significant CXL memory (up to 3GB) by the time it finishes because the hint faults for it only triggers for a few seconds in the ~20 minute runtime. Due to the short runtime of the workload and how tiering currently works, it finishes before the memory usage converges to the point where all its memory use is local. However, this still represents a significant improvement over case 2. Case 4: Set memory.low = 19GB for both workloads. Set memory.high = 257GB for the microbenchmark. Avg= 84.0 (<1% difference with case 1) Analysis: by setting both memory.low and memory.high, the usage of local memory is essentially provisioned for the microbenchmark. Therefore, even if the microbenchmark starts first, when omnetpp starts it can get all local memory from the very beginning and achieve near non-colocated performance. We’re working on getting performance data from Meta’s production workloads. Stay tuned for more results.