From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C2440C48BE5 for ; Wed, 16 Jun 2021 00:17:48 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4C3AE611BE for ; Wed, 16 Jun 2021 00:17:48 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4C3AE611BE Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id D4D216B006C; Tue, 15 Jun 2021 20:17:47 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CFD506B006E; Tue, 15 Jun 2021 20:17:47 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B77716B0070; Tue, 15 Jun 2021 20:17:47 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0208.hostedemail.com [216.40.44.208]) by kanga.kvack.org (Postfix) with ESMTP id 8748F6B006C for ; Tue, 15 Jun 2021 20:17:47 -0400 (EDT) Received: from smtpin39.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 1DB42181AC9C6 for ; Wed, 16 Jun 2021 00:17:47 +0000 (UTC) X-FDA: 78257673774.39.7032571 Received: from mail-ej1-f49.google.com (mail-ej1-f49.google.com [209.85.218.49]) by imf03.hostedemail.com (Postfix) with ESMTP id BC562C00CBDC for ; Wed, 16 Jun 2021 00:17:36 +0000 (UTC) Received: by mail-ej1-f49.google.com with SMTP id he7so602562ejc.13 for ; Tue, 15 Jun 2021 17:17:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=jeb/edVZ7vuvSD5cUuir8zPg++jL1VWkQvm6yV3B20I=; b=Y+Ob2m3BGJD/2vfreUx9ip5xCmkyL6xXSG1fxLN6ZRSgM35CYJP22JaLJ2sH2h1MAR 5UbNxFeCHqaiumXksu/SiHEgpivq86gdvkYDN9mAkFyIT1Ozjlkyrs3S4awZyuziClmM Lr1CZf8InRMNinbWG8P6MujVYr3jsEdg7ew+tiaceVF2NtW4RufENmPFmTWBuB5lZh1B WH7WOsqChirbyyyK0ka270wzwZ1LmLFv0ay2urG/4cy/hl4M5oRvyQi/ele/+nGk4nYL DolEBTlh0n9ibbNyFlGXjRCMlUf+BaYQWU+4CkTI7HWQ7N1vB8FpuoysrtOtZu/zKhWY DjtQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=jeb/edVZ7vuvSD5cUuir8zPg++jL1VWkQvm6yV3B20I=; b=Ju9KvPVXCAAnWLsbC4M9rSpfO2o5h0AHiUELSRjKsHElz5+vARrszSwbBPDyTz8YKH cKomAmllNFXft9UYi5ZVKaZrq7Z5dIxx75j5g+XBjQcylAepW/g8rUyk8i7/Pm3rASDL EfpiST4xbRHBLMMqzqf0ZHu622K6esVNY4EE9wDINIp9UURkVhaU59z4JSCrFW9poRge 4YmojbeOF1VguLPWgHPkbXqsQXqrKStgdvPYJGWn7SkkeGcRce1e1OSeiXXizxrsIT5f nLnwcO47z5bzUqXHR6V4WhBFFXuJNAiArM6Zjvfu9W0PugegI8tzOG5lWaczWTmOwQNS LgKA== X-Gm-Message-State: AOAM531qA/rg1pfaSGMJTSbf+bnvqigLMZpqedZnyCPbbJrxtjfwupfG eICZy2dhnjggINP4TZNBgW1X+pxUaBmBzo9SaJA= X-Google-Smtp-Source: ABdhPJxd7J6+QswmTUoOzMh6Z7Ymkp6NQ6vjXW541vUtqw482imHg4Z2h2E3RQ8D6Ll8u1XL0QyjMXpHKVueOrmQdZM= X-Received: by 2002:a17:907:7b97:: with SMTP id ne23mr2147494ejc.499.1623802665532; Tue, 15 Jun 2021 17:17:45 -0700 (PDT) MIME-Version: 1.0 References: <475cbc62-a430-2c60-34cc-72ea8baebf2c@linux.intel.com> In-Reply-To: <475cbc62-a430-2c60-34cc-72ea8baebf2c@linux.intel.com> From: Yang Shi Date: Tue, 15 Jun 2021 17:17:33 -0700 Message-ID: Subject: Re: [LSF/MM TOPIC] Tiered memory accounting and management To: Tim Chen Cc: lsf-pc@lists.linux-foundation.org, Linux MM , Michal Hocko , Dan Williams , Dave Hansen , Shakeel Butt , David Rientjes Content-Type: text/plain; charset="UTF-8" Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=gmail.com header.s=20161025 header.b=Y+Ob2m3B; spf=pass (imf03.hostedemail.com: domain of shy828301@gmail.com designates 209.85.218.49 as permitted sender) smtp.mailfrom=shy828301@gmail.com; dmarc=pass (policy=none) header.from=gmail.com X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: BC562C00CBDC X-Stat-Signature: 8eob4ed3k5nn4cqqtkfehndkk9x13k7k X-HE-Tag: 1623802656-465503 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Jun 14, 2021 at 2:51 PM Tim Chen wrote: > > > From: Tim Chen > > Tiered memory accounting and management > ------------------------------------------------------------ > Traditionally, all RAM is DRAM. Some DRAM might be closer/faster > than others, but a byte of media has about the same cost whether it > is close or far. But, with new memory tiers such as High-Bandwidth > Memory or Persistent Memory, there is a choice between fast/expensive > and slow/cheap. But, the current memory cgroups still live in the > old model. There is only one set of limits, and it implies that all > memory has the same cost. We would like to extend memory cgroups to > comprehend different memory tiers to give users a way to choose a mix > between fast/expensive and slow/cheap. > > To manage such memory, we will need to account memory usage and > impose limits for each kind of memory. > > There were a couple of approaches that have been discussed previously to partition > the memory between the cgroups listed below. We will like to > use the LSF/MM session to come to a consensus on the approach to > take. > > 1. Per NUMA node limit and accounting for each cgroup. > We can assign higher limits on better performing memory node for higher priority cgroups. > > There are some loose ends here that warrant further discussions: > (1) A user friendly interface for such limits. Will a proportional > weight for the cgroup that translate to actual absolute limit be more suitable? > (2) Memory mis-configurations can occur more easily as the admin > has a much larger number of limits spread among between the > cgroups to manage. Over-restrictive limits can lead to under utilized > and wasted memory and hurt performance. > (3) OOM behavior when a cgroup hits its limit. > > 2. Per memory tier limit and accounting for each cgroup. > We can assign higher limits on memories in better performing > memory tier for higher priority cgroups. I previously > prototyped a soft limit based implementation to demonstrate the > tiered limit idea. > > There are also a number of issues here: > (1) The advantage is we have fewer limits to deal with simplifying > configuration. However, there are doubts raised by a number > of people on whether we can really properly classify the NUMA > nodes into memory tiers. There could still be significant performance > differences between NUMA nodes even for the same kind of memory. > We will also not have the fine-grained control and flexibility that comes > with a per NUMA node limit. > (2) Will a memory hierarchy defined by promotion/demotion relationship between > memory nodes be a viable approach for defining memory tiers? > > These issues related to the management of systems with multiple kind of memories > can be ironed out in this session. Thanks for suggesting this topic. I'm interested in the topic and would like to attend. Other than the above points. I'm wondering whether we shall discuss "Migrate Pages in lieu of discard" as well? Dave Hansen is driving the development and I have been involved in the early development and review, but it seems there are still some open questions according to the latest review feedback. Some other folks may be interested in this topic either, CC'ed them in the thread. >