From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id ECC3FC433EF for ; Thu, 26 May 2022 20:30:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 192E98D0003; Thu, 26 May 2022 16:30:32 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 13ECC8D0002; Thu, 26 May 2022 16:30:32 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EF7FE8D0003; Thu, 26 May 2022 16:30:31 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id DCFA58D0002 for ; Thu, 26 May 2022 16:30:31 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id A04206A3 for ; Thu, 26 May 2022 20:30:31 +0000 (UTC) X-FDA: 79509037062.30.CEFC309 Received: from mail-vs1-f45.google.com (mail-vs1-f45.google.com [209.85.217.45]) by imf12.hostedemail.com (Postfix) with ESMTP id 7BF6040040 for ; Thu, 26 May 2022 20:29:52 +0000 (UTC) Received: by mail-vs1-f45.google.com with SMTP id w10so2472543vsa.4 for ; Thu, 26 May 2022 13:30:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=EaE2PBM7SrEMgBvdnMAtayVc5vqDLRWV6AQ1zZGxiDo=; b=SSNeGJIxApDm2h40uEBy/HSo391DliMhXNVkX29boHJM56E7d96b0hvA8B4hTbNy7B oEN0F/LKpb0vOog37WypcErBnrC5h20z6ztPSRarI7wFvFJpEQOk20FYNEUeX5I+7pTn ShnfUr35Feys7k7NIopZEsHPVu7/lAv0aTbg5iB2Vhusbt9QGIiCCuFZ2lcsDvdqcCG0 GAPXSk9V3cZ3J/LC3t+pkIvVkDDyZV2mRhOWvxrnD6LOkt/NwiqirC+XXK092r2xyIBh IivAgXwe2WErTPRuCZJN2kmVLzoC2jKuqT/PLAc2ZHpjPRzWIkAKg3iPpOnhU2GTV7fn HS+Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=EaE2PBM7SrEMgBvdnMAtayVc5vqDLRWV6AQ1zZGxiDo=; b=PDpZP4JEP6LvL17PyxQqZ2U/ukn91/RIarpy2x/wQFLYltPQZLqZKKV5Of4wlH3RFG gZnybqqrjGsH5BtA/m6UARWYI33Fr+IzaSTYJrS6aHUD51hH7rgf5nR3A9MU/zj/np8K l/rDnglvvZoFOac8USvqcvkSfvBhYYic7sp13a6l6Mt2n1TDf5/XLq1HXoALpEMkdg7y a3OWXS7fXBa1n5CoECv4cGZdE6V7DZ/AmsrviR6VXnJeMqEzZSq4GOyRDl3YP6866Gt7 rAteebM6LqpQMvTajrPzsJ+c8ThtNl4Dc3a7/F30I3V8ln7zn1oAsAVRs1f14AOd4b6/ tRfQ== X-Gm-Message-State: AOAM532utAdZJobyPkK+7D79LDmdlbEyv1UzVVG7gWQAL0EqAWX9Crtt PpbqpFQli7PJUrtpN3/wL8V268/y13jXIQVSeseSOA== X-Google-Smtp-Source: ABdhPJwaBZhRqMrcVuzxYbXuBoasFB0Tja+S/DnWk8uW5rJQmCo/GD0KItdpLl9LtUoYgfNgKUCymZKZcXY94hftSTU= X-Received: by 2002:a05:6102:3ecf:b0:320:7c27:5539 with SMTP id n15-20020a0561023ecf00b003207c275539mr17824111vsv.59.1653597028254; Thu, 26 May 2022 13:30:28 -0700 (PDT) MIME-Version: 1.0 References: <20220512160010.00005bc4@Huawei.com> <6b7c472b50049592cde912f04ca47c696caa2227.camel@intel.com> <6ce724e5c67d4f7530457897fa08d0a8ba5dd6d0.camel@intel.com> <594046f8-9ab3-786a-fc48-8a61f1238f52@linux.ibm.com> <20220526103211.000001ad@Huawei.com> In-Reply-To: <20220526103211.000001ad@Huawei.com> From: Wei Xu Date: Thu, 26 May 2022 13:30:16 -0700 Message-ID: Subject: Re: RFC: Memory Tiering Kernel Interfaces (v2) To: Jonathan Cameron Cc: Aneesh Kumar K V , Ying Huang , Andrew Morton , Greg Thelen , Yang Shi , Linux Kernel Mailing List , Jagdish Gediya , Michal Hocko , Tim C Chen , Dave Hansen , Alistair Popple , Baolin Wang , Feng Tang , Davidlohr Bueso , Dan Williams , David Rientjes , Linux MM , Brice Goglin , Hesham Almatary Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 7BF6040040 X-Rspam-User: Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=SSNeGJIx; spf=pass (imf12.hostedemail.com: domain of weixugc@google.com designates 209.85.217.45 as permitted sender) smtp.mailfrom=weixugc@google.com; dmarc=pass (policy=reject) header.from=google.com X-Stat-Signature: gontdko7446jzcomk93bqrppkxti64jy X-Rspamd-Server: rspam05 X-HE-Tag: 1653596992-191586 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, May 26, 2022 at 2:32 AM Jonathan Cameron wrote: > > On Wed, 25 May 2022 10:27:42 -0700 > Wei Xu wrote: > > > On Wed, May 25, 2022 at 3:01 AM Aneesh Kumar K V > > wrote: > > > > > > On 5/25/22 2:33 PM, Ying Huang wrote: > > > > On Tue, 2022-05-24 at 22:32 -0700, Wei Xu wrote: > > > >> On Tue, May 24, 2022 at 1:24 AM Ying Huang wrote: > > > >>> > > > >>> On Tue, 2022-05-24 at 00:04 -0700, Wei Xu wrote: > > > >>>> On Thu, May 19, 2022 at 8:06 PM Ying Huang wrote: > > > >>>>> > > > > > > ... > > > > > > > > > > > OK. Just to confirm. Does this mean that we will have fixed device ID, > > > > for example, > > > > > > > > GPU memtier255 > > > > DRAM (with CPU) memtier0 > > > > PMEM memtier1 > > > > > > > > When we add a new memtier, it can be memtier254, or memter2? The rank > > > > value will determine the real demotion order. > > > > > > > > I think you may need to send v3 to make sure everyone is at the same > > > > page. > > > > > > > > > > What we have implemented which we will send as RFC shortly is below. > > > > > > cd /sys/dekvaneesh@ubuntu-guest:~$ cd /sys/devices/system/ > > > kvaneesh@ubuntu-guest:/sys/devices/system$ pwd > > > /sys/devices/system > > > kvaneesh@ubuntu-guest:/sys/devices/system$ ls > > > clockevents clocksource container cpu edac memory memtier mpic > > > node power > > > kvaneesh@ubuntu-guest:/sys/devices/system$ cd memtier/ > > > kvaneesh@ubuntu-guest:/sys/devices/system/memtier$ pwd > > > /sys/devices/system/memtier > > > kvaneesh@ubuntu-guest:/sys/devices/system/memtier$ ls > > > default_rank max_rank memtier1 power uevent > > > kvaneesh@ubuntu-guest:/sys/devices/system/memtier$ cat default_rank > > > 1 > > > kvaneesh@ubuntu-guest:/sys/devices/system/memtier$ cat max_rank > > > 3 > > > > For flexibility, we don't want max_rank to be interpreted as the > > number of memory tiers. Also, we want to leave spaces in rank values > > to allow new memtiers to be inserted when needed. So I'd suggest to > > make max_rank a much larger value (e.g. 255). > > > > > kvaneesh@ubuntu-guest:/sys/devices/system/memtier$ cd memtier1/ > > > kvaneesh@ubuntu-guest:/sys/devices/system/memtier/memtier1$ ls > > > nodelist power rank subsystem uevent > > > kvaneesh@ubuntu-guest:/sys/devices/system/memtier/memtier1$ cat nodelist > > > 0-3 > > > kvaneesh@ubuntu-guest:/sys/devices/system/memtier/memtier1$ cat rank > > > 1 > > > kvaneesh@ubuntu-guest:/sys/devices/system/memtier/memtier1$ cd > > > ../../node/node1/ > > > kvaneesh@ubuntu-guest:/sys/devices/system/node/node1$ cat memtier > > > 1 > > > kvaneesh@ubuntu-guest:/sys/devices/system/node/node1$ > > > root@ubuntu-guest:/sys/devices/system/node/node1# echo 0 > memtier > > > root@ubuntu-guest:/sys/devices/system/node/node1# cat memtier > > > 0 > > > root@ubuntu-guest:/sys/devices/system/node/node1# cd ../../memtier/ > > > root@ubuntu-guest:/sys/devices/system/memtier# ls > > > default_rank max_rank memtier0 memtier1 power uevent > > > root@ubuntu-guest:/sys/devices/system/memtier# cd memtier0/ > > > root@ubuntu-guest:/sys/devices/system/memtier/memtier0# cat nodelist > > > 1 > > > root@ubuntu-guest:/sys/devices/system/memtier/memtier0# cat rank > > > 0 > > > > It looks like the example here demonstrates the dynamic creation of > > memtier0. If so, how is the rank of memtier0 determined? If we want > > to support creating new memtiers at runtime, I think an explicit > > interface that specifies both device ID and rank is preferred to avoid > > implicit dependencies between device IDs and ranks. > > Why make device ID explicit - it's meaningless I think? > How about a creation interface that is simply writing the rank value > to create a new one? The only race I can see would be to get > two parallel attempts to create a new tier with the same rank. > That seems unlikely to matter unless we support changing rank later. > > Two attempts to create the same device ID tier seems more likely to > cause fiddly races. That's right: Device ID is not needed when creating a new memtier. It should be enough to provide only a rank value. > Jonathan > > > > > > > root@ubuntu-guest:/sys/devices/system/memtier/memtier0# echo 4 > rank > > > bash: rank: Permission denied > > > root@ubuntu-guest:/sys/devices/system/memtier/memtier0# > >