From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 89BA9C77B62 for ; Tue, 4 Apr 2023 17:58:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D77796B0071; Tue, 4 Apr 2023 13:58:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D01096B0074; Tue, 4 Apr 2023 13:58:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B52656B0075; Tue, 4 Apr 2023 13:58:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 9F5866B0071 for ; Tue, 4 Apr 2023 13:58:12 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 64E9AC04FB for ; Tue, 4 Apr 2023 17:58:12 +0000 (UTC) X-FDA: 80644467624.28.B77577D Received: from mailout2.w2.samsung.com (mailout2.w2.samsung.com [211.189.100.12]) by imf02.hostedemail.com (Postfix) with ESMTP id C7F5C80019 for ; Tue, 4 Apr 2023 17:58:08 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=samsung.com header.s=mail20170921 header.b=cRkI8DNp; dmarc=pass (policy=none) header.from=samsung.com; spf=pass (imf02.hostedemail.com: domain of a.manzanares@samsung.com designates 211.189.100.12 as permitted sender) smtp.mailfrom=a.manzanares@samsung.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680631088; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=GyNCiN5NyYtYGx6ZpPHXpBvWu4FEC16e+2G8ooYWrNY=; b=N07SaDIOzkJi4pzqp27IAuEyDHzyJc5MRtg1tDoowplayMITte2Vgf6EOqdGckW7JgaSou djdq9y/4tYKJni2YPWqchyAurt0VViXcqKApXApb/kE5PuUa/3e02bOY+kgQHmA95c4E5Z MS5Jbqzfg6xxZ8iBFpsWIFzCknZq7KY= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=samsung.com header.s=mail20170921 header.b=cRkI8DNp; dmarc=pass (policy=none) header.from=samsung.com; spf=pass (imf02.hostedemail.com: domain of a.manzanares@samsung.com designates 211.189.100.12 as permitted sender) smtp.mailfrom=a.manzanares@samsung.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680631088; a=rsa-sha256; cv=none; b=vkvtpM4ZtoLrXj1f2SRP42WyscvBNAB1xC/XozXOoWERSJPBnLYhuieAkye0huTsP2VKEH GmS7TFWFFk/qMIToey8mKd7QkwlW/jhguKyykBTlGgYAyZrVDTjdHb7EkDztwi9v+iOwej quNDE42E/JzMUIwGjeG8W7yYX1uYPZk= Received: from uscas1p2.samsung.com (unknown [182.198.245.207]) by mailout2.w2.samsung.com (KnoxPortal) with ESMTP id 20230404175807usoutp02d61751c12250436338936df1139a8e1e~SzRHVbHJQ2513825138usoutp02T; Tue, 4 Apr 2023 17:58:07 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 mailout2.w2.samsung.com 20230404175807usoutp02d61751c12250436338936df1139a8e1e~SzRHVbHJQ2513825138usoutp02T DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=samsung.com; s=mail20170921; t=1680631087; bh=GyNCiN5NyYtYGx6ZpPHXpBvWu4FEC16e+2G8ooYWrNY=; h=From:To:CC:Subject:Date:In-Reply-To:References:From; b=cRkI8DNp1c0w2lIi3fxfUCDEV2ERJd39HzjmelUiGxokRJaBiNzhZNA2qTcrFuZZA rUMW84MxSHI2AwfgivhswAyGNAfnmHb0j1n4Zshw9RdltmV6eWOy8JcqkhsXvHX3fi SFVS3Rlqp9h9hw6rVJOnkDx2waJhP35j5T08bbDo= Received: from ussmges2new.samsung.com (u111.gpu85.samsung.co.kr [203.254.195.111]) by uscas1p1.samsung.com (KnoxPortal) with ESMTP id 20230404175807uscas1p1bd70cf779e5118432059284714c217a4~SzRHHnift0080200802uscas1p1s; Tue, 4 Apr 2023 17:58:07 +0000 (GMT) Received: from uscas1p1.samsung.com ( [182.198.245.206]) by ussmges2new.samsung.com (USCPEMTA) with SMTP id D8.49.09670.F256C246; Tue, 4 Apr 2023 13:58:07 -0400 (EDT) Received: from ussmgxs3new.samsung.com (u92.gpu85.samsung.co.kr [203.254.195.92]) by uscas1p1.samsung.com (KnoxPortal) with ESMTP id 20230404175806uscas1p123e671c3c0a5b2e536c161f91a47331f~SzRG04Atx0078500785uscas1p1i; Tue, 4 Apr 2023 17:58:06 +0000 (GMT) X-AuditID: cbfec36f-6f9ff700000025c6-9f-642c652f005a Received: from SSI-EX3.ssi.samsung.com ( [105.128.2.146]) by ussmgxs3new.samsung.com (USCPEXMTA) with SMTP id 41.68.09515.E256C246; Tue, 4 Apr 2023 13:58:06 -0400 (EDT) Received: from SSI-EX3.ssi.samsung.com (105.128.2.228) by SSI-EX3.ssi.samsung.com (105.128.2.228) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.1.2375.24; Tue, 4 Apr 2023 10:58:05 -0700 Received: from SSI-EX3.ssi.samsung.com ([105.128.5.228]) by SSI-EX3.ssi.samsung.com ([105.128.5.228]) with mapi id 15.01.2375.024; Tue, 4 Apr 2023 10:58:05 -0700 From: Adam Manzanares To: Mike Rapoport CC: Kyungsan Kim , "lsf-pc@lists.linux-foundation.org" , "linux-mm@kvack.org" , "linux-fsdevel@vger.kernel.org" , "linux-cxl@vger.kernel.org" , "viacheslav.dubeyko@bytedance.com" , "dan.j.williams@intel.com" , "seungjun.ha@samsung.com" , "wj28.lee@samsung.com" Subject: Re: RE: RE(2): FW: [LSF/MM/BPF TOPIC] SMDK inspired MM changes for CXL Thread-Topic: RE: RE(2): FW: [LSF/MM/BPF TOPIC] SMDK inspired MM changes for CXL Thread-Index: AQHZRZWYKsKibCrYtEW5v18Ffh90dK7oooD8gB42sICAASUvgIAA1oqAgAR8YoCACCVygIAGEwwAgACeYQA= Date: Tue, 4 Apr 2023 17:58:05 +0000 Message-ID: <20230404175754.GA633356@bgt-140510-bm01> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [105.128.2.176] Content-Type: text/plain; charset="us-ascii" Content-ID: <28C9E57282D5264F96CD90AC7651B607@ssi.samsung.com> Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-CFilter-Loop: Reflected X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFtrAKsWRmVeSWpSXmKPExsWy7djXc7r6qTopBm/vc1lMn3qB0eL8rFMs Fnv2nmSxuLfmP6vFvtd7mS2OrN/OZNGx4Q2jA7vHvxNr2DwW73nJ5LFpVSebx6ZPk9g9Jt9Y zujxeZNcAFsUl01Kak5mWWqRvl0CV8bPxkbWggnKFRM7TjI3MB6X7mLk5JAQMJG43PSYtYuR i0NIYCWjxINHO1kgnFYmiZYbv9hgqm68WgpVtZZRYvauI0wQzkdGiTXT3rGAVAkJLGWUWPRT E8RmEzCQ+H18IzOILSKgLPF98T5mkAZmgXYWiYNHzoONFRYIlFjw5TcbRFGQxJXPTexdjBxA dpLEp99SIGEWARWJ1wsngs3nFTCV2PrtGjuIzSmgJfFq9VVWEJtRQEzi+6k1TCA2s4C4xK0n 85kgrhaUWDR7DzOELSbxb9dDqG8UJe5/f8kOUa8jsWD3JzYI205i+a6/rBC2tsSyha+ZIfYK Spyc+YQFoldS4uCKG1D2Ew6JNzO8IGwXiXsLG6DmS0tMX3MZqiZfYlfbFagbKiSuvu6GqrGW WPhnPdMERpVZSM6eheSkWUhOmoXkpFlITlrAyLqKUby0uDg3PbXYKC+1XK84Mbe4NC9dLzk/ dxMjMFGd/nc4fwfj9Vsf9Q4xMnEwHmKU4GBWEuFV7dJKEeJNSaysSi3Kjy8qzUktPsQozcGi JM5raHsyWUggPbEkNTs1tSC1CCbLxMEp1cBkuPzO4XesD08cZnzpeul6rcitu4sXe1nO/WFS rvYmTXrSluPn+oq/PD1zWd+JN4B/8p+zW5qqW15OqhRaO/PzgcM/T6bfq/T8nHZhKrv3uzvf 9ayObXiwQMDds9pBnOmSZE3IqaI2s5obRoaf2XtLIvu3294pzfuic+Hrtp3TzCpZ5vVf0vMq PCh6ZMf/DRvk6n63lni85PJeqhcp7tsj+db/hHNm5d0dgmUmST/vKgZFy5zJSRR5bKaduHv5 baZdP2um1wetW3x6VQDf17R/D6bl2nIci01d5fxx/gLmB6fKv81bw3/pY+BrOeXOBKllTX+r 3L4qJ7x7b8lwdYWm4uHb63TvB5t4vivfp3h7pRJLcUaioRZzUXEiAAhRvA/DAwAA X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFjrHIsWRmVeSWpSXmKPExsWS2cA0SVcvVSfF4N8xQ4vpUy8wWpyfdYrF Ys/ekywW99b8Z7XY93ovs8WR9duZLDo2vGF0YPf4d2INm8fiPS+ZPDat6mTz2PRpErvH5BvL GT0+b5ILYIvisklJzcksSy3St0vgyvjZ2MhaMEG5YmLHSeYGxuPSXYycHBICJhI3Xi1l7WLk 4hASWM0ocezYGxYI5yOjxJ8dF1lAqoQEljJKzJvIBmKzCRhI/D6+kRnEFhFQlvi+eB8zSAOz QDuLRMuqyWANwgKBEgu+/AZq4AAqCpJo+yoOYSZJfPotBVLBIqAi8XrhRLBqXgFTia3frrFD 7L3AKLH44kZWkASngJbEq9VXwWxGATGJ76fWMIHYzALiEreezGeC+EBAYsme88wQtqjEy8f/ WCFsRYn731+yQ9TrSCzY/YkNwraTWL7rLyuErS2xbOFrZogjBCVOznzCAtErKXFwxQ2WCYwS s5Csm4Vk1Cwko2YhGTULyagFjKyrGMVLi4tz0yuKjfNSy/WKE3OLS/PS9ZLzczcxAqP89L/D MTsY7936qHeIkYmD8RCjBAezkgivapdWihBvSmJlVWpRfnxRaU5q8SFGaQ4WJXFej9iJ8UIC 6YklqdmpqQWpRTBZJg5OqQYmhiKFusny79da+KdG7mAI2RrLU6/rG/Xnc86KK4XPCif2O+8O tZw487d9XPBB3Zuf7vzybTkXf7juvpP2ZtkSjZQl8Ru2lm0/kyp4yfhekdi0N/Gfdr003vm3 93/g1NQewTd1pW2eizKPmM72ZtfIOfxM+Wv/oiA9oSWv/xy/O0P3wjGz1JfPEi6/F4lfkhJ7 xL+A5Yrk7X1/DvwTran0z3VJ6NtUvC115gfVD2c7yt9Hr9RxOiibl/yb68xV/Wqtb5tnnTpS 3MBpOOvi8tXP13+dFufOvul4X+fOxPOqnzWufd/lMl/he3zkDVkOjeNWLILTuTbN1n5icSBc iTPt3vH/Qb4Gb73WijxifviKSYmlOCPRUIu5qDgRAKiR6MthAwAA X-CMS-MailID: 20230404175806uscas1p123e671c3c0a5b2e536c161f91a47331f CMS-TYPE: 301P X-CMS-RootMailID: 20230331114526epcas2p2b6f1d4c8c1c0b2e3c12a425b6e48c0d8 References: <20230331114525.400375-1-ks0204.kim@samsung.com> X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: C7F5C80019 X-Stat-Signature: po4q69d8oxzzrr6odidy93747wqpqppw X-HE-Tag: 1680631088-7035 X-HE-Meta: U2FsdGVkX19IMt7mVB1sQABo/VLa+tVj8KrDNBPX2TO9faWlqaqSaU+1AHoLy8SIdsZ4NAdQo1J5baZ0N1cu61kGjS6a2TzmTWzRm49pcd7lcEZWFQLsdvrqCEqiSs660TRy8RbmjJWBFZwxOaN0MlsY7kbBEfEuPm9X1ptsrnWhIP//XrmPEhTKLxMEFcjtH2ZUTsfrmSL/zq1MaUSpmQqXfvxgBATjvKqdFiBdxLaxvZbAHxiAXiCC6gqUCNyJJZvoVAw4239iYqggdnVXT68C0bCT0QprONVdS8meQRRxKRfyDFWCF5coPYfPEAFOnjHB43nmYn0d4qXRIJhmtVDWVt1tf8YzAxuGYnPSE4n2XLhnKCGboeZhFGtuCMKA6ftk/qyGnjIpyBbqOD1JLeOnwaWqa1qv7MuLniGpiVo5ebdzwPlXJypnIhzGw8Q/y/LocWJR9gJ19RMtoQjEBYAnXjv22G7ycmAUMchmvnFpzDviVoYkRKhjmXn44AFSaWxoTOU8D6MyRiSW7KXbvwHTvFwPkKinDFo5sgXZmzITct/Y0vzvUNwRu/LTHpffANHH130iq+5PuAih6USMG5ESDXfKfJooJyv7954e6KvbyOo1d7ATv1WfhR4p9aEbQOMTGu2L5dW4jVsdYmNXaYAACZW/ezGpIlpsUX+dOfrBW7z+ERqmlrch8RowBH04ikISjRZsWhwEYJK8v8ui3clF1fmH/xbpV14IqydSF6m5+LyWtGkd1vdVzYIFQsKCZsJYhnijQYQ+rRD5b7/hMPba28FVSxmBxfdrVFrPKe3M96Si9cInI6sxLEEZdvU7VR/Pmi2xJ8TjanRrec573JweK9JgZo1X1xokaZXw+/MQdQFD/MJ9UqDDDDBhDjLNDtklfH17/RzLAEOWVtq7bzS0fUwAuf+kc8CFV1eOn8vaiGu6rHwWMcs7ZZm5hP3HJk/Ym7KNjJScROv94nn IEpnH6k7 Beu51JdErYGLUL2jysKbD5SdnqELiWc1Bgqb/rg0FRh1ulroEw0UM29Ql5GBLuLAzWMN7 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Apr 04, 2023 at 11:31:08AM +0300, Mike Rapoport wrote: > On Fri, Mar 31, 2023 at 08:45:25PM +0900, Kyungsan Kim wrote: > > Thank you Mike Rapoport for participating discussion and adding your th= ought. > >=20 > > >Hi, > > > > > >On Thu, Mar 23, 2023 at 07:51:05PM +0900, Kyungsan Kim wrote: > > >> I appreciate dan for the careful advice. > > >>=20 > > >> >Kyungsan Kim wrote: > > >> >[..] > > >> >> >In addition to CXL memory, we may have other kind of memory in t= he > > >> >> >system, for example, HBM (High Bandwidth Memory), memory in FPGA= card, > > >> >> >memory in GPU card, etc. I guess that we need to consider them > > >> >> >together. Do we need to add one zone type for each kind of memo= ry? > > >> >>=20 > > >> >> We also don't think a new zone is needed for every single memory > > >> >> device. Our viewpoint is the sole ZONE_NORMAL becomes not enough= to > > >> >> manage multiple volatile memory devices due to the increased devi= ce > > >> >> types. Including CXL DRAM, we think the ZONE_EXMEM can be used t= o > > >> >> represent extended volatile memories that have different HW > > >> >> characteristics. > > >> > > > >> >Some advice for the LSF/MM discussion, the rationale will need to b= e > > >> >more than "we think the ZONE_EXMEM can be used to represent extende= d > > >> >volatile memories that have different HW characteristics". It needs= to > > >> >be along the lines of "yes, to date Linux has been able to describe= DDR > > >> >with NUMA effects, PMEM with high write overhead, and HBM with impr= oved > > >> >bandwidth not necessarily latency, all without adding a new ZONE, b= ut a > > >> >new ZONE is absolutely required now to enable use case FOO, or addr= ess > > >> >unfixable NUMA problem BAR." Without FOO and BAR to discuss the cod= e > > >> >maintainability concern of "fewer degress of freedom in the ZONE > > >> >dimension" starts to dominate. > > >>=20 > > >> One problem we experienced was occured in the combination of hot-rem= ove and kerelspace allocation usecases. > > >> ZONE_NORMAL allows kernel context allocation, but it does not allow = hot-remove because kernel resides all the time. > > >> ZONE_MOVABLE allows hot-remove due to the page migration, but it onl= y allows userspace allocation. > > >> Alternatively, we allocated a kernel context out of ZONE_MOVABLE by = adding GFP_MOVABLE flag. > > >> In case, oops and system hang has occasionally occured because ZONE_= MOVABLE can be swapped. > > >> We resolved the issue using ZONE_EXMEM by allowing seletively choice= of the two usecases. > > >> As you well know, among heterogeneous DRAM devices, CXL DRAM is the = first PCIe basis device, which allows hot-pluggability, different RAS, and = extended connectivity. > > >> So, we thought it could be a graceful approach adding a new zone and= separately manage the new features. > > > > > >This still does not describe what are the use cases that require havin= g > > >kernel allocations on CXL.mem.=20 > > > > > >I believe it's important to start with explanation *why* it is importa= nt to > > >have kernel allocations on removable devices. > > >=20 > >=20 > > In general, a memory system with DDR/CXL DRAM will have near/far memory= . > > And, we think kernel already includes memory tiering solutions - Meta T= PP, zswap, and pagecache. > > Some kernel contexts would prefer fast memory. For example, a hot data = with time locality or a data for fast processing such as metadata or indexi= ng. > > Others would enough with slow memory. For example, a zswap page which i= s being used while swapping.=20 >=20 > The point of zswap IIUC is to have small and fast swap device and > compression is required to better utilize DRAM capacity at expense of CPU > time. >=20 > Presuming CXL memory will have larger capacity than DRAM, why not skip th= e > compression and use CXL as a swap device directly? I like to shy away from saying CXL memory should be used for swap. I see a= =20 swap device as storing pages in a manner that is no longer directly address= able by the cpu.=20 Migrating pages to a CXL device is a reasonable approach and I believe we have the ability to do this in the page reclaim code.=20 >=20 > And even supposing there's an advantage in putting zswap on CXL memory, > why that can be done with node-based APIs but warrants a new zone? >=20 > --=20 > Sincerely yours, > Mike.=