From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 59105CDB47E for ; Wed, 18 Oct 2023 22:27:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E2F3C80056; Wed, 18 Oct 2023 18:27:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DDF9980055; Wed, 18 Oct 2023 18:27:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CA6AE80056; Wed, 18 Oct 2023 18:27:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id BBC4A80055 for ; Wed, 18 Oct 2023 18:27:18 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 91BA11CBD43 for ; Wed, 18 Oct 2023 22:27:18 +0000 (UTC) X-FDA: 81360019356.14.8097FFB Received: from mail-ed1-f50.google.com (mail-ed1-f50.google.com [209.85.208.50]) by imf24.hostedemail.com (Postfix) with ESMTP id C623818001F for ; Wed, 18 Oct 2023 22:27:16 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=n7yx8rHS; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf24.hostedemail.com: domain of yosryahmed@google.com designates 209.85.208.50 as permitted sender) smtp.mailfrom=yosryahmed@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1697668036; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=YKT0DhFzK/7HUpXkHzrCihZCJECXU9c5i5HWV8bW/EM=; b=0JHNrX1Z0OouwcU2ymp+3Lyluqbo3UkAyaE1RrYjiYi1FVux9LGc1YcPN8AOsdlu8l5vUI wcC1EpFCBg8UPk6X7MUiDDzIBU5aKrEetBLf6fi1WamNJjL3kTSHKFh+9QAI+RpU+dfSBd 7T0QLYByNMe6r2Df/0mdTnwpATJMBzw= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=n7yx8rHS; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf24.hostedemail.com: domain of yosryahmed@google.com designates 209.85.208.50 as permitted sender) smtp.mailfrom=yosryahmed@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1697668036; a=rsa-sha256; cv=none; b=Vs8G3jjmVAdHD527U3pS3ZEueTlDTtBVdXVGqiQ6DfQyLOLB5Zu+X+00Zv2QClfJPMPIOK U9LP9jeOyvkXh8/PuhGUz0zFd/+HNXiqToL54mIRafZQHclKXswG3LWoBThmvZuHBIuh3c kRpHiVJ/aTdj2Cs6frvGXWNIGpWUIoY= Received: by mail-ed1-f50.google.com with SMTP id 4fb4d7f45d1cf-53e07db272cso11529371a12.3 for ; Wed, 18 Oct 2023 15:27:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1697668035; x=1698272835; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=YKT0DhFzK/7HUpXkHzrCihZCJECXU9c5i5HWV8bW/EM=; b=n7yx8rHS7LCNONmrwY1utErIfL4kzKKod9NcvDvsdlCnEw1pgT0pxAHmDFHGGd08oV dtY411EmBNTTc64kvYy3XdLHSpBZEe5Swg19H6oSRcrYssME7dXkIlvbn2C6LpWFvAW7 ZuM6wz3OumRE0aDQvKknjtFgykNzWq+/6aLrZ5BG0Q6AyBSgz1uRQg3uyO69phNY2E6J UowRXb3rmDiK+EoBahTRB83c7cF5ihJUaWh1mlarWUpsdgwioPeX9m9bkij7CoqCjuv3 89x1ltb62NAL7r2VSilEnH1GAwjbGlnYvhx5XX4GqdUMlMaxTze3PZ1jgl+OZnEDC5bh 5UtQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697668035; x=1698272835; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=YKT0DhFzK/7HUpXkHzrCihZCJECXU9c5i5HWV8bW/EM=; b=C4xKPamAWotysOObwvR4pKIyGXbq/Fz6S3OLOoOgRphI3SzFlcpEMbaKLO2A324/NS cVUCa04oY9Fwhwi9Nogtn/5yEj3pXTFYu/wODKyEx4JhM6dxy3vJPSKMPLEtZDFDzERW SVuqDpsj691NqM0hF7qvSwWl9Lc7kM4PlO/jUH+FiyCdTk4R1glV9SmxXgIAKdtzOgwW hFtrdUaTH15NiRMu6cfEKltdufZxOsUDHahnhfgJVRFzoYBGrAwolTBhimevbuKVOo2S xSkPtB3/GxQQ5hCJNqjfM17SrEtBjnteU58oddQlDxz0unMqeC/0ztft8KhpaYMaG4xa oX0g== X-Gm-Message-State: AOJu0Yx6JmNj2pvIruCuCxr1gi1XIMOIrltM0CpXEbLRymBz0pJNv1QO wRHuhVKaQotkgLjRugYlue23TsBpwNBmGc3vrDkWDQ== X-Google-Smtp-Source: AGHT+IHKpl2ijlZQSO51cM+Z8COTkTH4QpJBSq+LXSmEO/uccV7vxTLmHx/TF2yPMZHolRigJcKqb7McemsGrQI829g= X-Received: by 2002:a17:907:3f13:b0:9ae:69b8:322b with SMTP id hq19-20020a1709073f1300b009ae69b8322bmr426010ejc.60.1697668034993; Wed, 18 Oct 2023 15:27:14 -0700 (PDT) MIME-Version: 1.0 References: <20231017232152.2605440-1-nphamcs@gmail.com> <20231017232152.2605440-2-nphamcs@gmail.com> In-Reply-To: <20231017232152.2605440-2-nphamcs@gmail.com> From: Yosry Ahmed Date: Wed, 18 Oct 2023 15:26:36 -0700 Message-ID: Subject: Re: [PATCH v3 1/5] mm: list_lru: allow external numa node and cgroup tracking To: Nhat Pham Cc: akpm@linux-foundation.org, hannes@cmpxchg.org, cerasuolodomenico@gmail.com, sjenning@redhat.com, ddstreet@ieee.org, vitaly.wool@konsulko.com, mhocko@kernel.org, roman.gushchin@linux.dev, shakeelb@google.com, muchun.song@linux.dev, linux-mm@kvack.org, kernel-team@meta.com, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org, shuah@kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: C623818001F X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: 6kits9d6c98i7n83gapfsfahton61tj7 X-HE-Tag: 1697668036-797422 X-HE-Meta: U2FsdGVkX1+13LiWdatvvtj2WXO5t+eUrkey6M59FhCCmzvitCfwPI7BKhTFLFQqDdPyaZgk3+20/oSGt99tCG/ImIFkg8SkDnt/bUgwE3h1GKzYVTNWzdF2MrScB3gv5S6/CECeG8b3oGiN1zJikmWdpFEyTJS0QnMqhorubLR8D88Omc5p7wlRZxA4iQbOTPqbRrvIOetMdQZ1N2sViQ/sLpo3LRC6E61HGeaxXKSElHqNlT3sX/39/gYvWO7qdQQG+8bEYvrzitQWEstPCLmjU2IkCFV42CPmxY0aqNLxGYYzTAttBkrZA8hCCtpwoC34KA278gbo4lTlbVEIpt3mlk6UtW/B50hMpg53c9WEInEbF3maBGEaM7ZNf1BB8ENJHzKAUnsHrcCWMfL+xycu5yOufRi2vTGmOq+LjYpQI1MMJZANIMkd+rxHomHqw9XhnTMIg7WPZMiOYivQqX0MOEXL+HbifkppwkISSh/kuXyUjFiEnQ9Qc4JN5MHtZG834GjYQ+6MtQawOas5t1ytxYqLdbDlDsLiuQZR5IvJsr6bPI0glcnrikUV/oVUCSvEgv/EGIvhgVkO2ARE7PC+Z2tvEwk8X35n5WSTMpHQAI4nlrgdFtOpitRrcJlega4J8feay5hV4ZrcZ5Jc2Kfnv5Cm3GSSBE6ZJ2vfxuWV1FylOITqC/Vz+D8qbjUSE1i5CsNnJNtyIXdujPlLr/gppxzAc4GTOkj/mljzpRb34/ARBIz01JS6VrCwW3TrayiQTI/UBIe88xzIfNAbhwPikvMgR68n7WL+OQW8Nokig0CWURf8Mx5x6ZF1fusGynbEQW6tVif8jvP01VqvVPzc5jnQCXLu4uhgGDZDXpQVQWe72wE8UCFsbT1UT7Go6goGlfWet55lwY3IouB0QIorTVbIxoAdQRLORiDGuTDgPdOSQzP/cMqi9KC0o25xCJrHbaVRVsKpenct9c1 jL703rdG LxEqpf50+b537uFtpQ3X1A3lAnulpfuHkzTKiiYX6f8sR4tUDc76f5O93UyPySCLZ3TBTgSyrImYhRC1Biv3snmt3k08/46ZUu77RWdNLO7bv1+ZUjPrP9N2SsZyAbBhs3jzKQDZGh/y2lR+pvLEuX/cvwWQNbQRxdra9eJm313MYJoPsgAOPP032hyVl73g7eWm0+kani7eQw9yRGqf1NTTgwqjBC/AD9o1yCI94xYCHs3LnB/2G2f603RcCn3szxRqYU/0J/jDV2K8EASL1KzG7iWrepQRNRl/YohARFyw7RNE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Oct 17, 2023 at 4:21=E2=80=AFPM Nhat Pham wrote= : > > The interface of list_lru is based on the assumption that objects are > allocated on the correct node/memcg, with this change it is introduced th= e > possibility to explicitly specify numa node and memcgroup when adding and > removing objects. This is so that users of list_lru can track node/memcg > of the items outside of the list_lru, like in zswap, where the allocation= s > can be made by kswapd for data that's charged to a different cgroup. > > Signed-off-by: Nhat Pham I prefer what Johannes suggested, making list_lru_add() and friends take in the memcg and nid, and add list_lru_add_obj() (or similar) and friends that assume the object is on the right node and memcg. This is clearer and more explicit imo. I am not very familiar with list_lrus though, so I'll leave this to folks who actually are. > --- > include/linux/list_lru.h | 38 +++++++++++++++++++++++++++++++++++ > mm/list_lru.c | 43 +++++++++++++++++++++++++++++++++++----- > 2 files changed, 76 insertions(+), 5 deletions(-) > > diff --git a/include/linux/list_lru.h b/include/linux/list_lru.h > index b35968ee9fb5..0f5f39cacbbb 100644 > --- a/include/linux/list_lru.h > +++ b/include/linux/list_lru.h > @@ -89,6 +89,24 @@ void memcg_reparent_list_lrus(struct mem_cgroup *memcg= , struct mem_cgroup *paren > */ > bool list_lru_add(struct list_lru *lru, struct list_head *item); > > +/** > + * __list_lru_add: add an element to a specific sublist. > + * @list_lru: the lru pointer > + * @item: the item to be added. > + * @memcg: the cgroup of the sublist to add the item to. > + * @nid: the node id of the sublist to add the item to. > + * > + * This function is similar to list_lru_add(), but it allows the caller = to > + * specify the sublist to which the item should be added. This can be us= eful > + * when the list_head node is not necessarily in the same cgroup and NUM= A node > + * as the data it represents, such as zswap, where the list_head node co= uld be > + * from kswapd and the data from a different cgroup altogether. > + * > + * Return value: true if the list was updated, false otherwise > + */ > +bool __list_lru_add(struct list_lru *lru, struct list_head *item, int ni= d, > + struct mem_cgroup *memcg); > + > /** > * list_lru_del: delete an element to the lru list > * @list_lru: the lru pointer > @@ -102,6 +120,18 @@ bool list_lru_add(struct list_lru *lru, struct list_= head *item); > */ > bool list_lru_del(struct list_lru *lru, struct list_head *item); > > +/** > + * __list_lru_del: delete an element from a specific sublist. > + * @list_lru: the lru pointer > + * @item: the item to be deleted. > + * @memcg: the cgroup of the sublist to delete the item from. > + * @nid: the node id of the sublist to delete the item from. > + * > + * Return value: true if the list was updated, false otherwise. > + */ > +bool __list_lru_del(struct list_lru *lru, struct list_head *item, int ni= d, > + struct mem_cgroup *memcg); > + > /** > * list_lru_count_one: return the number of objects currently held by @l= ru > * @lru: the lru pointer. > @@ -136,6 +166,14 @@ static inline unsigned long list_lru_count(struct li= st_lru *lru) > void list_lru_isolate(struct list_lru_one *list, struct list_head *item)= ; > void list_lru_isolate_move(struct list_lru_one *list, struct list_head *= item, > struct list_head *head); > +/* > + * list_lru_putback: undo list_lru_isolate. > + * > + * Since we might have dropped the LRU lock in between, recompute list_l= ru_one > + * from the node's id and memcg. > + */ > +void list_lru_putback(struct list_lru *lru, struct list_head *item, int = nid, > + struct mem_cgroup *memcg); > > typedef enum lru_status (*list_lru_walk_cb)(struct list_head *item, > struct list_lru_one *list, spinlock_t *lock, void *cb_arg= ); > diff --git a/mm/list_lru.c b/mm/list_lru.c > index a05e5bef3b40..63b75163c6ad 100644 > --- a/mm/list_lru.c > +++ b/mm/list_lru.c > @@ -119,13 +119,22 @@ list_lru_from_kmem(struct list_lru *lru, int nid, v= oid *ptr, > bool list_lru_add(struct list_lru *lru, struct list_head *item) > { > int nid =3D page_to_nid(virt_to_page(item)); > + struct mem_cgroup *memcg =3D list_lru_memcg_aware(lru) ? > + mem_cgroup_from_slab_obj(item) : NULL; > + > + return __list_lru_add(lru, item, nid, memcg); > +} > +EXPORT_SYMBOL_GPL(list_lru_add); > + > +bool __list_lru_add(struct list_lru *lru, struct list_head *item, int ni= d, > + struct mem_cgroup *memcg) > +{ > struct list_lru_node *nlru =3D &lru->node[nid]; > - struct mem_cgroup *memcg; > struct list_lru_one *l; > > spin_lock(&nlru->lock); > if (list_empty(item)) { > - l =3D list_lru_from_kmem(lru, nid, item, &memcg); > + l =3D list_lru_from_memcg_idx(lru, nid, memcg_kmem_id(mem= cg)); > list_add_tail(item, &l->list); > /* Set shrinker bit if the first element was added */ > if (!l->nr_items++) > @@ -138,17 +147,27 @@ bool list_lru_add(struct list_lru *lru, struct list= _head *item) > spin_unlock(&nlru->lock); > return false; > } > -EXPORT_SYMBOL_GPL(list_lru_add); > +EXPORT_SYMBOL_GPL(__list_lru_add); > > bool list_lru_del(struct list_lru *lru, struct list_head *item) > { > int nid =3D page_to_nid(virt_to_page(item)); > + struct mem_cgroup *memcg =3D list_lru_memcg_aware(lru) ? > + mem_cgroup_from_slab_obj(item) : NULL; > + > + return __list_lru_del(lru, item, nid, memcg); > +} > +EXPORT_SYMBOL_GPL(list_lru_del); > + > +bool __list_lru_del(struct list_lru *lru, struct list_head *item, int ni= d, > + struct mem_cgroup *memcg) > +{ > struct list_lru_node *nlru =3D &lru->node[nid]; > struct list_lru_one *l; > > spin_lock(&nlru->lock); > if (!list_empty(item)) { > - l =3D list_lru_from_kmem(lru, nid, item, NULL); > + l =3D list_lru_from_memcg_idx(lru, nid, memcg_kmem_id(mem= cg)); > list_del_init(item); > l->nr_items--; > nlru->nr_items--; > @@ -158,7 +177,7 @@ bool list_lru_del(struct list_lru *lru, struct list_h= ead *item) > spin_unlock(&nlru->lock); > return false; > } > -EXPORT_SYMBOL_GPL(list_lru_del); > +EXPORT_SYMBOL_GPL(__list_lru_del); > > void list_lru_isolate(struct list_lru_one *list, struct list_head *item) > { > @@ -175,6 +194,20 @@ void list_lru_isolate_move(struct list_lru_one *list= , struct list_head *item, > } > EXPORT_SYMBOL_GPL(list_lru_isolate_move); > > +void list_lru_putback(struct list_lru *lru, struct list_head *item, int = nid, > + struct mem_cgroup *memcg) > +{ > + struct list_lru_one *list =3D > + list_lru_from_memcg_idx(lru, nid, memcg_kmem_id(memcg)); > + > + if (list_empty(item)) { > + list_add_tail(item, &list->list); > + if (!list->nr_items++) > + set_shrinker_bit(memcg, nid, lru_shrinker_id(lru)= ); > + } > +} > +EXPORT_SYMBOL_GPL(list_lru_putback); > + > unsigned long list_lru_count_one(struct list_lru *lru, > int nid, struct mem_cgroup *memcg) > { > -- > 2.34.1