From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 07311C3DA5D for ; Mon, 22 Jul 2024 15:33:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 956AE6B0095; Mon, 22 Jul 2024 11:33:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 905FB6B0098; Mon, 22 Jul 2024 11:33:04 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7A6CD6B0099; Mon, 22 Jul 2024 11:33:04 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 5AAC36B0095 for ; Mon, 22 Jul 2024 11:33:04 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 050FAA5D14 for ; Mon, 22 Jul 2024 15:33:03 +0000 (UTC) X-FDA: 82367781888.11.DA57497 Received: from mail-wm1-f52.google.com (mail-wm1-f52.google.com [209.85.128.52]) by imf16.hostedemail.com (Postfix) with ESMTP id 1781B18002C for ; Mon, 22 Jul 2024 15:33:01 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=UIoC09WY; spf=pass (imf16.hostedemail.com: domain of alexander.duyck@gmail.com designates 209.85.128.52 as permitted sender) smtp.mailfrom=alexander.duyck@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1721662359; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=qw3MXdh1l+GSHgjHWnYdzfuWkMoYNmYb7vb1OJfAtBk=; b=pUTlX+o+vB+pobfv4h2a3udFhIdiR6AbGqxmmfZC3eUkWpOcaIFNALQMQlS0IVx2DNlmNN FCM3AVT/DQDXonI7AVnRsqEpBvVqZ0L9k8YtOzqZWrFD5XrPQ7JwuNsoQs9imJWcwYcjBN 29lqNXHajd7/BDTde7cQozIbivVKfvY= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=UIoC09WY; spf=pass (imf16.hostedemail.com: domain of alexander.duyck@gmail.com designates 209.85.128.52 as permitted sender) smtp.mailfrom=alexander.duyck@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1721662359; a=rsa-sha256; cv=none; b=KAdsDEMP1e2UIZv/jki2Klpr7FG8vtGylEgAnSNyNKY1p+aH73IU9hxbJhbmukJpEUxb3p Od79zj8Z2nPQW0pRxJrfur1alCckvDkvdZCpOouQnFBNpljgJEt2kNcS3rqVVPslSRlDdD bgNvjDs/FpuzOau6oPNHHsbKd9rVBkw= Received: by mail-wm1-f52.google.com with SMTP id 5b1f17b1804b1-4279c10a40eso32436515e9.3 for ; Mon, 22 Jul 2024 08:33:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1721662381; x=1722267181; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=qw3MXdh1l+GSHgjHWnYdzfuWkMoYNmYb7vb1OJfAtBk=; b=UIoC09WYO4rzQEgp5STJWlscE/Tf+7+cpd+xkt8rEKRA0CoGWbZyodn0mqL7Jbd71u IeodQBgcaGUQJ4klH5KB65zo9EZOPIG+iFGScKEm20I0JiG33B8UsbU7pAq8O77vD5Dq Tz74HJH5qSPrKTOE7AoFJJsw5FFQi/dfDWggOrNwAGzdkc/u11mc8Et832DLa/i9Iqhw /bDg1FNyMHsJAs48i7doT/87lcNIuwJaUTg7vwibFstYun2hz83fb2idBo/NQCgDpjsV 7hdj5B/3w/jf7R+Ze+lUjDQCe65ZhP3R4zJewd0rVINr31MAG2CIGduNa/cDbF64KOGy PFZA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1721662381; x=1722267181; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=qw3MXdh1l+GSHgjHWnYdzfuWkMoYNmYb7vb1OJfAtBk=; b=mErooJQDHA2nmjOr8XmIo3KsLXAxV5Bq9WzUWE7xQW5eQEqlS2Moucu7GzyG86PCgn O00SMHwSjz7Ufm+LffyD6xwFwxccmJzzwToB1M+AZGqfZ0GTvMIvP5/BpEQf+gDUVW6b ImdpNxeGGTao27h8QqEFXR/VpEyBQf1zgn+ULX7EHsfB2OcxyDXJzgEWdzy1Plipcb3T JRhyrKeWOfcXD9rJg4Z7DH230Ba1aX3PRtOVM6dOW66RpN7krhyhUuimkFYngSMYbAZm gpwPIFvdbDF2Z0FhpoVc7vr1CgxzUT9iGHOYs67FO3FFORFeG5pe6heZgytHgeOSzCFm jEdQ== X-Forwarded-Encrypted: i=1; AJvYcCV0biIE3kixyOLkIfQ+Dv9Nw3ByIUeSaVYdxl8KQo3d5yyva2iKna8TkUJdxoxw4C2aWvFhIHV4hrbKkpCE4/RYilc= X-Gm-Message-State: AOJu0YzGDton2uIMntnJKV16NZ7kUWPzT33jNIZ8e/haKlu/IueMO7vp TNf5hvMl8k2W51O8Z80tly5akRMAQotnogKbdIH/AK1DE5MDhYt8SKya2FE10DCKfIFNVAFHvh8 keb3wpNHTdCBbo1hFvHQekw7k3A8= X-Google-Smtp-Source: AGHT+IEj9m4RflaxiV+GYMly5G1qIp3Qd3TRRIsl4lMoSEJkKSsDRRzvTMHWo0mf0MM8xeGkiYl/DFjyLYTsnsfObgA= X-Received: by 2002:a05:600c:4f50:b0:426:616e:db8d with SMTP id 5b1f17b1804b1-427dc52128fmr44639115e9.15.1721662380408; Mon, 22 Jul 2024 08:33:00 -0700 (PDT) MIME-Version: 1.0 References: <20240719093338.55117-1-linyunsheng@huawei.com> <20240719093338.55117-9-linyunsheng@huawei.com> In-Reply-To: From: Alexander Duyck Date: Mon, 22 Jul 2024 08:32:23 -0700 Message-ID: Subject: Re: [RFC v11 08/14] mm: page_frag: some minor refactoring before adding new API To: Yunsheng Lin Cc: davem@davemloft.net, kuba@kernel.org, pabeni@redhat.com, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Andrew Morton , linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: 1781B18002C X-Stat-Signature: fwi3gttecoidu7yf8z8ii7ybrfkygonc X-Rspam-User: X-HE-Tag: 1721662381-721209 X-HE-Meta: U2FsdGVkX1+G6CptAlqKb7WM5FNl8RsFrmYGLKEaPa5qHHWexL/5Qt1Kxj/LhWA5jKxHnYFuxBVkrQGIRDV1SvvO2YfL+J2HD+0XISpkzeYeOFqbQPmkiRN+eHvnyrDehXilwcBAhoBfN4ULebTVl/eKXaI4o6jKR4W21iu1jsQRliMhYr/Z4xp7QmK9f4cTRoSciFD774F/CbIqAOlhBke3h10EqtCNrR+XSN8Vv+VANqio2DZmOdXbZFVg/xvSk2yA2bX8GdR1zBhhQuG4TzH18MpAkh4EwdE5UciBjaVqyzL8cJtbp+HA/zy+eNfjfPBQKya6yljZ23J/Zifw+UHIIO5uQImw24q7/G/Kdx52bDSHdB9HRiaSWyPyN3H/nDiGrsLMNxSgOLhiVAiokUI00FEDy+oXh6odLQUpwL8/iRVH0xz2VUZAnE84uvFvoPyyp0FwWLZrKGc1ZevHLNguhAqWlJ8emmPqoXTz3pKbBBShn3W2SIViYUFZVyNeuZGcgJgs+npPjiIMWuDJjqdURqk9Itk/RJkIMkruDFBYoyqlMffuY/zG8P0pfAU/upziW7et82cDn5UGPPXErtOjLNG/DrlA6SZVuZbmvERZip9O0ytrzQdwhFPFwB4ibtpr0qNFgfrKgr8nz8DGdj2FN44lj4Vx8NwOxi/IT+05P+D+e6UCpo4vYPQTHtr4sqNIl1hIBh4O9IYW5Ohxcjct7SYPAIjB9cK7v7jqpFdbxATu2oWAgjhJ7k//wqH6wr0UJhutZad/6+KXxqxV0tJNwBnI+CSmiSGMMJGDNWVhGSzyq+pFJjxpC38FpDz0hBTtyBi7VceEfh5NlMJ2s/UcSRri32wJ74CBeo39vNFwF0Bt5VOEDGggVDlUYWkFZwqb62r7+9R86Mcow0inDpVAeFg9tCcWXHL1Ah5bkpdZsq6lJtG9yG0Y1b8lxZMzmqabO9Ly8eqalis34+G 8lo2GtLl SQ4xaYbBz/yCMTKzCPFJkA7rHm5i6/Yk2DUmkPQqIdZHgtyBmVRksx7mojQAsuNJtSOBAhUV1nnAFG/dxZUk7zhVeFOfzvfc3Gguaxm1u4NyvrT0nVIEDIrcEx+1DqbrgCpEJ7Oy+4IeXfMG3K5WTXeJc2UnEClYTpG+Nw7e0aVG3vDDaKziVt24pzo/gRLbmdQNAyyLrA6AI06MAQBhXMOgPYXkq7jhQpv0D2DZ/Gej6Ojg1/MLj3UrNMktzAla9L2sDbDOyCxe2cZi72gujz9pRqBR8h8xTfNR22fTtHG6gGFwJ7mKi/Puutd12JAQz9OANlSbPH4gjzXQmNETnkz+6G4nAgfln6rteKHGE6XH/FYm5Cd//9qeWqBBKFIvbPJDgVII+h/zI0uoS3hhLYCYLvWuhw8gbalGh X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Jul 22, 2024 at 5:55=E2=80=AFAM Yunsheng Lin wrote: > > On 2024/7/22 7:40, Alexander H Duyck wrote: > > On Fri, 2024-07-19 at 17:33 +0800, Yunsheng Lin wrote: > >> Refactor common codes from __page_frag_alloc_va_align() > >> to __page_frag_cache_refill(), so that the new API can > >> make use of them. > >> > >> CC: Alexander Duyck > >> Signed-off-by: Yunsheng Lin > >> --- > >> include/linux/page_frag_cache.h | 2 +- > >> mm/page_frag_cache.c | 93 +++++++++++++++++---------------= - > >> 2 files changed, 49 insertions(+), 46 deletions(-) > >> > >> diff --git a/include/linux/page_frag_cache.h b/include/linux/page_frag= _cache.h > >> index 12a16f8e8ad0..5aa45de7a9a5 100644 > >> --- a/include/linux/page_frag_cache.h > >> +++ b/include/linux/page_frag_cache.h > >> @@ -50,7 +50,7 @@ static inline void *encoded_page_address(unsigned lo= ng encoded_va) > >> > >> static inline void page_frag_cache_init(struct page_frag_cache *nc) > >> { > >> - nc->encoded_va =3D 0; > >> + memset(nc, 0, sizeof(*nc)); > >> } > >> > > > > I do not like requiring the entire structure to be reset as a part of > > init. If encoded_va is 0 then we have reset the page and the flags. > > There shouldn't be anything else we need to reset as remaining and bias > > will be reset when we reallocate. > > The argument is about aoviding one checking for fast path by doing the > memset in the slow path, which you might already know accroding to your > comment in previous version. > > It is just sometimes hard to understand your preference for maintainabili= ty > over performance here as sometimes your comment seems to perfer performan= ce > over maintainability, like the LEA trick you mentioned and offset count-d= own > before this patchset. It would be good to be more consistent about this, > otherwise it is sometimes confusing when doing the refactoring. The use of a negative offset is arguably more maintainable in my mind rather than being a performance trick. Essentially if you use the negative value you can just mask off the upper bits and it is the offset in the page. As such it is actually easier for me to read versus "remaining" which is an offset from the end of the page. Assuming you read the offset in hex anyway. > > > >> static inline bool page_frag_cache_is_pfmemalloc(struct page_frag_cac= he *nc) > >> diff --git a/mm/page_frag_cache.c b/mm/page_frag_cache.c > >> index 7928e5d50711..d9c9cad17af7 100644 > >> --- a/mm/page_frag_cache.c > >> +++ b/mm/page_frag_cache.c > >> @@ -19,6 +19,28 @@ > >> #include > >> #include "internal.h" > >> > >> +static struct page *__page_frag_cache_recharge(struct page_frag_cache= *nc) > >> +{ > >> + unsigned long encoded_va =3D nc->encoded_va; > >> + struct page *page; > >> + > >> + page =3D virt_to_page((void *)encoded_va); > >> + if (!page_ref_sub_and_test(page, nc->pagecnt_bias)) > >> + return NULL; > >> + > >> + if (unlikely(encoded_page_pfmemalloc(encoded_va))) { > >> + VM_BUG_ON(compound_order(page) !=3D > >> + encoded_page_order(encoded_va)); > >> + free_unref_page(page, encoded_page_order(encoded_va)); > >> + return NULL; > >> + } > >> + > >> + /* OK, page count is 0, we can safely set it */ > >> + set_page_count(page, PAGE_FRAG_CACHE_MAX_SIZE + 1); > >> + > >> + return page; > >> +} > >> + > >> static struct page *__page_frag_cache_refill(struct page_frag_cache *= nc, > >> gfp_t gfp_mask) > >> { > >> @@ -26,6 +48,14 @@ static struct page *__page_frag_cache_refill(struct= page_frag_cache *nc, > >> struct page *page =3D NULL; > >> gfp_t gfp =3D gfp_mask; > >> > >> + if (likely(nc->encoded_va)) { > >> + page =3D __page_frag_cache_recharge(nc); > >> + if (page) { > >> + order =3D encoded_page_order(nc->encoded_va); > >> + goto out; > >> + } > >> + } > >> + > > > > This code has no business here. This is refill, you just dropped > > recharge in here which will make a complete mess of the ordering and be > > confusing to say the least. > > > > The expectation was that if we are calling this function it is going to > > overwrite the virtual address to NULL on failure so we discard the old > > page if there is one present. This changes that behaviour. What you > > effectively did is made __page_frag_cache_refill into the recharge > > function. > > The idea is to reuse the below for both __page_frag_cache_refill() and > __page_frag_cache_recharge(), which seems to be about maintainability > to not having duplicated code. If there is a better idea to avoid that > duplicated code while keeping the old behaviour, I am happy to change > it. > > /* reset page count bias and remaining to start of new frag */ > nc->pagecnt_bias =3D PAGE_FRAG_CACHE_MAX_SIZE + 1; > nc->remaining =3D PAGE_SIZE << order; > The only piece that is really reused here is the pagecnt_bias assignment. What is obfuscated away is that the order is gotten through one of two paths. Really order isn't order here it is size. Which should have been fetched already. What you end up doing with this change is duplicating a bunch of code throughout the function. You end up having to fetch size multiple times multiple ways. here you are generating it with order. Then you have to turn around and get it again at the start of the function, and again after calling this function in order to pull it back out. > > > >> #if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) > >> gfp_mask =3D (gfp_mask & ~__GFP_DIRECT_RECLAIM) | __GFP_COMP | > >> __GFP_NOWARN | __GFP_NORETRY | __GFP_NOMEMALLOC; > >> @@ -35,7 +65,7 @@ static struct page *__page_frag_cache_refill(struct = page_frag_cache *nc, > >> if (unlikely(!page)) { > >> page =3D alloc_pages_node(NUMA_NO_NODE, gfp, 0); > >> if (unlikely(!page)) { > >> - nc->encoded_va =3D 0; > >> + memset(nc, 0, sizeof(*nc)); > >> return NULL; > >> } > >> > > > > The memset will take a few more instructions than the existing code > > did. I would prefer to keep this as is if at all possible. > > It will not take more instructions for arm64 as it has 'stp' instruction = for > __HAVE_ARCH_MEMSET is set. > There is something similar for x64? The x64 does not last I knew without getting into the SSE/AVX type stuff. This becomes two seperate 8B store instructions. > > > >> @@ -45,6 +75,16 @@ static struct page *__page_frag_cache_refill(struct= page_frag_cache *nc, > >> nc->encoded_va =3D encode_aligned_va(page_address(page), order, > >> page_is_pfmemalloc(page)); > >> > >> + /* Even if we own the page, we do not use atomic_set(). > >> + * This would break get_page_unless_zero() users. > >> + */ > >> + page_ref_add(page, PAGE_FRAG_CACHE_MAX_SIZE); > >> + > >> +out: > >> + /* reset page count bias and remaining to start of new frag */ > >> + nc->pagecnt_bias =3D PAGE_FRAG_CACHE_MAX_SIZE + 1; > >> + nc->remaining =3D PAGE_SIZE << order; > >> + > >> return page; > >> } > >> > > > > Why bother returning a page at all? It doesn't seem like you don't use > > it anymore. It looks like the use cases you have for it in patch 11/12 > > all appear to be broken from what I can tell as you are adding page as > > a variable when we don't need to be passing internal details to the > > callers of the function when just a simple error return code would do. > > It would be good to be more specific about the 'broken' part here. We are passing internals to the caller. Basically this is generally frowned upon for many implementations of things as the general idea is that the internal page we are using should be a pseudo-private value. I understand that you have one or two callers that need it for the use cases you have in patches 11/12, but it also seems like you are just passing it regardless. For example I noticed in a few cases you added the page pointer in 12 to handle the return value, but then just used it to check for NULL. My thought would be that rather than returning the page here you would be better off just returning 0 or an error and then doing the virt_to_page translation for all the cases where the page is actually needed since you have to go that route for a cached page anyway.