linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Alexander Duyck <alexander.duyck@gmail.com>
To: Yunsheng Lin <linyunsheng@huawei.com>
Cc: davem@davemloft.net, kuba@kernel.org, pabeni@redhat.com,
	 netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	 Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org
Subject: Re: [PATCH net-next v15 08/13] mm: page_frag: use __alloc_pages() to replace alloc_pages_node()
Date: Tue, 27 Aug 2024 08:30:55 -0700	[thread overview]
Message-ID: <CAKgT0UdiDfL++rC_g8guhChRFsNhKeax8697O5+zfi01Y=iEeg@mail.gmail.com> (raw)
In-Reply-To: <67c7c28d-bbfa-457d-a5bb-cb06806e5433@huawei.com>

On Tue, Aug 27, 2024 at 5:07 AM Yunsheng Lin <linyunsheng@huawei.com> wrote:
>
> On 2024/8/27 1:00, Alexander Duyck wrote:
> > On Mon, Aug 26, 2024 at 5:46 AM Yunsheng Lin <linyunsheng@huawei.com> wrote:
> >>
> >> It seems there is about 24Bytes binary size increase for
> >> __page_frag_cache_refill() after refactoring in arm64 system
> >> with 64K PAGE_SIZE. By doing the gdb disassembling, It seems
> >> we can have more than 100Bytes decrease for the binary size
> >> by using __alloc_pages() to replace alloc_pages_node(), as
> >> there seems to be some unnecessary checking for nid being
> >> NUMA_NO_NODE, especially when page_frag is part of the mm
> >> system.
> >>
> >> CC: Alexander Duyck <alexander.duyck@gmail.com>
> >> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
> >> ---
> >>  mm/page_frag_cache.c | 6 +++---
> >>  1 file changed, 3 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/mm/page_frag_cache.c b/mm/page_frag_cache.c
> >> index bba59c87d478..e0ad3de11249 100644
> >> --- a/mm/page_frag_cache.c
> >> +++ b/mm/page_frag_cache.c
> >> @@ -28,11 +28,11 @@ static struct page *__page_frag_cache_refill(struct page_frag_cache *nc,
> >>  #if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE)
> >>         gfp_mask = (gfp_mask & ~__GFP_DIRECT_RECLAIM) |  __GFP_COMP |
> >>                    __GFP_NOWARN | __GFP_NORETRY | __GFP_NOMEMALLOC;
> >> -       page = alloc_pages_node(NUMA_NO_NODE, gfp_mask,
> >> -                               PAGE_FRAG_CACHE_MAX_ORDER);
> >> +       page = __alloc_pages(gfp_mask, PAGE_FRAG_CACHE_MAX_ORDER,
> >> +                            numa_mem_id(), NULL);
> >>  #endif
> >>         if (unlikely(!page)) {
> >> -               page = alloc_pages_node(NUMA_NO_NODE, gfp, 0);
> >> +               page = __alloc_pages(gfp, 0, numa_mem_id(), NULL);
> >>                 if (unlikely(!page)) {
> >>                         nc->encoded_page = 0;
> >>                         return NULL;
> >
> > I still think this would be better served by fixing alloc_pages_node
> > to drop the superfluous checks rather than changing the function. We
> > would get more gain by just addressing the builtin constant and
> > NUMA_NO_NODE case there.
>
> I am supposing by 'just addressing the builtin constant and NUMA_NO_NODE
> case', it meant the below change from the previous discussion:
>
> diff --git a/include/linux/gfp.h b/include/linux/gfp.h
> index 01a49be7c98d..009ffb50d8cd 100644
> --- a/include/linux/gfp.h
> +++ b/include/linux/gfp.h
> @@ -290,6 +290,9 @@ struct folio *__folio_alloc_node_noprof(gfp_t gfp, unsigned int order, int nid)
>  static inline struct page *alloc_pages_node_noprof(int nid, gfp_t gfp_mask,
>                                                    unsigned int order)
>  {
> +       if (__builtin_constant_p(nid) && nid == NUMA_NO_NODE)
> +               return __alloc_pages_noprof(gfp_mask, order, numa_mem_id(), NULL);
> +
>         if (nid == NUMA_NO_NODE)
>                 nid = numa_mem_id();
>
>
> Actually it does not seem to get more gain by judging from binary size
> changing as below, vmlinux.org is the image after this patchset, and
> vmlinux is the image after this patchset with this patch reverted and
> with above change applied.
>
> [linyunsheng@localhost net-next]$ ./scripts/bloat-o-meter vmlinux.org vmlinux
> add/remove: 0/2 grow/shrink: 16/12 up/down: 432/-340 (92)
> Function                                     old     new   delta
> new_slab                                     808    1124    +316
> its_probe_one                               2860    2908     +48

...

> alloc_slab_page                              284       -    -284
> Total: Before=30534822, After=30534914, chg +0.00%

Well considering that alloc_slab_page was marked to be "inline" as per
the qualifier applied to it I would say the shrinking had an effect,
but it was just enough to enable the "inline" qualifier to kick in. It
could be argued that the change exposed another issue in that the
alloc_slab_page function is probably large enough that it should just
be "static" and not "static inline". If you can provide you config I
could probably look into this further but I suspect just dropping the
inline for that one function should result in net savings.

The only other big change I see is in its_probe_one which I am not
sure why it would be impacted since it is not passing a constant in
the first place, it is passing its->numa_node. I'd be curious what the
disassembly shows in terms of the change that caused it to increase in
size.

Otherwise the rest of the size changes seem more like code shifts than
anything else likely due to the functions shifting around slightly due
to a few dropping in size.


  reply	other threads:[~2024-08-27 15:31 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20240826124021.2635705-1-linyunsheng@huawei.com>
2024-08-26 12:40 ` [PATCH net-next v15 01/13] mm: page_frag: add a test module for page_frag Yunsheng Lin
2024-08-26 16:30   ` Alexander Duyck
2024-08-26 12:40 ` [PATCH net-next v15 02/13] mm: move the page fragment allocator from page_alloc into its own file Yunsheng Lin
2024-08-26 12:40 ` [PATCH net-next v15 03/13] mm: page_frag: use initial zero offset for page_frag_alloc_align() Yunsheng Lin
2024-08-26 12:40 ` [PATCH net-next v15 04/13] mm: page_frag: avoid caller accessing 'page_frag_cache' directly Yunsheng Lin
2024-08-26 12:40 ` [PATCH net-next v15 06/13] mm: page_frag: reuse existing space for 'size' and 'pfmemalloc' Yunsheng Lin
2024-08-26 16:46   ` Alexander Duyck
2024-08-27 12:06     ` Yunsheng Lin
2024-08-27 18:16       ` Alexander Duyck
2024-08-28 12:11         ` Yunsheng Lin
2024-08-26 12:40 ` [PATCH net-next v15 07/13] mm: page_frag: some minor refactoring before adding new API Yunsheng Lin
2024-08-27 16:00   ` Alexander Duyck
2024-08-28 12:12     ` Yunsheng Lin
2024-08-26 12:40 ` [PATCH net-next v15 08/13] mm: page_frag: use __alloc_pages() to replace alloc_pages_node() Yunsheng Lin
2024-08-26 17:00   ` Alexander Duyck
2024-08-27 12:06     ` Yunsheng Lin
2024-08-27 15:30       ` Alexander Duyck [this message]
2024-08-28 12:12         ` Yunsheng Lin
2024-08-26 12:40 ` [PATCH net-next v15 10/13] mm: page_frag: introduce prepare/probe/commit API Yunsheng Lin
2024-08-26 12:40 ` [PATCH net-next v15 12/13] mm: page_frag: update documentation for page_frag Yunsheng Lin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAKgT0UdiDfL++rC_g8guhChRFsNhKeax8697O5+zfi01Y=iEeg@mail.gmail.com' \
    --to=alexander.duyck@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=davem@davemloft.net \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linyunsheng@huawei.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox