linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Eric Dumazet <edumazet@google.com>
To: Barry Song <21cnbao@gmail.com>
Cc: corbet@lwn.net, davem@davemloft.net, hannes@cmpxchg.org,
	horms@kernel.org,  jackmanb@google.com, kuba@kernel.org,
	kuniyu@google.com,  linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	 linyunsheng@huawei.com, mhocko@suse.com, netdev@vger.kernel.org,
	 pabeni@redhat.com, surenb@google.com, v-songbaohua@oppo.com,
	vbabka@suse.cz,  willemb@google.com, zhouhuacai@oppo.com,
	ziy@nvidia.com
Subject: Re: [RFC PATCH] mm: net: disable kswapd for high-order network buffer allocation
Date: Tue, 14 Oct 2025 01:25:05 -0700	[thread overview]
Message-ID: <CANn89iK0OWswFFHH10PLzFdcFxZXodWorR5YJSdPq+P6+Qsu1Q@mail.gmail.com> (raw)
In-Reply-To: <CAGsJ_4xC=5nCSOv9P7ySONeXwdXN-YK2V+4OZ2zdCOeYiQHvzQ@mail.gmail.com>

On Tue, Oct 14, 2025 at 1:17 AM Barry Song <21cnbao@gmail.com> wrote:
>
> On Tue, Oct 14, 2025 at 3:01 PM Eric Dumazet <edumazet@google.com> wrote:
> >
> > On Mon, Oct 13, 2025 at 11:43 PM Barry Song <21cnbao@gmail.com> wrote:
> > >
> > > > >
> > > > > A problem with the existing sysctl is that it only covers the TX path;
> > > > > for the RX path, we also observe that kswapd consumes significant power.
> > > > > I could add the patch below to make it support the RX path, but it feels
> > > > > like a bit of a layer violation, since the RX path code resides in mm
> > > > > and is intended to serve generic users rather than networking, even
> > > > > though the current callers are primarily network-related.
> > > >
> > > > You might have a buggy driver.
> > >
> > > We are observing the RX path as follows:
> > >
> > > do_softirq
> > >     taskset_hi_action
> > >        kalPacketAlloc
> > >            __netdev_alloc_skb
> > >                page_frag_alloc_align
> > >                    __page_frag_cache_refill
> > >
> > > This appears to be a fairly common stack.
> > >
> > > So it is a buggy driver?
> >
> > No idea, kalPacketAlloc is not in upstream trees.
> >
> > It apparently needs high order allocations. It will fail at some point.
> >
> > >
> > > >
> > > > High performance drivers use order-0 allocations only.
> > > >
> > >
> > > Do you have an example of high-performance drivers that use only order-0 memory?
> >
> > About all drivers using XDP, and/or using napi_get_frags()
> >
> > XDP has been using order-0 pages from the very beginning.
>
> Thanks! But there are still many drivers using netdev_alloc_skb()—we
> shouldn’t overlook them, right?
>
> net % git grep netdev_alloc_skb | wc -l
>      359

Only the ones that are using 16KB allocations like some WAN drivers :)

Some networks use MTU=9000

If a hardware does not provide SG support on receive, a kmalloc()
based will use 16KB of memory.

By using a frag allocator, we can pack 3 allocations per 32KB instead of 2.

TCP can go 50% faster.

If memory is short, it will fail no matter what.


  reply	other threads:[~2025-10-14  8:25 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-13 10:16 Barry Song
2025-10-13 18:30 ` Vlastimil Babka
2025-10-13 21:35   ` Shakeel Butt
2025-10-13 21:53     ` Alexei Starovoitov
2025-10-13 22:25       ` Shakeel Butt
2025-10-13 22:46   ` Roman Gushchin
2025-10-14  4:31     ` Barry Song
2025-10-14  7:24     ` Michal Hocko
2025-10-14  7:26   ` Michal Hocko
2025-10-14  8:08     ` Barry Song
2025-10-14 14:27     ` Shakeel Butt
2025-10-14 15:14       ` Michal Hocko
2025-10-14 17:22         ` Shakeel Butt
2025-10-15  6:21           ` Michal Hocko
2025-10-15 18:26             ` Shakeel Butt
2025-10-13 18:53 ` Eric Dumazet
2025-10-14  3:58   ` Barry Song
2025-10-14  5:07     ` Eric Dumazet
2025-10-14  6:43       ` Barry Song
2025-10-14  7:01         ` Eric Dumazet
2025-10-14  8:17           ` Barry Song
2025-10-14  8:25             ` Eric Dumazet [this message]
2025-10-13 21:56 ` Matthew Wilcox
2025-10-14  4:09   ` Barry Song
2025-10-14  5:04     ` Eric Dumazet
2025-10-14  8:58       ` Barry Song
2025-10-14  9:49         ` Eric Dumazet
2025-10-14 10:19           ` Barry Song
2025-10-14 10:39             ` Eric Dumazet
2025-10-14 20:17               ` Barry Song
2025-10-15  6:39                 ` Eric Dumazet
2025-10-15  7:35                   ` Barry Song
2025-10-15 16:39                     ` Suren Baghdasaryan
2025-10-14 14:37             ` Shakeel Butt
2025-10-14 20:28               ` Barry Song
2025-10-15 18:13                 ` Shakeel Butt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CANn89iK0OWswFFHH10PLzFdcFxZXodWorR5YJSdPq+P6+Qsu1Q@mail.gmail.com \
    --to=edumazet@google.com \
    --cc=21cnbao@gmail.com \
    --cc=corbet@lwn.net \
    --cc=davem@davemloft.net \
    --cc=hannes@cmpxchg.org \
    --cc=horms@kernel.org \
    --cc=jackmanb@google.com \
    --cc=kuba@kernel.org \
    --cc=kuniyu@google.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linyunsheng@huawei.com \
    --cc=mhocko@suse.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=surenb@google.com \
    --cc=v-songbaohua@oppo.com \
    --cc=vbabka@suse.cz \
    --cc=willemb@google.com \
    --cc=zhouhuacai@oppo.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox