From: Mina Almasry <almasrymina@google.com>
To: Byungchul Park <byungchul@sk.com>
Cc: willy@infradead.org, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
kernel_team@skhynix.com, kuba@kernel.org,
ilias.apalodimas@linaro.org, harry.yoo@oracle.com,
hawk@kernel.org, akpm@linux-foundation.org, davem@davemloft.net,
john.fastabend@gmail.com, andrew+netdev@lunn.ch,
asml.silence@gmail.com, toke@redhat.com, tariqt@nvidia.com,
edumazet@google.com, pabeni@redhat.com, saeedm@nvidia.com,
leon@kernel.org, ast@kernel.org, daniel@iogearbox.net,
david@redhat.com, lorenzo.stoakes@oracle.com,
Liam.Howlett@oracle.com, vbabka@suse.cz, rppt@kernel.org,
surenb@google.com, mhocko@suse.com, horms@kernel.org,
linux-rdma@vger.kernel.org, bpf@vger.kernel.org,
vishal.moola@gmail.com
Subject: Re: [PATCH 01/18] netmem: introduce struct netmem_desc struct_group_tagged()'ed on struct net_iov
Date: Tue, 27 May 2025 20:47:54 -0700 [thread overview]
Message-ID: <CAHS8izMvRrG2wpE7HEyK3t544-wN_h3SC8nGabCoPWj1qCv_ag@mail.gmail.com> (raw)
In-Reply-To: <20250528012152.GA2986@system.software.com>
On Tue, May 27, 2025 at 6:22 PM Byungchul Park <byungchul@sk.com> wrote:
>
> On Tue, May 27, 2025 at 01:03:32PM -0700, Mina Almasry wrote:
> > On Mon, May 26, 2025 at 7:50 PM Byungchul Park <byungchul@sk.com> wrote:
> > >
> > > On Fri, May 23, 2025 at 12:25:52PM +0900, Byungchul Park wrote:
> > > > To simplify struct page, the page pool members of struct page should be
> > > > moved to other, allowing these members to be removed from struct page.
> > > >
> > > > Introduce a network memory descriptor to store the members, struct
> > > > netmem_desc, reusing struct net_iov that already mirrored struct page.
> > > >
> > > > While at it, relocate _pp_mapping_pad to group struct net_iov's fields.
> > > >
> > > > Signed-off-by: Byungchul Park <byungchul@sk.com>
> > > > ---
> > > > include/linux/mm_types.h | 2 +-
> > > > include/net/netmem.h | 43 +++++++++++++++++++++++++++++++++-------
> > > > 2 files changed, 37 insertions(+), 8 deletions(-)
> > > >
> > > > diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> > > > index 56d07edd01f9..873e820e1521 100644
> > > > --- a/include/linux/mm_types.h
> > > > +++ b/include/linux/mm_types.h
> > > > @@ -120,13 +120,13 @@ struct page {
> > > > unsigned long private;
> > > > };
> > > > struct { /* page_pool used by netstack */
> > > > + unsigned long _pp_mapping_pad;
> > > > /**
> > > > * @pp_magic: magic value to avoid recycling non
> > > > * page_pool allocated pages.
> > > > */
> > > > unsigned long pp_magic;
> > > > struct page_pool *pp;
> > > > - unsigned long _pp_mapping_pad;
> > > > unsigned long dma_addr;
> > > > atomic_long_t pp_ref_count;
> > > > };
> > > > diff --git a/include/net/netmem.h b/include/net/netmem.h
> > > > index 386164fb9c18..08e9d76cdf14 100644
> > > > --- a/include/net/netmem.h
> > > > +++ b/include/net/netmem.h
> > > > @@ -31,12 +31,41 @@ enum net_iov_type {
> > > > };
> > > >
> > > > struct net_iov {
> > > > - enum net_iov_type type;
> > > > - unsigned long pp_magic;
> > > > - struct page_pool *pp;
> > > > - struct net_iov_area *owner;
> > > > - unsigned long dma_addr;
> > > > - atomic_long_t pp_ref_count;
> > > > + /*
> > > > + * XXX: Now that struct netmem_desc overlays on struct page,
> > > > + * struct_group_tagged() should cover all of them. However,
> > > > + * a separate struct netmem_desc should be declared and embedded,
> > > > + * once struct netmem_desc is no longer overlayed but it has its
> > > > + * own instance from slab. The final form should be:
> > > > + *
> > > > + * struct netmem_desc {
> > > > + * unsigned long pp_magic;
> > > > + * struct page_pool *pp;
> > > > + * unsigned long dma_addr;
> > > > + * atomic_long_t pp_ref_count;
> > > > + * };
> > > > + *
> > > > + * struct net_iov {
> > > > + * enum net_iov_type type;
> > > > + * struct net_iov_area *owner;
> > > > + * struct netmem_desc;
> > > > + * };
> > > > + */
> > > > + struct_group_tagged(netmem_desc, desc,
> > >
> > > So.. For now, this is the best option we can pick. We can do all that
> > > you told me once struct netmem_desc has it own instance from slab.
> > >
> > > Again, it's because the page pool fields (or netmem things) from struct
> > > page will be gone by this series.
> > >
> > > Mina, thoughts?
> > >
> >
> > Can you please post an updated series with the approach you have in
> > mind? I think this series as-is seems broken vis-a-vie the
> > _pp_padding_map param move that looks incorrect. Pavel and I have also
> > commented on patch 18 that removing the ASSERTS seems incorrect as
> > it's breaking the symmetry between struct page and struct net_iov.
>
> I told you I will fix it. I will send the updated series shortly for
> *review*. However, it will be for review since we know this work can be
> completed once the next works have been done:
>
> https://lore.kernel.org/all/20250520205920.2134829-2-anthony.l.nguyen@intel.com/
> https://lore.kernel.org/all/1747950086-1246773-9-git-send-email-tariqt@nvidia.com/
>
> > It's not clear to me if the fields are being removed from struct page,
> > where are they going... the approach ptdesc for example has taken is
>
> They are going to struct net_iov.
Oh. I see. My gut reaction is I'm not sure moving the page_pool fields
to struct net_iov will work.
struct net_iov shares some fields with struct page, but abstractly
it's very different.
struct page is allocated by the mm stack via things like alloc_pages
and can be passed to mm apis such as put_page() (called from
skb_frag_ref) and vm_insert_batch (called from
tcp_zerocopy_vm_insert_batch_error).
struct net_iov is kvmalloced by networking code (see
net_devmem_bind_dmabuf for example), and *must not* be passed to any
mm apis as it's not a struct page at all. Accidentally calling
vm_insert_batch on a struct net_iov will cause a kernel crash or some
memory corruption.
Thus abstractly different things maybe should not share the same
in-kernel struct.
One thing that maybe could work is if struct net_iov has a field in it
which tells us whether it's actually a struct page that can be passed
to mm apis, or not a struct page which cannot be passed to mm apis.
> Or I should introduce another struct
maybe introducing another struct is the answer. I'm not sure. The net
stack today already supports struct page and struct net_iov, with
netmem_ref acting as an abstraction over both. Adding a 3rd struct and
adding more checks to test if page or net_iov or something new will
add overhead.
An additional problem is that there are probably hundreds or thousands
of references to 'page' in the net stack and drivers. I'm not sure
what you're going to do about those. Are you converting all those to
netmem or netmem_desc?
--
Thanks,
Mina
next prev parent reply other threads:[~2025-05-28 3:48 UTC|newest]
Thread overview: 72+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-23 3:25 [PATCH 00/18] Split netmem from struct page Byungchul Park
2025-05-23 3:25 ` [PATCH 01/18] netmem: introduce struct netmem_desc struct_group_tagged()'ed on struct net_iov Byungchul Park
2025-05-23 9:01 ` Toke Høiland-Jørgensen
2025-05-26 0:56 ` Byungchul Park
2025-05-23 17:00 ` Mina Almasry
2025-05-26 1:15 ` Byungchul Park
2025-05-27 2:50 ` Byungchul Park
2025-05-27 20:03 ` Mina Almasry
2025-05-28 1:21 ` Byungchul Park
2025-05-28 3:47 ` Mina Almasry [this message]
2025-05-28 5:03 ` Byungchul Park
2025-05-28 7:43 ` Pavel Begunkov
2025-05-28 8:17 ` Byungchul Park
2025-05-28 7:38 ` Pavel Begunkov
2025-05-23 3:25 ` [PATCH 02/18] netmem: introduce netmem alloc APIs to wrap page alloc APIs Byungchul Park
2025-05-23 3:25 ` [PATCH 03/18] page_pool: use netmem alloc/put APIs in __page_pool_alloc_page_order() Byungchul Park
2025-05-23 3:25 ` [PATCH 04/18] page_pool: rename __page_pool_alloc_page_order() to __page_pool_alloc_large_netmem() Byungchul Park
2025-05-23 3:25 ` [PATCH 05/18] page_pool: use netmem alloc/put APIs in __page_pool_alloc_pages_slow() Byungchul Park
2025-05-23 3:25 ` [PATCH 06/18] page_pool: rename page_pool_return_page() to page_pool_return_netmem() Byungchul Park
2025-05-28 3:18 ` Mina Almasry
2025-05-23 3:25 ` [PATCH 07/18] page_pool: use netmem put API in page_pool_return_netmem() Byungchul Park
2025-05-23 3:25 ` [PATCH 08/18] page_pool: rename __page_pool_release_page_dma() to __page_pool_release_netmem_dma() Byungchul Park
2025-05-23 3:26 ` [PATCH 09/18] page_pool: rename __page_pool_put_page() to __page_pool_put_netmem() Byungchul Park
2025-05-23 3:26 ` [PATCH 10/18] page_pool: rename __page_pool_alloc_pages_slow() to __page_pool_alloc_netmems_slow() Byungchul Park
2025-05-23 3:26 ` [PATCH 11/18] mlx4: use netmem descriptor and APIs for page pool Byungchul Park
2025-05-23 3:26 ` [PATCH 12/18] page_pool: use netmem APIs to access page->pp_magic in page_pool_page_is_pp() Byungchul Park
2025-05-23 8:58 ` Toke Høiland-Jørgensen
2025-05-23 17:21 ` Mina Almasry
2025-05-26 2:23 ` Byungchul Park
2025-05-26 2:36 ` Byungchul Park
2025-05-26 8:40 ` Toke Høiland-Jørgensen
2025-05-26 9:43 ` Byungchul Park
2025-05-26 9:54 ` Toke Høiland-Jørgensen
2025-05-26 10:01 ` Byungchul Park
2025-05-28 5:14 ` Byungchul Park
2025-05-28 7:35 ` Toke Høiland-Jørgensen
2025-05-28 8:15 ` Byungchul Park
2025-05-28 7:51 ` Pavel Begunkov
2025-05-28 8:14 ` Byungchul Park
2025-05-28 9:07 ` Pavel Begunkov
2025-05-28 9:14 ` Byungchul Park
2025-05-28 9:20 ` Pavel Begunkov
2025-05-28 9:33 ` Byungchul Park
2025-05-28 9:51 ` Pavel Begunkov
2025-05-28 10:44 ` Byungchul Park
2025-05-28 10:54 ` Pavel Begunkov
2025-05-23 3:26 ` [PATCH 13/18] mlx5: use netmem descriptor and APIs for page pool Byungchul Park
2025-05-23 17:13 ` Mina Almasry
2025-05-26 3:08 ` Byungchul Park
2025-05-26 8:12 ` Byungchul Park
2025-05-26 18:00 ` Mina Almasry
2025-05-23 3:26 ` [PATCH 14/18] netmem: use _Generic to cover const casting for page_to_netmem() Byungchul Park
2025-05-23 17:14 ` Mina Almasry
2025-05-23 3:26 ` [PATCH 15/18] netmem: remove __netmem_get_pp() Byungchul Park
2025-05-23 3:26 ` [PATCH 16/18] page_pool: make page_pool_get_dma_addr() just wrap page_pool_get_dma_addr_netmem() Byungchul Park
2025-05-23 3:26 ` [PATCH 17/18] netdevsim: use netmem descriptor and APIs for page pool Byungchul Park
2025-05-23 3:26 ` [PATCH 18/18] mm, netmem: remove the page pool members in struct page Byungchul Park
2025-05-23 17:16 ` kernel test robot
2025-05-23 17:55 ` Mina Almasry
2025-05-26 1:37 ` Byungchul Park
2025-05-26 16:58 ` Pavel Begunkov
2025-05-26 17:33 ` Mina Almasry
2025-05-27 1:02 ` Byungchul Park
2025-05-27 1:31 ` Byungchul Park
2025-05-27 5:30 ` Pavel Begunkov
2025-05-27 17:38 ` Mina Almasry
2025-05-28 1:31 ` Byungchul Park
2025-05-28 7:21 ` Pavel Begunkov
2025-05-23 6:20 ` [PATCH 00/18] Split netmem from " Taehee Yoo
2025-05-23 7:47 ` Byungchul Park
2025-05-23 17:47 ` SeongJae Park
2025-05-26 1:16 ` Byungchul Park
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAHS8izMvRrG2wpE7HEyK3t544-wN_h3SC8nGabCoPWj1qCv_ag@mail.gmail.com \
--to=almasrymina@google.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=andrew+netdev@lunn.ch \
--cc=asml.silence@gmail.com \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=byungchul@sk.com \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=david@redhat.com \
--cc=edumazet@google.com \
--cc=harry.yoo@oracle.com \
--cc=hawk@kernel.org \
--cc=horms@kernel.org \
--cc=ilias.apalodimas@linaro.org \
--cc=john.fastabend@gmail.com \
--cc=kernel_team@skhynix.com \
--cc=kuba@kernel.org \
--cc=leon@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-rdma@vger.kernel.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=mhocko@suse.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=rppt@kernel.org \
--cc=saeedm@nvidia.com \
--cc=surenb@google.com \
--cc=tariqt@nvidia.com \
--cc=toke@redhat.com \
--cc=vbabka@suse.cz \
--cc=vishal.moola@gmail.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox