Re: [PATCH RFC 3/8] memory-provider: dmabuf devmem memory provider

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Yunsheng Lin <linyunsheng@huawei.com>
To: Jakub Kicinski <kuba@kernel.org>, Mina Almasry <almasrymina@google.com>
Cc: davem@davemloft.net, pabeni@redhat.com, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	"Willem de Bruijn" <willemb@google.com>,
	"Kaiyuan Zhang" <kaiyuanz@google.com>,
	"Jesper Dangaard Brouer" <hawk@kernel.org>,
	"Ilias Apalodimas" <ilias.apalodimas@linaro.org>,
	"Eric Dumazet" <edumazet@google.com>,
	"Christian König" <christian.koenig@amd.com>,
	"Jason Gunthorpe" <jgg@nvidia.com>,
	"Matthew Wilcox" <willy@infradead.org>,
	Linux-MM <linux-mm@kvack.org>
Subject: Re: [PATCH RFC 3/8] memory-provider: dmabuf devmem memory provider
Date: Tue, 14 Nov 2023 16:23:29 +0800	[thread overview]
Message-ID: <0c39bd57-5d67-3255-9da2-3f3194ee5a66@huawei.com> (raw)
In-Reply-To: <20231113180554.1d1c6b1a@kernel.org>

+cc Christian, Jason and Willy

On 2023/11/14 7:05, Jakub Kicinski wrote:
> On Mon, 13 Nov 2023 05:42:16 -0800 Mina Almasry wrote:
>> You're doing exactly what I think you're doing, and what was nacked in RFC v1.
>>
>> You've converted 'struct page_pool_iov' to essentially become a
>> duplicate of 'struct page'. Then, you're casting page_pool_iov* into
>> struct page* in mp_dmabuf_devmem_alloc_pages(), then, you're calling
>> mm APIs like page_ref_*() on the page_pool_iov* because you've fooled
>> the mm stack into thinking dma-buf memory is a struct page.

Yes, something like above, but I am not sure about the 'fooled the mm
stack into thinking dma-buf memory is a struct page' part, because:
1. We never let the 'struct page' for devmem leaking out of net stacking
   through the 'not kmap()able and not readable' checking in your patchset.
2. We inititiate page->_refcount for devmem to one and it remains as one,
   we will never call page_ref_inc()/page_ref_dec()/get_page()/put_page(),
   instead, we use page pool's pp_frag_count to do reference counting for
   devmem page in patch 6.

>>
>> RFC v1 was almost exactly the same, except instead of creating a
>> duplicate definition of struct page, it just allocated 'struct page'
>> instead of allocating another struct that is identical to struct page
>> and casting it into struct page.

Perhaps it is more accurate to say this is something between RFC v1 and
RFC v3, in order to decouple 'struct page' for devmem from mm subsystem,
but still have most unified handling for both normal memory and devmem
in page pool and net stack.

The main difference between this patchset and RFC v1:
1. The mm subsystem is not supposed to see the 'struct page' for devmem
   in this patchset, I guess we could say it is decoupled from the mm
   subsystem even though we still call PageTail()/page_ref_count()/
   page_is_pfmemalloc() on 'struct page' for devmem.

The main difference between this patchset and RFC v3:
1. It reuses the 'struct page' to have more unified handling between
   normal page and devmem page for net stack.
2. It relies on the page->pp_frag_count to do reference counting.

>>
>> I don't think what you're doing here reverses the nacks I got in RFC
>> v1. You also did not CC any dma-buf or mm people on this proposal that
>> would bring up these concerns again.
> 
> Right, but the mirror struct has some appeal to a non-mm person like
> myself. The problem IIUC is that this patch is the wrong way around, we
> should be converting everyone who can deal with non-host mem to struct
> page_pool_iov. Using page_address() on ppiov which hns3 seems to do in
> this series does not compute for me.

The hacking use of ppiov in hns3 is only used to do the some prototype
testing, so ignore it.

> 
> Then we can turn the existing non-iov helpers to be a thin wrapper with
> just a cast from struct page to struct page_pool_iov, and a call of the
> iov helper. Again - never cast the other way around.

I am agreed that a cast from struct page to struct page_pool_iov is allowed,
but a cast from struct page_pool_iov to struct page is not allowed if I am
understanding you correctly.

Before we can also completely decouple 'struct page' allocated using buddy
allocator directly from mm subsystem in netstack, below is what I have in
mind in order to support different memory provider.

                                +--------------+
                                |   Netstack   |
                                |'struct page' |
                                +--------------+
                                        ^
                                        |
                                        |
                                        v
                              +---------------------+
+----------------------+      |                     |      +---------------+
|      devmem MP       |<---->|     Page pool       |----->|    **** MP    |
|'struct page_pool_iov'|      |   'struct page'     |      |'struct **_iov'|
+----------------------+      |                     |      +---------------+
                              +---------------------+
                                        ^
                                        |
                                        |
                                        v
                                +---------------+
                                |    Driver     |
                                | 'struct page' |
                                +---------------+

I would expect net stack, page pool, driver still see the 'struct page',
only memory provider see the specific struct for itself, for the above,
devmem memory provider sees the 'struct page_pool_iov'.

The reason I still expect driver to see the 'struct page' is that driver
will still need to support normal memory besides devmem.

> 
> Also I think this conversion can be done completely separately from the
> mem provider changes. Just add struct page_pool_iov and start using it.

I am not sure I understand what does "Just add struct page_pool_iov and
start using it" mean yet.

> 
> Does that make more sense?
> 
> .
>

next      parent reply	other threads:[~2023-11-14  8:23 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20231113130041.58124-1-linyunsheng@huawei.com>
     [not found] ` <20231113130041.58124-4-linyunsheng@huawei.com>
     [not found]   ` <CAHS8izMjmj0DRT_vjzVq5HMQyXtZdVK=o4OP0gzbaN=aJdQ3ig@mail.gmail.com>
     [not found]     ` <20231113180554.1d1c6b1a@kernel.org>
2023-11-14  8:23       ` Yunsheng Lin [this message]
2023-11-14 12:21         ` Mina Almasry
2023-11-14 12:49           ` Yunsheng Lin
2023-11-14 12:58             ` Mina Almasry
2023-11-14 13:19               ` Yunsheng Lin
2023-11-14 15:41                 ` Willem de Bruijn
2023-11-15  9:29                   ` Yunsheng Lin
2023-11-15 18:07                     ` Mina Almasry
2023-11-15 19:05                       ` Mina Almasry
2023-11-16 11:12                         ` Yunsheng Lin
2023-11-16 11:30                           ` Mina Almasry
2023-11-14 13:16           ` Jason Gunthorpe
2023-11-15  6:46             ` Christian König
2023-11-15  9:21             ` Yunsheng Lin
2023-11-15 13:38               ` Jason Gunthorpe
2023-11-16 11:10                 ` Yunsheng Lin
2023-11-16 15:31                   ` Jason Gunthorpe
2023-11-15 17:44               ` Mina Almasry
2023-11-16 11:11                 ` Yunsheng Lin
2023-11-15 17:57               ` David Ahern
2023-11-16 11:12                 ` Yunsheng Lin
2023-11-16 15:58                   ` David Ahern
2023-11-17 11:27                     ` Yunsheng Lin
2023-11-14 22:25         ` Jakub Kicinski
2023-11-15  9:33           ` Yunsheng Lin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0c39bd57-5d67-3255-9da2-3f3194ee5a66@huawei.com \
    --to=linyunsheng@huawei.com \
    --cc=almasrymina@google.com \
    --cc=christian.koenig@amd.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=hawk@kernel.org \
    --cc=ilias.apalodimas@linaro.org \
    --cc=jgg@nvidia.com \
    --cc=kaiyuanz@google.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=willemb@google.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox