linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "D. Wythe" <alibuda@linux.alibaba.com    >
To: Uladzislau Rezki <urezki@gmail.com>
Cc: "D. Wythe" <alibuda@linux.alibaba.com>,
	"David S. Miller" <davem@davemloft.net>,
	Andrew Morton <akpm@linux-foundation.org>,
	Dust Li <dust.li@linux.alibaba.com>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Sidraya Jayagond <sidraya@linux.ibm.com>,
	Wenjia Zhang <wenjia@linux.ibm.com>,
	Mahanta Jambigi <mjambigi@linux.ibm.com>,
	Simon Horman <horms@kernel.org>,
	Tony Lu <tonylu@linux.alibaba.com>,
	Wen Gu <guwen@linux.alibaba.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	linux-rdma@vger.kernel.org, linux-s390@vger.kernel.org,
	netdev@vger.kernel.org, oliver.yang@linux.alibaba.com
Subject: Re: [PATCH net-next 2/3] mm: vmalloc: export find_vm_area()
Date: Sat, 24 Jan 2026 22:57:54 +0800	[thread overview]
Message-ID: <20260124145754.GA57116@j66a10360.sqa.eu95> (raw)
In-Reply-To: <aXSjm1DXm6yP62tD@pc636>

On Sat, Jan 24, 2026 at 11:48:59AM +0100, Uladzislau Rezki wrote:
> Hello, D. Wythe!
> 
> > On Fri, Jan 23, 2026 at 07:55:17PM +0100, Uladzislau Rezki wrote:
> > > On Fri, Jan 23, 2026 at 04:23:48PM +0800, D. Wythe wrote:
> > > > find_vm_area() provides a way to find the vm_struct associated with a
> > > > virtual address. Export this symbol to modules so that modularized
> > > > subsystems can perform lookups on vmalloc addresses.
> > > > 
> > > > Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
> > > > ---
> > > >  mm/vmalloc.c | 1 +
> > > >  1 file changed, 1 insertion(+)
> > > > 
> > > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> > > > index ecbac900c35f..3eb9fe761c34 100644
> > > > --- a/mm/vmalloc.c
> > > > +++ b/mm/vmalloc.c
> > > > @@ -3292,6 +3292,7 @@ struct vm_struct *find_vm_area(const void *addr)
> > > >  
> > > >  	return va->vm;
> > > >  }
> > > > +EXPORT_SYMBOL_GPL(find_vm_area);
> > > >  
> > > This is internal. We can not just export it.
> > > 
> > > --
> > > Uladzislau Rezki
> > 
> > Hi Uladzislau,
> > 
> > Thank you for the feedback. I agree that we should avoid exposing
> > internal implementation details like struct vm_struct to external
> > subsystems.
> > 
> > Following Christoph's suggestion, I'm planning to encapsulate the page
> > order lookup into a minimal helper instead:
> > 
> > unsigned int vmalloc_page_order(const void *addr){
> > 	struct vm_struct *vm;
> >  	vm = find_vm_area(addr);
> > 	return vm ? vm->page_order : 0;
> > }
> > EXPORT_SYMBOL_GPL(vmalloc_page_order);
> > 
> > Does this approach look reasonable to you? It would keep the vm_struct
> > layout private while satisfying the optimization needs of SMC.
> > 
> Could you please clarify why you need info about page_order? I have not
> looked at your second patch.
> 
> Thanks!
> 
> --
> Uladzislau Rezki

Hi Uladzislau,

This stems from optimizing memory registration in SMC-R. To provide the
RDMA hardware with direct access to memory buffers, we must register
them with the NIC. During this process, the hardware generates one MTT
entry for each physically contiguous block. Since these hardware entries
are a finite and scarce resource, and SMC currently defaults to a 4KB
registration granularity, a single 2MB buffer consumes 512 entries. In
high-concurrency scenarios, this inefficiency quickly exhausts NIC
resources and becomes a major bottleneck for system scalability.

To address this, we intend to use vmalloc_huge(). When it successfully
allocates high-order pages, the vmalloc area is backed by a sequence of
physically contiguous chunks (e.g., 2MB each). If we know this
page_order, we can register these larger physical blocks instead of
individual 4KB pages, reducing MTT consumption from 512 entries down to
1 for every 2MB of memory (with page_order == 9).

However, the result of vmalloc_huge() is currently opaque to the caller.
We cannot determine whether it successfully allocated huge pages or fell
back to 4KB pages based solely on the returned pointer. Therefore, we
need a helper function to query the actual page order, enabling SMC-R to
adapt its registration logic to the underlying physical layout.

I hope this clarifies our design motivation!

Best regards,
D. Wythe






  reply	other threads:[~2026-01-24 14:58 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-23  8:23 [PATCH net-next 0/3] net/smc: buffer allocation and registration improvements D. Wythe
2026-01-23  8:23 ` [PATCH net-next 1/3] net/smc: cap allocation order for SMC-R physically contiguous buffers D. Wythe
2026-01-23 10:54   ` Alexandra Winter
2026-01-24  9:22     ` D. Wythe
2026-01-23  8:23 ` [PATCH net-next 2/3] mm: vmalloc: export find_vm_area() D. Wythe
2026-01-23 14:44   ` Christoph Hellwig
2026-01-23 18:55   ` Uladzislau Rezki
2026-01-24  9:35     ` D. Wythe
2026-01-24 10:48       ` Uladzislau Rezki
2026-01-24 14:57         ` D. Wythe [this message]
2026-01-26 10:28           ` Uladzislau Rezki
2026-01-26 12:02             ` D. Wythe
2026-01-26 16:45               ` Uladzislau Rezki
2026-01-27 13:34           ` Leon Romanovsky
2026-01-28  3:45             ` D. Wythe
2026-01-28 11:13               ` Leon Romanovsky
2026-01-28 12:44                 ` D. Wythe
2026-01-28 13:49                   ` Leon Romanovsky
2026-01-29 11:03                     ` D. Wythe
2026-01-29 12:22                       ` Leon Romanovsky
2026-01-29 14:04                         ` D. Wythe
2026-01-28 18:06               ` Jason Gunthorpe
2026-01-29 11:36                 ` D. Wythe
2026-01-29 13:20                   ` Jason Gunthorpe
2026-01-30  8:51                     ` D. Wythe
2026-01-30 15:16                       ` Jason Gunthorpe
2026-02-03  9:14                         ` D. Wythe
2026-01-23  8:23 ` [PATCH net-next 3/3] net/smc: optimize MTTE consumption for SMC-R buffers D. Wythe
2026-01-23 14:52   ` Christoph Hellwig
2026-01-24  9:25     ` D. Wythe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260124145754.GA57116@j66a10360.sqa.eu95 \
    --to=alibuda@linux.alibaba.com \
    --cc=akpm@linux-foundation.org \
    --cc=davem@davemloft.net \
    --cc=dust.li@linux.alibaba.com \
    --cc=edumazet@google.com \
    --cc=guwen@linux.alibaba.com \
    --cc=horms@kernel.org \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=mjambigi@linux.ibm.com \
    --cc=netdev@vger.kernel.org \
    --cc=oliver.yang@linux.alibaba.com \
    --cc=pabeni@redhat.com \
    --cc=sidraya@linux.ibm.com \
    --cc=tonylu@linux.alibaba.com \
    --cc=urezki@gmail.com \
    --cc=wenjia@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox