* [PATCH 0/2] optimization of dma-buf system_heap allocation
@ 2025-10-14 8:32 zhaoyang.huang
2025-10-14 8:32 ` [PATCH 1/2] mm: call back alloc_pages_bulk_list since it is useful zhaoyang.huang
` (2 more replies)
0 siblings, 3 replies; 19+ messages in thread
From: zhaoyang.huang @ 2025-10-14 8:32 UTC (permalink / raw)
To: Andrew Morton, David Hildenbrand, Matthew Wilcox, Mel Gorman,
Vlastimil Babka, Sumit Semwal, Benjamin Gaignard, Brian Starkey,
John Stultz, T . J . Mercier, Christian König, linux-media,
dri-devel, linaro-mm-sig, linux-mm, linux-kernel, Zhaoyang Huang,
steve.kang
From: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
This series of patches would like to introduce alloc_pages_bulk_list in
dma-buf which need to call back the API for page allocation.
Zhaoyang Huang (2):
mm: call back alloc_pages_bulk_list since it is useful
driver: dma-buf: use alloc_pages_bulk_list for order-0 allocation
drivers/dma-buf/heaps/system_heap.c | 33 +++++++++++++++---------
include/linux/gfp.h | 9 +++++--
mm/mempolicy.c | 14 +++++------
mm/page_alloc.c | 39 ++++++++++++++++++++---------
4 files changed, 62 insertions(+), 33 deletions(-)
--
2.25.1
^ permalink raw reply [flat|nested] 19+ messages in thread* [PATCH 1/2] mm: call back alloc_pages_bulk_list since it is useful 2025-10-14 8:32 [PATCH 0/2] optimization of dma-buf system_heap allocation zhaoyang.huang @ 2025-10-14 8:32 ` zhaoyang.huang 2025-10-14 9:41 ` Petr Tesarik 2025-10-15 12:16 ` David Hildenbrand 2025-10-14 8:32 ` [PATCH 2/2] driver: dma-buf: use alloc_pages_bulk_list for order-0 allocation zhaoyang.huang 2025-10-14 14:06 ` [PATCH 0/2] optimization of dma-buf system_heap allocation Matthew Wilcox 2 siblings, 2 replies; 19+ messages in thread From: zhaoyang.huang @ 2025-10-14 8:32 UTC (permalink / raw) To: Andrew Morton, David Hildenbrand, Matthew Wilcox, Mel Gorman, Vlastimil Babka, Sumit Semwal, Benjamin Gaignard, Brian Starkey, John Stultz, T . J . Mercier, Christian König, linux-media, dri-devel, linaro-mm-sig, linux-mm, linux-kernel, Zhaoyang Huang, steve.kang From: Zhaoyang Huang <zhaoyang.huang@unisoc.com> commit c8b979530f27 ("mm: alloc_pages_bulk_noprof: drop page_list argument") drops alloc_pages_bulk_list. This commit would like to call back it since it is proved to be helpful to the drivers which allocate a bulk of pages(see patch of 2 in this series ). I do notice that Matthew's comment of the time cost of iterating a list. However, I also observed in our test that the extra page_array's allocation could be more expensive than cpu iteration when direct reclaiming happens when ram is low[1]. IMHO, could we leave the API here to have the users choose between the array or list according to their scenarios. [1] android.hardwar-728 [002] ..... 334.573875: system_heap_do_allocate: Execution time: order 0 1 us android.hardwar-728 [002] ..... 334.573879: system_heap_do_allocate: Execution time: order 0 2 us android.hardwar-728 [002] ..... 334.574239: system_heap_do_allocate: Execution time: order 0 354 us android.hardwar-728 [002] ..... 334.574247: system_heap_do_allocate: Execution time: order 0 4 us android.hardwar-728 [002] ..... 334.574250: system_heap_do_allocate: Execution time: order 0 2 us Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com> --- include/linux/gfp.h | 9 +++++++-- mm/mempolicy.c | 14 +++++++------- mm/page_alloc.c | 39 +++++++++++++++++++++++++++------------ 3 files changed, 41 insertions(+), 21 deletions(-) diff --git a/include/linux/gfp.h b/include/linux/gfp.h index 5ebf26fcdcfa..f1540c9fcd87 100644 --- a/include/linux/gfp.h +++ b/include/linux/gfp.h @@ -231,6 +231,7 @@ struct folio *__folio_alloc_noprof(gfp_t gfp, unsigned int order, int preferred_ unsigned long alloc_pages_bulk_noprof(gfp_t gfp, int preferred_nid, nodemask_t *nodemask, int nr_pages, + struct list_head *page_list, struct page **page_array); #define __alloc_pages_bulk(...) alloc_hooks(alloc_pages_bulk_noprof(__VA_ARGS__)) @@ -242,7 +243,11 @@ unsigned long alloc_pages_bulk_mempolicy_noprof(gfp_t gfp, /* Bulk allocate order-0 pages */ #define alloc_pages_bulk(_gfp, _nr_pages, _page_array) \ - __alloc_pages_bulk(_gfp, numa_mem_id(), NULL, _nr_pages, _page_array) + __alloc_pages_bulk(_gfp, numa_mem_id(), NULL, _nr_pages, NULL, _page_array) + +#define alloc_pages_bulk_list(_gfp, _nr_pages, _list) \ + __alloc_pages_bulk(_gfp, numa_mem_id(), NULL, _nr_pages, _list, NULL) + static inline unsigned long alloc_pages_bulk_node_noprof(gfp_t gfp, int nid, unsigned long nr_pages, @@ -251,7 +256,7 @@ alloc_pages_bulk_node_noprof(gfp_t gfp, int nid, unsigned long nr_pages, if (nid == NUMA_NO_NODE) nid = numa_mem_id(); - return alloc_pages_bulk_noprof(gfp, nid, NULL, nr_pages, page_array); + return alloc_pages_bulk_noprof(gfp, nid, NULL, nr_pages, NULL, page_array); } #define alloc_pages_bulk_node(...) \ diff --git a/mm/mempolicy.c b/mm/mempolicy.c index eb83cff7db8c..26274302ee01 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -2537,13 +2537,13 @@ static unsigned long alloc_pages_bulk_interleave(gfp_t gfp, if (delta) { nr_allocated = alloc_pages_bulk_noprof(gfp, interleave_nodes(pol), NULL, - nr_pages_per_node + 1, + nr_pages_per_node + 1, NULL, page_array); delta--; } else { nr_allocated = alloc_pages_bulk_noprof(gfp, interleave_nodes(pol), NULL, - nr_pages_per_node, page_array); + nr_pages_per_node, NULL, page_array); } page_array += nr_allocated; @@ -2593,7 +2593,7 @@ static unsigned long alloc_pages_bulk_weighted_interleave(gfp_t gfp, if (weight && node_isset(node, nodes)) { node_pages = min(rem_pages, weight); nr_allocated = __alloc_pages_bulk(gfp, node, NULL, node_pages, - page_array); + NULL, page_array); page_array += nr_allocated; total_allocated += nr_allocated; /* if that's all the pages, no need to interleave */ @@ -2658,7 +2658,7 @@ static unsigned long alloc_pages_bulk_weighted_interleave(gfp_t gfp, if (!node_pages) break; nr_allocated = __alloc_pages_bulk(gfp, node, NULL, node_pages, - page_array); + NULL, page_array); page_array += nr_allocated; total_allocated += nr_allocated; if (total_allocated == nr_pages) @@ -2682,11 +2682,11 @@ static unsigned long alloc_pages_bulk_preferred_many(gfp_t gfp, int nid, preferred_gfp &= ~(__GFP_DIRECT_RECLAIM | __GFP_NOFAIL); nr_allocated = alloc_pages_bulk_noprof(preferred_gfp, nid, &pol->nodes, - nr_pages, page_array); + nr_pages, NULL, page_array); if (nr_allocated < nr_pages) nr_allocated += alloc_pages_bulk_noprof(gfp, numa_node_id(), NULL, - nr_pages - nr_allocated, + nr_pages - nr_allocated, NULL, page_array + nr_allocated); return nr_allocated; } @@ -2722,7 +2722,7 @@ unsigned long alloc_pages_bulk_mempolicy_noprof(gfp_t gfp, nid = numa_node_id(); nodemask = policy_nodemask(gfp, pol, NO_INTERLEAVE_INDEX, &nid); return alloc_pages_bulk_noprof(gfp, nid, nodemask, - nr_pages, page_array); + nr_pages, NULL, page_array); } int vma_dup_policy(struct vm_area_struct *src, struct vm_area_struct *dst) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index d1d037f97c5f..a95bdd8cbf5b 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -4940,23 +4940,28 @@ static inline bool prepare_alloc_pages(gfp_t gfp_mask, unsigned int order, } /* - * __alloc_pages_bulk - Allocate a number of order-0 pages to an array + * __alloc_pages_bulk - Allocate a number of order-0 pages to a list or array * @gfp: GFP flags for the allocation * @preferred_nid: The preferred NUMA node ID to allocate from * @nodemask: Set of nodes to allocate from, may be NULL - * @nr_pages: The number of pages desired in the array - * @page_array: Array to store the pages + * @nr_pages: The number of pages desired on the list or array + * @page_list: Optional list to store the allocated pages + * @page_array: Optional array to store the pages * * This is a batched version of the page allocator that attempts to - * allocate nr_pages quickly. Pages are added to the page_array. + * allocate nr_pages quickly. Pages are added to page_list if page_list + * is not NULL, otherwise it is assumed that the page_array is valid. * - * Note that only NULL elements are populated with pages and nr_pages + * For lists, nr_pages is the number of pages that should be allocated. + * + * For arrays, only NULL elements are populated with pages and nr_pages * is the maximum number of pages that will be stored in the array. * - * Returns the number of pages in the array. + * Returns the number of pages on the list or array. */ unsigned long alloc_pages_bulk_noprof(gfp_t gfp, int preferred_nid, nodemask_t *nodemask, int nr_pages, + struct list_head *page_list, struct page **page_array) { struct page *page; @@ -4974,7 +4979,7 @@ unsigned long alloc_pages_bulk_noprof(gfp_t gfp, int preferred_nid, * Skip populated array elements to determine if any pages need * to be allocated before disabling IRQs. */ - while (nr_populated < nr_pages && page_array[nr_populated]) + while (page_array && nr_populated < nr_pages && page_array[nr_populated]) nr_populated++; /* No pages requested? */ @@ -4982,7 +4987,7 @@ unsigned long alloc_pages_bulk_noprof(gfp_t gfp, int preferred_nid, goto out; /* Already populated array? */ - if (unlikely(nr_pages - nr_populated == 0)) + if (unlikely(page_array && nr_pages - nr_populated == 0)) goto out; /* Bulk allocator does not support memcg accounting. */ @@ -5064,7 +5069,7 @@ unsigned long alloc_pages_bulk_noprof(gfp_t gfp, int preferred_nid, while (nr_populated < nr_pages) { /* Skip existing pages */ - if (page_array[nr_populated]) { + if (page_array && page_array[nr_populated]) { nr_populated++; continue; } @@ -5083,7 +5088,11 @@ unsigned long alloc_pages_bulk_noprof(gfp_t gfp, int preferred_nid, prep_new_page(page, 0, gfp, 0); set_page_refcounted(page); - page_array[nr_populated++] = page; + if (page_list) + list_add(&page->lru, page_list); + else + page_array[nr_populated] = page; + nr_populated++; } pcp_spin_unlock(pcp); @@ -5100,8 +5109,14 @@ unsigned long alloc_pages_bulk_noprof(gfp_t gfp, int preferred_nid, failed: page = __alloc_pages_noprof(gfp, 0, preferred_nid, nodemask); - if (page) - page_array[nr_populated++] = page; + if (page) { + if (page_list) + list_add(&page->lru, page_list); + else + page_array[nr_populated] = page; + nr_populated++; + } + goto out; } EXPORT_SYMBOL_GPL(alloc_pages_bulk_noprof); -- 2.25.1 ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 1/2] mm: call back alloc_pages_bulk_list since it is useful 2025-10-14 8:32 ` [PATCH 1/2] mm: call back alloc_pages_bulk_list since it is useful zhaoyang.huang @ 2025-10-14 9:41 ` Petr Tesarik 2025-10-14 12:46 ` Zhaoyang Huang 2025-10-15 12:16 ` David Hildenbrand 1 sibling, 1 reply; 19+ messages in thread From: Petr Tesarik @ 2025-10-14 9:41 UTC (permalink / raw) To: zhaoyang.huang Cc: Andrew Morton, David Hildenbrand, Matthew Wilcox, Mel Gorman, Vlastimil Babka, Sumit Semwal, Benjamin Gaignard, Brian Starkey, John Stultz, T . J . Mercier, Christian König, linux-media, dri-devel, linaro-mm-sig, linux-mm, linux-kernel, Zhaoyang Huang, steve.kang On Tue, 14 Oct 2025 16:32:29 +0800 "zhaoyang.huang" <zhaoyang.huang@unisoc.com> wrote: > From: Zhaoyang Huang <zhaoyang.huang@unisoc.com> > > commit c8b979530f27 ("mm: alloc_pages_bulk_noprof: drop page_list > argument") drops alloc_pages_bulk_list. This commit would like to call back > it since it is proved to be helpful to the drivers which allocate a bulk of > pages(see patch of 2 in this series ). > I do notice that Matthew's comment of the time cost of iterating a list. > However, I also observed in our test that the extra page_array's allocation > could be more expensive than cpu iteration when direct reclaiming happens > when ram is low[1]. IMHO, could we leave the API here to have the users > choose between the array or list according to their scenarios. OK, so this is more or less a revert of commit c8b979530f27 ("mm: alloc_pages_bulk_noprof: drop page_list argument")... I cannot comment on the performance gains, but I dislike the fact that the patch re-introduces alloc_pages_bulk_noprof() as a function with two signatures (either page_list is used, or page_array is used). If we can agree that allocations onto a linked list are useful, then I suggest to split the existing function so that the common bits end up in helper functions, called by both variants (one function using a list, one using an array). Petr T > [1] > android.hardwar-728 [002] ..... 334.573875: system_heap_do_allocate: Execution time: order 0 1 us > android.hardwar-728 [002] ..... 334.573879: system_heap_do_allocate: Execution time: order 0 2 us > android.hardwar-728 [002] ..... 334.574239: system_heap_do_allocate: Execution time: order 0 354 us > android.hardwar-728 [002] ..... 334.574247: system_heap_do_allocate: Execution time: order 0 4 us > android.hardwar-728 [002] ..... 334.574250: system_heap_do_allocate: Execution time: order 0 2 us > > Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com> > --- > include/linux/gfp.h | 9 +++++++-- > mm/mempolicy.c | 14 +++++++------- > mm/page_alloc.c | 39 +++++++++++++++++++++++++++------------ > 3 files changed, 41 insertions(+), 21 deletions(-) > > diff --git a/include/linux/gfp.h b/include/linux/gfp.h > index 5ebf26fcdcfa..f1540c9fcd87 100644 > --- a/include/linux/gfp.h > +++ b/include/linux/gfp.h > @@ -231,6 +231,7 @@ struct folio *__folio_alloc_noprof(gfp_t gfp, unsigned int order, int preferred_ > > unsigned long alloc_pages_bulk_noprof(gfp_t gfp, int preferred_nid, > nodemask_t *nodemask, int nr_pages, > + struct list_head *page_list, > struct page **page_array); > #define __alloc_pages_bulk(...) alloc_hooks(alloc_pages_bulk_noprof(__VA_ARGS__)) > > @@ -242,7 +243,11 @@ unsigned long alloc_pages_bulk_mempolicy_noprof(gfp_t gfp, > > /* Bulk allocate order-0 pages */ > #define alloc_pages_bulk(_gfp, _nr_pages, _page_array) \ > - __alloc_pages_bulk(_gfp, numa_mem_id(), NULL, _nr_pages, _page_array) > + __alloc_pages_bulk(_gfp, numa_mem_id(), NULL, _nr_pages, NULL, _page_array) > + > +#define alloc_pages_bulk_list(_gfp, _nr_pages, _list) \ > + __alloc_pages_bulk(_gfp, numa_mem_id(), NULL, _nr_pages, _list, NULL) > + > > static inline unsigned long > alloc_pages_bulk_node_noprof(gfp_t gfp, int nid, unsigned long nr_pages, > @@ -251,7 +256,7 @@ alloc_pages_bulk_node_noprof(gfp_t gfp, int nid, unsigned long nr_pages, > if (nid == NUMA_NO_NODE) > nid = numa_mem_id(); > > - return alloc_pages_bulk_noprof(gfp, nid, NULL, nr_pages, page_array); > + return alloc_pages_bulk_noprof(gfp, nid, NULL, nr_pages, NULL, page_array); > } > > #define alloc_pages_bulk_node(...) \ > diff --git a/mm/mempolicy.c b/mm/mempolicy.c > index eb83cff7db8c..26274302ee01 100644 > --- a/mm/mempolicy.c > +++ b/mm/mempolicy.c > @@ -2537,13 +2537,13 @@ static unsigned long alloc_pages_bulk_interleave(gfp_t gfp, > if (delta) { > nr_allocated = alloc_pages_bulk_noprof(gfp, > interleave_nodes(pol), NULL, > - nr_pages_per_node + 1, > + nr_pages_per_node + 1, NULL, > page_array); > delta--; > } else { > nr_allocated = alloc_pages_bulk_noprof(gfp, > interleave_nodes(pol), NULL, > - nr_pages_per_node, page_array); > + nr_pages_per_node, NULL, page_array); > } > > page_array += nr_allocated; > @@ -2593,7 +2593,7 @@ static unsigned long alloc_pages_bulk_weighted_interleave(gfp_t gfp, > if (weight && node_isset(node, nodes)) { > node_pages = min(rem_pages, weight); > nr_allocated = __alloc_pages_bulk(gfp, node, NULL, node_pages, > - page_array); > + NULL, page_array); > page_array += nr_allocated; > total_allocated += nr_allocated; > /* if that's all the pages, no need to interleave */ > @@ -2658,7 +2658,7 @@ static unsigned long alloc_pages_bulk_weighted_interleave(gfp_t gfp, > if (!node_pages) > break; > nr_allocated = __alloc_pages_bulk(gfp, node, NULL, node_pages, > - page_array); > + NULL, page_array); > page_array += nr_allocated; > total_allocated += nr_allocated; > if (total_allocated == nr_pages) > @@ -2682,11 +2682,11 @@ static unsigned long alloc_pages_bulk_preferred_many(gfp_t gfp, int nid, > preferred_gfp &= ~(__GFP_DIRECT_RECLAIM | __GFP_NOFAIL); > > nr_allocated = alloc_pages_bulk_noprof(preferred_gfp, nid, &pol->nodes, > - nr_pages, page_array); > + nr_pages, NULL, page_array); > > if (nr_allocated < nr_pages) > nr_allocated += alloc_pages_bulk_noprof(gfp, numa_node_id(), NULL, > - nr_pages - nr_allocated, > + nr_pages - nr_allocated, NULL, > page_array + nr_allocated); > return nr_allocated; > } > @@ -2722,7 +2722,7 @@ unsigned long alloc_pages_bulk_mempolicy_noprof(gfp_t gfp, > nid = numa_node_id(); > nodemask = policy_nodemask(gfp, pol, NO_INTERLEAVE_INDEX, &nid); > return alloc_pages_bulk_noprof(gfp, nid, nodemask, > - nr_pages, page_array); > + nr_pages, NULL, page_array); > } > > int vma_dup_policy(struct vm_area_struct *src, struct vm_area_struct *dst) > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index d1d037f97c5f..a95bdd8cbf5b 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -4940,23 +4940,28 @@ static inline bool prepare_alloc_pages(gfp_t gfp_mask, unsigned int order, > } > > /* > - * __alloc_pages_bulk - Allocate a number of order-0 pages to an array > + * __alloc_pages_bulk - Allocate a number of order-0 pages to a list or array > * @gfp: GFP flags for the allocation > * @preferred_nid: The preferred NUMA node ID to allocate from > * @nodemask: Set of nodes to allocate from, may be NULL > - * @nr_pages: The number of pages desired in the array > - * @page_array: Array to store the pages > + * @nr_pages: The number of pages desired on the list or array > + * @page_list: Optional list to store the allocated pages > + * @page_array: Optional array to store the pages > * > * This is a batched version of the page allocator that attempts to > - * allocate nr_pages quickly. Pages are added to the page_array. > + * allocate nr_pages quickly. Pages are added to page_list if page_list > + * is not NULL, otherwise it is assumed that the page_array is valid. > * > - * Note that only NULL elements are populated with pages and nr_pages > + * For lists, nr_pages is the number of pages that should be allocated. > + * > + * For arrays, only NULL elements are populated with pages and nr_pages > * is the maximum number of pages that will be stored in the array. > * > - * Returns the number of pages in the array. > + * Returns the number of pages on the list or array. > */ > unsigned long alloc_pages_bulk_noprof(gfp_t gfp, int preferred_nid, > nodemask_t *nodemask, int nr_pages, > + struct list_head *page_list, > struct page **page_array) > { > struct page *page; > @@ -4974,7 +4979,7 @@ unsigned long alloc_pages_bulk_noprof(gfp_t gfp, int preferred_nid, > * Skip populated array elements to determine if any pages need > * to be allocated before disabling IRQs. > */ > - while (nr_populated < nr_pages && page_array[nr_populated]) > + while (page_array && nr_populated < nr_pages && page_array[nr_populated]) > nr_populated++; > > /* No pages requested? */ > @@ -4982,7 +4987,7 @@ unsigned long alloc_pages_bulk_noprof(gfp_t gfp, int preferred_nid, > goto out; > > /* Already populated array? */ > - if (unlikely(nr_pages - nr_populated == 0)) > + if (unlikely(page_array && nr_pages - nr_populated == 0)) > goto out; > > /* Bulk allocator does not support memcg accounting. */ > @@ -5064,7 +5069,7 @@ unsigned long alloc_pages_bulk_noprof(gfp_t gfp, int preferred_nid, > while (nr_populated < nr_pages) { > > /* Skip existing pages */ > - if (page_array[nr_populated]) { > + if (page_array && page_array[nr_populated]) { > nr_populated++; > continue; > } > @@ -5083,7 +5088,11 @@ unsigned long alloc_pages_bulk_noprof(gfp_t gfp, int preferred_nid, > > prep_new_page(page, 0, gfp, 0); > set_page_refcounted(page); > - page_array[nr_populated++] = page; > + if (page_list) > + list_add(&page->lru, page_list); > + else > + page_array[nr_populated] = page; > + nr_populated++; > } > > pcp_spin_unlock(pcp); > @@ -5100,8 +5109,14 @@ unsigned long alloc_pages_bulk_noprof(gfp_t gfp, int preferred_nid, > > failed: > page = __alloc_pages_noprof(gfp, 0, preferred_nid, nodemask); > - if (page) > - page_array[nr_populated++] = page; > + if (page) { > + if (page_list) > + list_add(&page->lru, page_list); > + else > + page_array[nr_populated] = page; > + nr_populated++; > + } > + > goto out; > } > EXPORT_SYMBOL_GPL(alloc_pages_bulk_noprof); ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 1/2] mm: call back alloc_pages_bulk_list since it is useful 2025-10-14 9:41 ` Petr Tesarik @ 2025-10-14 12:46 ` Zhaoyang Huang 0 siblings, 0 replies; 19+ messages in thread From: Zhaoyang Huang @ 2025-10-14 12:46 UTC (permalink / raw) To: Petr Tesarik Cc: zhaoyang.huang, Andrew Morton, David Hildenbrand, Matthew Wilcox, Mel Gorman, Vlastimil Babka, Sumit Semwal, Benjamin Gaignard, Brian Starkey, John Stultz, T . J . Mercier, Christian König, linux-media, dri-devel, linaro-mm-sig, linux-mm, linux-kernel, steve.kang On Tue, Oct 14, 2025 at 5:41 PM Petr Tesarik <ptesarik@suse.com> wrote: > > On Tue, 14 Oct 2025 16:32:29 +0800 > "zhaoyang.huang" <zhaoyang.huang@unisoc.com> wrote: > > > From: Zhaoyang Huang <zhaoyang.huang@unisoc.com> > > > > commit c8b979530f27 ("mm: alloc_pages_bulk_noprof: drop page_list > > argument") drops alloc_pages_bulk_list. This commit would like to call back > > it since it is proved to be helpful to the drivers which allocate a bulk of > > pages(see patch of 2 in this series ). > > I do notice that Matthew's comment of the time cost of iterating a list. > > However, I also observed in our test that the extra page_array's allocation > > could be more expensive than cpu iteration when direct reclaiming happens > > when ram is low[1]. IMHO, could we leave the API here to have the users > > choose between the array or list according to their scenarios. > > OK, so this is more or less a revert of commit c8b979530f27 ("mm: > alloc_pages_bulk_noprof: drop page_list argument")... > > I cannot comment on the performance gains, but I dislike the fact that > the patch re-introduces alloc_pages_bulk_noprof() as a function with two > signatures (either page_list is used, or page_array is used). > > If we can agree that allocations onto a linked list are useful, then I > suggest to split the existing function so that the common bits end up > in helper functions, called by both variants (one function using a > list, one using an array). Yes. That is also what I wanted to do in the beginning. I will implement if dma-buf would like to take the change > > Petr T > > > [1] > > android.hardwar-728 [002] ..... 334.573875: system_heap_do_allocate: Execution time: order 0 1 us > > android.hardwar-728 [002] ..... 334.573879: system_heap_do_allocate: Execution time: order 0 2 us > > android.hardwar-728 [002] ..... 334.574239: system_heap_do_allocate: Execution time: order 0 354 us > > android.hardwar-728 [002] ..... 334.574247: system_heap_do_allocate: Execution time: order 0 4 us > > android.hardwar-728 [002] ..... 334.574250: system_heap_do_allocate: Execution time: order 0 2 us > > > > Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com> > > --- > > include/linux/gfp.h | 9 +++++++-- > > mm/mempolicy.c | 14 +++++++------- > > mm/page_alloc.c | 39 +++++++++++++++++++++++++++------------ > > 3 files changed, 41 insertions(+), 21 deletions(-) > > > > diff --git a/include/linux/gfp.h b/include/linux/gfp.h > > index 5ebf26fcdcfa..f1540c9fcd87 100644 > > --- a/include/linux/gfp.h > > +++ b/include/linux/gfp.h > > @@ -231,6 +231,7 @@ struct folio *__folio_alloc_noprof(gfp_t gfp, unsigned int order, int preferred_ > > > > unsigned long alloc_pages_bulk_noprof(gfp_t gfp, int preferred_nid, > > nodemask_t *nodemask, int nr_pages, > > + struct list_head *page_list, > > struct page **page_array); > > #define __alloc_pages_bulk(...) alloc_hooks(alloc_pages_bulk_noprof(__VA_ARGS__)) > > > > @@ -242,7 +243,11 @@ unsigned long alloc_pages_bulk_mempolicy_noprof(gfp_t gfp, > > > > /* Bulk allocate order-0 pages */ > > #define alloc_pages_bulk(_gfp, _nr_pages, _page_array) \ > > - __alloc_pages_bulk(_gfp, numa_mem_id(), NULL, _nr_pages, _page_array) > > + __alloc_pages_bulk(_gfp, numa_mem_id(), NULL, _nr_pages, NULL, _page_array) > > + > > +#define alloc_pages_bulk_list(_gfp, _nr_pages, _list) \ > > + __alloc_pages_bulk(_gfp, numa_mem_id(), NULL, _nr_pages, _list, NULL) > > + > > > > static inline unsigned long > > alloc_pages_bulk_node_noprof(gfp_t gfp, int nid, unsigned long nr_pages, > > @@ -251,7 +256,7 @@ alloc_pages_bulk_node_noprof(gfp_t gfp, int nid, unsigned long nr_pages, > > if (nid == NUMA_NO_NODE) > > nid = numa_mem_id(); > > > > - return alloc_pages_bulk_noprof(gfp, nid, NULL, nr_pages, page_array); > > + return alloc_pages_bulk_noprof(gfp, nid, NULL, nr_pages, NULL, page_array); > > } > > > > #define alloc_pages_bulk_node(...) \ > > diff --git a/mm/mempolicy.c b/mm/mempolicy.c > > index eb83cff7db8c..26274302ee01 100644 > > --- a/mm/mempolicy.c > > +++ b/mm/mempolicy.c > > @@ -2537,13 +2537,13 @@ static unsigned long alloc_pages_bulk_interleave(gfp_t gfp, > > if (delta) { > > nr_allocated = alloc_pages_bulk_noprof(gfp, > > interleave_nodes(pol), NULL, > > - nr_pages_per_node + 1, > > + nr_pages_per_node + 1, NULL, > > page_array); > > delta--; > > } else { > > nr_allocated = alloc_pages_bulk_noprof(gfp, > > interleave_nodes(pol), NULL, > > - nr_pages_per_node, page_array); > > + nr_pages_per_node, NULL, page_array); > > } > > > > page_array += nr_allocated; > > @@ -2593,7 +2593,7 @@ static unsigned long alloc_pages_bulk_weighted_interleave(gfp_t gfp, > > if (weight && node_isset(node, nodes)) { > > node_pages = min(rem_pages, weight); > > nr_allocated = __alloc_pages_bulk(gfp, node, NULL, node_pages, > > - page_array); > > + NULL, page_array); > > page_array += nr_allocated; > > total_allocated += nr_allocated; > > /* if that's all the pages, no need to interleave */ > > @@ -2658,7 +2658,7 @@ static unsigned long alloc_pages_bulk_weighted_interleave(gfp_t gfp, > > if (!node_pages) > > break; > > nr_allocated = __alloc_pages_bulk(gfp, node, NULL, node_pages, > > - page_array); > > + NULL, page_array); > > page_array += nr_allocated; > > total_allocated += nr_allocated; > > if (total_allocated == nr_pages) > > @@ -2682,11 +2682,11 @@ static unsigned long alloc_pages_bulk_preferred_many(gfp_t gfp, int nid, > > preferred_gfp &= ~(__GFP_DIRECT_RECLAIM | __GFP_NOFAIL); > > > > nr_allocated = alloc_pages_bulk_noprof(preferred_gfp, nid, &pol->nodes, > > - nr_pages, page_array); > > + nr_pages, NULL, page_array); > > > > if (nr_allocated < nr_pages) > > nr_allocated += alloc_pages_bulk_noprof(gfp, numa_node_id(), NULL, > > - nr_pages - nr_allocated, > > + nr_pages - nr_allocated, NULL, > > page_array + nr_allocated); > > return nr_allocated; > > } > > @@ -2722,7 +2722,7 @@ unsigned long alloc_pages_bulk_mempolicy_noprof(gfp_t gfp, > > nid = numa_node_id(); > > nodemask = policy_nodemask(gfp, pol, NO_INTERLEAVE_INDEX, &nid); > > return alloc_pages_bulk_noprof(gfp, nid, nodemask, > > - nr_pages, page_array); > > + nr_pages, NULL, page_array); > > } > > > > int vma_dup_policy(struct vm_area_struct *src, struct vm_area_struct *dst) > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > > index d1d037f97c5f..a95bdd8cbf5b 100644 > > --- a/mm/page_alloc.c > > +++ b/mm/page_alloc.c > > @@ -4940,23 +4940,28 @@ static inline bool prepare_alloc_pages(gfp_t gfp_mask, unsigned int order, > > } > > > > /* > > - * __alloc_pages_bulk - Allocate a number of order-0 pages to an array > > + * __alloc_pages_bulk - Allocate a number of order-0 pages to a list or array > > * @gfp: GFP flags for the allocation > > * @preferred_nid: The preferred NUMA node ID to allocate from > > * @nodemask: Set of nodes to allocate from, may be NULL > > - * @nr_pages: The number of pages desired in the array > > - * @page_array: Array to store the pages > > + * @nr_pages: The number of pages desired on the list or array > > + * @page_list: Optional list to store the allocated pages > > + * @page_array: Optional array to store the pages > > * > > * This is a batched version of the page allocator that attempts to > > - * allocate nr_pages quickly. Pages are added to the page_array. > > + * allocate nr_pages quickly. Pages are added to page_list if page_list > > + * is not NULL, otherwise it is assumed that the page_array is valid. > > * > > - * Note that only NULL elements are populated with pages and nr_pages > > + * For lists, nr_pages is the number of pages that should be allocated. > > + * > > + * For arrays, only NULL elements are populated with pages and nr_pages > > * is the maximum number of pages that will be stored in the array. > > * > > - * Returns the number of pages in the array. > > + * Returns the number of pages on the list or array. > > */ > > unsigned long alloc_pages_bulk_noprof(gfp_t gfp, int preferred_nid, > > nodemask_t *nodemask, int nr_pages, > > + struct list_head *page_list, > > struct page **page_array) > > { > > struct page *page; > > @@ -4974,7 +4979,7 @@ unsigned long alloc_pages_bulk_noprof(gfp_t gfp, int preferred_nid, > > * Skip populated array elements to determine if any pages need > > * to be allocated before disabling IRQs. > > */ > > - while (nr_populated < nr_pages && page_array[nr_populated]) > > + while (page_array && nr_populated < nr_pages && page_array[nr_populated]) > > nr_populated++; > > > > /* No pages requested? */ > > @@ -4982,7 +4987,7 @@ unsigned long alloc_pages_bulk_noprof(gfp_t gfp, int preferred_nid, > > goto out; > > > > /* Already populated array? */ > > - if (unlikely(nr_pages - nr_populated == 0)) > > + if (unlikely(page_array && nr_pages - nr_populated == 0)) > > goto out; > > > > /* Bulk allocator does not support memcg accounting. */ > > @@ -5064,7 +5069,7 @@ unsigned long alloc_pages_bulk_noprof(gfp_t gfp, int preferred_nid, > > while (nr_populated < nr_pages) { > > > > /* Skip existing pages */ > > - if (page_array[nr_populated]) { > > + if (page_array && page_array[nr_populated]) { > > nr_populated++; > > continue; > > } > > @@ -5083,7 +5088,11 @@ unsigned long alloc_pages_bulk_noprof(gfp_t gfp, int preferred_nid, > > > > prep_new_page(page, 0, gfp, 0); > > set_page_refcounted(page); > > - page_array[nr_populated++] = page; > > + if (page_list) > > + list_add(&page->lru, page_list); > > + else > > + page_array[nr_populated] = page; > > + nr_populated++; > > } > > > > pcp_spin_unlock(pcp); > > @@ -5100,8 +5109,14 @@ unsigned long alloc_pages_bulk_noprof(gfp_t gfp, int preferred_nid, > > > > failed: > > page = __alloc_pages_noprof(gfp, 0, preferred_nid, nodemask); > > - if (page) > > - page_array[nr_populated++] = page; > > + if (page) { > > + if (page_list) > > + list_add(&page->lru, page_list); > > + else > > + page_array[nr_populated] = page; > > + nr_populated++; > > + } > > + > > goto out; > > } > > EXPORT_SYMBOL_GPL(alloc_pages_bulk_noprof); > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 1/2] mm: call back alloc_pages_bulk_list since it is useful 2025-10-14 8:32 ` [PATCH 1/2] mm: call back alloc_pages_bulk_list since it is useful zhaoyang.huang 2025-10-14 9:41 ` Petr Tesarik @ 2025-10-15 12:16 ` David Hildenbrand 2025-10-15 12:35 ` Zhaoyang Huang 1 sibling, 1 reply; 19+ messages in thread From: David Hildenbrand @ 2025-10-15 12:16 UTC (permalink / raw) To: zhaoyang.huang, Andrew Morton, Matthew Wilcox, Mel Gorman, Vlastimil Babka, Sumit Semwal, Benjamin Gaignard, Brian Starkey, John Stultz, T . J . Mercier, Christian König, linux-media, dri-devel, linaro-mm-sig, linux-mm, linux-kernel, Zhaoyang Huang, steve.kang On 14.10.25 10:32, zhaoyang.huang wrote: > From: Zhaoyang Huang <zhaoyang.huang@unisoc.com> Probably the subject should be "mm: reintroduce alloc_pages_bulk_list()" > > commit c8b979530f27 ("mm: alloc_pages_bulk_noprof: drop page_list > argument") drops alloc_pages_bulk_list. This commit would like to call back > it since it is proved to be helpful to the drivers which allocate a bulk of > pages(see patch of 2 in this series ). "Let's reintroduce it so we can us for bulk allocation in the context of XXX next." > I do notice that Matthew's comment of the time cost of iterating a list. > However, I also observed in our test that the extra page_array's allocation > could be more expensive than cpu iteration when direct reclaiming happens > when ram is low[1]. IMHO, could we leave the API here to have the users > choose between the array or list according to their scenarios. I'd prefer if we avoid reintroducing this interface. How many pages are you intending to allocate? Wouldn't a smaller array on the stack be sufficient? -- Cheers David / dhildenb ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 1/2] mm: call back alloc_pages_bulk_list since it is useful 2025-10-15 12:16 ` David Hildenbrand @ 2025-10-15 12:35 ` Zhaoyang Huang 0 siblings, 0 replies; 19+ messages in thread From: Zhaoyang Huang @ 2025-10-15 12:35 UTC (permalink / raw) To: David Hildenbrand Cc: zhaoyang.huang, Andrew Morton, Matthew Wilcox, Mel Gorman, Vlastimil Babka, Sumit Semwal, Benjamin Gaignard, Brian Starkey, John Stultz, T . J . Mercier, Christian König, linux-media, dri-devel, linaro-mm-sig, linux-mm, linux-kernel, steve.kang On Wed, Oct 15, 2025 at 8:16 PM David Hildenbrand <david@redhat.com> wrote: > > On 14.10.25 10:32, zhaoyang.huang wrote: > > From: Zhaoyang Huang <zhaoyang.huang@unisoc.com> > > Probably the subject should be "mm: reintroduce alloc_pages_bulk_list()" ok > > > > > commit c8b979530f27 ("mm: alloc_pages_bulk_noprof: drop page_list > > argument") drops alloc_pages_bulk_list. This commit would like to call back > > it since it is proved to be helpful to the drivers which allocate a bulk of > > pages(see patch of 2 in this series ). > > "Let's reintroduce it so we can us for bulk allocation in the context of > XXX next." > > > I do notice that Matthew's comment of the time cost of iterating a list. > > However, I also observed in our test that the extra page_array's allocation > > could be more expensive than cpu iteration when direct reclaiming happens > > when ram is low[1]. IMHO, could we leave the API here to have the users > > choose between the array or list according to their scenarios. > > I'd prefer if we avoid reintroducing this interface. > > How many pages are you intending to allocate? Wouldn't a smaller array > on the stack be sufficient? Actually, dma-buf is the main consumer in android which could occupy half of the system RAM(mainly by multimedia which passes the memory between GPU and display, camera driver, NPU driver etc). Dozens MB is quite common or maybe more. This commit is proved to be helpful in the scenario of camera APP cold start which allocate around 300MB memory in an 6GB RAM ANDROID system IMHO, page_list could be more efficient than page_array in memory perspective which is an uncertain factor than iterating the list > > > -- > Cheers > > David / dhildenb > ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 2/2] driver: dma-buf: use alloc_pages_bulk_list for order-0 allocation 2025-10-14 8:32 [PATCH 0/2] optimization of dma-buf system_heap allocation zhaoyang.huang 2025-10-14 8:32 ` [PATCH 1/2] mm: call back alloc_pages_bulk_list since it is useful zhaoyang.huang @ 2025-10-14 8:32 ` zhaoyang.huang 2025-10-14 11:59 ` Christian König 2025-10-14 12:45 ` Petr Tesarik 2025-10-14 14:06 ` [PATCH 0/2] optimization of dma-buf system_heap allocation Matthew Wilcox 2 siblings, 2 replies; 19+ messages in thread From: zhaoyang.huang @ 2025-10-14 8:32 UTC (permalink / raw) To: Andrew Morton, David Hildenbrand, Matthew Wilcox, Mel Gorman, Vlastimil Babka, Sumit Semwal, Benjamin Gaignard, Brian Starkey, John Stultz, T . J . Mercier, Christian König, linux-media, dri-devel, linaro-mm-sig, linux-mm, linux-kernel, Zhaoyang Huang, steve.kang From: Zhaoyang Huang <zhaoyang.huang@unisoc.com> The size of once dma-buf allocation could be dozens MB or much more which introduce a loop of allocating several thousands of order-0 pages. Furthermore, the concurrent allocation could have dma-buf allocation enter direct-reclaim during the loop. This commit would like to eliminate the above two affections by introducing alloc_pages_bulk_list in dma-buf's order-0 allocation. This patch is proved to be conditionally helpful in 18MB allocation as decreasing the time from 24604us to 6555us and no harm when bulk allocation can't be done(fallback to single page allocation) Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com> --- drivers/dma-buf/heaps/system_heap.c | 36 +++++++++++++++++++---------- 1 file changed, 24 insertions(+), 12 deletions(-) diff --git a/drivers/dma-buf/heaps/system_heap.c b/drivers/dma-buf/heaps/system_heap.c index bbe7881f1360..71b028c63bd8 100644 --- a/drivers/dma-buf/heaps/system_heap.c +++ b/drivers/dma-buf/heaps/system_heap.c @@ -300,8 +300,8 @@ static const struct dma_buf_ops system_heap_buf_ops = { .release = system_heap_dma_buf_release, }; -static struct page *alloc_largest_available(unsigned long size, - unsigned int max_order) +static void alloc_largest_available(unsigned long size, + unsigned int max_order, unsigned int *num_pages, struct list_head *list) { struct page *page; int i; @@ -312,12 +312,19 @@ static struct page *alloc_largest_available(unsigned long size, if (max_order < orders[i]) continue; - page = alloc_pages(order_flags[i], orders[i]); - if (!page) + if (orders[i]) { + page = alloc_pages(order_flags[i], orders[i]); + if (page) { + list_add(&page->lru, list); + *num_pages = 1; + } + } else + *num_pages = alloc_pages_bulk_list(LOW_ORDER_GFP, size / PAGE_SIZE, list); + + if (list_empty(list)) continue; - return page; + return; } - return NULL; } static struct dma_buf *system_heap_allocate(struct dma_heap *heap, @@ -335,6 +342,8 @@ static struct dma_buf *system_heap_allocate(struct dma_heap *heap, struct list_head pages; struct page *page, *tmp_page; int i, ret = -ENOMEM; + unsigned int num_pages; + LIST_HEAD(head); buffer = kzalloc(sizeof(*buffer), GFP_KERNEL); if (!buffer) @@ -348,6 +357,8 @@ static struct dma_buf *system_heap_allocate(struct dma_heap *heap, INIT_LIST_HEAD(&pages); i = 0; while (size_remaining > 0) { + num_pages = 0; + INIT_LIST_HEAD(&head); /* * Avoid trying to allocate memory if the process * has been killed by SIGKILL @@ -357,14 +368,15 @@ static struct dma_buf *system_heap_allocate(struct dma_heap *heap, goto free_buffer; } - page = alloc_largest_available(size_remaining, max_order); - if (!page) + alloc_largest_available(size_remaining, max_order, &num_pages, &head); + if (!num_pages) goto free_buffer; - list_add_tail(&page->lru, &pages); - size_remaining -= page_size(page); - max_order = compound_order(page); - i++; + list_splice_tail(&head, &pages); + max_order = folio_order(lru_to_folio(&head)); + size_remaining -= PAGE_SIZE * (num_pages << max_order); + i += num_pages; + } table = &buffer->sg_table; -- 2.25.1 ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 2/2] driver: dma-buf: use alloc_pages_bulk_list for order-0 allocation 2025-10-14 8:32 ` [PATCH 2/2] driver: dma-buf: use alloc_pages_bulk_list for order-0 allocation zhaoyang.huang @ 2025-10-14 11:59 ` Christian König 2025-10-14 12:44 ` Zhaoyang Huang 2025-10-14 12:45 ` Petr Tesarik 1 sibling, 1 reply; 19+ messages in thread From: Christian König @ 2025-10-14 11:59 UTC (permalink / raw) To: zhaoyang.huang, Andrew Morton, David Hildenbrand, Matthew Wilcox, Mel Gorman, Vlastimil Babka, Sumit Semwal, Benjamin Gaignard, Brian Starkey, John Stultz, T . J . Mercier, linux-media, dri-devel, linaro-mm-sig, linux-mm, linux-kernel, Zhaoyang Huang, steve.kang On 14.10.25 10:32, zhaoyang.huang wrote: > From: Zhaoyang Huang <zhaoyang.huang@unisoc.com> > > The size of once dma-buf allocation could be dozens MB or much more > which introduce a loop of allocating several thousands of order-0 pages. > Furthermore, the concurrent allocation could have dma-buf allocation enter > direct-reclaim during the loop. This commit would like to eliminate the > above two affections by introducing alloc_pages_bulk_list in dma-buf's > order-0 allocation. This patch is proved to be conditionally helpful > in 18MB allocation as decreasing the time from 24604us to 6555us and no > harm when bulk allocation can't be done(fallback to single page > allocation) Well that sounds like an absolutely horrible idea. See the handling of allocating only from specific order is *exactly* there to avoid the behavior of bulk allocation. What you seem to do with this patch here is to add on top of the behavior to avoid allocating large chunks from the buddy the behavior to allocate large chunks from the buddy because that is faster. So this change here doesn't looks like it will fly very high. Please explain what you're actually trying to do, just optimize allocation time? Regards, Christian. > Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com> > --- > drivers/dma-buf/heaps/system_heap.c | 36 +++++++++++++++++++---------- > 1 file changed, 24 insertions(+), 12 deletions(-) > > diff --git a/drivers/dma-buf/heaps/system_heap.c b/drivers/dma-buf/heaps/system_heap.c > index bbe7881f1360..71b028c63bd8 100644 > --- a/drivers/dma-buf/heaps/system_heap.c > +++ b/drivers/dma-buf/heaps/system_heap.c > @@ -300,8 +300,8 @@ static const struct dma_buf_ops system_heap_buf_ops = { > .release = system_heap_dma_buf_release, > }; > > -static struct page *alloc_largest_available(unsigned long size, > - unsigned int max_order) > +static void alloc_largest_available(unsigned long size, > + unsigned int max_order, unsigned int *num_pages, struct list_head *list) > { > struct page *page; > int i; > @@ -312,12 +312,19 @@ static struct page *alloc_largest_available(unsigned long size, > if (max_order < orders[i]) > continue; > > - page = alloc_pages(order_flags[i], orders[i]); > - if (!page) > + if (orders[i]) { > + page = alloc_pages(order_flags[i], orders[i]); > + if (page) { > + list_add(&page->lru, list); > + *num_pages = 1; > + } > + } else > + *num_pages = alloc_pages_bulk_list(LOW_ORDER_GFP, size / PAGE_SIZE, list); > + > + if (list_empty(list)) > continue; > - return page; > + return; > } > - return NULL; > } > > static struct dma_buf *system_heap_allocate(struct dma_heap *heap, > @@ -335,6 +342,8 @@ static struct dma_buf *system_heap_allocate(struct dma_heap *heap, > struct list_head pages; > struct page *page, *tmp_page; > int i, ret = -ENOMEM; > + unsigned int num_pages; > + LIST_HEAD(head); > > buffer = kzalloc(sizeof(*buffer), GFP_KERNEL); > if (!buffer) > @@ -348,6 +357,8 @@ static struct dma_buf *system_heap_allocate(struct dma_heap *heap, > INIT_LIST_HEAD(&pages); > i = 0; > while (size_remaining > 0) { > + num_pages = 0; > + INIT_LIST_HEAD(&head); > /* > * Avoid trying to allocate memory if the process > * has been killed by SIGKILL > @@ -357,14 +368,15 @@ static struct dma_buf *system_heap_allocate(struct dma_heap *heap, > goto free_buffer; > } > > - page = alloc_largest_available(size_remaining, max_order); > - if (!page) > + alloc_largest_available(size_remaining, max_order, &num_pages, &head); > + if (!num_pages) > goto free_buffer; > > - list_add_tail(&page->lru, &pages); > - size_remaining -= page_size(page); > - max_order = compound_order(page); > - i++; > + list_splice_tail(&head, &pages); > + max_order = folio_order(lru_to_folio(&head)); > + size_remaining -= PAGE_SIZE * (num_pages << max_order); > + i += num_pages; > + > } > > table = &buffer->sg_table; ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 2/2] driver: dma-buf: use alloc_pages_bulk_list for order-0 allocation 2025-10-14 11:59 ` Christian König @ 2025-10-14 12:44 ` Zhaoyang Huang 2025-10-14 13:04 ` Christian König 0 siblings, 1 reply; 19+ messages in thread From: Zhaoyang Huang @ 2025-10-14 12:44 UTC (permalink / raw) To: Christian König Cc: zhaoyang.huang, Andrew Morton, David Hildenbrand, Matthew Wilcox, Mel Gorman, Vlastimil Babka, Sumit Semwal, Benjamin Gaignard, Brian Starkey, John Stultz, T . J . Mercier, linux-media, dri-devel, linaro-mm-sig, linux-mm, linux-kernel, steve.kang On Tue, Oct 14, 2025 at 7:59 PM Christian König <christian.koenig@amd.com> wrote: > > On 14.10.25 10:32, zhaoyang.huang wrote: > > From: Zhaoyang Huang <zhaoyang.huang@unisoc.com> > > > > The size of once dma-buf allocation could be dozens MB or much more > > which introduce a loop of allocating several thousands of order-0 pages. > > Furthermore, the concurrent allocation could have dma-buf allocation enter > > direct-reclaim during the loop. This commit would like to eliminate the > > above two affections by introducing alloc_pages_bulk_list in dma-buf's > > order-0 allocation. This patch is proved to be conditionally helpful > > in 18MB allocation as decreasing the time from 24604us to 6555us and no > > harm when bulk allocation can't be done(fallback to single page > > allocation) > > Well that sounds like an absolutely horrible idea. > > See the handling of allocating only from specific order is *exactly* there to avoid the behavior of bulk allocation. > > What you seem to do with this patch here is to add on top of the behavior to avoid allocating large chunks from the buddy the behavior to allocate large chunks from the buddy because that is faster. emm, this patch doesn't change order-8 and order-4's allocation behaviour but just to replace the loop of order-0 allocations into once bulk allocation in the fallback way. What is your concern about this? > > So this change here doesn't looks like it will fly very high. Please explain what you're actually trying to do, just optimize allocation time? > > Regards, > Christian. > > > Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com> > > --- > > drivers/dma-buf/heaps/system_heap.c | 36 +++++++++++++++++++---------- > > 1 file changed, 24 insertions(+), 12 deletions(-) > > > > diff --git a/drivers/dma-buf/heaps/system_heap.c b/drivers/dma-buf/heaps/system_heap.c > > index bbe7881f1360..71b028c63bd8 100644 > > --- a/drivers/dma-buf/heaps/system_heap.c > > +++ b/drivers/dma-buf/heaps/system_heap.c > > @@ -300,8 +300,8 @@ static const struct dma_buf_ops system_heap_buf_ops = { > > .release = system_heap_dma_buf_release, > > }; > > > > -static struct page *alloc_largest_available(unsigned long size, > > - unsigned int max_order) > > +static void alloc_largest_available(unsigned long size, > > + unsigned int max_order, unsigned int *num_pages, struct list_head *list) > > { > > struct page *page; > > int i; > > @@ -312,12 +312,19 @@ static struct page *alloc_largest_available(unsigned long size, > > if (max_order < orders[i]) > > continue; > > > > - page = alloc_pages(order_flags[i], orders[i]); > > - if (!page) > > + if (orders[i]) { > > + page = alloc_pages(order_flags[i], orders[i]); > > + if (page) { > > + list_add(&page->lru, list); > > + *num_pages = 1; > > + } > > + } else > > + *num_pages = alloc_pages_bulk_list(LOW_ORDER_GFP, size / PAGE_SIZE, list); > > + > > + if (list_empty(list)) > > continue; > > - return page; > > + return; > > } > > - return NULL; > > } > > > > static struct dma_buf *system_heap_allocate(struct dma_heap *heap, > > @@ -335,6 +342,8 @@ static struct dma_buf *system_heap_allocate(struct dma_heap *heap, > > struct list_head pages; > > struct page *page, *tmp_page; > > int i, ret = -ENOMEM; > > + unsigned int num_pages; > > + LIST_HEAD(head); > > > > buffer = kzalloc(sizeof(*buffer), GFP_KERNEL); > > if (!buffer) > > @@ -348,6 +357,8 @@ static struct dma_buf *system_heap_allocate(struct dma_heap *heap, > > INIT_LIST_HEAD(&pages); > > i = 0; > > while (size_remaining > 0) { > > + num_pages = 0; > > + INIT_LIST_HEAD(&head); > > /* > > * Avoid trying to allocate memory if the process > > * has been killed by SIGKILL > > @@ -357,14 +368,15 @@ static struct dma_buf *system_heap_allocate(struct dma_heap *heap, > > goto free_buffer; > > } > > > > - page = alloc_largest_available(size_remaining, max_order); > > - if (!page) > > + alloc_largest_available(size_remaining, max_order, &num_pages, &head); > > + if (!num_pages) > > goto free_buffer; > > > > - list_add_tail(&page->lru, &pages); > > - size_remaining -= page_size(page); > > - max_order = compound_order(page); > > - i++; > > + list_splice_tail(&head, &pages); > > + max_order = folio_order(lru_to_folio(&head)); > > + size_remaining -= PAGE_SIZE * (num_pages << max_order); > > + i += num_pages; > > + > > } > > > > table = &buffer->sg_table; > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 2/2] driver: dma-buf: use alloc_pages_bulk_list for order-0 allocation 2025-10-14 12:44 ` Zhaoyang Huang @ 2025-10-14 13:04 ` Christian König 2025-10-14 15:10 ` Petr Tesarik 0 siblings, 1 reply; 19+ messages in thread From: Christian König @ 2025-10-14 13:04 UTC (permalink / raw) To: Zhaoyang Huang Cc: zhaoyang.huang, Andrew Morton, David Hildenbrand, Matthew Wilcox, Mel Gorman, Vlastimil Babka, Sumit Semwal, Benjamin Gaignard, Brian Starkey, John Stultz, T . J . Mercier, linux-media, dri-devel, linaro-mm-sig, linux-mm, linux-kernel, steve.kang On 14.10.25 14:44, Zhaoyang Huang wrote: > On Tue, Oct 14, 2025 at 7:59 PM Christian König > <christian.koenig@amd.com> wrote: >> >> On 14.10.25 10:32, zhaoyang.huang wrote: >>> From: Zhaoyang Huang <zhaoyang.huang@unisoc.com> >>> >>> The size of once dma-buf allocation could be dozens MB or much more >>> which introduce a loop of allocating several thousands of order-0 pages. >>> Furthermore, the concurrent allocation could have dma-buf allocation enter >>> direct-reclaim during the loop. This commit would like to eliminate the >>> above two affections by introducing alloc_pages_bulk_list in dma-buf's >>> order-0 allocation. This patch is proved to be conditionally helpful >>> in 18MB allocation as decreasing the time from 24604us to 6555us and no >>> harm when bulk allocation can't be done(fallback to single page >>> allocation) >> >> Well that sounds like an absolutely horrible idea. >> >> See the handling of allocating only from specific order is *exactly* there to avoid the behavior of bulk allocation. >> >> What you seem to do with this patch here is to add on top of the behavior to avoid allocating large chunks from the buddy the behavior to allocate large chunks from the buddy because that is faster. > emm, this patch doesn't change order-8 and order-4's allocation > behaviour but just to replace the loop of order-0 allocations into > once bulk allocation in the fallback way. What is your concern about > this? As far as I know the bulk allocation favors splitting large pages into smaller ones instead of allocating smaller pages first. That's where the performance benefit comes from. But that is exactly what we try to avoid here by allocating only certain order of pages. Regards, Christian. >> >> So this change here doesn't looks like it will fly very high. Please explain what you're actually trying to do, just optimize allocation time? >> >> Regards, >> Christian. >> >>> Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com> >>> --- >>> drivers/dma-buf/heaps/system_heap.c | 36 +++++++++++++++++++---------- >>> 1 file changed, 24 insertions(+), 12 deletions(-) >>> >>> diff --git a/drivers/dma-buf/heaps/system_heap.c b/drivers/dma-buf/heaps/system_heap.c >>> index bbe7881f1360..71b028c63bd8 100644 >>> --- a/drivers/dma-buf/heaps/system_heap.c >>> +++ b/drivers/dma-buf/heaps/system_heap.c >>> @@ -300,8 +300,8 @@ static const struct dma_buf_ops system_heap_buf_ops = { >>> .release = system_heap_dma_buf_release, >>> }; >>> >>> -static struct page *alloc_largest_available(unsigned long size, >>> - unsigned int max_order) >>> +static void alloc_largest_available(unsigned long size, >>> + unsigned int max_order, unsigned int *num_pages, struct list_head *list) >>> { >>> struct page *page; >>> int i; >>> @@ -312,12 +312,19 @@ static struct page *alloc_largest_available(unsigned long size, >>> if (max_order < orders[i]) >>> continue; >>> >>> - page = alloc_pages(order_flags[i], orders[i]); >>> - if (!page) >>> + if (orders[i]) { >>> + page = alloc_pages(order_flags[i], orders[i]); >>> + if (page) { >>> + list_add(&page->lru, list); >>> + *num_pages = 1; >>> + } >>> + } else >>> + *num_pages = alloc_pages_bulk_list(LOW_ORDER_GFP, size / PAGE_SIZE, list); >>> + >>> + if (list_empty(list)) >>> continue; >>> - return page; >>> + return; >>> } >>> - return NULL; >>> } >>> >>> static struct dma_buf *system_heap_allocate(struct dma_heap *heap, >>> @@ -335,6 +342,8 @@ static struct dma_buf *system_heap_allocate(struct dma_heap *heap, >>> struct list_head pages; >>> struct page *page, *tmp_page; >>> int i, ret = -ENOMEM; >>> + unsigned int num_pages; >>> + LIST_HEAD(head); >>> >>> buffer = kzalloc(sizeof(*buffer), GFP_KERNEL); >>> if (!buffer) >>> @@ -348,6 +357,8 @@ static struct dma_buf *system_heap_allocate(struct dma_heap *heap, >>> INIT_LIST_HEAD(&pages); >>> i = 0; >>> while (size_remaining > 0) { >>> + num_pages = 0; >>> + INIT_LIST_HEAD(&head); >>> /* >>> * Avoid trying to allocate memory if the process >>> * has been killed by SIGKILL >>> @@ -357,14 +368,15 @@ static struct dma_buf *system_heap_allocate(struct dma_heap *heap, >>> goto free_buffer; >>> } >>> >>> - page = alloc_largest_available(size_remaining, max_order); >>> - if (!page) >>> + alloc_largest_available(size_remaining, max_order, &num_pages, &head); >>> + if (!num_pages) >>> goto free_buffer; >>> >>> - list_add_tail(&page->lru, &pages); >>> - size_remaining -= page_size(page); >>> - max_order = compound_order(page); >>> - i++; >>> + list_splice_tail(&head, &pages); >>> + max_order = folio_order(lru_to_folio(&head)); >>> + size_remaining -= PAGE_SIZE * (num_pages << max_order); >>> + i += num_pages; >>> + >>> } >>> >>> table = &buffer->sg_table; >> ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 2/2] driver: dma-buf: use alloc_pages_bulk_list for order-0 allocation 2025-10-14 13:04 ` Christian König @ 2025-10-14 15:10 ` Petr Tesarik 2025-10-14 15:52 ` Christian König 0 siblings, 1 reply; 19+ messages in thread From: Petr Tesarik @ 2025-10-14 15:10 UTC (permalink / raw) To: Christian König Cc: Zhaoyang Huang, zhaoyang.huang, Andrew Morton, David Hildenbrand, Matthew Wilcox, Mel Gorman, Vlastimil Babka, Sumit Semwal, Benjamin Gaignard, Brian Starkey, John Stultz, T . J . Mercier, linux-media, dri-devel, linaro-mm-sig, linux-mm, linux-kernel, steve.kang On Tue, 14 Oct 2025 15:04:14 +0200 Christian König <christian.koenig@amd.com> wrote: > On 14.10.25 14:44, Zhaoyang Huang wrote: > > On Tue, Oct 14, 2025 at 7:59 PM Christian König > > <christian.koenig@amd.com> wrote: > >> > >> On 14.10.25 10:32, zhaoyang.huang wrote: > >>> From: Zhaoyang Huang <zhaoyang.huang@unisoc.com> > >>> > >>> The size of once dma-buf allocation could be dozens MB or much more > >>> which introduce a loop of allocating several thousands of order-0 pages. > >>> Furthermore, the concurrent allocation could have dma-buf allocation enter > >>> direct-reclaim during the loop. This commit would like to eliminate the > >>> above two affections by introducing alloc_pages_bulk_list in dma-buf's > >>> order-0 allocation. This patch is proved to be conditionally helpful > >>> in 18MB allocation as decreasing the time from 24604us to 6555us and no > >>> harm when bulk allocation can't be done(fallback to single page > >>> allocation) > >> > >> Well that sounds like an absolutely horrible idea. > >> > >> See the handling of allocating only from specific order is *exactly* there to avoid the behavior of bulk allocation. > >> > >> What you seem to do with this patch here is to add on top of the behavior to avoid allocating large chunks from the buddy the behavior to allocate large chunks from the buddy because that is faster. > > emm, this patch doesn't change order-8 and order-4's allocation > > behaviour but just to replace the loop of order-0 allocations into > > once bulk allocation in the fallback way. What is your concern about > > this? > > As far as I know the bulk allocation favors splitting large pages into smaller ones instead of allocating smaller pages first. That's where the performance benefit comes from. > > But that is exactly what we try to avoid here by allocating only certain order of pages. This is a good question, actually. Yes, bulk alloc will split large pages if there are insufficient pages on the pcp free list. But is dma-buf indeed trying to avoid it, or is it merely using an inefficient API? And does it need the extra speed? Even if it leads to increased fragmentation? Petr T ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 2/2] driver: dma-buf: use alloc_pages_bulk_list for order-0 allocation 2025-10-14 15:10 ` Petr Tesarik @ 2025-10-14 15:52 ` Christian König 2025-10-15 1:12 ` Zhaoyang Huang 0 siblings, 1 reply; 19+ messages in thread From: Christian König @ 2025-10-14 15:52 UTC (permalink / raw) To: Petr Tesarik Cc: Zhaoyang Huang, zhaoyang.huang, Andrew Morton, David Hildenbrand, Matthew Wilcox, Mel Gorman, Vlastimil Babka, Sumit Semwal, Benjamin Gaignard, Brian Starkey, John Stultz, T . J . Mercier, linux-media, dri-devel, linaro-mm-sig, linux-mm, linux-kernel, steve.kang On 14.10.25 17:10, Petr Tesarik wrote: > On Tue, 14 Oct 2025 15:04:14 +0200 > Christian König <christian.koenig@amd.com> wrote: > >> On 14.10.25 14:44, Zhaoyang Huang wrote: >>> On Tue, Oct 14, 2025 at 7:59 PM Christian König >>> <christian.koenig@amd.com> wrote: >>>> >>>> On 14.10.25 10:32, zhaoyang.huang wrote: >>>>> From: Zhaoyang Huang <zhaoyang.huang@unisoc.com> >>>>> >>>>> The size of once dma-buf allocation could be dozens MB or much more >>>>> which introduce a loop of allocating several thousands of order-0 pages. >>>>> Furthermore, the concurrent allocation could have dma-buf allocation enter >>>>> direct-reclaim during the loop. This commit would like to eliminate the >>>>> above two affections by introducing alloc_pages_bulk_list in dma-buf's >>>>> order-0 allocation. This patch is proved to be conditionally helpful >>>>> in 18MB allocation as decreasing the time from 24604us to 6555us and no >>>>> harm when bulk allocation can't be done(fallback to single page >>>>> allocation) >>>> >>>> Well that sounds like an absolutely horrible idea. >>>> >>>> See the handling of allocating only from specific order is *exactly* there to avoid the behavior of bulk allocation. >>>> >>>> What you seem to do with this patch here is to add on top of the behavior to avoid allocating large chunks from the buddy the behavior to allocate large chunks from the buddy because that is faster. >>> emm, this patch doesn't change order-8 and order-4's allocation >>> behaviour but just to replace the loop of order-0 allocations into >>> once bulk allocation in the fallback way. What is your concern about >>> this? >> >> As far as I know the bulk allocation favors splitting large pages into smaller ones instead of allocating smaller pages first. That's where the performance benefit comes from. >> >> But that is exactly what we try to avoid here by allocating only certain order of pages. > > This is a good question, actually. Yes, bulk alloc will split large > pages if there are insufficient pages on the pcp free list. But is > dma-buf indeed trying to avoid it, or is it merely using an inefficient > API? And does it need the extra speed? Even if it leads to increased > fragmentation? DMA-buf-heaps is completly intentionally trying rather hard to avoid splitting large pages. That's why you have the distinction between HIGH_ORDER_GFP and LOW_ORDER_GFP as well. Keep in mind that this is mostly used on embedded system with only small amounts of memory. Not entering direct reclaim and instead preferring to split large pages until they are used up is an absolutely no-go for most use cases as far as I can see. Could be that we need to make this behavior conditional, but somebody would need to come up with some really good arguments to justify the complexity. Regards, Christian. > > Petr T ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 2/2] driver: dma-buf: use alloc_pages_bulk_list for order-0 allocation 2025-10-14 15:52 ` Christian König @ 2025-10-15 1:12 ` Zhaoyang Huang 2025-10-15 3:21 ` Matthew Wilcox 0 siblings, 1 reply; 19+ messages in thread From: Zhaoyang Huang @ 2025-10-15 1:12 UTC (permalink / raw) To: Christian König, Suren Baghdasaryan Cc: Petr Tesarik, zhaoyang.huang, Andrew Morton, David Hildenbrand, Matthew Wilcox, Mel Gorman, Vlastimil Babka, Sumit Semwal, Benjamin Gaignard, Brian Starkey, John Stultz, T . J . Mercier, linux-media, dri-devel, linaro-mm-sig, linux-mm, linux-kernel, steve.kang On Tue, Oct 14, 2025 at 11:52 PM Christian König <christian.koenig@amd.com> wrote: > > On 14.10.25 17:10, Petr Tesarik wrote: > > On Tue, 14 Oct 2025 15:04:14 +0200 > > Christian König <christian.koenig@amd.com> wrote: > > > >> On 14.10.25 14:44, Zhaoyang Huang wrote: > >>> On Tue, Oct 14, 2025 at 7:59 PM Christian König > >>> <christian.koenig@amd.com> wrote: > >>>> > >>>> On 14.10.25 10:32, zhaoyang.huang wrote: > >>>>> From: Zhaoyang Huang <zhaoyang.huang@unisoc.com> > >>>>> > >>>>> The size of once dma-buf allocation could be dozens MB or much more > >>>>> which introduce a loop of allocating several thousands of order-0 pages. > >>>>> Furthermore, the concurrent allocation could have dma-buf allocation enter > >>>>> direct-reclaim during the loop. This commit would like to eliminate the > >>>>> above two affections by introducing alloc_pages_bulk_list in dma-buf's > >>>>> order-0 allocation. This patch is proved to be conditionally helpful > >>>>> in 18MB allocation as decreasing the time from 24604us to 6555us and no > >>>>> harm when bulk allocation can't be done(fallback to single page > >>>>> allocation) > >>>> > >>>> Well that sounds like an absolutely horrible idea. > >>>> > >>>> See the handling of allocating only from specific order is *exactly* there to avoid the behavior of bulk allocation. > >>>> > >>>> What you seem to do with this patch here is to add on top of the behavior to avoid allocating large chunks from the buddy the behavior to allocate large chunks from the buddy because that is faster. > >>> emm, this patch doesn't change order-8 and order-4's allocation > >>> behaviour but just to replace the loop of order-0 allocations into > >>> once bulk allocation in the fallback way. What is your concern about > >>> this? > >> > >> As far as I know the bulk allocation favors splitting large pages into smaller ones instead of allocating smaller pages first. That's where the performance benefit comes from. > >> > >> But that is exactly what we try to avoid here by allocating only certain order of pages. > > > > This is a good question, actually. Yes, bulk alloc will split large > > pages if there are insufficient pages on the pcp free list. But islatest > > dma-buf indeed trying to avoid it, or is it merely using an inefficient > > API? And does it need the extra speed? Even if it leads to increased > > fragmentation? > > DMA-buf-heaps is completly intentionally trying rather hard to avoid splitting large pages. That's why you have the distinction between HIGH_ORDER_GFP and LOW_ORDER_GFP as well. Could you please check the alloc_pages_bulk_noprof in the patch 1/2 of this series. According to my understanding, it try to get the order-0 page from pcplist firstly and then fallback to normal alloc_pages(order-0) as same as what current dma-buf do. That is to say no extra large pages splitting introduced by this commit. > > Keep in mind that this is mostly used on embedded system with only small amounts of memory. Actually, dma-buf is the main consumer for driver's memory allocation in the android world(multimedia, camera, npu etc) which could use even half of the system RAM with dozens MB for once allocation. > > Not entering direct reclaim and instead preferring to split large pages until they are used up is an absolutely no-go for most use cases as far as I can see. Yes. This behaviour would NOT be changed under this commit. > > Could be that we need to make this behavior conditional, but somebody would need to come up with some really good arguments to justify the complexity. ok, should we use CONFIG_DMA_BUF_BULK_ALLOCATION or a variable controlled by sysfs interface? > > Regards, > Christian. > > > > > Petr T > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 2/2] driver: dma-buf: use alloc_pages_bulk_list for order-0 allocation 2025-10-15 1:12 ` Zhaoyang Huang @ 2025-10-15 3:21 ` Matthew Wilcox 2025-10-15 5:52 ` Zhaoyang Huang 0 siblings, 1 reply; 19+ messages in thread From: Matthew Wilcox @ 2025-10-15 3:21 UTC (permalink / raw) To: Zhaoyang Huang Cc: Christian König, Suren Baghdasaryan, Petr Tesarik, zhaoyang.huang, Andrew Morton, David Hildenbrand, Mel Gorman, Vlastimil Babka, Sumit Semwal, Benjamin Gaignard, Brian Starkey, John Stultz, T . J . Mercier, linux-media, dri-devel, linaro-mm-sig, linux-mm, linux-kernel, steve.kang On Wed, Oct 15, 2025 at 09:12:07AM +0800, Zhaoyang Huang wrote: > > Could be that we need to make this behavior conditional, but somebody would need to come up with some really good arguments to justify the complexity. > ok, should we use CONFIG_DMA_BUF_BULK_ALLOCATION or a variable > controlled by sysfs interface? No. Explain what you're trying to solve, because you haven't yet. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 2/2] driver: dma-buf: use alloc_pages_bulk_list for order-0 allocation 2025-10-15 3:21 ` Matthew Wilcox @ 2025-10-15 5:52 ` Zhaoyang Huang 2025-10-15 8:40 ` Petr Tesarik 0 siblings, 1 reply; 19+ messages in thread From: Zhaoyang Huang @ 2025-10-15 5:52 UTC (permalink / raw) To: Matthew Wilcox Cc: Christian König, Suren Baghdasaryan, Petr Tesarik, zhaoyang.huang, Andrew Morton, David Hildenbrand, Mel Gorman, Vlastimil Babka, Sumit Semwal, Benjamin Gaignard, Brian Starkey, John Stultz, T . J . Mercier, linux-media, dri-devel, linaro-mm-sig, linux-mm, linux-kernel, steve.kang On Wed, Oct 15, 2025 at 11:21 AM Matthew Wilcox <willy@infradead.org> wrote: > > On Wed, Oct 15, 2025 at 09:12:07AM +0800, Zhaoyang Huang wrote: > > > Could be that we need to make this behavior conditional, but somebody would need to come up with some really good arguments to justify the complexity. > > ok, should we use CONFIG_DMA_BUF_BULK_ALLOCATION or a variable > > controlled by sysfs interface? > > No. Explain what you're trying to solve, because you haven't yet. Dma-buf works as a memory allocation backend could loop thousands of times alloc_pages for allocating order-0 pages to fulfill the dozens MB demand, this commit would like to replace the loop by once alloc_pages_bulk. Whereas, alloc_pages_bulk_array perhaps introduces extra memory allocation along with direct-reclaim which could be more expensive than iterating the list. so call back the API alloc_pages_bulk_list as well ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 2/2] driver: dma-buf: use alloc_pages_bulk_list for order-0 allocation 2025-10-15 5:52 ` Zhaoyang Huang @ 2025-10-15 8:40 ` Petr Tesarik 2025-10-15 9:14 ` Zhaoyang Huang 0 siblings, 1 reply; 19+ messages in thread From: Petr Tesarik @ 2025-10-15 8:40 UTC (permalink / raw) To: Zhaoyang Huang Cc: Matthew Wilcox, Christian König, Suren Baghdasaryan, zhaoyang.huang, Andrew Morton, David Hildenbrand, Mel Gorman, Vlastimil Babka, Sumit Semwal, Benjamin Gaignard, Brian Starkey, John Stultz, T . J . Mercier, linux-media, dri-devel, linaro-mm-sig, linux-mm, linux-kernel, steve.kang On Wed, 15 Oct 2025 13:52:57 +0800 Zhaoyang Huang <huangzhaoyang@gmail.com> wrote: > On Wed, Oct 15, 2025 at 11:21 AM Matthew Wilcox <willy@infradead.org> wrote: > > > > On Wed, Oct 15, 2025 at 09:12:07AM +0800, Zhaoyang Huang wrote: > > > > Could be that we need to make this behavior conditional, but somebody would need to come up with some really good arguments to justify the complexity. > > > ok, should we use CONFIG_DMA_BUF_BULK_ALLOCATION or a variable > > > controlled by sysfs interface? > > > > No. Explain what you're trying to solve, because you haven't yet. > Dma-buf works as a memory allocation backend could loop thousands of > times alloc_pages for allocating order-0 pages to fulfill the dozens > MB demand, this commit would like to replace the loop by once > alloc_pages_bulk. Whereas, alloc_pages_bulk_array perhaps introduces > extra memory allocation along with direct-reclaim which could be more > expensive than iterating the list. so call back the API > alloc_pages_bulk_list as well This does not quite explain it. IIRC you mentioned allocating 18M as an example. The ideal outcome in that case is: - 16 order-8 compound pages - 32 order-4 compound pages -> total 48 calls to alloc_pages() But presumably, that's not what happens, because fragmentation makes (some of) those order-8 allocations fail. Since you talk about thousands of loop iterations, it looks like even order-4 allocation fail in your case. Then I agree there's not much value in trying to avoid further fragmentation, and after so many order-0 allocations, it's probably also pointless to do memory reclaim. OTOH I can see why the opposite approach is a bad idea in situations where fragmentation can be avoided. To make things even worse, alloc_pages_bulk() will rather split pages in the preferred zone than try allocating from the next best zone. To sum it up, Zhaoyang, can you please describe in more detail what happens in your scenario and what you believe should happen instead? Petr T ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 2/2] driver: dma-buf: use alloc_pages_bulk_list for order-0 allocation 2025-10-15 8:40 ` Petr Tesarik @ 2025-10-15 9:14 ` Zhaoyang Huang 0 siblings, 0 replies; 19+ messages in thread From: Zhaoyang Huang @ 2025-10-15 9:14 UTC (permalink / raw) To: Petr Tesarik Cc: Matthew Wilcox, Christian König, Suren Baghdasaryan, zhaoyang.huang, Andrew Morton, David Hildenbrand, Mel Gorman, Vlastimil Babka, Sumit Semwal, Benjamin Gaignard, Brian Starkey, John Stultz, T . J . Mercier, linux-media, dri-devel, linaro-mm-sig, linux-mm, linux-kernel, steve.kang On Wed, Oct 15, 2025 at 4:40 PM Petr Tesarik <ptesarik@suse.com> wrote: > > On Wed, 15 Oct 2025 13:52:57 +0800 > Zhaoyang Huang <huangzhaoyang@gmail.com> wrote: > > > On Wed, Oct 15, 2025 at 11:21 AM Matthew Wilcox <willy@infradead.org> wrote: > > > > > > On Wed, Oct 15, 2025 at 09:12:07AM +0800, Zhaoyang Huang wrote: > > > > > Could be that we need to make this behavior conditional, but somebody would need to come up with some really good arguments to justify the complexity. > > > > ok, should we use CONFIG_DMA_BUF_BULK_ALLOCATION or a variable > > > > controlled by sysfs interface? > > > > > > No. Explain what you're trying to solve, because you haven't yet. > > Dma-buf works as a memory allocation backend could loop thousands of > > times alloc_pages for allocating order-0 pages to fulfill the dozens > > MB demand, this commit would like to replace the loop by once > > alloc_pages_bulk. Whereas, alloc_pages_bulk_array perhaps introduces > > extra memory allocation along with direct-reclaim which could be more > > expensive than iterating the list. so call back the API > > alloc_pages_bulk_list as well > > This does not quite explain it. IIRC you mentioned allocating 18M as an > example. The ideal outcome in that case is: > > - 16 order-8 compound pages > - 32 order-4 compound pages > -> total 48 calls to alloc_pages() > > But presumably, that's not what happens, because fragmentation makes > (some of) those order-8 allocations fail. Since you talk about > thousands of loop iterations, it looks like even order-4 allocation > fail in your case. Then I agree there's not much value in trying to > avoid further fragmentation, and after so many order-0 allocations, > it's probably also pointless to do memory reclaim. Thanks for the words > > OTOH I can see why the opposite approach is a bad idea in situations > where fragmentation can be avoided. To make things even worse, > alloc_pages_bulk() will rather split pages in the preferred zone than > try allocating from the next best zone. but the loop of alloc_pages(order-0) could also split high-order pages one by one on the prefered-zone, right? > > To sum it up, Zhaoyang, can you please describe in more detail what > happens in your scenario and what you believe should happen instead? My goal is as simple as eliminating the loop of alloc_pages to get some performance gains. > > Petr T ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 2/2] driver: dma-buf: use alloc_pages_bulk_list for order-0 allocation 2025-10-14 8:32 ` [PATCH 2/2] driver: dma-buf: use alloc_pages_bulk_list for order-0 allocation zhaoyang.huang 2025-10-14 11:59 ` Christian König @ 2025-10-14 12:45 ` Petr Tesarik 1 sibling, 0 replies; 19+ messages in thread From: Petr Tesarik @ 2025-10-14 12:45 UTC (permalink / raw) To: zhaoyang.huang Cc: Andrew Morton, David Hildenbrand, Matthew Wilcox, Mel Gorman, Vlastimil Babka, Sumit Semwal, Benjamin Gaignard, Brian Starkey, John Stultz, T . J . Mercier, Christian König, linux-media, dri-devel, linaro-mm-sig, linux-mm, linux-kernel, Zhaoyang Huang, steve.kang On Tue, 14 Oct 2025 16:32:30 +0800 "zhaoyang.huang" <zhaoyang.huang@unisoc.com> wrote: > From: Zhaoyang Huang <zhaoyang.huang@unisoc.com> > > The size of once dma-buf allocation could be dozens MB or much more > which introduce a loop of allocating several thousands of order-0 pages. > Furthermore, the concurrent allocation could have dma-buf allocation enter > direct-reclaim during the loop. This commit would like to eliminate the > above two affections by introducing alloc_pages_bulk_list in dma-buf's > order-0 allocation. This patch is proved to be conditionally helpful > in 18MB allocation as decreasing the time from 24604us to 6555us and no > harm when bulk allocation can't be done(fallback to single page > allocation) > > Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com> > --- > drivers/dma-buf/heaps/system_heap.c | 36 +++++++++++++++++++---------- > 1 file changed, 24 insertions(+), 12 deletions(-) > > diff --git a/drivers/dma-buf/heaps/system_heap.c b/drivers/dma-buf/heaps/system_heap.c > index bbe7881f1360..71b028c63bd8 100644 > --- a/drivers/dma-buf/heaps/system_heap.c > +++ b/drivers/dma-buf/heaps/system_heap.c > @@ -300,8 +300,8 @@ static const struct dma_buf_ops system_heap_buf_ops = { > .release = system_heap_dma_buf_release, > }; > > -static struct page *alloc_largest_available(unsigned long size, > - unsigned int max_order) > +static void alloc_largest_available(unsigned long size, > + unsigned int max_order, unsigned int *num_pages, struct list_head *list) This interface feels weird. Maybe you could return the number of pages instead of making this a void function and passing a pointer to get that number? > { > struct page *page; > int i; > @@ -312,12 +312,19 @@ static struct page *alloc_largest_available(unsigned long size, > if (max_order < orders[i]) > continue; > > - page = alloc_pages(order_flags[i], orders[i]); > - if (!page) > + if (orders[i]) { > + page = alloc_pages(order_flags[i], orders[i]); nitpick: Since the lowest order is special-cased now, you can simply use HIGH_ORDER_GFP here and remove order_flags[] entirely. > + if (page) { > + list_add(&page->lru, list); > + *num_pages = 1; > + } > + } else > + *num_pages = alloc_pages_bulk_list(LOW_ORDER_GFP, size / PAGE_SIZE, list); > + > + if (list_empty(list)) > continue; > - return page; > + return; > } > - return NULL; > } > > static struct dma_buf *system_heap_allocate(struct dma_heap *heap, > @@ -335,6 +342,8 @@ static struct dma_buf *system_heap_allocate(struct dma_heap *heap, > struct list_head pages; > struct page *page, *tmp_page; > int i, ret = -ENOMEM; > + unsigned int num_pages; > + LIST_HEAD(head); > > buffer = kzalloc(sizeof(*buffer), GFP_KERNEL); > if (!buffer) > @@ -348,6 +357,8 @@ static struct dma_buf *system_heap_allocate(struct dma_heap *heap, > INIT_LIST_HEAD(&pages); > i = 0; > while (size_remaining > 0) { > + num_pages = 0; > + INIT_LIST_HEAD(&head); > /* > * Avoid trying to allocate memory if the process > * has been killed by SIGKILL > @@ -357,14 +368,15 @@ static struct dma_buf *system_heap_allocate(struct dma_heap *heap, > goto free_buffer; > } > > - page = alloc_largest_available(size_remaining, max_order); > - if (!page) > + alloc_largest_available(size_remaining, max_order, &num_pages, &head); > + if (!num_pages) > goto free_buffer; > > - list_add_tail(&page->lru, &pages); > - size_remaining -= page_size(page); > - max_order = compound_order(page); > - i++; > + list_splice_tail(&head, &pages); > + max_order = folio_order(lru_to_folio(&head)); > + size_remaining -= PAGE_SIZE * (num_pages << max_order); This looks complicated. What about changing alloc_largest_available() to return the total number of pages and using PAGE_SIZE * num_page? Ah, you still have to look at the folio order to determine the new value of max_order, so no big win. Hm. You could pass a pointer to max_order down to alloc_largest_available(), but at that point I think it's a matter of taste (aka bikeshedding). Petr T > + i += num_pages; > + > } > > table = &buffer->sg_table; ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 0/2] optimization of dma-buf system_heap allocation 2025-10-14 8:32 [PATCH 0/2] optimization of dma-buf system_heap allocation zhaoyang.huang 2025-10-14 8:32 ` [PATCH 1/2] mm: call back alloc_pages_bulk_list since it is useful zhaoyang.huang 2025-10-14 8:32 ` [PATCH 2/2] driver: dma-buf: use alloc_pages_bulk_list for order-0 allocation zhaoyang.huang @ 2025-10-14 14:06 ` Matthew Wilcox 2 siblings, 0 replies; 19+ messages in thread From: Matthew Wilcox @ 2025-10-14 14:06 UTC (permalink / raw) To: zhaoyang.huang Cc: Andrew Morton, David Hildenbrand, Mel Gorman, Vlastimil Babka, Sumit Semwal, Benjamin Gaignard, Brian Starkey, John Stultz, T . J . Mercier, Christian König, linux-media, dri-devel, linaro-mm-sig, linux-mm, linux-kernel, Zhaoyang Huang, steve.kang On Tue, Oct 14, 2025 at 04:32:28PM +0800, zhaoyang.huang wrote: > From: Zhaoyang Huang <zhaoyang.huang@unisoc.com> > > This series of patches would like to introduce alloc_pages_bulk_list in > dma-buf which need to call back the API for page allocation. Start with the problem you're trying to solve. ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2025-10-15 12:36 UTC | newest] Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2025-10-14 8:32 [PATCH 0/2] optimization of dma-buf system_heap allocation zhaoyang.huang 2025-10-14 8:32 ` [PATCH 1/2] mm: call back alloc_pages_bulk_list since it is useful zhaoyang.huang 2025-10-14 9:41 ` Petr Tesarik 2025-10-14 12:46 ` Zhaoyang Huang 2025-10-15 12:16 ` David Hildenbrand 2025-10-15 12:35 ` Zhaoyang Huang 2025-10-14 8:32 ` [PATCH 2/2] driver: dma-buf: use alloc_pages_bulk_list for order-0 allocation zhaoyang.huang 2025-10-14 11:59 ` Christian König 2025-10-14 12:44 ` Zhaoyang Huang 2025-10-14 13:04 ` Christian König 2025-10-14 15:10 ` Petr Tesarik 2025-10-14 15:52 ` Christian König 2025-10-15 1:12 ` Zhaoyang Huang 2025-10-15 3:21 ` Matthew Wilcox 2025-10-15 5:52 ` Zhaoyang Huang 2025-10-15 8:40 ` Petr Tesarik 2025-10-15 9:14 ` Zhaoyang Huang 2025-10-14 12:45 ` Petr Tesarik 2025-10-14 14:06 ` [PATCH 0/2] optimization of dma-buf system_heap allocation Matthew Wilcox
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox