On 2024/8/27 23:30, Alexander Duyck wrote: > On Tue, Aug 27, 2024 at 5:07 AM Yunsheng Lin wrote: >> >> On 2024/8/27 1:00, Alexander Duyck wrote: >>> On Mon, Aug 26, 2024 at 5:46 AM Yunsheng Lin wrote: >>>> >>>> It seems there is about 24Bytes binary size increase for >>>> __page_frag_cache_refill() after refactoring in arm64 system >>>> with 64K PAGE_SIZE. By doing the gdb disassembling, It seems >>>> we can have more than 100Bytes decrease for the binary size >>>> by using __alloc_pages() to replace alloc_pages_node(), as >>>> there seems to be some unnecessary checking for nid being >>>> NUMA_NO_NODE, especially when page_frag is part of the mm >>>> system. >>>> >>>> CC: Alexander Duyck >>>> Signed-off-by: Yunsheng Lin >>>> --- >>>> mm/page_frag_cache.c | 6 +++--- >>>> 1 file changed, 3 insertions(+), 3 deletions(-) >>>> >>>> diff --git a/mm/page_frag_cache.c b/mm/page_frag_cache.c >>>> index bba59c87d478..e0ad3de11249 100644 >>>> --- a/mm/page_frag_cache.c >>>> +++ b/mm/page_frag_cache.c >>>> @@ -28,11 +28,11 @@ static struct page *__page_frag_cache_refill(struct page_frag_cache *nc, >>>> #if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) >>>> gfp_mask = (gfp_mask & ~__GFP_DIRECT_RECLAIM) | __GFP_COMP | >>>> __GFP_NOWARN | __GFP_NORETRY | __GFP_NOMEMALLOC; >>>> - page = alloc_pages_node(NUMA_NO_NODE, gfp_mask, >>>> - PAGE_FRAG_CACHE_MAX_ORDER); >>>> + page = __alloc_pages(gfp_mask, PAGE_FRAG_CACHE_MAX_ORDER, >>>> + numa_mem_id(), NULL); >>>> #endif >>>> if (unlikely(!page)) { >>>> - page = alloc_pages_node(NUMA_NO_NODE, gfp, 0); >>>> + page = __alloc_pages(gfp, 0, numa_mem_id(), NULL); >>>> if (unlikely(!page)) { >>>> nc->encoded_page = 0; >>>> return NULL; >>> >>> I still think this would be better served by fixing alloc_pages_node >>> to drop the superfluous checks rather than changing the function. We >>> would get more gain by just addressing the builtin constant and >>> NUMA_NO_NODE case there. >> >> I am supposing by 'just addressing the builtin constant and NUMA_NO_NODE >> case', it meant the below change from the previous discussion: >> >> diff --git a/include/linux/gfp.h b/include/linux/gfp.h >> index 01a49be7c98d..009ffb50d8cd 100644 >> --- a/include/linux/gfp.h >> +++ b/include/linux/gfp.h >> @@ -290,6 +290,9 @@ struct folio *__folio_alloc_node_noprof(gfp_t gfp, unsigned int order, int nid) >> static inline struct page *alloc_pages_node_noprof(int nid, gfp_t gfp_mask, >> unsigned int order) >> { >> + if (__builtin_constant_p(nid) && nid == NUMA_NO_NODE) >> + return __alloc_pages_noprof(gfp_mask, order, numa_mem_id(), NULL); >> + >> if (nid == NUMA_NO_NODE) >> nid = numa_mem_id(); >> >> >> Actually it does not seem to get more gain by judging from binary size >> changing as below, vmlinux.org is the image after this patchset, and >> vmlinux is the image after this patchset with this patch reverted and >> with above change applied. >> >> [linyunsheng@localhost net-next]$ ./scripts/bloat-o-meter vmlinux.org vmlinux >> add/remove: 0/2 grow/shrink: 16/12 up/down: 432/-340 (92) >> Function old new delta >> new_slab 808 1124 +316 >> its_probe_one 2860 2908 +48 > > ... > >> alloc_slab_page 284 - -284 >> Total: Before=30534822, After=30534914, chg +0.00% > > Well considering that alloc_slab_page was marked to be "inline" as per > the qualifier applied to it I would say the shrinking had an effect, > but it was just enough to enable the "inline" qualifier to kick in. It > could be argued that the change exposed another issue in that the > alloc_slab_page function is probably large enough that it should just > be "static" and not "static inline". If you can provide you config I > could probably look into this further but I suspect just dropping the > inline for that one function should result in net savings. The config is for the arm64 system, please see the attachment. > > The only other big change I see is in its_probe_one which I am not > sure why it would be impacted since it is not passing a constant in > the first place, it is passing its->numa_node. I'd be curious what the > disassembly shows in terms of the change that caused it to increase in > size. > > Otherwise the rest of the size changes seem more like code shifts than > anything else likely due to the functions shifting around slightly due > to a few dropping in size.