From: Shivam Kalra <shivamkalra98@zohomail.in>
To: Uladzislau Rezki <urezki@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Alice Ryhl <aliceryhl@google.com>,
Danilo Krummrich <dakr@kernel.org>
Subject: Re: [PATCH v4 2/3] mm/vmalloc: free unused pages on vrealloc() shrink
Date: Tue, 17 Mar 2026 02:53:12 +0530 [thread overview]
Message-ID: <19a1f1e5-76b4-4c0a-bebc-2eb048ad2fe2@zohomail.in> (raw)
In-Reply-To: <abg6BG0MT6sKy-FT@milan>
On 16/03/26 22:42, Uladzislau Rezki wrote:
> On Sat, Mar 14, 2026 at 02:34:14PM +0530, Shivam Kalra via B4 Relay wrote:
>> From: Shivam Kalra <shivamkalra98@zohomail.in>
>>
>> When vrealloc() shrinks an allocation and the new size crosses a page
>> boundary, unmap and free the tail pages that are no longer needed. This
>> reclaims physical memory that was previously wasted for the lifetime
>> of the allocation.
>>
>> The heuristic is simple: always free when at least one full page becomes
>> unused. Huge page allocations (page_order > 0) are skipped, as partial
>> freeing would require splitting.
>>
>> The virtual address reservation (vm->size / vmap_area) is intentionally
>> kept unchanged, preserving the address for potential future grow-in-place
>> support.
>>
>> Fix the grow-in-place check to compare against vm->nr_pages rather than
>> get_vm_area_size(), since the latter reflects the virtual reservation
>> which does not shrink. Without this fix, a grow after shrink would
>> access freed pages.
>>
>> Signed-off-by: Shivam Kalra <shivamkalra98@zohomail.in>
>> ---
>> mm/vmalloc.c | 19 ++++++++++++++-----
>> 1 file changed, 14 insertions(+), 5 deletions(-)
>>
>> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
>> index b29bf58c0e3f..2c455f2038f6 100644
>> --- a/mm/vmalloc.c
>> +++ b/mm/vmalloc.c
>> @@ -4345,14 +4345,23 @@ void *vrealloc_node_align_noprof(const void *p, size_t size, unsigned long align
>> goto need_realloc;
>> }
>>
>> - /*
>> - * TODO: Shrink the vm_area, i.e. unmap and free unused pages. What
>> - * would be a good heuristic for when to shrink the vm_area?
>> - */
>> if (size <= old_size) {
>> + unsigned int new_nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
>> +
>> /* Zero out "freed" memory, potentially for future realloc. */
>> if (want_init_on_free() || want_init_on_alloc(flags))
>> memset((void *)p + size, 0, old_size - size);
>> +
>> + /* Free tail pages when shrink crosses a page boundary. */
>> + if (new_nr_pages < vm->nr_pages && !vm_area_page_order(vm)) {
>> + unsigned long addr = (unsigned long)p;
>> +
>> + vunmap_range(addr + (new_nr_pages << PAGE_SHIFT),
>> + addr + (vm->nr_pages << PAGE_SHIFT));
>> +
>> + vm_area_free_pages(vm, new_nr_pages, vm->nr_pages);
>> + vm->nr_pages = new_nr_pages;
>> + }
>> vm->requested_size = size;
>> kasan_vrealloc(p, old_size, size);
>> return (void *)p;
>> @@ -4361,7 +4370,7 @@ void *vrealloc_node_align_noprof(const void *p, size_t size, unsigned long align
>> /*
>> * We already have the bytes available in the allocation; use them.
>> */
>> - if (size <= alloced_size) {
>> + if (size <= (size_t)vm->nr_pages << PAGE_SHIFT) {
>> /*
>> * No need to zero memory here, as unused memory will have
>> * already been zeroed at initial allocation time or during
>>
>> --
>> 2.43.0
>>
>>
> Do we perform vm_reset_perms(vm) for tail pages? As i see you update the
> vm->nr_pages when shrinking. Then on vfree() we have:
>
> <snip>
> /*
> * Flush the vm mapping and reset the direct map.
> */
> static void vm_reset_perms(struct vm_struct *area)
> {
> unsigned long start = ULONG_MAX, end = 0;
> unsigned int page_order = vm_area_page_order(area);
> int flush_dmap = 0;
> int i;
>
> /*
> * Find the start and end range of the direct mappings to make sure that
> * the vm_unmap_aliases() flush includes the direct map.
> */
> for (i = 0; i < area->nr_pages; i += 1U << page_order) {
> ...
> <snip>
>
> i.e. tail pages go back to the page allocator without resetting permission.
>
> --
> Uladzslau Rezki
Hi Uladzislau,
Good catch, thank you for spotting this. You are absolutely right-we are
currently returning the tail pages to the page allocator without
resetting their direct-map permissions if VM_FLUSH_RESET_PERMS was set.
While my specific use case doesn't utilize VM_FLUSH_RESET_PERMS,
vrealloc needs to safely handle all vmalloc flags as a generic API.
I will fix this in the next version (v5). I plan to add a helper
function to perform the permission reset specifically for the range of
tail pages being freed during the shrink.
Thanks,
Shivam
next prev parent reply other threads:[~2026-03-16 21:23 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-14 9:04 [PATCH v4 0/3] " Shivam Kalra via B4 Relay
2026-03-14 9:04 ` [PATCH v4 1/3] mm/vmalloc: extract vm_area_free_pages() helper from vfree() Shivam Kalra via B4 Relay
2026-03-14 9:04 ` [PATCH v4 2/3] mm/vmalloc: free unused pages on vrealloc() shrink Shivam Kalra via B4 Relay
2026-03-16 17:12 ` Uladzislau Rezki
2026-03-16 21:23 ` Shivam Kalra [this message]
2026-03-14 9:04 ` [PATCH v4 3/3] lib/test_vmalloc: add vrealloc test case Shivam Kalra via B4 Relay
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=19a1f1e5-76b4-4c0a-bebc-2eb048ad2fe2@zohomail.in \
--to=shivamkalra98@zohomail.in \
--cc=akpm@linux-foundation.org \
--cc=aliceryhl@google.com \
--cc=dakr@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=urezki@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox