linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Baoquan He <bhe@redhat.com>
To: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Christoph Hellwig <hch@infradead.org>,
	Uladzislau Rezki <urezki@gmail.com>
Subject: Re: [PATCH] lib/test_vmalloc.c: avoid garbage in page array
Date: Fri, 26 May 2023 16:56:49 +0800	[thread overview]
Message-ID: <ZHB0UTEYUMZVa23V@MiWiFi-R3L-srv> (raw)
In-Reply-To: <b87ff2af-c89d-4ddd-8992-2ffb337fbe0c@lucifer.local>

On 05/26/23 at 08:13am, Lorenzo Stoakes wrote:
> On Fri, May 26, 2023 at 08:08:33AM +0800, Baoquan He wrote:
> > On 05/24/23 at 09:24am, Lorenzo Stoakes wrote:
> > > It turns out that alloc_pages_bulk_array() does not treat the page_array
> > > parameter as an output parameter, but rather reads the array and skips any
> > > entries that have already been allocated.
> >
> > I read __alloc_pages_bulk() carefully, it does store the allocated page
> > pointers into page_array[] and pass out, not just reads the array and
> > skips entry alreay allocated.
> 
> Umm, the function literally opens with:-
> 
> 	/*
> 	 * Skip populated array elements to determine if any pages need
> 	 * to be allocated before disabling IRQs.
> 	 */
> 	while (page_array && nr_populated < nr_pages && page_array[nr_populated])
> 		nr_populated++;

OK, suppose page_array[] alreasy has three pages populated, if not
initialized and there's garbage data in page_array[], it could have
nr_populated > 3 finally? This is really risky.

> 
> And then later:-
> 
> 		/* Skip existing pages */
> 		if (page_array && page_array[nr_populated]) {
> 			nr_populated++;
> 			continue;
> 		}

This is interesting, I thought this place of nr_populated checking and
updating is meaningless, in fact it's skipping the element with vlaue
in the middle of page_array. I realize this when I recheck the code when
replying to your mail. Not sure if we should restrict that, or it's
really existing reasonablly.

[x][x][x][][][][x][x][][]
x marks the element pointing to page.

> 
> This explicitly skips populated array entries and reads page_array to see
> if entries already exist, and literally documents this in the comments
> above each line, exactly as I describe.

OK, I misread your words in log. While page_array[] is still output
parameter, just not pure output parameter? Not sure if I understand
output parameter correctly.

> 
> >
> > For the issue this patch is trying to fix, you mean __alloc_pages_bulk()
> > doesn't initialize page_array intentionally if it doesn't successfully
> > allocate desired number of pages. we may need add one sentence to notice
> > user that page_array need be initialized explicitly.
> 
> It isn't 'trying' to fix it, it fixes it. I have this reproing locally.

Right, my wrong expression.

> 
> What you're stating about 'successfully allocate desired number of pages'
> is irrelevant, we literally check the number of allocated pages in the
> caller.
> 
> No sentences need to be added, I explicitly state that the issue is due to
> the array being uninitialised, the summary lines talks about reading
> garbage.

Well, I meant adding sentence above __alloc_pages_bulk() to tell:
page_array[] could have garbage data stored if you don't initialize
it explicitly before calling __alloc_pages_bulk();

This could happen in other place if they don't use kcalloc(),
kmalloc(GFP_ZERO) or something like this to allocate page_array[]?

> 
> >
> > By the way, could you please tell in which line the test was referencing
> > uninitialized data and causing the PFN to not be valid and trigger the
> > WANR_ON? Please forgive my dumb head.
> 
> Well, I showed you the lines above where __alloc_bulk_array() is accessing
> uninitialised data by reading page_array[].

I see now, thanks for these details.

> 
> But ultimately this is called from vm_map_ram_test() in lib/test_vmalloc.c:-
> 
> 	for (i = 0; i < test_loop_count; i++) {
> 		v_ptr = vm_map_ram(pages, map_nr_pages, NUMA_NO_NODE);
> ^--- triggers warning because we can't map the invalid PFN
> 		*v_ptr = 'a';
> ^--- NULL pointer deref
> 		vm_unmap_ram(v_ptr, map_nr_pages);
> 	}
> 
> The warning is triggered in:-
> 
> vm_map_ram()
> vmap_pages_range()
> vmap_pages_range_noflush()
> __vmap_pages_range_noflush()
> vmap_pages_p4d_range()
> vmap_pages_pud_range()
> vmap_pages_pmd_range()
> vmap_pages_pte_range()
> 
> In:-
> 
> 		if (WARN_ON(!pfn_valid(page_to_pfn(page))))
> 			return -EINVAL;
> 
> The PFN is invalid because I happen to have garbage in an array entry such
> that page_to_pfn(garbage) >= max_pfn.
> 
> > >
> > > This is somewhat unexpected and breaks this test, as we allocate the pages
> > > array uninitialised on the assumption it will be overwritten.
> > >
> > > As a result, the test was referencing uninitialised data and causing the
> > > PFN to not be valid and thus a WARN_ON() followed by a null pointer deref
> > > and panic.
> > >
> > > In addition, this is an array of pointers not of struct page objects, so we
> > > need only allocate an array with elements of pointer size.
> > >
> > > We solve both problems by simply using kcalloc() and referencing
> > > sizeof(struct page *) rather than sizeof(struct page).
> > >
> > > Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
> > > ---
> > >  lib/test_vmalloc.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/lib/test_vmalloc.c b/lib/test_vmalloc.c
> > > index 9dd9745d365f..3718d9886407 100644
> > > --- a/lib/test_vmalloc.c
> > > +++ b/lib/test_vmalloc.c
> > > @@ -369,7 +369,7 @@ vm_map_ram_test(void)
> > >  	int i;
> > >
> > >  	map_nr_pages = nr_pages > 0 ? nr_pages:1;
> > > -	pages = kmalloc(map_nr_pages * sizeof(struct page), GFP_KERNEL);
> > > +	pages = kcalloc(map_nr_pages, sizeof(struct page *), GFP_KERNEL);
> > >  	if (!pages)
> > >  		return -1;
> > >
> > > --
> > > 2.40.1
> > >
> >
> 
> A broader problem we might want to think about is how little anybody is
> running this test in order that it wasn't picked up before now... obviously
> there's an element of luck as to whether the page_array happens to be
> zeroed or not, but you'd think it'd be garbage filled at least a reasonable
> amount of the time.

Hmm, that's why we may need notice people that there's risk in
__alloc_pages_bulk() if page_array[] is not initialized and the garbage
could be mistaken as a effective page pointer. My personal opinion.
People may argue it's caller's responsibility to do that.

Thanks
Baoquan



  reply	other threads:[~2023-05-26  8:57 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-24  8:24 Lorenzo Stoakes
2023-05-25 20:04 ` Uladzislau Rezki
2023-05-26  0:08 ` Baoquan He
2023-05-26  7:13   ` Lorenzo Stoakes
2023-05-26  8:56     ` Baoquan He [this message]
2023-05-26  9:10       ` Lorenzo Stoakes
2023-05-27 10:11         ` Baoquan He
2023-05-27 22:04           ` Lorenzo Stoakes
2023-05-27 10:13 ` Baoquan He

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZHB0UTEYUMZVa23V@MiWiFi-R3L-srv \
    --to=bhe@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=hch@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lstoakes@gmail.com \
    --cc=urezki@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox