[Fwd: Page allocator doubt]

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [Fwd: Page allocator doubt]
@ 2004-11-11 14:37 Luciano A. Stertz
  2004-11-11 19:10 ` Dave Hansen
  0 siblings, 1 reply; 8+ messages in thread
From: Luciano A. Stertz @ 2004-11-11 14:37 UTC (permalink / raw)
  To: linux-mm

	May someone help me? I tried kernelnewbies, but got no answers.

	TIA,
	Luciano Stertz

-------- Original Message --------
Subject: Page allocator doubt
Date: Wed, 10 Nov 2004 15:31:41 -0200
From: Luciano A. Stertz <luciano@tteng.com.br>
To: KERNEL <kernelnewbies@nl.linux.org>


	I have a doubt about the page allocation. I need to allocate a number
of contiguous pages to later initialize and add them to the page cache.
I'm doing something like that:

	struct address_space *x = &area->vm_file->f_dentry->d_inode->i_data;
	struct page *page = alloc_pages(mapping_gfp_mask(x)|__GFP_COLD, order);

	for (i = 0; i< 1<<order; i++) {
		struct page *pg = page + i;
		printk("Page count of page %i is %i\n", i, page_count(pg));
		printk("Page address: 0x%lx\n", page_address(pg));
	}

	For my surprise, for order = 4 I got the following output:

	Page count of page 0 is 1
	Page address: 0xe00000003c1e0000
	Page count of page 1 is 0
	Page address: 0xe00000003c1e4000
	Page count of page 2 is 0
	Page address: 0xe00000003c1e8000
	Page count of page 3 is 0
	Page address: 0xe00000003c1ec000

	Only the first page got it page counter incremented. Is this expected?
As far as I understand, if page_count is 0 the page is free.
	Looking at page_alloc, I found set_page_refs (below), and it really
sets the page count only for the first page for machines with MMU.
	I'm confused... are these pages really allocated to me?

	I'm running kernel 2.6.8-rc3 on a IPF machine.

	Thanks in advance for any help!

	Luciano Stertz


static inline void set_page_refs(struct page *page, int order)
{
#ifdef CONFIG_MMU
     set_page_count(page, 1);
#else
     int i;

     /*
      * We need to reference all the pages for this order, otherwise if
      * anyone accesses one of the pages with (get/put) it will be freed.
      */
     for (i = 0; i < (1 << order); i++)
         set_page_count(page+i, 1);
#endif /* CONFIG_MMU */
}



-- 
Luciano A. Stertz
luciano@tteng.com.br
T&T Engenheiros Associados Ltda
http://www.tteng.com.br
Fone/Fax (51) 3224 8425

--
Kernelnewbies: Help each other learn about the Linux kernel.
Archive:       http://mail.nl.linux.org/kernelnewbies/
FAQ:           http://kernelnewbies.org/faq/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Fwd: Page allocator doubt]
  2004-11-11 14:37 [Fwd: Page allocator doubt] Luciano A. Stertz
@ 2004-11-11 19:10 ` Dave Hansen
  2004-11-11 19:27   ` Luciano A. Stertz
  0 siblings, 1 reply; 8+ messages in thread
From: Dave Hansen @ 2004-11-11 19:10 UTC (permalink / raw)
  To: Luciano A. Stertz; +Cc: linux-mm

On Thu, 2004-11-11 at 06:37, Luciano A. Stertz wrote:
> Only the first page got it page counter incremented. Is this expected?

Yes.

-- Dave

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Fwd: Page allocator doubt]
  2004-11-11 19:10 ` Dave Hansen
@ 2004-11-11 19:27   ` Luciano A. Stertz
  2004-11-11 19:36     ` Dave Hansen
  0 siblings, 1 reply; 8+ messages in thread
From: Luciano A. Stertz @ 2004-11-11 19:27 UTC (permalink / raw)
  To: Dave Hansen; +Cc: linux-mm

Dave Hansen wrote:
> On Thu, 2004-11-11 at 06:37, Luciano A. Stertz wrote:
> 
>>Only the first page got it page counter incremented. Is this expected?
> 
> 
> Yes.
	But... are they allocated to me, even with page_count zeroed? Do I need 
to do get_page on the them? Sorry if it's a too lame question, but I 
still didn't understand and found no place to read about this.

	Luciano Stertz
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Fwd: Page allocator doubt]
  2004-11-11 19:27   ` Luciano A. Stertz
@ 2004-11-11 19:36     ` Dave Hansen
  2004-11-11 20:22       ` Luciano A. Stertz
  0 siblings, 1 reply; 8+ messages in thread
From: Dave Hansen @ 2004-11-11 19:36 UTC (permalink / raw)
  To: Luciano A. Stertz; +Cc: linux-mm

On Thu, 2004-11-11 at 11:27, Luciano A. Stertz wrote:
> 	But... are they allocated to me, even with page_count zeroed? Do I need 
> to do get_page on the them? Sorry if it's a too lame question, but I 
> still didn't understand and found no place to read about this.

Do you see anywhere in the page allocator where it does a loop like
yours?

        for (i = 1; i< 1<<order; i++)
		get_page(page + i);

When you do a multi-order allocation, the first page represents the
whole group and they're treated as a whole.  As you've noticed, breaking
them up requires a little work.

Why don't you post all of the code that you're using so that we can tell
what you're doing?  There might be a better way.  Drivers probably
shouldn't be putting stuff in the page cache all by themselves.  

-- Dave

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Fwd: Page allocator doubt]
  2004-11-11 19:36     ` Dave Hansen
@ 2004-11-11 20:22       ` Luciano A. Stertz
  2004-11-11 20:34         ` Dave Hansen
  2004-11-11 21:21         ` Marcelo Tosatti
  0 siblings, 2 replies; 8+ messages in thread
From: Luciano A. Stertz @ 2004-11-11 20:22 UTC (permalink / raw)
  To: Dave Hansen; +Cc: linux-mm

Dave Hansen wrote:
> On Thu, 2004-11-11 at 11:27, Luciano A. Stertz wrote:
> 
>>	But... are they allocated to me, even with page_count zeroed? Do I need 
>>to do get_page on the them? Sorry if it's a too lame question, but I 
>>still didn't understand and found no place to read about this.
> 
> 
> Do you see anywhere in the page allocator where it does a loop like
> yours?
> 
>         for (i = 1; i< 1<<order; i++)
> 		get_page(page + i);
	Actually this loop isn't mine. It's part of the page allocator, but 
it's only executed on systems without a MMU.

> When you do a multi-order allocation, the first page represents the
> whole group and they're treated as a whole.  As you've noticed, breaking
> them up requires a little work.
> 
> Why don't you post all of the code that you're using so that we can tell
> what you're doing?  There might be a better way.  Drivers probably
> shouldn't be putting stuff in the page cache all by themselves.  
	Unhappily I can't post any code yet, but I'll try to give an insight of 
what we're trying to do.
	It's not a driver. We're doing an implementation to allow the kernel to 
execute compressed files, decompressing pages on demand.
	These files will usually be compressed in small blocks, typically 4kb. 
But if they got compressed in blocks bigger then a page (say 8kb blocks 
on a 4kb page system), the kernel will have more than one decompressed 
page each time a block have to be decompressed; and I'd like to add them 
both to the page cache.
	So, seems I would have to break multi-order allocated pages. Is this 
possible / viable? If not, maybe I'll have to work only with small 
blocks, but I wouldn't like to...

> 
> -- Dave
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Fwd: Page allocator doubt]
  2004-11-11 20:22       ` Luciano A. Stertz
@ 2004-11-11 20:34         ` Dave Hansen
  2004-11-11 21:21         ` Marcelo Tosatti
  1 sibling, 0 replies; 8+ messages in thread
From: Dave Hansen @ 2004-11-11 20:34 UTC (permalink / raw)
  To: Luciano A. Stertz; +Cc: linux-mm

On Thu, 2004-11-11 at 12:22, Luciano A. Stertz wrote:
> Dave Hansen wrote:
> > On Thu, 2004-11-11 at 11:27, Luciano A. Stertz wrote:
> > 
> >>	But... are they allocated to me, even with page_count zeroed? Do I need 
> >>to do get_page on the them? Sorry if it's a too lame question, but I 
> >>still didn't understand and found no place to read about this.
> > 
> > 
> > Do you see anywhere in the page allocator where it does a loop like
> > yours?
> > 
> >         for (i = 1; i< 1<<order; i++)
> > 		get_page(page + i);
> 	Actually this loop isn't mine. It's part of the page allocator, but 
> it's only executed on systems without a MMU.

Well, what does that tell you?  How can page_count(page[i]) be non-zero
unless someone goes and sets it like that for pages other than the first
(0th) one?

> 	Unhappily I can't post any code yet, but I'll try to give an insight of 
> what we're trying to do.
> 	It's not a driver. We're doing an implementation to allow the kernel to 
> execute compressed files, decompressing pages on demand.
> 	These files will usually be compressed in small blocks, typically 4kb. 
> But if they got compressed in blocks bigger then a page (say 8kb blocks 
> on a 4kb page system), the kernel will have more than one decompressed 
> page each time a block have to be decompressed; and I'd like to add them 
> both to the page cache.

 Why do 2 *uncompressed* blocks have to be physically contiguous?  If
you're decompressing and you need more than one page, just allocate
another one.  I understand that your algorithms may not be optimized for
this right now, but that's what you get for doing it in the kernel. :)

> 	So, seems I would have to break multi-order allocated pages. Is this 
> possible / viable? If not, maybe I'll have to work only with small 
> blocks, but I wouldn't like to...

It's possible, but you shouldn't do it.  Multi-order pages are a very
valuable commodity and should be reserved for things that *ABSOLUTELY*
need them, like DMA buffers.  If you ever get a system up for a while
and under a lot of memory pressure, those non-order-zero allocations are
going to start failing all over the place.

Make your code handle all order-0 pages now.  You'll need to do it
eventually.  

-- Dave


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Fwd: Page allocator doubt]
  2004-11-11 20:22       ` Luciano A. Stertz
  2004-11-11 20:34         ` Dave Hansen
@ 2004-11-11 21:21         ` Marcelo Tosatti
  2004-11-12 11:37           ` Luciano A. Stertz
  1 sibling, 1 reply; 8+ messages in thread
From: Marcelo Tosatti @ 2004-11-11 21:21 UTC (permalink / raw)
  To: Luciano A. Stertz; +Cc: Dave Hansen, linux-mm

On Thu, Nov 11, 2004 at 06:22:51PM -0200, Luciano A. Stertz wrote:
> Dave Hansen wrote:
> >On Thu, 2004-11-11 at 11:27, Luciano A. Stertz wrote:
> >
> >>	But... are they allocated to me, even with page_count zeroed? Do I 
> >>	need to do get_page on the them? Sorry if it's a too lame question, but I 
> >>still didn't understand and found no place to read about this.
> >
> >
> >Do you see anywhere in the page allocator where it does a loop like
> >yours?
> >
> >        for (i = 1; i< 1<<order; i++)
> >		get_page(page + i);
> 	Actually this loop isn't mine. It's part of the page allocator, but 
> it's only executed on systems without a MMU.
> 
> >When you do a multi-order allocation, the first page represents the
> >whole group and they're treated as a whole.  As you've noticed, breaking
> >them up requires a little work.
> >
> >Why don't you post all of the code that you're using so that we can tell
> >what you're doing?  There might be a better way.  Drivers probably
> >shouldn't be putting stuff in the page cache all by themselves.  
> 	Unhappily I can't post any code yet, but I'll try to give an insight 
> 	of what we're trying to do.
> 	It's not a driver. We're doing an implementation to allow the kernel 
> 	to execute compressed files, decompressing pages on demand.
> 	These files will usually be compressed in small blocks, typically 
> 	4kb. But if they got compressed in blocks bigger then a page (say 8kb 
> blocks on a 4kb page system), the kernel will have more than one 
> decompressed page each time a block have to be decompressed; and I'd like 
> to add them both to the page cache.
> 	So, seems I would have to break multi-order allocated pages. Is this 
> possible / viable? If not, maybe I'll have to work only with small 
> blocks, but I wouldn't like to...

Why do you need the pages to be physically contiguous?

I dont see any reason for that requirement - you can use discontiguous physical
pages which are virtually contiguous (so your decompression code wont need to 
care about non adjacent pieces of memory).
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Fwd: Page allocator doubt]
  2004-11-11 21:21         ` Marcelo Tosatti
@ 2004-11-12 11:37           ` Luciano A. Stertz
  0 siblings, 0 replies; 8+ messages in thread
From: Luciano A. Stertz @ 2004-11-12 11:37 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: Dave Hansen, linux-mm

Marcelo Tosatti wrote:
> On Thu, Nov 11, 2004 at 06:22:51PM -0200, Luciano A. Stertz wrote:
> 
>>Dave Hansen wrote:
>>
>>>On Thu, 2004-11-11 at 11:27, Luciano A. Stertz wrote:
>>>
>>>
>>>>	But... are they allocated to me, even with page_count zeroed? Do I 
>>>>	need to do get_page on the them? Sorry if it's a too lame question, but I 
>>>>still didn't understand and found no place to read about this.
>>>
>>>
>>>Do you see anywhere in the page allocator where it does a loop like
>>>yours?
>>>
>>>       for (i = 1; i< 1<<order; i++)
>>>		get_page(page + i);
>>
>>	Actually this loop isn't mine. It's part of the page allocator, but 
>>it's only executed on systems without a MMU.
>>
>>
>>>When you do a multi-order allocation, the first page represents the
>>>whole group and they're treated as a whole.  As you've noticed, breaking
>>>them up requires a little work.
>>>
>>>Why don't you post all of the code that you're using so that we can tell
>>>what you're doing?  There might be a better way.  Drivers probably
>>>shouldn't be putting stuff in the page cache all by themselves.  
>>
>>	Unhappily I can't post any code yet, but I'll try to give an insight 
>>	of what we're trying to do.
>>	It's not a driver. We're doing an implementation to allow the kernel 
>>	to execute compressed files, decompressing pages on demand.
>>	These files will usually be compressed in small blocks, typically 
>>	4kb. But if they got compressed in blocks bigger then a page (say 8kb 
>>blocks on a 4kb page system), the kernel will have more than one 
>>decompressed page each time a block have to be decompressed; and I'd like 
>>to add them both to the page cache.
>>	So, seems I would have to break multi-order allocated pages. Is this 
>>possible / viable? If not, maybe I'll have to work only with small 
>>blocks, but I wouldn't like to...
> 
> 
> Why do you need the pages to be physically contiguous?
> 
> I dont see any reason for that requirement - you can use discontiguous physical
> pages which are virtually contiguous (so your decompression code wont need to 
> care about non adjacent pieces of memory).
> 

	Thanks Dave and Marcelo. You're obviously right. I'll do it.

	Luciano

-- 
Luciano A. Stertz
luciano@tteng.com.br
T&T Engenheiros Associados Ltda
http://www.tteng.com.br
Fone/Fax (51) 3224 8425
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2004-11-12 11:37 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-11-11 14:37 [Fwd: Page allocator doubt] Luciano A. Stertz
2004-11-11 19:10 ` Dave Hansen
2004-11-11 19:27   ` Luciano A. Stertz
2004-11-11 19:36     ` Dave Hansen
2004-11-11 20:22       ` Luciano A. Stertz
2004-11-11 20:34         ` Dave Hansen
2004-11-11 21:21         ` Marcelo Tosatti
2004-11-12 11:37           ` Luciano A. Stertz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox