linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* Re: kmap_kiobuf()
@ 2000-06-28 15:54 lord
  2000-06-28 16:06 ` kmap_kiobuf() David Woodhouse
  2000-06-28 17:46 ` kmap_kiobuf() Stephen C. Tweedie
  0 siblings, 2 replies; 18+ messages in thread
From: lord @ 2000-06-28 15:54 UTC (permalink / raw)
  To: David Woodhouse; +Cc: linux-kernel, linux-mm, sct, riel

> I think it would be useful to provide a function which can be used to 
> obtain a virtually-contiguous VM mapping of the pages of an iobuf.
> 
> Currently, to access the pages of an iobuf, you have to kmap() each page
> individually. For various purposes, it would be useful to be able to kmap the
> whole iobuf contiguously, so that you can guarantee that:
> 
> 	page_address(iobuf->maplist[n]) + PAGE_SIZE 
> 		== page_address(iobuf->maplist[n+1])
> 
>     (for n such that n < iobuf->nr_pages, obviously. Don't be so pedantic.)
> 
> Rather than taking a kiobuf as an argument, the new function might as well 
> be more generic:
> 
> unsigned long kremap_pages(struct page **maplist, int nr_pages);
> void kunmap_pages(struct page **maplist, int nr_pages);
> 
> I had a quick look at the code for kmap() and vmalloc() and decided that 
> even if I attempted to do it myself, I'd probably bugger it up and a MM 
> hacker would have to fix it anyway. So I'm not going to bother.
> 
> T'would be useful if someone else could find the time to do so, though.
> 
> 
> --
> dwmw2
> 
> 


The XFS port currently has exactly this beast, there is an extension
to let us pass an existing set of pages into the vmalloc_area_pages
function. It uses the existing pages instead of allocating new ones.
We needed something to let us map groups of pages into a single byte array.


I always knew it would go down like a ton of bricks, because of the TLB
flushing costs. As soon as you have a multi-cpu box this operation gets
expensive, the code could be changed to do lazy tlb flushes on unmapping
the pages, but you still have the cost every time you set a mapping up.

Steve

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 18+ messages in thread
* Re: kmap_kiobuf()
@ 2000-06-28 20:16 lord
  2000-06-28 21:22 ` kmap_kiobuf() Benjamin C.R. LaHaise
  2000-06-29  9:34 ` kmap_kiobuf() Stephen C. Tweedie
  0 siblings, 2 replies; 18+ messages in thread
From: lord @ 2000-06-28 20:16 UTC (permalink / raw)
  To: Stephen C. Tweedie; +Cc: lord, David Woodhouse, linux-mm, riel

> Hi,
> 
> On Wed, Jun 28, 2000 at 10:54:40AM -0500, lord@sgi.com wrote:
> 
> > I always knew it would go down like a ton of bricks, because of the TLB
> > flushing costs. As soon as you have a multi-cpu box this operation gets
> > expensive, the code could be changed to do lazy tlb flushes on unmapping
> > the pages, but you still have the cost every time you set a mapping up.
> 
> That's exactly what kmap() is for --- it does all the lazy tlb
> flushing for you.  Of course, the kmap area can get fragmented so it's
> not a magic solution if you really need contiguous virtual mappings.
> 
> However, kmap caches the virtual mappings for you automatically, so it
> may well be fast enough for you that you can avoid the whole
> contiguous map thing and just kmap pages as you need them.  Is that
> impossible for your code?
> 
> Cheers,
>  Stephen

Hmm, not sure how much kmap helps - it appears to be for mapping a single
page from highmem. The issue with XFS is that we have variable sized
chunks of meta-data (could be upto 64 Kbytes depending on how the filesystem
was built). 

The code was originally written to treat this like a byte array. Some of the
structures are layed out so that we could rework the code to not treat it
as a byte array, since they are basically arrays of smaller records. Some are
run length encoded type structures (directory leaf blocks being one) where
reworking the code would be a pain to say the least.

So we are currently using memory managed as an address space to do the
caching of metadata. Everything is built up out of single pages, and when we
need something bigger we glue it together into a larger chunk of address
space. This has the nice property that for cached metadata which does
not have special properties at the moment, we can just leave the pages
in the address space. The rest of the vm system is then free to reuse
them out from under us when there is demand for more memory.

Clearly it also has the nasty property of wanting to mess with the address
space map on a regular basis. [ Note that the mapping together of
pages like this is only done when the caller requests it, we can
still use pagebufs without it. ]

So if we do not use pages then we could use other memory from the slab
allocator, and work really hard to ensure it always works. If we go this
route then we now have chunks memory which we need to manage as our own cache,
otherwise we end up continually re-reading from disk. We introduce another 
caching mechanism into the kernel - yet another beast to fight over memory.

If we do not allow the remapping of the pages then we get into rewriting
lots of XFS, and almost certainly breaking it in the process.

Ben mentioned large page support as another way to get around this
problem. Where is that in the grand scheme of things?

Steve

p.s. Woudn't the remapping of pages be a way to let modules etc get larger
arrays of memory after boot time - doing it a few times is not going to
kill the system.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 18+ messages in thread
* Re: kmap_kiobuf()
@ 2000-06-28 16:52 lord
  2000-06-28 18:06 ` kmap_kiobuf() Stephen C. Tweedie
  0 siblings, 1 reply; 18+ messages in thread
From: lord @ 2000-06-28 16:52 UTC (permalink / raw)
  To: Benjamin C.R. LaHaise; +Cc: David Woodhouse, lord, linux-kernel, linux-mm

> On Wed, 28 Jun 2000, David Woodhouse wrote:
> 
> > MM is not exactly my field - I just know I want to be able to lock down a 
> > user's buffer and treat it as if it were in kernel-space, passing its 
> > address to functions which expect kernel buffers.
> 
> Then pass in a kiovec (we're planning on adding a rw_kiovec file op!) and
> use kmap/kmap_atomic on individual pages as required.  As to providing
> larger kmaps, I have yet to be convinced that providing primatives for
> dealing with objects larger than PAGE_SIZE is a Good Idea. 
> 
> 		-ben

I agree with trying to minimize things which require TLB flushes, we just
have 112 thousand lines of existing code (OK, lots of comments in that)
which wants to use things bigger than a page, and use them in ways which
are sometimes not going to be amenable to rewriting to use an array of pages,
not to mention rewriting would destabilize the code base.

I am not a VM guy either, Ben, is the cost of the TLB flush mostly in
the synchronization between CPUs, or is it just expensive anyway you
look at it?


Steve

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 18+ messages in thread
* kmap_kiobuf()
@ 2000-06-28 15:41 David Woodhouse
  2000-06-28 17:44 ` kmap_kiobuf() Stephen C. Tweedie
  2000-06-29 10:52 ` kmap_kiobuf() Stephen C. Tweedie
  0 siblings, 2 replies; 18+ messages in thread
From: David Woodhouse @ 2000-06-28 15:41 UTC (permalink / raw)
  To: linux-kernel, linux-mm; +Cc: sct, riel

I think it would be useful to provide a function which can be used to 
obtain a virtually-contiguous VM mapping of the pages of an iobuf.

Currently, to access the pages of an iobuf, you have to kmap() each page
individually. For various purposes, it would be useful to be able to kmap the
whole iobuf contiguously, so that you can guarantee that:

	page_address(iobuf->maplist[n]) + PAGE_SIZE 
		== page_address(iobuf->maplist[n+1])

    (for n such that n < iobuf->nr_pages, obviously. Don't be so pedantic.)

Rather than taking a kiobuf as an argument, the new function might as well 
be more generic:

unsigned long kremap_pages(struct page **maplist, int nr_pages);
void kunmap_pages(struct page **maplist, int nr_pages);

I had a quick look at the code for kmap() and vmalloc() and decided that 
even if I attempted to do it myself, I'd probably bugger it up and a MM 
hacker would have to fix it anyway. So I'm not going to bother.

T'would be useful if someone else could find the time to do so, though.


--
dwmw2


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2000-06-29 13:45 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2000-06-28 15:54 kmap_kiobuf() lord
2000-06-28 16:06 ` kmap_kiobuf() David Woodhouse
2000-06-28 16:24   ` kmap_kiobuf() Benjamin C.R. LaHaise
2000-06-28 18:07   ` kmap_kiobuf() Stephen C. Tweedie
2000-06-28 18:45     ` kmap_kiobuf() David Woodhouse
2000-06-29  9:09       ` kmap_kiobuf() Stephen C. Tweedie
2000-06-28 17:46 ` kmap_kiobuf() Stephen C. Tweedie
  -- strict thread matches above, loose matches on Subject: below --
2000-06-28 20:16 kmap_kiobuf() lord
2000-06-28 21:22 ` kmap_kiobuf() Benjamin C.R. LaHaise
2000-06-29  9:34 ` kmap_kiobuf() Stephen C. Tweedie
2000-06-29 13:45   ` kmap_kiobuf() Steve Lord
2000-06-28 16:52 kmap_kiobuf() lord
2000-06-28 18:06 ` kmap_kiobuf() Stephen C. Tweedie
2000-06-28 19:06   ` kmap_kiobuf() Manfred Spraul
2000-06-28 21:05   ` kmap_kiobuf() Andi Kleen
2000-06-28 15:41 kmap_kiobuf() David Woodhouse
2000-06-28 17:44 ` kmap_kiobuf() Stephen C. Tweedie
2000-06-29 10:52 ` kmap_kiobuf() Stephen C. Tweedie

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox