* faulting kmalloced buffers into userspace through mmap()
@ 2008-06-01 14:40 Daniel Drake
2008-06-02 5:38 ` Johannes Weiner
0 siblings, 1 reply; 5+ messages in thread
From: Daniel Drake @ 2008-06-01 14:40 UTC (permalink / raw)
To: linux-mm
Hi,
I am developing a driver for an under-development PCI frame grabber,
which will be released as GPL once the hardware is complete.
The character device driver basically operates as follows:
- userspace uses an ioctl to request a certain number of buffers
- driver allocates the buffers
- userspace calls mmap() to gain direct access to those buffers
- driver pushes physical addresses of those buffers to the device,
which DMAs data into them and generates interrupts accordingly
- userspace uses ioctls to monitor buffer status (i.e. check when
frame data has arrived) and then reads the data out.
The buffers are allocated with kmalloc(.., GFP_KERNEL). I use a .fault
vm operation to implement mmap. The memory space presented by mmap is as
if all the individual buffers were laid out contiguously in memory.
Fault handler is pretty much as follows:
static int vma_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
{
struct page *page;
/* find kernel-virtual address of requested data page */
unsigned char *addr = find_address(foo);
/* some locking and sanity/safety checks omitted */
page = virt_to_page(addr);
get_page(page);
vmf->page = page;
return 0;
}
The mapping seems to work fine, data is accessible as you'd expect.
However, during the munmap() operation, hundreds of bad page state
messages are generated:
Bad page state in process 'lt-capture_fram'
page:ffffe20005254300 flags:0x0148300000000084 mapping:0000000000000000
mapcount:0 count:0
Trying to fix it up, but a reboot is needed
Backtrace:
Pid: 5603, comm: lt-capture_fram Tainted: P B 2.6.25.4 #5
Call Trace:
[<ffffffff803e8813>] vt_console_print+0x223/0x310
[<ffffffff80266b5d>] bad_page+0x6d/0xb0
[<ffffffff802677c8>] free_hot_cold_page+0x178/0x190
[<ffffffff8026780a>] __pagevec_free+0x2a/0x40
[<ffffffff8026acb1>] release_pages+0x171/0x1b0
[<ffffffff8027c66d>] free_pages_and_swap_cache+0x8d/0xb0
[<ffffffff80271628>] unmap_vmas+0x578/0x800
[<ffffffff8027584a>] unmap_region+0xca/0x160
[<ffffffff802767e3>] do_munmap+0x223/0x2d0
[<ffffffff80519ca3>] __down_write_nested+0xa3/0xc0
[<ffffffff802768d8>] sys_munmap+0x48/0x80
[<ffffffff8020c03b>] system_call_after_swapgs+0x7b/0x80
The bad_page() call comes from the inline function free_pages_check().
It triggers bad_page() because the PG_slab bit is set on the page.
Presumably this is set by the __SetPageSlab() call inside slab's
kmem_getpages() function, but I haven't traced it. What does this flag
indicate?
I also did an experiment where I kmalloced some memory and then
immediately used virt_to_page() to get the struct page pointer for that
memory. It already had the PG_slab bit set at that stage, so it does not
appear to be later-occurring corruption causing this flag to be set at
munmap() time.
So, am I right in saying that it is not legal to use a page fault
handler to remap kmalloced memory in this way? I guess I need to use
alloc_pages() or something instead?
If I switched to remap_pfn_range(), would it be OK to use kmalloced
memory in this way? I chose to use fault because the mapping I am
presenting to userspace is actually composed of a number of separate
kmalloced buffers, whereas remap_pfn_range() looks best suited for where
you just have one buffer you want to map.
I'll document any resultant findings on the linux-mm wiki.
Thanks,
--
Daniel Drake
Brontes Technologies, A 3M Company
http://www.brontes3d.com/opensource/
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: faulting kmalloced buffers into userspace through mmap()
2008-06-01 14:40 faulting kmalloced buffers into userspace through mmap() Daniel Drake
@ 2008-06-02 5:38 ` Johannes Weiner
2008-06-04 9:39 ` Daniel Drake
0 siblings, 1 reply; 5+ messages in thread
From: Johannes Weiner @ 2008-06-02 5:38 UTC (permalink / raw)
To: Daniel Drake; +Cc: linux-mm
Hi,
Daniel Drake <ddrake@brontes3d.com> writes:
> Hi,
>
> I am developing a driver for an under-development PCI frame grabber,
> which will be released as GPL once the hardware is complete.
>
> The character device driver basically operates as follows:
> - userspace uses an ioctl to request a certain number of buffers
> - driver allocates the buffers
> - userspace calls mmap() to gain direct access to those buffers
> - driver pushes physical addresses of those buffers to the device,
> which DMAs data into them and generates interrupts accordingly
> - userspace uses ioctls to monitor buffer status (i.e. check when
> frame data has arrived) and then reads the data out.
Why the first ioctl? The mmap() handler can set up the buffers. You
can also implement a poll handler that sleeps until the interrupt
handler wakes it up.
> The buffers are allocated with kmalloc(.., GFP_KERNEL). I use a .fault
> vm operation to implement mmap. The memory space presented by mmap is
> as if all the individual buffers were laid out contiguously in memory.
>
> Fault handler is pretty much as follows:
>
> static int vma_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
> {
> struct page *page;
>
> /* find kernel-virtual address of requested data page */
> unsigned char *addr = find_address(foo);
>
> /* some locking and sanity/safety checks omitted */
>
> page = virt_to_page(addr);
> get_page(page);
> vmf->page = page;
> return 0;
> }
>
> The mapping seems to work fine, data is accessible as you'd
> expect. However, during the munmap() operation, hundreds of bad page
> state messages are generated:
>
> Bad page state in process 'lt-capture_fram'
> page:ffffe20005254300 flags:0x0148300000000084
> mapping:0000000000000000 mapcount:0 count:0
> Trying to fix it up, but a reboot is needed
> Backtrace:
> Pid: 5603, comm: lt-capture_fram Tainted: P B 2.6.25.4 #5
>
> Call Trace:
> [<ffffffff803e8813>] vt_console_print+0x223/0x310
> [<ffffffff80266b5d>] bad_page+0x6d/0xb0
> [<ffffffff802677c8>] free_hot_cold_page+0x178/0x190
> [<ffffffff8026780a>] __pagevec_free+0x2a/0x40
> [<ffffffff8026acb1>] release_pages+0x171/0x1b0
> [<ffffffff8027c66d>] free_pages_and_swap_cache+0x8d/0xb0
> [<ffffffff80271628>] unmap_vmas+0x578/0x800
> [<ffffffff8027584a>] unmap_region+0xca/0x160
> [<ffffffff802767e3>] do_munmap+0x223/0x2d0
> [<ffffffff80519ca3>] __down_write_nested+0xa3/0xc0
> [<ffffffff802768d8>] sys_munmap+0x48/0x80
> [<ffffffff8020c03b>] system_call_after_swapgs+0x7b/0x80
>
> The bad_page() call comes from the inline function
> free_pages_check(). It triggers bad_page() because the PG_slab bit is
> set on the page.
> Presumably this is set by the __SetPageSlab() call inside slab's
> kmem_getpages() function, but I haven't traced it. What does this flag
> indicate?
>
> I also did an experiment where I kmalloced some memory and then
> immediately used virt_to_page() to get the struct page pointer for
> that memory. It already had the PG_slab bit set at that stage, so it
> does not appear to be later-occurring corruption causing this flag to
> be set at munmap() time.
You broke the abstraction here. There are no pages from kmalloc(), it
gives you other memory objects. And on munmapping the region, the
kmalloc objects are passed back to the buddy allocator which then blows
the whistle with bad_page() on it.
> So, am I right in saying that it is not legal to use a page fault
> handler to remap kmalloced memory in this way? I guess I need to use
> alloc_pages() or something instead?
Yes.
Hannes
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: faulting kmalloced buffers into userspace through mmap()
2008-06-02 5:38 ` Johannes Weiner
@ 2008-06-04 9:39 ` Daniel Drake
2008-06-04 11:00 ` Nick Piggin
0 siblings, 1 reply; 5+ messages in thread
From: Daniel Drake @ 2008-06-04 9:39 UTC (permalink / raw)
To: Johannes Weiner; +Cc: linux-mm
Hi Johannes,
Johannes Weiner wrote:
> You broke the abstraction here. There are no pages from kmalloc(), it
> gives you other memory objects. And on munmapping the region, the
> kmalloc objects are passed back to the buddy allocator which then blows
> the whistle with bad_page() on it.
Thanks for the explanation, I attempted to document this here:
http://linux-mm.org/DeviceDriverMmap
Comments/edits are welcome!
One more quick question: if pages that were mapped are "passed back to
the buddy allocator" during munmap() does that mean that the pages get
freed too?
i.e. if I allocate some pages with alloc_pages(), remap them into
userspace in my VM .fault handler, and then userspace munmaps them, is
it still legal for my driver to use those pages internally after the
munmap? Do I still need to call __free_pages() on them when done?
Also, it is possible to get the physical address of a kmalloc region
with virt_to_phys(). Is it also illegal to pass this physical address to
remap_pfn_range() to implement mmap in that fashion? Can't find any
in-kernel code that does this, but google brings up a few hits such as
http://www.opentech.at/papers/embedded_resources/node21.html
Thanks!
--
Daniel Drake
Brontes Technologies, A 3M Company
http://www.brontes3d.com/opensource/
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: faulting kmalloced buffers into userspace through mmap()
2008-06-04 9:39 ` Daniel Drake
@ 2008-06-04 11:00 ` Nick Piggin
2008-06-06 21:29 ` Daniel Drake
0 siblings, 1 reply; 5+ messages in thread
From: Nick Piggin @ 2008-06-04 11:00 UTC (permalink / raw)
To: Daniel Drake; +Cc: Johannes Weiner, linux-mm
On Wednesday 04 June 2008 19:39, Daniel Drake wrote:
> Hi Johannes,
>
> Johannes Weiner wrote:
> > You broke the abstraction here. There are no pages from kmalloc(), it
> > gives you other memory objects. And on munmapping the region, the
> > kmalloc objects are passed back to the buddy allocator which then blows
> > the whistle with bad_page() on it.
>
> Thanks for the explanation, I attempted to document this here:
> http://linux-mm.org/DeviceDriverMmap
> Comments/edits are welcome!
You can map it with a pfn mapping / vm_insert_pfn / remap_pfn_range etc.
which does not touch the underlying struct pages. You must then ensure
you deallocate the memory yourself after it is finished with.
> One more quick question: if pages that were mapped are "passed back to
> the buddy allocator" during munmap() does that mean that the pages get
> freed too?
They get their refcount decremented if they were inserted with
vm_insert_page or ->fault page fault handler.
> i.e. if I allocate some pages with alloc_pages(), remap them into
> userspace in my VM .fault handler, and then userspace munmaps them, is
> it still legal for my driver to use those pages internally after the
> munmap? Do I still need to call __free_pages() on them when done?
Provided you increment the refcount on the pages in your fault
handler, munmap will not free them, and it is still legal for
your driver to touch them (and must free them itself).
> Also, it is possible to get the physical address of a kmalloc region
> with virt_to_phys(). Is it also illegal to pass this physical address to
> remap_pfn_range() to implement mmap in that fashion? Can't find any
> in-kernel code that does this, but google brings up a few hits such as
> http://www.opentech.at/papers/embedded_resources/node21.html
I think (__pa(address) >> PAGE_SIZE) should get you the pfn.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: faulting kmalloced buffers into userspace through mmap()
2008-06-04 11:00 ` Nick Piggin
@ 2008-06-06 21:29 ` Daniel Drake
0 siblings, 0 replies; 5+ messages in thread
From: Daniel Drake @ 2008-06-06 21:29 UTC (permalink / raw)
To: Nick Piggin; +Cc: Johannes Weiner, linux-mm
Nick Piggin wrote:
> You can map it with a pfn mapping / vm_insert_pfn / remap_pfn_range etc.
> which does not touch the underlying struct pages. You must then ensure
> you deallocate the memory yourself after it is finished with.
Ah, excellent, I wasn't aware of pfn mappings or vm_insert_pfn. I should
have read further than LDD3 :)
I have brushed up the section I wrote earlier:
http://linux-mm.org/DeviceDriverMmap
Hopefully someone else will find it useful.
Since I'm working with 2.6.25 I've implemented a nopfn handler which
works perfectly using vm_insert_pfn(). Thanks for all your great work in
this area!
--
Daniel Drake
Brontes Technologies, A 3M Company
http://www.brontes3d.com/opensource/
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2008-06-06 21:29 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-06-01 14:40 faulting kmalloced buffers into userspace through mmap() Daniel Drake
2008-06-02 5:38 ` Johannes Weiner
2008-06-04 9:39 ` Daniel Drake
2008-06-04 11:00 ` Nick Piggin
2008-06-06 21:29 ` Daniel Drake
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox