linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* IO mappings; verify_area() on SMP
@ 1999-11-08 11:43 Arkadi E. Shishlov
  1999-11-08 19:25 ` Kanoj Sarcar
  1999-11-08 20:50 ` Andi Kleen
  0 siblings, 2 replies; 6+ messages in thread
From: Arkadi E. Shishlov @ 1999-11-08 11:43 UTC (permalink / raw)
  To: linux-mm


  Hi.
  If it is not good place to ask, direct me to the right place, please.

  Some times ago I wrote driver for 2.0 series of kernel. It was
  primitive - no memory based IO, don't concerned about SMP case,
  and so on. But it give me understanding of character drivers basics.

  Now I'm trying to write driver for hardware device that heavily use
  main memory for data exchange. Architecture is i386 and device is
  on ISA. But in future, I want this driver to work on other architectures
  too and device will become PCI card to overcome 16Mb ISA barrier.

  For IO, device use many memory chunks that are linked together using
  classical structure - one-way linked list - ptr->data, ptr->next.
  I'm in stuck about how driver can supply a pointer on data structure
  to the device. I will try to explain on examples.

  For first step, I don't use kmalloc() - I simply boot my 128Mb box
  with mem=120M parameter. And then I created test module:

int init_module(void)
{

	uint base;
	uint base2;
	uint base3;


	printk("----------\n");
	base = (uint)ioremap_nocache(0xB0000000, 1024*1024);
	printk("%08X, %08X\n", base, (int)virt_to_phys((void*)base));

	base2 = (uint)ioremap_nocache(0xD0000000, 1024*1024);
	printk("%08X, %08X\n", base2, (int)virt_to_phys((void*)base2));

	base3 = (uint)ioremap_nocache(0x07900000, 1024*1024);
	printk("%08X, %08X\n", base3, (int)virt_to_phys((void*)base3));

	if (base) iounmap((void*)base);
	if (base2) iounmap((void*)base2);
	if (base3) iounmap((void*)base3);


	return(0);
}

  I know, that virt_to_phys() is equivalent to virt_to_bus() on i386.
  Output:

C806D000, 0806D000
C816E000, 0816E000
C826F000, 0826F000

  In last case, bus address according to virt_to_bus() is 0x0826F000,
  but device will see this region of memory at 0x07900000. Definitely
  not what I want. I read Documentation/IO-mapping.txt. Very strange.
  Likely I misunderstand something. At this point of time I think
  this way:

  hwbase = 0x07900000;
  base = ioremap_nocache(hwbase, 1024*1024);
  data_ptr = base + 100;
  data = base + 104;
  data_addr_for_controller = hwbase + (data - base);
  *(uint*)data_ptr = data_addr_for_controller;

  Instead of playing with virt_to_bus() and memcpy_to_io(), there is
  pointer arithmetics every time. Is it right or not?
  OK. Maybe I'm wrong. I mix ioremap() and main memory access. Not very
  clever. But, read further, please.

  Next step - memory are allocated by kmalloc(). Now driver don't know
  hwbase... How it should work? How this magic ptr = kmalloc() can
  be translated to raw bus address, that driver can give to controller?
  Will virt_to_bus() work?

  Also some miscellaneous questions:
  Does memory allocated with one call to kmalloc(), will be always
  physically contiguous (in future)?
  What is about Intel 64Gb PAE extension - how device drivers should
  deal with it?

  Second question is about verify_area() safety. Many drivers contain
  following sequence:

  if ((ret = verify_area(VERIFY_WRITE, buffer, count)))
	    return r;
  ...
  copy_to_user(buffer, driver_data_buf, count);

  Even protected by cli()/sti() pairs, why multithreaded program on
  SMP machine can't unmap this verified buffer between calls to
  verify_area() and copy_to_user()? Of course it can't be true, but
  maybe somebody can write two-three words about reason that prevent
  this situation.


arkadi.
-- 
Just arms curvature radius.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://humbolt.geo.uu.nl/Linux-MM/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: IO mappings; verify_area() on SMP
  1999-11-08 11:43 IO mappings; verify_area() on SMP Arkadi E. Shishlov
@ 1999-11-08 19:25 ` Kanoj Sarcar
  1999-11-09  9:26   ` Arkadi E. Shishlov
  1999-11-08 20:50 ` Andi Kleen
  1 sibling, 1 reply; 6+ messages in thread
From: Kanoj Sarcar @ 1999-11-08 19:25 UTC (permalink / raw)
  To: Arkadi E. Shishlov; +Cc: linux-mm

> 
> 
>   Hi.
>   If it is not good place to ask, direct me to the right place, please.
> 
>   Some times ago I wrote driver for 2.0 series of kernel. It was
>   primitive - no memory based IO, don't concerned about SMP case,
>   and so on. But it give me understanding of character drivers basics.
> 
>   Now I'm trying to write driver for hardware device that heavily use
>   main memory for data exchange. Architecture is i386 and device is
>   on ISA. But in future, I want this driver to work on other architectures
>   too and device will become PCI card to overcome 16Mb ISA barrier.
> 
>   For IO, device use many memory chunks that are linked together using
>   classical structure - one-way linked list - ptr->data, ptr->next.
>   I'm in stuck about how driver can supply a pointer on data structure
>   to the device. I will try to explain on examples.
> 
>   For first step, I don't use kmalloc() - I simply boot my 128Mb box
>   with mem=120M parameter. And then I created test module:
> 
> int init_module(void)
> {
> 
> 	uint base;
> 	uint base2;
> 	uint base3;
> 
> 
> 	printk("----------\n");
> 	base = (uint)ioremap_nocache(0xB0000000, 1024*1024);
> 	printk("%08X, %08X\n", base, (int)virt_to_phys((void*)base));
> 
> 	base2 = (uint)ioremap_nocache(0xD0000000, 1024*1024);
> 	printk("%08X, %08X\n", base2, (int)virt_to_phys((void*)base2));
> 
> 	base3 = (uint)ioremap_nocache(0x07900000, 1024*1024);
> 	printk("%08X, %08X\n", base3, (int)virt_to_phys((void*)base3));
> 
> 	if (base) iounmap((void*)base);
> 	if (base2) iounmap((void*)base2);
> 	if (base3) iounmap((void*)base3);
> 
> 
> 	return(0);
> }
> 
>   I know, that virt_to_phys() is equivalent to virt_to_bus() on i386.
>   Output:
> 
> C806D000, 0806D000
> C816E000, 0816E000
> C826F000, 0826F000

I don't think you can do a virt_to_phys on an address returned from
ioremap_nocache. You can do that only on a direct mapped kernel address.
And why would you want to do it anyway? You already know the
physical address, in these cases 0xB0000000, 0xD0000000, 0x07900000.

> 
>   In last case, bus address according to virt_to_bus() is 0x0826F000,
>   but device will see this region of memory at 0x07900000. Definitely
>   not what I want. I read Documentation/IO-mapping.txt. Very strange.
>   Likely I misunderstand something. At this point of time I think
>   this way:
> 
>   hwbase = 0x07900000;
>   base = ioremap_nocache(hwbase, 1024*1024);
>   data_ptr = base + 100;
>   data = base + 104;
>   data_addr_for_controller = hwbase + (data - base);
>   *(uint*)data_ptr = data_addr_for_controller;
> 
>   Instead of playing with virt_to_bus() and memcpy_to_io(), there is
>   pointer arithmetics every time. Is it right or not?
>   OK. Maybe I'm wrong. I mix ioremap() and main memory access. Not very
>   clever. But, read further, please.
> 
>   Next step - memory are allocated by kmalloc(). Now driver don't know
>   hwbase... How it should work? How this magic ptr = kmalloc() can
>   be translated to raw bus address, that driver can give to controller?
>   Will virt_to_bus() work?

With kmalloc'ed memory, you can indeed do a virt_to_phys/virt_to_bus ...
kmalloc always returns direct mapped memory.


> 
>   Also some miscellaneous questions:
>   Does memory allocated with one call to kmalloc(), will be always
>   physically contiguous (in future)?

Yes, I think Linux will stick to that.

>   What is about Intel 64Gb PAE extension - how device drivers should
>   deal with it?

That issue is being dealt with right now in 2.3. People are working
on PCI64, for PCI32, you need bounce buffers (ie temporary copy
buffers) if you want to do dma to addresses >4Gb.

> 
>   Second question is about verify_area() safety. Many drivers contain
>   following sequence:
> 
>   if ((ret = verify_area(VERIFY_WRITE, buffer, count)))
> 	    return r;
>   ...
>   copy_to_user(buffer, driver_data_buf, count);
> 
>   Even protected by cli()/sti() pairs, why multithreaded program on
>   SMP machine can't unmap this verified buffer between calls to
>   verify_area() and copy_to_user()? Of course it can't be true, but
>   maybe somebody can write two-three words about reason that prevent
>   this situation.

In most cases, the address spaces' mmap_sem is held, which prevents
unmap's from happening until the caller of verify_area/copy_to_user
releases it. This is if copy_to_user takes a page fault. If there
is no page fault, the caller probably holds the kernel_lock 
monitor, which excludes anyone else from doing a lot of things 
inside the kernel, including unmaps.

Kanoj

> 
> 
> arkadi.
> -- 
> Just arms curvature radius.
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://humbolt.geo.uu.nl/Linux-MM/
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://humbolt.geo.uu.nl/Linux-MM/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: IO mappings; verify_area() on SMP
  1999-11-08 11:43 IO mappings; verify_area() on SMP Arkadi E. Shishlov
  1999-11-08 19:25 ` Kanoj Sarcar
@ 1999-11-08 20:50 ` Andi Kleen
  1999-11-09  9:27   ` Arkadi E. Shishlov
  1 sibling, 1 reply; 6+ messages in thread
From: Andi Kleen @ 1999-11-08 20:50 UTC (permalink / raw)
  To: Arkadi E. Shishlov; +Cc: linux-mm

On Mon, Nov 08, 1999 at 12:43:25PM +0100, Arkadi E. Shishlov wrote:
> 
>   Second question is about verify_area() safety. Many drivers contain
>   following sequence:
> 
>   if ((ret = verify_area(VERIFY_WRITE, buffer, count)))
> 	    return r;
>   ...
>   copy_to_user(buffer, driver_data_buf, count);
> 
>   Even protected by cli()/sti() pairs, why multithreaded program on
>   SMP machine can't unmap this verified buffer between calls to
>   verify_area() and copy_to_user()? Of course it can't be true, but
>   maybe somebody can write two-three words about reason that prevent
>   this situation.

The verify_area is unnecessary in 2.2. The correct way to do it is:

	if (copy_to_user(buffer, driver_data_buf, count))
		return -EFAULT;

The above sequence is because a lot of drivers were incorrectly converted
from the 2.0 verify_area/memcpy_to_fs method to the 2.2 method. copy_from_user
avoids the race you're describing (see Documentation/exception.txt). 

verify_area() is a backwards compatibility wrapper around access_ok()
which only does a security check for kernel mode addresses, it is done
by copy_*_user too.  The real mapping check is done by the MMU by
handling the exception.

Some early 386 don't check properly for page write protection when the CPU
is in supervisor mode. In this case verify_area does a full walk of the
page tables to avoid security problems. Unfortunately there is still a race
with programs that use clone() (does not even need SMP), because when the
user access sleeps in a page fault another thread can unmap the mapping
inbetween and cause a kernel crash. Fortunately this only applies to some
very early 386 steppings, later CPUs don't have this problem (and AFAIK
no non x86 port except possibly uclinux)

Hope this helps,

-Andi
-- 
This is like TV. I don't like TV.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://humbolt.geo.uu.nl/Linux-MM/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: IO mappings; verify_area() on SMP
  1999-11-08 19:25 ` Kanoj Sarcar
@ 1999-11-09  9:26   ` Arkadi E. Shishlov
  0 siblings, 0 replies; 6+ messages in thread
From: Arkadi E. Shishlov @ 1999-11-09  9:26 UTC (permalink / raw)
  To: Kanoj Sarcar; +Cc: linux-mm

  Thank you for fast response. Now I know, how to deal with memory io.

On Mon, Nov 08, 1999 at 11:25:11AM -0800, Kanoj Sarcar wrote:
> > 
> >   Second question is about verify_area() safety. Many drivers contain
> >   following sequence:
> > 
> >   if ((ret = verify_area(VERIFY_WRITE, buffer, count)))
> > 	    return r;
> >   ...
> >   copy_to_user(buffer, driver_data_buf, count);
> > 
> >   Even protected by cli()/sti() pairs, why multithreaded program on
> >   SMP machine can't unmap this verified buffer between calls to
> >   verify_area() and copy_to_user()? Of course it can't be true, but
> >   maybe somebody can write two-three words about reason that prevent
> >   this situation.
> 
> In most cases, the address spaces' mmap_sem is held, which prevents
> unmap's from happening until the caller of verify_area/copy_to_user
> releases it. This is if copy_to_user takes a page fault. If there
> is no page fault, the caller probably holds the kernel_lock 
> monitor, which excludes anyone else from doing a lot of things 
> inside the kernel, including unmaps.

  Hmm... Your explanation is somewhat different from Andi Kleen wrote.
  I don't see use of mmap_sem in conjunction with drivers (only char/mem).
  If I mistaken - sorry, I will dig into kernel and investigate this.


arkadi.
-- 
Just arms curvature radius.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://humbolt.geo.uu.nl/Linux-MM/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: IO mappings; verify_area() on SMP
  1999-11-08 20:50 ` Andi Kleen
@ 1999-11-09  9:27   ` Arkadi E. Shishlov
  1999-11-09 10:40     ` Andi Kleen
  0 siblings, 1 reply; 6+ messages in thread
From: Arkadi E. Shishlov @ 1999-11-09  9:27 UTC (permalink / raw)
  To: Andi Kleen; +Cc: linux-mm

On Mon, Nov 08, 1999 at 09:50:35PM +0100, Andi Kleen wrote:
> On Mon, Nov 08, 1999 at 12:43:25PM +0100, Arkadi E. Shishlov wrote:
> > 
> >   Second question is about verify_area() safety. Many drivers contain
> >   following sequence:
> > 
> >   if ((ret = verify_area(VERIFY_WRITE, buffer, count)))
> > 	    return r;
> >   ...
> >   copy_to_user(buffer, driver_data_buf, count);
> > 
> >   Even protected by cli()/sti() pairs, why multithreaded program on
> >   SMP machine can't unmap this verified buffer between calls to
> >   verify_area() and copy_to_user()? Of course it can't be true, but
> >   maybe somebody can write two-three words about reason that prevent
> >   this situation.
> 
> The verify_area is unnecessary in 2.2. The correct way to do it is:
> 
> 	if (copy_to_user(buffer, driver_data_buf, count))
> 		return -EFAULT;
> 
> The above sequence is because a lot of drivers were incorrectly converted
> from the 2.0 verify_area/memcpy_to_fs method to the 2.2 method. copy_from_user
> avoids the race you're describing (see Documentation/exception.txt). 

  Yes. I already read it. But... There is cases where verify_area() is
  essential. To do copy_to_user() driver need actual data to put to user.
  To get this data, driver walk through it internal structures and copy
  data to buffer, then call copy_to_user(). In case of verify_area()
  it was easy to do internal structures clean-up (packet is read - forget
  about it) while filling this buffer. In case of copy_to_user() there is
  two walk-through - first fill buffer, second - if copy_to_user() succeeds,
  alter driver structures.
  I can even imagine situation, when driver will be over-complicated, only
  because it get data from hardware and copy_to_user() fails - driver need
  to maintain additional buffer to hold this data. But it is rare case.
  I understand this decision and agree. Will rewrite my driver slightly.


  I look at verify_area() function. On i386 architecture it reduces to:

#define __range_ok(addr,size) ({ \
	unsigned long flag,sum; \
	asm("addl %3,%1 ; sbbl %0,%0; cmpl %1,%4; sbbl $0,%0" \
		:"=&r" (flag), "=r" (sum) \
		:"1" (addr),"g" (size),"g" (current->addr_limit.seg)); \
	flag; })

  I don't understand this magic code, but it looks somewhat different from
  copy_to_user() with all it .fixup's. Why not to create function named
  memset_to_user() - it will do the work of verify_area() and will be quite
  cheap.
  I found clear_user() function in arch/i386/lib/usercopy.c:

unsigned long
clear_user(void *to, unsigned long n)
{
	if (access_ok(VERIFY_WRITE, to, n))
		__do_clear_user(to, n);
	return n;
}

  Why it is not macro and why it call access_ok()?

> verify_area() is a backwards compatibility wrapper around access_ok()
> which only does a security check for kernel mode addresses, it is done
> by copy_*_user too.  The real mapping check is done by the MMU by
> handling the exception.
> 
> Some early 386 don't check properly for page write protection when the CPU
> is in supervisor mode. In this case verify_area does a full walk of the
> page tables to avoid security problems. Unfortunately there is still a race
> with programs that use clone() (does not even need SMP), because when the
> user access sleeps in a page fault another thread can unmap the mapping
> inbetween and cause a kernel crash. Fortunately this only applies to some
> very early 386 steppings, later CPUs don't have this problem (and AFAIK
> no non x86 port except possibly uclinux)
> 
> Hope this helps,

  Yes. Thank you.


arkadi.
-- 
Just arms curvature radius.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://humbolt.geo.uu.nl/Linux-MM/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: IO mappings; verify_area() on SMP
  1999-11-09  9:27   ` Arkadi E. Shishlov
@ 1999-11-09 10:40     ` Andi Kleen
  0 siblings, 0 replies; 6+ messages in thread
From: Andi Kleen @ 1999-11-09 10:40 UTC (permalink / raw)
  To: Arkadi E. Shishlov; +Cc: Andi Kleen, linux-mm

On Tue, Nov 09, 1999 at 10:27:32AM +0100, Arkadi E. Shishlov wrote:
> On Mon, Nov 08, 1999 at 09:50:35PM +0100, Andi Kleen wrote:
> > On Mon, Nov 08, 1999 at 12:43:25PM +0100, Arkadi E. Shishlov wrote:
> > > 
> > >   Second question is about verify_area() safety. Many drivers contain
> > >   following sequence:
> > > 
> > >   if ((ret = verify_area(VERIFY_WRITE, buffer, count)))
> > > 	    return r;
> > >   ...
> > >   copy_to_user(buffer, driver_data_buf, count);
> > > 
> > >   Even protected by cli()/sti() pairs, why multithreaded program on
> > >   SMP machine can't unmap this verified buffer between calls to
> > >   verify_area() and copy_to_user()? Of course it can't be true, but
> > >   maybe somebody can write two-three words about reason that prevent
> > >   this situation.
> > 
> > The verify_area is unnecessary in 2.2. The correct way to do it is:
> > 
> > 	if (copy_to_user(buffer, driver_data_buf, count))
> > 		return -EFAULT;
> > 
> > The above sequence is because a lot of drivers were incorrectly converted
> > from the 2.0 verify_area/memcpy_to_fs method to the 2.2 method. copy_from_user
> > avoids the race you're describing (see Documentation/exception.txt). 
> 
>   Yes. I already read it. But... There is cases where verify_area() is
>   essential. To do copy_to_user() driver need actual data to put to user.
>   To get this data, driver walk through it internal structures and copy
>   data to buffer, then call copy_to_user(). In case of verify_area()
>   it was easy to do internal structures clean-up (packet is read - forget
>   about it) while filling this buffer. In case of copy_to_user() there is
>   two walk-through - first fill buffer, second - if copy_to_user() succeeds,
>   alter driver structures.
>   I can even imagine situation, when driver will be over-complicated, only
>   because it get data from hardware and copy_to_user() fails - driver need
>   to maintain additional buffer to hold this data. But it is rare case.
>   I understand this decision and agree. Will rewrite my driver slightly.

There is no alternative. *_user can sleep, and another thread can unmap
while it is sleeping. So it has to be checking in *_user by the MMU.
 
> 
> 
>   I look at verify_area() function. On i386 architecture it reduces to:
> 
> #define __range_ok(addr,size) ({ \
> 	unsigned long flag,sum; \
> 	asm("addl %3,%1 ; sbbl %0,%0; cmpl %1,%4; sbbl $0,%0" \
> 		:"=&r" (flag), "=r" (sum) \
> 		:"1" (addr),"g" (size),"g" (current->addr_limit.seg)); \
> 	flag; })
> 
>   I don't understand this magic code, but it looks somewhat different from
>   copy_to_user() with all it .fixup's. Why not to create function named
>   memset_to_user() - it will do the work of verify_area() and will be quite
>   cheap.

I don't understand. What __range_ok basically does is to check if the
address is part of the address space reserved for the user. It it wouldn't
do that the user could specify a kernel address and access internal
kernel structures, leading to a security leak. This is a bit complicated
because the kernel sometimes wants to do IO to/from internal buffers (e.g.
for NFS), so the idea of kernel and user memory can be switched (with
set_fs(KERNEL_DS) which sets current->addr_limit). The assembly magic 
above is just a fancy jumpless way to implement this check. It has nothing
to do with memset.


>   I found clear_user() function in arch/i386/lib/usercopy.c:
> 
> unsigned long
> clear_user(void *to, unsigned long n)
> {
> 	if (access_ok(VERIFY_WRITE, to, n))
> 		__do_clear_user(to, n);
> 	return n;
> }
> 
>   Why it is not macro and why it call access_ok()?

See above.

-Andi
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://humbolt.geo.uu.nl/Linux-MM/

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~1999-11-09 10:40 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1999-11-08 11:43 IO mappings; verify_area() on SMP Arkadi E. Shishlov
1999-11-08 19:25 ` Kanoj Sarcar
1999-11-09  9:26   ` Arkadi E. Shishlov
1999-11-08 20:50 ` Andi Kleen
1999-11-09  9:27   ` Arkadi E. Shishlov
1999-11-09 10:40     ` Andi Kleen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox