linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* Obtaining the kernel's PTEs
@ 2002-09-13 21:06 Scott Kaplan
  2002-09-13 22:10 ` William Lee Irwin III
  0 siblings, 1 reply; 4+ messages in thread
From: Scott Kaplan @ 2002-09-13 21:06 UTC (permalink / raw)
  To: linux-mm

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Yet another question...

Assume that I'm not concerned with ZONE_HIGHMEM, and I have a struct page*
.  How would I obtain a pointer to the PTE that maps the corresponding 
virtual page in the kernel's address space to this given page?

In case you're wondering, ``Why does he want that?'':  I want to remove 
access permissions for pages, and I want to include the kernel in that 
denial of permission.  An example of where this matters is when you have a 
page cache page that was allocated by the VFS for read()/write() 
operations on a regular (non-mmaped) file.  Only the kernel has a mapping 
to that page, and I a trap to occur when the kernel tries to use that page.

Must I get the PGD, PMD, and then PTE?  Is there a function that will do 
this nicely for me so that I don't write redundant (and potentially buggy)
  code for this little task?

Scott
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (Darwin)
Comment: For info see http://www.gnupg.org

iD8DBQE9glN28eFdWQtoOmgRAof5AJ4tBOxrX6g74RiFezCQfrsooJjwLQCgq0V4
sH16r3mkat6WMtbqx9JcBbk=
=HSwE
-----END PGP SIGNATURE-----

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Obtaining the kernel's PTEs
  2002-09-13 21:06 Obtaining the kernel's PTEs Scott Kaplan
@ 2002-09-13 22:10 ` William Lee Irwin III
  2002-09-15 21:02   ` Scott Kaplan
  0 siblings, 1 reply; 4+ messages in thread
From: William Lee Irwin III @ 2002-09-13 22:10 UTC (permalink / raw)
  To: Scott Kaplan; +Cc: linux-mm

On Fri, Sep 13, 2002 at 05:06:59PM -0400, Scott Kaplan wrote:
> Yet another question...
> Assume that I'm not concerned with ZONE_HIGHMEM, and I have a struct page*
> .  How would I obtain a pointer to the PTE that maps the corresponding 
> virtual page in the kernel's address space to this given page?
> In case you're wondering, ``Why does he want that?'':  I want to remove 
> access permissions for pages, and I want to include the kernel in that 
> denial of permission.  An example of where this matters is when you have a 
> page cache page that was allocated by the VFS for read()/write() 
> operations on a regular (non-mmaped) file.  Only the kernel has a mapping 
> to that page, and I a trap to occur when the kernel tries to use that page.
> Must I get the PGD, PMD, and then PTE?  Is there a function that will do 
> this nicely for me so that I don't write redundant (and potentially buggy)
>  code for this little task?

Well, there are a couple of issues here.

(1) Pagetables are only meaningful to a couple of machines, most notably
	i386 and m68k. The rest is pretty much software TLB or inverted.
	So there's zero accounting of the direct-mapping within the kernel
	for some machines, not sure which since I've not gone about the
	task of hunting for the answer to "What does everyone do when
	they've taken a TLB miss on kernelspace?" My suspicion is TLB
	entries are generated on the fly for what is not bolted.

(2) The kernel is often mapped out using various tidbits of TLB magic not
	handled by user PTE manipulation routines. e.g. the G and PS bits
	on i386. i386 is even worse, as the PAT bit in a PTE has the same
	position as the PS bit in a PMD so a priori knowledge of mapping
	size is required. At the moment, hardware pagetable large pages
	such as i386's PTE's it keeps in PMD's are not understood by
	the core kernel to begin with...
	Also, since the kernel translations at least on i386 use the G
	bit which is basically "invalidate the TLB entry only when a
	specific page is targeted." This is also not a particularly
	friendly feature...

There is pain involved here.


Cheers,
Bill
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Obtaining the kernel's PTEs
  2002-09-13 22:10 ` William Lee Irwin III
@ 2002-09-15 21:02   ` Scott Kaplan
  2002-09-15 22:44     ` William Lee Irwin III
  0 siblings, 1 reply; 4+ messages in thread
From: Scott Kaplan @ 2002-09-15 21:02 UTC (permalink / raw)
  To: linux-mm

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Thanks for the responses...Here we go...

On Friday, September 13, 2002, at 05:19 PM, Martin J. Bligh wrote:

> On an ia32 machine, I believe there is no PTE - they're large pages for 
> ZONE_NORMAL.
> I don't know of any function to walk one, but you could look at 
> pagetable_init for something close to what you're doing?

A-ha.  Not surprising, and it does look like it wouldn't be hard to force 
the kernel to map its space using small pages.  Of course, that begs the 
question, ``How much overhead will be introduced by using small pages?''  
Increased space use for page table consumption and increased TLB misses 
_could_ be significant, but it depends on the reference patterns of the 
applications and the kernel itself.

On Friday, September 13, 2002, at 06:10 PM, William Lee Irwin III wrote:

> (1) Pagetables are only meaningful to a couple of machines, most notably
> 	i386 and m68k. The rest is pretty much software TLB or inverted.
> 	So there's zero accounting of the direct-mapping within the kernel
> 	for some machines, not sure which since I've not gone about the
> 	task of hunting for the answer to "What does everyone do when
> 	they've taken a TLB miss on kernelspace?" My suspicion is TLB
> 	entries are generated on the fly for what is not bolted.

First, the essentials (for me):  I just want to implement some 
kernel-level changes for experimental purposes.  It needs to run only on 
one platform.  So, if it just works on i386, that's fine for me.

Second, my curiosity:  I confess that I don't understand how a software 
TLB or inverted page table obviates the need for a virtual->physical 
mapping for the kernel.  Those are simply different mechanisms for 
supporting the mapping task.  While the mapping information may be stored 
and handled differently for other architectures, the kernel must have its 
address space mapped onto the physical address space.  Or am I completely 
misunderstanding you?

> (2) The kernel is often mapped out using various tidbits of TLB magic not
> 	handled by user PTE manipulation routines. e.g. the G and PS bits
> 	on i386. i386 is even worse, as the PAT bit in a PTE has the same
> 	position as the PS bit in a PMD so a priori knowledge of mapping
> 	size is required.

What is the ``PAT bit''?  Wait, doesn't the PS bit on the PGD entry tell 
you whether it points to a 4 MB page or whether the levels of indirection 
to 4 KB pages continues?  Again, this issue seems to be more one of 
curiosity for me, and not something essential to what I'm trying to do -- 
but I would like to know to what you're referring, because I'm having 
trouble understanding it.

> 	Also, since the kernel translations at least on i386 use the G
> 	bit which is basically "invalidate the TLB entry only when a
> 	specific page is targeted." This is also not a particularly
> 	friendly feature...

I don't think this feature is a problem for what I want to do.  I'm aiming 
to change the protection on individual pages, so changing the PTE and then 
invalidating that specific mapping in the TLB is exactly what I want.

Thanks to those who responded, as the information is quite helpful!
Scott
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (Darwin)
Comment: For info see http://www.gnupg.org

iD8DBQE9hPVM8eFdWQtoOmgRAu4HAJ42Pg40Ld+tXWizw2oHpzAyF0h5+ACglehx
1Yt4BCjdN5WC6qPKSjMj0Ys=
=OiP7
-----END PGP SIGNATURE-----

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Obtaining the kernel's PTEs
  2002-09-15 21:02   ` Scott Kaplan
@ 2002-09-15 22:44     ` William Lee Irwin III
  0 siblings, 0 replies; 4+ messages in thread
From: William Lee Irwin III @ 2002-09-15 22:44 UTC (permalink / raw)
  To: Scott Kaplan; +Cc: linux-mm

On Sun, Sep 15, 2002 at 05:02:01PM -0400, Scott Kaplan wrote:
> A-ha.  Not surprising, and it does look like it wouldn't be hard to force 
> the kernel to map its space using small pages.  Of course, that begs the 
> question, ``How much overhead will be introduced by using small pages?''  
> Increased space use for page table consumption and increased TLB misses 
> _could_ be significant, but it depends on the reference patterns of the 
> applications and the kernel itself.

Well, the worst case is losing around 8MB of ZONE_NORMAL to kernel PTE's
and some TLB thrashing overhead.


On Friday, September 13, 2002, at 06:10 PM, William Lee Irwin III wrote:
>> (1) Pagetables are only meaningful to a couple of machines, most notably
>>	i386 and m68k. The rest is pretty much software TLB or inverted.
>>	So there's zero accounting of the direct-mapping within the kernel
>>	for some machines, not sure which since I've not gone about the
>>	task of hunting for the answer to "What does everyone do when
>>	they've taken a TLB miss on kernelspace?" My suspicion is TLB
>>	entries are generated on the fly for what is not bolted.

On Sun, Sep 15, 2002 at 05:02:01PM -0400, Scott Kaplan wrote:
> First, the essentials (for me):  I just want to implement some 
> kernel-level changes for experimental purposes.  It needs to run only on 
> one platform.  So, if it just works on i386, that's fine for me.

Well, for that case one need only bear in mind the pagetable structures
are interpreted by hardware on i386, so utilizing PTE's to store
information may result in the installation of garbage translations or
confusing the swap handling code (which stores block addresses in PTE's).


On Sun, Sep 15, 2002 at 05:02:01PM -0400, Scott Kaplan wrote:
> Second, my curiosity:  I confess that I don't understand how a software 
> TLB or inverted page table obviates the need for a virtual->physical 
> mapping for the kernel.  Those are simply different mechanisms for 
> supporting the mapping task.  While the mapping information may be stored 
> and handled differently for other architectures, the kernel must have its 
> address space mapped onto the physical address space.  Or am I completely 
> misunderstanding you?

Linux' strategy with ZONE_NORMAL etc. uses a direct mapping, so virtual
to physical translation may be simply calculated via addition and
subtraction for kernel accesses to that region. The Linux pagetable
structures are relevant only for bookkeeping process translations and
dynamically configured kernel translations on such machines. I'm still
unaware if that's actually done by any of the software TLB architectures.


On Friday, September 13, 2002, at 06:10 PM, William Lee Irwin III wrote:
>> (2) The kernel is often mapped out using various tidbits of TLB magic not
>>	handled by user PTE manipulation routines. e.g. the G and PS bits
>>	on i386. i386 is even worse, as the PAT bit in a PTE has the same
>>	position as the PS bit in a PMD so a priori knowledge of mapping
>>	size is required.

On Sun, Sep 15, 2002 at 05:02:01PM -0400, Scott Kaplan wrote:
> What is the ``PAT bit''?  Wait, doesn't the PS bit on the PGD entry tell 
> you whether it points to a 4 MB page or whether the levels of indirection 
> to 4 KB pages continues?  Again, this issue seems to be more one of 
> curiosity for me, and not something essential to what I'm trying to do -- 
> but I would like to know to what you're referring, because I'm having 
> trouble understanding it.

The PAT bit selects translation attributes that I believe have to do
with cacheing. The PS bit does exactly what you suspect.


On Friday, September 13, 2002, at 06:10 PM, William Lee Irwin III wrote:
>>	Also, since the kernel translations at least on i386 use the G
>>	bit which is basically "invalidate the TLB entry only when a
>>	specific page is targeted." This is also not a particularly
>>	friendly feature...

On Sun, Sep 15, 2002 at 05:02:01PM -0400, Scott Kaplan wrote:
> I don't think this feature is a problem for what I want to do.  I'm aiming 
> to change the protection on individual pages, so changing the PTE and then 
> invalidating that specific mapping in the TLB is exactly what I want.

You could probably get away with trapping in-kernel accesses so long as
do_page_fault() reinstates the translation after the access is trapped
and allocations aren't required to do so, and maybe doublecheck to be
sure page_fault() (the assembly fault handler installed in the IDT that
jumps to do_page_fault() etc.) won't shoot you for faulting on it. Also
beware of taking faults prior to the set_intr_gate() for page_fault()
in trap_init(), done right after parse_options() in start_kernel().


Bill
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2002-09-15 22:44 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-09-13 21:06 Obtaining the kernel's PTEs Scott Kaplan
2002-09-13 22:10 ` William Lee Irwin III
2002-09-15 21:02   ` Scott Kaplan
2002-09-15 22:44     ` William Lee Irwin III

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox