* Meaning of the dirty bit
@ 2002-10-10 7:46 Martin Maletinsky
2002-10-10 8:49 ` Dharmender Rai
2002-10-10 11:40 ` Hugh Dickins
0 siblings, 2 replies; 9+ messages in thread
From: Martin Maletinsky @ 2002-10-10 7:46 UTC (permalink / raw)
To: kernelnewbies, linux-mm
Hi,
While studying the follow_page() function (the version of the function that is in place since 2.4.4, i.e. with the write argument), I noticed, that for an address that
should be written to (i.e. write != 0), the function checks not only the writeable flag (with pte_write()), but also the dirty flag (with pte_dirty()) of the page
containing this address.
>From what I thought to understand from general paging theory, the dirty flag of a page is set, when its content in physical memory differs from its backing on the permanent
storage system (file or swap space). Based on this understanding I do not understand why it is necessary to check the dirty flag, in order to ensure that a page is writable
- what am I missing here?
Thanks in advance for any answers
with best regards
Martin Maletinsky
P.S. Pls. put me on cc: in your reply, since I am not on the mailing list.
--
Supercomputing System AG email: maletinsky@scs.ch
Martin Maletinsky phone: +41 (0)1 445 16 05
Technoparkstrasse 1 fax: +41 (0)1 445 16 10
CH-8005 Zurich
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: Meaning of the dirty bit 2002-10-10 7:46 Meaning of the dirty bit Martin Maletinsky @ 2002-10-10 8:49 ` Dharmender Rai 2002-10-10 8:57 ` Martin Maletinsky 2002-10-10 11:40 ` Hugh Dickins 1 sibling, 1 reply; 9+ messages in thread From: Dharmender Rai @ 2002-10-10 8:49 UTC (permalink / raw) To: Martin Maletinsky, kernelnewbies, linux-mm Hi, The purpose is to achieve need-based disk I/O. Dirty-flag-set means you have to write the contents of that page to the disk before paging out or invalidating that page. If the dirty flag is not set then there is no need for the I/O part. Regards Dharmender Rai --- Martin Maletinsky <maletinsky@scs.ch> wrote: > Hi, > > While studying the follow_page() function (the > version of the function that is in place since > 2.4.4, i.e. with the write argument), I noticed, > that for an address that > should be written to (i.e. write != 0), the function > checks not only the writeable flag (with > pte_write()), but also the dirty flag (with > pte_dirty()) of the page > containing this address. > From what I thought to understand from general > paging theory, the dirty flag of a page is set, when > its content in physical memory differs from its > backing on the permanent > storage system (file or swap space). Based on this > understanding I do not understand why it is > necessary to check the dirty flag, in order to > ensure that a page is writable > - what am I missing here? > > Thanks in advance for any answers > with best regards > Martin Maletinsky > > P.S. Pls. put me on cc: in your reply, since I am > not on the mailing list. > > -- > Supercomputing System AG email: > maletinsky@scs.ch > Martin Maletinsky phone: +41 (0)1 > 445 16 05 > Technoparkstrasse 1 fax: +41 (0)1 > 445 16 10 > CH-8005 Zurich > > > -- > Kernelnewbies: Help each other learn about the Linux > kernel. > Archive: > http://mail.nl.linux.org/kernelnewbies/ > FAQ: http://kernelnewbies.org/faq/ > __________________________________________________ Do You Yahoo!? Everything you'll ever need on one web page from News and Sport to Email and Music Charts http://uk.my.yahoo.com -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Meaning of the dirty bit 2002-10-10 8:49 ` Dharmender Rai @ 2002-10-10 8:57 ` Martin Maletinsky 2002-10-10 9:46 ` Dharmender Rai 0 siblings, 1 reply; 9+ messages in thread From: Martin Maletinsky @ 2002-10-10 8:57 UTC (permalink / raw) To: dharmenderr; +Cc: kernelnewbies, linux-mm Hello, Thanks for your reply. What is the reason to check the dirty bit in follow_page(), which (presumably) should just parse the page tables, verify write access (if the write argument is set) and return the page descriptor describing the page the address is in (from what I understood, there is no I/O involved). Is there any reason to deny write access when the dirty flag is not set? Thanks again, regards Martin Dharmender Rai wrote: > Hi, > The purpose is to achieve need-based disk I/O. > Dirty-flag-set means you have to write the contents of > that page to the disk before paging out or > invalidating that page. If the dirty flag is not set > then there is no need for the I/O part. > > Regards > Dharmender Rai > > --- Martin Maletinsky <maletinsky@scs.ch> wrote: > > Hi, > > > > While studying the follow_page() function (the > > version of the function that is in place since > > 2.4.4, i.e. with the write argument), I noticed, > > that for an address that > > should be written to (i.e. write != 0), the function > > checks not only the writeable flag (with > > pte_write()), but also the dirty flag (with > > pte_dirty()) of the page > > containing this address. > > From what I thought to understand from general > > paging theory, the dirty flag of a page is set, when > > its content in physical memory differs from its > > backing on the permanent > > storage system (file or swap space). Based on this > > understanding I do not understand why it is > > necessary to check the dirty flag, in order to > > ensure that a page is writable > > - what am I missing here? > > > > Thanks in advance for any answers > > with best regards > > Martin Maletinsky > > > > P.S. Pls. put me on cc: in your reply, since I am > > not on the mailing list. > > > > -- > > Supercomputing System AG email: > > maletinsky@scs.ch > > Martin Maletinsky phone: +41 (0)1 > > 445 16 05 > > Technoparkstrasse 1 fax: +41 (0)1 > > 445 16 10 > > CH-8005 Zurich > > > > > > -- > > Kernelnewbies: Help each other learn about the Linux > > kernel. > > Archive: > > http://mail.nl.linux.org/kernelnewbies/ > > FAQ: http://kernelnewbies.org/faq/ > > > > __________________________________________________ > Do You Yahoo!? > Everything you'll ever need on one web page > from News and Sport to Email and Music Charts > http://uk.my.yahoo.com -- Supercomputing System AG email: maletinsky@scs.ch Martin Maletinsky phone: +41 (0)1 445 16 05 Technoparkstrasse 1 fax: +41 (0)1 445 16 10 CH-8005 Zurich -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Meaning of the dirty bit 2002-10-10 8:57 ` Martin Maletinsky @ 2002-10-10 9:46 ` Dharmender Rai 0 siblings, 0 replies; 9+ messages in thread From: Dharmender Rai @ 2002-10-10 9:46 UTC (permalink / raw) To: Martin Maletinsky; +Cc: kernelnewbies, linux-mm Hi , Read the //// commented part in the following code mentioned by you: * Do a quick page-table lookup for a single page. */ static struct page * follow_page(unsigned long address, int write) { pgd_t *pgd; pmd_t *pmd; pte_t *ptep, pte; pgd = pgd_offset(current->mm, address); /// initialized page directory entry or the page directory entry is invalid if (pgd_none(*pgd) || pgd_bad(*pgd)) goto out; pmd = pmd_offset(pgd, address); /// initialized page middle directory entry or the page middle directory entry is invalid if (pmd_none(*pmd) || pmd_bad(*pmd)) goto out; ptep = pte_offset(pmd, address); if (!ptep) goto out; pte = *ptep; //// if the page table entry is valid if (pte_present(pte)) { if (!write || //// page is write-able and dirty (pte_write(pte) && pte_dirty(pte))) return pte_page(pte); } out: return 0; } The logic here is very simple. This function is used to detect one page. Now a writeable and dirty page is the most suitable one as this page's content has to be written out on the disk. Suppose you go for the read only page then you will be interrupting the processes that might be reading from that page. Regards, Dharmender Rai ================================ Dharmender Rai, Cybage Software Pvt. Ltd, Kalyani Nagar, Pune -411006 Phone : 6686359 Extn : 261 ----- Original Message ----- From: "Martin Maletinsky" <maletinsky@scs.ch> To: <dharmenderr@cybage.com> Cc: <kernelnewbies@nl.linux.org>; <linux-mm@kvack.org> Sent: Thursday, October 10, 2002 2:27 PM Subject: Re: Meaning of the dirty bit > Hello, > > Thanks for your reply. What is the reason to check the dirty bit in follow_page(), which (presumably) should just parse the page tables, verify write access (if the write > argument is set) and return the page descriptor describing the page the address is in (from what I understood, there is no I/O involved). > Is there any reason to deny write access when the dirty flag is not set? > > Thanks again, > regards > Martin > > Dharmender Rai wrote: > > > Hi, > > The purpose is to achieve need-based disk I/O. > > Dirty-flag-set means you have to write the contents of > > that page to the disk before paging out or > > invalidating that page. If the dirty flag is not set > > then there is no need for the I/O part. > > > > Regards > > Dharmender Rai > > > > --- Martin Maletinsky <maletinsky@scs.ch> wrote: > > > Hi, > > > > > > While studying the follow_page() function (the > > > version of the function that is in place since > > > 2.4.4, i.e. with the write argument), I noticed, > > > that for an address that > > > should be written to (i.e. write != 0), the function > > > checks not only the writeable flag (with > > > pte_write()), but also the dirty flag (with > > > pte_dirty()) of the page > > > containing this address. > > > From what I thought to understand from general > > > paging theory, the dirty flag of a page is set, when > > > its content in physical memory differs from its > > > backing on the permanent > > > storage system (file or swap space). Based on this > > > understanding I do not understand why it is > > > necessary to check the dirty flag, in order to > > > ensure that a page is writable > > > - what am I missing here? > > > > > > Thanks in advance for any answers > > > with best regards > > > Martin Maletinsky > > > > > > P.S. Pls. put me on cc: in your reply, since I am > > > not on the mailing list. > > > > > > -- > > > Supercomputing System AG email: > > > maletinsky@scs.ch > > > Martin Maletinsky phone: +41 (0)1 > > > 445 16 05 > > > Technoparkstrasse 1 fax: +41 (0)1 > > > 445 16 10 > > > CH-8005 Zurich > > > > > > > > > -- > > > Kernelnewbies: Help each other learn about the Linux > > > kernel. > > > Archive: > > > http://mail.nl.linux.org/kernelnewbies/ > > > FAQ: http://kernelnewbies.org/faq/ > > > > > > > __________________________________________________ > > Do You Yahoo!? > > Everything you'll ever need on one web page > > from News and Sport to Email and Music Charts > > http://uk.my.yahoo.com > > -- > Supercomputing System AG email: maletinsky@scs.ch > Martin Maletinsky phone: +41 (0)1 445 16 05 > Technoparkstrasse 1 fax: +41 (0)1 445 16 10 > CH-8005 Zurich > > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Meaning of the dirty bit 2002-10-10 7:46 Meaning of the dirty bit Martin Maletinsky 2002-10-10 8:49 ` Dharmender Rai @ 2002-10-10 11:40 ` Hugh Dickins 2002-10-10 11:55 ` William Lee Irwin III ` (2 more replies) 1 sibling, 3 replies; 9+ messages in thread From: Hugh Dickins @ 2002-10-10 11:40 UTC (permalink / raw) To: Martin Maletinsky; +Cc: Stephen Tweedie, kernelnewbies, linux-mm On Thu, 10 Oct 2002, Martin Maletinsky wrote: > > While studying the follow_page() function (the version of the function > that is in place since 2.4.4, i.e. with the write argument), I noticed, > that for an address that > should be written to (i.e. write != 0), the > function checks not only the writeable flag (with pte_write()), but also > the dirty flag (with pte_dirty()) of the page > containing this address. > From what I thought to understand from general paging theory, the dirty > flag of a page is set, when its content in physical memory differs from > its backing on the permanent storage system (file or swap space). Based > on this understanding I do not understand why it is necessary to check > the dirty flag, in order to ensure that a page is writable > - what am I missing here? Good question (and I don't see the answer in Dharmender's replies). I expect Stephen can give the definitive answer, but here's my guess. follow_page() was introduced for kiobufs, so despite its general name, it's doing what map_user_kiobuf() needed (or thought it needed). Originally (pre-2.4.4), as you've noticed, there was no write argument to follow_page, and map_user_kiobuf made one call to handle_mm_fault per page. Experience with races under memory pressure will have shown that to be inadequate, it needed to loop until it could hold down the page, with the writable bit in the pte guaranteeing it good to write to. But why dirty too, you ask? I think, because writing to page via kiobuf happens directly, not via pte, so the pte dirty bit would not be set that way; but if it's not set, then the modification to the page may be lost later. Hence map_user_kiobuf used handle_mm_fault to set that dirty bit too, and used follow_page to check that it is set. Except that's racy too, and so mark_dirty_kiobuf() was added to SetPageDirty on the pages after kio done, before unmapping the kiobuf. mark_dirty_kiobuf appeared in the main kernel tree at the same time as the pte_dirty test in follow_page, but I'm guessing the pte_dirty test was an earlier failed attempt to solve the problems fixed by mark_dirty_kiobuf, which got left in place (and also helped a bit if kiobuf users weren't updated to call mark_dirty_kiobuf). Apologies in advance if my guesses are wild. Hugh -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Meaning of the dirty bit 2002-10-10 11:40 ` Hugh Dickins @ 2002-10-10 11:55 ` William Lee Irwin III 2002-10-10 13:40 ` Hugh Dickins 2002-10-10 12:11 ` Martin Maletinsky 2002-10-10 13:11 ` Dharmender Rai 2 siblings, 1 reply; 9+ messages in thread From: William Lee Irwin III @ 2002-10-10 11:55 UTC (permalink / raw) To: Hugh Dickins; +Cc: Martin Maletinsky, Stephen Tweedie, kernelnewbies, linux-mm On Thu, Oct 10, 2002 at 12:40:08PM +0100, Hugh Dickins wrote: > Originally (pre-2.4.4), as you've noticed, there was no write argument > to follow_page, and map_user_kiobuf made one call to handle_mm_fault > per page. Experience with races under memory pressure will have shown > that to be inadequate, it needed to loop until it could hold down the > page, with the writable bit in the pte guaranteeing it good to write to. Could you explain what race occurred? On Thu, Oct 10, 2002 at 12:40:08PM +0100, Hugh Dickins wrote: > But why dirty too, you ask? I think, because writing to page via kiobuf > happens directly, not via pte, so the pte dirty bit would not be set > that way; but if it's not set, then the modification to the page may > be lost later. Hence map_user_kiobuf used handle_mm_fault to set > that dirty bit too, and used follow_page to check that it is set. Some of the mechanics of how the PTE dirty bit relate to the software notion of a page being dirty are escaping me here. How does follow_page() enter the equation? The PTE's of other processes cannot be resolved this way so it does not seem clear to me at all that follow_page() taking an extra argument can actually get something useful done here. On Thu, Oct 10, 2002 at 12:40:08PM +0100, Hugh Dickins wrote: > Except that's racy too, and so mark_dirty_kiobuf() was added to > SetPageDirty on the pages after kio done, before unmapping the kiobuf. > mark_dirty_kiobuf appeared in the main kernel tree at the same time > as the pte_dirty test in follow_page, but I'm guessing the pte_dirty > test was an earlier failed attempt to solve the problems fixed by > mark_dirty_kiobuf, which got left in place (and also helped a bit > if kiobuf users weren't updated to call mark_dirty_kiobuf). > Apologies in advance if my guesses are wild. Hrm, I'm going to have to dig up a tree with kiobuf stuff in it, I've largely ignored that path for various reasons. Bill -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Meaning of the dirty bit 2002-10-10 11:55 ` William Lee Irwin III @ 2002-10-10 13:40 ` Hugh Dickins 0 siblings, 0 replies; 9+ messages in thread From: Hugh Dickins @ 2002-10-10 13:40 UTC (permalink / raw) To: William Lee Irwin III Cc: Martin Maletinsky, Stephen Tweedie, kernelnewbies, linux-mm On Thu, 10 Oct 2002, William Lee Irwin III wrote: > On Thu, Oct 10, 2002 at 12:40:08PM +0100, Hugh Dickins wrote: > > Originally (pre-2.4.4), as you've noticed, there was no write argument > > to follow_page, and map_user_kiobuf made one call to handle_mm_fault > > per page. Experience with races under memory pressure will have shown > > that to be inadequate, it needed to loop until it could hold down the > > page, with the writable bit in the pte guaranteeing it good to write to. > > Could you explain what race occurred? In the 2.4.3 version, handle_mm_fault would fault the page in, writable and dirty, if not already; but try_to_swap_out might intervene, just before map_user_kiobuf immediately after takes the page_table_lock and does follow_page, clearing the page table entry just verified. And there might even be a read fault coming in too (from another thread), bringing back the page table entry but without its dirty bit. Er, no, scrub that: we have down_write on mmap_sem, keeping out such a fault. (But I wasn't involved, just noticed when the looping was added and was unsurprised since it had looked unsafe to me before. Perhaps the race which actually occurred was something else I've not thought of.) > On Thu, Oct 10, 2002 at 12:40:08PM +0100, Hugh Dickins wrote: > > But why dirty too, you ask? I think, because writing to page via kiobuf > > happens directly, not via pte, so the pte dirty bit would not be set > > that way; but if it's not set, then the modification to the page may > > be lost later. Hence map_user_kiobuf used handle_mm_fault to set > > that dirty bit too, and used follow_page to check that it is set. > > Some of the mechanics of how the PTE dirty bit relate to the software > notion of a page being dirty are escaping me here. How does follow_page() > enter the equation? The PTE's of other processes cannot be resolved this > way so it does not seem clear to me at all that follow_page() taking an > extra argument can actually get something useful done here. I don't entirely understand you here. follow_page verifies the pte, while holding page_table_lock, prior to bumping page reference count: page_table_lock necessary to keep try_to_swap_out away, and of course it cannot be held over call to handle_mm_fault. The extra arg to follow_page does get something useful done, in the 2.4.4 tree where it's introduced along with the loop, since in that loop the follow_page is done before the handle_mm_fault - so if the writable dirty(?) pte already exists, no need to call handle_mm_fault at all. get_user_pages still works this way. > On Thu, Oct 10, 2002 at 12:40:08PM +0100, Hugh Dickins wrote: > > Except that's racy too, and so mark_dirty_kiobuf() was added to > > SetPageDirty on the pages after kio done, before unmapping the kiobuf. > > mark_dirty_kiobuf appeared in the main kernel tree at the same time > > as the pte_dirty test in follow_page, but I'm guessing the pte_dirty > > test was an earlier failed attempt to solve the problems fixed by > > mark_dirty_kiobuf, which got left in place (and also helped a bit > > if kiobuf users weren't updated to call mark_dirty_kiobuf). > > Apologies in advance if my guesses are wild. > > Hrm, I'm going to have to dig up a tree with kiobuf stuff in it, I've > largely ignored that path for various reasons. I believe akpm hopes to do away with kiobufs shortly; but I assume the get_user_pages inheritor of this code will remain, and it is a different kind of path which can easily catch us out. Hugh -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Meaning of the dirty bit 2002-10-10 11:40 ` Hugh Dickins 2002-10-10 11:55 ` William Lee Irwin III @ 2002-10-10 12:11 ` Martin Maletinsky 2002-10-10 13:11 ` Dharmender Rai 2 siblings, 0 replies; 9+ messages in thread From: Martin Maletinsky @ 2002-10-10 12:11 UTC (permalink / raw) Cc: Stephen Tweedie, kernelnewbies, linux-mm Hi Hugh, Thanks a lot for your answer. Hugh Dickins wrote: > On Thu, 10 Oct 2002, Martin Maletinsky wrote: > > > > While studying the follow_page() function (the version of the function > > that is in place since 2.4.4, i.e. with the write argument), I noticed, > > that for an address that > should be written to (i.e. write != 0), the > > function checks not only the writeable flag (with pte_write()), but also > > the dirty flag (with pte_dirty()) of the page > containing this address. > > From what I thought to understand from general paging theory, the dirty > > flag of a page is set, when its content in physical memory differs from > > its backing on the permanent storage system (file or swap space). Based > > on this understanding I do not understand why it is necessary to check > > the dirty flag, in order to ensure that a page is writable > > - what am I missing here? > > Good question (and I don't see the answer in Dharmender's replies). > I expect Stephen can give the definitive answer, but here's my guess. > > follow_page() was introduced for kiobufs, so despite its general name, > it's doing what map_user_kiobuf() needed (or thought it needed). > > Originally (pre-2.4.4), as you've noticed, there was no write argument > to follow_page, and map_user_kiobuf made one call to handle_mm_fault > per page. Experience with races under memory pressure will have shown > that to be inadequate, it needed to loop until it could hold down the > page, with the writable bit in the pte guaranteeing it good to write to. > > But why dirty too, you ask? I think, because writing to page via kiobuf > happens directly, not via pte, so the pte dirty bit would not be set > that way; but if it's not set, then the modification to the page may > be lost later. Hence map_user_kiobuf used handle_mm_fault to set > that dirty bit too, and used follow_page to check that it is set. > > Except that's racy too, and so mark_dirty_kiobuf() was added to > SetPageDirty on the pages after kio done, before unmapping the kiobuf. > mark_dirty_kiobuf appeared in the main kernel tree at the same time > as the pte_dirty test in follow_page, but I'm guessing the pte_dirty > test was an earlier failed attempt to solve the problems fixed by > mark_dirty_kiobuf, which got left in place (and also helped a bit > if kiobuf users weren't updated to call mark_dirty_kiobuf). > > Apologies in advance if my guesses are wild. Although you call it a 'a wild guess', it sounds quite plausible to me. However, if the check of the dirty flag is basically there to ensure that handle_mm_fault() did its job (to mark the pte dirty), wouldn't it make (more?) sense, to have a pte_mkdirty() call in follow_page() setting the dirty bit (possibly/probably once again)? thanks again best regards Martin Maletinsky -- Supercomputing System AG email: maletinsky@scs.ch Martin Maletinsky phone: +41 (0)1 445 16 05 Technoparkstrasse 1 fax: +41 (0)1 445 16 10 CH-8005 Zurich -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Meaning of the dirty bit 2002-10-10 11:40 ` Hugh Dickins 2002-10-10 11:55 ` William Lee Irwin III 2002-10-10 12:11 ` Martin Maletinsky @ 2002-10-10 13:11 ` Dharmender Rai 2 siblings, 0 replies; 9+ messages in thread From: Dharmender Rai @ 2002-10-10 13:11 UTC (permalink / raw) To: Hugh Dickins, Martin Maletinsky; +Cc: Stephen Tweedie, kernelnewbies, linux-mm --- Hugh Dickins <hugh@veritas.com> wrote: > On Thu, 10 Oct 2002, Martin Maletinsky wrote: > > > > While studying the follow_page() function (the > version of the function Hugh, Here is the link to know more about follow_page(). I had replied after reading it. http://lwn.net/Articles/11483/ Regards Dharmender Rai > > that is in place since 2.4.4, i.e. with the write > argument), I noticed, > > that for an address that > should be written to > (i.e. write != 0), the > > function checks not only the writeable flag (with > pte_write()), but also > > the dirty flag (with pte_dirty()) of the page > > containing this address. > > From what I thought to understand from general > paging theory, the dirty > > flag of a page is set, when its content in > physical memory differs from > > its backing on the permanent storage system (file > or swap space). Based > > on this understanding I do not understand why it > is necessary to check > > the dirty flag, in order to ensure that a page is > writable > > - what am I missing here? > > Good question (and I don't see the answer in > Dharmender's replies). > I expect Stephen can give the definitive answer, but > here's my guess. > > follow_page() was introduced for kiobufs, so despite > its general name, > it's doing what map_user_kiobuf() needed (or thought > it needed). > > Originally (pre-2.4.4), as you've noticed, there was > no write argument > to follow_page, and map_user_kiobuf made one call to > handle_mm_fault > per page. Experience with races under memory > pressure will have shown > that to be inadequate, it needed to loop until it > could hold down the > page, with the writable bit in the pte guaranteeing > it good to write to. > > But why dirty too, you ask? I think, because > writing to page via kiobuf > happens directly, not via pte, so the pte dirty bit > would not be set > that way; but if it's not set, then the modification > to the page may > be lost later. Hence map_user_kiobuf used > handle_mm_fault to set > that dirty bit too, and used follow_page to check > that it is set. > > Except that's racy too, and so mark_dirty_kiobuf() > was added to > SetPageDirty on the pages after kio done, before > unmapping the kiobuf. > mark_dirty_kiobuf appeared in the main kernel tree > at the same time > as the pte_dirty test in follow_page, but I'm > guessing the pte_dirty > test was an earlier failed attempt to solve the > problems fixed by > mark_dirty_kiobuf, which got left in place (and also > helped a bit > if kiobuf users weren't updated to call > mark_dirty_kiobuf). > > Apologies in advance if my guesses are wild. > > Hugh > > -- > Kernelnewbies: Help each other learn about the Linux > kernel. > Archive: > http://mail.nl.linux.org/kernelnewbies/ > FAQ: http://kernelnewbies.org/faq/ > __________________________________________________ Do You Yahoo!? Everything you'll ever need on one web page from News and Sport to Email and Music Charts http://uk.my.yahoo.com -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2002-10-10 13:40 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2002-10-10 7:46 Meaning of the dirty bit Martin Maletinsky 2002-10-10 8:49 ` Dharmender Rai 2002-10-10 8:57 ` Martin Maletinsky 2002-10-10 9:46 ` Dharmender Rai 2002-10-10 11:40 ` Hugh Dickins 2002-10-10 11:55 ` William Lee Irwin III 2002-10-10 13:40 ` Hugh Dickins 2002-10-10 12:11 ` Martin Maletinsky 2002-10-10 13:11 ` Dharmender Rai
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox