linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH] dirty bit clearing on s390.
@ 2003-05-26  6:24 Martin Schwidefsky
  0 siblings, 0 replies; 8+ messages in thread
From: Martin Schwidefsky @ 2003-05-26  6:24 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm, phillips

> Having thought long and hard about this, yes, I don't really see anything
> saner than just hooking into SetPageUptodate as you have done.
>
> Just to be sure that I understand the issues here I'll cook up a new
> changelog for this and run it by you, then submit it.

Good, so we haven't been too far off with our approach. Thanks for the
review.

blue skies,
   Martin


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] dirty bit clearing on s390.
  2003-05-22 11:20 Martin Schwidefsky
  2003-05-22 11:40 ` Christoph Hellwig
  2003-05-22 11:42 ` Arjan van de Ven
@ 2003-05-23 22:52 ` Andrew Morton
  2 siblings, 0 replies; 8+ messages in thread
From: Andrew Morton @ 2003-05-23 22:52 UTC (permalink / raw)
  To: Martin Schwidefsky; +Cc: linux-mm, phillips

Martin Schwidefsky <schwidefsky@de.ibm.com> wrote:
>
> I'd like to propose a small change in the common memory management that
> would enable s390 to get its dirty bits finally right. The change is a
> architecture hook in SetPageUptodate.

Having thought long and hard about this, yes, I don't really see anything
saner than just hooking into SetPageUptodate as you have done.

Just to be sure that I understand the issues here I'll cook up a new
changelog for this and run it by you, then submit it.

Thanks.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] dirty bit clearing on s390.
  2003-05-22 14:35   ` Rik van Riel
@ 2003-05-22 15:18     ` Daniel Phillips
  0 siblings, 0 replies; 8+ messages in thread
From: Daniel Phillips @ 2003-05-22 15:18 UTC (permalink / raw)
  To: Rik van Riel, Arjan van de Ven; +Cc: Martin Schwidefsky, linux-mm, akpm

On Thursday 22 May 2003 16:35, Rik van Riel wrote:
> On 22 May 2003, Arjan van de Ven wrote:
> > On Thu, 2003-05-22 at 13:20, Martin Schwidefsky wrote:
> > > Our solution is to move the clearing of the storage key (dirty bit)
> > > from set_pte to SetPageUptodate. A patch that implements this is
> > > attached. What do you think ?
> >
> > Is there anything that prevents a thread mmaping the page to redirty it
> > before the kernel marks it uptodate ?
>
> Nobody will mmap a page before PG_uptodate has been set.

Thanks, Rik.  This needs to go in as a comment.

Regards,

Daniel
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] dirty bit clearing on s390.
  2003-05-22 11:42 ` Arjan van de Ven
  2003-05-22 14:21   ` Daniel Phillips
@ 2003-05-22 14:35   ` Rik van Riel
  2003-05-22 15:18     ` Daniel Phillips
  1 sibling, 1 reply; 8+ messages in thread
From: Rik van Riel @ 2003-05-22 14:35 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: Martin Schwidefsky, linux-mm, akpm, phillips

On 22 May 2003, Arjan van de Ven wrote:
> On Thu, 2003-05-22 at 13:20, Martin Schwidefsky wrote:
> 
> > Our solution is to move the clearing of the storage key (dirty bit)
> > from set_pte to SetPageUptodate. A patch that implements this is
> > attached. What do you think ?
> 
> Is there anything that prevents a thread mmaping the page to redirty it
> before the kernel marks it uptodate ? 

Nobody will mmap a page before PG_uptodate has been set.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] dirty bit clearing on s390.
  2003-05-22 11:42 ` Arjan van de Ven
@ 2003-05-22 14:21   ` Daniel Phillips
  2003-05-22 14:35   ` Rik van Riel
  1 sibling, 0 replies; 8+ messages in thread
From: Daniel Phillips @ 2003-05-22 14:21 UTC (permalink / raw)
  To: arjanv, Martin Schwidefsky; +Cc: linux-mm, akpm

On Thursday 22 May 2003 13:42, Arjan van de Ven wrote:
> On Thu, 2003-05-22 at 13:20, Martin Schwidefsky wrote:
> > Our solution is to move the clearing of the storage key (dirty bit)
> > from set_pte to SetPageUptodate. A patch that implements this is
> > attached. What do you think ?
>
> Is there anything that prevents a thread mmaping the page to redirty it
> before the kernel marks it uptodate ?

The storage key is only supposed to be cleared the first time a page is 
entered into any process page table.  The theory is that the s390 hook in 
SetPageUptodate can figure that out reliably (this theory needs to be 
examined closely).  If it can know that, then it also knows no other page 
table is mapping the page, so no dirty events can get lost.

Regards,

Daniel
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] dirty bit clearing on s390.
  2003-05-22 11:20 Martin Schwidefsky
  2003-05-22 11:40 ` Christoph Hellwig
@ 2003-05-22 11:42 ` Arjan van de Ven
  2003-05-22 14:21   ` Daniel Phillips
  2003-05-22 14:35   ` Rik van Riel
  2003-05-23 22:52 ` Andrew Morton
  2 siblings, 2 replies; 8+ messages in thread
From: Arjan van de Ven @ 2003-05-22 11:42 UTC (permalink / raw)
  To: Martin Schwidefsky; +Cc: linux-mm, akpm, phillips

[-- Attachment #1: Type: text/plain, Size: 347 bytes --]

On Thu, 2003-05-22 at 13:20, Martin Schwidefsky wrote:


> Our solution is to move the clearing of the storage key (dirty bit)
> from set_pte to SetPageUptodate. A patch that implements this is
> attached. What do you think ?

Is there anything that prevents a thread mmaping the page to redirty it
before the kernel marks it uptodate ? 

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] dirty bit clearing on s390.
  2003-05-22 11:20 Martin Schwidefsky
@ 2003-05-22 11:40 ` Christoph Hellwig
  2003-05-22 11:42 ` Arjan van de Ven
  2003-05-23 22:52 ` Andrew Morton
  2 siblings, 0 replies; 8+ messages in thread
From: Christoph Hellwig @ 2003-05-22 11:40 UTC (permalink / raw)
  To: Martin Schwidefsky; +Cc: linux-mm, akpm, phillips

On Thu, May 22, 2003 at 01:20:00PM +0200, Martin Schwidefsky wrote:
> +#ifndef arch_set_page_uptodate
> +#define arch_set_page_uptodate(page)
> +#endif
> +
>  #define PageUptodate(page)	test_bit(PG_uptodate, &(page)->flags)
> -#define SetPageUptodate(page)	set_bit(PG_uptodate, &(page)->flags)
> +#define SetPageUptodate(page) \
> +	do {								\
> +		arch_set_page_uptodate(page);				\
> +		set_bit(PG_uptodate, &(page)->flags);			\
> +	} while (0)
>  #define ClearPageUptodate(page)	clear_bit(PG_uptodate, &(page)->flags)

I guess it would be nicer if the arch could just overrid SetPageUptodate
completly e.g.

#ifndef SetPageUptodate
#define SetPageUptodate(page)	set_bit(PG_uptodate, &(page)->flags)
#endif

with a big comment explaining why s390 needs it.  Else it looks fine to me.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH] dirty bit clearing on s390.
@ 2003-05-22 11:20 Martin Schwidefsky
  2003-05-22 11:40 ` Christoph Hellwig
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Martin Schwidefsky @ 2003-05-22 11:20 UTC (permalink / raw)
  To: linux-mm; +Cc: akpm, phillips

Hi,
I'd like to propose a small change in the common memory management that
would enable s390 to get its dirty bits finally right. The change is a
architecture hook in SetPageUptodate.

The problem I want to solve: s390 does not keept its dirty bits in the
page table entries like other architectures do but in the "storage key".
The storage key is a 8 bit value associated with each physical(!) page.
It is accessed with some special instructions (iske, ivsk, sske, rrbe).
The storage key contains the access-control bits, the fetch-protection
bit, the referenced bit and the change bit (=dirty bit). Linux only
uses the referenced and the change/dirty bit.
This means that each physical page always has a dirty bit. On i386 a
page is implicitly clean if no page table entry points to it (and
PG_dirty is 0). This is not true on s390. A new page without any pte
pointing to it usually starts off dirty because nobody has reset the
dirty bit in the storage key. Worse, a write access due to i/o will
set the dirty bit! We need to clear the dirty bit somewhere or we'll
end up writing every page back to the disk before it becomes clean.
Up to now s390 uses a special bit in the pte that is set in mk_pte
for the first user of a page and makes set_pte to clear the storage
key. The problem is that this is a race condition if two processes
want to access the same page simultaneously. Then the page count is
already > 1 in mk_pte and nobody will clear the storage key. It
doesn't lead to any data loss because what happens is that a clean
page is considered dirty and is written back to the disk. The worst
scenario is a read only disk where this results in i/o errors (but no
data loss). 
Our solution is to move the clearing of the storage key (dirty bit)
from set_pte to SetPageUptodate. A patch that implements this is
attached. What do you think ?

blue skies,
  Martin.
----

Move clearing of the dirty bit from mk_pte/set_pte to SetPageUptodate.

diffstat:
 include/asm-s390/pgtable.h |   33 ++++++++-------------------------
 include/linux/page-flags.h |   11 ++++++++++-
 2 files changed, 18 insertions(+), 26 deletions(-)

diff -urN linux-2.5.69/include/asm-s390/pgtable.h linux-2.5.69-s390/include/asm-s390/pgtable.h
--- linux-2.5.69/include/asm-s390/pgtable.h	Thu May 22 10:42:25 2003
+++ linux-2.5.69-s390/include/asm-s390/pgtable.h	Thu May 22 10:42:32 2003
@@ -212,8 +212,7 @@
 #define _PAGE_INVALID   0x400          /* HW invalid                       */
 
 /* Software bits in the page table entry */
-#define _PAGE_MKCLEAN   0x002
-#define _PAGE_ISCLEAN   0x004
+#define _PAGE_ISCLEAN   0x002
 
 /* Mask and four different kinds of invalid pages. */
 #define _PAGE_INVALID_MASK	0x601
@@ -320,15 +319,6 @@
  */
 extern inline void set_pte(pte_t *pteptr, pte_t pteval)
 {
-	if ((pte_val(pteval) & (_PAGE_MKCLEAN|_PAGE_INVALID))
-	    == _PAGE_MKCLEAN) 
-	{
-		pte_val(pteval) &= ~_PAGE_MKCLEAN;
-               
-		asm volatile ("sske %0,%1" 
-				: : "d" (0), "a" (pte_val(pteval)));
-	}
-
 	*pteptr = pteval;
 }
 
@@ -501,7 +491,7 @@
 	 * sske instruction is slow. It is faster to let the
 	 * next instruction set the dirty bit.
 	 */
-	pte_val(pte) &= ~(_PAGE_MKCLEAN | _PAGE_ISCLEAN);
+	pte_val(pte) &= ~ _PAGE_ISCLEAN;
 	return pte;
 }
 
@@ -582,30 +572,23 @@
 	pgprot_t __pgprot = (pgprot);					  \
 	unsigned long __physpage = __pa((__page-mem_map) << PAGE_SHIFT);  \
 	pte_t __pte = mk_pte_phys(__physpage, __pgprot);                  \
-	                                                                  \
-	if (!(pgprot_val(__pgprot) & _PAGE_ISCLEAN)) {			  \
-		int __users = !!PagePrivate(__page) + !!__page->mapping;  \
-		if (__users + page_count(__page) == 1)                    \
-			pte_val(__pte) |= _PAGE_MKCLEAN;                  \
-	}								  \
 	__pte;                                                            \
 })
 
 #define pfn_pte(pfn, pgprot)                                              \
 ({                                                                        \
-	struct page *__page = mem_map+(pfn);                              \
 	pgprot_t __pgprot = (pgprot);					  \
 	unsigned long __physpage = __pa((pfn) << PAGE_SHIFT);             \
 	pte_t __pte = mk_pte_phys(__physpage, __pgprot);                  \
-	                                                                  \
-	if (!(pgprot_val(__pgprot) & _PAGE_ISCLEAN)) {			  \
-		int __users = !!PagePrivate(__page) + !!__page->mapping;  \
-		if (__users + page_count(__page) == 1)                    \
-			pte_val(__pte) |= _PAGE_MKCLEAN;                  \
-	}								  \
 	__pte;                                                            \
 })
 
+#define arch_set_page_uptodate(__page)					  \
+	do {								  \
+		asm volatile ("sske %0,%1" : : "d" (0),			  \
+			      "a" (__pa((__page-mem_map) << PAGE_SHIFT)));\
+	} while (0)
+
 #ifdef __s390x__
 
 #define pfn_pmd(pfn, pgprot)                                              \
diff -urN linux-2.5.69/include/linux/page-flags.h linux-2.5.69-s390/include/linux/page-flags.h
--- linux-2.5.69/include/linux/page-flags.h	Mon May  5 01:53:35 2003
+++ linux-2.5.69-s390/include/linux/page-flags.h	Thu May 22 10:42:32 2003
@@ -7,6 +7,7 @@
 
 #include <linux/percpu.h>
 #include <linux/cache.h>
+#include <asm/pgtable.h>
 
 /*
  * Various page->flags bits:
@@ -158,8 +159,16 @@
 #define ClearPageReferenced(page)	clear_bit(PG_referenced, &(page)->flags)
 #define TestClearPageReferenced(page) test_and_clear_bit(PG_referenced, &(page)->flags)
 
+#ifndef arch_set_page_uptodate
+#define arch_set_page_uptodate(page)
+#endif
+
 #define PageUptodate(page)	test_bit(PG_uptodate, &(page)->flags)
-#define SetPageUptodate(page)	set_bit(PG_uptodate, &(page)->flags)
+#define SetPageUptodate(page) \
+	do {								\
+		arch_set_page_uptodate(page);				\
+		set_bit(PG_uptodate, &(page)->flags);			\
+	} while (0)
 #define ClearPageUptodate(page)	clear_bit(PG_uptodate, &(page)->flags)
 
 #define PageDirty(page)		test_bit(PG_dirty, &(page)->flags)
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2003-05-26  6:24 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-05-26  6:24 [PATCH] dirty bit clearing on s390 Martin Schwidefsky
  -- strict thread matches above, loose matches on Subject: below --
2003-05-22 11:20 Martin Schwidefsky
2003-05-22 11:40 ` Christoph Hellwig
2003-05-22 11:42 ` Arjan van de Ven
2003-05-22 14:21   ` Daniel Phillips
2003-05-22 14:35   ` Rik van Riel
2003-05-22 15:18     ` Daniel Phillips
2003-05-23 22:52 ` Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox