linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Andrea Arcangeli <andrea@suse.de>
To: Andrew Morton <akpm@osdl.org>
Cc: hugh@veritas.com, vrajesh@umich.edu,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [RFC][PATCH 1/3] radix priority search tree - objrmap complexity fix
Date: Fri, 2 Apr 2004 04:00:22 +0200	[thread overview]
Message-ID: <20040402020022.GN18585@dualathlon.random> (raw)
In-Reply-To: <20040401173649.22f734cd.akpm@osdl.org>

On Thu, Apr 01, 2004 at 05:36:49PM -0800, Andrew Morton wrote:
> Andrea Arcangeli <andrea@suse.de> wrote:
> >
> > > it isn't doing anything useful for rw_swap_page_sync, just getting you
> > > into memory allocation difficulties.  No need for add_to_page_cache or
> > > add_to_swap_cache there at all.  As I say, I haven't tested this path,
> > 
> > I wouldn't need to call add_to_page_cache either, it's just Andrew
> > prefers it.
> 
> Well all of this is to avoid a fairly arbitrary BUG_ON in the radix-tree
> code.  If I hadn't added that, we'd all be happy.
> 
> The code is well-tested and has been thrashed to death in the userspace
> radix-tree test harness.
> (http://www.zip.com.au/~akpm/linux/patches/stuff/rtth.tar.gz).  Let's
> remove the BUG_ON.

the point of the BUG_ON is not to debug the radix tree, it's to debug
the callers.

I don't like to remove the BUG_ON... it's very cool to get a BUG when
somebody tag or delete entries that are never been present in the
radix-tree. That's been useful to me so far and I like to retain that
feature since I believe it's not hurting performance at all.

I now fixed up the whole compound thing, it made no sense to keep
compound off with HUGETLBSF=N, that's a generic setup for all order > 0
not just for hugetlbfs, so it has to be enabled always or never, or it's
just asking for troubles. I can very well imagine drivers touching
page[1]->mapping and crashing with HUGETLBFS=Y and working fine with
HUGETLBFS=N, for something that has nothing to do with hugetlbfs in the
first place. At the same time the CONFIG_MMU assuming HUGETLBFS=N and
setting the count for all the secondary pages as well looks wrong, the
MMU isn't related to the page[1-N].count setting at all, the layout we
return pages via alloc_pages must be the same w/ or w/o MMU.

So I cleaned up the code and also allowed drivers to alloc multipages
not compounded, so that they can do stuff like rw_swap_page_sync on
them (doing the refcount on MMU=y too). The thing works fine here and I
can finally suspend fine. The bug is definitely fixed now. I'm checking
this in CVS so the swapsuspend people will finally be able to suspend,
and and unless anybody has objections I'll merge it in next -aa.

--- x/include/linux/gfp.h.~1~	2003-08-31 02:38:26.000000000 +0200
+++ x/include/linux/gfp.h	2004-04-02 02:49:21.241968968 +0200
@@ -32,6 +32,7 @@
 #define __GFP_NOFAIL	0x800	/* Retry for ever.  Cannot fail */
 #define __GFP_NORETRY	0x1000	/* Do not retry.  Might fail */
 #define __GFP_NO_GROW	0x2000	/* Slab internal usage */
+#define __GFP_NO_COMP	0x4000	/* Return non compound pages if order > 0 */
 
 #define __GFP_BITS_SHIFT 16	/* Room for 16 __GFP_FOO bits */
 #define __GFP_BITS_MASK ((1 << __GFP_BITS_SHIFT) - 1)
--- x/include/linux/mm.h.~1~	2004-04-01 18:32:55.000000000 +0200
+++ x/include/linux/mm.h	2004-04-02 03:39:40.884913464 +0200
@@ -445,8 +445,6 @@ struct page {
 
 extern void FASTCALL(__page_cache_release(struct page *));
 
-#ifdef CONFIG_HUGETLB_PAGE
-
 static inline int page_count(struct page *p)
 {
 	if (PageCompound(p))
@@ -478,23 +476,6 @@ static inline void put_page(struct page 
 		__page_cache_release(page);
 }
 
-#else		/* CONFIG_HUGETLB_PAGE */
-
-#define page_count(p)		atomic_read(&(p)->count)
-
-static inline void get_page(struct page *page)
-{
-	atomic_inc(&page->count);
-}
-
-static inline void put_page(struct page *page)
-{
-	if (!PageReserved(page) && put_page_testzero(page))
-		__page_cache_release(page);
-}
-
-#endif		/* CONFIG_HUGETLB_PAGE */
-
 /*
  * Multiple processes may "see" the same page. E.g. for untouched
  * mappings of /dev/null, all processes see the same page full of
--- x/kernel/power/pmdisk.c.~1~	2004-03-11 08:27:47.000000000 +0100
+++ x/kernel/power/pmdisk.c	2004-04-02 02:51:09.000000000 +0200
@@ -531,7 +531,7 @@ static void calc_order(void)
 static int alloc_pagedir(void)
 {
 	calc_order();
-	pagedir_save = (suspend_pagedir_t *)__get_free_pages(GFP_ATOMIC | __GFP_COLD, 
+	pagedir_save = (suspend_pagedir_t *)__get_free_pages(GFP_ATOMIC | __GFP_COLD | __GFP_NO_COMP, 
 							     pagedir_order);
 	if(!pagedir_save)
 		return -ENOMEM;
--- x/kernel/power/swsusp.c.~1~	2004-03-11 08:27:47.000000000 +0100
+++ x/kernel/power/swsusp.c	2004-04-02 03:03:03.327992896 +0200
@@ -442,7 +442,7 @@ static suspend_pagedir_t *create_suspend
 
 	pagedir_order = get_bitmask_order(SUSPEND_PD_PAGES(nr_copy_pages));
 
-	p = pagedir = (suspend_pagedir_t *)__get_free_pages(GFP_ATOMIC | __GFP_COLD, pagedir_order);
+	p = pagedir = (suspend_pagedir_t *)__get_free_pages(GFP_ATOMIC | __GFP_COLD | __GFP_NO_COMP, pagedir_order);
 	if(!pagedir)
 		return NULL;
 
--- x/mm/page_alloc.c.~1~	2004-04-01 18:32:54.000000000 +0200
+++ x/mm/page_alloc.c	2004-04-02 03:53:33.897276336 +0200
@@ -93,10 +93,6 @@ static void bad_page(const char *functio
 	page->mapcount = 0;
 }
 
-#ifndef CONFIG_HUGETLB_PAGE
-#define prep_compound_page(page, order) do { } while (0)
-#define destroy_compound_page(page, order) do { } while (0)
-#else
 /*
  * Higher-order pages are called "compound pages".  They are structured thusly:
  *
@@ -147,7 +143,6 @@ static void destroy_compound_page(struct
 		ClearPageCompound(p);
 	}
 }
-#endif		/* CONFIG_HUGETLB_PAGE */
 
 /*
  * Freeing function for a buddy system allocator.
@@ -178,7 +173,7 @@ static inline void __free_pages_bulk (st
 {
 	unsigned long page_idx, index;
 
-	if (order)
+	if (PageCompound(page))
 		destroy_compound_page(page, order);
 	page_idx = page - base;
 	if (page_idx & ~mask)
@@ -306,47 +301,37 @@ expand(struct zone *zone, struct page *p
 	return page;
 }
 
-static inline void set_page_refs(struct page *page, int order)
-{
-#ifdef CONFIG_MMU
-	set_page_count(page, 1);
-#else
-	int i;
-
-	/*
-	 * We need to reference all the pages for this order, otherwise if
-	 * anyone accesses one of the pages with (get/put) it will be freed.
-	 */
-	for (i = 0; i < (1 << order); i++)
-		set_page_count(page+i, 1);
-#endif /* CONFIG_MMU */
-}
-
 /*
  * This page is about to be returned from the page allocator
  */
-static void prep_new_page(struct page *page, int order)
+static void prep_new_page(struct page * _page, int order)
 {
-	if (page->mapping ||
-	    page->mapcount ||
-	    (page->flags & (
-			1 << PG_private	|
-			1 << PG_locked	|
-			1 << PG_lru	|
-			1 << PG_active	|
-			1 << PG_dirty	|
-			1 << PG_reclaim	|
-			1 << PG_anon	|
-			1 << PG_maplock	|
-			1 << PG_swapcache	|
-			1 << PG_writeback )))
-		bad_page(__FUNCTION__, page);
+	int i;
+
+	for (i = 0; i < (1 << order); i++) {
+		struct page * page = _page + i;
 
-	page->flags &= ~(1 << PG_uptodate | 1 << PG_error |
-			1 << PG_referenced | 1 << PG_arch_1 |
-			1 << PG_checked | 1 << PG_mappedtodisk);
-	page->private = 0;
-	set_page_refs(page, order);
+		if (page->mapping ||
+		    page->mapcount ||
+		    (page->flags & (
+				    1 << PG_private	|
+				    1 << PG_locked	|
+				    1 << PG_lru	|
+				    1 << PG_active	|
+				    1 << PG_dirty	|
+				    1 << PG_reclaim	|
+				    1 << PG_anon	|
+				    1 << PG_maplock	|
+				    1 << PG_swapcache	|
+				    1 << PG_writeback )))
+			bad_page(__FUNCTION__, page);
+
+		page->flags &= ~(1 << PG_uptodate | 1 << PG_error |
+				 1 << PG_referenced | 1 << PG_arch_1 |
+				 1 << PG_checked | 1 << PG_mappedtodisk);
+		page->private = 0;
+		set_page_count(page, 1);
+	}
 }
 
 /* 
@@ -498,10 +483,11 @@ void fastcall free_cold_page(struct page
  * or two.
  */
 
-static struct page *buffered_rmqueue(struct zone *zone, int order, int cold)
+static struct page *buffered_rmqueue(struct zone *zone, int order, int cold_compound)
 {
 	unsigned long flags;
 	struct page *page = NULL;
+	int cold = !!(cold_compound & __GFP_COLD);
 
 	if (order == 0) {
 		struct per_cpu_pages *pcp;
@@ -530,7 +516,7 @@ static struct page *buffered_rmqueue(str
 		BUG_ON(bad_range(zone, page));
 		mod_page_state_zone(zone, pgalloc, 1 << order);
 		prep_new_page(page, order);
-		if (order)
+		if (unlikely(order) && !(cold_compound & __GFP_NO_COMP))
 			prep_compound_page(page, order);
 	}
 	return page;
@@ -570,7 +556,9 @@ __alloc_pages(unsigned int gfp_mask, uns
 
 	cold = 0;
 	if (gfp_mask & __GFP_COLD)
-		cold = 1;
+		cold = __GFP_COLD;
+	if (gfp_mask & __GFP_NO_COMP)
+		cold |= __GFP_NO_COMP;
 
 	zones = zonelist->zones;  /* the list of zones suitable for gfp_mask */
 	if (zones[0] == NULL)     /* no zones in the zonelist */
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

  reply	other threads:[~2004-04-02  2:00 UTC|newest]

Thread overview: 92+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <Pine.LNX.4.44.0403150527400.28579-100000@localhost.localdomain>
2004-03-21 22:10 ` Rajesh Venkatasubramanian
2004-03-21 22:12   ` [RFC][PATCH 2/3] Dave & Hugh's objrmap patch Rajesh Venkatasubramanian
2004-03-21 22:13   ` [RFC][PATCH 3/3] Covert objrmap to use prio_tree Rajesh Venkatasubramanian
2004-03-21 22:26   ` URL typo Rajesh Venkatasubramanian
2004-03-22  0:46   ` [RFC][PATCH 1/3] radix priority search tree - objrmap complexity fix Andrea Arcangeli
2004-03-22  2:32     ` Rik van Riel
2004-03-22  3:49     ` Rajesh Venkatasubramanian
2004-03-22  4:02       ` Rik van Riel
2004-03-25 22:59   ` Andrea Arcangeli
2004-03-26  4:06     ` Rajesh Venkatasubramanian
2004-03-26  7:53       ` Andrea Arcangeli
2004-03-26 15:43         ` Rajesh Venkatasubramanian
2004-03-26 17:58           ` Andrea Arcangeli
2004-03-27 19:51             ` Rajesh Venkatasubramanian
2004-03-29 17:22               ` Andrea Arcangeli
2004-03-29 17:50                 ` Rajesh Venkatasubramanian
2004-03-29 18:01                   ` Andrea Arcangeli
2004-03-29 20:40                     ` Andrew Morton
2004-03-29 22:24                       ` Hugh Dickins
2004-03-29 22:54                         ` Andrea Arcangeli
2004-03-29 23:08                         ` William Lee Irwin III
2004-03-29 22:39                       ` Andrea Arcangeli
2004-03-29 22:42                         ` Andrew Morton
2004-03-31 15:07                           ` Andrea Arcangeli
2004-03-31 15:26                             ` Andrea Arcangeli
2004-03-31 16:45                             ` Hugh Dickins
2004-03-31 17:28                               ` Andrea Arcangeli
2004-04-01  0:45                                 ` Andrea Arcangeli
2004-04-01  1:22                                   ` Andrew Morton
2004-04-01  1:26                                     ` Andrea Arcangeli
2004-04-01  1:51                                       ` Andrew Morton
2004-04-01  2:01                                         ` Andrea Arcangeli
2004-04-01  5:05                                           ` Hugh Dickins
2004-04-01 13:35                                             ` Andrea Arcangeli
2004-04-01 15:09                                               ` Andrea Arcangeli
2004-04-01 15:15                                                 ` Andrea Arcangeli
2004-04-02  0:15                                                   ` Andrea Arcangeli
2004-04-02  0:52                                                     ` Andrew Morton
2004-04-02  1:06                                                       ` Andrea Arcangeli
2004-04-02  1:03                                                     ` Hugh Dickins
2004-04-02  1:16                                                       ` Andrea Arcangeli
2004-04-02  1:36                                                         ` Andrew Morton
2004-04-02  2:00                                                           ` Andrea Arcangeli [this message]
2004-04-02  2:08                                                             ` Andrew Morton
2004-04-02  2:22                                                               ` Andrea Arcangeli
2004-04-02  6:05                                                                 ` Christoph Hellwig
2004-04-02  7:07                                                                   ` Paul Mackerras
2004-04-02  7:11                                                                     ` Christoph Hellwig
2004-04-02 15:28                                                                     ` Andrea Arcangeli
2004-04-02 15:22                                                                   ` Andrea Arcangeli
2004-04-02 15:27                                                                     ` Christoph Hellwig
2004-04-02 15:38                                                                       ` Andrea Arcangeli
2004-04-02 15:45                                                                         ` Andrea Arcangeli
2004-04-02  9:43                                                             ` Christoph Hellwig
2004-04-02 10:21                                                               ` Marc-Christian Petersen
2004-04-02 10:55                                                                 ` Hugh Dickins
2004-04-02 16:46                                                               ` Andrea Arcangeli
2004-04-02 18:59                                                                 ` Christoph Hellwig
2004-04-02 19:29                                                                   ` Andrea Arcangeli
2004-04-02 19:54                                                                     ` Christoph Hellwig
2004-04-02 20:35                                                                       ` Andrea Arcangeli
2004-04-03  8:40                                                                         ` Christoph Hellwig
2004-04-03 15:20                                                                           ` Andrea Arcangeli
2004-04-03 15:59                                                                             ` Andrea Arcangeli
2004-04-03 17:02                                                                               ` Andrea Arcangeli
2004-04-05  9:59                                                                                 ` Christoph Hellwig
2004-04-05 12:11                                                                                   ` Christoph Hellwig
2004-04-05 16:08                                                                                     ` Andrea Arcangeli
2004-04-06  4:22                                                                                     ` Andrea Arcangeli
2004-04-06  4:43                                                                                       ` Andrew Morton
2004-04-06  5:14                                                                                         ` Christoph Hellwig
2004-04-06 21:54                                                                                         ` Andrea Arcangeli
2004-04-07  1:39                                                                                           ` Nathan Scott
2004-04-06  5:16                                                                                       ` Christoph Hellwig
2004-04-06 16:01                                                                                         ` Andrea Arcangeli
2004-04-07  1:33                                                                                           ` Nathan Scott
2004-04-03 17:40                                                                 ` Andrea Arcangeli
2004-04-03 20:02                                                                   ` Andrew Morton
2004-04-03 23:27                                                                     ` Andrea Arcangeli
2004-04-03 23:46                                                                       ` Andrew Morton
2004-04-04  0:40                                                                         ` Andrea Arcangeli
2004-04-02 20:13                                           ` Pavel Machek
2004-04-02 21:42                                             ` Andrea Arcangeli
2004-04-02 21:45                                               ` Pavel Machek
2004-04-02 21:49                                                 ` Andrea Arcangeli
2004-03-29 18:12                 ` Hugh Dickins
2004-03-29 18:20                   ` Andrea Arcangeli
2004-03-29 18:38                     ` Christoph Hellwig
2004-04-05  3:14       ` Rajesh Venkatasubramanian
2004-04-05  4:42         ` Andrea Arcangeli
2004-03-26 12:26     ` William Lee Irwin III
2004-03-26 19:18       ` Andrea Arcangeli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20040402020022.GN18585@dualathlon.random \
    --to=andrea@suse.de \
    --cc=akpm@osdl.org \
    --cc=hugh@veritas.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=vrajesh@umich.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox