* [RFC/PATCH] free_area[] bitmap elimination [3/3]
@ 2004-08-24 12:41 Hiroyuki KAMEZAWA
2004-08-24 17:05 ` Dave Hansen
0 siblings, 1 reply; 3+ messages in thread
From: Hiroyuki KAMEZAWA @ 2004-08-24 12:41 UTC (permalink / raw)
To: linux-mm, LHMS
Cc: William Lee Irwin III, Dave Hansen, Hirokazu Takahashi, ncunningham
[-- Attachment #1: Type: text/plain, Size: 291 bytes --]
This is the last(4th) part.
a patch for free_pages()
there are many changes but an algorithm itself is unchanged.
If this patch is complex, please see an example I added.
--Kame
--
--the clue is these footmarks leading to the door.--
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
[-- Attachment #2: eliminate-bitmap-free.patch --]
[-- Type: text/x-patch, Size: 7606 bytes --]
This patch removes bitmap operation from free_pages()
Instead of using bitmap, this patch records page's order in
page struct itself, page->private field.
As a side effect of removing a bitmap, we must test whether a buddy of a page
is existing or not. This is done by zone->aligned_order and bad_range(page,zone).
zone->aligned_order guarantees "a page of an order below zone->aligned_order has
its buddy in the zone".This is calculated in the zone initialization time.
If order > zone->aligned_order, we must call bad_range(zone, page) and
this is enough when zone doesn't contain memory-hole.
But....
In IA64 case, additionally, there can be memory holes in a zone and size of a hole
is smaller than 1 << MAX_ORDER. So pfn_valid() is inserted.
But pfn_valid() looks a bit heavy in ia64 and replacing this with faster one
is desirable.
I think some of memory-hotplug codes will be good for fix this in the future.
There will be light-weight pfn_valid() for detecting memory-hole.
example) a series of free_pages()
consider these 8 pages
page 0 1 2 3 4 5 6 7
[A][F][F][-][F][A][A2][-]
page[0] is allocated , order=0
page[1] is free, order=0
page[2-3] is free, order=1
page[4] is free, order=0
page[5] is allocated , order=0
page[6-7] is allocated, order=1
0) status
free_area[3] ->
free_area[2] ->
free_area[1] -> page[2-3]
free_area[0] -> page[1], page[4]
allocated -> page[0], page[5], page[6-7]
1) free_pages (page[6],order=1)
1-1) loop 1st
free_area[3] ->
free_area[2] ->
free_area[1] -> page[2-3], page[6-7]
free_area[0] -> page[1], page[4]
allocated -> page[0], page[5]
buddy of page 6 in order 1 is page[4].
page[4] is free,but its order is 0
stop here.
2) free_pages(page[5], order=0)
2-1) loop 1st
free_area[3] ->
free_area[2] ->
free_area[1] -> page[2-3], page[6-7]
free_area[0] -> page[1], page[4], page[5]
allocated -> page[0]
buddy of page[5] in order 0 is page[4].
page[4] is free and its order is 0.
do coalesce page[4] & page[5].
2-2) loop 2nd
free_area[3] ->
free_area[2] ->
free_area[1] -> page[2-3], page[6-7],page[4-5]
free_area[0] -> page[1]
allocated -> page[0]
buddy of page[4] in order 1 is page[6]
page[6] is free and its order is 1
coalesce page[4-5] and page[6-7]
2-3) loop 3rd
free_area[3] ->
free_area[2] -> page[4-7]
free_area[1] -> page[2-3],
free_area[0] -> page[1]
allocated -> page[0]
buddy of page[4] in order 2 is page[0]
page[0] is not free.
stop here.
3) free_pages(page[0],order=0)
3-1) 1st loop
free_area[3] ->
free_area[2] -> page[4-7]
free_area[1] -> page[2-3],
free_area[0] -> page[1],page[0] -> coalesce
allocated ->
3-2) 2nd loop
free_area[3] ->
free_area[2] -> page[4-7]
free_area[1] -> page[0-1],page[2-3] -> coalesce
free_area[0] ->
allocated ->
3-3) 3rd
free_area[3] ->
free_area[2] -> page[4-7] , page[0-3] -> coalesce
free_area[1] ->
free_area[0] ->
allocated ->
3-4) 4th
free_area[3] -> page[0-7]
free_area[2] ->
free_area[1] ->
free_area[0] ->
allocated ->
---
linux-2.6.8.1-mm4-kame-kamezawa/mm/page_alloc.c | 82 ++++++++++++++++++------
1 files changed, 63 insertions(+), 19 deletions(-)
diff -puN mm/page_alloc.c~eliminate-bitmap-free mm/page_alloc.c
--- linux-2.6.8.1-mm4-kame/mm/page_alloc.c~eliminate-bitmap-free 2004-08-24 20:03:48.000000000 +0900
+++ linux-2.6.8.1-mm4-kame-kamezawa/mm/page_alloc.c 2004-08-24 20:03:48.000000000 +0900
@@ -157,6 +157,37 @@ static void destroy_compound_page(struct
#endif /* CONFIG_HUGETLB_PAGE */
/*
+ * This function checks whether a page is free && is the buddy
+ * we can do coalesce if
+ * (a) the buddy is free and
+ * (b) the buddy is on the buddy system
+ * (c) the buddy has the same order.
+ * for recording page's order, we use private field and PG_private.
+ */
+static inline int page_is_buddy(struct page *page, int order)
+{
+ if (page_count(page) == 0 &&
+ PagePrivate(page) &&
+ !PageReserved(page) &&
+ page_order(page) == order) {
+ /* check, check... see free_pages_check() */
+ if (page_mapped(page) ||
+ page->mapping != NULL ||
+ (page->flags & (
+ 1 << PG_lru |
+ 1 << PG_locked |
+ 1 << PG_active |
+ 1 << PG_reclaim |
+ 1 << PG_slab |
+ 1 << PG_swapcache |
+ 1 << PG_writeback )))
+ bad_page(__FUNCTION__, page);
+ return 1;
+ }
+ return 0;
+}
+
+/*
* Freeing function for a buddy system allocator.
*
* The concept of a buddy system is to maintain direct-mapped table
@@ -168,9 +199,12 @@ static void destroy_compound_page(struct
* at the bottom level available, and propagating the changes upward
* as necessary, plus some accounting needed to play nicely with other
* parts of the VM system.
- * At each level, we keep one bit for each pair of blocks, which
- * is set to 1 iff only one of the pair is allocated. So when we
- * are allocating or freeing one, we can derive the state of the
+ *
+ * At each level, we keep a list of pages, which are head of chunk of
+ * pages at the level. A page, which is a head of chunks, has its order
+ * in page structure itself and PG_private flag is set. we can get an
+ * order of a page by calling page_order().
+ * So we are allocating or freeing one, we can derive the state of the
* other. That is, if we allocate a small block, and both were
* free, the remainder of the region must be split into blocks.
* If a block is freed, and its buddy is also free, then this
@@ -178,43 +212,53 @@ static void destroy_compound_page(struct
*
* -- wli
*/
-
static inline void __free_pages_bulk (struct page *page, struct page *base,
struct zone *zone, struct free_area *area, unsigned int order)
{
- unsigned long page_idx, index, mask;
-
+ unsigned long page_idx, mask;
if (order)
destroy_compound_page(page, order);
mask = (~0UL) << order;
page_idx = page - base;
if (page_idx & ~mask)
BUG();
- index = page_idx >> (1 + order);
-
zone->free_pages += 1 << order;
- while (order < MAX_ORDER-1) {
- struct page *buddy1, *buddy2;
+ BUG_ON(bad_range(zone,page));
+ while (order < MAX_ORDER-1) {
+ struct page *buddy1;
BUG_ON(area >= zone->free_area + MAX_ORDER);
- if (!__test_and_change_bit(index, area->map))
+ buddy1 = base + (page_idx ^ (1 << order));
+ if (order >= zone->aligned_order) {
+ /* we need range check */
+#ifdef CONFIG_VIRTUAL_MEM_MAP
+ /* This check is necessary when
+ 1. there may be holes in zone.
+ 2. a hole is not aligned in this order.
+ currently, VIRTUAL_MEM_MAP case, is only case.
+ Is there better call than pfn_valid ?
+ */
+ if (!pfn_valid(zone->zone_start_pfn + (page_idx ^ (1 << order))))
+ break;
+#endif
+ /* this order in this zone is not aligned. */
+ if (bad_range(zone, buddy1))
+ break;
+ }
+ if (!page_is_buddy(buddy1, order))
/*
- * the buddy page is still allocated.
+ * the buddy page is still allocated.
*/
break;
-
- /* Move the buddy up one level. */
- buddy1 = base + (page_idx ^ (1 << order));
- buddy2 = base + page_idx;
- BUG_ON(bad_range(zone, buddy1));
- BUG_ON(bad_range(zone, buddy2));
list_del(&buddy1->lru);
+ invalidate_page_order(buddy1);
mask <<= 1;
order++;
area++;
- index >>= 1;
page_idx &= mask;
}
+ /* record the final order of the page */
+ set_page_order((base + page_idx), order);
list_add(&(base + page_idx)->lru, &area->free_list);
}
_
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: [RFC/PATCH] free_area[] bitmap elimination [3/3]
2004-08-24 12:41 [RFC/PATCH] free_area[] bitmap elimination [3/3] Hiroyuki KAMEZAWA
@ 2004-08-24 17:05 ` Dave Hansen
2004-08-25 0:17 ` [Lhms-devel] " Hiroyuki KAMEZAWA
0 siblings, 1 reply; 3+ messages in thread
From: Dave Hansen @ 2004-08-24 17:05 UTC (permalink / raw)
To: Hiroyuki KAMEZAWA
Cc: linux-mm, lhms, William Lee Irwin III, Hirokazu Takahashi, ncunningham
On Tue, 2004-08-24 at 05:41, Hiroyuki KAMEZAWA wrote:
> +static inline int page_is_buddy(struct page *page, int order)
> +{
> + if (page_count(page) == 0 &&
> + PagePrivate(page) &&
> + !PageReserved(page) &&
> + page_order(page) == order) {
> + /* check, check... see free_pages_check() */
> + if (page_mapped(page) ||
> + page->mapping != NULL ||
> + (page->flags & (
> + 1 << PG_lru |
> + 1 << PG_locked |
> + 1 << PG_active |
> + 1 << PG_reclaim |
> + 1 << PG_slab |
> + 1 << PG_swapcache |
> + 1 << PG_writeback )))
> + bad_page(__FUNCTION__, page);
> + return 1;
> + }
> + return 0;
> +}
Please share some code with the free_pages_check() that you stole this
from. It's nasty enough to have one copy of it around. :)
> +#ifdef CONFIG_VIRTUAL_MEM_MAP
> + /* This check is necessary when
> + 1. there may be holes in zone.
> + 2. a hole is not aligned in this order.
> + currently, VIRTUAL_MEM_MAP case, is only case.
> + Is there better call than pfn_valid ?
> + */
> + if (!pfn_valid(zone->zone_start_pfn + (page_idx ^ (1 << order))))
> + break;
> +#endif
This should be hidden in a header somewhere. We don't want to have to
see ia64-specific ifdefs in generic code.
-- Dave
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: [Lhms-devel] Re: [RFC/PATCH] free_area[] bitmap elimination [3/3]
2004-08-24 17:05 ` Dave Hansen
@ 2004-08-25 0:17 ` Hiroyuki KAMEZAWA
0 siblings, 0 replies; 3+ messages in thread
From: Hiroyuki KAMEZAWA @ 2004-08-25 0:17 UTC (permalink / raw)
To: Dave Hansen
Cc: linux-mm, lhms, William Lee Irwin III, Hirokazu Takahashi, ncunningham
Dave Hansen wrote:
> On Tue, 2004-08-24 at 05:41, Hiroyuki KAMEZAWA wrote:
>
>>+static inline int page_is_buddy(struct page *page, int order)
>>+{
>>+ if (page_count(page) == 0 &&
>>+ PagePrivate(page) &&
>>+ !PageReserved(page) &&
>>+ page_order(page) == order) {
>>+ /* check, check... see free_pages_check() */
>>+ if (page_mapped(page) ||
>>+ page->mapping != NULL ||
>>+ (page->flags & (
>>+ 1 << PG_lru |
>>+ 1 << PG_locked |
>>+ 1 << PG_active |
>>+ 1 << PG_reclaim |
>>+ 1 << PG_slab |
>>+ 1 << PG_swapcache |
>>+ 1 << PG_writeback )))
>>+ bad_page(__FUNCTION__, page);
>>+ return 1;
>>+ }
>>+ return 0;
>>+}
>
>
> Please share some code with the free_pages_check() that you stole this
> from. It's nasty enough to have one copy of it around. :)
Hmm... this part is different from free_pages_check() even if I stoled it from.
Becasuse PG_private bit check is not done here. Sharing some code with
frees_page_check() would make free_pages_check() complex to read.
And this is only a bug checking code and bad_page( __FUNCTION__ , page) is useful
to test this buddy system.
>>+#ifdef CONFIG_VIRTUAL_MEM_MAP
>>+ /* This check is necessary when
>>+ 1. there may be holes in zone.
>>+ 2. a hole is not aligned in this order.
>>+ currently, VIRTUAL_MEM_MAP case, is only case.
>>+ Is there better call than pfn_valid ?
>>+ */
>>+ if (!pfn_valid(zone->zone_start_pfn + (page_idx ^ (1 << order))))
>>+ break;
>>+#endif
>
>
> This should be hidden in a header somewhere. We don't want to have to
> see ia64-specific ifdefs in generic code.
>
Hmm, I understand what you say. I'll consider better another way.
But why #ifdef is inserted here is that this is RFC and
I want to make it clear this is IA64 specific.
Thank you for your all comments.
-- Kame
--
--the clue is these footmarks leading to the door.--
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2004-08-25 0:12 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-08-24 12:41 [RFC/PATCH] free_area[] bitmap elimination [3/3] Hiroyuki KAMEZAWA
2004-08-24 17:05 ` Dave Hansen
2004-08-25 0:17 ` [Lhms-devel] " Hiroyuki KAMEZAWA
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox