* [RFC][PATCH 2/3] find a contiguous range.
2010-10-13 3:15 [RFC][PATCH 1/3] contigous big page allocator KAMEZAWA Hiroyuki
@ 2010-10-13 3:17 ` KAMEZAWA Hiroyuki
2010-10-17 3:18 ` Minchan Kim
2010-10-13 3:18 ` [RFC][PATCH 3/3] alloc contig pages with migration KAMEZAWA Hiroyuki
` (2 subsequent siblings)
3 siblings, 1 reply; 26+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-10-13 3:17 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki; +Cc: linux-mm, linux-kernel, minchan.kim
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Unlike memory hotplug, at an allocation of contigous memory range, address
may not be a problem. IOW, if a requester of memory wants to allocate 100M of
of contigous memory, placement of allocated memory may not be a problem.
So, "finding a range of memory which seems to be MOVABLE" is required.
This patch adds a functon to isolate a length of memory within [start, end).
This function returns a pfn which is 1st page of isolated contigous chunk
of given length within [start, end).
If no_search=true is passed as argument, start address is always same to
the specified "base" addresss.
After isolation, free memory within this area will never be allocated.
But some pages will remain as "Used/LRU" pages. They should be dropped by
page reclaim or migration.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
mm/page_isolation.c | 130 ++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 130 insertions(+)
Index: mmotm-1008/mm/page_isolation.c
===================================================================
--- mmotm-1008.orig/mm/page_isolation.c
+++ mmotm-1008/mm/page_isolation.c
@@ -9,6 +9,7 @@
#include <linux/pageblock-flags.h>
#include <linux/memcontrol.h>
#include <linux/migrate.h>
+#include <linux/memory_hotplug.h>
#include <linux/mm_inline.h>
#include "internal.h"
@@ -254,3 +255,132 @@ out:
return ret;
}
+/*
+ * Functions for getting contiguous MOVABLE pages in a zone.
+ */
+struct page_range {
+ unsigned long base; /* Base address of searching contigouous block */
+ unsigned long end;
+ unsigned long pages;/* Length of contiguous block */
+};
+
+static inline unsigned long MAX_ORDER_ALIGN(unsigned long x)
+{
+ return ALIGN(x, MAX_ORDER_NR_PAGES);
+}
+
+static inline unsigned long MAX_ORDER_BASE(unsigned long x)
+{
+ return x & ~(MAX_ORDER_NR_PAGES - 1);
+}
+
+int __get_contig_block(unsigned long pfn, unsigned long nr_pages, void *arg)
+{
+ struct page_range *blockinfo = arg;
+ unsigned long end;
+
+ end = pfn + nr_pages;
+ pfn = MAX_ORDER_ALIGN(pfn);
+ end = MAX_ORDER_BASE(end);
+
+ if (end < pfn)
+ return 0;
+ if (end - pfn >= blockinfo->pages) {
+ blockinfo->base = pfn;
+ blockinfo->end = end;
+ return 1;
+ }
+ return 0;
+}
+
+static void __trim_zone(struct page_range *range)
+{
+ struct zone *zone;
+ unsigned long pfn;
+ /*
+ * In most case, each zone's [start_pfn, end_pfn) has no
+ * overlap between each other. But some arch allows it and
+ * we need to check it here.
+ */
+ for (pfn = range->base, zone = page_zone(pfn_to_page(pfn));
+ pfn < range->end;
+ pfn += MAX_ORDER_NR_PAGES) {
+
+ if (zone != page_zone(pfn_to_page(pfn)))
+ break;
+ }
+ range->end = min(pfn, range->end);
+ return;
+}
+
+/*
+ * This function is for finding a contiguous memory block which has length
+ * of pages and MOVABLE. If it finds, make the range of pages as ISOLATED
+ * and return the first page's pfn.
+ * If no_search==true, this function doesn't scan the range but tries to
+ * isolate the range of memory.
+ */
+
+static unsigned long find_contig_block(unsigned long base,
+ unsigned long end, unsigned long pages, bool no_search)
+{
+ unsigned long pfn, pos;
+ struct page_range blockinfo;
+ int ret;
+
+ pages = MAX_ORDER_ALIGN(pages);
+retry:
+ blockinfo.base = base;
+ blockinfo.end = end;
+ blockinfo.pages = pages;
+ /*
+ * At first, check physical page layout and skip memory holes.
+ */
+ ret = walk_system_ram_range(base, end - base, &blockinfo,
+ __get_contig_block);
+ if (!ret)
+ return 0;
+ /* check contiguous pages in a zone */
+ __trim_zone(&blockinfo);
+
+
+ /* Ok, we found contiguous memory chunk of size. Isolate it.*/
+ for (pfn = blockinfo.base; pfn + pages < blockinfo.end;
+ pfn += MAX_ORDER_NR_PAGES) {
+ /* If no_search==true, base addess should be same to 'base' */
+ if (no_search && pfn != base)
+ break;
+ /* Better code is necessary here.. */
+ for (pos = pfn; pos < pfn + pages; pos++) {
+ struct page *p;
+
+ if (!pfn_valid_within(pos))
+ break;
+ p = pfn_to_page(pos);
+ if (PageReserved(p))
+ break;
+ /* This may hit a page on per-cpu queue. */
+ if (page_count(p) && !PageLRU(p))
+ break;
+ /* Need to skip order of pages */
+ }
+ if (pos != pfn + pages) {
+ pfn = MAX_ORDER_BASE(pos);
+ continue;
+ }
+ /*
+ * Now, we know [base,end) of a contiguous chunk.
+ * Don't need to take care of memory holes.
+ */
+ if (!start_isolate_page_range(pfn, pfn + pages))
+ return pfn;
+ }
+
+ /* failed */
+ if (!no_search && blockinfo.end + pages < end) {
+ /* Move base address and find the next block of RAM. */
+ base = blockinfo.end;
+ goto retry;
+ }
+ return 0;
+}
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 26+ messages in thread* Re: [RFC][PATCH 2/3] find a contiguous range.
2010-10-13 3:17 ` [RFC][PATCH 2/3] find a contiguous range KAMEZAWA Hiroyuki
@ 2010-10-17 3:18 ` Minchan Kim
2010-10-18 0:29 ` KAMEZAWA Hiroyuki
0 siblings, 1 reply; 26+ messages in thread
From: Minchan Kim @ 2010-10-17 3:18 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki; +Cc: linux-mm, linux-kernel
Hi Kame,
Sorry for the late review.
On Wed, Oct 13, 2010 at 12:17 PM, KAMEZAWA Hiroyuki
<kamezawa.hiroyu@jp.fujitsu.com> wrote:
> From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
>
> Unlike memory hotplug, at an allocation of contigous memory range, address
> may not be a problem. IOW, if a requester of memory wants to allocate 100M of
> of contigous memory, placement of allocated memory may not be a problem.
> So, "finding a range of memory which seems to be MOVABLE" is required.
>
> This patch adds a functon to isolate a length of memory within [start, end).
Typo
function
> This function returns a pfn which is 1st page of isolated contigous chunk
Typo
contiguous
> of given length within [start, end).
>
> If no_search=true is passed as argument, start address is always same to
I don't like no_search argument name. It would be better to show not
the implement but context.
How about "bool strict" or "ALLOC_FIXED"?
> the specified "base" addresss.
Typo
address,
Let's add following description.
"Some devices want to bind memory to some memory bank. In this case,
no_search and base address fix
can be helpful."
>
> After isolation, free memory within this area will never be allocated.
> But some pages will remain as "Used/LRU" pages. They should be dropped by
> page reclaim or migration.
At first I saw the above description, I got confused. How about this?
After it isolates some pages in the range, the part of some pages are
freed but others could be used processes now.
Next patch[3/3] try to move or reclaim used pages by page
migration/reclaim for obtaining big contiguous page.
>
>
> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> ---
> mm/page_isolation.c | 130 ++++++++++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 130 insertions(+)
>
> Index: mmotm-1008/mm/page_isolation.c
> ===================================================================
> --- mmotm-1008.orig/mm/page_isolation.c
> +++ mmotm-1008/mm/page_isolation.c
> @@ -9,6 +9,7 @@
> #include <linux/pageblock-flags.h>
> #include <linux/memcontrol.h>
> #include <linux/migrate.h>
> +#include <linux/memory_hotplug.h>
> #include <linux/mm_inline.h>
> #include "internal.h"
>
> @@ -254,3 +255,132 @@ out:
> return ret;
> }
>
> +/*
> + * Functions for getting contiguous MOVABLE pages in a zone.
> + */
> +struct page_range {
> + unsigned long base; /* Base address of searching contigouous block */
Typo contiguous.
Please, specify that it's a pfn number.
> + unsigned long end;
> + unsigned long pages;/* Length of contiguous block */
> +};
> +
> +static inline unsigned long MAX_ORDER_ALIGN(unsigned long x)
> +{
> + return ALIGN(x, MAX_ORDER_NR_PAGES);
> +}
> +
> +static inline unsigned long MAX_ORDER_BASE(unsigned long x)
> +{
> + return x & ~(MAX_ORDER_NR_PAGES - 1);
> +}
> +
> +int __get_contig_block(unsigned long pfn, unsigned long nr_pages, void *arg)
> +{
> + struct page_range *blockinfo = arg;
> + unsigned long end;
> +
> + end = pfn + nr_pages;
> + pfn = MAX_ORDER_ALIGN(pfn);
> + end = MAX_ORDER_BASE(end);
> +
> + if (end < pfn)
> + return 0;
> + if (end - pfn >= blockinfo->pages) {
> + blockinfo->base = pfn;
> + blockinfo->end = end;
> + return 1;
> + }
> + return 0;
> +}
> +
> +static void __trim_zone(struct page_range *range)
Hmm..
I think this function name can't present enough meaning.
Let's move description in body of function to the head.
/*
* In most case, each zone's [start_pfn, end_pfn) has no
* overlap between each other. But some arch allows it and
* we need to check it here. If it happens, range end is changed
* to only include pfns in a zone.
*/
> +{
> + struct zone *zone;
> + unsigned long pfn;
> + /*
> + * In most case, each zone's [start_pfn, end_pfn) has no
> + * overlap between each other. But some arch allows it and
> + * we need to check it here.
> + */
> + for (pfn = range->base, zone = page_zone(pfn_to_page(pfn));
> + pfn < range->end;
> + pfn += MAX_ORDER_NR_PAGES) {
> +
> + if (zone != page_zone(pfn_to_page(pfn)))
> + break;
> + }
> + range->end = min(pfn, range->end);
> + return;
Unnecessary return.
> +}
> +
> +/*
> + * This function is for finding a contiguous memory block which has length
> + * of pages and MOVABLE. If it finds, make the range of pages as ISOLATED
> + * and return the first page's pfn.
> + * If no_search==true, this function doesn't scan the range but tries to
> + * isolate the range of memory.
> + */
> +
> +static unsigned long find_contig_block(unsigned long base,
> + unsigned long end, unsigned long pages, bool no_search)
> +{
> + unsigned long pfn, pos;
> + struct page_range blockinfo;
> + int ret;
> +
> + pages = MAX_ORDER_ALIGN(pages);
> +retry:
> + blockinfo.base = base;
> + blockinfo.end = end;
> + blockinfo.pages = pages;
> + /*
> + * At first, check physical page layout and skip memory holes.
> + */
> + ret = walk_system_ram_range(base, end - base, &blockinfo,
> + __get_contig_block);
> + if (!ret)
> + return 0;
> + /* check contiguous pages in a zone */
> + __trim_zone(&blockinfo);
> +
> +
> + /* Ok, we found contiguous memory chunk of size. Isolate it.*/
> + for (pfn = blockinfo.base; pfn + pages < blockinfo.end;
> + pfn += MAX_ORDER_NR_PAGES) {
> + /* If no_search==true, base addess should be same to 'base' */
> + if (no_search && pfn != base)
> + break;
> + /* Better code is necessary here.. */
> + for (pos = pfn; pos < pfn + pages; pos++) {
> + struct page *p;
> +
> + if (!pfn_valid_within(pos))
> + break;
> + p = pfn_to_page(pos);
> + if (PageReserved(p))
> + break;
> + /* This may hit a page on per-cpu queue. */
Couldn't we drain per-cpu queue before this function?
> + if (page_count(p) && !PageLRU(p))
> + break;
> + /* Need to skip order of pages */
> + }
> + if (pos != pfn + pages) {
> + pfn = MAX_ORDER_BASE(pos);
> + continue;
> + }
> + /*
> + * Now, we know [base,end) of a contiguous chunk.
> + * Don't need to take care of memory holes.
> + */
> + if (!start_isolate_page_range(pfn, pfn + pages))
> + return pfn;
> + }
> +
> + /* failed */
> + if (!no_search && blockinfo.end + pages < end) {
> + /* Move base address and find the next block of RAM. */
> + base = blockinfo.end;
> + goto retry;
> + }
> + return 0;
> +}
>
>
--
Kind regards,
Minchan Kim
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 26+ messages in thread* Re: [RFC][PATCH 2/3] find a contiguous range.
2010-10-17 3:18 ` Minchan Kim
@ 2010-10-18 0:29 ` KAMEZAWA Hiroyuki
0 siblings, 0 replies; 26+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-10-18 0:29 UTC (permalink / raw)
To: Minchan Kim; +Cc: linux-mm, linux-kernel
On Sun, 17 Oct 2010 12:18:48 +0900
Minchan Kim <minchan.kim@gmail.com> wrote:
> Hi Kame,
> Sorry for the late review.
>
> On Wed, Oct 13, 2010 at 12:17 PM, KAMEZAWA Hiroyuki
> <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> > From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> >
> > Unlike memory hotplug, at an allocation of contigous memory range, address
> > may not be a problem. IOW, if a requester of memory wants to allocate 100M of
> > of contigous memory, placement of allocated memory may not be a problem.
> > So, "finding a range of memory which seems to be MOVABLE" is required.
> >
> > This patch adds a functon to isolate a length of memory within [start, end).
>
> Typo
> function
>
> > This function returns a pfn which is 1st page of isolated contigous chunk
>
> Typo
> contiguous
>
I'll use aspell...
> > of given length within [start, end).
> >
> > If no_search=true is passed as argument, start address is always same to
>
> I don't like no_search argument name. It would be better to show not
> the implement but context.
> How about "bool strict" or "ALLOC_FIXED"?
Hmm, ok.
> > the specified "base" addresss.
> Typo
> address,
> Let's add following description.
> "Some devices want to bind memory to some memory bank. In this case,
> no_search and base address fix
> can be helpful."
Then, do you need "end" address for search ?
>
> >
> > After isolation, free memory within this area will never be allocated.
> > But some pages will remain as "Used/LRU" pages. They should be dropped by
> > page reclaim or migration.
>
> At first I saw the above description, I got confused. How about this?
> After it isolates some pages in the range, the part of some pages are
> freed but others could be used processes now.
> Next patch[3/3] try to move or reclaim used pages by page
> migration/reclaim for obtaining big contiguous page.
>
will consider some.
> >
> >
> > Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> > ---
> > A mm/page_isolation.c | A 130 ++++++++++++++++++++++++++++++++++++++++++++++++++++
> > A 1 file changed, 130 insertions(+)
> >
> > Index: mmotm-1008/mm/page_isolation.c
> > ===================================================================
> > --- mmotm-1008.orig/mm/page_isolation.c
> > +++ mmotm-1008/mm/page_isolation.c
> > @@ -9,6 +9,7 @@
> > A #include <linux/pageblock-flags.h>
> > A #include <linux/memcontrol.h>
> > A #include <linux/migrate.h>
> > +#include <linux/memory_hotplug.h>
> > A #include <linux/mm_inline.h>
> > A #include "internal.h"
> >
> > @@ -254,3 +255,132 @@ out:
> > A A A A return ret;
> > A }
> >
> > +/*
> > + * Functions for getting contiguous MOVABLE pages in a zone.
> > + */
> > +struct page_range {
> > + A A A unsigned long base; /* Base address of searching contigouous block */
>
> Typo contiguous.
> Please, specify that it's a pfn number.
>
ok.
> > + A A A unsigned long end;
> > + A A A unsigned long pages;/* Length of contiguous block */
> > +};
> > +
> > +static inline unsigned long A MAX_ORDER_ALIGN(unsigned long x)
> > +{
> > + A A A return ALIGN(x, MAX_ORDER_NR_PAGES);
> > +}
> > +
> > +static inline unsigned long MAX_ORDER_BASE(unsigned long x)
> > +{
> > + A A A return x & ~(MAX_ORDER_NR_PAGES - 1);
> > +}
> > +
> > +int __get_contig_block(unsigned long pfn, unsigned long nr_pages, void *arg)
> > +{
> > + A A A struct page_range *blockinfo = arg;
> > + A A A unsigned long end;
> > +
> > + A A A end = pfn + nr_pages;
> > + A A A pfn = MAX_ORDER_ALIGN(pfn);
> > + A A A end = MAX_ORDER_BASE(end);
> > +
> > + A A A if (end < pfn)
> > + A A A A A A A return 0;
> > + A A A if (end - pfn >= blockinfo->pages) {
> > + A A A A A A A blockinfo->base = pfn;
> > + A A A A A A A blockinfo->end = end;
> > + A A A A A A A return 1;
> > + A A A }
> > + A A A return 0;
> > +}
> > +
> > +static void __trim_zone(struct page_range *range)
>
> Hmm..
> I think this function name can't present enough meaning.
> Let's move description in body of function to the head.
>
> /*
> * In most case, each zone's [start_pfn, end_pfn) has no
> * overlap between each other. But some arch allows it and
> * we need to check it here. If it happens, range end is changed
> * to only include pfns in a zone.
> */
ok.
>
> > +{
> > + A A A struct zone *zone;
> > + A A A unsigned long pfn;
> > + A A A /*
> > + A A A A * In most case, each zone's [start_pfn, end_pfn) has no
> > + A A A A * overlap between each other. But some arch allows it and
> > + A A A A * we need to check it here.
> > + A A A A */
> > + A A A for (pfn = range->base, zone = page_zone(pfn_to_page(pfn));
> > + A A A A A A pfn < range->end;
> > + A A A A A A pfn += MAX_ORDER_NR_PAGES) {
> > +
> > + A A A A A A A if (zone != page_zone(pfn_to_page(pfn)))
> > + A A A A A A A A A A A break;
> > + A A A }
> > + A A A range->end = min(pfn, range->end);
> > + A A A return;
>
> Unnecessary return.
>
will remove.
> > +}
> > +
> > +/*
> > + * This function is for finding a contiguous memory block which has length
> > + * of pages and MOVABLE. If it finds, make the range of pages as ISOLATED
> > + * and return the first page's pfn.
> > + * If no_search==true, this function doesn't scan the range but tries to
> > + * isolate the range of memory.
> > + */
> > +
> > +static unsigned long find_contig_block(unsigned long base,
> > + A A A A A A A unsigned long end, unsigned long pages, bool no_search)
> > +{
> > + A A A unsigned long pfn, pos;
> > + A A A struct page_range blockinfo;
> > + A A A int ret;
> > +
> > + A A A pages = MAX_ORDER_ALIGN(pages);
> > +retry:
> > + A A A blockinfo.base = base;
> > + A A A blockinfo.end = end;
> > + A A A blockinfo.pages = pages;
> > + A A A /*
> > + A A A A * At first, check physical page layout and skip memory holes.
> > + A A A A */
> > + A A A ret = walk_system_ram_range(base, end - base, &blockinfo,
> > + A A A A A A A __get_contig_block);
> > + A A A if (!ret)
> > + A A A A A A A return 0;
> > + A A A /* check contiguous pages in a zone */
> > + A A A __trim_zone(&blockinfo);
> > +
> > +
> > + A A A /* Ok, we found contiguous memory chunk of size. Isolate it.*/
> > + A A A for (pfn = blockinfo.base; pfn + pages < blockinfo.end;
> > + A A A A A A pfn += MAX_ORDER_NR_PAGES) {
> > + A A A A A A A /* If no_search==true, base addess should be same to 'base' */
> > + A A A A A A A if (no_search && pfn != base)
> > + A A A A A A A A A A A break;
> > + A A A A A A A /* Better code is necessary here.. */
> > + A A A A A A A for (pos = pfn; pos < pfn + pages; pos++) {
> > + A A A A A A A A A A A struct page *p;
> > +
> > + A A A A A A A A A A A if (!pfn_valid_within(pos))
> > + A A A A A A A A A A A A A A A break;
> > + A A A A A A A A A A A p = pfn_to_page(pos);
> > + A A A A A A A A A A A if (PageReserved(p))
> > + A A A A A A A A A A A A A A A break;
> > + A A A A A A A A A A A /* This may hit a page on per-cpu queue. */
>
> Couldn't we drain per-cpu queue before this function?
>
We can't guarantee it on SMP systems because we don't ISOLATE the range
at this point.
Thanks,
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 26+ messages in thread
* [RFC][PATCH 3/3] alloc contig pages with migration.
2010-10-13 3:15 [RFC][PATCH 1/3] contigous big page allocator KAMEZAWA Hiroyuki
2010-10-13 3:17 ` [RFC][PATCH 2/3] find a contiguous range KAMEZAWA Hiroyuki
@ 2010-10-13 3:18 ` KAMEZAWA Hiroyuki
2010-10-17 4:05 ` Minchan Kim
2010-10-13 5:05 ` [RFC][PATCH 1/3] contigous big page allocator KOSAKI Motohiro
2010-10-13 7:01 ` Andi Kleen
3 siblings, 1 reply; 26+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-10-13 3:18 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki; +Cc: linux-mm, linux-kernel, minchan.kim
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Add an function to allocate contigous memory larger than MAX_ORDER.
The main difference between usual page allocater is that this uses
memory offline techiqueue (Isoalte pages and migrate remaining pages.).
I think this is not 100% solution because we can't avoid fragmentation,
but we have kernelcore= boot option and can create MOVABLE zone. That
helps us to allow allocate a contigous range on demand.
Maybe drivers can alloc contig pages by bootmem or hiding some memory
from the kernel at boot. But if contig pages are necessary only in some
situation, kernelcore= boot option and using page migration is a choice.
Anyway, to allocate a contiguous chunk larger than MAX_ORDER, we need to
add an overlay allocator on buddy allocator. This can be a 1st step.
Note:
This function is heavy if there are tons of memory requesters. So, maybe
not good for 1GB pages for x86's usual use. It will requires some other
tricks than migration.
TODO:
- allows the caller to specify the migration target pages.
- reduce the number of lru_add_drain_all()..etc...system wide heavy calls.
- Pass gfp_t for some purpose...
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
include/linux/page-isolation.h | 8 ++
mm/page_alloc.c | 29 ++++++++
mm/page_isolation.c | 136 +++++++++++++++++++++++++++++++++++++++++
3 files changed, 173 insertions(+)
Index: mmotm-1008/mm/page_isolation.c
===================================================================
--- mmotm-1008.orig/mm/page_isolation.c
+++ mmotm-1008/mm/page_isolation.c
@@ -7,6 +7,7 @@
#include <linux/mm.h>
#include <linux/page-isolation.h>
#include <linux/pageblock-flags.h>
+#include <linux/swap.h>
#include <linux/memcontrol.h>
#include <linux/migrate.h>
#include <linux/memory_hotplug.h>
@@ -384,3 +385,138 @@ retry:
}
return 0;
}
+
+/**
+ * alloc_contig_pages - allocate a contigous physical pages
+ * @hint: the base address of searching free space(in pfn)
+ * @size: size of requested area (in # of pages)
+ * @node: the node from which memory is allocated. "-1" means anywhere.
+ * @no_search: if true, "hint" is not a hint, requirement.
+ *
+ * Search an area of @size in the physical memory map and checks wheter
+ * we can create a contigous free space. If it seems possible, try to
+ * create contigous space with page migration. If no_search==true, we just try
+ * to allocate [hint, hint+size) range of pages as contigous block.
+ *
+ * Returns a page of the beginning of contiguous block. At failure, NULL
+ * is returned. Each page in the area is set to page_count() = 1. Because
+ * this function does page migration, this function is very heavy and
+ * sleeps some time. Caller must be aware that "NULL returned" is not a
+ * special case.
+ *
+ * Now, returned range is aligned to MAX_ORDER. (So "hint" must be aligned
+ * if no_search==true.)
+ */
+
+#define MIGRATION_RETRY (5)
+struct page *alloc_contig_pages(unsigned long hint, unsigned long size,
+ int node, bool no_search)
+{
+ unsigned long base, found, end, pages, start;
+ struct page *ret = NULL;
+ int migration_failed;
+ struct zone *zone;
+
+ hint = MAX_ORDER_ALIGN(hint);
+ /*
+ * request size should be aligned to pageblock_order..but use
+ * MAX_ORDER here for avoiding messy checks.
+ */
+ pages = MAX_ORDER_ALIGN(size);
+ found = 0;
+retry:
+ for_each_populated_zone(zone) {
+ unsigned long zone_end_pfn;
+
+ if (node >= 0 && node != zone_to_nid(zone))
+ continue;
+ if (zone->present_pages < pages)
+ continue;
+ base = MAX_ORDER_ALIGN(zone->zone_start_pfn);
+ base = max(base, hint);
+ zone_end_pfn = zone->zone_start_pfn + zone->spanned_pages;
+ if (base + pages > zone_end_pfn)
+ continue;
+ found = find_contig_block(base, zone_end_pfn, pages, no_search);
+ /* Next try will see the next block. */
+ hint = base + MAX_ORDER_NR_PAGES;
+ if (found)
+ break;
+ }
+
+ if (!found)
+ return NULL;
+
+ if (no_search && found != hint)
+ return NULL;
+
+ /*
+ * Ok, here, we have contiguous pageblock marked as "isolated"
+ * try migration.
+ *
+ * FIXME: permanent migration_failure detection logic is required.
+ */
+ lru_add_drain_all();
+ flush_scheduled_work();
+ drain_all_pages();
+
+ end = found + pages;
+ /*
+ * scan_lru_pages() finds the next PG_lru page in the range
+ * scan_lru_pages() returns 0 when it reaches the end.
+ */
+ for (start = scan_lru_pages(found, end), migration_failed = 0;
+ start && start < end;
+ start = scan_lru_pages(start, end)) {
+ if (do_migrate_range(start, end)) {
+ /* it's better to try another block ? */
+ if (++migration_failed >= MIGRATION_RETRY)
+ break;
+ /* take a rest and synchronize LRU etc. */
+ lru_add_drain_all();
+ flush_scheduled_work();
+ cond_resched();
+ drain_all_pages();
+ } else /* reset migration_failure counter */
+ migration_failed = 0;
+ }
+
+ lru_add_drain_all();
+ flush_scheduled_work();
+ drain_all_pages();
+ /* Check all pages are isolated */
+ if (test_pages_isolated(found, end)) {
+ undo_isolate_page_range(found, pages);
+ /*
+ * We failed at [start...end) migration.
+ * FIXME: there may be better restaring point.
+ */
+ hint = MAX_ORDER_ALIGN(end + 1);
+ goto retry; /* goto next chunk */
+ }
+ /*
+ * Ok, here, [found...found+pages) memory are isolated.
+ * All pages in the range will be moved into the list with
+ * page_count(page)=1.
+ */
+ ret = pfn_to_page(found);
+ alloc_contig_freed_pages(found, found + pages);
+ /* unset ISOLATE */
+ undo_isolate_page_range(found, pages);
+ /* Free unnecessary pages in tail */
+ for (start = found + size; start < found + pages; start++)
+ __free_page(pfn_to_page(start));
+ return ret;
+
+}
+
+
+void free_contig_pages(struct page *page, int nr_pages)
+{
+ int i;
+ for (i = 0; i < nr_pages; i++)
+ __free_page(page + i);
+}
+
+EXPORT_SYMBOL_GPL(alloc_contig_pages);
+EXPORT_SYMBOL_GPL(free_contig_pages);
Index: mmotm-1008/include/linux/page-isolation.h
===================================================================
--- mmotm-1008.orig/include/linux/page-isolation.h
+++ mmotm-1008/include/linux/page-isolation.h
@@ -32,6 +32,7 @@ test_pages_isolated(unsigned long start_
*/
extern int set_migratetype_isolate(struct page *page);
extern void unset_migratetype_isolate(struct page *page);
+extern void alloc_contig_freed_pages(unsigned long pfn, unsigned long pages);
/*
* For migration.
@@ -41,4 +42,11 @@ int test_pages_in_a_zone(unsigned long s
int scan_lru_pages(unsigned long start, unsigned long end);
int do_migrate_range(unsigned long start_pfn, unsigned long end_pfn);
+/*
+ * For large alloc.
+ */
+struct page *alloc_contig_pages(unsigned long hint, unsigned long size,
+ int node, bool no_search);
+void free_contig_pages(struct page *page, int nr_pages);
+
#endif
Index: mmotm-1008/mm/page_alloc.c
===================================================================
--- mmotm-1008.orig/mm/page_alloc.c
+++ mmotm-1008/mm/page_alloc.c
@@ -5430,6 +5430,35 @@ out:
spin_unlock_irqrestore(&zone->lock, flags);
}
+
+void alloc_contig_freed_pages(unsigned long pfn, unsigned long end)
+{
+ struct page *page;
+ struct zone *zone;
+ int order;
+ unsigned long start = pfn;
+
+ zone = page_zone(pfn_to_page(pfn));
+ spin_lock_irq(&zone->lock);
+ while (pfn < end) {
+ VM_BUG_ON(!pfn_valid(pfn));
+ page = pfn_to_page(pfn);
+ VM_BUG_ON(page_count(page));
+ VM_BUG_ON(!PageBuddy(page));
+ list_del(&page->lru);
+ order = page_order(page);
+ zone->free_area[order].nr_free--;
+ rmv_page_order(page);
+ __mod_zone_page_state(zone, NR_FREE_PAGES, - (1UL << order));
+ pfn += 1 << order;
+ }
+ spin_unlock_irq(&zone->lock);
+
+ /*After this, pages in the range can be freed one be one */
+ for (pfn = start; pfn < end; pfn++)
+ prep_new_page(pfn_to_page(pfn), 0, 0);
+}
+
#ifdef CONFIG_MEMORY_HOTREMOVE
/*
* All pages in the range must be isolated before calling this.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 26+ messages in thread* Re: [RFC][PATCH 3/3] alloc contig pages with migration.
2010-10-13 3:18 ` [RFC][PATCH 3/3] alloc contig pages with migration KAMEZAWA Hiroyuki
@ 2010-10-17 4:05 ` Minchan Kim
2010-10-18 0:35 ` KAMEZAWA Hiroyuki
0 siblings, 1 reply; 26+ messages in thread
From: Minchan Kim @ 2010-10-17 4:05 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki; +Cc: linux-mm, linux-kernel
On Wed, Oct 13, 2010 at 12:18 PM, KAMEZAWA Hiroyuki
<kamezawa.hiroyu@jp.fujitsu.com> wrote:
> From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
>
> Add an function to allocate contigous memory larger than MAX_ORDER.
> The main difference between usual page allocater is that this uses
> memory offline techiqueue (Isoalte pages and migrate remaining pages.).
>
> I think this is not 100% solution because we can't avoid fragmentation,
> but we have kernelcore= boot option and can create MOVABLE zone. That
> helps us to allow allocate a contigous range on demand.
>
> Maybe drivers can alloc contig pages by bootmem or hiding some memory
> from the kernel at boot. But if contig pages are necessary only in some
> situation, kernelcore= boot option and using page migration is a choice
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH 3/3] alloc contig pages with migration.
2010-10-17 4:05 ` Minchan Kim
@ 2010-10-18 0:35 ` KAMEZAWA Hiroyuki
2010-10-18 5:18 ` Minchan Kim
0 siblings, 1 reply; 26+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-10-18 0:35 UTC (permalink / raw)
To: Minchan Kim; +Cc: linux-mm, linux-kernel
On Sun, 17 Oct 2010 13:05:22 +0900
Minchan Kim <minchan.kim@gmail.com> wrote:
> On Wed, Oct 13, 2010 at 12:18 PM, KAMEZAWA Hiroyuki
> <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> > From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> >
> > Add an function to allocate contigous memory larger than MAX_ORDER.
> > The main difference between usual page allocater is that this uses
> > memory offline techiqueue (Isoalte pages and migrate remaining pages.).
> >
> > I think this is not 100% solution because we can't avoid fragmentation,
> > but we have kernelcore= boot option and can create MOVABLE zone. That
> > helps us to allow allocate a contigous range on demand.
> >
> > Maybe drivers can alloc contig pages by bootmem or hiding some memory
> > from the kernel at boot. But if contig pages are necessary only in some
> > situation, kernelcore= boot option and using page migration is a choice.
> >
> > Anyway, to allocate a contiguous chunk larger than MAX_ORDER, we need to
> > add an overlay allocator on buddy allocator. This can be a 1st step.
> >
> > Note:
> > This function is heavy if there are tons of memory requesters. So, maybe
> > not good for 1GB pages for x86's usual use. It will requires some other
> > tricks than migration.
>
> I got found many typos but I don't pointed out each by each. :)
> Please, correct typos in next version.
>
Sorry.
> >
> > TODO:
> > A - allows the caller to specify the migration target pages.
> > A - reduce the number of lru_add_drain_all()..etc...system wide heavy calls.
> > A - Pass gfp_t for some purpose...
> >
> > Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> > ---
> > A include/linux/page-isolation.h | A A 8 ++
> > A mm/page_alloc.c A A A A A A A A | A 29 ++++++++
> > A mm/page_isolation.c A A A A A A | A 136 +++++++++++++++++++++++++++++++++++++++++
> > A 3 files changed, 173 insertions(+)
> >
> > Index: mmotm-1008/mm/page_isolation.c
> > ===================================================================
> > --- mmotm-1008.orig/mm/page_isolation.c
> > +++ mmotm-1008/mm/page_isolation.c
> > @@ -7,6 +7,7 @@
> > A #include <linux/mm.h>
> > A #include <linux/page-isolation.h>
> > A #include <linux/pageblock-flags.h>
> > +#include <linux/swap.h>
> > A #include <linux/memcontrol.h>
> > A #include <linux/migrate.h>
> > A #include <linux/memory_hotplug.h>
> > @@ -384,3 +385,138 @@ retry:
> > A A A A }
> > A A A A return 0;
> > A }
> > +
> > +/**
> > + * alloc_contig_pages - allocate a contigous physical pages
> > + * @hint: A A A the base address of searching free space(in pfn)
> > + * @size: A A A size of requested area (in # of pages)
>
> Could you add _range_ which can be specified by user into your TODO list?
> Maybe some embedded system have a requirement to allocate contiguous
> pages in some bank.
> so guys try to allocate pages in some base address and if it fails, he
> can try to next offset in same bank.
> But it's very annoying. So let's add feature that user can specify
> _range_ where user want to allocate.
Add [start, end) to the argument.
>
> > + * @node: A A A the node from which memory is allocated. "-1" means anywhere.
> > + * @no_search: if true, "hint" is not a hint, requirement.
>
> As I said previous, how about "strict" or "ALLOC_FIXED" like MAP_FIXED?
>
If "range" is an argument, ALLOC_FIXED is not necessary. I'll add "range".
> > + *
> > + * Search an area of @size in the physical memory map and checks wheter
>
> Typo
> whether
>
> > + * we can create a contigous free space. If it seems possible, try to
> > + * create contigous space with page migration. If no_search==true, we just try
> > + * to allocate [hint, hint+size) range of pages as contigous block.
> > + *
> > + * Returns a page of the beginning of contiguous block. At failure, NULL
> > + * is returned. Each page in the area is set to page_count() = 1. Because
>
> Why do you mention page_count() = 1?
> Do users of this function have to know it?
A user can free any page within the range for his purpose.
> > + * this function does page migration, this function is very heavy and
>
> Nitpick.
> page migration is implementation, too. Do we need to mention it in here?
> We might add page reclaim/or new feature in future or page migration
> might be very light function although it is not a easy. :)
> Let's not show the implementation for users.
>
ok.
> > + * sleeps some time. Caller must be aware that "NULL returned" is not a
> > + * special case.
>
> I think this information is enough to users.
>
> > + *
> > + * Now, returned range is aligned to MAX_ORDER. (So "hint" must be aligned
> > + * if no_search==true.)
>
> Couldn't we add handling of this exception?
> If (hint != MAX_ORDER_ALIGH(hint) && no_search == true)
> return 0 or WARN_ON?
>
I'll add "alignment" argument. (for 1G hugepage.)
and add the check.
> > + */
> > +
> > +#define MIGRATION_RETRY A A A A (5)
> > +struct page *alloc_contig_pages(unsigned long hint, unsigned long size,
> > + A A A A A A A A A A A A A A A int node, bool no_search)
> > +{
> > + A A A unsigned long base, found, end, pages, start;
> > + A A A struct page *ret = NULL;
> > + A A A int migration_failed;
> > + A A A struct zone *zone;
> > +
> > + A A A hint = MAX_ORDER_ALIGN(hint);
> > + A A A /*
> > + A A A A * request size should be aligned to pageblock_order..but use
> > + A A A A * MAX_ORDER here for avoiding messy checks.
> > + A A A A */
> > + A A A pages = MAX_ORDER_ALIGN(size);
> > + A A A found = 0;
> > +retry:
> > + A A A for_each_populated_zone(zone) {
> > + A A A A A A A unsigned long zone_end_pfn;
> > +
> > + A A A A A A A if (node >= 0 && node != zone_to_nid(zone))
> > + A A A A A A A A A A A continue;
> > + A A A A A A A if (zone->present_pages < pages)
> > + A A A A A A A A A A A continue;
> > + A A A A A A A base = MAX_ORDER_ALIGN(zone->zone_start_pfn);
> > + A A A A A A A base = max(base, hint);
> > + A A A A A A A zone_end_pfn = zone->zone_start_pfn + zone->spanned_pages;
> > + A A A A A A A if (base + pages > zone_end_pfn)
> > + A A A A A A A A A A A continue;
> > + A A A A A A A found = find_contig_block(base, zone_end_pfn, pages, no_search);
> > + A A A A A A A /* Next try will see the next block. */
> > + A A A A A A A hint = base + MAX_ORDER_NR_PAGES;
> > + A A A A A A A if (found)
> > + A A A A A A A A A A A break;
> > + A A A }
> > +
> > + A A A if (!found)
> > + A A A A A A A return NULL;
> > +
> > + A A A if (no_search && found != hint)
>
> You increased hint before "break".
> So if the no_search is true, this condition (found != hint) is always true.
>
Ah...yes.
>
> > + A A A A A A A return NULL;
> > +
> > + A A A /*
> > + A A A A * Ok, here, we have contiguous pageblock marked as "isolated"
> > + A A A A * try migration.
> > + A A A A *
> > + A A A A * FIXME: permanent migration_failure detection logic is required.
> > + A A A A */
> > + A A A lru_add_drain_all();
> > + A A A flush_scheduled_work();
> > + A A A drain_all_pages();
> > +
> > + A A A end = found + pages;
> > + A A A /*
> > + A A A A * scan_lru_pages() finds the next PG_lru page in the range
> > + A A A A * scan_lru_pages() returns 0 when it reaches the end.
> > + A A A A */
> > + A A A for (start = scan_lru_pages(found, end), migration_failed = 0;
> > + A A A A A A start && start < end;
> > + A A A A A A start = scan_lru_pages(start, end)) {
> > + A A A A A A A if (do_migrate_range(start, end)) {
> > + A A A A A A A A A A A /* it's better to try another block ? */
> > + A A A A A A A A A A A if (++migration_failed >= MIGRATION_RETRY)
> > + A A A A A A A A A A A A A A A break;
> > + A A A A A A A A A A A /* take a rest and synchronize LRU etc. */
> > + A A A A A A A A A A A lru_add_drain_all();
> > + A A A A A A A A A A A flush_scheduled_work();
> > + A A A A A A A A A A A cond_resched();
> > + A A A A A A A A A A A drain_all_pages();
> > + A A A A A A A } else /* reset migration_failure counter */
> > + A A A A A A A A A A A migration_failed = 0;
> > + A A A }
> > +
> > + A A A lru_add_drain_all();
> > + A A A flush_scheduled_work();
> > + A A A drain_all_pages();
>
> Hmm.. as you mentioned, It would be better to remove many flush lru/per-cpu.
> But in embedded system, it couldn't be a big overhead.
>
I'll drop flush_scheduled_work().
> > + A A A /* Check all pages are isolated */
> > + A A A if (test_pages_isolated(found, end)) {
> > + A A A A A A A undo_isolate_page_range(found, pages);
> > + A A A A A A A /*
> > + A A A A A A A A * We failed at [start...end) migration.
> > + A A A A A A A A * FIXME: there may be better restaring point.
> > + A A A A A A A A */
> > + A A A A A A A hint = MAX_ORDER_ALIGN(end + 1);
> > + A A A A A A A goto retry; /* goto next chunk */
> > + A A A }
> > + A A A /*
> > + A A A A * Ok, here, [found...found+pages) memory are isolated.
> > + A A A A * All pages in the range will be moved into the list with
> > + A A A A * page_count(page)=1.
> > + A A A A */
> > + A A A ret = pfn_to_page(found);
> > + A A A alloc_contig_freed_pages(found, found + pages);
> > + A A A /* unset ISOLATE */
> > + A A A undo_isolate_page_range(found, pages);
> > + A A A /* Free unnecessary pages in tail */
> > + A A A for (start = found + size; start < found + pages; start++)
> > + A A A A A A A __free_page(pfn_to_page(start));
> > + A A A return ret;
> > +
> > +}
>
> Thanks for the good patches, Kame.
>
Thank you for advices.
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 26+ messages in thread* Re: [RFC][PATCH 3/3] alloc contig pages with migration.
2010-10-18 0:35 ` KAMEZAWA Hiroyuki
@ 2010-10-18 5:18 ` Minchan Kim
2010-10-18 5:31 ` KAMEZAWA Hiroyuki
0 siblings, 1 reply; 26+ messages in thread
From: Minchan Kim @ 2010-10-18 5:18 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki; +Cc: linux-mm, linux-kernel
On Mon, Oct 18, 2010 at 9:35 AM, KAMEZAWA Hiroyuki
<kamezawa.hiroyu@jp.fujitsu.com> wrote:
>> > + * @node: the node from which memory is allocated. "-1" means anywhere.
>> > + * @no_search: if true, "hint" is not a hint, requirement.
>>
>> As I said previous, how about "strict" or "ALLOC_FIXED" like MAP_FIXED?
>>
>
> If "range" is an argument, ALLOC_FIXED is not necessary. I'll add "range"
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH 3/3] alloc contig pages with migration.
2010-10-18 5:18 ` Minchan Kim
@ 2010-10-18 5:31 ` KAMEZAWA Hiroyuki
2010-10-18 5:52 ` Minchan Kim
0 siblings, 1 reply; 26+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-10-18 5:31 UTC (permalink / raw)
To: Minchan Kim; +Cc: linux-mm, linux-kernel
On Mon, 18 Oct 2010 14:18:52 +0900
Minchan Kim <minchan.kim@gmail.com> wrote:
> >
> >> > + *
> >> > + * Search an area of @size in the physical memory map and checks wheter
> >>
> >> Typo
> >> whether
> >>
> >> > + * we can create a contigous free space. If it seems possible, try to
> >> > + * create contigous space with page migration. If no_search==true, we just try
> >> > + * to allocate [hint, hint+size) range of pages as contigous block.
> >> > + *
> >> > + * Returns a page of the beginning of contiguous block. At failure, NULL
> >> > + * is returned. Each page in the area is set to page_count() = 1. Because
> >>
> >> Why do you mention page_count() = 1?
> >> Do users of this function have to know it?
> >
> > A user can free any page within the range for his purpose.
>
> I think it's not a good idea if we allow handling of page by page, not
> for page-chunk requested by user.
> By mistake, free_contig_pages could have a trouble to free pages.
> Why do you support the feature? Do you have any motivation?
>
No big motivation.
Usual pages are set up by prep_compund_page(page, order), but it is pages smaller
than MAX_ORDER. Then, I called prep_new_page() one by one.
And I don't think some new prep_xxxx_page() is required.
If you requests, ok, I'll add one.
Thanks,
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH 3/3] alloc contig pages with migration.
2010-10-18 5:31 ` KAMEZAWA Hiroyuki
@ 2010-10-18 5:52 ` Minchan Kim
2010-10-18 5:52 ` KAMEZAWA Hiroyuki
0 siblings, 1 reply; 26+ messages in thread
From: Minchan Kim @ 2010-10-18 5:52 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki; +Cc: linux-mm, linux-kernel
On Mon, Oct 18, 2010 at 2:31 PM, KAMEZAWA Hiroyuki
<kamezawa.hiroyu@jp.fujitsu.com> wrote:
> On Mon, 18 Oct 2010 14:18:52 +0900
> Minchan Kim <minchan.kim@gmail.com> wrote:
>> >
>> >> > + *
>> >> > + * Search an area of @size in the physical memory map and checks wheter
>> >>
>> >> Typo
>> >> whether
>> >>
>> >> > + * we can create a contigous free space. If it seems possible, try to
>> >> > + * create contigous space with page migration. If no_search==true, we just try
>> >> > + * to allocate [hint, hint+size) range of pages as contigous block
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH 3/3] alloc contig pages with migration.
2010-10-18 5:52 ` Minchan Kim
@ 2010-10-18 5:52 ` KAMEZAWA Hiroyuki
0 siblings, 0 replies; 26+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-10-18 5:52 UTC (permalink / raw)
To: Minchan Kim; +Cc: linux-mm, linux-kernel
On Mon, 18 Oct 2010 14:52:19 +0900
Minchan Kim <minchan.kim@gmail.com> wrote:
> On Mon, Oct 18, 2010 at 2:31 PM, KAMEZAWA Hiroyuki
> <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> > On Mon, 18 Oct 2010 14:18:52 +0900
> > Minchan Kim <minchan.kim@gmail.com> wrote:
> >> >
> >> >> > + *
> >> >> > + * Search an area of @size in the physical memory map and checks wheter
> >> >>
> >> >> Typo
> >> >> whether
> >> >>
> >> >> > + * we can create a contigous free space. If it seems possible, try to
> >> >> > + * create contigous space with page migration. If no_search==true, we just try
> >> >> > + * to allocate [hint, hint+size) range of pages as contigous block.
> >> >> > + *
> >> >> > + * Returns a page of the beginning of contiguous block. At failure, NULL
> >> >> > + * is returned. Each page in the area is set to page_count() = 1. Because
> >> >>
> >> >> Why do you mention page_count() = 1?
> >> >> Do users of this function have to know it?
> >> >
> >> > A user can free any page within the range for his purpose.
> >>
> >> I think it's not a good idea if we allow handling of page by page, not
> >> for page-chunk requested by user.
> >> By mistake, free_contig_pages could have a trouble to free pages.
> >> Why do you support the feature? A Do you have any motivation?
> >>
> > No big motivation.
> >
> > Usual pages are set up by prep_compund_page(page, order), but it is pages smaller
> > than MAX_ORDER. A Then, I called prep_new_page() one by one.
> > And I don't think some new prep_xxxx_page() is required.
> >
> > If you requests, ok, I'll add one.
>
> Maybe we are talking another thing.
>
> My question is why you noticed "page_count() == 1" in function description.
> So your answer was for user to free some pages within big contiguous page.
> Then, my concern is that if you didn't mentioned page_count() == 1 in
> description, anonymous user will use just alloc_contig_pages and
> free_contig_pages. That's enough for current our requirement. But
> since you mentioned page_count() == 1 and you want for user to free
> some pages within big contiguous page, anonymous user who isn't expert
> in mm or careless people can free pages _freely_. It could make BUG
> easily(free_contig_pages can free the page which is freed by user's
> put_page).
>
> So if there isn't strong cause, I hope removing the mention for
> preventing careless API usage.
>
Ah, ok. I see. I'll update that parts as "use free_contig_page() to free a chunk".
Thanks,
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH 1/3] contigous big page allocator
2010-10-13 3:15 [RFC][PATCH 1/3] contigous big page allocator KAMEZAWA Hiroyuki
2010-10-13 3:17 ` [RFC][PATCH 2/3] find a contiguous range KAMEZAWA Hiroyuki
2010-10-13 3:18 ` [RFC][PATCH 3/3] alloc contig pages with migration KAMEZAWA Hiroyuki
@ 2010-10-13 5:05 ` KOSAKI Motohiro
2010-10-13 7:01 ` Andi Kleen
3 siblings, 0 replies; 26+ messages in thread
From: KOSAKI Motohiro @ 2010-10-13 5:05 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki; +Cc: kosaki.motohiro, linux-mm, linux-kernel, minchan.kim
>
> No big change since the previous version but divided into 3 patches.
> This patch is based onto mmotm-1008 and just works, IOW, mot tested in
> very-bad-situation.
>
> What this wants to do:
> allocates a contiguous chunk of pages larger than MAX_ORDER.
> for device drivers (camera? etc..)
> My intention is not for allocating HUGEPAGE(> MAX_ORDER).
>
> What this does:
> allocates a contiguous chunk of page with page migration,
> based on memory hotplug codes. (memory unplug is for isolating
> a chunk of page from buddy allocator.)
>
> Consideration:
> Maybe more codes can be shared with other functions
> (memory hotplug, compaction..)
>
> Status:
> Maybe still needs more updates, works on small test.
> [1/3] ... move some codes from memory hotplug. (no functional changes)
> [2/3] ... a code for searching contiguous pages.
> [3/3] ... a code for allocating contig memory.
>
> Thanks,
> -Kame
> ==
>
> From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
>
> Memory hotplug is a logic for making pages unused in the specified range
> of pfn. So, some of core logics can be used for other purpose as
> allocating a very large contigous memory block.
>
> This patch moves some functions from mm/memory_hotplug.c to
> mm/page_isolation.c. This helps adding a function for large-alloc in
> page_isolation.c with memory-unplug technique.
>
>
> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> ---
> include/linux/page-isolation.h | 7 ++
> mm/memory_hotplug.c | 109 ---------------------------------------
> mm/page_isolation.c | 114 +++++++++++++++++++++++++++++++++++++++++
> 3 files changed, 121 insertions(+), 109 deletions(-)
>
> Index: mmotm-1008/include/linux/page-isolation.h
> ===================================================================
> --- mmotm-1008.orig/include/linux/page-isolation.h
> +++ mmotm-1008/include/linux/page-isolation.h
> @@ -33,5 +33,12 @@ test_pages_isolated(unsigned long start_
> extern int set_migratetype_isolate(struct page *page);
> extern void unset_migratetype_isolate(struct page *page);
>
> +/*
> + * For migration.
> + */
> +
> +int test_pages_in_a_zone(unsigned long start_pfn, unsigned long end_pfn);
> +int scan_lru_pages(unsigned long start, unsigned long end);
offtopic: scan_lru_pages() return type should be unsined long. it return
pfn.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH 1/3] contigous big page allocator
2010-10-13 3:15 [RFC][PATCH 1/3] contigous big page allocator KAMEZAWA Hiroyuki
` (2 preceding siblings ...)
2010-10-13 5:05 ` [RFC][PATCH 1/3] contigous big page allocator KOSAKI Motohiro
@ 2010-10-13 7:01 ` Andi Kleen
2010-10-13 7:12 ` KAMEZAWA Hiroyuki
2010-10-14 7:07 ` FUJITA Tomonori
3 siblings, 2 replies; 26+ messages in thread
From: Andi Kleen @ 2010-10-13 7:01 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki; +Cc: linux-mm, linux-kernel, minchan.kim
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> writes:
>
> What this wants to do:
> allocates a contiguous chunk of pages larger than MAX_ORDER.
> for device drivers (camera? etc..)
I think to really move forward you need a concrete use case
actually implemented in tree.
> My intention is not for allocating HUGEPAGE(> MAX_ORDER).
I still believe using this for 1GB pages would be one of the more
interesting use cases.
-Andi
--
ak@linux.intel.com -- Speaking for myself only.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 26+ messages in thread* Re: [RFC][PATCH 1/3] contigous big page allocator
2010-10-13 7:01 ` Andi Kleen
@ 2010-10-13 7:12 ` KAMEZAWA Hiroyuki
2010-10-13 8:36 ` Andi Kleen
2010-10-14 7:07 ` FUJITA Tomonori
1 sibling, 1 reply; 26+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-10-13 7:12 UTC (permalink / raw)
To: Andi Kleen; +Cc: linux-mm, linux-kernel, minchan.kim, fujita.tomonori
On Wed, 13 Oct 2010 09:01:43 +0200
Andi Kleen <andi@firstfloor.org> wrote:
> KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> writes:
> >
> > What this wants to do:
> > allocates a contiguous chunk of pages larger than MAX_ORDER.
> > for device drivers (camera? etc..)
>
> I think to really move forward you need a concrete use case
> actually implemented in tree.
>
yes. I heard there were users at LinuxCon Japan, so restarted.
I heared video-for-linux + ARM wants this.
I found this thread, now.
http://kerneltrap.org/mailarchive/linux-kernel/2010/10/10/4630166
Hmm.
> > My intention is not for allocating HUGEPAGE(> MAX_ORDER).
>
> I still believe using this for 1GB pages would be one of the more
> interesting use cases.
>
I'm successfully allocating 1GB of continous pages at test. But I'm not sure
requirements and users. How quick this allocation should be ?
For example, if prep_new_page() for 1GB page is slow, what kind of chunk-of-page
construction is the best.
THanks,
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH 1/3] contigous big page allocator
2010-10-13 7:12 ` KAMEZAWA Hiroyuki
@ 2010-10-13 8:36 ` Andi Kleen
2010-10-13 8:39 ` KAMEZAWA Hiroyuki
2010-10-14 1:59 ` KOSAKI Motohiro
0 siblings, 2 replies; 26+ messages in thread
From: Andi Kleen @ 2010-10-13 8:36 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki; +Cc: linux-mm, linux-kernel, minchan.kim, fujita.tomonori
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> writes:
>> > My intention is not for allocating HUGEPAGE(> MAX_ORDER).
>>
>> I still believe using this for 1GB pages would be one of the more
>> interesting use cases.
>>
>
> I'm successfully allocating 1GB of continous pages at test. But I'm not sure
> requirements and users. How quick this allocation should be ?
This will always be slow. Huge pages are always pre allocated
even today through a sysctl. The use case would be have
echo XXX > /proc/sys/vm/nr_hugepages
at runtime working for 1GB too, instead of requiring a reboot
for this.
I think it's ok if that is somewhat slow, as long as it is not
incredible slow. Ideally it shouldn't cause a swap storm either
(maybe we need some way to indicate how hard the freeing code should
try?)
I guess it would only really work well if you predefine
movable zones at boot time.
-Andi
--
ak@linux.intel.com -- Speaking for myself only.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH 1/3] contigous big page allocator
2010-10-13 8:36 ` Andi Kleen
@ 2010-10-13 8:39 ` KAMEZAWA Hiroyuki
2010-10-14 1:59 ` KOSAKI Motohiro
1 sibling, 0 replies; 26+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-10-13 8:39 UTC (permalink / raw)
To: Andi Kleen; +Cc: linux-mm, linux-kernel, minchan.kim, fujita.tomonori
On Wed, 13 Oct 2010 10:36:53 +0200
Andi Kleen <andi@firstfloor.org> wrote:
> KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> writes:
>
> >> > My intention is not for allocating HUGEPAGE(> MAX_ORDER).
> >>
> >> I still believe using this for 1GB pages would be one of the more
> >> interesting use cases.
> >>
> >
> > I'm successfully allocating 1GB of continous pages at test. But I'm not sure
> > requirements and users. How quick this allocation should be ?
>
> This will always be slow. Huge pages are always pre allocated
> even today through a sysctl. The use case would be have
>
> echo XXX > /proc/sys/vm/nr_hugepages
>
> at runtime working for 1GB too, instead of requiring a reboot
> for this.
>
> I think it's ok if that is somewhat slow, as long as it is not
> incredible slow. Ideally it shouldn't cause a swap storm either
>
> (maybe we need some way to indicate how hard the freeing code should
> try?)
>
yes. I think this patch should be update to do a precice control of memory
pressure. It will improve memory hotplug's memory allocation, too.
> I guess it would only really work well if you predefine
> movable zones at boot time.
>
I think so, too. But maybe enough for embeded guys and very special systems
which need to use 1G page.
Thanks,
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH 1/3] contigous big page allocator
2010-10-13 8:36 ` Andi Kleen
2010-10-13 8:39 ` KAMEZAWA Hiroyuki
@ 2010-10-14 1:59 ` KOSAKI Motohiro
1 sibling, 0 replies; 26+ messages in thread
From: KOSAKI Motohiro @ 2010-10-14 1:59 UTC (permalink / raw)
To: Andi Kleen
Cc: kosaki.motohiro, KAMEZAWA Hiroyuki, linux-mm, linux-kernel,
minchan.kim, fujita.tomonori
> KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> writes:
>
> >> > My intention is not for allocating HUGEPAGE(> MAX_ORDER).
> >>
> >> I still believe using this for 1GB pages would be one of the more
> >> interesting use cases.
> >>
> >
> > I'm successfully allocating 1GB of continous pages at test. But I'm not sure
> > requirements and users. How quick this allocation should be ?
>
> This will always be slow. Huge pages are always pre allocated
> even today through a sysctl. The use case would be have
>
> echo XXX > /proc/sys/vm/nr_hugepages
>
> at runtime working for 1GB too, instead of requiring a reboot
> for this.
>
> I think it's ok if that is somewhat slow, as long as it is not
> incredible slow. Ideally it shouldn't cause a swap storm either
offtopic: When I tried to increase nr_hugepages on ia64
which has 256MB hugepage architecture, sometimes I needed to wait
>10 miniture if the system is under memory pressure. So, slow allocation
is NOT only this contigous allocator issue. we already accept it and
we should. (I doubt it can be avoidable)
>
> (maybe we need some way to indicate how hard the freeing code should
> try?)
>
> I guess it would only really work well if you predefine
> movable zones at boot time.
>
> -Andi
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH 1/3] contigous big page allocator
2010-10-13 7:01 ` Andi Kleen
2010-10-13 7:12 ` KAMEZAWA Hiroyuki
@ 2010-10-14 7:07 ` FUJITA Tomonori
2010-10-14 7:24 ` Andi Kleen
2010-10-14 12:09 ` Felipe Contreras
1 sibling, 2 replies; 26+ messages in thread
From: FUJITA Tomonori @ 2010-10-14 7:07 UTC (permalink / raw)
To: andi; +Cc: kamezawa.hiroyu, linux-mm, linux-kernel, minchan.kim
On Wed, 13 Oct 2010 09:01:43 +0200
Andi Kleen <andi@firstfloor.org> wrote:
> KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> writes:
> >
> > What this wants to do:
> > allocates a contiguous chunk of pages larger than MAX_ORDER.
> > for device drivers (camera? etc..)
>
> I think to really move forward you need a concrete use case
> actually implemented in tree.
As already pointed out, some embeded drivers need physcailly
contignous memory. Currenlty, they use hacky tricks (e.g. playing with
the boot memory allocators). There are several proposals for this like
adding a new kernel memory allocator (from samsung).
It's ideal if the memory allocator can handle this, I think.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH 1/3] contigous big page allocator
2010-10-14 7:07 ` FUJITA Tomonori
@ 2010-10-14 7:24 ` Andi Kleen
2010-10-14 8:36 ` FUJITA Tomonori
2010-10-14 12:10 ` Felipe Contreras
2010-10-14 12:09 ` Felipe Contreras
1 sibling, 2 replies; 26+ messages in thread
From: Andi Kleen @ 2010-10-14 7:24 UTC (permalink / raw)
To: FUJITA Tomonori
Cc: andi, kamezawa.hiroyu, linux-mm, linux-kernel, minchan.kim
On Thu, Oct 14, 2010 at 04:07:12PM +0900, FUJITA Tomonori wrote:
> On Wed, 13 Oct 2010 09:01:43 +0200
> Andi Kleen <andi@firstfloor.org> wrote:
>
> > KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> writes:
> > >
> > > What this wants to do:
> > > allocates a contiguous chunk of pages larger than MAX_ORDER.
> > > for device drivers (camera? etc..)
> >
> > I think to really move forward you need a concrete use case
> > actually implemented in tree.
>
> As already pointed out, some embeded drivers need physcailly
> contignous memory. Currenlty, they use hacky tricks (e.g. playing with
> the boot memory allocators). There are several proposals for this like
Are any of those in mainline?
-Andi
--
ak@linux.intel.com -- Speaking for myself only.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH 1/3] contigous big page allocator
2010-10-14 7:24 ` Andi Kleen
@ 2010-10-14 8:36 ` FUJITA Tomonori
2010-10-14 12:55 ` Andi Kleen
2010-10-14 12:10 ` Felipe Contreras
1 sibling, 1 reply; 26+ messages in thread
From: FUJITA Tomonori @ 2010-10-14 8:36 UTC (permalink / raw)
To: andi
Cc: fujita.tomonori, kamezawa.hiroyu, linux-mm, linux-kernel, minchan.kim
On Thu, 14 Oct 2010 09:24:21 +0200
Andi Kleen <andi@firstfloor.org> wrote:
> On Thu, Oct 14, 2010 at 04:07:12PM +0900, FUJITA Tomonori wrote:
> > On Wed, 13 Oct 2010 09:01:43 +0200
> > Andi Kleen <andi@firstfloor.org> wrote:
> >
> > > KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> writes:
> > > >
> > > > What this wants to do:
> > > > allocates a contiguous chunk of pages larger than MAX_ORDER.
> > > > for device drivers (camera? etc..)
> > >
> > > I think to really move forward you need a concrete use case
> > > actually implemented in tree.
> >
> > As already pointed out, some embeded drivers need physcailly
> > contignous memory. Currenlty, they use hacky tricks (e.g. playing with
> > the boot memory allocators). There are several proposals for this like
>
> Are any of those in mainline?
The tricks or the proposals?
I think that at least one mainline driver in arm uses such trick but I
can't recall the name. Better to ask on the arm mainling list. Also I
heard that the are some out-of-tree patches about this.
I think that any such proposal hasn't merged yet. If you are looking
for such examples, here's one:
http://lwn.net/Articles/401107/
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH 1/3] contigous big page allocator
2010-10-14 8:36 ` FUJITA Tomonori
@ 2010-10-14 12:55 ` Andi Kleen
2010-10-14 15:09 ` FUJITA Tomonori
0 siblings, 1 reply; 26+ messages in thread
From: Andi Kleen @ 2010-10-14 12:55 UTC (permalink / raw)
To: FUJITA Tomonori
Cc: andi, kamezawa.hiroyu, linux-mm, linux-kernel, minchan.kim
> I think that at least one mainline driver in arm uses such trick but I
> can't recall the name. Better to ask on the arm mainling list. Also I
> heard that the are some out-of-tree patches about this.
I'm sure there are out of tree patches for lots of things,
but at least in terms of merging mainline functionality
use cases merged in the mainline tree are required.
-Andi
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH 1/3] contigous big page allocator
2010-10-14 12:55 ` Andi Kleen
@ 2010-10-14 15:09 ` FUJITA Tomonori
0 siblings, 0 replies; 26+ messages in thread
From: FUJITA Tomonori @ 2010-10-14 15:09 UTC (permalink / raw)
To: andi
Cc: fujita.tomonori, kamezawa.hiroyu, linux-mm, linux-kernel, minchan.kim
On Thu, 14 Oct 2010 14:55:19 +0200
Andi Kleen <andi@firstfloor.org> wrote:
> > I think that at least one mainline driver in arm uses such trick but I
> > can't recall the name. Better to ask on the arm mainling list. Also I
> > heard that the are some out-of-tree patches about this.
>
> I'm sure there are out of tree patches for lots of things,
> but at least in terms of merging mainline functionality
> use cases merged in the mainline tree are required.
I think that we already have drivers that need such feature in
mainline. They keep out-of-tree patches that give continuous memory to
these drivers reliably.
Anyway, Felipe pointed out one user. I also think that
drivers/media/video/videobuf-dma-contig.c also was already mentioned,
needs such feature.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH 1/3] contigous big page allocator
2010-10-14 7:24 ` Andi Kleen
2010-10-14 8:36 ` FUJITA Tomonori
@ 2010-10-14 12:10 ` Felipe Contreras
1 sibling, 0 replies; 26+ messages in thread
From: Felipe Contreras @ 2010-10-14 12:10 UTC (permalink / raw)
To: Andi Kleen
Cc: FUJITA Tomonori, kamezawa.hiroyu, linux-mm, linux-kernel, minchan.kim
On Thu, Oct 14, 2010 at 10:24 AM, Andi Kleen <andi@firstfloor.org> wrote:
> On Thu, Oct 14, 2010 at 04:07:12PM +0900, FUJITA Tomonori wrote:
>> On Wed, 13 Oct 2010 09:01:43 +0200
>> Andi Kleen <andi@firstfloor.org> wrote:
>>
>> > KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> writes:
>> > >
>> > > What this wants to do:
>> > > allocates a contiguous chunk of pages larger than MAX_ORDER.
>> > > for device drivers (camera? etc..)
>> >
>> > I think to really move forward you need a concrete use case
>> > actually implemented in tree.
>>
>> As already pointed out, some embeded drivers need physcailly
>> contignous memory. Currenlty, they use hacky tricks (e.g. playing with
>> the boot memory allocators). There are several proposals for this like
>
> Are any of those in mainline?
drivers/video/omap/
--
Felipe Contreras
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH 1/3] contigous big page allocator
2010-10-14 7:07 ` FUJITA Tomonori
2010-10-14 7:24 ` Andi Kleen
@ 2010-10-14 12:09 ` Felipe Contreras
2010-10-14 15:24 ` FUJITA Tomonori
1 sibling, 1 reply; 26+ messages in thread
From: Felipe Contreras @ 2010-10-14 12:09 UTC (permalink / raw)
To: FUJITA Tomonori
Cc: andi, kamezawa.hiroyu, linux-mm, linux-kernel, minchan.kim
On Thu, Oct 14, 2010 at 10:07 AM, FUJITA Tomonori
<fujita.tomonori@lab.ntt.co.jp> wrote:
> On Wed, 13 Oct 2010 09:01:43 +0200
> Andi Kleen <andi@firstfloor.org> wrote:
>
>> KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> writes:
>> >
>> > What this wants to do:
>> > allocates a contiguous chunk of pages larger than MAX_ORDER.
>> > for device drivers (camera? etc..)
>>
>> I think to really move forward you need a concrete use case
>> actually implemented in tree.
>
> As already pointed out, some embeded drivers need physcailly
> contignous memory. Currenlty, they use hacky tricks (e.g. playing with
> the boot memory allocators). There are several proposals for this like
> adding a new kernel memory allocator (from samsung).
>
> It's ideal if the memory allocator can handle this, I think.
Not only contiguous, but sometimes also coherent.
--
Felipe Contreras
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH 1/3] contigous big page allocator
2010-10-14 12:09 ` Felipe Contreras
@ 2010-10-14 15:24 ` FUJITA Tomonori
2010-10-14 21:50 ` Felipe Contreras
0 siblings, 1 reply; 26+ messages in thread
From: FUJITA Tomonori @ 2010-10-14 15:24 UTC (permalink / raw)
To: felipe.contreras
Cc: fujita.tomonori, andi, kamezawa.hiroyu, linux-mm, linux-kernel,
minchan.kim
On Thu, 14 Oct 2010 15:09:13 +0300
Felipe Contreras <felipe.contreras@gmail.com> wrote:
> > As already pointed out, some embeded drivers need physcailly
> > contignous memory. Currenlty, they use hacky tricks (e.g. playing with
> > the boot memory allocators). There are several proposals for this like
> > adding a new kernel memory allocator (from samsung).
> >
> > It's ideal if the memory allocator can handle this, I think.
>
> Not only contiguous, but sometimes also coherent.
Can you give the list of such drivers?
Anyway, in general cases, the page allocator needs to allocate large
contignous memory if we want dma_alloc_coherent to return large
contignous coherent memory.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH 1/3] contigous big page allocator
2010-10-14 15:24 ` FUJITA Tomonori
@ 2010-10-14 21:50 ` Felipe Contreras
0 siblings, 0 replies; 26+ messages in thread
From: Felipe Contreras @ 2010-10-14 21:50 UTC (permalink / raw)
To: FUJITA Tomonori
Cc: andi, kamezawa.hiroyu, linux-mm, linux-kernel, minchan.kim
On Thu, Oct 14, 2010 at 6:24 PM, FUJITA Tomonori
<fujita.tomonori@lab.ntt.co.jp> wrote:
> On Thu, 14 Oct 2010 15:09:13 +0300
> Felipe Contreras <felipe.contreras@gmail.com> wrote:
>
>> > As already pointed out, some embeded drivers need physcailly
>> > contignous memory. Currenlty, they use hacky tricks (e.g. playing with
>> > the boot memory allocators). There are several proposals for this like
>> > adding a new kernel memory allocator (from samsung).
>> >
>> > It's ideal if the memory allocator can handle this, I think.
>>
>> Not only contiguous, but sometimes also coherent.
>
> Can you give the list of such drivers?
omapfb and tidspbridge. Perhaps tidspbridge can be modified to flush
the relevant memory, but for now it does. I'm not sure about omapfb,
but it would be very likely that user-space would need to be modified
if flushes suddenly become required.
--
Felipe Contreras
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 26+ messages in thread