RE: [PATCH 5/5] Light fragmentation avoidance without usemap: 005

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* RE: [PATCH 5/5] Light fragmentation avoidance without usemap: 005_drainpercpu
@ 2005-11-22 23:43 Seth, Rohit, Tuesday, November
  2005-11-23  0:17 ` Mel Gorman
  0 siblings, 1 reply; 5+ messages in thread
From: Seth, Rohit, Tuesday, November @ 2005-11-22 23:43 UTC (permalink / raw)
  To: Mel Gorman, linux-mm; +Cc: nickpiggin, ak, linux-kernel, lhms-devel, mingo

>Per-cpu pages can accidentally cause fragmentation because they are
free, >but
>pinned pages in an otherwise contiguous block.  When this patch is
applied,
>the per-cpu caches are drained after the direct-reclaim is entered if
the

I don't think this is the right place to drain the pcp.  Since direct
reclaim is already done, so it is possible that allocator can service
the request without draining the pcps. 

>requested order is greater than 3. 

Why this order limit.  Most of the previous failures seen (because of my

earlier patches of bigger and more physical contiguous chunks for pcps) 
were with order 1 allocation.

>It simply reuses the code used by suspend
>and hotplug and only is triggered when anti-defragmentation is enabled.
>
That code has issues with pre-emptible kernel.

I will be shortly sending the patch to free pages from pcp when higher
order
allocation is not able to get serviced from global list.

-rohi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: [PATCH 5/5] Light fragmentation avoidance without usemap: 005_drainpercpu
  2005-11-22 23:43 [PATCH 5/5] Light fragmentation avoidance without usemap: 005_drainpercpu Seth, Rohit, Tuesday, November
@ 2005-11-23  0:17 ` Mel Gorman
  2005-11-23  1:22   ` Rohit Seth
  0 siblings, 1 reply; 5+ messages in thread
From: Mel Gorman @ 2005-11-23  0:17 UTC (permalink / raw)
  To: Seth, Rohit; +Cc: linux-mm, nickpiggin, ak, linux-kernel, lhms-devel, mingo

On Tue, 22 Nov 2005, Seth, Rohit wrote:

> From:  Mel Gorman Sent: Tuesday, November 22, 2005 11:18 AM
>
> >Per-cpu pages can accidentally cause fragmentation because they are
> free, >but
> >pinned pages in an otherwise contiguous block.  When this patch is
> applied,
> >the per-cpu caches are drained after the direct-reclaim is entered if
> the
>
> I don't think this is the right place to drain the pcp.  Since direct
> reclaim is already done, so it is possible that allocator can service
> the request without draining the pcps.
>

ok, true. A check should be made to see if it's possible yet and if not,
then drain. A more appropriate place might be after this block

                if (page)
                        goto got_pg;

>
> >requested order is greater than 3.
>
> Why this order limit.  Most of the previous failures seen (because of my
> earlier patches of bigger and more physical contiguous chunks for pcps)
> were with order 1 allocation.
>

The order 3 is because of this block;

        if (!(gfp_mask & __GFP_NORETRY)) {
                if ((order <= 3) || (gfp_mask & __GFP_REPEAT))
                        do_retry = 1;
                if (gfp_mask & __GFP_NOFAIL)
                        do_retry = 1;
        }

If it's less than 3, we are retrying anyway and it's something we are
already doing. If it was felt it had a chance of working before, I felt
that draining per-cpu caches was unnecessary.

> >It simply reuses the code used by suspend
> >and hotplug and only is triggered when anti-defragmentation is enabled.
> >
> That code has issues with pre-emptible kernel.
>

ok... why? I thought that we could only be preempted when we were about to
take a spinlock but I have an imperfect understanding of preempt and
things change quickly. The path the drain_all_local_pages() enters
disables the local IRQs before calling __drain_pages() and when
smp_drain_local_pages()  is called, the local IRQs are disabled again
before releasing pages. Where can we get preempted?

> I will be shortly sending the patch to free pages from pcp when higher
> order allocation is not able to get serviced from global list.
>

If that works, this part of the patch can be dropped. The intention is to
"drain the per-cpu lists by some mechanism". I am not too particular about
how it happens. Right now, the per-cpu caches make a massive difference on
my 4-way machine at least on whether a large number of contiguous blocks
can be allocated or not.

-- 
Mel Gorman
Part-time Phd Student                          Java Applications Developer
University of Limerick                         IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: [PATCH 5/5] Light fragmentation avoidance without usemap: 005_drainpercpu
  2005-11-23  0:17 ` Mel Gorman
@ 2005-11-23  1:22   ` Rohit Seth
  2005-11-23  8:33     ` Mel Gorman
  0 siblings, 1 reply; 5+ messages in thread
From: Rohit Seth @ 2005-11-23  1:22 UTC (permalink / raw)
  To: Mel Gorman; +Cc: linux-mm, nickpiggin, ak, linux-kernel, lhms-devel, mingo

On Wed, 2005-11-23 at 00:17 +0000, Mel Gorman wrote:
> On Tue, 22 Nov 2005, Seth, Rohit wrote:
> 
> >
> >
> > >requested order is greater than 3.
> >
> > Why this order limit.  Most of the previous failures seen (because of my
> > earlier patches of bigger and more physical contiguous chunks for pcps)
> > were with order 1 allocation.
> >
> 
> The order 3 is because of this block;
> 
>         if (!(gfp_mask & __GFP_NORETRY)) {
>                 if ((order <= 3) || (gfp_mask & __GFP_REPEAT))
>                         do_retry = 1;
>                 if (gfp_mask & __GFP_NOFAIL)
>                         do_retry = 1;
>         }
> 
> If it's less than 3, we are retrying anyway and it's something we are

You are retrying (for 0<order<=3) but without draining the pcps (in your
patch).

> > That code has issues with pre-emptible kernel.
> >
> 
> ok... why? I thought that we could only be preempted when we were about to
> take a spinlock but I have an imperfect understanding of preempt and
> things change quickly. The path the drain_all_local_pages() enters
> disables the local IRQs before calling __drain_pages() and when
> smp_drain_local_pages()  is called, the local IRQs are disabled again
> before releasing pages. Where can we get preempted?
> 

Basically the get_cpu(), put_cpu() needs to cover the whole scope of
smp_processor_id usage.  (When you enable CONFIG_DEBUG_PREEMPT the
kernel will barf if preempt is enabled while calling smp_processor_id).

If the interrupts are disabled all the way through then you wouldn't be
preempted though.  But get/put_cpu is the right mechanism to ensure
smp_processor_id and its derived value is used on same processor.

> > I will be shortly sending the patch to free pages from pcp when higher
> > order allocation is not able to get serviced from global list.
> >
> 
> If that works, this part of the patch can be dropped. The intention is to
> "drain the per-cpu lists by some mechanism". I am not too particular about
> how it happens. Right now, the per-cpu caches make a massive difference on
> my 4-way machine at least on whether a large number of contiguous blocks
> can be allocated or not.
> 

Please let me know if you see any issues with the patch that I sent out
a bit earlier.

Thanks,
-rohit

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: [PATCH 5/5] Light fragmentation avoidance without usemap: 005_drainpercpu
  2005-11-23  1:22   ` Rohit Seth
@ 2005-11-23  8:33     ` Mel Gorman
  0 siblings, 0 replies; 5+ messages in thread
From: Mel Gorman @ 2005-11-23  8:33 UTC (permalink / raw)
  To: Rohit Seth; +Cc: linux-mm, nickpiggin, ak, linux-kernel, lhms-devel, mingo

On Tue, 22 Nov 2005, Rohit Seth wrote:

> On Wed, 2005-11-23 at 00:17 +0000, Mel Gorman wrote:
> > On Tue, 22 Nov 2005, Seth, Rohit wrote:
> >
> > >
> > >
> > > >requested order is greater than 3.
> > >
> > > Why this order limit.  Most of the previous failures seen (because of my
> > > earlier patches of bigger and more physical contiguous chunks for pcps)
> > > were with order 1 allocation.
> > >
> >
> > The order 3 is because of this block;
> >
> >         if (!(gfp_mask & __GFP_NORETRY)) {
> >                 if ((order <= 3) || (gfp_mask & __GFP_REPEAT))
> >                         do_retry = 1;
> >                 if (gfp_mask & __GFP_NOFAIL)
> >                         do_retry = 1;
> >         }
> >
> > If it's less than 3, we are retrying anyway and it's something we are
>
> You are retrying (for 0<order<=3) but without draining the pcps (in your
> patch).
>
> > > That code has issues with pre-emptible kernel.
> > >
> >
> > ok... why? I thought that we could only be preempted when we were about to
> > take a spinlock but I have an imperfect understanding of preempt and
> > things change quickly. The path the drain_all_local_pages() enters
> > disables the local IRQs before calling __drain_pages() and when
> > smp_drain_local_pages()  is called, the local IRQs are disabled again
> > before releasing pages. Where can we get preempted?
> >
>
> Basically the get_cpu(), put_cpu() needs to cover the whole scope of
> smp_processor_id usage.  (When you enable CONFIG_DEBUG_PREEMPT the
> kernel will barf if preempt is enabled while calling smp_processor_id).
>
> If the interrupts are disabled all the way through then you wouldn't be
> preempted though.  But get/put_cpu is the right mechanism to ensure
> smp_processor_id and its derived value is used on same processor.
>

That can be easily enough fixed.

> > > I will be shortly sending the patch to free pages from pcp when higher
> > > order allocation is not able to get serviced from global list.
> > >
> >
> > If that works, this part of the patch can be dropped. The intention is to
> > "drain the per-cpu lists by some mechanism". I am not too particular about
> > how it happens. Right now, the per-cpu caches make a massive difference on
> > my 4-way machine at least on whether a large number of contiguous blocks
> > can be allocated or not.
> >
>
> Please let me know if you see any issues with the patch that I sent out
> a bit earlier.
>

I don't have access to my test environment for the rest of the week so I
can't actually try them out.

However, reading through the patches, they appear to duplicate a
significant amount of the existing drain_local_pages() functions and they
only drain the pages on the currently running CPU. On a system with a
number of CPUs, you will only be improving your chances slightly.

I think you would get more of what you need with this patch if;

1. Removed the compile time dependency on CONFIG_PM||CONFIG_HOTPLUG
2. Rechecked the usage of smp_processor_id() (although I don't think it's
    wrong because it's only called with local IRQs disabled)
3. Draining the CPUs after direct reclaim and the allocation still failing

This patch does everything you need including the draining of remove
per-cpus.

-- 
Mel Gorman
Part-time Phd Student                          Java Applications Developer
University of Limerick                         IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 5/5] Light fragmentation avoidance without usemap: 005_drainpercpu
  2005-11-22 19:17 [PATCH 0/5] Light fragmentation avoidance without usemap Mel Gorman
@ 2005-11-22 19:17 ` Mel Gorman
  0 siblings, 0 replies; 5+ messages in thread
From: Mel Gorman @ 2005-11-22 19:17 UTC (permalink / raw)
  To: linux-mm; +Cc: Mel Gorman, nickpiggin, ak, linux-kernel, lhms-devel, mingo

Per-cpu pages can accidentally cause fragmentation because they are free, but
pinned pages in an otherwise contiguous block.  When this patch is applied,
the per-cpu caches are drained after the direct-reclaim is entered if the
requested order is greater than 3. It simply reuses the code used by suspend
and hotplug and only is triggered when anti-defragmentation is enabled.
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
diff -rup -X /usr/src/patchset-0.5/bin//dontdiff linux-2.6.15-rc1-mm2-004_configurable/mm/page_alloc.c linux-2.6.15-rc1-mm2-005_drainpercpu/mm/page_alloc.c
--- linux-2.6.15-rc1-mm2-004_configurable/mm/page_alloc.c	2005-11-22 16:53:03.000000000 +0000
+++ linux-2.6.15-rc1-mm2-005_drainpercpu/mm/page_alloc.c	2005-11-22 16:53:45.000000000 +0000
@@ -689,7 +689,9 @@ void drain_remote_pages(void)
 }
 #endif
 
-#if defined(CONFIG_PM) || defined(CONFIG_HOTPLUG_CPU)
+#if defined(CONFIG_PM) || \
+	defined(CONFIG_HOTPLUG_CPU) || \
+	defined(CONFIG_PAGEALLOC_ANTIDEFRAG)
 static void __drain_pages(unsigned int cpu)
 {
 	struct zone *zone;
@@ -716,10 +718,9 @@ static void __drain_pages(unsigned int c
 		}
 	}
 }
-#endif /* CONFIG_PM || CONFIG_HOTPLUG_CPU */
+#endif /* CONFIG_PM || CONFIG_HOTPLUG_CPU || CONFIG_PAGEALLOC_ANTIDEFRAG */
 
 #ifdef CONFIG_PM
-
 void mark_free_pages(struct zone *zone)
 {
 	unsigned long zone_pfn, flags;
@@ -746,7 +747,9 @@ void mark_free_pages(struct zone *zone)
 	}
 	spin_unlock_irqrestore(&zone->lock, flags);
 }
+#endif /* CONFIG_PM */
 
+#if defined(CONFIG_PM) || defined(CONFIG_PAGEALLOC_ANTIDEFRAG)
 /*
  * Spill all of this CPU's per-cpu pages back into the buddy allocator.
  */
@@ -758,7 +761,28 @@ void drain_local_pages(void)
 	__drain_pages(smp_processor_id());
 	local_irq_restore(flags);	
 }
-#endif /* CONFIG_PM */
+
+void smp_drain_local_pages(void *arg)
+{
+	drain_local_pages();
+}
+
+/*
+ * Spill all the per-cpu pages from all CPUs back into the buddy allocator
+ */
+void drain_all_local_pages(void)
+{
+	unsigned long flags;
+
+	local_irq_save(flags);
+	__drain_pages(smp_processor_id());
+	local_irq_restore(flags);
+
+	smp_call_function(smp_drain_local_pages, NULL, 0, 1);
+}
+#else
+void drain_all_local_pages(void) {}
+#endif /* CONFIG_PAGEALLOC_ANTIDEFRAG */
 
 void zone_statistics(struct zonelist *zonelist, struct zone *z)
 {
@@ -1109,6 +1133,9 @@ rebalance:
 
 	did_some_progress = try_to_free_pages(zonelist->zones, gfp_mask);
 
+	if (order > 3)
+		drain_all_local_pages();
+
 	p->reclaim_state = NULL;
 	p->flags &= ~PF_MEMALLOC;
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2005-11-23  8:33 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-11-22 23:43 [PATCH 5/5] Light fragmentation avoidance without usemap: 005_drainpercpu Seth, Rohit, Tuesday, November
2005-11-23  0:17 ` Mel Gorman
2005-11-23  1:22   ` Rohit Seth
2005-11-23  8:33     ` Mel Gorman
  -- strict thread matches above, loose matches on Subject: below --
2005-11-22 19:17 [PATCH 0/5] Light fragmentation avoidance without usemap Mel Gorman
2005-11-22 19:17 ` [PATCH 5/5] Light fragmentation avoidance without usemap: 005_drainpercpu Mel Gorman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox