* fallout of 16K stacks
@ 2014-07-07 22:30 Andi Kleen
2014-07-07 22:49 ` H. Peter Anvin
0 siblings, 1 reply; 4+ messages in thread
From: Andi Kleen @ 2014-07-07 22:30 UTC (permalink / raw)
To: torvalds; +Cc: linux-mm, linux-kernel
Since the 16K stack change I noticed a number of problems with
my usual stress tests. They have a tendency to bomb out
because something cannot fork.
- AIM7 on a dual socket socket system now cannot reliably run
>1000 parallel jobs.
- LTP stress + memhog stress in parallel to something else
usually doesn't survive the night.
Do we need to strengthen the memory allocator to try
harder for 16K?
-Andi
--
ak@linux.intel.com -- Speaking for myself only.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: fallout of 16K stacks
2014-07-07 22:30 fallout of 16K stacks Andi Kleen
@ 2014-07-07 22:49 ` H. Peter Anvin
2014-07-07 23:04 ` Andi Kleen
0 siblings, 1 reply; 4+ messages in thread
From: H. Peter Anvin @ 2014-07-07 22:49 UTC (permalink / raw)
To: Andi Kleen, torvalds; +Cc: linux-mm, linux-kernel
On 07/07/2014 03:30 PM, Andi Kleen wrote:
>
> Since the 16K stack change I noticed a number of problems with
> my usual stress tests. They have a tendency to bomb out
> because something cannot fork.
As in ENOMEM or does something worse happen?
> - AIM7 on a dual socket socket system now cannot reliably run
>> 1000 parallel jobs.
... with how much RAM?
> - LTP stress + memhog stress in parallel to something else
> usually doesn't survive the night.
>
> Do we need to strengthen the memory allocator to try
> harder for 16K?
Can we even? The probability of success goes down exponentially in the
order requested. Movable pages can help, of course, but still, there is
a very real cost to this :(
-hpa
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: fallout of 16K stacks
2014-07-07 22:49 ` H. Peter Anvin
@ 2014-07-07 23:04 ` Andi Kleen
2014-07-07 23:52 ` Linus Torvalds
0 siblings, 1 reply; 4+ messages in thread
From: Andi Kleen @ 2014-07-07 23:04 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: Andi Kleen, torvalds, linux-mm, linux-kernel
On Mon, Jul 07, 2014 at 03:49:48PM -0700, H. Peter Anvin wrote:
> On 07/07/2014 03:30 PM, Andi Kleen wrote:
> >
> > Since the 16K stack change I noticed a number of problems with
> > my usual stress tests. They have a tendency to bomb out
> > because something cannot fork.
>
> As in ENOMEM or does something worse happen?
EAGAIN, then the workload stops. For an overnight stress
test that's pretty catastrophic. It may have killed some stuff
with the OOM killer too.
> > - AIM7 on a dual socket socket system now cannot reliably run
> >> 1000 parallel jobs.
>
> ... with how much RAM?
This system has 32G
> > - LTP stress + memhog stress in parallel to something else
> > usually doesn't survive the night.
> >
> > Do we need to strengthen the memory allocator to try
> > harder for 16K?
>
> Can we even? The probability of success goes down exponentially in the
> order requested. Movable pages can help, of course, but still, there is
> a very real cost to this :(
I hope so. In the worst case just try longer.
-Andi
--
ak@linux.intel.com -- Speaking for myself only.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: fallout of 16K stacks
2014-07-07 23:04 ` Andi Kleen
@ 2014-07-07 23:52 ` Linus Torvalds
0 siblings, 0 replies; 4+ messages in thread
From: Linus Torvalds @ 2014-07-07 23:52 UTC (permalink / raw)
To: Andi Kleen; +Cc: H. Peter Anvin, linux-mm, Linux Kernel Mailing List
On Mon, Jul 7, 2014 at 4:04 PM, Andi Kleen <andi@firstfloor.org> wrote:
>>
>> As in ENOMEM or does something worse happen?
>
> EAGAIN, then the workload stops. For an overnight stress
> test that's pretty catastrophic. It may have killed some stuff
> with the OOM killer too.
I don't think it's OOM.
We have long had the rule that order <= PAGE_ALLOC_COSTLY_ORDER (which
is 3) allocations imply __GFP_RETRY unless you explicitly ask it not
to.
And THREAD_SIZE_ORDER is still smaller than that.
Sure, if the system makes no progress at all, it will still oom for
allocations like that, but that's *not* going to happen for something
like a 32GB machine afaik.
And if it was the actual dup_task_struct() that failed (due to
alloc_thread_info_node() now failing), it should have returned ENOMEM
anyway.
So EAGAIN is due to something else.
The only cases for fork() returning EAGAIN I can find are the
RLIMIT_NPROC and max_threads checks.
And the thing is, the default value for RLIMIT_NPROC is actually
initialized based on THREAD_SIZE (which doubled), so maybe it's really
just that rlimit check that now triggers.
Hmm?
Linus
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2014-07-07 23:52 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-07 22:30 fallout of 16K stacks Andi Kleen
2014-07-07 22:49 ` H. Peter Anvin
2014-07-07 23:04 ` Andi Kleen
2014-07-07 23:52 ` Linus Torvalds
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox