* hugepage test failures
@ 2007-07-23 19:04 Randy Dunlap
2007-07-23 20:18 ` Nish Aravamudan
2007-07-24 0:02 ` Ken Chen
0 siblings, 2 replies; 7+ messages in thread
From: Randy Dunlap @ 2007-07-23 19:04 UTC (permalink / raw)
To: linux-mm
Hi,
I'm a few hundred linux-mm emails behind, so maybe this has been
addressed already. I hope so.
I run hugepage-mmap and hugepage-shm tests (from Doc/vm/hugetlbpage.txt)
on a regular basis. Lately they have been failing, usually with -ENOMEM,
but sometimes the mmap() succeeds and hugepage-mmap gets a SIGBUS:
open("/mnt/hugetlbfs/hugepagefile", O_RDWR|O_CREAT, 0755) = 3
mmap(NULL, 268435456, PROT_READ|PROT_WRITE, MAP_SHARED, 3, 0) = 0x2af31d2c3000
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 1), ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2af32d2c3000
write(1, "Returned address is 0x2af31d2c30"..., 35) = 35
--- SIGBUS (Bus error) @ 0 (0) ---
+++ killed by SIGBUS +++
and:
# ./hugepage-shm
shmget: Cannot allocate memory
I added printk()s in many mm/mmap.c and mm/hugetlb.c error return
locations and got this:
hugetlb_reserve_pages: -ENOMEM
which comes from mm/hugetlb.c::hugetlb_reserve_pages():
if (chg > cpuset_mems_nr(free_huge_pages_node)) {
printk(KERN_DEBUG "%s: -ENOMEM\n", __func__);
return -ENOMEM;
}
I had CONFIG_CPUSETS=y so I disabled it, but the same error
still happens.
Suggestions? Fixex?
Thanks.
---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: hugepage test failures
2007-07-23 19:04 hugepage test failures Randy Dunlap
@ 2007-07-23 20:18 ` Nish Aravamudan
2007-07-23 20:30 ` Randy Dunlap
2007-07-24 0:02 ` Ken Chen
1 sibling, 1 reply; 7+ messages in thread
From: Nish Aravamudan @ 2007-07-23 20:18 UTC (permalink / raw)
To: Randy Dunlap; +Cc: linux-mm
On 7/23/07, Randy Dunlap <randy.dunlap@oracle.com> wrote:
> Hi,
>
> I'm a few hundred linux-mm emails behind, so maybe this has been
> addressed already. I hope so.
>
> I run hugepage-mmap and hugepage-shm tests (from Doc/vm/hugetlbpage.txt)
> on a regular basis. Lately they have been failing, usually with -ENOMEM,
> but sometimes the mmap() succeeds and hugepage-mmap gets a SIGBUS:
Would it be possible for you instead to run the libhugetlbfs tests?
They are kept uptodate, at least.
> open("/mnt/hugetlbfs/hugepagefile", O_RDWR|O_CREAT, 0755) = 3
> mmap(NULL, 268435456, PROT_READ|PROT_WRITE, MAP_SHARED, 3, 0) = 0x2af31d2c3000
> fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 1), ...}) = 0
> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2af32d2c3000
> write(1, "Returned address is 0x2af31d2c30"..., 35) = 35
> --- SIGBUS (Bus error) @ 0 (0) ---
> +++ killed by SIGBUS +++
>
>
> and:
>
> # ./hugepage-shm
> shmget: Cannot allocate memory
>
>
> I added printk()s in many mm/mmap.c and mm/hugetlb.c error return
> locations and got this:
>
> hugetlb_reserve_pages: -ENOMEM
>
> which comes from mm/hugetlb.c::hugetlb_reserve_pages():
>
> if (chg > cpuset_mems_nr(free_huge_pages_node)) {
> printk(KERN_DEBUG "%s: -ENOMEM\n", __func__);
> return -ENOMEM;
> }
>
> I had CONFIG_CPUSETS=y so I disabled it, but the same error
> still happens.
As in the same cpusets_mems_nr() check fails?
> Suggestions? Fixex?
Which kernel is this?
Thanks,
Nish
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: hugepage test failures
2007-07-23 20:30 ` Randy Dunlap
@ 2007-07-23 20:29 ` Nish Aravamudan
2007-07-23 20:45 ` Randy Dunlap
2007-07-24 0:23 ` Ken Chen
1 sibling, 1 reply; 7+ messages in thread
From: Nish Aravamudan @ 2007-07-23 20:29 UTC (permalink / raw)
To: Randy Dunlap; +Cc: linux-mm
On 7/23/07, Randy Dunlap <randy.dunlap@oracle.com> wrote:
> Nish Aravamudan wrote:
> > On 7/23/07, Randy Dunlap <randy.dunlap@oracle.com> wrote:
> >> Hi,
> >>
> >> I'm a few hundred linux-mm emails behind, so maybe this has been
> >> addressed already. I hope so.
> >>
> >> I run hugepage-mmap and hugepage-shm tests (from Doc/vm/hugetlbpage.txt)
> >> on a regular basis. Lately they have been failing, usually with -ENOMEM,
> >> but sometimes the mmap() succeeds and hugepage-mmap gets a SIGBUS:
> >
> > Would it be possible for you instead to run the libhugetlbfs tests?
>
> OK, I'm downloading that now.
Great, thanks. I believe the same tests that are intended by
Doc/vm/hugetlbpage.txt will be run by `make func`.
> > They are kept uptodate, at least.
>
> You mean that the Doc/ tree is not kept up to date? ;(
Well, I think we all know that is true. But I wasn't aware there was a
testcase in the Documentation directory. I'll see what I can do about
making sure that is uptodate.
> But this represents an R*word (regression).
> These tests ran successfully until recently (I can't say when).
Ok. I'm not sure a lot of hugetlb.c stuff has gone in very recently.
Any chance you can narrow down the window?
> >> open("/mnt/hugetlbfs/hugepagefile", O_RDWR|O_CREAT, 0755) = 3
> >> mmap(NULL, 268435456, PROT_READ|PROT_WRITE, MAP_SHARED, 3, 0) =
> >> 0x2af31d2c3000
> >> fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 1), ...}) = 0
> >> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
> >> 0) = 0x2af32d2c3000
> >> write(1, "Returned address is 0x2af31d2c30"..., 35) = 35
> >> --- SIGBUS (Bus error) @ 0 (0) ---
> >> +++ killed by SIGBUS +++
> >>
> >>
> >> and:
> >>
> >> # ./hugepage-shm
> >> shmget: Cannot allocate memory
> >>
> >>
> >> I added printk()s in many mm/mmap.c and mm/hugetlb.c error return
> >> locations and got this:
> >>
> >> hugetlb_reserve_pages: -ENOMEM
> >>
> >> which comes from mm/hugetlb.c::hugetlb_reserve_pages():
> >>
> >> if (chg > cpuset_mems_nr(free_huge_pages_node)) {
> >> printk(KERN_DEBUG "%s: -ENOMEM\n", __func__);
> >> return -ENOMEM;
> >> }
> >>
> >> I had CONFIG_CPUSETS=y so I disabled it, but the same error
> >> still happens.
> >
> > As in the same cpusets_mems_nr() check fails?
> >
> >> Suggestions? Fixex?
> >
> > Which kernel is this?
>
> Ah, sorry, 2.6.23-rc1.
Architecture? I'll try and reproduce here.
Thanks,
Nish
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: hugepage test failures
2007-07-23 20:18 ` Nish Aravamudan
@ 2007-07-23 20:30 ` Randy Dunlap
2007-07-23 20:29 ` Nish Aravamudan
2007-07-24 0:23 ` Ken Chen
0 siblings, 2 replies; 7+ messages in thread
From: Randy Dunlap @ 2007-07-23 20:30 UTC (permalink / raw)
To: Nish Aravamudan; +Cc: linux-mm
Nish Aravamudan wrote:
> On 7/23/07, Randy Dunlap <randy.dunlap@oracle.com> wrote:
>> Hi,
>>
>> I'm a few hundred linux-mm emails behind, so maybe this has been
>> addressed already. I hope so.
>>
>> I run hugepage-mmap and hugepage-shm tests (from Doc/vm/hugetlbpage.txt)
>> on a regular basis. Lately they have been failing, usually with -ENOMEM,
>> but sometimes the mmap() succeeds and hugepage-mmap gets a SIGBUS:
>
> Would it be possible for you instead to run the libhugetlbfs tests?
OK, I'm downloading that now.
> They are kept uptodate, at least.
You mean that the Doc/ tree is not kept up to date? ;(
But this represents an R*word (regression).
These tests ran successfully until recently (I can't say when).
>> open("/mnt/hugetlbfs/hugepagefile", O_RDWR|O_CREAT, 0755) = 3
>> mmap(NULL, 268435456, PROT_READ|PROT_WRITE, MAP_SHARED, 3, 0) =
>> 0x2af31d2c3000
>> fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 1), ...}) = 0
>> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
>> 0) = 0x2af32d2c3000
>> write(1, "Returned address is 0x2af31d2c30"..., 35) = 35
>> --- SIGBUS (Bus error) @ 0 (0) ---
>> +++ killed by SIGBUS +++
>>
>>
>> and:
>>
>> # ./hugepage-shm
>> shmget: Cannot allocate memory
>>
>>
>> I added printk()s in many mm/mmap.c and mm/hugetlb.c error return
>> locations and got this:
>>
>> hugetlb_reserve_pages: -ENOMEM
>>
>> which comes from mm/hugetlb.c::hugetlb_reserve_pages():
>>
>> if (chg > cpuset_mems_nr(free_huge_pages_node)) {
>> printk(KERN_DEBUG "%s: -ENOMEM\n", __func__);
>> return -ENOMEM;
>> }
>>
>> I had CONFIG_CPUSETS=y so I disabled it, but the same error
>> still happens.
>
> As in the same cpusets_mems_nr() check fails?
>
>> Suggestions? Fixex?
>
> Which kernel is this?
Ah, sorry, 2.6.23-rc1.
--
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: hugepage test failures
2007-07-23 20:29 ` Nish Aravamudan
@ 2007-07-23 20:45 ` Randy Dunlap
0 siblings, 0 replies; 7+ messages in thread
From: Randy Dunlap @ 2007-07-23 20:45 UTC (permalink / raw)
To: Nish Aravamudan; +Cc: linux-mm
Nish Aravamudan wrote:
> On 7/23/07, Randy Dunlap <randy.dunlap@oracle.com> wrote:
>> Nish Aravamudan wrote:
>> > On 7/23/07, Randy Dunlap <randy.dunlap@oracle.com> wrote:
>> >> Hi,
>> >>
>> >> I'm a few hundred linux-mm emails behind, so maybe this has been
>> >> addressed already. I hope so.
>> >>
>> >> I run hugepage-mmap and hugepage-shm tests (from
>> Doc/vm/hugetlbpage.txt)
>> >> on a regular basis. Lately they have been failing, usually with
>> -ENOMEM,
>> >> but sometimes the mmap() succeeds and hugepage-mmap gets a SIGBUS:
>> >
>> > Would it be possible for you instead to run the libhugetlbfs tests?
>>
>> OK, I'm downloading that now.
>
> Great, thanks. I believe the same tests that are intended by
> Doc/vm/hugetlbpage.txt will be run by `make func`.
>
>> > They are kept uptodate, at least.
>>
>> You mean that the Doc/ tree is not kept up to date? ;(
>
> Well, I think we all know that is true. But I wasn't aware there was a
> testcase in the Documentation directory. I'll see what I can do about
> making sure that is uptodate.
You could begin with my (old) patch to make them standalone .c files
instead of being buried in a txt file. (All programs in Doc/ should
be like this IMO.)
>> But this represents an R*word (regression).
>> These tests ran successfully until recently (I can't say when).
>
> Ok. I'm not sure a lot of hugetlb.c stuff has gone in very recently.
> Any chance you can narrow down the window?
Maybe.
>> >> open("/mnt/hugetlbfs/hugepagefile", O_RDWR|O_CREAT, 0755) = 3
>> >> mmap(NULL, 268435456, PROT_READ|PROT_WRITE, MAP_SHARED, 3, 0) =
>> >> 0x2af31d2c3000
>> >> fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 1), ...}) = 0
>> >> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
>> >> 0) = 0x2af32d2c3000
>> >> write(1, "Returned address is 0x2af31d2c30"..., 35) = 35
>> >> --- SIGBUS (Bus error) @ 0 (0) ---
>> >> +++ killed by SIGBUS +++
>> >>
>> >>
>> >> and:
>> >>
>> >> # ./hugepage-shm
>> >> shmget: Cannot allocate memory
>> >>
>> >>
>> >> I added printk()s in many mm/mmap.c and mm/hugetlb.c error return
>> >> locations and got this:
>> >>
>> >> hugetlb_reserve_pages: -ENOMEM
>> >>
>> >> which comes from mm/hugetlb.c::hugetlb_reserve_pages():
>> >>
>> >> if (chg > cpuset_mems_nr(free_huge_pages_node)) {
>> >> printk(KERN_DEBUG "%s: -ENOMEM\n", __func__);
>> >> return -ENOMEM;
>> >> }
>> >>
>> >> I had CONFIG_CPUSETS=y so I disabled it, but the same error
>> >> still happens.
>> >
>> > As in the same cpusets_mems_nr() check fails?
>> >
>> >> Suggestions? Fixex?
>> >
>> > Which kernel is this?
>>
>> Ah, sorry, 2.6.23-rc1.
>
> Architecture? I'll try and reproduce here.
x86_64.
--
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: hugepage test failures
2007-07-23 19:04 hugepage test failures Randy Dunlap
2007-07-23 20:18 ` Nish Aravamudan
@ 2007-07-24 0:02 ` Ken Chen
1 sibling, 0 replies; 7+ messages in thread
From: Ken Chen @ 2007-07-24 0:02 UTC (permalink / raw)
To: Randy Dunlap; +Cc: linux-mm
On 7/23/07, Randy Dunlap <randy.dunlap@oracle.com> wrote:
> I'm a few hundred linux-mm emails behind, so maybe this has been
> addressed already. I hope so.
>
> I run hugepage-mmap and hugepage-shm tests (from Doc/vm/hugetlbpage.txt)
> on a regular basis. Lately they have been failing, usually with -ENOMEM,
> but sometimes the mmap() succeeds and hugepage-mmap gets a SIGBUS:
man, what did people do to hugetlb?
In dequeue_huge_page(), it just loops around for all the alloc'able
zones, even though this function is suppose to just allocate *ONE*
hugetlb page. That is a serious memory leak here. We need a break
statement in the inner if statement there.
- Ken
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: hugepage test failures
2007-07-23 20:30 ` Randy Dunlap
2007-07-23 20:29 ` Nish Aravamudan
@ 2007-07-24 0:23 ` Ken Chen
1 sibling, 0 replies; 7+ messages in thread
From: Ken Chen @ 2007-07-24 0:23 UTC (permalink / raw)
To: Randy Dunlap; +Cc: Nish Aravamudan, linux-mm
On 7/23/07, Randy Dunlap <randy.dunlap@oracle.com> wrote:
> > They are kept uptodate, at least.
>
> You mean that the Doc/ tree is not kept up to date? ;(
AFAICT, the sample code in Documentation/vm/hugetlbpage.txt is up to
date. I'm not aware any bug in the user space example code (except
maybe the memory segment LENGTH is too big at 256MB). If there are
bugs there, I would like to hear about it.
> But this represents an R*word (regression).
> These tests ran successfully until recently (I can't say when).
Yeah, it's a true regression.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2007-07-24 0:23 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-07-23 19:04 hugepage test failures Randy Dunlap
2007-07-23 20:18 ` Nish Aravamudan
2007-07-23 20:30 ` Randy Dunlap
2007-07-23 20:29 ` Nish Aravamudan
2007-07-23 20:45 ` Randy Dunlap
2007-07-24 0:23 ` Ken Chen
2007-07-24 0:02 ` Ken Chen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox