* weird calloc problem
@ 1999-03-10 1:51 marco saraniti
1999-03-17 15:47 ` Stephen C. Tweedie
0 siblings, 1 reply; 4+ messages in thread
From: marco saraniti @ 1999-03-10 1:51 UTC (permalink / raw)
To: linux-kernel, linux-mm
Hi there,
I'm having a calloc problem that made me waste three weeks, at this point
I'm out of options, and I was wondering if this can be a kernel- or
MM-related problem. Furthermore, the system is a relatively big machine and
I'd like to share my experience with other people who are interested in
using Linux for number crunching.
The problem is trivial: calloc returns a NULL, even if there is a lot
of free memory. Yes, both arguments of calloc are always > 0.
The code is a big Monte Carlo simulation program, developed by myself
(mostly) and by several other people. The whole thing is more 60000
lines of C. What I can say is:
1) no use of sbrk, just malloc,calloc,realloc,free. There's no heavy
use of the calloc/free couple or of realloc. The allocation occurs in
a cycle: data are computed in a statically allocated buffer, then the
right amount of memory is allocated, and the buffer is memcpyed into
the freshly allocated memory. Then the whole thing starts again for a
new set of data (using the same buffer). The dimension of the
allocated memory varies, from very small to several hundred KB. After
several thousand iterations (and five hours!) calloc gives a
NULL. The problem doesn't occur if the program is forced to start a
few iterations before the critical one.
2) the process size increases exactly as the memory allocation counter
implemented in the program. This is the last vmstat report *before* the
calloc failure:
procs memory swap io system cpu
r b w swpd free buff cache si so bi bo in cs us sy id
2 0 0 0 914960 387020 78600 0 0 0 0 117 53 49 3 48
3) when used, dmalloc (which uses sbrk) gives no errors, it just exits
complaining for not being able to increase the heap size.
4) the error is reproduced perfectly in subsequent runs, but it occurs at
different iterations if dmalloc is used, or if the compiler is changed, or
if the kernel is changed.
5) I tried kernel 2.2.1 and 2.2.3, compilers gcc2.7.2.3, pgcc1.1, egcs1.1.
6) I tried to swap the memory banks on the motherboard, no change.
The machine is a dual PII (400MHz) on a mboard supermicro P6DGE with
2GB of *non* ECC SDRAM (PC100). No IDE disks, everything is SCSI
(controller AHA 2940UW). The kernel has been compiled enabling SMP,
and after changing the value 0xC0000000 to 0x80000000 in
/usr/src/linux/arch/i386/vmlinux.lds and
/usr/src/linux/include/asm-i386/page.h. No "mem=" directive is used in
lilo.conf. I have four swap partitions, 128MB each. The netcard is a
SMC Etherpower II (10/100 PCI). There's also a awe64 soundcard. The
videocard is a matrox millennium G200 AGP. The Linux distribution is
a plain RH5.2, Xserver Accelerated X and CDE, both from X-inside.
The question is even more trivial than the problem: what's wrong? Why
calloc refuses to allocate memory if there's a full GB of (apparently)
free RAM?
thanks a lot for your help, *any* suggestion will be appreciated.
marco
PS please reply also to my email address, I'm not a subscriber of the
mail list.
====================================================================
Marco Saraniti
Assistant Professor of Electrical and Computer Engineering
Department of Electrical and Computer Engineering #SH329
Illinois Institute of Technology - Main Campus
3301 South Dearborn
Chicago, IL 60616
Tel: (312) 567 8813
Fax: (312) 567 8976
email: saraniti@ece.iit.edu
www: www.ece.iit.edu/Faculty/marco.html
--
To unsubscribe, send a message with 'unsubscribe linux-mm my@address'
in the body to majordomo@kvack.org. For more info on Linux MM,
see: http://humbolt.geo.uu.nl/Linux-MM/
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: weird calloc problem
1999-03-10 1:51 weird calloc problem marco saraniti
@ 1999-03-17 15:47 ` Stephen C. Tweedie
1999-03-17 23:12 ` Richard B. Johnson
0 siblings, 1 reply; 4+ messages in thread
From: Stephen C. Tweedie @ 1999-03-17 15:47 UTC (permalink / raw)
To: saraniti; +Cc: linux-kernel, linux-mm
Hi,
On Tue, 9 Mar 1999 19:51:32 -0600 (EST), marco saraniti
<saraniti@neumann.ece.iit.edu> said:
> I'm having a calloc problem that made me waste three weeks, at this point
> I'm out of options, and I was wondering if this can be a kernel- or
> MM-related problem. Furthermore, the system is a relatively big machine and
> I'd like to share my experience with other people who are interested in
> using Linux for number crunching.
> The problem is trivial: calloc returns a NULL, even if there is a lot
> of free memory. Yes, both arguments of calloc are always > 0.
Do you have any evidence that this is a kernel problem as opposed to a
user-space problem? A "ps -m" listing of the process concerned when
the fault happens would be useful in pinning this down, as would a
"strace" output.
--Stephen
--
To unsubscribe, send a message with 'unsubscribe linux-mm my@address'
in the body to majordomo@kvack.org. For more info on Linux MM,
see: http://humbolt.geo.uu.nl/Linux-MM/
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: weird calloc problem
1999-03-17 15:47 ` Stephen C. Tweedie
@ 1999-03-17 23:12 ` Richard B. Johnson
1999-03-18 13:00 ` Manfred Spraul
0 siblings, 1 reply; 4+ messages in thread
From: Richard B. Johnson @ 1999-03-17 23:12 UTC (permalink / raw)
To: Stephen C. Tweedie; +Cc: saraniti, linux-kernel, linux-mm
On Wed, 17 Mar 1999, Stephen C. Tweedie wrote:
> Hi,
>
> On Tue, 9 Mar 1999 19:51:32 -0600 (EST), marco saraniti
> <saraniti@neumann.ece.iit.edu> said:
>
> > I'm having a calloc problem that made me waste three weeks, at this point
> > I'm out of options, and I was wondering if this can be a kernel- or
> > MM-related problem. Furthermore, the system is a relatively big machine and
> > I'd like to share my experience with other people who are interested in
> > using Linux for number crunching.
>
> > The problem is trivial: calloc returns a NULL, even if there is a lot
> > of free memory. Yes, both arguments of calloc are always > 0.
>
Here is a simple program and its output that, on my system, clearly
shows that if I want more array-space I just need to increase the size
of my swap file.
Script started on Wed Mar 17 18:05:00 1999
# show
Crunching 1048576 elements
total: used: free: shared: buffers: cached:
Mem: 330358784 17907712 312451072 2613248 5787648 2535424
SwapTotal: 248996 kB
SwapFree: 248996 kB
Crunching 1572864 elements
total: used: free: shared: buffers: cached:
Mem: 330358784 20004864 310353920 2621440 5787648 2535424
SwapTotal: 248996 kB
SwapFree: 248996 kB
Crunching 2359296 elements
total: used: free: shared: buffers: cached:
Mem: 330358784 23150592 307208192 2621440 5787648 2535424
SwapTotal: 248996 kB
SwapFree: 248996 kB
Crunching 3538944 elements
total: used: free: shared: buffers: cached:
Mem: 330358784 27869184 302489600 2621440 5787648 2535424
SwapTotal: 248996 kB
SwapFree: 248996 kB
Crunching 5308416 elements
total: used: free: shared: buffers: cached:
Mem: 330358784 34947072 295411712 2621440 5787648 2535424
SwapTotal: 248996 kB
SwapFree: 248996 kB
Crunching 7962624 elements
total: used: free: shared: buffers: cached:
Mem: 330358784 45563904 284794880 2621440 5787648 2535424
SwapTotal: 248996 kB
SwapFree: 248996 kB
Crunching 11943936 elements
total: used: free: shared: buffers: cached:
Mem: 330358784 61489152 268869632 2621440 5787648 2535424
SwapTotal: 248996 kB
SwapFree: 248996 kB
Crunching 17915904 elements
total: used: free: shared: buffers: cached:
Mem: 330358784 85381120 244977664 2621440 5787648 2535424
SwapTotal: 248996 kB
SwapFree: 248996 kB
Crunching 26873856 elements
total: used: free: shared: buffers: cached:
Mem: 330358784 121245696 209113088 2621440 5787648 2535424
SwapTotal: 248996 kB
SwapFree: 248996 kB
Crunching 40310784 elements
total: used: free: shared: buffers: cached:
Mem: 330358784 175046656 155312128 2621440 5787648 2535424
SwapTotal: 248996 kB
SwapFree: 248996 kB
Crunching 60466176 elements
total: used: free: shared: buffers: cached:
Mem: 330358784 255746048 74612736 2621440 5787648 2535424
SwapTotal: 248996 kB
SwapFree: 248996 kB
Crunching 90699264 elements
total: used: free: shared: buffers: cached:
Mem: 330358784 327876608 2482176 2293760 5787648 7868416
SwapTotal: 248996 kB
SwapFree: 196036 kB
Crunching 136048896 elements
total: used: free: shared: buffers: cached:
Mem: 330358784 328028160 2330624 569344 5787648 4739072
SwapTotal: 248996 kB
SwapFree: 21924 kB
calloc(816293376) failed
# exit
exit
Script done on Wed Mar 17 18:06:20 1999
Here is the test program.
#include <stdio.h>
#include <stdlib.h>
#include <memory.h>
#define ARRAY 0x100000
#define BUF_LEN 0x100
int main(void);
int main()
{
char buf[BUF_LEN];
size_t len, i, half;
size_t *pf;
FILE *file;
len = ARRAY;
for(;;)
{
if((pf = (size_t *) calloc(len, sizeof(size_t))) == NULL)
{
fprintf(stderr, "calloc(%lu) failed\n", len * sizeof(size_t));
exit(EXIT_FAILURE);
}
fprintf(stdout, " Crunching %u elements\n", len);
for(i=0; i< len; i++)
pf[i] = i;
/*
* seek and rewind don't work on proc (files have no size)
*/
if((file = fopen ("/proc/meminfo", "r")) == NULL)
{
fprintf(stderr, "You need the /proc file system mounted");
exit(EXIT_FAILURE);
}
fgets(buf,BUF_LEN, file);
fprintf(stdout, buf);
fgets(buf,BUF_LEN, file);
fprintf(stdout, buf);
fgets(buf,BUF_LEN, file);
fgets(buf,BUF_LEN, file);
fgets(buf,BUF_LEN, file);
fgets(buf,BUF_LEN, file);
fgets(buf,BUF_LEN, file);
fgets(buf,BUF_LEN, file);
fgets(buf,BUF_LEN, file);
fprintf(stdout, buf);
fgets(buf,BUF_LEN, file);
fprintf(stdout, buf);
puts("");
fclose(file);
free(pf);
len += (len / 2);
}
return 0;
}
Cheers,
Dick Johnson
***** FILE SYSTEM WAS MODIFIED *****
Penguin : Linux version 2.2.3 on an i686 machine (400.59 BogoMips).
Warning : It's hard to remain at the trailing edge of technology.
--
To unsubscribe, send a message with 'unsubscribe linux-mm my@address'
in the body to majordomo@kvack.org. For more info on Linux MM,
see: http://humbolt.geo.uu.nl/Linux-MM/
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: weird calloc problem
1999-03-17 23:12 ` Richard B. Johnson
@ 1999-03-18 13:00 ` Manfred Spraul
0 siblings, 0 replies; 4+ messages in thread
From: Manfred Spraul @ 1999-03-18 13:00 UTC (permalink / raw)
To: saraniti; +Cc: linux-kernel, linux-mm
On Tue, 9 Mar 1999 19:51:32 -0600 (EST), marco saraniti
<saraniti@neumann.ece.iit.edu> said:
> I'm having a calloc problem that made me waste three weeks, at this point
> I'm out of options, and I was wondering if this can be a kernel- or
> MM-related problem. Furthermore, the system is a relatively big machine and
> I'd like to share my experience with other people who are interested in
> using Linux for number crunching.
>
> The problem is trivial: calloc returns a NULL, even if there is a lot
> of free memory. Yes, both arguments of calloc are always > 0.
you wrote 'the system is a relatively big machine'.
Perhaps you have run out of virtual memory.
How much memory do you try to allocate? (more than 1 Gigabyte?)
How much physical memory do you have?
You Could also pause the process as soon as you calloc returns NULL
(i.e. if(ptr==NULL) while(1) { printf("error!!\n");} )
and look at the informations in /proc/<pid>. The file formats are
described in 'man proc'.
Regards,
Manfred
--
To unsubscribe, send a message with 'unsubscribe linux-mm my@address'
in the body to majordomo@kvack.org. For more info on Linux MM,
see: http://humbolt.geo.uu.nl/Linux-MM/
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~1999-03-18 13:02 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1999-03-10 1:51 weird calloc problem marco saraniti
1999-03-17 15:47 ` Stephen C. Tweedie
1999-03-17 23:12 ` Richard B. Johnson
1999-03-18 13:00 ` Manfred Spraul
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox