linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* Re: memory exhausted
       [not found] <5.1.0.14.2.20020424145006.00b17cb0@notes.tcindex.com>
@ 2002-04-25  2:19 ` Rik van Riel
  2002-04-25  2:57   ` William Lee Irwin III
  0 siblings, 1 reply; 6+ messages in thread
From: Rik van Riel @ 2002-04-25  2:19 UTC (permalink / raw)
  To: Vivian Wang; +Cc: linux-mm

[mailing list address corrected ... won't people ever learn to read ?]

On Wed, 24 Apr 2002, Vivian Wang wrote:

> I try to sort my 11 GB file, but I got message about memory exhausted.
> I used the command like this:
> sort -u file1 -o file2
> Is this correct?

Yes, sort only has a maximum of 3 GB of virtual address space so
it will never be able to load the whole 11 GB file into memory.

> What I should do?

You could either write your own sort program that doesn't need
to have the whole file loaded or you could use a 64 bit machine
with at least 11 GB of available virtual memory, probably the
double...

regards,

Rik
-- 
Bravely reimplemented by the knights who say "NIH".

http://www.surriel.com/		http://distro.conectiva.com/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: memory exhausted
  2002-04-25  2:19 ` memory exhausted Rik van Riel
@ 2002-04-25  2:57   ` William Lee Irwin III
  2002-04-25  4:21     ` msimons
  2002-04-27 19:30     ` H. Peter Anvin
  0 siblings, 2 replies; 6+ messages in thread
From: William Lee Irwin III @ 2002-04-25  2:57 UTC (permalink / raw)
  To: Rik van Riel; +Cc: Vivian Wang, linux-mm

On Wed, Apr 24, 2002 at 11:19:50PM -0300, Rik van Riel wrote:
> [mailing list address corrected ... won't people ever learn to read ?]

On Wed, 24 Apr 2002, Vivian Wang wrote:
>> I try to sort my 11 GB file, but I got message about memory exhausted.
>> I used the command like this:
>> sort -u file1 -o file2
>> Is this correct?

On Wed, Apr 24, 2002 at 11:19:50PM -0300, Rik van Riel wrote:
> Yes, sort only has a maximum of 3 GB of virtual address space so
> it will never be able to load the whole 11 GB file into memory.

This is larger than the virtual address space of i386 machines,
but not larger than the physical address space. In principle, an
executive taking advantage of 36-bit physical addressing extensions and
performing its own memory management on the bare metal could perform an
in-core sort on a 36-bit physical addressing -capable 32-bit machine,
e.g. i386-style PAE/highmem machines and some 32-bit MIPS machines. A
kernel module could also in principle take advantage of the kernel's
low-level memory management facilities to perform such an in-core sort.
While possible, this is absolutely not recommended.


On Wed, 24 Apr 2002, Vivian Wang wrote:
>> What I should do?

On Wed, Apr 24, 2002 at 11:19:50PM -0300, Rik van Riel wrote:
> You could either write your own sort program that doesn't need
> to have the whole file loaded or you could use a 64 bit machine
> with at least 11 GB of available virtual memory, probably the
> double...
> regards,
> Rik

It's doubtful the above "solutions" I mentioned above are practical for
your purposes unless you are under the most extreme duress and have
access to uncommon hardware. I suggest polyphase merge sorting or any
of the various algorithms recommended in Donald E. Knuth's "The Art of
Computer Programming", specifically its chapter on external sorting,
which I'm willing to discuss and assist in implementations of off-list.


Cheers,
Bill
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: memory exhausted
  2002-04-25  2:57   ` William Lee Irwin III
@ 2002-04-25  4:21     ` msimons
  2002-04-27 19:30     ` H. Peter Anvin
  1 sibling, 0 replies; 6+ messages in thread
From: msimons @ 2002-04-25  4:21 UTC (permalink / raw)
  To: Vivian Wang; +Cc: William Lee Irwin III, Rik van Riel, linux-mm

> On Wed, 24 Apr 2002, Vivian Wang wrote:
> >> I try to sort my 11 GB file, but I got message about memory exhausted.
> >> sort -u file1 -o file2
> >> What I should do?

Vivian,

  If you are using the GNU version of sort it does not even try to load 
the input set into memory, it does a splintering and merge multiple merge 
sorts of the input files.  The sort operation will be very disk I/O bound...

  You must set -T (or TMPDIR) to point at a filesystem with enough disk 
space to store the temporary files if your /tmp doesn't have enough room.
You didn't paste the exact text of the error message so I didn't check 
to see what error is generated if /tmp fills up while working.


  It appears there is a restriction that sort must be able to load a few
of the longest lines into memory.

- What is the longest line in your input file?
  (run "wc -L input file")

  In my own testing of very big data files (with short line lengths)
I had no problems.  If that is a number larger than memory you may
need to find a way to chop your data into shorter lines...


  Lastly, sort splits the files into 500 KiB byte chunks on the first
pass.  This will create about 22000 files in your TMPDIR location on 
first pass, if you are using a filesystem with any sort of number of
files per directory limitation you could be having a problem there.


  Check those things and if you are still having a problem, send the
error messages you see, and some more information (like longest line).

    Good Luck,
      Mike

ps:
  I think this question may be off topic, since it's a userspace not kernel
problem... so if anyone complains you might want to take it off channel
or to some GNU textutils related mailing list.


On Wed, Apr 24, 2002 at 07:57:53PM -0700, William Lee Irwin III wrote:
> On Wed, Apr 24, 2002 at 11:19:50PM -0300, Rik van Riel wrote:
> On Wed, 24 Apr 2002, Vivian Wang wrote:
> >> I try to sort my 11 GB file, but I got message about memory exhausted.
> >> I used the command like this:
> >> sort -u file1 -o file2
> >> Is this correct?
> 
> On Wed, Apr 24, 2002 at 11:19:50PM -0300, Rik van Riel wrote:
> > Yes, sort only has a maximum of 3 GB of virtual address space so
> > it will never be able to load the whole 11 GB file into memory.
[...]
> On Wed, 24 Apr 2002, Vivian Wang wrote:
> >> What I should do?
> 
> On Wed, Apr 24, 2002 at 11:19:50PM -0300, Rik van Riel wrote:
> > You could either write your own sort program that doesn't need
> > to have the whole file loaded or you could use a 64 bit machine
> > with at least 11 GB of available virtual memory, probably the
> > double...
> > regards,
> > Rik
> 
> It's doubtful the above "solutions" I mentioned above are practical for
> your purposes unless you are under the most extreme duress and have
> access to uncommon hardware. I suggest polyphase merge sorting or any
> of the various algorithms recommended in Donald E. Knuth's "The Art of
> Computer Programming", specifically its chapter on external sorting,
> which I'm willing to discuss and assist in implementations of off-list.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: memory exhausted
  2002-04-25  2:57   ` William Lee Irwin III
  2002-04-25  4:21     ` msimons
@ 2002-04-27 19:30     ` H. Peter Anvin
  2002-04-27 20:38       ` William Lee Irwin III
  1 sibling, 1 reply; 6+ messages in thread
From: H. Peter Anvin @ 2002-04-27 19:30 UTC (permalink / raw)
  To: William Lee Irwin III; +Cc: Rik van Riel, Vivian Wang, linux-mm

William Lee Irwin III wrote:
> 
> This is larger than the virtual address space of i386 machines,
> but not larger than the physical address space. In principle, an
> executive taking advantage of 36-bit physical addressing extensions and
> performing its own memory management on the bare metal could perform an
> in-core sort on a 36-bit physical addressing -capable 32-bit machine,
> e.g. i386-style PAE/highmem machines and some 32-bit MIPS machines. A
> kernel module could also in principle take advantage of the kernel's
> low-level memory management facilities to perform such an in-core sort.
> While possible, this is absolutely not recommended.
> 

Good God, I hope x86-64 catches on soon and kills off this PAE silliness...

	-hpa


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: memory exhausted
  2002-04-27 19:30     ` H. Peter Anvin
@ 2002-04-27 20:38       ` William Lee Irwin III
  2002-04-27 20:52         ` H. Peter Anvin
  0 siblings, 1 reply; 6+ messages in thread
From: William Lee Irwin III @ 2002-04-27 20:38 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Rik van Riel, Vivian Wang, linux-mm

William Lee Irwin III wrote:
>> This is larger than the virtual address space of i386 machines,
>> but not larger than the physical address space. In principle, an
>> executive taking advantage of 36-bit physical addressing extensions and
>> performing its own memory management on the bare metal could perform an
>> in-core sort on a 36-bit physical addressing -capable 32-bit machine,
>> e.g. i386-style PAE/highmem machines and some 32-bit MIPS machines. A
>> kernel module could also in principle take advantage of the kernel's
>> low-level memory management facilities to perform such an in-core sort.
>> While possible, this is absolutely not recommended.


On Sat, Apr 27, 2002 at 12:30:49PM -0700, H. Peter Anvin wrote:
> Good God, I hope x86-64 catches on soon and kills off this PAE silliness...
> 	-hpa


Taunting me, eh? =)

Well, I did say "absolutely not recommended". 64-bit hardware of
whatever kind is without question a more appropriate solution to these
kinds of issues than such shenanigans anyway, and at this point I'm
more or less sorry I brought that up. And I'll leave the discussion
of what specific lines of hardware are most suitable for other fora. =)


Cheers,
Bill
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: memory exhausted
  2002-04-27 20:38       ` William Lee Irwin III
@ 2002-04-27 20:52         ` H. Peter Anvin
  0 siblings, 0 replies; 6+ messages in thread
From: H. Peter Anvin @ 2002-04-27 20:52 UTC (permalink / raw)
  To: William Lee Irwin III; +Cc: Rik van Riel, Vivian Wang, linux-mm

William Lee Irwin III wrote:
> 
>>Good God, I hope x86-64 catches on soon and kills off this PAE silliness...
> 
> Well, I did say "absolutely not recommended". 64-bit hardware of
> whatever kind is without question a more appropriate solution to these
> kinds of issues than such shenanigans anyway, and at this point I'm
> more or less sorry I brought that up. And I'll leave the discussion
> of what specific lines of hardware are most suitable for other fora. =)
> 

Oh, no, I didn't mean to pick on you.  It was more of a general lament. 
  I hope x86-64 can make 64-bit computing widespread.

	-hpa


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2002-04-27 20:52 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <5.1.0.14.2.20020424145006.00b17cb0@notes.tcindex.com>
2002-04-25  2:19 ` memory exhausted Rik van Riel
2002-04-25  2:57   ` William Lee Irwin III
2002-04-25  4:21     ` msimons
2002-04-27 19:30     ` H. Peter Anvin
2002-04-27 20:38       ` William Lee Irwin III
2002-04-27 20:52         ` H. Peter Anvin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox