* Re: mmap: is default non-populating behavior stable? [not found] <490F73CD.4010705@gmail.com> @ 2008-11-03 22:41 ` Peter Zijlstra 2008-11-03 22:49 ` Rik van Riel 0 siblings, 1 reply; 10+ messages in thread From: Peter Zijlstra @ 2008-11-03 22:41 UTC (permalink / raw) To: Eugene V. Lyubimkin; +Cc: linux-kernel, linux-mm, hugh, riel On Mon, 2008-11-03 at 23:57 +0200, Eugene V. Lyubimkin wrote: > Hello kernel hackers! > > The current implementation of mmap() in kernel is very convenient. > It allows to mmap(fd) very big amount of memory having small file as back-end. > So one can mmap() 100 MiB on empty file, use first 10 KiB of memory, munmap() and have > only 10 KiB of file at the end. And while working with memory, file will automatically be > grown by read/write memory requests. > > Question is: can user-space application rely on this behavior (I failed to find any > documentation about this)? > > TIA and please CC me in replies. mmap() writes past the end of the file should not grow the file if I understand things write, but produce a sigbus (after the first page size alignment). The exact interaction of mmap() and truncate() I'm not exactly clear on. The safe way to do things is to first create your file of at least the size you mmap, using truncate. This will create a sparse file, and will on any sane filesystem not take more space than its meta data. Thereafter you can fill it with writes to the mmap. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: mmap: is default non-populating behavior stable? 2008-11-03 22:41 ` mmap: is default non-populating behavior stable? Peter Zijlstra @ 2008-11-03 22:49 ` Rik van Riel 2008-11-04 15:56 ` Chris Friesen 0 siblings, 1 reply; 10+ messages in thread From: Rik van Riel @ 2008-11-03 22:49 UTC (permalink / raw) To: Peter Zijlstra; +Cc: Eugene V. Lyubimkin, linux-kernel, linux-mm, hugh Peter Zijlstra wrote: > On Mon, 2008-11-03 at 23:57 +0200, Eugene V. Lyubimkin wrote: >> Hello kernel hackers! >> >> The current implementation of mmap() in kernel is very convenient. >> It allows to mmap(fd) very big amount of memory having small file as back-end. >> So one can mmap() 100 MiB on empty file, use first 10 KiB of memory, munmap() and have >> only 10 KiB of file at the end. And while working with memory, file will automatically be >> grown by read/write memory requests. >> >> Question is: can user-space application rely on this behavior (I failed to find any >> documentation about this)? >> >> TIA and please CC me in replies. > > mmap() writes past the end of the file should not grow the file if I > understand things write, but produce a sigbus (after the first page size > alignment). Indeed, faulting beyond the end of file returns a SIGBUS, see these lines in mm/filemap.c:filemap_fault(): size = (i_size_read(inode) + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT; if (vmf->pgoff >= size) return VM_FAULT_SIGBUS; > The exact interaction of mmap() and truncate() I'm not exactly clear on. Truncate will reduce the size of the mmaps on the file to match the new file size, so processes accessing beyond the end of file will get a segmentation fault (SIGSEGV). > The safe way to do things is to first create your file of at least the > size you mmap, using truncate. This will create a sparse file, and will > on any sane filesystem not take more space than its meta data. > > Thereafter you can fill it with writes to the mmap. Agreed. -- All Rights Reversed -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: mmap: is default non-populating behavior stable? 2008-11-03 22:49 ` Rik van Riel @ 2008-11-04 15:56 ` Chris Friesen 2008-11-04 16:07 ` Peter Zijlstra 0 siblings, 1 reply; 10+ messages in thread From: Chris Friesen @ 2008-11-04 15:56 UTC (permalink / raw) To: Rik van Riel Cc: Peter Zijlstra, Eugene V. Lyubimkin, linux-kernel, linux-mm, hugh Rik van Riel wrote: > Peter Zijlstra wrote: >> The exact interaction of mmap() and truncate() I'm not exactly clear on. > > Truncate will reduce the size of the mmaps on the file to > match the new file size, so processes accessing beyond the > end of file will get a segmentation fault (SIGSEGV). I suspect Peter was talking about using truncate() to set the initial file size, effectively increasing rather than reducing it. Chris -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: mmap: is default non-populating behavior stable? 2008-11-04 15:56 ` Chris Friesen @ 2008-11-04 16:07 ` Peter Zijlstra 2008-11-04 16:28 ` Alan Cox 0 siblings, 1 reply; 10+ messages in thread From: Peter Zijlstra @ 2008-11-04 16:07 UTC (permalink / raw) To: Chris Friesen Cc: Rik van Riel, Eugene V. Lyubimkin, linux-kernel, linux-mm, hugh On Tue, 2008-11-04 at 09:56 -0600, Chris Friesen wrote: > Rik van Riel wrote: > > Peter Zijlstra wrote: > > >> The exact interaction of mmap() and truncate() I'm not exactly clear on. > > > > Truncate will reduce the size of the mmaps on the file to > > match the new file size, so processes accessing beyond the > > end of file will get a segmentation fault (SIGSEGV). > > I suspect Peter was talking about using truncate() to set the initial > file size, effectively increasing rather than reducing it. I was thinking of truncate() on an already mmap()'ed region, either increasing or decreasing the size so that part of the mmap becomes (in)valid. I'm not sure how POSIX speaks of this. I think Linux does the expected thing. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: mmap: is default non-populating behavior stable? 2008-11-04 16:07 ` Peter Zijlstra @ 2008-11-04 16:28 ` Alan Cox 2008-11-04 16:51 ` Eugene V. Lyubimkin 0 siblings, 1 reply; 10+ messages in thread From: Alan Cox @ 2008-11-04 16:28 UTC (permalink / raw) To: Peter Zijlstra Cc: Chris Friesen, Rik van Riel, Eugene V. Lyubimkin, linux-kernel, linux-mm, hugh On Tue, 04 Nov 2008 17:07:00 +0100 Peter Zijlstra <peterz@infradead.org> wrote: > On Tue, 2008-11-04 at 09:56 -0600, Chris Friesen wrote: > > Rik van Riel wrote: > > > Peter Zijlstra wrote: > > > > >> The exact interaction of mmap() and truncate() I'm not exactly clear on. > > > > > > Truncate will reduce the size of the mmaps on the file to > > > match the new file size, so processes accessing beyond the > > > end of file will get a segmentation fault (SIGSEGV). > > > > I suspect Peter was talking about using truncate() to set the initial > > file size, effectively increasing rather than reducing it. > > I was thinking of truncate() on an already mmap()'ed region, either > increasing or decreasing the size so that part of the mmap becomes > (in)valid. > > I'm not sure how POSIX speaks of this. > > I think Linux does the expected thing. I believe our behaviour is correct for mmap/mumap/truncate and it certainly used to be and was tested. At the point you do anything involving mremap (which is non posix) our behaviour becomes rather bizarre. Alan -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: mmap: is default non-populating behavior stable? 2008-11-04 16:28 ` Alan Cox @ 2008-11-04 16:51 ` Eugene V. Lyubimkin 2008-11-05 16:42 ` Hugh Dickins 0 siblings, 1 reply; 10+ messages in thread From: Eugene V. Lyubimkin @ 2008-11-04 16:51 UTC (permalink / raw) To: Alan Cox Cc: Peter Zijlstra, Chris Friesen, Rik van Riel, linux-kernel, linux-mm, hugh [-- Attachment #1: Type: text/plain, Size: 733 bytes --] Alan Cox wrote: > On Tue, 04 Nov 2008 17:07:00 +0100 > Peter Zijlstra <peterz@infradead.org> wrote: >> [snip] >> I'm not sure how POSIX speaks of this. >> >> I think Linux does the expected thing. > > I believe our behaviour is correct for mmap/mumap/truncate and it > certainly used to be and was tested. > > At the point you do anything involving mremap (which is non posix) our > behaviour becomes rather bizarre. Thanks to all for answers. I have made the conclusion that doing "open() new file, truncate(<big size>), mmap(<the same big size>), write/read some memory pages" should not populate other, untouched by write/read pages (until MAP_POPULATE given), right? -- Eugene V. Lyubimkin aka JackYF [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 260 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: mmap: is default non-populating behavior stable? 2008-11-04 16:51 ` Eugene V. Lyubimkin @ 2008-11-05 16:42 ` Hugh Dickins 2008-11-05 16:54 ` Alan Cox 2008-11-05 17:50 ` Eugene V. Lyubimkin 0 siblings, 2 replies; 10+ messages in thread From: Hugh Dickins @ 2008-11-05 16:42 UTC (permalink / raw) To: Eugene V. Lyubimkin Cc: Alan Cox, Peter Zijlstra, Chris Friesen, Rik van Riel, linux-kernel, linux-mm On Tue, 4 Nov 2008, Eugene V. Lyubimkin wrote: > Alan Cox wrote: > > > > I believe our behaviour is correct for mmap/mumap/truncate and it > > certainly used to be and was tested. Agreed. > > > > At the point you do anything involving mremap (which is non posix) our > > behaviour becomes rather bizarre. Certainly mremap is non-POSIX, but I can't think of any way in which it would interfere with Eugene's assumptions about population. (Every year or so we do wonder whether to change an extending mremap of a MAP_SHARED|MAP_ANONYMOUS object to extend the object itself instead of just SIGBUSing on the extension: but I've so far remained conservative about that, and Eugene appears to be thinking of more ordinary files.) > > Thanks to all for answers. I have made the conclusion that doing "open() new > file, truncate(<big size>), mmap(<the same big size>), write/read some memory > pages" should not populate other, untouched by write/read pages (until > MAP_POPULATE given), right? That is a reasonable description of how the kernel tries and will always try to handle it, approximately; but I don't think you can rely upon it absolutely. For a start, it depends on the filesystem: I believe that vfat, for example, does not support the concept of sparse files (files with holes in), so its truncate(<big size>) will allocate the whole of that big size initially. I'm not sure what you mean by "populate": in mm, as in MAP_POPULATE, we're thinking of prefaulting pages into the user address space; but you're probably thinking of whether the blocks are allocated on disk? Prefaulting hole pages into the user address space may imply allocating blocks on disk, or it may not: likely to depend on filesystem again. >From time to time we toy with prefaulting adjacent pages when a fault occurs (though IIRC tests have proved disappointing in the past): we'd like to keep that option open, but it would go against your guidelines above to some extent. Hugh -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: mmap: is default non-populating behavior stable? 2008-11-05 16:42 ` Hugh Dickins @ 2008-11-05 16:54 ` Alan Cox 2008-11-05 17:50 ` Eugene V. Lyubimkin 1 sibling, 0 replies; 10+ messages in thread From: Alan Cox @ 2008-11-05 16:54 UTC (permalink / raw) To: Hugh Dickins Cc: Eugene V. Lyubimkin, Peter Zijlstra, Chris Friesen, Rik van Riel, linux-kernel, linux-mm > (Every year or so we do wonder whether to change an extending mremap > of a MAP_SHARED|MAP_ANONYMOUS object to extend the object itself instead > of just SIGBUSing on the extension: but I've so far remained conservative > about that, and Eugene appears to be thinking of more ordinary files.) Try an mremap of a VM_GROWS* mapping and all the other things of this nature. I would say our current behaviour is not what might be expected by users. The extending an object case is just one example of weird behaviour. Alan -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: mmap: is default non-populating behavior stable? 2008-11-05 16:42 ` Hugh Dickins 2008-11-05 16:54 ` Alan Cox @ 2008-11-05 17:50 ` Eugene V. Lyubimkin 2008-11-05 23:31 ` Hugh Dickins 1 sibling, 1 reply; 10+ messages in thread From: Eugene V. Lyubimkin @ 2008-11-05 17:50 UTC (permalink / raw) To: Hugh Dickins Cc: Alan Cox, Peter Zijlstra, Chris Friesen, Rik van Riel, linux-kernel, linux-mm [-- Attachment #1: Type: text/plain, Size: 1420 bytes --] Hugh Dickins wrote: >> Thanks to all for answers. I have made the conclusion that doing "open() new >> file, truncate(<big size>), mmap(<the same big size>), write/read some memory >> pages" should not populate other, untouched by write/read pages (until >> MAP_POPULATE given), right? [snip] > For a start, it depends on the filesystem: I believe that vfat, for > example, does not support the concept of sparse files (files with holes > in), so its truncate(<big size>) will allocate the whole of that big > size initially. For my case vfat is not an option fortunately. > I'm not sure what you mean by "populate": in mm, as in MAP_POPULATE, > we're thinking of prefaulting pages into the user address space; but > you're probably thinking of whether the blocks are allocated on disk? Yes. >>From time to time we toy with prefaulting adjacent pages when a fault > occurs (though IIRC tests have proved disappointing in the past): we'd > like to keep that option open, but it would go against your guidelines > above to some extent. It depends how is "adjacent" would count :) If several pages, probably not. If millions or similar, that would be a problem. It's very convenient to use such "open+truncate+mmap+write/read" behavior to make self-growing-on-demand cache in memory with disk as back-end without remaps. Thanks for descriptive answer. -- Eugene V. Lyubimkin aka JackYF [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 260 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: mmap: is default non-populating behavior stable? 2008-11-05 17:50 ` Eugene V. Lyubimkin @ 2008-11-05 23:31 ` Hugh Dickins 0 siblings, 0 replies; 10+ messages in thread From: Hugh Dickins @ 2008-11-05 23:31 UTC (permalink / raw) To: Eugene V. Lyubimkin Cc: Alan Cox, Peter Zijlstra, Chris Friesen, Rik van Riel, linux-kernel, linux-mm On Wed, 5 Nov 2008, Eugene V. Lyubimkin wrote: > Hugh Dickins wrote: > > >>From time to time we toy with prefaulting adjacent pages when a fault > > occurs (though IIRC tests have proved disappointing in the past): we'd > > like to keep that option open, but it would go against your guidelines > > above to some extent. > It depends how is "adjacent" would count :) If several pages, probably not. > If millions or similar, that would be a problem. That's fine, you'll be safe: you can be sure that it would never be in the kernel's interest to prefault more than "several" extra pages. Well, bearing in mind that famous "640K enough for all" remark, let's not say "never"; but it won't prefault millions until memory is so abundant and I/O so fast that you'd be happy with it prefaulting millions yourself. > It's very convenient to use such > "open+truncate+mmap+write/read" behavior to make self-growing-on-demand cache > in memory with disk as back-end without remaps. Yes. Though one thing to beware of is running out of disk space: whereas a write system call should be good at reporting -ENOSPC, the filesystem may not be able to handle running out of disk space when writing back dirty mmaped pages - it may quietly lose the data. Hugh -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2008-11-05 23:31 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <490F73CD.4010705@gmail.com>
2008-11-03 22:41 ` mmap: is default non-populating behavior stable? Peter Zijlstra
2008-11-03 22:49 ` Rik van Riel
2008-11-04 15:56 ` Chris Friesen
2008-11-04 16:07 ` Peter Zijlstra
2008-11-04 16:28 ` Alan Cox
2008-11-04 16:51 ` Eugene V. Lyubimkin
2008-11-05 16:42 ` Hugh Dickins
2008-11-05 16:54 ` Alan Cox
2008-11-05 17:50 ` Eugene V. Lyubimkin
2008-11-05 23:31 ` Hugh Dickins
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox