* Unexpected mremap + shared anon mapping behavior
@ 2013-03-08 8:27 Pavel Emelyanov
2013-03-08 8:53 ` Kirill A. Shutemov
2013-03-12 2:53 ` Hugh Dickins
0 siblings, 2 replies; 4+ messages in thread
From: Pavel Emelyanov @ 2013-03-08 8:27 UTC (permalink / raw)
To: Linux MM, Hugh Dickins
Hi!
I've recently noticed that the following user-space code
#define _GNU_SOURCE
#include <stdio.h>
#include <sys/mman.h>
#define PAGE_SIZE (4096)
int main(void)
{
char *mem = mmap(NULL, PAGE_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANON, 0, 0);
mem = mremap(mem, PAGE_SIZE, 2 * PAGE_SIZE, MREMAP_MAYMOVE);
mem[0] = 'a';
mem[PAGE_SIZE] = 'b';
return 0;
}
generates SIGBUS on the 2nd page access. But if we change MAP_SHARED into MAP_PRIVATE
in the mmap() call, it starts working OK.
This happens because when doing a MAP_SHARED | MAP_ANON area, the kernel sets up a shmem
file for the mapping, but the subsequent mremap() doesn't grow it. Thus a page-fault into
the 2nd page happens to be beyond this file i_size, resulting in SIGBUS.
So, the question is -- what should the mremap() behavior be for shared anonymous mappings?
Should it truncate the file to match the grown-up vma length? If yes, should it also
truncate it if we mremap() the mapping to the smaller size?
I also have to note, that before the /proc/PID/map_files/ directory appeared in Linux it
was impossible to fix this behavior from the application side. Now app can (yes, it's a
hack) open the respective shmem file via this dir and manually truncate one. It does help.
Thanks,
Pavel
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Unexpected mremap + shared anon mapping behavior
2013-03-08 8:27 Unexpected mremap + shared anon mapping behavior Pavel Emelyanov
@ 2013-03-08 8:53 ` Kirill A. Shutemov
2013-03-08 9:04 ` Pavel Emelyanov
2013-03-12 2:53 ` Hugh Dickins
1 sibling, 1 reply; 4+ messages in thread
From: Kirill A. Shutemov @ 2013-03-08 8:53 UTC (permalink / raw)
To: Pavel Emelyanov; +Cc: Linux MM, Hugh Dickins
On Fri, Mar 08, 2013 at 12:27:56PM +0400, Pavel Emelyanov wrote:
> Hi!
>
> I've recently noticed that the following user-space code
>
> #define _GNU_SOURCE
> #include <stdio.h>
> #include <sys/mman.h>
>
> #define PAGE_SIZE (4096)
>
> int main(void)
> {
> char *mem = mmap(NULL, PAGE_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANON, 0, 0);
> mem = mremap(mem, PAGE_SIZE, 2 * PAGE_SIZE, MREMAP_MAYMOVE);
> mem[0] = 'a';
> mem[PAGE_SIZE] = 'b';
> return 0;
> }
>
> generates SIGBUS on the 2nd page access. But if we change MAP_SHARED into MAP_PRIVATE
> in the mmap() call, it starts working OK.
>
> This happens because when doing a MAP_SHARED | MAP_ANON area, the kernel sets up a shmem
> file for the mapping, but the subsequent mremap() doesn't grow it. Thus a page-fault into
> the 2nd page happens to be beyond this file i_size, resulting in SIGBUS.
>
> So, the question is -- what should the mremap() behavior be for shared anonymous mappings?
> Should it truncate the file to match the grown-up vma length? If yes, should it also
> truncate it if we mremap() the mapping to the smaller size?
I think the answer is 'no' for both cases. It's ABI change.
Should we introduce mtruncate() syscall which will truncate backing fail
in both cases? ;)
--
Kirill A. Shutemov
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Unexpected mremap + shared anon mapping behavior
2013-03-08 8:53 ` Kirill A. Shutemov
@ 2013-03-08 9:04 ` Pavel Emelyanov
0 siblings, 0 replies; 4+ messages in thread
From: Pavel Emelyanov @ 2013-03-08 9:04 UTC (permalink / raw)
To: Kirill A. Shutemov; +Cc: Linux MM, Hugh Dickins
>> So, the question is -- what should the mremap() behavior be for shared anonymous mappings?
>> Should it truncate the file to match the grown-up vma length? If yes, should it also
>> truncate it if we mremap() the mapping to the smaller size?
>
> I think the answer is 'no' for both cases. It's ABI change.
>
> Should we introduce mtruncate() syscall which will truncate backing fail
> in both cases? ;)
>
If we don't touch kernel mremap, then mtruncate can be done in glibc via /proc/pid/map_files :)
Thanks,
Pavel
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Unexpected mremap + shared anon mapping behavior
2013-03-08 8:27 Unexpected mremap + shared anon mapping behavior Pavel Emelyanov
2013-03-08 8:53 ` Kirill A. Shutemov
@ 2013-03-12 2:53 ` Hugh Dickins
1 sibling, 0 replies; 4+ messages in thread
From: Hugh Dickins @ 2013-03-12 2:53 UTC (permalink / raw)
To: Pavel Emelyanov; +Cc: Kirill A. Shutemov, Linux MM
On Fri, 8 Mar 2013, Pavel Emelyanov wrote:
> Hi!
>
> I've recently noticed that the following user-space code
>
> #define _GNU_SOURCE
> #include <stdio.h>
> #include <sys/mman.h>
>
> #define PAGE_SIZE (4096)
>
> int main(void)
> {
> char *mem = mmap(NULL, PAGE_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANON, 0, 0);
> mem = mremap(mem, PAGE_SIZE, 2 * PAGE_SIZE, MREMAP_MAYMOVE);
> mem[0] = 'a';
> mem[PAGE_SIZE] = 'b';
> return 0;
> }
>
> generates SIGBUS on the 2nd page access. But if we change MAP_SHARED into MAP_PRIVATE
> in the mmap() call, it starts working OK.
>
> This happens because when doing a MAP_SHARED | MAP_ANON area, the kernel sets up a shmem
> file for the mapping, but the subsequent mremap() doesn't grow it. Thus a page-fault into
> the 2nd page happens to be beyond this file i_size, resulting in SIGBUS.
>
> So, the question is -- what should the mremap() behavior be for shared anonymous mappings?
> Should it truncate the file to match the grown-up vma length?
I have mixed feelings. Here's a link to the discussion around 2.6.7 -
when I had more to say than I do these days!
https://lkml.org/lkml/2004/6/16/155
I feel much the same as before; but tend more against since I developed
a dislike for the way object size and mapping size get muddled up in
hugetlbfs, which has been troublesome. I'm probably over cautious;
but if it only poses a problem once in 9 years, maybe it's not worth
messing about with.
> If yes, should it also
> truncate it if we mremap() the mapping to the smaller size?
No to that. I'm amused to see Kirill lightheartedly proposing
an mtruncate(): I see I suggested the same in that thread above.
But nowadays I do sometimes think it would be useful to have an mopen():
give me a file descriptor for the file backing this area of memory (and
perhaps one day some interesting extension to anonymous memory); that
perhaps we could use to get around some of the awkwardness of SysV SHM.
>
> I also have to note, that before the /proc/PID/map_files/ directory appeared in Linux it
> was impossible to fix this behavior from the application side. Now app can (yes, it's a
> hack) open the respective shmem file via this dir and manually truncate one. It does help.
Wow, that's interesting: so you're well ahead of me.
Perverted, and a little worrying, but interesting - I applaud you!
Hugh
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2013-03-12 2:54 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-03-08 8:27 Unexpected mremap + shared anon mapping behavior Pavel Emelyanov
2013-03-08 8:53 ` Kirill A. Shutemov
2013-03-08 9:04 ` Pavel Emelyanov
2013-03-12 2:53 ` Hugh Dickins
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox