linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
To: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: akpm@linux-foundation.org, riel@redhat.com,
	linux-mm <linux-mm@kvack.org>
Subject: Re: [PATCH 6/6] Mlock: make mlock error return Posixly Correct
Date: Fri, 22 Aug 2008 16:48:26 -0400	[thread overview]
Message-ID: <1219438107.9576.38.camel@lts-notebook> (raw)
In-Reply-To: <1219259092.6075.45.camel@lts-notebook>

On Wed, 2008-08-20 at 15:04 -0400, Lee Schermerhorn wrote:
> On Thu, 2008-08-21 at 02:58 +0900, KOSAKI Motohiro wrote:
> > >> mlock() need error code if vma permission failure happend.
> > >> but mmap() (and remap_pages_range(), etc..) should ignore it.
> > >>
> > >> So, mlock_vma_pages_range() should ignore __mlock_vma_pages_range()'s error code.
> > >
> > > Well, I don't know whether we can trigger a vma permission failure
> > > during mmap(MAP_LOCKED) or a remap within a VM_LOCKED vma, either of
> > > which will end up calling mlock_vma_pages_range().  However, [after
> > > rereading the man page] looks like we DO want to return any ENOMEM w/o
> > > translating to EAGAIN.
> > 
> > Linus-tree implemetation does it?
> > Can we make reproduce programs?
> 
> > So, I think implimentation compatibility is important than man pages
> > because many person think imcompatibility is bug ;-)
> 
> Currently, the upstream kernel uses make_pages_present() and ignores the
> return value.  However, the mmap(2) man page does say that it can return
> ENOMEM for "no memory is available"--which is what get_user_pages() will
> return in that situation and which I propose we pass on untranslated.
> 
> To make a reproducer, we'd need to call mmap() with MAP_LOCKED from a
> program running on a system or in a mem control group with insufficient
> available memory to satisfy the mapping.  [I think that's really the
> only get_user_pages() error we should get back from
> mlock_vma_pages_range().  We shouldn't get the 'EFAULT' as we're
> mlocking know good vma addresses and we filter out VM_IO|VM_PFNMAP
> vmas.]  If the system/container is close enough to its memory capacity
> that mmap(MAP_LOCKED) is returning ENOMEM, the application probably has
> other more important problems to deal with.
> 
> In any case, we'd only return an error when one occurs.  This might be
> different from today's behavior which is to not tell the application
> about the condition, even tho' it has occurred--e.g., insufficient
> memory to satisfy MAP_LOCKED--but I think it's better for the
> application to know about it than to pretend it didn't happen.  
> 
> So, I think we're OK, after I move the posix error return translation to
> mlock_fixup().  [I'm testing the patch now.]
> 

After further examination of the code and thinking about separating
issues, I agree with you that we should hide any pte population error
[returned by get_user_pages()] from mmap() callers and just return the
address altho' the range may not be fully populated.  

I reread the Single Unix Specification mmap() page:

	http://www.opengroup.org/onlinepubs/007908799/xsh/mmap.html

It appears that, technically, we should be returning the error code when
we can't mlock the pages, at least in the case where mlockall() has
previously been called for the process--the MAP_LOCKED flag is not
defined in the standard.  Since the current upstream code doesn't do
this, it's not a regression nor otherwise directly related to the
unevictable mlocked pages patches, so I'd like to handle it as a
separate patch.  It will require backing out the already mmap()ed vma.

mlock_vma_pages_range() still returns the error in the case where the
vma goes away when it switches the mmap_sem back to write.  This should
be VERY unlikely, and I suppose we could return the virtual address of a
missing vma to the application and let it find out via SIGSEGV that the
address is invalid when it tries to access the memory.  I chose to
return the error.

I'm sending out the revised series shortly.

Lee


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2008-08-22 20:48 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-08-19 21:05 [Patch 0/6] Mlock: doc, patch grouping and error return cleanups Lee Schermerhorn
2008-08-19 21:05 ` [PATCH 1/6] Mlock: fix __mlock_vma_pages_range comment block Lee Schermerhorn, KOSAKI Motohiro
2008-08-19 21:05 ` [PATCH 2/6] Mlock: backout locked_vm adjustment during mmap() Lee Schermerhorn, KOSAKI Motohiro
2008-08-19 21:05 ` [PATCH 3/6] Mlock: resubmit locked_vm adjustment as separate patch Lee Schermerhorn, Lee Schermerhorn
2008-08-19 21:05 ` [PATCH 4/6] Mlock: fix return value for munmap/mlock vma race Lee Schermerhorn, KOSAKI Motohiro
2008-08-20  8:31   ` KOSAKI Motohiro
2008-08-19 21:05 ` [PATCH 5/6] Mlock: revert mainline handling of mlock error return Lee Schermerhorn, Lee Schermerhorn
2008-08-20  7:20   ` KOSAKI Motohiro
2008-08-20  7:24     ` KOSAKI Motohiro
2008-08-19 21:05 ` [PATCH 6/6] Mlock: make mlock error return Posixly Correct Lee Schermerhorn, KOSAKI Motohiro
2008-08-20  8:35   ` KOSAKI Motohiro
2008-08-20 16:24     ` Lee Schermerhorn
2008-08-20 17:58       ` KOSAKI Motohiro
2008-08-20 19:04         ` Lee Schermerhorn
2008-08-22 20:48           ` Lee Schermerhorn [this message]
2008-08-20 10:17   ` Pekka Enberg
2008-08-20 16:26     ` Lee Schermerhorn
2008-08-20  7:21 ` [Patch 0/6] Mlock: doc, patch grouping and error return cleanups KOSAKI Motohiro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1219438107.9576.38.camel@lts-notebook \
    --to=lee.schermerhorn@hp.com \
    --cc=akpm@linux-foundation.org \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-mm@kvack.org \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox