linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Oleg Nesterov <oleg@redhat.com>
To: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Will Deacon <will.deacon@arm.com>,
	hpa@zytor.com, luto@amacapital.net, dave.hansen@linux.intel.com,
	mingo@elte.hu, minchan@kernel.org, tglx@linutronix.de,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: LTP regressions due to 6dc296e7df4c ("mm: make sure all file VMAs have ->vm_ops set")
Date: Tue, 15 Sep 2015 17:13:18 +0200	[thread overview]
Message-ID: <20150915151318.GA15866@redhat.com> (raw)
In-Reply-To: <20150915134216.GA16093@node.dhcp.inet.fi>

On 09/15, Kirill A. Shutemov wrote:
>
> On Tue, Sep 15, 2015 at 02:12:01PM +0200, Oleg Nesterov wrote:
> > On 09/14, Kirill A. Shutemov wrote:
> > >
> > > On Mon, Sep 14, 2015 at 07:05:47PM +0200, Oleg Nesterov wrote:
> > > > On 09/14, Kirill A. Shutemov wrote:
> > > > >
> > > > > Fix is below. I don't really like it, but I cannot find any better
> > > > > solution.
> > > >
> > > > Me too...
> > > >
> > > > But this change "documents" the nasty special "vm_file && !vm_ops" case, and
> > > > I am not sure how we can remove it later...
> > > >
> > > > So perhaps we should change vma_is_anonymous() back to check ->fault too,
> > > >
> > > > 	 static inline bool vma_is_anonymous(struct vm_area_struct *vma)
> > > > 	 {
> > > > 	-	return !vma->vm_ops;
> > > > 	+	return !vma->vm_ops || !vma->vm_ops->fault;
> > >
> > > No. This would give a lot false positives from drives which setup page
> > > tables upfront and don't use ->fault at all.
> >
> > And? I mean, I am not sure I understand what exactly do you dislike.
> >
> > Firstly, I still think that (in the long term) we should change them
> > to use .faul = no_fault() which just returns VM_FAULT_SIGBUS.
>
> I would rather like to see consolidated fault path between file and anon
> with ->vm_ops set for both. So vma_is_anonymous() will be trivial
> vma->vm_ops == anon_vm_ops.

I too thought about this. Perhaps but I guess this needs another
discussion.

In particular I am not sure we should just rely on vm_ops == anon_vm_ops.
Again, it is not that I think that the VM_MPX check in arch_vma_name() is
that bad. Still I think it would be better if mpx_mmap() could install
vma->vm_ops = mpx_vm_ops with ->name(). So perhaps ->anon_fault() makes
more sense. But lets not discuss this right now.

>
> > Until then I do not see why the change above can be really bad. The
> > VM_SHARED case is fine, do_anonymous_page() will return VM_FAULT_SIGBUS.
> >
> > So afaics the only problem is that after the change above the private
> > mapping can silently get an anonymous page after (say) MADV_DONTNEED
> > instead of the nice SIGBUS from do_fault(). I agree, this is not good,
> > but see above.
>
> So, what the point to introduce vma_is_anonymous() if it often produces
> false result? vma_is_anonymous_or_maybe_not()?

Heh.

Then what the point to demand that "All file mapping must have ->vm_ops set"
if mmap(MAP_PRIVATE, "/dev/zero") has ->vm_ops == NULL ? Because this is
not actually the file mapping, yes. And this is why we want vma_is_anonymous()
to return T in this case.

vma_is_anonymous() just says that a page fault will use do_anonymous_page().
I agree, it would be nice to ensure vma_is_anonymous() can only be true
if this vma can only have the anon pages. Let me repeat that I suggested
this change as a short-term fix (at least without other changes like we
discuss above). Because the mmap_zero() hack looks worse to me. Damn, even
the ugly hack below looks better to me.

> > Whether we need to keep the vm_ops/fault check in __vma_link_rb() and
> > mmap_region() is another issue. But if we keep them, then I think we
> > should at least turn the !vma->vm_ops check in mmap_region into
> > WARN_ON() as well.
>
> It would require first fix all known cases where ->f_op->mmap() returns
> vma->vm_ops == NULL. Not subject for 4.3, I think.

Kirill, I even sent you the private email to clarify that - of course! -
I only meant "in the longer term" ;)

Oleg.

--- x/include/linux/mm.h
+++ x/include/linux/mm.h
@@ -1289,9 +1289,11 @@ static inline int vma_growsdown(struct v
 	return vma && (vma->vm_end == addr) && (vma->vm_flags & VM_GROWSDOWN);
 }
 
+#define xxx_fault	((void*)1)
+
 static inline bool vma_is_anonymous(struct vm_area_struct *vma)
 {
-	return !vma->vm_ops;
+	return !vma->vm_ops || vma->vm_ops->fault == xxx_fault;
 }
 
 static inline int stack_guard_page_start(struct vm_area_struct *vma,
--- x/drivers/char/mem.c
+++ x/drivers/char/mem.c
@@ -653,11 +653,17 @@ static ssize_t read_iter_zero(struct kio
 
 static int mmap_zero(struct file *file, struct vm_area_struct *vma)
 {
+	static const struct vm_operations_struct xxx_ops = {
+		.fault = xxx_fault,
+	};
+		}
 #ifndef CONFIG_MMU
 	return -ENOSYS;
 #endif
 	if (vma->vm_flags & VM_SHARED)
 		return shmem_zero_setup(vma);
+
+	vma->vm_ops = &xxx_ops;
 	return 0;
 }
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2015-09-15 15:16 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-14 10:53 Will Deacon
2015-09-14 11:57 ` Kirill A. Shutemov
2015-09-14 13:22   ` Will Deacon
2015-09-14 17:05   ` Oleg Nesterov
2015-09-14 17:46     ` Oleg Nesterov
2015-09-14 18:20     ` Kirill A. Shutemov
2015-09-15 12:12       ` Oleg Nesterov
2015-09-15 13:42         ` Kirill A. Shutemov
2015-09-15 15:13           ` Oleg Nesterov [this message]
2015-09-16 21:28           ` Andrew Morton
2015-09-17 15:37             ` Kirill A. Shutemov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150915151318.GA15866@redhat.com \
    --to=oleg@redhat.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@amacapital.net \
    --cc=minchan@kernel.org \
    --cc=mingo@elte.hu \
    --cc=tglx@linutronix.de \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox