From: Blaisorblade <blaisorblade@yahoo.it>
To: Badari Pulavarty <pbadari@us.ibm.com>
Cc: Jeff Dike <jdike@addtoit.com>, Hugh Dickins <hugh@veritas.com>,
akpm@osdl.org, andrea@suse.de, dvhltc@us.ibm.com,
linux-mm <linux-mm@kvack.org>
Subject: Re: [RFC] madvise(MADV_TRUNCATE)
Date: Fri, 28 Oct 2005 20:40:26 +0200 [thread overview]
Message-ID: <200510282040.29856.blaisorblade@yahoo.it> (raw)
In-Reply-To: <43624EE6.8000605@us.ibm.com>
On Friday 28 October 2005 18:16, Badari Pulavarty wrote:
> Blaisorblade wrote:
> > On Friday 28 October 2005 05:46, Jeff Dike wrote:
> >>On Wed, Oct 26, 2005 at 03:49:55PM -0700, Badari Pulavarty wrote:
> > On the plan, however, I have a concern: VM_NONLINEAR.
> > However, looking at the patch, the implementation would boil down to
> > something like
> >
> > for each page in range {
> > start = page->index;
> > end = start + PAGE_SIZE;
> > call truncate_inode_pages_range(mapping, offset, end);
> > inode->i_op->truncate_range(inode, offset, end);
> > }
> >
> > unmap_mapping_range() should be done at once for the whole range.
>
> patch does
>
> for all the pages in the given vma {
> unmap_mapping_range(mapping, offset, end);
> truncate_inode_pages_range(mapping, offset, end);
> inode->op->truncate_range(inode, offset, end)
> }
> It operates on bunch of pages in the given VMA. Since UML has
> one page for VMA, it operates on one page at a time - do you
> see anything wrong here ?
My point was the support to VM_NONLINEAR. In the future, UML will have one big
VMA, but different pages will be remapped with different offsets (already in
mainline) and different protections (I have patches, I sent an earlier
version, still revising).
In that case, you could really truncate (in one single call) pages which are
one at the start of the file and one at the end. That's why with VM_NONLINEAR
it wouldn't work.
However, Jeff made me note that we'd probably call madvise() on the linear
kernel mapping (the kernel maps pages from the RAM file all at once,
linearly). So you can safely just refuse operating on VM_NONLINEAR vmas.
> > While looking at these, here's what I'd call "strange" in the patch:
> > Also, why is unmap_mapping_range done with the inode semaphore held? I
> > don't remember locking rule but conceptually this has no point, IMHO.
> I am not sure either, let me look at it. (I thought we should hold it
> for truncate()).
Ok, do_truncate() uses the semaphore around the whole ops, because it's
implemented in a radically different way (through notify_change()).
We don't need IMHO to do things that way; we don't even change i_size - not
even when at the end of the file, as we don't want SIGBUS.
And anyway FS's must already handle holes at the end of a file.
Btw, when truncating, notify_change does:
if (ia_valid & ATTR_SIZE)
down_write(&dentry->d_inode->i_alloc_sem);
(which I suppose is used to protect against concurrent file extensions - page
allocations in previous holes - and such). You should probably take that too
(nest it inside mapping->host->i_sem).
Also, vmtruncate is called with the semaphore held because it must call
truncate_inode_pages(), and because even the calls to i_size_write() must be
atomic with the rest. But other than that, there's no reason. Especially,
unmap_mapping_range() does purely pagetable operations.
> > Btw, why I don't see vm_pgoff mentioned in these lines of the patch (nor
> > anywhere else in the patch)?
> vm_pgoff - don't remember what that supposed to represent...
Call mmap() with non-0 pgoff (i.e. offset in the file), say the second file
page. You're gonna store the pgoff parameter in vma->vm_pgoff (in PAGE_SHIFT
units).
If I then request you to truncate the first page in the VMA, how does your
code realize that it should punch the second page rather than the first?
However, Jeff said this _isn't_ the bug he's hitting - in his case the VMA has
a 0 initial offset (for the same reason we don't need VM_NONLINEAR support).
> > You call truncate_inode_pages_range(mapping, offset, endoff), so I think
> > you're really burned here.
> > +offset = (loff_t)(start - vma->vm_start);
> > +endoff = (loff_t)(end - vma->vm_start);
So they would become:
offset = (loff_t)(start - vma->vm_start) + vma->vm_pgoff << PAGE_SHIFT;
or with page_offset(). Btw, shouldn't this be done by some macro in
<linux/pagemap.h>, as page_offset() and linear_page_index()?
Btw, also compare with mm/rmap.c:vma_address()/page_address_in_vma().
> "end" here is not end of VMA - its end of the region we want to discard
> (in UML case its start + PAGE_SIZE). Anything wrong ?
All ok for that, I was complaining about not using ->vm_pgoff.
I had the doubt that vm_pgoff entered the picture later, but I'm sure
truncate_inode_pages{_range} wants file offsets, so it wasn't something I was
missing.
--
Inform me of my mistakes, so I can keep imitating Homer Simpson's "Doh!".
Paolo Giarrusso, aka Blaisorblade (Skype ID "PaoloGiarrusso", ICQ 215621894)
http://www.user-mode-linux.org/~blaisorblade
___________________________________
Yahoo! Mail: gratis 1GB per i messaggi e allegati da 10MB
http://mail.yahoo.it
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2005-10-28 18:40 UTC|newest]
Thread overview: 86+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-10-26 22:49 Badari Pulavarty
2005-10-27 8:38 ` Andi Kleen
2005-10-27 13:17 ` Andrea Arcangeli
2005-10-27 15:00 ` Badari Pulavarty
2005-10-27 15:11 ` Andrea Arcangeli
2005-10-27 18:20 ` Andrew Morton
2005-10-27 18:35 ` Badari Pulavarty
2005-10-27 18:50 ` Andrew Morton
2005-10-27 19:40 ` Gerrit Huizenga
2005-10-27 19:56 ` Andi Kleen
2005-10-27 23:21 ` Darren Hart
2005-10-27 20:05 ` Theodore Ts'o
2005-10-27 20:16 ` Andrea Arcangeli
2005-10-28 1:42 ` Badari Pulavarty
2005-10-28 16:33 ` Theodore Ts'o
2005-10-27 20:22 ` Jeff Dike
2005-10-27 20:04 ` Andrea Arcangeli
2005-10-27 20:50 ` Andrew Morton
2005-10-27 21:37 ` Andrea Arcangeli
2005-10-27 22:23 ` Andrew Morton
2005-10-27 23:05 ` Badari Pulavarty
2005-10-27 23:16 ` Andrew Morton
2005-10-27 23:33 ` Peter Chubb
2005-10-28 0:22 ` Andrea Arcangeli
2005-10-28 0:32 ` Andrew Morton
2005-10-28 1:10 ` Andrea Arcangeli
2005-10-28 1:27 ` Badari Pulavarty
2005-10-28 2:00 ` Andrew Morton
2005-10-27 22:32 ` Badari Pulavarty
2005-10-27 23:28 ` Peter Chubb
2005-10-27 23:49 ` Andrew Morton
2005-10-27 23:56 ` Nathan Scott
2005-10-28 0:15 ` Andrea Arcangeli
2005-10-27 23:59 ` Peter Chubb
2005-10-28 3:46 ` Jeff Dike
2005-10-28 11:03 ` Blaisorblade
2005-10-28 13:29 ` Andrea Arcangeli
2005-10-28 16:56 ` Blaisorblade
2005-10-28 16:16 ` Badari Pulavarty
2005-10-28 18:40 ` Blaisorblade [this message]
2005-10-28 18:56 ` Badari Pulavarty
2005-10-29 0:35 ` Badari Pulavarty
2005-10-28 16:19 ` Badari Pulavarty
2005-10-28 17:10 ` Blaisorblade
2005-10-28 18:28 ` Jeff Dike
2005-10-28 18:44 ` Blaisorblade
2005-10-28 18:42 ` Jeff Dike
2005-10-28 18:54 ` Badari Pulavarty
2005-10-29 0:03 ` Badari Pulavarty
2005-10-29 2:51 ` Jeff Dike
2005-10-31 16:34 ` Badari Pulavarty
2005-10-31 19:15 ` Badari Pulavarty
2005-10-31 19:49 ` [RFC][PATCH] madvise(MADV_TRUNCATE) Badari Pulavarty
2005-11-01 0:05 ` Jeff Dike
2005-11-02 1:15 ` [PATCH] 2.6.14 patch for supporting madvise(MADV_FREE) Badari Pulavarty
2005-11-02 1:43 ` Andrea Arcangeli
2005-11-02 15:49 ` Badari Pulavarty
2005-11-02 16:12 ` [PATCH] 2.6.14 patch for supporting madvise(MADV_REMOVE) Badari Pulavarty
2005-11-02 19:54 ` New bug in patch and existing Linux code - race with install_page() (was: Re: [PATCH] 2.6.14 patch for supporting madvise(MADV_REMOVE)) Blaisorblade
2005-11-02 20:12 ` Hugh Dickins
2005-11-02 20:45 ` Hugh Dickins
2005-11-02 21:36 ` Badari Pulavarty
2005-11-02 21:55 ` Hugh Dickins
2005-11-02 22:02 ` Badari Pulavarty
2005-11-12 0:25 ` [PATCH] 2.6.14 patch for supporting madvise(MADV_REMOVE) Andrew Morton
2005-11-12 0:34 ` Badari Pulavarty
2005-11-12 1:43 ` Andrew Morton
2005-11-12 4:41 ` Badari Pulavarty
2006-01-16 13:06 ` differences between MADV_FREE and MADV_DONTNEED Andrea Arcangeli
2006-01-16 16:02 ` Suleiman Souhlal
2006-01-16 16:28 ` Andrea Arcangeli
2006-01-16 17:03 ` Suleiman Souhlal
2006-01-16 17:24 ` Andrea Arcangeli
2006-01-16 21:43 ` Eric W. Biederman
2006-01-17 0:24 ` Suleiman Souhlal
2006-01-17 1:04 ` Nicholas Miell
2006-01-17 12:43 ` Christoph Hellwig
2006-01-17 18:23 ` Eric W. Biederman
2006-01-17 22:55 ` Nicholas Miell
2007-03-01 18:11 ` Samuel Thibault
2006-01-17 19:06 ` Badari Pulavarty
2006-01-17 1:06 ` Blaisorblade
2006-01-17 1:33 ` Andrea Arcangeli
2005-11-12 0:34 ` [PATCH] 2.6.14 patch for supporting madvise(MADV_REMOVE) Andrew Morton
2005-10-28 17:55 ` [RFC] madvise(MADV_TRUNCATE) Blaisorblade
2005-10-28 21:23 ` Theodore Ts'o
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200510282040.29856.blaisorblade@yahoo.it \
--to=blaisorblade@yahoo.it \
--cc=akpm@osdl.org \
--cc=andrea@suse.de \
--cc=dvhltc@us.ibm.com \
--cc=hugh@veritas.com \
--cc=jdike@addtoit.com \
--cc=linux-mm@kvack.org \
--cc=pbadari@us.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox