linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Blaisorblade <blaisorblade@yahoo.it>
To: Badari Pulavarty <pbadari@us.ibm.com>
Cc: Jeff Dike <jdike@addtoit.com>, Hugh Dickins <hugh@veritas.com>,
	akpm@osdl.org, andrea@suse.de, dvhltc@us.ibm.com,
	linux-mm <linux-mm@kvack.org>
Subject: Re: [RFC] madvise(MADV_TRUNCATE)
Date: Fri, 28 Oct 2005 20:40:26 +0200	[thread overview]
Message-ID: <200510282040.29856.blaisorblade@yahoo.it> (raw)
In-Reply-To: <43624EE6.8000605@us.ibm.com>

On Friday 28 October 2005 18:16, Badari Pulavarty wrote:
> Blaisorblade wrote:
> > On Friday 28 October 2005 05:46, Jeff Dike wrote:
> >>On Wed, Oct 26, 2005 at 03:49:55PM -0700, Badari Pulavarty wrote:

> > On the plan, however, I have a concern: VM_NONLINEAR.

> > However, looking at the patch, the implementation would boil down to
> > something like
> >
> > for each page in range {
> > 	start = page->index;
> > 	end = start + PAGE_SIZE;
> > 	call truncate_inode_pages_range(mapping, offset, end);
> > 	inode->i_op->truncate_range(inode, offset, end);
> > }
> >
> > unmap_mapping_range() should be done at once for the whole range.
>
> patch does
>
> for all the pages in the given vma {
> 	unmap_mapping_range(mapping, offset, end);
> 	truncate_inode_pages_range(mapping, offset, end);
> 	inode->op->truncate_range(inode, offset, end)
> }

> It operates on bunch of pages in the given VMA. Since UML has
> one page for VMA, it operates on one page at a time - do you
> see anything wrong here ?

My point was the support to VM_NONLINEAR. In the future, UML will have one big 
VMA, but different pages will be remapped with different offsets (already in 
mainline) and different protections (I have patches, I sent an earlier 
version, still revising).

In that case, you could really truncate (in one single call) pages which are 
one at the start of the file and one at the end. That's why with VM_NONLINEAR 
it wouldn't work.

However, Jeff made me note that we'd probably call madvise() on the linear 
kernel mapping (the kernel maps pages from the RAM file all at once, 
linearly). So you can safely just refuse operating on VM_NONLINEAR vmas.

> > While looking at these, here's what I'd call "strange" in the patch:

> > Also, why is unmap_mapping_range done with the inode semaphore held? I
> > don't remember locking rule but conceptually this has no point, IMHO.

> I am not sure either, let me look at it. (I thought we should hold it
> for truncate()).

Ok, do_truncate() uses the semaphore around the whole ops, because it's 
implemented in a radically different way (through notify_change()).

We don't need IMHO to do things that way; we don't even change i_size - not 
even when at the end of the file, as we don't want SIGBUS.

And anyway FS's must already handle holes at the end of a file.

Btw, when truncating, notify_change does:

        if (ia_valid & ATTR_SIZE)
                down_write(&dentry->d_inode->i_alloc_sem);

(which I suppose is used to protect against concurrent file extensions - page 
allocations in previous holes - and such). You should probably take that too 
(nest it inside mapping->host->i_sem).

Also, vmtruncate is called with the semaphore held because it must call 
truncate_inode_pages(), and because even the calls to i_size_write() must be 
atomic with the rest. But other than that, there's no reason. Especially, 
unmap_mapping_range() does purely pagetable operations.

> > Btw, why I don't see vm_pgoff mentioned in these lines of the patch (nor
> > anywhere else in the patch)?

> vm_pgoff - don't remember what that supposed to represent...

Call mmap() with non-0 pgoff (i.e. offset in the file), say the second file 
page. You're gonna store the pgoff parameter in vma->vm_pgoff (in PAGE_SHIFT 
units).

If I then request you to truncate the first page in the VMA, how does your 
code realize that it should punch the second page rather than the first?

However, Jeff said this _isn't_ the bug he's hitting - in his case the VMA has 
a 0 initial offset (for the same reason we don't need VM_NONLINEAR support).

> > You call truncate_inode_pages_range(mapping, offset, endoff), so I think
> > you're really burned here.

> > +offset = (loff_t)(start - vma->vm_start);
> > +endoff = (loff_t)(end - vma->vm_start);

So they would become:

offset = (loff_t)(start - vma->vm_start) + vma->vm_pgoff << PAGE_SHIFT; 

or with page_offset(). Btw, shouldn't this be done by some macro in 
<linux/pagemap.h>, as page_offset() and linear_page_index()?

Btw, also compare with mm/rmap.c:vma_address()/page_address_in_vma().

> "end" here is not end of VMA - its end of the region we want to discard
> (in UML case its start + PAGE_SIZE). Anything wrong ?

All ok for that, I was complaining about not using ->vm_pgoff.

I had the doubt that vm_pgoff entered the picture later, but I'm sure 
truncate_inode_pages{_range} wants file offsets, so it wasn't something I was 
missing.
-- 
Inform me of my mistakes, so I can keep imitating Homer Simpson's "Doh!".
Paolo Giarrusso, aka Blaisorblade (Skype ID "PaoloGiarrusso", ICQ 215621894)
http://www.user-mode-linux.org/~blaisorblade

	

	
		
___________________________________ 
Yahoo! Mail: gratis 1GB per i messaggi e allegati da 10MB 
http://mail.yahoo.it

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2005-10-28 18:40 UTC|newest]

Thread overview: 86+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-10-26 22:49 Badari Pulavarty
2005-10-27  8:38 ` Andi Kleen
2005-10-27 13:17   ` Andrea Arcangeli
2005-10-27 15:00     ` Badari Pulavarty
2005-10-27 15:11       ` Andrea Arcangeli
2005-10-27 18:20         ` Andrew Morton
2005-10-27 18:35           ` Badari Pulavarty
2005-10-27 18:50             ` Andrew Morton
2005-10-27 19:40               ` Gerrit Huizenga
2005-10-27 19:56                 ` Andi Kleen
2005-10-27 23:21                   ` Darren Hart
2005-10-27 20:05               ` Theodore Ts'o
2005-10-27 20:16                 ` Andrea Arcangeli
2005-10-28  1:42                 ` Badari Pulavarty
2005-10-28 16:33                   ` Theodore Ts'o
2005-10-27 20:22               ` Jeff Dike
2005-10-27 20:04           ` Andrea Arcangeli
2005-10-27 20:50             ` Andrew Morton
2005-10-27 21:37               ` Andrea Arcangeli
2005-10-27 22:23                 ` Andrew Morton
2005-10-27 23:05                   ` Badari Pulavarty
2005-10-27 23:16                     ` Andrew Morton
2005-10-27 23:33                       ` Peter Chubb
2005-10-28  0:22                   ` Andrea Arcangeli
2005-10-28  0:32                     ` Andrew Morton
2005-10-28  1:10                       ` Andrea Arcangeli
2005-10-28  1:27                       ` Badari Pulavarty
2005-10-28  2:00                         ` Andrew Morton
2005-10-27 22:32               ` Badari Pulavarty
2005-10-27 23:28             ` Peter Chubb
2005-10-27 23:49               ` Andrew Morton
2005-10-27 23:56                 ` Nathan Scott
2005-10-28  0:15                   ` Andrea Arcangeli
2005-10-27 23:59                 ` Peter Chubb
2005-10-28  3:46 ` Jeff Dike
2005-10-28 11:03   ` Blaisorblade
2005-10-28 13:29     ` Andrea Arcangeli
2005-10-28 16:56       ` Blaisorblade
2005-10-28 16:16     ` Badari Pulavarty
2005-10-28 18:40       ` Blaisorblade [this message]
2005-10-28 18:56         ` Badari Pulavarty
2005-10-29  0:35         ` Badari Pulavarty
2005-10-28 16:19   ` Badari Pulavarty
2005-10-28 17:10     ` Blaisorblade
2005-10-28 18:28       ` Jeff Dike
2005-10-28 18:44         ` Blaisorblade
2005-10-28 18:42     ` Jeff Dike
2005-10-28 18:54       ` Badari Pulavarty
2005-10-29  0:03       ` Badari Pulavarty
2005-10-29  2:51         ` Jeff Dike
2005-10-31 16:34           ` Badari Pulavarty
2005-10-31 19:15           ` Badari Pulavarty
2005-10-31 19:49           ` [RFC][PATCH] madvise(MADV_TRUNCATE) Badari Pulavarty
2005-11-01  0:05             ` Jeff Dike
2005-11-02  1:15               ` [PATCH] 2.6.14 patch for supporting madvise(MADV_FREE) Badari Pulavarty
2005-11-02  1:43                 ` Andrea Arcangeli
2005-11-02 15:49                   ` Badari Pulavarty
2005-11-02 16:12                   ` [PATCH] 2.6.14 patch for supporting madvise(MADV_REMOVE) Badari Pulavarty
2005-11-02 19:54                     ` New bug in patch and existing Linux code - race with install_page() (was: Re: [PATCH] 2.6.14 patch for supporting madvise(MADV_REMOVE)) Blaisorblade
2005-11-02 20:12                       ` Hugh Dickins
2005-11-02 20:45                         ` Hugh Dickins
2005-11-02 21:36                       ` Badari Pulavarty
2005-11-02 21:55                         ` Hugh Dickins
2005-11-02 22:02                           ` Badari Pulavarty
2005-11-12  0:25                     ` [PATCH] 2.6.14 patch for supporting madvise(MADV_REMOVE) Andrew Morton
2005-11-12  0:34                       ` Badari Pulavarty
2005-11-12  1:43                         ` Andrew Morton
2005-11-12  4:41                           ` Badari Pulavarty
2006-01-16 13:06                             ` differences between MADV_FREE and MADV_DONTNEED Andrea Arcangeli
2006-01-16 16:02                               ` Suleiman Souhlal
2006-01-16 16:28                                 ` Andrea Arcangeli
2006-01-16 17:03                                   ` Suleiman Souhlal
2006-01-16 17:24                                     ` Andrea Arcangeli
2006-01-16 21:43                                       ` Eric W. Biederman
2006-01-17  0:24                                         ` Suleiman Souhlal
2006-01-17  1:04                                           ` Nicholas Miell
2006-01-17 12:43                                             ` Christoph Hellwig
2006-01-17 18:23                                               ` Eric W. Biederman
2006-01-17 22:55                                                 ` Nicholas Miell
2007-03-01 18:11                                                 ` Samuel Thibault
2006-01-17 19:06                                               ` Badari Pulavarty
2006-01-17  1:06                               ` Blaisorblade
2006-01-17  1:33                                 ` Andrea Arcangeli
2005-11-12  0:34                     ` [PATCH] 2.6.14 patch for supporting madvise(MADV_REMOVE) Andrew Morton
2005-10-28 17:55   ` [RFC] madvise(MADV_TRUNCATE) Blaisorblade
2005-10-28 21:23     ` Theodore Ts'o

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200510282040.29856.blaisorblade@yahoo.it \
    --to=blaisorblade@yahoo.it \
    --cc=akpm@osdl.org \
    --cc=andrea@suse.de \
    --cc=dvhltc@us.ibm.com \
    --cc=hugh@veritas.com \
    --cc=jdike@addtoit.com \
    --cc=linux-mm@kvack.org \
    --cc=pbadari@us.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox