linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Trond Myklebust <trond.myklebust@fys.uio.no>
To: Andrea Arcangeli <andrea@suse.de>
Cc: "Juan J. Quintela" <quintela@fi.udc.es>,
	linux-mm@kvack.org, linux-kernel@vger.rutgers.edu
Subject: Re: classzone-VM + mapped pages out of lru_cache
Date: Fri, 5 May 2000 09:01:58 +0200 (CEST)	[thread overview]
Message-ID: <14610.29158.285011.81152@charged.uio.no> (raw)
In-Reply-To: <Pine.LNX.4.21.0005042201520.5533-100000@alpha.random>

>>>>> " " == Andrea Arcangeli <andrea@suse.de> writes:

    >> As far as NFS is concerned, that page is incorrect and should
    >> be read in again whenever we next try to access it. That is the
    >> purpose of the call to invalidate_inode_pages().  As far as I
    >> can see, your patch fundamentally breaks that concept for all
    >> files whether they are mmapped or not.

     > It breaks the concept only for mmaped files. non mmaped files
     > have page->count == 1 so their cache will be shrunk completly
     > as usual.

If there are pending asynchronous writes then this is neither true in
2.2.x nor in 2.3.x. Each pending writeback has to increment the
page->count in order to prevent the page from disappearing beneath
it. Since a writeback does not have to involve the whole page, we
cannot assume that just because the page is dirty, then it won't want
to get invalidated. Imagine the scenario:

   Process 1 (on client 1)             Process 2 (on client 2)

   Schedule asynchronous write
   on bytes 0-255 of file

                                      Write bytes 256-511 of same file.

   revalidate inode
   discover that file has changed
   try to invalidate page 0

Under your patch, the invalidation of page 0 will fail due to the
pending writeback, and hence process 1 will never see what process 2
wrote.

     > unmapping page from the pagetable means that later userspace
     > won't be anymore able to read/write to the page (only kernel
     > will have visibility on the page then and you'll read from the
     > page in each read(2) and write(2)). A page in the cache can be
     > mapped in several ptes and we have to unmap it from all them
     > before we're allowed to unlink the page from the pagecache or
     > current VM will break.

Could this be done as part of "invalidate_inode_pages" or would that
break the VM?

    >> such a page still have to be part of an inode's i_data?

     > Mapped page-cache can't be unlinked from the cache as first
     > because when you'll have to sync the dirty shard mapping
     > (because you run low on memory and you have to get rid of dirty
     > data in the VM) you won't know anymore which inode and which fs
     > the page belongs to.

You have the vma->vm_file and hence both dentry and inode.

Don't forget that on NFS, the inode is just a pretty collection of
statistics. It contains our estimates of the data, size, creation
times...  It does *not* contain sufficient information to sync a page
to storage, and if the VM assumes that it does, then it is clearly
broken.
Under NFS all read and write operations require us to use a file
handle, which is stored in the dentry, not in the inode. So you will
always be required to use the vm_file in some form or other.

Cheers,
  Trond
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

  reply	other threads:[~2000-05-05  7:01 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2000-05-03 16:26 Andrea Arcangeli
2000-05-04  0:42 ` David S. Miller
2000-05-04 10:00   ` Andrea Arcangeli
2000-05-04 14:40 ` Juan J. Quintela
2000-05-04 15:19   ` Andrea Arcangeli
2000-05-04 15:23     ` Andrea Arcangeli
2000-05-04 15:38     ` Rik van Riel
2000-05-04 17:59       ` Andrea Arcangeli
2000-05-04 19:24         ` Rik van Riel
2000-05-04 16:34     ` Manfred Spraul, Andrea Arcangeli
2000-05-04 16:48     ` Trond Myklebust
2000-05-04 18:43       ` Andrea Arcangeli
2000-05-04 19:32         ` Trond Myklebust
2000-05-04 20:15           ` Andrea Arcangeli
2000-05-05  7:01             ` Trond Myklebust [this message]
2000-05-04 16:34   ` Juan J. Quintela
2000-05-04 18:27     ` Chris Evans
     [not found] <3911ECCD.BA1BB24E@arcormail.de>
2000-05-04 23:44 ` Andrea Arcangeli
2000-05-05  0:03   ` Jens Axboe
2000-05-05  3:04   ` David S. Miller
2000-05-05  8:43     ` Russell King
2000-05-05 14:56     ` Andrea Arcangeli
2000-05-06 13:37   ` Andrea Arcangeli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=14610.29158.285011.81152@charged.uio.no \
    --to=trond.myklebust@fys.uio.no \
    --cc=andrea@suse.de \
    --cc=linux-kernel@vger.rutgers.edu \
    --cc=linux-mm@kvack.org \
    --cc=quintela@fi.udc.es \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox