linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@osdl.org>
To: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, mason@suse.com,
	andrea@suse.de, hugh@veritas.com, axboe@suse.de,
	torvalds@osdl.org
Subject: Re: [rfc][patch] remove racy sync_page?
Date: Mon, 29 May 2006 18:32:01 -0700	[thread overview]
Message-ID: <20060529183201.0e8173bc.akpm@osdl.org> (raw)
In-Reply-To: <447B8CE6.5000208@yahoo.com.au>

On Tue, 30 May 2006 10:08:06 +1000
Nick Piggin <nickpiggin@yahoo.com.au> wrote:

> > 
> > Try disabling kblockd completely, see what effect that has on performance.
> 
> Which is what I want to know. I don't exactly have an interesting
> disk setup.

You don't need one - just a single disk should show up such problems.  I
forget which workloads though.  Perhaps just a linear read (readahead
queues the I/O but doesn't unplug, subsequent lock_page() sulks).

> >>Can we get rid of the whole thing, confusing memory barriers and all? Nobody
> >>uses anything but the default sync_page, and if block rq plugging is terribly
> >>bad for performance, perhaps it should be reworked anyway? It shouldn't be a
> >>correctness thing, right?
> > 
> > 
> > What this means is that it is not legal to run lock_page() against a
> > pagecache page if you don't have a ref on the inode.
> 
> Yes. So set_page_dirty_lock is broken, right?

yup.

> And the wait_on_page_stuff needs an inode ref.
> Also splice seems to have broken sync_page.

Please describe the splice() problem which you've observed.

> > 
> > iirc the main (only?) offender here is direct-io reads into MAP_SHARED
> > pagecache.  (And similar things, like infiniband and nfs-direct).
> 
> Well yes, writing to a page would be the main reason to set it dirty.
> Is splice broken as well? I'm not sure that it always has a ref on the
> inode when stealing a page.

Whereabouts?

> It sounds like you think fixing the set_page_dirty_lock callers wouldn't
> be too difficult? I wouldn't know (although the ptrace one should be
> able to be turned into a set_page_dirty, because we're holding mmap_sem).

No, I think it's damn impossible ;)

get_user_pages() has gotten us a random pagecache page.  How do we
non-racily get at the address_space prior to locking that page?

I don't think we can.

> You're sure about all other lock_page()rs? I'm not, given that
> set_page_dirty_lock got it so wrong. But you'd have a better idea than
> me.

No, I'm not sure.

However it is rare for the kernel to play with pagecache pages against
which the caller doesn't have an inode ref.  Think: how did the caller look
up that page in the first place if not from the address_space in the first
place?

- get_user_pages(): the current problem

- page LRU: OK, uses trylock first.

- pagetable walk??

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2006-05-30  1:32 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-05-29  9:34 Nick Piggin
2006-05-29 19:15 ` Andrew Morton
2006-05-30  0:08   ` Nick Piggin
2006-05-30  1:32     ` Andrew Morton [this message]
2006-05-30  2:54       ` Nick Piggin
2006-05-30  3:14         ` Andrew Morton
2006-05-30  4:13           ` Nick Piggin
2006-05-30  9:05           ` Jens Axboe
2006-05-31 13:43             ` Nick Piggin
2006-05-31 15:09               ` Hugh Dickins
2006-05-31 15:22                 ` Nick Piggin
2006-05-31 17:51                   ` Jens Axboe
2006-05-31 17:50               ` Jens Axboe
2006-05-30  4:20         ` Linus Torvalds
2006-05-30  5:07           ` Nick Piggin
2006-05-30  5:21             ` Nick Piggin
2006-05-30  6:12               ` Neil Brown
2006-05-30  7:10                 ` Nick Piggin
2006-05-31  4:34                   ` Neil Brown
2006-05-30  8:24               ` Nikita Danilov
2006-05-30 17:55               ` Linus Torvalds
2006-05-31  0:32                 ` Nick Piggin
2006-05-31  0:56                   ` Linus Torvalds
2006-05-31  1:33                     ` Mark Lord
2006-05-31  6:11                       ` Jens Axboe
2006-05-31 12:55                         ` Mark Lord
2006-05-31 13:02                           ` Jens Axboe
2006-06-01 13:19                           ` NCQ performance (was Re: [rfc][patch] remove racy sync_page?) Jens Axboe
2006-06-01 14:56                             ` Avi Kivity
2006-06-01 15:03                               ` Jens Axboe
2006-06-01 18:04                                 ` Jens Axboe
2006-06-05  5:30                                   ` Avi Kivity
2006-06-05  7:59                                     ` Jens Axboe
2006-05-31 12:31                     ` [rfc][patch] remove racy sync_page? Helge Hafting
2006-05-31 12:36                       ` Arjan van de Ven
2006-05-31 13:29                     ` Nick Piggin
2006-05-31 13:41                       ` Jens Axboe
2006-05-31 13:54                         ` Nick Piggin
2006-05-31 14:43                       ` Linus Torvalds
2006-05-31 14:57                         ` Nick Piggin
2006-05-31 15:13                           ` Linus Torvalds
2006-05-31 15:09                         ` Linus Torvalds
2006-05-31 18:13                           ` Jens Axboe
2006-05-31 18:26                             ` Linus Torvalds
2006-05-30  5:36             ` Nick Piggin
2006-05-30 18:31               ` Hugh Dickins
2006-05-31  0:21                 ` Nick Piggin
2006-05-31  3:06                   ` Hugh Dickins
2006-05-31 14:30                     ` Hugh Dickins
2006-05-31 17:56                     ` Jens Axboe
2006-05-30  5:51 ` Josef Sipek
2006-05-30  6:44   ` Nick Piggin
2006-05-30  6:50     ` Nick Piggin
2006-05-30 13:12     ` Josef Sipek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20060529183201.0e8173bc.akpm@osdl.org \
    --to=akpm@osdl.org \
    --cc=andrea@suse.de \
    --cc=axboe@suse.de \
    --cc=hugh@veritas.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mason@suse.com \
    --cc=nickpiggin@yahoo.com.au \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox