From: "Stephen C. Tweedie" <sct@redhat.com>
To: Linus Torvalds <torvalds@transmeta.com>
Cc: "Stephen C. Tweedie" <sct@redhat.com>,
Savochkin Andrey Vladimirovich <saw@msu.ru>,
Andrea Arcangeli <andrea@e-mind.com>,
steve@netplus.net, "Eric W. Biederman" <ebiederm+eric@ccr.net>,
brent verner <damonbrent@earthlink.net>,
"Garst R. Reese" <reese@isn.net>,
Kalle Andersson <kalle.andersson@mbox303.swipnet.se>,
Zlatko Calusic <Zlatko.Calusic@CARNet.hr>,
Ben McCann <bmccann@indusriver.com>,
Alan Cox <alan@lxorguk.ukuu.org.uk>,
bredelin@ucsd.edu, linux-kernel@vger.rutgers.edu,
Rik van Riel <H.H.vanRiel@phys.uu.nl>,
linux-mm@kvack.org
Subject: Re: MM deadlock [was: Re: arca-vm-8...]
Date: Sun, 10 Jan 1999 22:18:10 GMT [thread overview]
Message-ID: <199901102218.WAA01598@dax.scot.redhat.com> (raw)
In-Reply-To: <Pine.LNX.3.95.990110103201.7668D-100000@penguin.transmeta.com>
Hi,
On Sun, 10 Jan 1999 10:35:10 -0800 (PST), Linus Torvalds
<torvalds@transmeta.com> said:
> The thing I want to make re-entrant is just semaphore accesses: at the
> point where we would otherwise deadlock on the writer semaphore it's much
> better to just allow nested writes. I suspect all filesystems can already
> handle nested writes - they are a lot easier to handle than truly
> concurrent ones.
We used to do it anyway, before inodes were locked for write, if I
remember correctly.
What I'm after is something like the patch below for a fix (don't apply
it: it should work and should fix the problem, but it's really just for
illustration). It enforces an i_atomic_allocate semaphore to lock
against truncate(). The write-page filemap code takes this semaphore,
but does _not_ take i_sem at all.
Frankly, I really don't think we want to serialise writes so
aggressively in the first place. In POSIX, O_APPEND is the only case
where we need to do this (and since that modifies i_size, it's a natural
case to do under the i_atomic_allocate semaphore in any case).
This patch should fix the problem in hand, but what I think we really
want is a read/write semaphore for i_atomic_allocate: we want normal
read and write IO to a file to guard against a concurrent truncate(),
but _not_ against each other (in situations such as threaded/async IO to
a database file, multiple outstanding IOs can be a big win). Basically,
most writes should take out a read lock on the filesize so that the file
won't disappear from under their feet; only extending or truncating the
file should take out an i_atomic_allocate write lock (assuming the same
sorts of semantics for r/w semaphores as we already have for r/w
spinlocks).
Are there really any filesystems we know can't deal with
concurrent/reentrant writes to an inode? We already have to deal with
concurrent reads with a single write in progress, after all.
--Stephen
----------------------------------------------------------------
--- fs/inode.c.~1~ Fri Jan 8 16:13:05 1999
+++ fs/inode.c Sun Jan 10 21:58:46 1999
@@ -132,6 +132,7 @@
INIT_LIST_HEAD(&inode->i_dentry);
sema_init(&inode->i_sem, 1);
sema_init(&inode->i_atomic_write, 1);
+ sema_init(&inode->i_atomic_allocate, 1);
}
static inline void write_inode(struct inode *inode)
--- fs/open.c~ Fri Jan 8 17:24:19 1999
+++ fs/open.c Sun Jan 10 21:59:49 1999
@@ -70,6 +70,7 @@
int error;
struct iattr newattrs;
+ down(&inode->i_atomic_allocate);
down(&inode->i_sem);
newattrs.ia_size = length;
newattrs.ia_valid = ATTR_SIZE | ATTR_CTIME;
@@ -81,6 +82,7 @@
inode->i_op->truncate(inode);
}
up(&inode->i_sem);
+ up(&inode->i_atomic_allocate);
return error;
}
--- include/linux/fs.h.~1~ Sun Jan 10 21:56:23 1999
+++ include/linux/fs.h Sun Jan 10 21:58:39 1999
@@ -358,6 +358,7 @@
unsigned long i_nrpages;
struct semaphore i_sem;
struct semaphore i_atomic_write;
+ struct semaphore i_atomic_allocate;
struct inode_operations *i_op;
struct super_block *i_sb;
struct wait_queue *i_wait;
--- mm/filemap.c~ Fri Jan 8 16:13:06 1999
+++ mm/filemap.c Sun Jan 10 22:01:52 1999
@@ -1113,9 +1113,9 @@
* and file could be released ... increment the count to be safe.
*/
file->f_count++;
- down(&inode->i_sem);
+ down(&inode->i_atomic_allocate);
result = do_write_page(inode, file, (const char *) page, offset);
- up(&inode->i_sem);
+ up(&inode->i_atomic_allocate);
fput(file);
return result;
}
--
This is a majordomo managed list. To unsubscribe, send a message with
the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org
next prev parent reply other threads:[~1999-01-10 22:18 UTC|newest]
Thread overview: 243+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <199812290146.BAA12687@terrorserver.swansea.linux.org.uk>
1998-12-31 18:00 ` 2.2.0 Bug summary Andrea Arcangeli
1998-12-31 18:34 ` [patch] new-vm improvement [Re: 2.2.0 Bug summary] Andrea Arcangeli
1999-01-01 0:16 ` Steve Bergman
1999-01-01 17:16 ` Andrea Arcangeli
1999-01-01 16:44 ` Andrea Arcangeli
1999-01-01 20:02 ` Andrea Arcangeli
1999-01-01 23:46 ` Steve Bergman
1999-01-02 6:55 ` Linus Torvalds
1999-01-02 8:33 ` Steve Bergman
1999-01-02 14:48 ` Andrea Arcangeli
1999-01-02 15:38 ` Andrea Arcangeli
1999-01-02 18:10 ` Linus Torvalds
1999-01-02 20:52 ` Andrea Arcangeli
1999-01-03 2:59 ` Andrea Arcangeli
1999-01-04 18:08 ` [patch] arca-vm-6, killed kswapd [Re: [patch] new-vm improvement , [Re: 2.2.0 Bug summary]] Andrea Arcangeli
1999-01-04 20:56 ` Linus Torvalds
1999-01-04 21:10 ` Rik van Riel
1999-01-04 22:04 ` Alan Cox
1999-01-04 21:55 ` Linus Torvalds
1999-01-04 22:51 ` Andrea Arcangeli
1999-01-05 0:32 ` Andrea Arcangeli
1999-01-05 0:52 ` Zlatko Calusic
1999-01-05 3:02 ` Zlatko Calusic
1999-01-05 11:49 ` Andrea Arcangeli
1999-01-05 13:23 ` Zlatko Calusic
1999-01-05 15:42 ` Andrea Arcangeli
1999-01-05 16:16 ` Zlatko Calusic
1999-01-05 15:35 ` arca-vm-8 [Re: [patch] arca-vm-6, killed kswapd [Re: [patch] new-vm , improvement , [Re: 2.2.0 Bug summary]]] Andrea Arcangeli
1999-01-06 14:48 ` Andrea Arcangeli
1999-01-06 23:31 ` Andrea Arcangeli
1999-01-07 3:32 ` Results: 2.2.0-pre5 vs arcavm10 vs arcavm9 vs arcavm7 Steve Bergman
1999-01-07 12:02 ` Andrea Arcangeli
1999-01-07 20:27 ` Linus Torvalds
1999-01-07 23:56 ` Andrea Arcangeli
1999-01-07 17:35 ` Linus Torvalds
1999-01-07 18:44 ` Zlatko Calusic
1999-01-07 19:33 ` Linus Torvalds
1999-01-07 21:10 ` Zlatko Calusic
1999-01-07 19:38 ` Zlatko Calusic
1999-01-07 19:40 ` Andrea Arcangeli
1999-01-09 6:28 ` 2.2.0-pre[56] swap performance poor with > 1 thrashing task Dax Kelson
1999-01-09 6:32 ` Zlatko Calusic
1999-01-09 6:44 ` Linus Torvalds
1999-01-09 18:58 ` Andrea Arcangeli
1999-01-11 9:21 ` Buffer handling (setting PG_referenced on access) Zlatko Calusic
1999-01-11 17:44 ` Linus Torvalds
1999-01-11 20:14 ` Zlatko Calusic
1999-01-16 17:35 ` 2.2.0-pre[56] swap performance poor with > 1 thrashing task Andrea Arcangeli
1999-01-09 7:48 ` Benjamin Redelings I
1999-01-09 6:53 ` Linus Torvalds
1999-01-09 22:39 ` Results: pre6 vs pre6+zlatko's_patch vs pre5 vs arcavm13 Steve Bergman
1999-01-10 0:28 ` Steve Bergman
1999-01-10 5:35 ` Linus Torvalds
1999-01-10 18:33 ` Andrea Arcangeli
1999-01-10 18:43 ` Steve Bergman
1999-01-10 19:08 ` Linus Torvalds
1999-01-10 19:23 ` Vladimir Dergachev
1999-01-10 20:09 ` Andrea Arcangeli
1999-01-10 20:29 ` Steve Bergman
1999-01-10 21:41 ` Linus Torvalds
1999-01-10 23:33 ` testing/pre-7 and do_poll() Chip Salzenberg
1999-01-11 6:02 ` Linus Torvalds
1999-01-11 6:26 ` Chip Salzenberg
1999-01-11 6:46 ` Linus Torvalds
1999-01-11 6:59 ` Chip Salzenberg
1999-01-11 7:02 ` Linus Torvalds
1999-01-11 22:08 ` Shawn Leas
1999-01-11 22:13 ` Linus Torvalds
1999-01-12 0:25 ` estafford
1999-01-12 8:25 ` Shawn Leas
1999-01-12 7:06 ` Gregory Maxwell
1999-01-11 20:20 ` Adam Heath
1999-01-11 16:57 ` Results: pre6 vs pre6+zlatko's_patch vs pre5 vs arcavm13 Steve Bergman
1999-01-11 19:36 ` Andrea Arcangeli
1999-01-11 23:03 ` Andrea Arcangeli
1999-01-11 23:38 ` Zlatko Calusic
1999-01-12 2:02 ` Steve Bergman
1999-01-12 3:21 ` Results: Zlatko's new vm patch Steve Bergman
1999-01-12 5:33 ` Linus Torvalds
1999-01-12 14:49 ` Andrea Arcangeli
1999-01-12 16:58 ` Joseph Anthony
1999-01-12 18:16 ` Stephen C. Tweedie
1999-01-12 20:15 ` Michael K Vance
1999-01-13 19:25 ` Stephen C. Tweedie
1999-01-12 18:24 ` Michael K Vance
1999-01-13 0:01 ` Where to find pre7. Was: " Robert Thorncrantz
1999-01-13 20:47 ` [patch] arca-vm-19 [Re: Results: Zlatko's new vm patch] Andrea Arcangeli
1999-01-14 12:30 ` Andrea Arcangeli
1999-01-15 23:56 ` [patch] NEW: arca-vm-21, swapout via shrink_mmap using PG_dirty Andrea Arcangeli
1999-01-16 16:49 ` Andrea Arcangeli
1999-01-17 23:47 ` Andrea Arcangeli
1999-01-18 5:11 ` Linus Torvalds
1999-01-18 7:28 ` Eric W. Biederman
1999-01-18 10:00 ` Andrea Arcangeli
1999-01-18 9:15 ` Andrea Arcangeli
1999-01-18 17:49 ` Linus Torvalds
1999-01-18 19:22 ` Andrea Arcangeli
1999-01-10 20:40 ` Results: pre6 vs pre6+zlatko's_patch vs pre5 vs arcavm13 Andrea Arcangeli
1999-01-10 20:50 ` Linus Torvalds
1999-01-10 21:01 ` Andrea Arcangeli
1999-01-10 21:51 ` Steve Bergman
1999-01-10 22:50 ` Results: arcavm15, et. al Steve Bergman
1999-01-11 0:20 ` Steve Bergman
1999-01-11 13:21 ` Andrea Arcangeli
1999-01-11 3:47 ` Results: pre6 vs pre6+zlatko's_patch vs pre5 vs arcavm13 Gregory Maxwell
1999-01-06 23:35 ` arca-vm-8 [Re: [patch] arca-vm-6, killed kswapd [Re: [patch] new-vm , improvement , [Re: 2.2.0 Bug summary]]] Linus Torvalds
1999-01-07 4:30 ` Eric W. Biederman
1999-01-07 17:56 ` Linus Torvalds
1999-01-07 18:18 ` Rik van Riel
1999-01-07 19:19 ` arca-vm-8 [Re: [patch] arca-vm-6, killed kswapd [Re: [patch] Alan Cox
1999-01-07 18:55 ` arca-vm-8 [Re: [patch] arca-vm-6, killed kswapd [Re: [patch] new-vm , improvement , [Re: 2.2.0 Bug summary]]] Zlatko Calusic
1999-01-07 22:57 ` Linus Torvalds
1999-01-08 1:16 ` Linus Torvalds
1999-01-08 10:45 ` Andrea Arcangeli
1999-01-08 19:06 ` Linus Torvalds
1999-01-09 9:43 ` MM deadlock [was: Re: arca-vm-8...] Savochkin Andrey Vladimirovich
1999-01-09 18:00 ` Linus Torvalds
1999-01-09 18:41 ` Andrea Arcangeli
1999-01-10 21:41 ` Stephen C. Tweedie
1999-01-10 21:47 ` Linus Torvalds
1999-01-09 21:50 ` Linus Torvalds
1999-01-10 11:56 ` Savochkin Andrey Vladimirovich
1999-01-10 17:59 ` Andrea Arcangeli
1999-01-10 22:33 ` Stephen C. Tweedie
1999-01-10 16:59 ` Stephen C. Tweedie
1999-01-10 18:13 ` Andrea Arcangeli
1999-01-10 18:35 ` Linus Torvalds
1999-01-10 19:45 ` Alan Cox
1999-01-10 19:03 ` Andrea Arcangeli
1999-01-10 21:39 ` Stephen C. Tweedie
1999-01-10 19:09 ` Linus Torvalds
1999-01-10 20:33 ` Alan Cox
1999-01-10 20:07 ` Linus Torvalds
1999-01-10 22:18 ` Stephen C. Tweedie [this message]
1999-01-10 22:49 ` Stephen C. Tweedie
1999-01-11 6:04 ` Eric W. Biederman
1999-01-12 16:06 ` Stephen C. Tweedie
1999-01-12 17:54 ` Linus Torvalds
1999-01-12 18:44 ` Zlatko Calusic
1999-01-12 19:05 ` Andrea Arcangeli
1999-01-13 17:48 ` Stephen C. Tweedie
1999-01-13 18:07 ` 2.2.0-pre6 ain't nice =( Kalle Andersson
1999-01-13 19:05 ` MM deadlock [was: Re: arca-vm-8...] Alan Cox
1999-01-13 19:23 ` MOLNAR Ingo
1999-01-13 19:26 ` Andrea Arcangeli
1999-01-14 11:02 ` Mike Jagdis
1999-01-14 22:38 ` Andrea Arcangeli
1999-01-15 7:40 ` Agus Budy Wuysang
1999-01-14 10:48 ` Mike Jagdis
1999-01-12 21:46 ` Rik van Riel
1999-01-13 6:52 ` Zlatko Calusic
1999-01-13 13:45 ` Andrea Arcangeli
1999-01-13 13:58 ` Chris Evans
1999-01-13 15:07 ` Andrea Arcangeli
1999-01-13 22:11 ` Stephen C. Tweedie
1999-01-13 14:59 ` Rik van Riel
1999-01-13 18:10 ` Andrea Arcangeli
1999-01-13 22:14 ` Stephen C. Tweedie
1999-01-14 14:53 ` Dr. Werner Fink
1999-01-21 16:50 ` Stephen C. Tweedie
1999-01-21 19:53 ` Andrea Arcangeli
1999-01-22 13:55 ` Stephen C. Tweedie
1999-01-22 19:45 ` Andrea Arcangeli
1999-01-23 23:20 ` Alan Cox
1999-01-24 0:19 ` Linus Torvalds
1999-01-24 18:33 ` Gregory Maxwell
1999-01-25 0:21 ` Linus Torvalds
1999-01-25 1:28 ` Alan Cox
1999-01-25 3:35 ` pmonta
1999-01-25 4:17 ` Linus Torvalds
1999-01-24 20:33 ` Alan Cox
1999-01-25 0:27 ` Linus Torvalds
1999-01-25 1:38 ` Alan Cox
1999-01-25 1:04 ` Andrea Arcangeli
1999-01-25 2:10 ` Alan Cox
1999-01-25 3:16 ` Garst R. Reese
1999-01-25 10:49 ` Alan Cox
1999-01-25 14:06 ` Rik van Riel
1999-01-25 21:59 ` Gerard Roudier
1999-01-26 11:45 ` Thomas Sailer
1999-01-26 20:48 ` Gerard Roudier
1999-01-26 21:24 ` Thomas Sailer
1999-01-27 0:25 ` David Lang
1999-01-27 16:05 ` Stephen C. Tweedie
1999-01-27 20:11 ` Gerard Roudier
1999-01-26 13:06 ` Stephen C. Tweedie
1999-01-26 14:28 ` Alan Cox
1999-01-26 14:15 ` MOLNAR Ingo
1999-01-26 14:36 ` yodaiken
1999-01-26 15:21 ` MOLNAR Ingo
1999-01-27 10:31 ` yodaiken
1999-01-26 15:46 ` Alan Cox
1999-01-26 16:45 ` Stephen C. Tweedie
1999-01-30 7:01 ` yodaiken
1999-02-01 13:07 ` Stephen C. Tweedie
1999-01-26 16:37 ` Stephen C. Tweedie
1999-01-27 11:35 ` Jakub Jelinek
1999-01-26 14:21 ` Rik van Riel
1999-01-25 16:25 ` Stephen C. Tweedie
1999-01-25 16:52 ` Andrea Arcangeli
1999-01-25 18:27 ` Linus Torvalds
1999-01-25 18:43 ` Stephen C. Tweedie
1999-01-25 18:49 ` Linus Torvalds
1999-01-25 18:43 ` Linus Torvalds
1999-01-25 19:15 ` Stephen C. Tweedie
1999-01-26 1:57 ` Andrea Arcangeli
1999-01-26 18:37 ` Andrea Arcangeli
1999-01-27 12:13 ` Stephen C. Tweedie
1999-01-22 16:29 ` Eric W. Biederman
1999-01-25 13:14 ` Dr. Werner Fink
1999-01-25 17:56 ` Stephen C. Tweedie
1999-01-25 19:10 ` Andrea Arcangeli
1999-01-25 20:49 ` Dr. Werner Fink
1999-01-25 20:56 ` Linus Torvalds
1999-01-26 12:23 ` Rik van Riel
1999-01-26 15:44 ` Andrea Arcangeli
1999-01-27 14:52 ` Stephen C. Tweedie
1999-01-28 19:12 ` Dr. Werner Fink
1999-01-13 17:55 ` [PATCH] " Stephen C. Tweedie
1999-01-13 18:52 ` Andrea Arcangeli
1999-01-13 22:10 ` Stephen C. Tweedie
1999-01-13 22:30 ` Linus Torvalds
1999-01-11 11:20 ` Pavel Machek
1999-01-11 17:35 ` Stephen C. Tweedie
1999-01-11 14:11 ` Savochkin Andrey Vladimirovich
1999-01-11 17:55 ` Linus Torvalds
1999-01-11 18:37 ` Andrea Arcangeli
1999-01-08 2:56 ` arca-vm-8 [Re: [patch] arca-vm-6, killed kswapd [Re: [patch] new-vm , improvement , [Re: 2.2.0 Bug summary]]] Eric W. Biederman
1999-01-09 0:50 ` David S. Miller
1999-01-09 2:13 ` Stephen C. Tweedie
1999-01-09 2:34 ` Andrea Arcangeli
1999-01-09 9:30 ` Stephen C. Tweedie
1999-01-09 12:11 ` Andrea Arcangeli
1999-01-07 14:11 ` Andrea Arcangeli
1999-01-07 18:19 ` Linus Torvalds
1999-01-07 20:35 ` Andrea Arcangeli
1999-01-07 23:51 ` Linus Torvalds
1999-01-08 0:04 ` Andrea Arcangeli
1999-01-04 22:43 ` [patch] arca-vm-6, killed kswapd [Re: [patch] new-vm improvement , [Re: 2.2.0 Bug summary]] Andrea Arcangeli
1999-01-04 22:29 ` Andrea Arcangeli
1999-01-05 13:33 ` [patch] new-vm improvement [Re: 2.2.0 Bug summary] Ben McCann
1999-01-02 20:04 ` Steve Bergman
1999-01-02 3:03 ` Andrea Arcangeli
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=199901102218.WAA01598@dax.scot.redhat.com \
--to=sct@redhat.com \
--cc=H.H.vanRiel@phys.uu.nl \
--cc=Zlatko.Calusic@CARNet.hr \
--cc=alan@lxorguk.ukuu.org.uk \
--cc=andrea@e-mind.com \
--cc=bmccann@indusriver.com \
--cc=bredelin@ucsd.edu \
--cc=damonbrent@earthlink.net \
--cc=ebiederm+eric@ccr.net \
--cc=kalle.andersson@mbox303.swipnet.se \
--cc=linux-kernel@vger.rutgers.edu \
--cc=linux-mm@kvack.org \
--cc=reese@isn.net \
--cc=saw@msu.ru \
--cc=steve@netplus.net \
--cc=torvalds@transmeta.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox