From: Hugh Dickins <hugh@veritas.com>
To: Marcelo Tosatti <marcelo.tosatti@cyclades.com>
Cc: linux-mm@kvack.org
Subject: Re: __set_page_dirty_nobuffers superfluous check
Date: Sat, 14 Aug 2004 07:46:45 +0100 (BST) [thread overview]
Message-ID: <Pine.LNX.4.44.0408140745280.20187-100000@localhost.localdomain> (raw)
In-Reply-To: <20040813180504.GB29875@logos.cnet>
On Fri, 13 Aug 2004, Marcelo Tosatti wrote:
> While wandering through mm/page-writeback.c I noticed
> __set_page_dirty_nobuffers does:
>
> int __set_page_dirty_nobuffers(struct page *page)
> {
> int ret = 0;
>
> if (!TestSetPageDirty(page)) {
> struct address_space *mapping = page_mapping(page);
>
> if (mapping) {
> spin_lock_irq(&mapping->tree_lock);
> mapping = page_mapping(page);
> if (page_mapping(page)) { /* Race with truncate? */
> BUG_ON(page_mapping(page) != mapping); <------------------
> if (!mapping->backing_dev_info->memory_backed)
> inc_page_state(nr_dirty);
> radix_tree_tag_set(&mapping->page_tree,
> page_index(page), PAGECACHE_TAG_DIRTY);
> }
>
> How could the mapping ever change if we have tree_lock?
>
> Its basically a check which assumes there might be
> buggy page->mapping writers who do so without the lock, yes?
Nicely observed - and four evaluations of page_mapping(page)
within seven lines is two too many, even if optimized away.
But actually your interpretation is wrong: because this has evolved
from a sensible check on something seriously in doubt,
to a pointless duplication of effort.
It makes sense if you look at the original, in 2.6.6 or earlier:
if (!TestSetPageDirty(page)) {
struct address_space *mapping = page->mapping;
if (mapping) {
spin_lock_irq(&mapping->tree_lock);
if (page->mapping) { /* Race with truncate? */
BUG_ON(page->mapping != mapping);
if (!mapping->backing_dev_info->memory_backed)
What this is actually worrying about (along with truncation suddenly
setting page->mapping to NULL, won't happen without tree_lock) is
tmpfs swizzling page->mapping between a tmpfs struct address_space *
and &swapper_space (move_to/from_swap_cache in swap_state.c); or
(more distant concern) the page getting reused while we're in here,
coming in with one page->mapping and then suddenly another.
It's not doubting that tree_lock protects against that, but _which_
tree_lock? If page->mapping suddenly changes underneath us, then
the spin_lock_irq(&mapping->tree_lock) may have been done on the
wrong mapping->tree_lock - to lock "mapping->tree_lock" you have
to choose "mapping" first, but perhaps that's not stable without
its tree_lock.
In most cases, the (assumed) hold on the page in question prevents
page->mapping from changing from one non-NULL to another non-NULL
here, even without the tree_lock. But that's not enough to protect
against the tmpfs swizzling: what protects against that? Er, er,
it's the way tmpfs pages only go to swap when not in use, and are
brought back from swap before being used, and shmem.c insists on
page lock in each direction; but really it needs an audit of every
use of set_page_dirty to be sure. Though I've never heard of the
(earlier, useful) BUG_ON actually firing.
And I think you'll find that in practice it's just a waste for ramfs,
tmpfs and swap to be coming through __set_page_dirty_nobuffers at all:
since we don't do mpage operations on the "memory_backed" filesystems,
all the radix-tree tagging and dirty-inode operations on them are just
a waste of time? and they'd do better to use a .set_page_dirty which
just does SetPageDirty. But akpm does wonder from time to time whether
to reintroduce mpage operations on at least swap, so may resist such a
simplification.
Anyway, perhaps a suitable patch to make sense of that BUG_ON would be:
--- 2.6.8/mm/page-writeback.c 2004-08-10 05:40:21.000000000 +0100
+++ linux/mm/page-writeback.c 2004-08-14 07:21:58.744468256 +0100
@@ -580,12 +580,13 @@ int __set_page_dirty_nobuffers(struct pa
if (!TestSetPageDirty(page)) {
struct address_space *mapping = page_mapping(page);
+ struct address_space *mapping2;
if (mapping) {
spin_lock_irq(&mapping->tree_lock);
- mapping = page_mapping(page);
- if (page_mapping(page)) { /* Race with truncate? */
- BUG_ON(page_mapping(page) != mapping);
+ mapping2 = page_mapping(page);
+ if (mapping2) { /* Race with truncate? */
+ BUG_ON(mapping2 != mapping);
if (!mapping->backing_dev_info->memory_backed)
inc_page_state(nr_dirty);
radix_tree_tag_set(&mapping->page_tree,
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>
next prev parent reply other threads:[~2004-08-14 6:47 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-08-13 18:05 Marcelo Tosatti
2004-08-14 6:46 ` Hugh Dickins [this message]
2004-08-14 13:37 ` Marcelo Tosatti
2004-08-16 12:37 ` Hugh Dickins
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Pine.LNX.4.44.0408140745280.20187-100000@localhost.localdomain \
--to=hugh@veritas.com \
--cc=linux-mm@kvack.org \
--cc=marcelo.tosatti@cyclades.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox