From: kanoj@google.engr.sgi.com (Kanoj Sarcar)
To: Andrea Arcangeli <andrea@suse.de>
Cc: Ben LaHaise <bcrl@redhat.com>,
riel@nl.linux.org, Linus Torvalds <torvalds@transmeta.com>,
linux-mm@kvack.org
Subject: Re: [patch] take 2 Re: PG_swap_entry bug in recent kernels
Date: Sat, 8 Apr 2000 16:18:53 -0700 (PDT) [thread overview]
Message-ID: <200004082318.QAA60782@google.engr.sgi.com> (raw)
In-Reply-To: <Pine.LNX.4.21.0004090005020.18345-100000@alpha.random> from "Andrea Arcangeli" at Apr 09, 2000 01:02:05 AM
>
> On Sat, 8 Apr 2000, Kanoj Sarcar wrote:
>
> >shrink_mmap
> >-------------- __find_get_page
> >get pagemap_lru_lock ----------------
> >LockPage
> >drop pagemap_lru_lock
> >Fail if page_count(page) > 1
> >get pagecache_lock
> >get_page
> >Fail if page_count(page) != 2
> >if PageSwapCache, drop pagecache_lock
> > get pagecache_lock
> > Finds page in swapcache,
> > does get_page
> > drop pagecache_lock
> > and __delete_from_swap_cache,
> > which releases PageLock.
> > LockPage succeeds,
> > erronesouly believes he
> > has swapcache page.
> >
> >Did I miss some interlocking step that would prevent this from happening?
>
> Oh, very good point indeed, I don't think you are missing anything. Thanks
> for showing me that!
>
> It seems to me the only reason we was dropping the lock earlier for the
> swap cache was to be able to use the remove_inode_page and so avoding
> having to export a secondary remove_inode_page that doesn't grab the
> page_cache_lock. It looks the only reason was an implementation issue.
Agreed.
>
> So IMHVO it would be nicer to change the locking in shrink_mmap() instead
> of putting the page-cache check in the swap cache lookup fast path. Swap
> cache and page cache are sharing the same locking rules w.r.t. the
> hashtable. That was the only exception as far I can tell and removing it
> would give us a cleaner design IMHO.
Yes, its easy enough to do, I am sure your attached patch does it properly.
Notwithstanding, I will repeat this again. Just because you change how
shrink_mmap deletes swapcache pages, does not make it a good idea to
change lookup_swap_cache(). Lets first fix the swap cache deletion races,
then you can change lookup_swap_cache() not to lock the page ... which is
trivial anyway, once we think we have all the holes nailed.
I am off to polishing my previous swapoff patch. Andrea, it might make it
easier to understand your previous patch if you were to send out a list of
the specific races it is fixing. (Something like my above race example).
Kanoj
>
> What do you think about something like this?
>
> diff -urN swap-entry-2/include/linux/mm.h swap-entry-3/include/linux/mm.h
> --- swap-entry-2/include/linux/mm.h Sat Apr 8 19:16:28 2000
> +++ swap-entry-3/include/linux/mm.h Sun Apr 9 00:18:43 2000
> @@ -449,6 +449,7 @@
> struct zone_t;
> /* filemap.c */
> extern void remove_inode_page(struct page *);
> +extern void __remove_inode_page(struct page *);
> extern unsigned long page_unuse(struct page *);
> extern int shrink_mmap(int, int, zone_t *);
> extern void truncate_inode_pages(struct address_space *, loff_t);
> diff -urN swap-entry-2/include/linux/swap.h swap-entry-3/include/linux/swap.h
> --- swap-entry-2/include/linux/swap.h Sat Apr 8 18:08:37 2000
> +++ swap-entry-3/include/linux/swap.h Sun Apr 9 00:47:42 2000
> @@ -105,6 +105,7 @@
> /*
> * Make these inline later once they are working properly.
> */
> +extern void shrink_swap_cache(struct page *);
> extern void unlink_from_swap_cache(struct page *);
> extern void __delete_from_swap_cache(struct page *page);
> extern void delete_from_swap_cache(struct page *page);
> diff -urN swap-entry-2/mm/filemap.c swap-entry-3/mm/filemap.c
> --- swap-entry-2/mm/filemap.c Sat Apr 8 04:46:04 2000
> +++ swap-entry-3/mm/filemap.c Sun Apr 9 00:39:23 2000
> @@ -77,6 +77,13 @@
> atomic_dec(&page_cache_size);
> }
>
> +inline void __remove_inode_page(struct page *page)
> +{
> + remove_page_from_inode_queue(page);
> + remove_page_from_hash_queue(page);
> + page->mapping = NULL;
> +}
> +
> /*
> * Remove a page from the page cache and free it. Caller has to make
> * sure the page is locked and that nobody else uses it - or that usage
> @@ -88,9 +95,7 @@
> PAGE_BUG(page);
>
> spin_lock(&pagecache_lock);
> - remove_page_from_inode_queue(page);
> - remove_page_from_hash_queue(page);
> - page->mapping = NULL;
> + __remove_inode_page(page);
> spin_unlock(&pagecache_lock);
> }
>
> @@ -298,8 +303,8 @@
> * were to be marked referenced..
> */
> if (PageSwapCache(page)) {
> + shrink_swap_cache(page);
> spin_unlock(&pagecache_lock);
> - __delete_from_swap_cache(page);
> /* the page is local to us now */
> page->flags &= ~(1UL << PG_swap_entry);
> goto made_inode_progress;
> diff -urN swap-entry-2/mm/swap_state.c swap-entry-3/mm/swap_state.c
> --- swap-entry-2/mm/swap_state.c Sat Apr 8 17:29:46 2000
> +++ swap-entry-3/mm/swap_state.c Sun Apr 9 00:39:17 2000
> @@ -55,6 +55,19 @@
> return ret;
> }
>
> +static inline void __remove_from_swap_cache(struct page *page)
> +{
> + struct address_space *mapping = page->mapping;
> +
> + if (mapping != &swapper_space)
> + BUG();
> + if (!PageSwapCache(page) || !PageLocked(page))
> + PAGE_BUG(page);
> +
> + PageClearSwapCache(page);
> + __remove_inode_page(page);
> +}
> +
> static inline void remove_from_swap_cache(struct page *page)
> {
> struct address_space *mapping = page->mapping;
> @@ -76,6 +89,20 @@
> lru_cache_del(page);
> remove_from_swap_cache(page);
> __free_page(page);
> +}
> +
> +/* called by shrink_mmap() with the page_cache_lock held */
> +void shrink_swap_cache(struct page *page)
> +{
> + swp_entry_t entry;
> +
> + entry.val = page->index;
> +
> +#ifdef SWAP_CACHE_INFO
> + swap_cache_del_total++;
> +#endif
> + __remove_from_swap_cache(page);
> + swap_free(entry);
> }
>
> /*
>
>
> The other option is to keep the checks in the lookup swap cache fast path.
>
> Comments?
>
> Andrea
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/
next prev parent reply other threads:[~2000-04-08 23:18 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2000-04-03 22:22 Ben LaHaise
2000-04-04 15:06 ` Andrea Arcangeli
2000-04-04 15:46 ` Rik van Riel
2000-04-04 16:50 ` Andrea Arcangeli
2000-04-04 17:06 ` Ben LaHaise
2000-04-04 18:03 ` Andrea Arcangeli
2000-04-06 22:11 ` [patch] take 2 " Ben LaHaise
2000-04-07 10:45 ` Andrea Arcangeli
2000-04-07 11:29 ` Rik van Riel
2000-04-07 12:00 ` Andrea Arcangeli
2000-04-07 12:54 ` Rik van Riel
2000-04-07 13:14 ` Andrea Arcangeli
2000-04-07 20:12 ` Kanoj Sarcar
2000-04-07 23:26 ` Andrea Arcangeli
2000-04-08 0:11 ` Kanoj Sarcar
2000-04-08 0:37 ` Kanoj Sarcar
2000-04-08 13:20 ` Andrea Arcangeli
2000-04-08 21:39 ` Kanoj Sarcar
2000-04-08 23:02 ` Andrea Arcangeli
2000-04-08 23:18 ` Kanoj Sarcar [this message]
2000-04-08 23:58 ` Andrea Arcangeli
2000-04-08 13:30 ` Andrea Arcangeli
2000-04-08 17:39 ` Andrea Arcangeli
2000-04-07 23:54 ` Andrea Arcangeli
2000-04-08 0:15 ` Kanoj Sarcar
2000-04-08 13:14 ` Andrea Arcangeli
2000-04-08 21:47 ` Kanoj Sarcar
2000-04-08 23:10 ` Andrea Arcangeli
2000-04-08 23:21 ` Kanoj Sarcar
2000-04-08 23:39 ` Andrea Arcangeli
2000-04-09 0:40 ` Kanoj Sarcar
2000-04-10 8:55 ` andrea
2000-04-11 2:45 ` Kanoj Sarcar
2000-04-11 16:22 ` Andrea Arcangeli
2000-04-11 17:40 ` Rik van Riel
2000-04-11 18:20 ` Kanoj Sarcar
2000-04-21 18:23 ` Andrea Arcangeli
2000-04-21 21:00 ` Rik van Riel
2000-04-22 1:12 ` Andrea Arcangeli
2000-04-22 1:51 ` Linus Torvalds
2000-04-22 18:29 ` Rik van Riel
2000-04-22 19:58 ` Linus Torvalds
2000-04-11 18:26 ` Kanoj Sarcar
2000-04-10 19:10 ` Stephen C. Tweedie
2000-04-08 0:04 ` Andrea Arcangeli
[not found] <yttem7xstk2.fsf@vexeta.dc.fi.udc.es>
2000-04-23 0:52 ` Andrea Arcangeli
[not found] <yttk8ho26s8.fsf@vexeta.dc.fi.udc.es>
2000-04-23 16:07 ` Andrea Arcangeli
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200004082318.QAA60782@google.engr.sgi.com \
--to=kanoj@google.engr.sgi.com \
--cc=andrea@suse.de \
--cc=bcrl@redhat.com \
--cc=linux-mm@kvack.org \
--cc=riel@nl.linux.org \
--cc=torvalds@transmeta.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox