From mboxrd@z Thu Jan 1 00:00:00 1970 From: kanoj@google.engr.sgi.com (Kanoj Sarcar) Message-Id: <200005030526.WAA59352@google.engr.sgi.com> Subject: Re: Oops in __free_pages_ok (pre7-1) (Long) (backtrace) Date: Tue, 2 May 2000 22:26:48 -0700 (PDT) In-Reply-To: from "Linus Torvalds" at May 02, 2000 08:47:57 PM MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: Linus Torvalds Cc: Rajagopal Ananthanarayanan , linux-mm@kvack.org List-ID: > > > On Tue, 2 May 2000, Rajagopal Ananthanarayanan wrote: > > ------------- > > > > I ran into a BUG in __free_pages_ok which checks: > > > > ---------- > > if (PageSwapCache(page)) > > BUG(); > > ---------- > > > > The call to free the page was from try_to_swap_out(): > > > > ---------- > > /* > > * Is the page already in the swap cache? If so, then > > * we can just drop our reference to it without doing > > * any IO - it's already up-to-date on disk. > > * > > * Return 0, as we didn't actually free any real > > * memory, and we should just continue our scan. > > */ > > if (PageSwapCache(page)) { > > entry.val = page->index; > > swap_duplicate(entry); > > set_pte(page_table, swp_entry_to_pte(entry)); > > drop_pte: > > vma->vm_mm->rss--; > > flush_tlb_page(vma, address); > > __free_page(page); > > goto out_failed; > > } > > Wow. > > That code definitely looks buggy. > > Looking at the whole try_to_swap_out() in this light shows how it messes > with a _lot_ of page information without holding the page lock. I thought > we fixed this once already, but maybe not. > > In try_to_swap_out(), earlier it does a > > if (PageLocked(page)) > goto out_failed; > > and that really is wrong - it should do a > > if (TryLockPage(page)) > goto out_failed; Umm, I am not saying this is not a good idea, but maybe code that try_to_swap_out() invokes (like filemap_swapout etc) need to be taught that the incoming page has already been locked. Nonetheless, unless you show me a possible scenario that will lead to the observed panic, I am skeptical that this is the real problem. Lets just talk about swapcache pages (since the problem happened with that type), and lets forget swapfile deletion, I am pretty sure Ananth was not trying that. In this restricted situation, I _think_ you can not theorize what the problem is. That is, if all code that add/delete pages from the swap cache make sure they never delete a "shared" page from the scache (as determined by is_page_shared). This is because the process that kswapd is looking at already ensures that the page is "shared". The only code that does delete "shared" pages from the scache is shrink_mmap, but if a process already has a page-reference, shrink_mmap can not touch that page. Also, most process level code that takes a page out from the swapcache is interlocked out because kswapd is holding the vmlist/page_table_lock. Anyway, I will try to think if there are more race conditions possible. Ananth, was there shared memory programs in your test suite? Also, if you have any success in reproducing this, let us know. Kanoj > > and do all the rest with the page locked so that there are no races on > changing the state of the page (and then unlock just before actually > returning, or freeing the page). > > As far as I can tell, this is a real bug, and has absolutely nothing to do > with the swap entry cache. It may be that the swap entry cache code just > changed timings for some people enough to show the race. > > But maybe I've overlooked something. Anybody else have comments on this? > > Linus > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux.eu.org/Linux-MM/ > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/