From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from flinx.npwt.net (eric@flinx.npwt.net [208.236.161.237]) by kvack.org (8.8.7/8.8.7) with ESMTP id LAA15849 for ; Wed, 27 May 1998 11:32:02 -0400 Subject: Re: Q: Swap Locking Reinstatement References: <199805192246.XAA03125@dax.dcs.ed.ac.uk> From: ebiederm+eric@npwt.net (Eric W. Biederman) Date: 27 May 1998 10:15:19 -0500 In-Reply-To: "Stephen C. Tweedie"'s message of Tue, 19 May 1998 23:46:01 +0100 Message-ID: Sender: owner-linux-mm@kvack.org To: "Stephen C. Tweedie" Cc: linux-mm@kvack.org List-ID: >>>>> "ST" == Stephen C Tweedie writes: ST> Hi, ST> On 12 May 1998 20:57:05 -0500, ebiederm+eric@npwt.net (Eric ST> W. Biederman) said: >> Recently the swap lockmap has been readded. >> Was that just as a low cost sanity check, to use especially while >> there were bugs in some of the low level disk drivers? >> Was there something that really needs the swap lockmap? ST> Yes, there was a bug. The problem occurs when: ST> page X is owned by process A ST> process B tries to swap out page X from A's address space ST> process A exits or execs ST> process B's swap IO completes. ST> The IO completion is an interrupt (we perform async swaps where ST> possible). Now, if we dereference the swap entry belonging to page X ST> at IO completion time, then the entry is protected against reuse while ST> the IO is in flight. However, that requires making the entire swap map ST> interrupt safe. It is much more efficient to keep the lock map separate ST> and to use atomic bitops on it to allow us to do the IO completion ST> unlock in an interrupt-safe manner. ST> A similar race occurs when ST> process B tries to swap out page X from A's address space ST> process A tries to swap it back in ST> process B's swap IO completes. ST> Now process A may, or may not, get the right data from disk depending on ST> the (undefined) ordering of the IOs submitted by A and B. Here is how I'm going to code it. I'm going to modify swap_out to never remove a page from the page cache until I/O is complete on it. This should only affect asynchrounous pages. I'm going to modify shrink_mmap and friends so that when they remove a swapper page from the page cache they will decrement the swap use count, of the page. (Via a new generic inode function). This should both remove the need for the swap lock map, and increase performance on the second race condition you mentioned (because it doesn't have to read the page back in). Hopefully when we get reverse pte maps working we can remove the swap use counts as well, and only worry if a swap page is in use. Eric