linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: kanoj@google.engr.sgi.com (Kanoj Sarcar)
To: "Stephen C. Tweedie" <sct@redhat.com>
Cc: manfreds@colorfullife.com, viro@math.psu.edu, andrea@suse.de,
	linux-kernel@vger.rutgers.edu, mingo@chiara.csoma.elte.hu,
	linux-mm@kvack.org
Subject: Re: locking question: do_mmap(), do_munmap()
Date: Tue, 12 Oct 1999 18:25:42 -0700 (PDT)	[thread overview]
Message-ID: <199910130125.SAA66579@google.engr.sgi.com> (raw)
In-Reply-To: <14338.25466.233239.59715@dukat.scot.redhat.com> from "Stephen C. Tweedie" at Oct 11, 99 11:23:54 pm

> 
> Hi,
> 
> On Mon, 11 Oct 1999 12:07:08 -0700 (PDT), kanoj@google.engr.sgi.com
> (Kanoj Sarcar) said:
> 
> >> What about something like a rw-semaphore which protects the vma list:
> >> vma-list modifiers [ie merge_segments(), insert_vm_struct() and
> >> do_munmap()] grab it exclusive, swapper grabs it "shared, starve
> >> exclusive".
> >> All other vma-list readers are protected by mm->mmap_sem.
> 
> It will deadlock.
> 

Not sure why you say that. Yes, I did see the nesting semdaphore grabbing
scenario that you posted, but I think you can prevent that from 
happening if you follow some rules. 

Here's a primitive patch showing the direction I am thinking of. I do not
have any problem with a spinning lock, but I coded this against 2.2.10,
where insert_vm_struct could go to sleep, hence I had to use sleeping
locks to protect the vma chain. Slight change is needed to vmscan.c
to use spinning locks.

Kanoj

This is a skeleton of the solution that prevents kswapd from walking
down a vma chain without protections. I am trying to get comments on
this approach before I try a full blown implementation.

The rules:
1. To modify the vmlist (add/delete), you must hold mmap_sem to 
guard against clones doing mmap/munmap/faults, (ie all vm system 
calls and faults), and from ptrace, swapin due to swap deletion
etc.
2. To modify the vmlist (add/delete), you must also hold
vmlist_modify_lock, to guard against page stealers scanning the
list.
3. To scan the vmlist, you must either 
	a. grab mmap_sem, which should be all cases except page stealer.
or
	b. grab vmlist_access_lock, only done by page stealer.
4. While holding the vmlist_modify_lock, you must be able to guarantee
that no code path will lead to page stealing.
5. You must be able to guarantee that while holding vmlist_modify_lock
or vmlist_access_lock of mm A, you will not try to get either lock
for mm B.

The assumptions:
1. No code path reachable thru insert_vm_struct and merge_segments
sleep for memory.


--- /usr/tmp/p_rdiff_a00368/vmscan.c	Tue Oct 12 17:58:50 1999
+++ mm/vmscan.c	Tue Oct 12 16:55:13 1999
@@ -295,6 +295,7 @@
 	/*
 	 * Find the proper vm-area
 	 */
+	vmlist_access_lock(mm);
 	vma = find_vma(mm, address);
 	if (vma) {
 		if (address < vma->vm_start)
@@ -302,8 +303,10 @@
 
 		for (;;) {
 			int result = swap_out_vma(vma, address, gfp_mask);
-			if (result)
+			if (result) {
+				vmlist_access_unlock(mm);
 				return result;
+			}
 			vma = vma->vm_next;
 			if (!vma)
 				break;
@@ -310,6 +313,7 @@
 			address = vma->vm_start;
 		}
 	}
+	vmlist_access_unlock(mm);
 
 	/* We didn't find anything for the process */
 	mm->swap_cnt = 0;
--- /usr/tmp/p_rdiff_a0036H/mlock.c	Tue Oct 12 17:59:06 1999
+++ mm/mlock.c	Tue Oct 12 16:35:25 1999
@@ -13,7 +13,9 @@
 
 static inline int mlock_fixup_all(struct vm_area_struct * vma, int newflags)
 {
+	vmlist_modify_lock(vma->vm_mm);
 	vma->vm_flags = newflags;
+	vmlist_modify_unlock(vma->vm_mm);
 	return 0;
 }
 
@@ -26,15 +28,17 @@
 	if (!n)
 		return -EAGAIN;
 	*n = *vma;
-	vma->vm_start = end;
 	n->vm_end = end;
-	vma->vm_offset += vma->vm_start - n->vm_start;
 	n->vm_flags = newflags;
 	if (n->vm_file)
 		get_file(n->vm_file);
 	if (n->vm_ops && n->vm_ops->open)
 		n->vm_ops->open(n);
+	vmlist_modify_lock(vma->vm_mm);
+	vma->vm_offset += end - vma->vm_start;
+	vma->vm_start = end;
 	insert_vm_struct(current->mm, n);
+	vmlist_modify_unlock(vma->vm_mm);
 	return 0;
 }
 
@@ -47,7 +51,6 @@
 	if (!n)
 		return -EAGAIN;
 	*n = *vma;
-	vma->vm_end = start;
 	n->vm_start = start;
 	n->vm_offset += n->vm_start - vma->vm_start;
 	n->vm_flags = newflags;
@@ -55,7 +58,10 @@
 		get_file(n->vm_file);
 	if (n->vm_ops && n->vm_ops->open)
 		n->vm_ops->open(n);
+	vmlist_modify_lock(vma->vm_mm);
+	vma->vm_end = start;
 	insert_vm_struct(current->mm, n);
+	vmlist_modify_unlock(vma->vm_mm);
 	return 0;
 }
 
@@ -75,10 +81,7 @@
 	*left = *vma;
 	*right = *vma;
 	left->vm_end = start;
-	vma->vm_start = start;
-	vma->vm_end = end;
 	right->vm_start = end;
-	vma->vm_offset += vma->vm_start - left->vm_start;
 	right->vm_offset += right->vm_start - left->vm_start;
 	vma->vm_flags = newflags;
 	if (vma->vm_file)
@@ -88,8 +91,14 @@
 		vma->vm_ops->open(left);
 		vma->vm_ops->open(right);
 	}
+	vmlist_modify_lock(vma->vm_mm);
+	vma->vm_offset += start - vma->vm_start;
+	vma->vm_start = start;
+	vma->vm_end = end;
+	vma->vm_flags = newflags;
 	insert_vm_struct(current->mm, left);
 	insert_vm_struct(current->mm, right);
+	vmlist_modify_unlock(vma->vm_mm);
 	return 0;
 }
 
@@ -168,7 +177,9 @@
 			break;
 		}
 	}
+	vmlist_modify_lock(current->mm);
 	merge_segments(current->mm, start, end);
+	vmlist_modify_unlock(current->mm);
 	return error;
 }
 
@@ -240,7 +251,9 @@
 		if (error)
 			break;
 	}
+	vmlist_modify_lock(current->mm);
 	merge_segments(current->mm, 0, TASK_SIZE);
+	vmlist_modify_unlock(current->mm);
 	return error;
 }
 
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://humbolt.geo.uu.nl/Linux-MM/

  reply	other threads:[~1999-10-13  1:25 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <Pine.LNX.4.10.9910101713010.364-100000@alpha.random>
1999-10-10 15:52 ` Manfred Spraul
1999-10-10 16:07   ` Alexander Viro
1999-10-10 16:25     ` Alexander Viro
1999-10-10 16:45       ` Manfred Spraul
1999-10-10 17:25         ` Alexander Viro
1999-10-10 17:12       ` Andrea Arcangeli
1999-10-10 17:48         ` Alexander Viro
1999-10-10 18:42           ` Manfred Spraul
1999-10-10 19:03             ` Alexander Viro
1999-10-10 21:31               ` Manfred Spraul
1999-10-10 21:53               ` Andrea Arcangeli
1999-10-10 22:34                 ` Alexander Viro
1999-10-10 23:28                   ` Andrea Arcangeli
1999-10-11 15:50               ` Stephen C. Tweedie
1999-10-11 16:05                 ` Alexander Viro
1999-10-11 18:02                   ` Manfred Spraul
1999-10-11 19:07                     ` Kanoj Sarcar
1999-10-11 22:23                       ` Stephen C. Tweedie
1999-10-13  1:25                         ` Kanoj Sarcar [this message]
1999-10-13  7:32                           ` Manfred Spraul
1999-10-15  9:58                             ` Ralf Baechle
1999-10-15 17:50                               ` Kanoj Sarcar
1999-10-13 10:45                           ` Stephen C. Tweedie
1999-10-11 20:15                     ` Stephen C. Tweedie
1999-10-11 21:14                       ` Manfred Spraul
1999-10-11 21:37                     ` Alexander Viro
1999-10-11 22:13                       ` Manfred Spraul
1999-10-11 22:22                     ` Stephen C. Tweedie
1999-10-11 23:01                       ` Alexander Viro
1999-10-12 14:06                         ` [more fun] " Alexander Viro
1999-10-13  7:35                           ` Manfred Spraul
1999-10-13 18:34                             ` Kanoj Sarcar
1999-10-13 10:16                         ` Stephen C. Tweedie
1999-10-11 20:13                   ` Stephen C. Tweedie
1999-10-11 21:40                     ` Alexander Viro
1999-10-11 22:20                       ` Stephen C. Tweedie
1999-10-11 22:31                         ` Alexander Viro
1999-10-13 10:25                           ` Stephen C. Tweedie
1999-10-11 15:47             ` Stephen C. Tweedie
1999-10-11 15:43         ` Stephen C. Tweedie
1999-10-10 16:56     ` Andrea Arcangeli
1999-10-11 15:41     ` Stephen C. Tweedie
1999-10-11 15:52       ` Alexander Viro
1999-10-09 12:48 Manfred Spraul
1999-10-09 13:12 ` Alexander Viro
1999-10-09 13:17   ` Manfred Spraul
1999-10-09 13:38     ` Alexander Viro
1999-10-09 16:01   ` Andrea Arcangeli
1999-10-10 13:05     ` Manfred Spraul
1999-10-11 15:09       ` Stephen C. Tweedie
1999-10-11 15:05     ` Stephen C. Tweedie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=199910130125.SAA66579@google.engr.sgi.com \
    --to=kanoj@google.engr.sgi.com \
    --cc=andrea@suse.de \
    --cc=linux-kernel@vger.rutgers.edu \
    --cc=linux-mm@kvack.org \
    --cc=manfreds@colorfullife.com \
    --cc=mingo@chiara.csoma.elte.hu \
    --cc=sct@redhat.com \
    --cc=viro@math.psu.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox