* [PATCH] fix split_vma vs. invalidate_mmap_range_list race
@ 2003-10-04 3:58 V. Rajesh
2003-10-04 5:29 ` David S. Miller
0 siblings, 1 reply; 5+ messages in thread
From: V. Rajesh @ 2003-10-04 3:58 UTC (permalink / raw)
To: akpm; +Cc: davem, hch, linux-mm
If a vma is already present in an i_mmap list of a mapping,
then it is racy to update the vm_start, vm_end, and vm_pgoff
members of the vma without holding the mapping's i_shared_sem.
This is because the updates can race with invalidate_mmap_range_list.
I audited all the places that assign vm_start, vm_end, and vm_pgoff.
AFAIK, the following is the list of questionable places:
1) This patch fixes the racy split_vma. Kernel 2.4 does the
right thing, but the following changesets introduced a race.
http://linux.bkbits.net:8080/linux-2.5/patch@1.536.34.4
http://linux.bkbits.net:8080/linux-2.5/patch@1.536.34.5
You can use the patch and programs in the following URL to
trigger the race.
http://www-personal.engin.umich.edu/~vrajesh/linux/truncate-race/
2) This patch also locks a small racy window in vma_merge.
3) In few cases vma_merge and do_mremap expand a vma by adding
extra length to vm_end without holding i_shared_sem. I think
that's fine.
4) In arch/sparc64, vm_end is updated without holding i_shared_sem.
Check make_hugetlb_page_present. I hope that is fine, but
I am not sure.
Please teach me if I am wrong.
Thanks,
Rajesh
mm/mmap.c | 54 +++++++++++++++++++++++++++++++++++++++++++++++-------
1 files changed, 47 insertions(+), 7 deletions(-)
diff -puN mm/mmap.c~truncate_race mm/mmap.c
--- dev-2.6/mm/mmap.c~truncate_race 2003-10-03 16:01:29.000000000 -0400
+++ dev-2.6-jaya/mm/mmap.c 2003-10-03 21:01:28.000000000 -0400
@@ -277,6 +277,26 @@ static void vma_link(struct mm_struct *m
validate_mm(mm);
}
+/* Insert vm structure into process list sorted by address
+ * and into the inode's i_mmap ring. The caller should
+ * hold mm->page_table_lock and ->f_mappping->i_shared_sem
+ * if vm_file is non-NULL.
+ */
+static void
+__insert_vm_struct(struct mm_struct * mm, struct vm_area_struct * vma)
+{
+ struct vm_area_struct * __vma, * prev;
+ struct rb_node ** rb_link, * rb_parent;
+
+ __vma = find_vma_prepare(mm,vma->vm_start,&prev,&rb_link,&rb_parent);
+ if (__vma && __vma->vm_start < vma->vm_end)
+ BUG();
+ __vma_link(mm, vma, prev, rb_link, rb_parent);
+ mark_mm_hugetlb(mm, vma);
+ mm->map_count++;
+ validate_mm(mm);
+}
+
/*
* If the vma has a ->close operation then the driver probably needs to release
* per-vma resources, so we don't attempt to merge those.
@@ -417,10 +437,14 @@ static int vma_merge(struct mm_struct *m
pgoff, (end - addr) >> PAGE_SHIFT))
return 0;
if (end == prev->vm_start) {
+ if (file)
+ down(&file->f_mapping->i_shared_sem);
spin_lock(lock);
prev->vm_start = addr;
prev->vm_pgoff -= (end - addr) >> PAGE_SHIFT;
spin_unlock(lock);
+ if (file)
+ up(&file->f_mapping->i_shared_sem);
return 1;
}
}
@@ -1134,6 +1158,7 @@ int split_vma(struct mm_struct * mm, str
unsigned long addr, int new_below)
{
struct vm_area_struct *new;
+ struct address_space *mapping = NULL;
if (mm->map_count >= MAX_MAP_COUNT)
return -ENOMEM;
@@ -1147,12 +1172,9 @@ int split_vma(struct mm_struct * mm, str
INIT_LIST_HEAD(&new->shared);
- if (new_below) {
+ if (new_below)
new->vm_end = addr;
- vma->vm_start = addr;
- vma->vm_pgoff += ((addr - new->vm_start) >> PAGE_SHIFT);
- } else {
- vma->vm_end = addr;
+ else {
new->vm_start = addr;
new->vm_pgoff += ((addr - vma->vm_start) >> PAGE_SHIFT);
}
@@ -1163,7 +1185,25 @@ int split_vma(struct mm_struct * mm, str
if (new->vm_ops && new->vm_ops->open)
new->vm_ops->open(new);
- insert_vm_struct(mm, new);
+ if (vma->vm_file)
+ mapping = vma->vm_file->f_mapping;
+
+ if (mapping)
+ down(&mapping->i_shared_sem);
+ spin_lock(&mm->page_table_lock);
+
+ if (new_below) {
+ vma->vm_start = addr;
+ vma->vm_pgoff += ((addr - new->vm_start) >> PAGE_SHIFT);
+ } else
+ vma->vm_end = addr;
+
+ __insert_vm_struct(mm, new);
+
+ spin_unlock(&mm->page_table_lock);
+ if (mapping)
+ up(&mapping->i_shared_sem);
+
return 0;
}
@@ -1408,7 +1448,7 @@ void exit_mmap(struct mm_struct *mm)
/* Insert vm structure into process list sorted by address
* and into the inode's i_mmap ring. If vm_file is non-NULL
- * then i_shared_sem is taken here.
+ * then i_shared_sem is taken here.
*/
void insert_vm_struct(struct mm_struct * mm, struct vm_area_struct * vma)
{
_
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] fix split_vma vs. invalidate_mmap_range_list race
2003-10-04 3:58 [PATCH] fix split_vma vs. invalidate_mmap_range_list race V. Rajesh
@ 2003-10-04 5:29 ` David S. Miller
2003-10-04 5:40 ` Andrew Morton
0 siblings, 1 reply; 5+ messages in thread
From: David S. Miller @ 2003-10-04 5:29 UTC (permalink / raw)
To: V. Rajesh; +Cc: akpm, hch, linux-mm
On Fri, 3 Oct 2003 23:58:04 -0400 (EDT)
"V. Rajesh" <vrajesh@eecs.umich.edu> wrote:
> 4) In arch/sparc64, vm_end is updated without holding i_shared_sem.
> Check make_hugetlb_page_present. I hope that is fine, but
> I am not sure.
You can ignore this case for now if you want, the sparc64 hugetlb code
is in a state of disrepair. I plan to nearly rewrite it over the
weekend so it's more uptodate and in line with the current x86 and
ia64 hugetlb support code.
I think you are right about these races, they exist and they are
real. Someone should just make sure you haven't added any deadlock
or semaphore taking with spinlocks held in higher level callers.
I don't think your patch does, but it's something to audit.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] fix split_vma vs. invalidate_mmap_range_list race
2003-10-04 5:29 ` David S. Miller
@ 2003-10-04 5:40 ` Andrew Morton
2003-10-04 6:31 ` Rajesh Venkatasubramanian
0 siblings, 1 reply; 5+ messages in thread
From: Andrew Morton @ 2003-10-04 5:40 UTC (permalink / raw)
To: David S. Miller; +Cc: vrajesh, hch, linux-mm
"David S. Miller" <davem@redhat.com> wrote:
>
> I think you are right about these races, they exist and they are
> real. Someone should just make sure you haven't added any deadlock
> or semaphore taking with spinlocks held in higher level callers.
> I don't think your patch does, but it's something to audit.
It looks OK. I updated the VM lock ranking docco to cover this.
mm/filemap.c | 3 +++
mm/mmap.c | 16 +++++++++-------
2 files changed, 12 insertions(+), 7 deletions(-)
diff -puN mm/mmap.c~vma-split-truncate-race-fix-tweaks mm/mmap.c
--- 25/mm/mmap.c~vma-split-truncate-race-fix-tweaks 2003-10-03 21:50:39.000000000 -0700
+++ 25-akpm/mm/mmap.c 2003-10-03 21:53:09.000000000 -0700
@@ -369,7 +369,8 @@ static int vma_merge(struct mm_struct *m
unsigned long end, unsigned long vm_flags,
struct file *file, unsigned long pgoff)
{
- spinlock_t * lock = &mm->page_table_lock;
+ spinlock_t *lock = &mm->page_table_lock;
+ struct semaphore *i_shared_sem;
/*
* We later require that vma->vm_flags == vm_flags, so this tests
@@ -378,6 +379,8 @@ static int vma_merge(struct mm_struct *m
if (vm_flags & VM_SPECIAL)
return 0;
+ i_shared_sem = file ? &file->f_mapping->i_shared_sem : NULL;
+
if (!prev) {
prev = rb_entry(rb_parent, struct vm_area_struct, vm_rb);
goto merge_next;
@@ -395,7 +398,7 @@ static int vma_merge(struct mm_struct *m
if (unlikely(file && prev->vm_next &&
prev->vm_next->vm_file == file)) {
- down(&file->f_mapping->i_shared_sem);
+ down(i_shared_sem);
need_up = 1;
}
spin_lock(lock);
@@ -413,7 +416,7 @@ static int vma_merge(struct mm_struct *m
__remove_shared_vm_struct(next, inode);
spin_unlock(lock);
if (need_up)
- up(&file->f_mapping->i_shared_sem);
+ up(i_shared_sem);
if (file)
fput(file);
@@ -423,7 +426,7 @@ static int vma_merge(struct mm_struct *m
}
spin_unlock(lock);
if (need_up)
- up(&file->f_mapping->i_shared_sem);
+ up(i_shared_sem);
return 1;
}
@@ -438,17 +441,16 @@ static int vma_merge(struct mm_struct *m
return 0;
if (end == prev->vm_start) {
if (file)
- down(&file->f_mapping->i_shared_sem);
+ down(i_shared_sem); /* invalidate_mmap_range */
spin_lock(lock);
prev->vm_start = addr;
prev->vm_pgoff -= (end - addr) >> PAGE_SHIFT;
spin_unlock(lock);
if (file)
- up(&file->f_mapping->i_shared_sem);
+ up(i_shared_sem);
return 1;
}
}
-
return 0;
}
diff -puN mm/filemap.c~vma-split-truncate-race-fix-tweaks mm/filemap.c
--- 25/mm/filemap.c~vma-split-truncate-race-fix-tweaks 2003-10-03 21:59:15.000000000 -0700
+++ 25-akpm/mm/filemap.c 2003-10-03 22:02:01.000000000 -0700
@@ -61,6 +61,9 @@
* ->swap_device_lock (exclusive_swap_page, others)
* ->mapping->page_lock
*
+ * ->i_sem
+ * ->i_shared_sem (truncate->invalidate_mmap_range)
+ *
* ->mmap_sem
* ->i_shared_sem (various places)
*
_
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] fix split_vma vs. invalidate_mmap_range_list race
2003-10-04 5:40 ` Andrew Morton
@ 2003-10-04 6:31 ` Rajesh Venkatasubramanian
2003-10-04 6:41 ` Andrew Morton
0 siblings, 1 reply; 5+ messages in thread
From: Rajesh Venkatasubramanian @ 2003-10-04 6:31 UTC (permalink / raw)
To: Andrew Morton; +Cc: David S. Miller, hch, linux-mm
>
> It looks OK. I updated the VM lock ranking docco to cover this.
> *
> + * ->i_sem
> + * ->i_shared_sem (truncate->invalidate_mmap_range)
> + *
I don't understand how my patch introduced this new(?) ordering.
My patch does not introduce any new semaphores or spin_locks in
any code path. Am I missing something ? Please give a clue if so.
Thanks,
Rajesh
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] fix split_vma vs. invalidate_mmap_range_list race
2003-10-04 6:31 ` Rajesh Venkatasubramanian
@ 2003-10-04 6:41 ` Andrew Morton
0 siblings, 0 replies; 5+ messages in thread
From: Andrew Morton @ 2003-10-04 6:41 UTC (permalink / raw)
To: Rajesh Venkatasubramanian; +Cc: davem, hch, linux-mm
Rajesh Venkatasubramanian <vrajesh@eecs.umich.edu> wrote:
>
>
> >
> > It looks OK. I updated the VM lock ranking docco to cover this.
>
> > *
> > + * ->i_sem
> > + * ->i_shared_sem (truncate->invalidate_mmap_range)
> > + *
>
> I don't understand how my patch introduced this new(?) ordering.
It did not: I just added this as I was reviewing the code, because it was
missing.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2003-10-04 6:41 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-10-04 3:58 [PATCH] fix split_vma vs. invalidate_mmap_range_list race V. Rajesh
2003-10-04 5:29 ` David S. Miller
2003-10-04 5:40 ` Andrew Morton
2003-10-04 6:31 ` Rajesh Venkatasubramanian
2003-10-04 6:41 ` Andrew Morton
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox