* [PATCH 0/5] mm: further simplify VMA merge operation
@ 2025-01-27 15:50 Lorenzo Stoakes
2025-01-27 15:50 ` [PATCH 1/5] mm: simplify vma merge structure and expand comments Lorenzo Stoakes
` (4 more replies)
0 siblings, 5 replies; 26+ messages in thread
From: Lorenzo Stoakes @ 2025-01-27 15:50 UTC (permalink / raw)
To: Andrew Morton
Cc: Liam R . Howlett, Vlastimil Babka, Jann Horn, linux-mm, linux-kernel
While significant efforts have been made to improve the VMA merge
operation, there remains remnants of the bad (or rather confusing) old
days, which make the code difficult to understand, more bug prone and
thus harder to modify.
This series attempts to significantly improve matters in a number of
respects - with a focus on simplifying the commit_merge() function which
actually actions the merge operation - and importantly, adjusting the two
most confusing merge cases - those in which we 'adjust' the VMA immediately
adjacent to the one being merged.
One source of confusion are the VMAs being threaded through the operation
themselves - vmg->prev, vmg->vma and vmg->next.
At the start of the operation, vmg->vma is either NULL if a new VMA is
propose to be added, or if not then a pointer to an existing VMA being
modified, and prev/next are (perhaps not present) VMAs sat immediately
before and after the range specified in vmg->start, end, respectively.
However, during the VMA merge operation, we change vmg->start, end and
pgoff to span the newly merged range and vmg->vma to either be:
a. The ultimately returned VMA (in most cases) or
b. A VMA which we will manipulate, but ultimately instead return vmg->next.
Case b. especially here is confusing for somebody reading this code, but
the fact we update this state, along with vmg->start, end, pgoff only makes
matters worse.
We simplify things by replacing vmg->vma with vmg->middle and never
changing it - this is always either NULL (for a new VMA) or the VMA being
modified between vmg->prev and vmg->next.
We further simplify by placing the merged VMA in a new vmg->target field -
whether case b. above is the case or not. The reader of the code can now
simply rely on vmg->middle being the middle VMA and vmg->target being the
ultimately merged VMA.
We additionally tackle the confusing cases where we 'adjust' VMAs other
than the one we ultimately return as the merged VMA (this includes case
b. above). These are:
(1)
merge
<----------->
|------||--------| |------------|---|
| prev || middle | -> | target | m |
|------||--------| |------------|---|
In which case middle must be adjusted so middle->vm_start is increased as
well as performing the merge.
(2) (equivalent to case b. above)
<------------->
|---------||------| |---|-------------|
| middle || next | -> | m | target |
|---------||------| |---|-------------|
In which case next must be adjusted so next->vm_start is decreased as well
as performing the merge.
This cases have previously been performed by calculating and passing around
a dubious and confusing 'adj_start' parameter along side a pointer to an
'adjust' VMA indicating which VMA requires additional adjustment (middle in
case 1 and next in case 2).
With the VMG structure in place we are able to avoid this by simply setting
a merge flag to describe each case:
(1) Sets the __VMG_FLAG_ADJUST_MIDDLE_START flag
(2) Sets the __VMG_FLAG_ADJUST_NEXT_START flag
By doing so it turns out we can vastly simplify the logic and calculate
what is required to perform the operation.
Taken together the refactorings make it far easier to understand what is
being done even in these more confusing cases, make the code far more
maintainable, debuggable, and testable, providing more internal state
indicating what is happening in the merge operation.
The changes have no functional net impact on the merge operation and
everything should still behave as it did before.
Lorenzo Stoakes (5):
mm: simplify vma merge structure and expand comments
mm: further refactor commit_merge()
mm: eliminate adj_start parameter from commit_merge()
mm: make vmg->target consistent and further simplify commit_merge()
mm: completely abstract unnecessary adj_start calculation
include/linux/huge_mm.h | 2 +-
mm/debug.c | 18 +-
mm/huge_memory.c | 19 +-
mm/mmap.c | 2 +-
mm/vma.c | 322 +++++++++++++++++--------------
mm/vma.h | 58 +++++-
tools/testing/vma/vma.c | 55 +++---
tools/testing/vma/vma_internal.h | 4 +-
8 files changed, 276 insertions(+), 204 deletions(-)
--
2.48.0
^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH 1/5] mm: simplify vma merge structure and expand comments
2025-01-27 15:50 [PATCH 0/5] mm: further simplify VMA merge operation Lorenzo Stoakes
@ 2025-01-27 15:50 ` Lorenzo Stoakes
2025-01-28 11:38 ` Vlastimil Babka
2025-01-27 15:50 ` [PATCH 2/5] mm: further refactor commit_merge() Lorenzo Stoakes
` (3 subsequent siblings)
4 siblings, 1 reply; 26+ messages in thread
From: Lorenzo Stoakes @ 2025-01-27 15:50 UTC (permalink / raw)
To: Andrew Morton
Cc: Liam R . Howlett, Vlastimil Babka, Jann Horn, linux-mm, linux-kernel
The merge code, while much improved, still has a number of points of
confusion. As part of a broader series cleaning this up to make this more
maintainable, we start by addressing some confusion around vma_merge_struct
fields.
So far, the caller either provides no vmg->vma (a new VMA) or supplies the
existing VMA which is being altered, setting vmg->start,end,pgoff to the
proposed VMA dimensions.
vmg->vma is then updated, as are vmg->start,end,pgoff as the merge process
proceeds and the appropriate merge strategy is determined.
This is rather confusing, as vmg->vma starts off as the 'middle' VMA
between vmg->prev,next, but becomes the 'target' VMA, except in one
specific edge case (merge next, shrink middle).
Int his patch we introduce vmg->middle to describe the VMA that is between
vmg->prev and vmg->next, and does NOT change during the merge operation.
We replace vmg->vma with vmg->target, and use this only during the merge
operation itself.
Aside from the merge right, shrink middle case, this becomes the VMA that
forms the basis of the VMA that is returned. This edge case can be
addressed in a future commit.
We also add a number of comments to explain what is going on.
Finally, we adjust the ASCII diagrams showing each merge case in
vma_merge_existing_range() to be clearer - the arrow range previously
showed the vmg->start, end spanned area, but it is clearer to change this
to show the final merged VMA.
This patch has no change in functional behaviour.
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
---
mm/debug.c | 18 ++---
mm/mmap.c | 2 +-
mm/vma.c | 166 +++++++++++++++++++++-------------------
mm/vma.h | 42 ++++++++--
tools/testing/vma/vma.c | 52 ++++++-------
5 files changed, 159 insertions(+), 121 deletions(-)
diff --git a/mm/debug.c b/mm/debug.c
index 8d2acf432385..c9e07651677b 100644
--- a/mm/debug.c
+++ b/mm/debug.c
@@ -261,7 +261,7 @@ void dump_vmg(const struct vma_merge_struct *vmg, const char *reason)
pr_warn("vmg %px state: mm %px pgoff %lx\n"
"vmi %px [%lx,%lx)\n"
- "prev %px next %px vma %px\n"
+ "prev %px middle %px next %px target %px\n"
"start %lx end %lx flags %lx\n"
"file %px anon_vma %px policy %px\n"
"uffd_ctx %px\n"
@@ -270,7 +270,7 @@ void dump_vmg(const struct vma_merge_struct *vmg, const char *reason)
vmg, vmg->mm, vmg->pgoff,
vmg->vmi, vmg->vmi ? vma_iter_addr(vmg->vmi) : 0,
vmg->vmi ? vma_iter_end(vmg->vmi) : 0,
- vmg->prev, vmg->next, vmg->vma,
+ vmg->prev, vmg->middle, vmg->next, vmg->target,
vmg->start, vmg->end, vmg->flags,
vmg->file, vmg->anon_vma, vmg->policy,
#ifdef CONFIG_USERFAULTFD
@@ -288,13 +288,6 @@ void dump_vmg(const struct vma_merge_struct *vmg, const char *reason)
pr_warn("vmg %px mm: (NULL)\n", vmg);
}
- if (vmg->vma) {
- pr_warn("vmg %px vma:\n", vmg);
- dump_vma(vmg->vma);
- } else {
- pr_warn("vmg %px vma: (NULL)\n", vmg);
- }
-
if (vmg->prev) {
pr_warn("vmg %px prev:\n", vmg);
dump_vma(vmg->prev);
@@ -302,6 +295,13 @@ void dump_vmg(const struct vma_merge_struct *vmg, const char *reason)
pr_warn("vmg %px prev: (NULL)\n", vmg);
}
+ if (vmg->middle) {
+ pr_warn("vmg %px middle:\n", vmg);
+ dump_vma(vmg->middle);
+ } else {
+ pr_warn("vmg %px middle: (NULL)\n", vmg);
+ }
+
if (vmg->next) {
pr_warn("vmg %px next:\n", vmg);
dump_vma(vmg->next);
diff --git a/mm/mmap.c b/mm/mmap.c
index cda01071c7b1..6401a1d73f4a 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1707,7 +1707,7 @@ int relocate_vma_down(struct vm_area_struct *vma, unsigned long shift)
/*
* cover the whole range: [new_start, old_end)
*/
- vmg.vma = vma;
+ vmg.middle = vma;
if (vma_expand(&vmg))
return -ENOMEM;
diff --git a/mm/vma.c b/mm/vma.c
index af1d549b179c..68a301a76297 100644
--- a/mm/vma.c
+++ b/mm/vma.c
@@ -52,7 +52,7 @@ struct mmap_state {
.pgoff = (map_)->pgoff, \
.file = (map_)->file, \
.prev = (map_)->prev, \
- .vma = vma_, \
+ .middle = vma_, \
.next = (vma_) ? NULL : (map_)->next, \
.state = VMA_MERGE_START, \
.merge_flags = VMG_FLAG_DEFAULT, \
@@ -639,7 +639,7 @@ static int commit_merge(struct vma_merge_struct *vmg,
{
struct vma_prepare vp;
- init_multi_vma_prep(&vp, vmg->vma, adjust, remove, remove2);
+ init_multi_vma_prep(&vp, vmg->target, adjust, remove, remove2);
VM_WARN_ON(vp.anon_vma && adjust && adjust->anon_vma &&
vp.anon_vma != adjust->anon_vma);
@@ -652,15 +652,15 @@ static int commit_merge(struct vma_merge_struct *vmg,
adjust->vm_end);
}
- if (vma_iter_prealloc(vmg->vmi, vmg->vma))
+ if (vma_iter_prealloc(vmg->vmi, vmg->target))
return -ENOMEM;
vma_prepare(&vp);
- vma_adjust_trans_huge(vmg->vma, vmg->start, vmg->end, adj_start);
- vma_set_range(vmg->vma, vmg->start, vmg->end, vmg->pgoff);
+ vma_adjust_trans_huge(vmg->target, vmg->start, vmg->end, adj_start);
+ vma_set_range(vmg->target, vmg->start, vmg->end, vmg->pgoff);
if (expanded)
- vma_iter_store(vmg->vmi, vmg->vma);
+ vma_iter_store(vmg->vmi, vmg->target);
if (adj_start) {
adjust->vm_start += adj_start;
@@ -671,7 +671,7 @@ static int commit_merge(struct vma_merge_struct *vmg,
}
}
- vma_complete(&vp, vmg->vmi, vmg->vma->vm_mm);
+ vma_complete(&vp, vmg->vmi, vmg->target->vm_mm);
return 0;
}
@@ -694,8 +694,9 @@ static bool can_merge_remove_vma(struct vm_area_struct *vma)
* identical properties.
*
* This function checks for the existence of any such mergeable VMAs and updates
- * the maple tree describing the @vmg->vma->vm_mm address space to account for
- * this, as well as any VMAs shrunk/expanded/deleted as a result of this merge.
+ * the maple tree describing the @vmg->middle->vm_mm address space to account
+ * for this, as well as any VMAs shrunk/expanded/deleted as a result of this
+ * merge.
*
* As part of this operation, if a merge occurs, the @vmg object will have its
* vma, start, end, and pgoff fields modified to execute the merge. Subsequent
@@ -704,45 +705,47 @@ static bool can_merge_remove_vma(struct vm_area_struct *vma)
* Returns: The merged VMA if merge succeeds, or NULL otherwise.
*
* ASSUMPTIONS:
- * - The caller must assign the VMA to be modifed to @vmg->vma.
+ * - The caller must assign the VMA to be modifed to @vmg->middle.
* - The caller must have set @vmg->prev to the previous VMA, if there is one.
* - The caller must not set @vmg->next, as we determine this.
* - The caller must hold a WRITE lock on the mm_struct->mmap_lock.
- * - vmi must be positioned within [@vmg->vma->vm_start, @vmg->vma->vm_end).
+ * - vmi must be positioned within [@vmg->middle->vm_start, @vmg->middle->vm_end).
*/
static __must_check struct vm_area_struct *vma_merge_existing_range(
struct vma_merge_struct *vmg)
{
- struct vm_area_struct *vma = vmg->vma;
+ struct vm_area_struct *middle = vmg->middle;
struct vm_area_struct *prev = vmg->prev;
struct vm_area_struct *next, *res;
struct vm_area_struct *anon_dup = NULL;
struct vm_area_struct *adjust = NULL;
unsigned long start = vmg->start;
unsigned long end = vmg->end;
- bool left_side = vma && start == vma->vm_start;
- bool right_side = vma && end == vma->vm_end;
+ bool left_side = middle && start == middle->vm_start;
+ bool right_side = middle && end == middle->vm_end;
int err = 0;
long adj_start = 0;
- bool merge_will_delete_vma, merge_will_delete_next;
+ bool merge_will_delete_middle, merge_will_delete_next;
bool merge_left, merge_right, merge_both;
bool expanded;
mmap_assert_write_locked(vmg->mm);
- VM_WARN_ON_VMG(!vma, vmg); /* We are modifying a VMA, so caller must specify. */
+ VM_WARN_ON_VMG(!middle, vmg); /* We are modifying a VMA, so caller must specify. */
VM_WARN_ON_VMG(vmg->next, vmg); /* We set this. */
VM_WARN_ON_VMG(prev && start <= prev->vm_start, vmg);
VM_WARN_ON_VMG(start >= end, vmg);
/*
- * If vma == prev, then we are offset into a VMA. Otherwise, if we are
+ * If middle == prev, then we are offset into a VMA. Otherwise, if we are
* not, we must span a portion of the VMA.
*/
- VM_WARN_ON_VMG(vma && ((vma != prev && vmg->start != vma->vm_start) ||
- vmg->end > vma->vm_end), vmg);
- /* The vmi must be positioned within vmg->vma. */
- VM_WARN_ON_VMG(vma && !(vma_iter_addr(vmg->vmi) >= vma->vm_start &&
- vma_iter_addr(vmg->vmi) < vma->vm_end), vmg);
+ VM_WARN_ON_VMG(middle &&
+ ((middle != prev && vmg->start != middle->vm_start) ||
+ vmg->end > middle->vm_end), vmg);
+ /* The vmi must be positioned within vmg->middle. */
+ VM_WARN_ON_VMG(middle &&
+ !(vma_iter_addr(vmg->vmi) >= middle->vm_start &&
+ vma_iter_addr(vmg->vmi) < middle->vm_end), vmg);
vmg->state = VMA_MERGE_NOMERGE;
@@ -776,13 +779,13 @@ static __must_check struct vm_area_struct *vma_merge_existing_range(
merge_both = merge_left && merge_right;
/* If we span the entire VMA, a merge implies it will be deleted. */
- merge_will_delete_vma = left_side && right_side;
+ merge_will_delete_middle = left_side && right_side;
/*
- * If we need to remove vma in its entirety but are unable to do so,
+ * If we need to remove middle in its entirety but are unable to do so,
* we have no sensible recourse but to abort the merge.
*/
- if (merge_will_delete_vma && !can_merge_remove_vma(vma))
+ if (merge_will_delete_middle && !can_merge_remove_vma(middle))
return NULL;
/*
@@ -793,7 +796,7 @@ static __must_check struct vm_area_struct *vma_merge_existing_range(
/*
* If we cannot delete next, then we can reduce the operation to merging
- * prev and vma (thereby deleting vma).
+ * prev and middle (thereby deleting middle).
*/
if (merge_will_delete_next && !can_merge_remove_vma(next)) {
merge_will_delete_next = false;
@@ -801,8 +804,8 @@ static __must_check struct vm_area_struct *vma_merge_existing_range(
merge_both = false;
}
- /* No matter what happens, we will be adjusting vma. */
- vma_start_write(vma);
+ /* No matter what happens, we will be adjusting middle. */
+ vma_start_write(middle);
if (merge_left)
vma_start_write(prev);
@@ -812,13 +815,13 @@ static __must_check struct vm_area_struct *vma_merge_existing_range(
if (merge_both) {
/*
- * |<----->|
- * |-------*********-------|
- * prev vma next
- * extend delete delete
+ * |<-------------------->|
+ * |-------********-------|
+ * prev middle next
+ * extend delete delete
*/
- vmg->vma = prev;
+ vmg->target = prev;
vmg->start = prev->vm_start;
vmg->end = next->vm_end;
vmg->pgoff = prev->vm_pgoff;
@@ -826,78 +829,79 @@ static __must_check struct vm_area_struct *vma_merge_existing_range(
/*
* We already ensured anon_vma compatibility above, so now it's
* simply a case of, if prev has no anon_vma object, which of
- * next or vma contains the anon_vma we must duplicate.
+ * next or middle contains the anon_vma we must duplicate.
*/
- err = dup_anon_vma(prev, next->anon_vma ? next : vma, &anon_dup);
+ err = dup_anon_vma(prev, next->anon_vma ? next : middle,
+ &anon_dup);
} else if (merge_left) {
/*
- * |<----->| OR
- * |<--------->|
+ * |<------------>| OR
+ * |<----------------->|
* |-------*************
- * prev vma
+ * prev middle
* extend shrink/delete
*/
- vmg->vma = prev;
+ vmg->target = prev;
vmg->start = prev->vm_start;
vmg->pgoff = prev->vm_pgoff;
- if (!merge_will_delete_vma) {
- adjust = vma;
- adj_start = vmg->end - vma->vm_start;
+ if (!merge_will_delete_middle) {
+ adjust = middle;
+ adj_start = vmg->end - middle->vm_start;
}
- err = dup_anon_vma(prev, vma, &anon_dup);
+ err = dup_anon_vma(prev, middle, &anon_dup);
} else { /* merge_right */
/*
- * |<----->| OR
- * |<--------->|
+ * |<------------->| OR
+ * |<----------------->|
* *************-------|
- * vma next
+ * middle next
* shrink/delete extend
*/
pgoff_t pglen = PHYS_PFN(vmg->end - vmg->start);
VM_WARN_ON_VMG(!merge_right, vmg);
- /* If we are offset into a VMA, then prev must be vma. */
- VM_WARN_ON_VMG(vmg->start > vma->vm_start && prev && vma != prev, vmg);
+ /* If we are offset into a VMA, then prev must be middle. */
+ VM_WARN_ON_VMG(vmg->start > middle->vm_start && prev && middle != prev, vmg);
- if (merge_will_delete_vma) {
- vmg->vma = next;
+ if (merge_will_delete_middle) {
+ vmg->target = next;
vmg->end = next->vm_end;
vmg->pgoff = next->vm_pgoff - pglen;
} else {
/*
- * We shrink vma and expand next.
+ * We shrink middle and expand next.
*
* IMPORTANT: This is the ONLY case where the final
- * merged VMA is NOT vmg->vma, but rather vmg->next.
+ * merged VMA is NOT vmg->target, but rather vmg->next.
*/
-
- vmg->start = vma->vm_start;
+ vmg->target = middle;
+ vmg->start = middle->vm_start;
vmg->end = start;
- vmg->pgoff = vma->vm_pgoff;
+ vmg->pgoff = middle->vm_pgoff;
adjust = next;
- adj_start = -(vma->vm_end - start);
+ adj_start = -(middle->vm_end - start);
}
- err = dup_anon_vma(next, vma, &anon_dup);
+ err = dup_anon_vma(next, middle, &anon_dup);
}
if (err)
goto abort;
/*
- * In nearly all cases, we expand vmg->vma. There is one exception -
+ * In nearly all cases, we expand vmg->middle. There is one exception -
* merge_right where we partially span the VMA. In this case we shrink
- * the end of vmg->vma and adjust the start of vmg->next accordingly.
+ * the end of vmg->middle and adjust the start of vmg->next accordingly.
*/
- expanded = !merge_right || merge_will_delete_vma;
+ expanded = !merge_right || merge_will_delete_middle;
if (commit_merge(vmg, adjust,
- merge_will_delete_vma ? vma : NULL,
+ merge_will_delete_middle ? middle : NULL,
merge_will_delete_next ? next : NULL,
adj_start, expanded)) {
if (anon_dup)
@@ -973,7 +977,7 @@ struct vm_area_struct *vma_merge_new_range(struct vma_merge_struct *vmg)
bool just_expand = vmg->merge_flags & VMG_FLAG_JUST_EXPAND;
mmap_assert_write_locked(vmg->mm);
- VM_WARN_ON_VMG(vmg->vma, vmg);
+ VM_WARN_ON_VMG(vmg->middle, vmg);
/* vmi must point at or before the gap. */
VM_WARN_ON_VMG(vma_iter_addr(vmg->vmi) > end, vmg);
@@ -989,13 +993,13 @@ struct vm_area_struct *vma_merge_new_range(struct vma_merge_struct *vmg)
/* If we can merge with the next VMA, adjust vmg accordingly. */
if (can_merge_right) {
vmg->end = next->vm_end;
- vmg->vma = next;
+ vmg->middle = next;
}
/* If we can merge with the previous VMA, adjust vmg accordingly. */
if (can_merge_left) {
vmg->start = prev->vm_start;
- vmg->vma = prev;
+ vmg->middle = prev;
vmg->pgoff = prev->vm_pgoff;
/*
@@ -1017,10 +1021,10 @@ struct vm_area_struct *vma_merge_new_range(struct vma_merge_struct *vmg)
* Now try to expand adjacent VMA(s). This takes care of removing the
* following VMA if we have VMAs on both sides.
*/
- if (vmg->vma && !vma_expand(vmg)) {
- khugepaged_enter_vma(vmg->vma, vmg->flags);
+ if (vmg->middle && !vma_expand(vmg)) {
+ khugepaged_enter_vma(vmg->middle, vmg->flags);
vmg->state = VMA_MERGE_SUCCESS;
- return vmg->vma;
+ return vmg->middle;
}
return NULL;
@@ -1032,44 +1036,46 @@ struct vm_area_struct *vma_merge_new_range(struct vma_merge_struct *vmg)
* @vmg: Describes a VMA expansion operation.
*
* Expand @vma to vmg->start and vmg->end. Can expand off the start and end.
- * Will expand over vmg->next if it's different from vmg->vma and vmg->end ==
- * vmg->next->vm_end. Checking if the vmg->vma can expand and merge with
+ * Will expand over vmg->next if it's different from vmg->middle and vmg->end ==
+ * vmg->next->vm_end. Checking if the vmg->middle can expand and merge with
* vmg->next needs to be handled by the caller.
*
* Returns: 0 on success.
*
* ASSUMPTIONS:
- * - The caller must hold a WRITE lock on vmg->vma->mm->mmap_lock.
- * - The caller must have set @vmg->vma and @vmg->next.
+ * - The caller must hold a WRITE lock on vmg->middle->mm->mmap_lock.
+ * - The caller must have set @vmg->middle and @vmg->next.
*/
int vma_expand(struct vma_merge_struct *vmg)
{
struct vm_area_struct *anon_dup = NULL;
bool remove_next = false;
- struct vm_area_struct *vma = vmg->vma;
+ struct vm_area_struct *middle = vmg->middle;
struct vm_area_struct *next = vmg->next;
mmap_assert_write_locked(vmg->mm);
- vma_start_write(vma);
- if (next && (vma != next) && (vmg->end == next->vm_end)) {
+ vma_start_write(middle);
+ if (next && (middle != next) && (vmg->end == next->vm_end)) {
int ret;
remove_next = true;
/* This should already have been checked by this point. */
VM_WARN_ON_VMG(!can_merge_remove_vma(next), vmg);
vma_start_write(next);
- ret = dup_anon_vma(vma, next, &anon_dup);
+ ret = dup_anon_vma(middle, next, &anon_dup);
if (ret)
return ret;
}
/* Not merging but overwriting any part of next is not handled. */
VM_WARN_ON_VMG(next && !remove_next &&
- next != vma && vmg->end > next->vm_start, vmg);
+ next != middle && vmg->end > next->vm_start, vmg);
/* Only handles expanding */
- VM_WARN_ON_VMG(vma->vm_start < vmg->start || vma->vm_end > vmg->end, vmg);
+ VM_WARN_ON_VMG(middle->vm_start < vmg->start ||
+ middle->vm_end > vmg->end, vmg);
+ vmg->target = middle;
if (commit_merge(vmg, NULL, remove_next ? next : NULL, NULL, 0, true))
goto nomem;
@@ -1508,7 +1514,7 @@ int do_vmi_munmap(struct vma_iterator *vmi, struct mm_struct *mm,
*/
static struct vm_area_struct *vma_modify(struct vma_merge_struct *vmg)
{
- struct vm_area_struct *vma = vmg->vma;
+ struct vm_area_struct *vma = vmg->middle;
struct vm_area_struct *merged;
/* First, try to merge. */
@@ -1605,7 +1611,7 @@ struct vm_area_struct *vma_merge_extend(struct vma_iterator *vmi,
VMG_VMA_STATE(vmg, vmi, vma, vma, vma->vm_end, vma->vm_end + delta);
vmg.next = vma_iter_next_rewind(vmi, NULL);
- vmg.vma = NULL; /* We use the VMA to populate VMG fields only. */
+ vmg.middle = NULL; /* We use the VMA to populate VMG fields only. */
return vma_merge_new_range(&vmg);
}
@@ -1726,7 +1732,7 @@ struct vm_area_struct *copy_vma(struct vm_area_struct **vmap,
if (new_vma && new_vma->vm_start < addr + len)
return NULL; /* should never get here */
- vmg.vma = NULL; /* New VMA range. */
+ vmg.middle = NULL; /* New VMA range. */
vmg.pgoff = pgoff;
vmg.next = vma_iter_next_rewind(&vmi, NULL);
new_vma = vma_merge_new_range(&vmg);
diff --git a/mm/vma.h b/mm/vma.h
index a2e8710b8c47..5b5dd07e478c 100644
--- a/mm/vma.h
+++ b/mm/vma.h
@@ -69,16 +69,48 @@ enum vma_merge_flags {
VMG_FLAG_JUST_EXPAND = 1 << 0,
};
-/* Represents a VMA merge operation. */
+/*
+ * Describes a VMA merge operation and is threaded throughout it.
+ *
+ * Any of the fields may be mutated by the merge operation, so no guarantees are
+ * made to the contents of this structure after a merge operation has completed.
+ */
struct vma_merge_struct {
struct mm_struct *mm;
struct vma_iterator *vmi;
- pgoff_t pgoff;
+ /*
+ * Adjacent VMAs, any of which may be NULL if not present:
+ *
+ * |------|--------|------|
+ * | prev | middle | next |
+ * |------|--------|------|
+ *
+ * middle may not yet exist in the case of a proposed new VMA being
+ * merged, or it may be an existing VMA.
+ *
+ * next may be assigned by the caller.
+ */
struct vm_area_struct *prev;
- struct vm_area_struct *next; /* Modified by vma_merge(). */
- struct vm_area_struct *vma; /* Either a new VMA or the one being modified. */
+ struct vm_area_struct *middle;
+ struct vm_area_struct *next;
+ /*
+ * This is the VMA we ultimately target to become the merged VMA, except
+ * for the one exception of merge right, shrink next (for details of
+ * this scenario see vma_merge_existing_range()).
+ */
+ struct vm_area_struct *target;
+ /*
+ * Initially, the start, end, pgoff fields are provided by the caller
+ * and describe the proposed new VMA range, whether modifying an
+ * existing VMA (which will be 'middle'), or adding a new one.
+ *
+ * During the merge process these fields are updated to describe the new
+ * range _including those VMAs which will be merged_.
+ */
unsigned long start;
unsigned long end;
+ pgoff_t pgoff;
+
unsigned long flags;
struct file *file;
struct anon_vma *anon_vma;
@@ -118,8 +150,8 @@ static inline pgoff_t vma_pgoff_offset(struct vm_area_struct *vma,
.mm = vma_->vm_mm, \
.vmi = vmi_, \
.prev = prev_, \
+ .middle = vma_, \
.next = NULL, \
- .vma = vma_, \
.start = start_, \
.end = end_, \
.flags = vma_->vm_flags, \
diff --git a/tools/testing/vma/vma.c b/tools/testing/vma/vma.c
index 04ab45e27fb8..3c0572120e94 100644
--- a/tools/testing/vma/vma.c
+++ b/tools/testing/vma/vma.c
@@ -147,8 +147,8 @@ static void vmg_set_range(struct vma_merge_struct *vmg, unsigned long start,
vma_iter_set(vmg->vmi, start);
vmg->prev = NULL;
+ vmg->middle = NULL;
vmg->next = NULL;
- vmg->vma = NULL;
vmg->start = start;
vmg->end = end;
@@ -338,7 +338,7 @@ static bool test_simple_expand(void)
VMA_ITERATOR(vmi, &mm, 0);
struct vma_merge_struct vmg = {
.vmi = &vmi,
- .vma = vma,
+ .middle = vma,
.start = 0,
.end = 0x3000,
.pgoff = 0,
@@ -631,7 +631,7 @@ static bool test_vma_merge_special_flags(void)
*/
vma = alloc_and_link_vma(&mm, 0x3000, 0x4000, 3, flags);
ASSERT_NE(vma, NULL);
- vmg.vma = vma;
+ vmg.middle = vma;
for (i = 0; i < ARRAY_SIZE(special_flags); i++) {
vm_flags_t special_flag = special_flags[i];
@@ -760,7 +760,7 @@ static bool test_vma_merge_with_close(void)
vmg_set_range(&vmg, 0x3000, 0x5000, 3, flags);
vmg.prev = vma_prev;
- vmg.vma = vma;
+ vmg.middle = vma;
/*
* The VMA being modified in a way that would otherwise merge should
@@ -787,7 +787,7 @@ static bool test_vma_merge_with_close(void)
vma->vm_ops = &vm_ops;
vmg_set_range(&vmg, 0x3000, 0x5000, 3, flags);
- vmg.vma = vma;
+ vmg.middle = vma;
ASSERT_EQ(merge_existing(&vmg), NULL);
/*
* Initially this is misapprehended as an out of memory report, as the
@@ -817,7 +817,7 @@ static bool test_vma_merge_with_close(void)
vmg_set_range(&vmg, 0x3000, 0x5000, 3, flags);
vmg.prev = vma_prev;
- vmg.vma = vma;
+ vmg.middle = vma;
ASSERT_EQ(merge_existing(&vmg), NULL);
ASSERT_EQ(vmg.state, VMA_MERGE_NOMERGE);
@@ -843,7 +843,7 @@ static bool test_vma_merge_with_close(void)
vmg_set_range(&vmg, 0x3000, 0x5000, 3, flags);
vmg.prev = vma_prev;
- vmg.vma = vma;
+ vmg.middle = vma;
ASSERT_EQ(merge_existing(&vmg), vma_prev);
ASSERT_EQ(vmg.state, VMA_MERGE_SUCCESS);
@@ -940,7 +940,7 @@ static bool test_merge_existing(void)
vma_next = alloc_and_link_vma(&mm, 0x6000, 0x9000, 6, flags);
vma_next->vm_ops = &vm_ops; /* This should have no impact. */
vmg_set_range(&vmg, 0x3000, 0x6000, 3, flags);
- vmg.vma = vma;
+ vmg.middle = vma;
vmg.prev = vma;
vma->anon_vma = &dummy_anon_vma;
ASSERT_EQ(merge_existing(&vmg), vma_next);
@@ -973,7 +973,7 @@ static bool test_merge_existing(void)
vma_next = alloc_and_link_vma(&mm, 0x6000, 0x9000, 6, flags);
vma_next->vm_ops = &vm_ops; /* This should have no impact. */
vmg_set_range(&vmg, 0x2000, 0x6000, 2, flags);
- vmg.vma = vma;
+ vmg.middle = vma;
vma->anon_vma = &dummy_anon_vma;
ASSERT_EQ(merge_existing(&vmg), vma_next);
ASSERT_EQ(vmg.state, VMA_MERGE_SUCCESS);
@@ -1003,7 +1003,7 @@ static bool test_merge_existing(void)
vma->vm_ops = &vm_ops; /* This should have no impact. */
vmg_set_range(&vmg, 0x3000, 0x6000, 3, flags);
vmg.prev = vma_prev;
- vmg.vma = vma;
+ vmg.middle = vma;
vma->anon_vma = &dummy_anon_vma;
ASSERT_EQ(merge_existing(&vmg), vma_prev);
@@ -1037,7 +1037,7 @@ static bool test_merge_existing(void)
vma = alloc_and_link_vma(&mm, 0x3000, 0x7000, 3, flags);
vmg_set_range(&vmg, 0x3000, 0x7000, 3, flags);
vmg.prev = vma_prev;
- vmg.vma = vma;
+ vmg.middle = vma;
vma->anon_vma = &dummy_anon_vma;
ASSERT_EQ(merge_existing(&vmg), vma_prev);
ASSERT_EQ(vmg.state, VMA_MERGE_SUCCESS);
@@ -1067,7 +1067,7 @@ static bool test_merge_existing(void)
vma_next = alloc_and_link_vma(&mm, 0x7000, 0x9000, 7, flags);
vmg_set_range(&vmg, 0x3000, 0x7000, 3, flags);
vmg.prev = vma_prev;
- vmg.vma = vma;
+ vmg.middle = vma;
vma->anon_vma = &dummy_anon_vma;
ASSERT_EQ(merge_existing(&vmg), vma_prev);
ASSERT_EQ(vmg.state, VMA_MERGE_SUCCESS);
@@ -1102,37 +1102,37 @@ static bool test_merge_existing(void)
vmg_set_range(&vmg, 0x4000, 0x5000, 4, flags);
vmg.prev = vma;
- vmg.vma = vma;
+ vmg.middle = vma;
ASSERT_EQ(merge_existing(&vmg), NULL);
ASSERT_EQ(vmg.state, VMA_MERGE_NOMERGE);
vmg_set_range(&vmg, 0x5000, 0x6000, 5, flags);
vmg.prev = vma;
- vmg.vma = vma;
+ vmg.middle = vma;
ASSERT_EQ(merge_existing(&vmg), NULL);
ASSERT_EQ(vmg.state, VMA_MERGE_NOMERGE);
vmg_set_range(&vmg, 0x6000, 0x7000, 6, flags);
vmg.prev = vma;
- vmg.vma = vma;
+ vmg.middle = vma;
ASSERT_EQ(merge_existing(&vmg), NULL);
ASSERT_EQ(vmg.state, VMA_MERGE_NOMERGE);
vmg_set_range(&vmg, 0x4000, 0x7000, 4, flags);
vmg.prev = vma;
- vmg.vma = vma;
+ vmg.middle = vma;
ASSERT_EQ(merge_existing(&vmg), NULL);
ASSERT_EQ(vmg.state, VMA_MERGE_NOMERGE);
vmg_set_range(&vmg, 0x4000, 0x6000, 4, flags);
vmg.prev = vma;
- vmg.vma = vma;
+ vmg.middle = vma;
ASSERT_EQ(merge_existing(&vmg), NULL);
ASSERT_EQ(vmg.state, VMA_MERGE_NOMERGE);
vmg_set_range(&vmg, 0x5000, 0x6000, 5, flags);
vmg.prev = vma;
- vmg.vma = vma;
+ vmg.middle = vma;
ASSERT_EQ(merge_existing(&vmg), NULL);
ASSERT_EQ(vmg.state, VMA_MERGE_NOMERGE);
@@ -1197,7 +1197,7 @@ static bool test_anon_vma_non_mergeable(void)
vmg_set_range(&vmg, 0x3000, 0x7000, 3, flags);
vmg.prev = vma_prev;
- vmg.vma = vma;
+ vmg.middle = vma;
ASSERT_EQ(merge_existing(&vmg), vma_prev);
ASSERT_EQ(vmg.state, VMA_MERGE_SUCCESS);
@@ -1277,7 +1277,7 @@ static bool test_dup_anon_vma(void)
vma_next->anon_vma = &dummy_anon_vma;
vmg_set_range(&vmg, 0, 0x5000, 0, flags);
- vmg.vma = vma_prev;
+ vmg.middle = vma_prev;
vmg.next = vma_next;
ASSERT_EQ(expand_existing(&vmg), 0);
@@ -1309,7 +1309,7 @@ static bool test_dup_anon_vma(void)
vma_next->anon_vma = &dummy_anon_vma;
vmg_set_range(&vmg, 0x3000, 0x5000, 3, flags);
vmg.prev = vma_prev;
- vmg.vma = vma;
+ vmg.middle = vma;
ASSERT_EQ(merge_existing(&vmg), vma_prev);
ASSERT_EQ(vmg.state, VMA_MERGE_SUCCESS);
@@ -1338,7 +1338,7 @@ static bool test_dup_anon_vma(void)
vma->anon_vma = &dummy_anon_vma;
vmg_set_range(&vmg, 0x3000, 0x5000, 3, flags);
vmg.prev = vma_prev;
- vmg.vma = vma;
+ vmg.middle = vma;
ASSERT_EQ(merge_existing(&vmg), vma_prev);
ASSERT_EQ(vmg.state, VMA_MERGE_SUCCESS);
@@ -1366,7 +1366,7 @@ static bool test_dup_anon_vma(void)
vma->anon_vma = &dummy_anon_vma;
vmg_set_range(&vmg, 0x3000, 0x5000, 3, flags);
vmg.prev = vma_prev;
- vmg.vma = vma;
+ vmg.middle = vma;
ASSERT_EQ(merge_existing(&vmg), vma_prev);
ASSERT_EQ(vmg.state, VMA_MERGE_SUCCESS);
@@ -1394,7 +1394,7 @@ static bool test_dup_anon_vma(void)
vma->anon_vma = &dummy_anon_vma;
vmg_set_range(&vmg, 0x3000, 0x5000, 3, flags);
vmg.prev = vma;
- vmg.vma = vma;
+ vmg.middle = vma;
ASSERT_EQ(merge_existing(&vmg), vma_next);
ASSERT_EQ(vmg.state, VMA_MERGE_SUCCESS);
@@ -1432,7 +1432,7 @@ static bool test_vmi_prealloc_fail(void)
vmg_set_range(&vmg, 0x3000, 0x5000, 3, flags);
vmg.prev = vma_prev;
- vmg.vma = vma;
+ vmg.middle = vma;
fail_prealloc = true;
@@ -1458,7 +1458,7 @@ static bool test_vmi_prealloc_fail(void)
vma->anon_vma = &dummy_anon_vma;
vmg_set_range(&vmg, 0, 0x5000, 3, flags);
- vmg.vma = vma_prev;
+ vmg.middle = vma_prev;
vmg.next = vma;
fail_prealloc = true;
--
2.48.0
^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH 2/5] mm: further refactor commit_merge()
2025-01-27 15:50 [PATCH 0/5] mm: further simplify VMA merge operation Lorenzo Stoakes
2025-01-27 15:50 ` [PATCH 1/5] mm: simplify vma merge structure and expand comments Lorenzo Stoakes
@ 2025-01-27 15:50 ` Lorenzo Stoakes
2025-01-28 15:05 ` Vlastimil Babka
2025-01-28 15:45 ` Vlastimil Babka
2025-01-27 15:50 ` [PATCH 3/5] mm: eliminate adj_start parameter from commit_merge() Lorenzo Stoakes
` (2 subsequent siblings)
4 siblings, 2 replies; 26+ messages in thread
From: Lorenzo Stoakes @ 2025-01-27 15:50 UTC (permalink / raw)
To: Andrew Morton
Cc: Liam R . Howlett, Vlastimil Babka, Jann Horn, linux-mm, linux-kernel
The current VMA merge mechanism contains a number of confusing mechanisms
around removal of VMAs on merge and the shrinking of the VMA adjacent to
vma->target in the case of merges which result in a partial merge with that
adjacent VMA.
Since we now have a STABLE set of VMAs - prev, middle, next - we are now
able to have the caller of commit_merge() explicitly tell us which VMAs
need deleting, using newly introduced internal VMA merge flags.
Doing so allows us to embed this state within the VMG and remove the
confusing remove, remove2 parameters from commit_merge().
We additionally are able to eliminate the highly confusing and misleading
'expanded' parameter - a parameter that in reality refers to whether or not
the return VMA is the target one or the one immediately adjacent.
We can infer which is the case from whether or not the adj_start parameter
is negative. This also allows us to simplify further logic around iterator
configuration and VMA iterator stores.
Doing so means we can also eliminate the adjust parameter, as we are able
to infer which VMA ought to be adjusted from adj_start - a positive value
implies we adjust the start of 'middle', a negative one implies we adjust
the start of 'next'.
We are then able to have commit_merge() explicitly return the target VMA,
or NULL on inability to pre-allocate memory. Errors were previously
filtered so behaviour does not change.
This patch has no change in functional behaviour.
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
---
mm/vma.c | 83 +++++++++++++++++++++++------------------
mm/vma.h | 10 +++++
tools/testing/vma/vma.c | 3 ++
3 files changed, 60 insertions(+), 36 deletions(-)
diff --git a/mm/vma.c b/mm/vma.c
index 68a301a76297..955c5ebd5739 100644
--- a/mm/vma.c
+++ b/mm/vma.c
@@ -120,8 +120,8 @@ static void init_multi_vma_prep(struct vma_prepare *vp,
memset(vp, 0, sizeof(struct vma_prepare));
vp->vma = vma;
vp->anon_vma = vma->anon_vma;
- vp->remove = remove;
- vp->remove2 = remove2;
+ vp->remove = remove ? remove : remove2;
+ vp->remove2 = remove ? remove2 : NULL;
vp->adj_next = next;
if (!vp->anon_vma && next)
vp->anon_vma = next->anon_vma;
@@ -129,7 +129,6 @@ static void init_multi_vma_prep(struct vma_prepare *vp,
vp->file = vma->vm_file;
if (vp->file)
vp->mapping = vma->vm_file->f_mapping;
-
}
/*
@@ -629,22 +628,40 @@ void validate_mm(struct mm_struct *mm)
}
#endif /* CONFIG_DEBUG_VM_MAPLE_TREE */
-/* Actually perform the VMA merge operation. */
-static int commit_merge(struct vma_merge_struct *vmg,
- struct vm_area_struct *adjust,
- struct vm_area_struct *remove,
- struct vm_area_struct *remove2,
- long adj_start,
- bool expanded)
+/*
+ * Actually perform the VMA merge operation.
+ *
+ * On success, returns the merged VMA. Otherwise returns NULL.
+ */
+static struct vm_area_struct *commit_merge(struct vma_merge_struct *vmg,
+ long adj_start)
{
struct vma_prepare vp;
+ struct vm_area_struct *remove = NULL;
+ struct vm_area_struct *remove2 = NULL;
+ struct vm_area_struct *adjust = NULL;
+ /*
+ * In all cases but that of merge right, shrink next, we write
+ * vmg->target to the maple tree and return this as the merged VMA.
+ */
+ bool merge_target = adj_start >= 0;
+
+ if (vmg->merge_flags & __VMG_FLAG_REMOVE_MIDDLE)
+ remove = vmg->middle;
+ if (vmg->merge_flags & __VMG_FLAG_REMOVE_NEXT)
+ remove2 = vmg->next;
+
+ if (adj_start > 0)
+ adjust = vmg->middle;
+ else if (adj_start < 0)
+ adjust = vmg->next;
init_multi_vma_prep(&vp, vmg->target, adjust, remove, remove2);
VM_WARN_ON(vp.anon_vma && adjust && adjust->anon_vma &&
vp.anon_vma != adjust->anon_vma);
- if (expanded) {
+ if (merge_target) {
/* Note: vma iterator must be pointing to 'start'. */
vma_iter_config(vmg->vmi, vmg->start, vmg->end);
} else {
@@ -653,27 +670,26 @@ static int commit_merge(struct vma_merge_struct *vmg,
}
if (vma_iter_prealloc(vmg->vmi, vmg->target))
- return -ENOMEM;
+ return NULL;
vma_prepare(&vp);
vma_adjust_trans_huge(vmg->target, vmg->start, vmg->end, adj_start);
vma_set_range(vmg->target, vmg->start, vmg->end, vmg->pgoff);
- if (expanded)
+ if (merge_target)
vma_iter_store(vmg->vmi, vmg->target);
if (adj_start) {
adjust->vm_start += adj_start;
adjust->vm_pgoff += PHYS_PFN(adj_start);
- if (adj_start < 0) {
- WARN_ON(expanded);
+
+ if (!merge_target)
vma_iter_store(vmg->vmi, adjust);
- }
}
vma_complete(&vp, vmg->vmi, vmg->target->vm_mm);
- return 0;
+ return merge_target ? vmg->target : vmg->next;
}
/* We can only remove VMAs when merging if they do not have a close hook. */
@@ -718,7 +734,6 @@ static __must_check struct vm_area_struct *vma_merge_existing_range(
struct vm_area_struct *prev = vmg->prev;
struct vm_area_struct *next, *res;
struct vm_area_struct *anon_dup = NULL;
- struct vm_area_struct *adjust = NULL;
unsigned long start = vmg->start;
unsigned long end = vmg->end;
bool left_side = middle && start == middle->vm_start;
@@ -727,7 +742,6 @@ static __must_check struct vm_area_struct *vma_merge_existing_range(
long adj_start = 0;
bool merge_will_delete_middle, merge_will_delete_next;
bool merge_left, merge_right, merge_both;
- bool expanded;
mmap_assert_write_locked(vmg->mm);
VM_WARN_ON_VMG(!middle, vmg); /* We are modifying a VMA, so caller must specify. */
@@ -846,10 +860,11 @@ static __must_check struct vm_area_struct *vma_merge_existing_range(
vmg->start = prev->vm_start;
vmg->pgoff = prev->vm_pgoff;
- if (!merge_will_delete_middle) {
- adjust = middle;
+ /*
+ * We both expand prev and shrink middle.
+ */
+ if (!merge_will_delete_middle)
adj_start = vmg->end - middle->vm_start;
- }
err = dup_anon_vma(prev, middle, &anon_dup);
} else { /* merge_right */
@@ -883,7 +898,6 @@ static __must_check struct vm_area_struct *vma_merge_existing_range(
vmg->end = start;
vmg->pgoff = middle->vm_pgoff;
- adjust = next;
adj_start = -(middle->vm_end - start);
}
@@ -893,17 +907,13 @@ static __must_check struct vm_area_struct *vma_merge_existing_range(
if (err)
goto abort;
- /*
- * In nearly all cases, we expand vmg->middle. There is one exception -
- * merge_right where we partially span the VMA. In this case we shrink
- * the end of vmg->middle and adjust the start of vmg->next accordingly.
- */
- expanded = !merge_right || merge_will_delete_middle;
+ if (merge_will_delete_middle)
+ vmg->merge_flags |= __VMG_FLAG_REMOVE_MIDDLE;
+ if (merge_will_delete_next)
+ vmg->merge_flags |= __VMG_FLAG_REMOVE_NEXT;
- if (commit_merge(vmg, adjust,
- merge_will_delete_middle ? middle : NULL,
- merge_will_delete_next ? next : NULL,
- adj_start, expanded)) {
+ res = commit_merge(vmg, adj_start);
+ if (!res) {
if (anon_dup)
unlink_anon_vmas(anon_dup);
@@ -911,9 +921,7 @@ static __must_check struct vm_area_struct *vma_merge_existing_range(
return NULL;
}
- res = merge_left ? prev : next;
khugepaged_enter_vma(res, vmg->flags);
-
vmg->state = VMA_MERGE_SUCCESS;
return res;
@@ -1076,7 +1084,10 @@ int vma_expand(struct vma_merge_struct *vmg)
middle->vm_end > vmg->end, vmg);
vmg->target = middle;
- if (commit_merge(vmg, NULL, remove_next ? next : NULL, NULL, 0, true))
+ if (remove_next)
+ vmg->merge_flags |= __VMG_FLAG_REMOVE_NEXT;
+
+ if (!commit_merge(vmg, 0))
goto nomem;
return 0;
diff --git a/mm/vma.h b/mm/vma.h
index 5b5dd07e478c..ffbfefb9a83d 100644
--- a/mm/vma.h
+++ b/mm/vma.h
@@ -67,6 +67,16 @@ enum vma_merge_flags {
* at the gap.
*/
VMG_FLAG_JUST_EXPAND = 1 << 0,
+ /*
+ * Internal flag used during the merge operation to indicate we will
+ * remove vmg->middle.
+ */
+ __VMG_FLAG_REMOVE_MIDDLE = 1 << 1,
+ /*
+ * Internal flag used during the merge operationr to indicate we will
+ * remove vmg->next.
+ */
+ __VMG_FLAG_REMOVE_NEXT = 1 << 2,
};
/*
diff --git a/tools/testing/vma/vma.c b/tools/testing/vma/vma.c
index 3c0572120e94..8cce67237d86 100644
--- a/tools/testing/vma/vma.c
+++ b/tools/testing/vma/vma.c
@@ -154,6 +154,9 @@ static void vmg_set_range(struct vma_merge_struct *vmg, unsigned long start,
vmg->end = end;
vmg->pgoff = pgoff;
vmg->flags = flags;
+
+ vmg->merge_flags = VMG_FLAG_DEFAULT;
+ vmg->target = NULL;
}
/*
--
2.48.0
^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH 3/5] mm: eliminate adj_start parameter from commit_merge()
2025-01-27 15:50 [PATCH 0/5] mm: further simplify VMA merge operation Lorenzo Stoakes
2025-01-27 15:50 ` [PATCH 1/5] mm: simplify vma merge structure and expand comments Lorenzo Stoakes
2025-01-27 15:50 ` [PATCH 2/5] mm: further refactor commit_merge() Lorenzo Stoakes
@ 2025-01-27 15:50 ` Lorenzo Stoakes
2025-01-29 14:13 ` Vlastimil Babka
2025-01-27 15:50 ` [PATCH 4/5] mm: make vmg->target consistent and further simplify commit_merge() Lorenzo Stoakes
2025-01-27 15:50 ` [PATCH 5/5] mm: completely abstract unnecessary adj_start calculation Lorenzo Stoakes
4 siblings, 1 reply; 26+ messages in thread
From: Lorenzo Stoakes @ 2025-01-27 15:50 UTC (permalink / raw)
To: Andrew Morton
Cc: Liam R . Howlett, Vlastimil Babka, Jann Horn, linux-mm, linux-kernel
Introduce internal __VMG_FLAG_ADJUST_MIDDLE_START and
__VMG_FLAG_ADJUST_NEXT_START merge flags, enabling us to indicate to
commit_merge() that we are performing a merge which either spans only part
of vmg->middle, or part of vmg->next respectively.
In the former instance, we change the start of vmg->middle to match the
attributes of vmg->prev, without spanning all of vmg->middle.
This implies that vmg->prev->vm_end and vmg->middle->vm_start are both
increased to form the new merged VMA (vmg->prev) and the new subsequent VMA
(vmg->middle).
In the latter case, we change the end of vmg->middle to match the
attributes of vmg->next, without spanning all of vmg->next.
This implies that vmg->middle->vm_end and vmg->next->vm_start are both
decreased to form the new merged VMA (vmg->next) and the new prior VMA
(vmg->middle).
Since we now have a stable set of prev, middle, next VMAs threaded through
vmg and with these flags set know what is happening, we can perform
the calculation in commit_merge() instead.
This allows us to drop the confusing adj_start parameter and instead pass
semantic information to commit_merge().
In the latter case the -(middle->vm_end - start) calculation becomes
-(middle->vm-end - vmg->end), however this is correct as vmg->end is set to
the start parameter.
This is because in this case (rather confusingly), we manipulate
vmg->middle, but ultimately return vmg->next, whose range will be correctly
specified. At this point vmg->start, end is the new range for the prior VMA
rather than the merged one.
This patch has no change in functional behaviour.
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
---
mm/vma.c | 53 ++++++++++++++++++++++++++++++++---------------------
mm/vma.h | 14 ++++++++++++--
2 files changed, 44 insertions(+), 23 deletions(-)
diff --git a/mm/vma.c b/mm/vma.c
index 955c5ebd5739..e78d65de734b 100644
--- a/mm/vma.c
+++ b/mm/vma.c
@@ -633,29 +633,45 @@ void validate_mm(struct mm_struct *mm)
*
* On success, returns the merged VMA. Otherwise returns NULL.
*/
-static struct vm_area_struct *commit_merge(struct vma_merge_struct *vmg,
- long adj_start)
+static struct vm_area_struct *commit_merge(struct vma_merge_struct *vmg)
{
- struct vma_prepare vp;
struct vm_area_struct *remove = NULL;
struct vm_area_struct *remove2 = NULL;
+ unsigned long flags = vmg->merge_flags;
+ struct vma_prepare vp;
struct vm_area_struct *adjust = NULL;
+ long adj_start;
+ bool merge_target;
+
/*
- * In all cases but that of merge right, shrink next, we write
- * vmg->target to the maple tree and return this as the merged VMA.
+ * If modifying an existing VMA and we don't remove vmg->middle, then we
+ * shrink the adjacent VMA.
*/
- bool merge_target = adj_start >= 0;
+ if (flags & __VMG_FLAG_ADJUST_MIDDLE_START) {
+ adjust = vmg->middle;
+ /* The POSITIVE value by which we offset vmg->middle->vm_start. */
+ adj_start = vmg->end - vmg->middle->vm_start;
+ merge_target = true;
+ } else if (flags & __VMG_FLAG_ADJUST_NEXT_START) {
+ adjust = vmg->next;
+ /* The NEGATIVE value by which we offset vmg->next->vm_start. */
+ adj_start = -(vmg->middle->vm_end - vmg->end);
+ /*
+ * In all cases but this - merge right, shrink next - we write
+ * vmg->target to the maple tree and return this as the merged VMA.
+ */
+ merge_target = false;
+ } else {
+ adjust = NULL;
+ adj_start = 0;
+ merge_target = true;
+ }
- if (vmg->merge_flags & __VMG_FLAG_REMOVE_MIDDLE)
+ if (flags & __VMG_FLAG_REMOVE_MIDDLE)
remove = vmg->middle;
if (vmg->merge_flags & __VMG_FLAG_REMOVE_NEXT)
remove2 = vmg->next;
- if (adj_start > 0)
- adjust = vmg->middle;
- else if (adj_start < 0)
- adjust = vmg->next;
-
init_multi_vma_prep(&vp, vmg->target, adjust, remove, remove2);
VM_WARN_ON(vp.anon_vma && adjust && adjust->anon_vma &&
@@ -739,7 +755,6 @@ static __must_check struct vm_area_struct *vma_merge_existing_range(
bool left_side = middle && start == middle->vm_start;
bool right_side = middle && end == middle->vm_end;
int err = 0;
- long adj_start = 0;
bool merge_will_delete_middle, merge_will_delete_next;
bool merge_left, merge_right, merge_both;
@@ -860,11 +875,8 @@ static __must_check struct vm_area_struct *vma_merge_existing_range(
vmg->start = prev->vm_start;
vmg->pgoff = prev->vm_pgoff;
- /*
- * We both expand prev and shrink middle.
- */
if (!merge_will_delete_middle)
- adj_start = vmg->end - middle->vm_start;
+ vmg->merge_flags |= __VMG_FLAG_ADJUST_MIDDLE_START;
err = dup_anon_vma(prev, middle, &anon_dup);
} else { /* merge_right */
@@ -893,12 +905,11 @@ static __must_check struct vm_area_struct *vma_merge_existing_range(
* IMPORTANT: This is the ONLY case where the final
* merged VMA is NOT vmg->target, but rather vmg->next.
*/
+ vmg->merge_flags |= __VMG_FLAG_ADJUST_NEXT_START;
vmg->target = middle;
vmg->start = middle->vm_start;
vmg->end = start;
vmg->pgoff = middle->vm_pgoff;
-
- adj_start = -(middle->vm_end - start);
}
err = dup_anon_vma(next, middle, &anon_dup);
@@ -912,7 +923,7 @@ static __must_check struct vm_area_struct *vma_merge_existing_range(
if (merge_will_delete_next)
vmg->merge_flags |= __VMG_FLAG_REMOVE_NEXT;
- res = commit_merge(vmg, adj_start);
+ res = commit_merge(vmg);
if (!res) {
if (anon_dup)
unlink_anon_vmas(anon_dup);
@@ -1087,7 +1098,7 @@ int vma_expand(struct vma_merge_struct *vmg)
if (remove_next)
vmg->merge_flags |= __VMG_FLAG_REMOVE_NEXT;
- if (!commit_merge(vmg, 0))
+ if (!commit_merge(vmg))
goto nomem;
return 0;
diff --git a/mm/vma.h b/mm/vma.h
index ffbfefb9a83d..ddf567359880 100644
--- a/mm/vma.h
+++ b/mm/vma.h
@@ -67,16 +67,26 @@ enum vma_merge_flags {
* at the gap.
*/
VMG_FLAG_JUST_EXPAND = 1 << 0,
+ /*
+ * Internal flag indicating the merge increases vmg->middle->vm_start
+ * (and thereby, vmg->prev->vm_end).
+ */
+ __VMG_FLAG_ADJUST_MIDDLE_START = 1 << 1,
+ /*
+ * Internal flag indicating the merge decreases vmg->next->vm_start
+ * (and thereby, vmg->middle->vm_end).
+ */
+ __VMG_FLAG_ADJUST_NEXT_START = 1 << 2,
/*
* Internal flag used during the merge operation to indicate we will
* remove vmg->middle.
*/
- __VMG_FLAG_REMOVE_MIDDLE = 1 << 1,
+ __VMG_FLAG_REMOVE_MIDDLE = 1 << 3,
/*
* Internal flag used during the merge operationr to indicate we will
* remove vmg->next.
*/
- __VMG_FLAG_REMOVE_NEXT = 1 << 2,
+ __VMG_FLAG_REMOVE_NEXT = 1 << 4,
};
/*
--
2.48.0
^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH 4/5] mm: make vmg->target consistent and further simplify commit_merge()
2025-01-27 15:50 [PATCH 0/5] mm: further simplify VMA merge operation Lorenzo Stoakes
` (2 preceding siblings ...)
2025-01-27 15:50 ` [PATCH 3/5] mm: eliminate adj_start parameter from commit_merge() Lorenzo Stoakes
@ 2025-01-27 15:50 ` Lorenzo Stoakes
2025-01-29 14:45 ` Vlastimil Babka
2025-01-29 15:19 ` Vlastimil Babka
2025-01-27 15:50 ` [PATCH 5/5] mm: completely abstract unnecessary adj_start calculation Lorenzo Stoakes
4 siblings, 2 replies; 26+ messages in thread
From: Lorenzo Stoakes @ 2025-01-27 15:50 UTC (permalink / raw)
To: Andrew Morton
Cc: Liam R . Howlett, Vlastimil Babka, Jann Horn, linux-mm, linux-kernel
It is confusing for vmg->target to sometimes be the target merged VMA and
in one case not.
Fix this by having commit_merge() use its awareness of the
__VMG_FLAG_ADJUST_NEXT_START case to know that it is manipulating a
separate vma, abstracted in the 'vma' local variable.
Place removal and adjust VMA determination logic into
init_multi_vma_prep(), as the flags give us enough information to do so,
and since this is the function that sets up the vma_prepare struct it makes
sense to do so here.
Doing this significantly simplifies commit_merge(), allowing us to
eliminate the 'merge_target' handling, initialise the VMA iterator in a
more sensible place and simply return vmg->target consistently.
This also allows us to simplify setting vmg->target in
vma_merge_existing_range() since we are then left only with two cases -
merge left (or both) where the target is vmg->prev or merge right in which
the target is vmg->next.
This makes it easy for somebody reading the code to know what VMA will
actually be the one returned and merged into and removes a great deal of
the confusing 'adjust' nonsense.
This patch has no change in functional behaviour.
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
---
mm/vma.c | 122 ++++++++++++++++++++++++++++---------------------------
mm/vma.h | 6 +--
2 files changed, 64 insertions(+), 64 deletions(-)
diff --git a/mm/vma.c b/mm/vma.c
index e78d65de734b..cfcadca7b116 100644
--- a/mm/vma.c
+++ b/mm/vma.c
@@ -107,24 +107,41 @@ static inline bool are_anon_vmas_compatible(struct vm_area_struct *vma1,
* init_multi_vma_prep() - Initializer for struct vma_prepare
* @vp: The vma_prepare struct
* @vma: The vma that will be altered once locked
- * @next: The next vma if it is to be adjusted
- * @remove: The first vma to be removed
- * @remove2: The second vma to be removed
+ * @vmg: The merge state that will be used to determine adjustment and VMA
+ * removal.
*/
static void init_multi_vma_prep(struct vma_prepare *vp,
struct vm_area_struct *vma,
- struct vm_area_struct *next,
- struct vm_area_struct *remove,
- struct vm_area_struct *remove2)
+ struct vma_merge_struct *vmg)
{
+ struct vm_area_struct *adjust;
+ struct vm_area_struct **remove = &vp->remove;
+ unsigned long flags = vmg ? vmg->merge_flags : 0;
+
memset(vp, 0, sizeof(struct vma_prepare));
vp->vma = vma;
vp->anon_vma = vma->anon_vma;
- vp->remove = remove ? remove : remove2;
- vp->remove2 = remove ? remove2 : NULL;
- vp->adj_next = next;
- if (!vp->anon_vma && next)
- vp->anon_vma = next->anon_vma;
+
+ if (flags & __VMG_FLAG_REMOVE_MIDDLE) {
+ *remove = vmg->middle;
+ remove = &vp->remove2;
+ }
+ if (flags & __VMG_FLAG_REMOVE_NEXT)
+ *remove = vmg->next;
+
+ if (flags & __VMG_FLAG_ADJUST_MIDDLE_START)
+ adjust = vmg->middle;
+ else if (flags & __VMG_FLAG_ADJUST_NEXT_START)
+ adjust = vmg->next;
+ else
+ adjust = NULL;
+
+ vp->adj_next = adjust;
+ if (!vp->anon_vma && adjust)
+ vp->anon_vma = adjust->anon_vma;
+
+ VM_WARN_ON(vp->anon_vma && adjust && adjust->anon_vma &&
+ vp->anon_vma != adjust->anon_vma);
vp->file = vma->vm_file;
if (vp->file)
@@ -361,7 +378,7 @@ static void vma_complete(struct vma_prepare *vp, struct vma_iterator *vmi,
*/
static void init_vma_prep(struct vma_prepare *vp, struct vm_area_struct *vma)
{
- init_multi_vma_prep(vp, vma, NULL, NULL, NULL);
+ init_multi_vma_prep(vp, vma, NULL);
}
/*
@@ -635,77 +652,64 @@ void validate_mm(struct mm_struct *mm)
*/
static struct vm_area_struct *commit_merge(struct vma_merge_struct *vmg)
{
- struct vm_area_struct *remove = NULL;
- struct vm_area_struct *remove2 = NULL;
- unsigned long flags = vmg->merge_flags;
+ struct vm_area_struct *vma;
struct vma_prepare vp;
- struct vm_area_struct *adjust = NULL;
+ struct vm_area_struct *adjust;
long adj_start;
- bool merge_target;
+ unsigned long flags = vmg->merge_flags;
/*
* If modifying an existing VMA and we don't remove vmg->middle, then we
* shrink the adjacent VMA.
*/
if (flags & __VMG_FLAG_ADJUST_MIDDLE_START) {
+ vma = vmg->target;
adjust = vmg->middle;
/* The POSITIVE value by which we offset vmg->middle->vm_start. */
adj_start = vmg->end - vmg->middle->vm_start;
- merge_target = true;
+
+ /* Note: vma iterator must be pointing to 'start'. */
+ vma_iter_config(vmg->vmi, vmg->start, vmg->end);
} else if (flags & __VMG_FLAG_ADJUST_NEXT_START) {
+ /*
+ * In this case alone, the VMA we manipulate is vmg->middle, but
+ * we ultimately return vmg->next.
+ */
+ vma = vmg->middle;
adjust = vmg->next;
/* The NEGATIVE value by which we offset vmg->next->vm_start. */
adj_start = -(vmg->middle->vm_end - vmg->end);
- /*
- * In all cases but this - merge right, shrink next - we write
- * vmg->target to the maple tree and return this as the merged VMA.
- */
- merge_target = false;
+
+ vma_iter_config(vmg->vmi, vmg->next->vm_start + adj_start,
+ vmg->next->vm_end);
} else {
+ vma = vmg->target;
adjust = NULL;
adj_start = 0;
- merge_target = true;
- }
-
- if (flags & __VMG_FLAG_REMOVE_MIDDLE)
- remove = vmg->middle;
- if (vmg->merge_flags & __VMG_FLAG_REMOVE_NEXT)
- remove2 = vmg->next;
-
- init_multi_vma_prep(&vp, vmg->target, adjust, remove, remove2);
-
- VM_WARN_ON(vp.anon_vma && adjust && adjust->anon_vma &&
- vp.anon_vma != adjust->anon_vma);
- if (merge_target) {
- /* Note: vma iterator must be pointing to 'start'. */
+ /* Note: vma iterator must be pointing to 'start'. */
vma_iter_config(vmg->vmi, vmg->start, vmg->end);
- } else {
- vma_iter_config(vmg->vmi, adjust->vm_start + adj_start,
- adjust->vm_end);
}
- if (vma_iter_prealloc(vmg->vmi, vmg->target))
+ init_multi_vma_prep(&vp, vma, vmg);
+
+ if (vma_iter_prealloc(vmg->vmi, vma))
return NULL;
vma_prepare(&vp);
- vma_adjust_trans_huge(vmg->target, vmg->start, vmg->end, adj_start);
- vma_set_range(vmg->target, vmg->start, vmg->end, vmg->pgoff);
-
- if (merge_target)
- vma_iter_store(vmg->vmi, vmg->target);
+ vma_adjust_trans_huge(vma, vmg->start, vmg->end, adj_start);
+ vma_set_range(vma, vmg->start, vmg->end, vmg->pgoff);
if (adj_start) {
adjust->vm_start += adj_start;
adjust->vm_pgoff += PHYS_PFN(adj_start);
-
- if (!merge_target)
- vma_iter_store(vmg->vmi, adjust);
}
- vma_complete(&vp, vmg->vmi, vmg->target->vm_mm);
+ vma_iter_store(vmg->vmi, vmg->target);
+
+ vma_complete(&vp, vmg->vmi, vma->vm_mm);
- return merge_target ? vmg->target : vmg->next;
+ return vmg->target;
}
/* We can only remove VMAs when merging if they do not have a close hook. */
@@ -836,11 +840,15 @@ static __must_check struct vm_area_struct *vma_merge_existing_range(
/* No matter what happens, we will be adjusting middle. */
vma_start_write(middle);
- if (merge_left)
- vma_start_write(prev);
-
- if (merge_right)
+ if (merge_right) {
vma_start_write(next);
+ vmg->target = next;
+ }
+
+ if (merge_left) {
+ vma_start_write(prev);
+ vmg->target = prev;
+ }
if (merge_both) {
/*
@@ -850,7 +858,6 @@ static __must_check struct vm_area_struct *vma_merge_existing_range(
* extend delete delete
*/
- vmg->target = prev;
vmg->start = prev->vm_start;
vmg->end = next->vm_end;
vmg->pgoff = prev->vm_pgoff;
@@ -871,7 +878,6 @@ static __must_check struct vm_area_struct *vma_merge_existing_range(
* extend shrink/delete
*/
- vmg->target = prev;
vmg->start = prev->vm_start;
vmg->pgoff = prev->vm_pgoff;
@@ -895,7 +901,6 @@ static __must_check struct vm_area_struct *vma_merge_existing_range(
VM_WARN_ON_VMG(vmg->start > middle->vm_start && prev && middle != prev, vmg);
if (merge_will_delete_middle) {
- vmg->target = next;
vmg->end = next->vm_end;
vmg->pgoff = next->vm_pgoff - pglen;
} else {
@@ -906,7 +911,6 @@ static __must_check struct vm_area_struct *vma_merge_existing_range(
* merged VMA is NOT vmg->target, but rather vmg->next.
*/
vmg->merge_flags |= __VMG_FLAG_ADJUST_NEXT_START;
- vmg->target = middle;
vmg->start = middle->vm_start;
vmg->end = start;
vmg->pgoff = middle->vm_pgoff;
diff --git a/mm/vma.h b/mm/vma.h
index ddf567359880..5be43e2bba3f 100644
--- a/mm/vma.h
+++ b/mm/vma.h
@@ -113,11 +113,7 @@ struct vma_merge_struct {
struct vm_area_struct *prev;
struct vm_area_struct *middle;
struct vm_area_struct *next;
- /*
- * This is the VMA we ultimately target to become the merged VMA, except
- * for the one exception of merge right, shrink next (for details of
- * this scenario see vma_merge_existing_range()).
- */
+ /* This is the VMA we ultimately target to become the merged VMA. */
struct vm_area_struct *target;
/*
* Initially, the start, end, pgoff fields are provided by the caller
--
2.48.0
^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH 5/5] mm: completely abstract unnecessary adj_start calculation
2025-01-27 15:50 [PATCH 0/5] mm: further simplify VMA merge operation Lorenzo Stoakes
` (3 preceding siblings ...)
2025-01-27 15:50 ` [PATCH 4/5] mm: make vmg->target consistent and further simplify commit_merge() Lorenzo Stoakes
@ 2025-01-27 15:50 ` Lorenzo Stoakes
2025-01-27 18:50 ` kernel test robot
` (4 more replies)
4 siblings, 5 replies; 26+ messages in thread
From: Lorenzo Stoakes @ 2025-01-27 15:50 UTC (permalink / raw)
To: Andrew Morton
Cc: Liam R . Howlett, Vlastimil Babka, Jann Horn, linux-mm, linux-kernel
The adj_start calculation has been a constant source of confusion in the
VMA merge code.
There are two cases to consider, one where we adjust the start of the
vmg->middle VMA (i.e. the __VMG_FLAG_ADJUST_MIDDLE_START merge flag is
set), in which case adj_start is calculated as:
(1) adj_start = vmg->end - vmg->middle->vm_start
And the case where we adjust the start of the vmg->next VMA (i.e.t he
__VMG_FLAG_ADJUST_NEXT_START merge flag is set), in which case adj_start is
calculated as:
(2) adj_start = -(vmg->middle->vm_end - vmg->end)
We apply (1) thusly:
vmg->middle->vm_start =
vmg->middle->vm_start + vmg->end - vmg->middle->vm_start
Which simplifies to:
vmg->middle->vm_start = vmg->end
Similarly, we apply (2) as:
vmg->next->vm_start =
vmg->next->vm_start + -(vmg->middle->vm_end - vmg->end)
Noting that for these VMAs to be mergeable vmg->middle->vm_end ==
vmg->next->vm_start and so this simplifies to:
vmg->next->vm_start =
vmg->next->vm_start + -(vmg->next->vm_start - vmg->end)
Which simplifies to:
vmg->next->vm_start = vmg->end
Therefore in each case, we simply need to adjust the start of the VMA to
vmg->end (!) and can do away with this adj_start calculation. The only
caveat is that we must ensure we update the vm_pgoff field correctly.
We therefore abstract this entire calculation to a new function
vmg_adjust_set_range() which performs this calculation and sets the
adjusted VMA's new range using the general vma_set_range() function.
We also must update vma_adjust_trans_huge() which expects the
now-abstracted adj_start parameter. It turns out this is wholly
unnecessary.
In vma_adjust_trans_huge() the relevant code is:
if (adjust_next > 0) {
struct vm_area_struct *next = find_vma(vma->vm_mm, vma->vm_end);
unsigned long nstart = next->vm_start;
nstart += adjust_next;
split_huge_pmd_if_needed(next, nstart);
}
The only case where this is relevant is when __VMG_FLAG_ADJUST_MIDDLE_START
is specified (in which case adj_next would have been positive), i.e. the
one in which the vma specified is vmg->prev and this the sought 'next' VMA
would be vmg->middle.
We can therefore eliminate the find_vma() invocation altogether and simply
provide the vmg->middle VMA in this instance, or NULL otherwise.
Again we have an adj_next offset calculation:
next->vm_start + vmg->end - vmg->middle->vm_start
Where next == vmg->middle this simplifies to vmg->end as previously
demonstrated.
Therefore nstart is equal to vmg->end, which is already passed to
vma_adjust_trans_huge() via the 'end' parameter and so this code (rather
delightfully) simplifies to:
if (next)
split_huge_pmd_if_needed(next, end);
With these changes in place, it becomes silly for commit_merge() to return
vmg->target, as it is always the same and threaded through vmg, so we
finally change commit_merge() to return an error value once again.
This patch has no change in functional behaviour.
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
---
include/linux/huge_mm.h | 2 +-
mm/huge_memory.c | 19 ++----
mm/vma.c | 102 +++++++++++++++----------------
tools/testing/vma/vma_internal.h | 4 +-
4 files changed, 58 insertions(+), 69 deletions(-)
diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index 93e509b6c00e..43f76b54b522 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -404,7 +404,7 @@ int madvise_collapse(struct vm_area_struct *vma,
struct vm_area_struct **prev,
unsigned long start, unsigned long end);
void vma_adjust_trans_huge(struct vm_area_struct *vma, unsigned long start,
- unsigned long end, long adjust_next);
+ unsigned long end, struct vm_area_struct *next);
spinlock_t *__pmd_trans_huge_lock(pmd_t *pmd, struct vm_area_struct *vma);
spinlock_t *__pud_trans_huge_lock(pud_t *pud, struct vm_area_struct *vma);
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 3d3ebdc002d5..c44599f9ad09 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -3017,9 +3017,9 @@ static inline void split_huge_pmd_if_needed(struct vm_area_struct *vma, unsigned
}
void vma_adjust_trans_huge(struct vm_area_struct *vma,
- unsigned long start,
- unsigned long end,
- long adjust_next)
+ unsigned long start,
+ unsigned long end,
+ struct vm_area_struct *next)
{
/* Check if we need to split start first. */
split_huge_pmd_if_needed(vma, start);
@@ -3027,16 +3027,9 @@ void vma_adjust_trans_huge(struct vm_area_struct *vma,
/* Check if we need to split end next. */
split_huge_pmd_if_needed(vma, end);
- /*
- * If we're also updating the next vma vm_start,
- * check if we need to split it.
- */
- if (adjust_next > 0) {
- struct vm_area_struct *next = find_vma(vma->vm_mm, vma->vm_end);
- unsigned long nstart = next->vm_start;
- nstart += adjust_next;
- split_huge_pmd_if_needed(next, nstart);
- }
+ /* If we're incrementing next->vm_start, we might need to split it. */
+ if (next)
+ split_huge_pmd_if_needed(next, end);
}
static void unmap_folio(struct folio *folio)
diff --git a/mm/vma.c b/mm/vma.c
index cfcadca7b116..48a7acc83a82 100644
--- a/mm/vma.c
+++ b/mm/vma.c
@@ -515,7 +515,7 @@ __split_vma(struct vma_iterator *vmi, struct vm_area_struct *vma,
init_vma_prep(&vp, vma);
vp.insert = new;
vma_prepare(&vp);
- vma_adjust_trans_huge(vma, vma->vm_start, addr, 0);
+ vma_adjust_trans_huge(vma, vma->vm_start, addr, NULL);
if (new_below) {
vma->vm_start = addr;
@@ -646,47 +646,46 @@ void validate_mm(struct mm_struct *mm)
#endif /* CONFIG_DEBUG_VM_MAPLE_TREE */
/*
- * Actually perform the VMA merge operation.
- *
- * On success, returns the merged VMA. Otherwise returns NULL.
+ * Based on the vmg flag indicating whether we need to adjust the vm_start field
+ * for the middle or next VMA, we calculate what the range of the newly adjusted
+ * VMA ought to be, and set the VMA's range accordingly.
*/
-static struct vm_area_struct *commit_merge(struct vma_merge_struct *vmg)
+static void vmg_adjust_set_range(struct vma_merge_struct *vmg)
{
- struct vm_area_struct *vma;
- struct vma_prepare vp;
- struct vm_area_struct *adjust;
- long adj_start;
unsigned long flags = vmg->merge_flags;
+ struct vm_area_struct *adjust;
+ pgoff_t pgoff;
- /*
- * If modifying an existing VMA and we don't remove vmg->middle, then we
- * shrink the adjacent VMA.
- */
if (flags & __VMG_FLAG_ADJUST_MIDDLE_START) {
- vma = vmg->target;
adjust = vmg->middle;
- /* The POSITIVE value by which we offset vmg->middle->vm_start. */
- adj_start = vmg->end - vmg->middle->vm_start;
-
- /* Note: vma iterator must be pointing to 'start'. */
- vma_iter_config(vmg->vmi, vmg->start, vmg->end);
+ pgoff = adjust->vm_pgoff + PHYS_PFN(vmg->end - adjust->vm_start);
} else if (flags & __VMG_FLAG_ADJUST_NEXT_START) {
- /*
- * In this case alone, the VMA we manipulate is vmg->middle, but
- * we ultimately return vmg->next.
- */
- vma = vmg->middle;
adjust = vmg->next;
- /* The NEGATIVE value by which we offset vmg->next->vm_start. */
- adj_start = -(vmg->middle->vm_end - vmg->end);
+ pgoff = adjust->vm_pgoff - PHYS_PFN(adjust->vm_start - vmg->end);
+ } else {
+ return;
+ }
+
+ vma_set_range(adjust, vmg->end, adjust->vm_end, pgoff);
+}
+
+/*
+ * Actually perform the VMA merge operation.
+ *
+ * On success, returns the merged VMA. Otherwise returns NULL.
+ */
+static int commit_merge(struct vma_merge_struct *vmg)
+{
+ struct vm_area_struct *vma;
+ struct vma_prepare vp;
+ bool adj_middle = vmg->merge_flags & __VMG_FLAG_ADJUST_MIDDLE_START;
- vma_iter_config(vmg->vmi, vmg->next->vm_start + adj_start,
- vmg->next->vm_end);
+ if (vmg->merge_flags & __VMG_FLAG_ADJUST_NEXT_START) {
+ /* In this case we manipulate middle and return next. */
+ vma = vmg->middle;
+ vma_iter_config(vmg->vmi, vmg->end, vmg->next->vm_end);
} else {
vma = vmg->target;
- adjust = NULL;
- adj_start = 0;
-
/* Note: vma iterator must be pointing to 'start'. */
vma_iter_config(vmg->vmi, vmg->start, vmg->end);
}
@@ -694,22 +693,22 @@ static struct vm_area_struct *commit_merge(struct vma_merge_struct *vmg)
init_multi_vma_prep(&vp, vma, vmg);
if (vma_iter_prealloc(vmg->vmi, vma))
- return NULL;
+ return -ENOMEM;
vma_prepare(&vp);
- vma_adjust_trans_huge(vma, vmg->start, vmg->end, adj_start);
+ /*
+ * THP pages may need to do additional splits if we increase
+ * middle->vm_start.
+ */
+ vma_adjust_trans_huge(vma, vmg->start, vmg->end,
+ adj_middle ? vmg->middle : NULL);
vma_set_range(vma, vmg->start, vmg->end, vmg->pgoff);
-
- if (adj_start) {
- adjust->vm_start += adj_start;
- adjust->vm_pgoff += PHYS_PFN(adj_start);
- }
-
+ vmg_adjust_set_range(vmg);
vma_iter_store(vmg->vmi, vmg->target);
vma_complete(&vp, vmg->vmi, vma->vm_mm);
- return vmg->target;
+ return 0;
}
/* We can only remove VMAs when merging if they do not have a close hook. */
@@ -752,7 +751,7 @@ static __must_check struct vm_area_struct *vma_merge_existing_range(
{
struct vm_area_struct *middle = vmg->middle;
struct vm_area_struct *prev = vmg->prev;
- struct vm_area_struct *next, *res;
+ struct vm_area_struct *next;
struct vm_area_struct *anon_dup = NULL;
unsigned long start = vmg->start;
unsigned long end = vmg->end;
@@ -904,12 +903,7 @@ static __must_check struct vm_area_struct *vma_merge_existing_range(
vmg->end = next->vm_end;
vmg->pgoff = next->vm_pgoff - pglen;
} else {
- /*
- * We shrink middle and expand next.
- *
- * IMPORTANT: This is the ONLY case where the final
- * merged VMA is NOT vmg->target, but rather vmg->next.
- */
+ /* We shrink middle and expand next. */
vmg->merge_flags |= __VMG_FLAG_ADJUST_NEXT_START;
vmg->start = middle->vm_start;
vmg->end = start;
@@ -927,8 +921,10 @@ static __must_check struct vm_area_struct *vma_merge_existing_range(
if (merge_will_delete_next)
vmg->merge_flags |= __VMG_FLAG_REMOVE_NEXT;
- res = commit_merge(vmg);
- if (!res) {
+ err = commit_merge(vmg);
+ if (err) {
+ VM_WARN_ON(err != -ENOMEM);
+
if (anon_dup)
unlink_anon_vmas(anon_dup);
@@ -936,9 +932,9 @@ static __must_check struct vm_area_struct *vma_merge_existing_range(
return NULL;
}
- khugepaged_enter_vma(res, vmg->flags);
+ khugepaged_enter_vma(vmg->target, vmg->flags);
vmg->state = VMA_MERGE_SUCCESS;
- return res;
+ return vmg->target;
abort:
vma_iter_set(vmg->vmi, start);
@@ -1102,7 +1098,7 @@ int vma_expand(struct vma_merge_struct *vmg)
if (remove_next)
vmg->merge_flags |= __VMG_FLAG_REMOVE_NEXT;
- if (!commit_merge(vmg))
+ if (commit_merge(vmg))
goto nomem;
return 0;
@@ -1142,7 +1138,7 @@ int vma_shrink(struct vma_iterator *vmi, struct vm_area_struct *vma,
init_vma_prep(&vp, vma);
vma_prepare(&vp);
- vma_adjust_trans_huge(vma, start, end, 0);
+ vma_adjust_trans_huge(vma, start, end, NULL);
vma_iter_clear(vmi);
vma_set_range(vma, start, end, pgoff);
diff --git a/tools/testing/vma/vma_internal.h b/tools/testing/vma/vma_internal.h
index 1eae23039854..bb273927af0f 100644
--- a/tools/testing/vma/vma_internal.h
+++ b/tools/testing/vma/vma_internal.h
@@ -796,12 +796,12 @@ static inline void vma_start_write(struct vm_area_struct *vma)
static inline void vma_adjust_trans_huge(struct vm_area_struct *vma,
unsigned long start,
unsigned long end,
- long adjust_next)
+ struct vm_area_struct *next)
{
(void)vma;
(void)start;
(void)end;
- (void)adjust_next;
+ (void)next;
}
static inline void vma_iter_free(struct vma_iterator *vmi)
--
2.48.0
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 5/5] mm: completely abstract unnecessary adj_start calculation
2025-01-27 15:50 ` [PATCH 5/5] mm: completely abstract unnecessary adj_start calculation Lorenzo Stoakes
@ 2025-01-27 18:50 ` kernel test robot
2025-01-27 18:56 ` Lorenzo Stoakes
2025-01-27 19:58 ` Lorenzo Stoakes
2025-01-27 20:13 ` kernel test robot
` (3 subsequent siblings)
4 siblings, 2 replies; 26+ messages in thread
From: kernel test robot @ 2025-01-27 18:50 UTC (permalink / raw)
To: Lorenzo Stoakes, Andrew Morton
Cc: oe-kbuild-all, Linux Memory Management List, Liam R . Howlett,
Vlastimil Babka, Jann Horn, linux-kernel
Hi Lorenzo,
kernel test robot noticed the following build warnings:
[auto build test WARNING on akpm-mm/mm-everything]
url: https://github.com/intel-lab-lkp/linux/commits/Lorenzo-Stoakes/mm-simplify-vma-merge-structure-and-expand-comments/20250127-235322
base: https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link: https://lore.kernel.org/r/ef00aec42a892fe6ac9557b3a11f18f30a2e51b3.1737929364.git.lorenzo.stoakes%40oracle.com
patch subject: [PATCH 5/5] mm: completely abstract unnecessary adj_start calculation
config: um-randconfig-001-20250128 (https://download.01.org/0day-ci/archive/20250128/202501280232.qYHi6Rkt-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250128/202501280232.qYHi6Rkt-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202501280232.qYHi6Rkt-lkp@intel.com/
All warnings (new ones prefixed by >>):
In file included from include/uapi/linux/posix_types.h:5,
from include/uapi/linux/types.h:14,
from include/linux/types.h:6,
from include/linux/kasan-checks.h:5,
from include/asm-generic/rwonce.h:26,
from ./arch/x86/include/generated/asm/rwonce.h:1,
from include/linux/compiler.h:339,
from include/linux/array_size.h:5,
from include/linux/kernel.h:16,
from include/linux/backing-dev.h:12,
from mm/vma_internal.h:12,
from mm/vma.c:7:
mm/vma.c: In function '__split_vma':
>> include/linux/stddef.h:8:14: warning: passing argument 4 of 'vma_adjust_trans_huge' makes integer from pointer without a cast [-Wint-conversion]
8 | #define NULL ((void *)0)
| ^~~~~~~~~~~
| |
| void *
mm/vma.c:518:57: note: in expansion of macro 'NULL'
518 | vma_adjust_trans_huge(vma, vma->vm_start, addr, NULL);
| ^~~~
In file included from include/linux/mm.h:1133,
from arch/um/include/asm/tlbflush.h:9,
from arch/um/include/asm/cacheflush.h:4,
from include/linux/cacheflush.h:5,
from include/linux/highmem.h:8,
from include/linux/bvec.h:10,
from include/linux/blk_types.h:10,
from include/linux/writeback.h:13,
from include/linux/backing-dev.h:16:
include/linux/huge_mm.h:574:47: note: expected 'long int' but argument is of type 'void *'
574 | long adjust_next)
| ~~~~~^~~~~~~~~~~
mm/vma.c: In function 'commit_merge':
>> mm/vma.c:704:56: warning: passing argument 4 of 'vma_adjust_trans_huge' makes integer from pointer without a cast [-Wint-conversion]
704 | adj_middle ? vmg->middle : NULL);
include/linux/huge_mm.h:574:47: note: expected 'long int' but argument is of type 'struct vm_area_struct *'
574 | long adjust_next)
| ~~~~~^~~~~~~~~~~
mm/vma.c: In function 'vma_shrink':
>> include/linux/stddef.h:8:14: warning: passing argument 4 of 'vma_adjust_trans_huge' makes integer from pointer without a cast [-Wint-conversion]
8 | #define NULL ((void *)0)
| ^~~~~~~~~~~
| |
| void *
mm/vma.c:1141:48: note: in expansion of macro 'NULL'
1141 | vma_adjust_trans_huge(vma, start, end, NULL);
| ^~~~
include/linux/huge_mm.h:574:47: note: expected 'long int' but argument is of type 'void *'
574 | long adjust_next)
| ~~~~~^~~~~~~~~~~
vim +/vma_adjust_trans_huge +8 include/linux/stddef.h
^1da177e4c3f41 Linus Torvalds 2005-04-16 6
^1da177e4c3f41 Linus Torvalds 2005-04-16 7 #undef NULL
^1da177e4c3f41 Linus Torvalds 2005-04-16 @8 #define NULL ((void *)0)
6e218287432472 Richard Knutsson 2006-09-30 9
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 5/5] mm: completely abstract unnecessary adj_start calculation
2025-01-27 18:50 ` kernel test robot
@ 2025-01-27 18:56 ` Lorenzo Stoakes
2025-01-27 19:58 ` Lorenzo Stoakes
1 sibling, 0 replies; 26+ messages in thread
From: Lorenzo Stoakes @ 2025-01-27 18:56 UTC (permalink / raw)
To: kernel test robot
Cc: Andrew Morton, oe-kbuild-all, Linux Memory Management List,
Liam R . Howlett, Vlastimil Babka, Jann Horn, linux-kernel
On Tue, Jan 28, 2025 at 02:50:55AM +0800, kernel test robot wrote:
> Hi Lorenzo,
>
> kernel test robot noticed the following build warnings:
>
> [auto build test WARNING on akpm-mm/mm-everything]
>
> url: https://github.com/intel-lab-lkp/linux/commits/Lorenzo-Stoakes/mm-simplify-vma-merge-structure-and-expand-comments/20250127-235322
> base: https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
> patch link: https://lore.kernel.org/r/ef00aec42a892fe6ac9557b3a11f18f30a2e51b3.1737929364.git.lorenzo.stoakes%40oracle.com
> patch subject: [PATCH 5/5] mm: completely abstract unnecessary adj_start calculation
> config: um-randconfig-001-20250128 (https://download.01.org/0day-ci/archive/20250128/202501280232.qYHi6Rkt-lkp@intel.com/config)
> compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
> reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250128/202501280232.qYHi6Rkt-lkp@intel.com/reproduce)
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <lkp@intel.com>
> | Closes: https://lore.kernel.org/oe-kbuild-all/202501280232.qYHi6Rkt-lkp@intel.com/
>
> All warnings (new ones prefixed by >>):
>
> In file included from include/uapi/linux/posix_types.h:5,
> from include/uapi/linux/types.h:14,
> from include/linux/types.h:6,
> from include/linux/kasan-checks.h:5,
> from include/asm-generic/rwonce.h:26,
> from ./arch/x86/include/generated/asm/rwonce.h:1,
> from include/linux/compiler.h:339,
> from include/linux/array_size.h:5,
> from include/linux/kernel.h:16,
> from include/linux/backing-dev.h:12,
> from mm/vma_internal.h:12,
> from mm/vma.c:7:
> mm/vma.c: In function '__split_vma':
> >> include/linux/stddef.h:8:14: warning: passing argument 4 of 'vma_adjust_trans_huge' makes integer from pointer without a cast [-Wint-conversion]
> 8 | #define NULL ((void *)0)
> | ^~~~~~~~~~~
> | |
> | void *
> mm/vma.c:518:57: note: in expansion of macro 'NULL'
> 518 | vma_adjust_trans_huge(vma, vma->vm_start, addr, NULL);
> | ^~~~
> In file included from include/linux/mm.h:1133,
> from arch/um/include/asm/tlbflush.h:9,
> from arch/um/include/asm/cacheflush.h:4,
> from include/linux/cacheflush.h:5,
> from include/linux/highmem.h:8,
> from include/linux/bvec.h:10,
> from include/linux/blk_types.h:10,
> from include/linux/writeback.h:13,
> from include/linux/backing-dev.h:16:
> include/linux/huge_mm.h:574:47: note: expected 'long int' but argument is of type 'void *'
> 574 | long adjust_next)
> | ~~~~~^~~~~~~~~~~
Doh! My bad.
For some reason I didn't change the header, and my local compiles must have
made this a warning only and had no problem with it... That'll teach me for
disabling CONFIG_WERROR :)
I'll fix that.
Thanks!
> mm/vma.c: In function 'commit_merge':
> >> mm/vma.c:704:56: warning: passing argument 4 of 'vma_adjust_trans_huge' makes integer from pointer without a cast [-Wint-conversion]
> 704 | adj_middle ? vmg->middle : NULL);
> include/linux/huge_mm.h:574:47: note: expected 'long int' but argument is of type 'struct vm_area_struct *'
> 574 | long adjust_next)
> | ~~~~~^~~~~~~~~~~
> mm/vma.c: In function 'vma_shrink':
> >> include/linux/stddef.h:8:14: warning: passing argument 4 of 'vma_adjust_trans_huge' makes integer from pointer without a cast [-Wint-conversion]
> 8 | #define NULL ((void *)0)
> | ^~~~~~~~~~~
> | |
> | void *
> mm/vma.c:1141:48: note: in expansion of macro 'NULL'
> 1141 | vma_adjust_trans_huge(vma, start, end, NULL);
> | ^~~~
> include/linux/huge_mm.h:574:47: note: expected 'long int' but argument is of type 'void *'
> 574 | long adjust_next)
> | ~~~~~^~~~~~~~~~~
>
>
> vim +/vma_adjust_trans_huge +8 include/linux/stddef.h
>
> ^1da177e4c3f41 Linus Torvalds 2005-04-16 6
> ^1da177e4c3f41 Linus Torvalds 2005-04-16 7 #undef NULL
> ^1da177e4c3f41 Linus Torvalds 2005-04-16 @8 #define NULL ((void *)0)
> 6e218287432472 Richard Knutsson 2006-09-30 9
>
> --
> 0-DAY CI Kernel Test Service
> https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 5/5] mm: completely abstract unnecessary adj_start calculation
2025-01-27 18:50 ` kernel test robot
2025-01-27 18:56 ` Lorenzo Stoakes
@ 2025-01-27 19:58 ` Lorenzo Stoakes
1 sibling, 0 replies; 26+ messages in thread
From: Lorenzo Stoakes @ 2025-01-27 19:58 UTC (permalink / raw)
To: kernel test robot, Andrew Morton
Cc: oe-kbuild-all, Linux Memory Management List, Liam R . Howlett,
Vlastimil Babka, Jann Horn, linux-kernel
On Tue, Jan 28, 2025 at 02:50:55AM +0800, kernel test robot wrote:
> Hi Lorenzo,
>
> kernel test robot noticed the following build warnings:
>
> [auto build test WARNING on akpm-mm/mm-everything]
>
> url: https://github.com/intel-lab-lkp/linux/commits/Lorenzo-Stoakes/mm-simplify-vma-merge-structure-and-expand-comments/20250127-235322
> base: https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
> patch link: https://lore.kernel.org/r/ef00aec42a892fe6ac9557b3a11f18f30a2e51b3.1737929364.git.lorenzo.stoakes%40oracle.com
> patch subject: [PATCH 5/5] mm: completely abstract unnecessary adj_start calculation
> config: um-randconfig-001-20250128 (https://download.01.org/0day-ci/archive/20250128/202501280232.qYHi6Rkt-lkp@intel.com/config)
> compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
> reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250128/202501280232.qYHi6Rkt-lkp@intel.com/reproduce)
>
[snap]
Hm, this is actually only applicable in the case that
CONFIG_TRANSPARENT_HUGEPAGE is specified, I missed the stub.
Andrew - if this doesn't go for a respin before the merge window ends, I attach
a fix-patch which should resolve this.
Thanks!
----8<----
From 47cf78f1975133fea2862830b43ab9a6937fef78 Mon Sep 17 00:00:00 2001
From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Date: Mon, 27 Jan 2025 19:56:13 +0000
Subject: [PATCH] fixup
---
include/linux/huge_mm.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index 43f76b54b522..e1bea54820ff 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -571,7 +571,7 @@ static inline int madvise_collapse(struct vm_area_struct *vma,
static inline void vma_adjust_trans_huge(struct vm_area_struct *vma,
unsigned long start,
unsigned long end,
- long adjust_next)
+ struct vm_area_struct *next)
{
}
static inline int is_swap_pmd(pmd_t pmd)
--
2.48.0
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 5/5] mm: completely abstract unnecessary adj_start calculation
2025-01-27 15:50 ` [PATCH 5/5] mm: completely abstract unnecessary adj_start calculation Lorenzo Stoakes
2025-01-27 18:50 ` kernel test robot
@ 2025-01-27 20:13 ` kernel test robot
2025-01-27 20:35 ` kernel test robot
` (2 subsequent siblings)
4 siblings, 0 replies; 26+ messages in thread
From: kernel test robot @ 2025-01-27 20:13 UTC (permalink / raw)
To: Lorenzo Stoakes, Andrew Morton
Cc: llvm, oe-kbuild-all, Linux Memory Management List,
Liam R . Howlett, Vlastimil Babka, Jann Horn, linux-kernel
Hi Lorenzo,
kernel test robot noticed the following build errors:
[auto build test ERROR on akpm-mm/mm-everything]
url: https://github.com/intel-lab-lkp/linux/commits/Lorenzo-Stoakes/mm-simplify-vma-merge-structure-and-expand-comments/20250127-235322
base: https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link: https://lore.kernel.org/r/ef00aec42a892fe6ac9557b3a11f18f30a2e51b3.1737929364.git.lorenzo.stoakes%40oracle.com
patch subject: [PATCH 5/5] mm: completely abstract unnecessary adj_start calculation
config: hexagon-randconfig-001-20250128 (https://download.01.org/0day-ci/archive/20250128/202501280337.7bKYRAYQ-lkp@intel.com/config)
compiler: clang version 20.0.0git (https://github.com/llvm/llvm-project 19306351a2c45e266fa11b41eb1362b20b6ca56d)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250128/202501280337.7bKYRAYQ-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202501280337.7bKYRAYQ-lkp@intel.com/
All errors (new ones prefixed by >>):
In file included from mm/vma.c:7:
In file included from mm/vma_internal.h:29:
include/linux/mm_inline.h:47:41: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
47 | __mod_lruvec_state(lruvec, NR_LRU_BASE + lru, nr_pages);
| ~~~~~~~~~~~ ^ ~~~
include/linux/mm_inline.h:49:22: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
49 | NR_ZONE_LRU_BASE + lru, nr_pages);
| ~~~~~~~~~~~~~~~~ ^ ~~~
>> mm/vma.c:518:50: error: incompatible pointer to integer conversion passing 'void *' to parameter of type 'long' [-Wint-conversion]
518 | vma_adjust_trans_huge(vma, vma->vm_start, addr, NULL);
| ^~~~
include/linux/stddef.h:8:14: note: expanded from macro 'NULL'
8 | #define NULL ((void *)0)
| ^~~~~~~~~~~
include/linux/huge_mm.h:574:12: note: passing argument to parameter 'adjust_next' here
574 | long adjust_next)
| ^
>> mm/vma.c:704:10: error: incompatible pointer to integer conversion passing 'struct vm_area_struct *' to parameter of type 'long' [-Wint-conversion]
704 | adj_middle ? vmg->middle : NULL);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
include/linux/huge_mm.h:574:12: note: passing argument to parameter 'adjust_next' here
574 | long adjust_next)
| ^
mm/vma.c:1141:41: error: incompatible pointer to integer conversion passing 'void *' to parameter of type 'long' [-Wint-conversion]
1141 | vma_adjust_trans_huge(vma, start, end, NULL);
| ^~~~
include/linux/stddef.h:8:14: note: expanded from macro 'NULL'
8 | #define NULL ((void *)0)
| ^~~~~~~~~~~
include/linux/huge_mm.h:574:12: note: passing argument to parameter 'adjust_next' here
574 | long adjust_next)
| ^
2 warnings and 3 errors generated.
vim +518 mm/vma.c
459
460 /*
461 * __split_vma() bypasses sysctl_max_map_count checking. We use this where it
462 * has already been checked or doesn't make sense to fail.
463 * VMA Iterator will point to the original VMA.
464 */
465 static __must_check int
466 __split_vma(struct vma_iterator *vmi, struct vm_area_struct *vma,
467 unsigned long addr, int new_below)
468 {
469 struct vma_prepare vp;
470 struct vm_area_struct *new;
471 int err;
472
473 WARN_ON(vma->vm_start >= addr);
474 WARN_ON(vma->vm_end <= addr);
475
476 if (vma->vm_ops && vma->vm_ops->may_split) {
477 err = vma->vm_ops->may_split(vma, addr);
478 if (err)
479 return err;
480 }
481
482 new = vm_area_dup(vma);
483 if (!new)
484 return -ENOMEM;
485
486 if (new_below) {
487 new->vm_end = addr;
488 } else {
489 new->vm_start = addr;
490 new->vm_pgoff += ((addr - vma->vm_start) >> PAGE_SHIFT);
491 }
492
493 err = -ENOMEM;
494 vma_iter_config(vmi, new->vm_start, new->vm_end);
495 if (vma_iter_prealloc(vmi, new))
496 goto out_free_vma;
497
498 err = vma_dup_policy(vma, new);
499 if (err)
500 goto out_free_vmi;
501
502 err = anon_vma_clone(new, vma);
503 if (err)
504 goto out_free_mpol;
505
506 if (new->vm_file)
507 get_file(new->vm_file);
508
509 if (new->vm_ops && new->vm_ops->open)
510 new->vm_ops->open(new);
511
512 vma_start_write(vma);
513 vma_start_write(new);
514
515 init_vma_prep(&vp, vma);
516 vp.insert = new;
517 vma_prepare(&vp);
> 518 vma_adjust_trans_huge(vma, vma->vm_start, addr, NULL);
519
520 if (new_below) {
521 vma->vm_start = addr;
522 vma->vm_pgoff += (addr - new->vm_start) >> PAGE_SHIFT;
523 } else {
524 vma->vm_end = addr;
525 }
526
527 /* vma_complete stores the new vma */
528 vma_complete(&vp, vmi, vma->vm_mm);
529 validate_mm(vma->vm_mm);
530
531 /* Success. */
532 if (new_below)
533 vma_next(vmi);
534 else
535 vma_prev(vmi);
536
537 return 0;
538
539 out_free_mpol:
540 mpol_put(vma_policy(new));
541 out_free_vmi:
542 vma_iter_free(vmi);
543 out_free_vma:
544 vm_area_free(new);
545 return err;
546 }
547
548 /*
549 * Split a vma into two pieces at address 'addr', a new vma is allocated
550 * either for the first part or the tail.
551 */
552 static int split_vma(struct vma_iterator *vmi, struct vm_area_struct *vma,
553 unsigned long addr, int new_below)
554 {
555 if (vma->vm_mm->map_count >= sysctl_max_map_count)
556 return -ENOMEM;
557
558 return __split_vma(vmi, vma, addr, new_below);
559 }
560
561 /*
562 * dup_anon_vma() - Helper function to duplicate anon_vma
563 * @dst: The destination VMA
564 * @src: The source VMA
565 * @dup: Pointer to the destination VMA when successful.
566 *
567 * Returns: 0 on success.
568 */
569 static int dup_anon_vma(struct vm_area_struct *dst,
570 struct vm_area_struct *src, struct vm_area_struct **dup)
571 {
572 /*
573 * Easily overlooked: when mprotect shifts the boundary, make sure the
574 * expanding vma has anon_vma set if the shrinking vma had, to cover any
575 * anon pages imported.
576 */
577 if (src->anon_vma && !dst->anon_vma) {
578 int ret;
579
580 vma_assert_write_locked(dst);
581 dst->anon_vma = src->anon_vma;
582 ret = anon_vma_clone(dst, src);
583 if (ret)
584 return ret;
585
586 *dup = dst;
587 }
588
589 return 0;
590 }
591
592 #ifdef CONFIG_DEBUG_VM_MAPLE_TREE
593 void validate_mm(struct mm_struct *mm)
594 {
595 int bug = 0;
596 int i = 0;
597 struct vm_area_struct *vma;
598 VMA_ITERATOR(vmi, mm, 0);
599
600 mt_validate(&mm->mm_mt);
601 for_each_vma(vmi, vma) {
602 #ifdef CONFIG_DEBUG_VM_RB
603 struct anon_vma *anon_vma = vma->anon_vma;
604 struct anon_vma_chain *avc;
605 #endif
606 unsigned long vmi_start, vmi_end;
607 bool warn = 0;
608
609 vmi_start = vma_iter_addr(&vmi);
610 vmi_end = vma_iter_end(&vmi);
611 if (VM_WARN_ON_ONCE_MM(vma->vm_end != vmi_end, mm))
612 warn = 1;
613
614 if (VM_WARN_ON_ONCE_MM(vma->vm_start != vmi_start, mm))
615 warn = 1;
616
617 if (warn) {
618 pr_emerg("issue in %s\n", current->comm);
619 dump_stack();
620 dump_vma(vma);
621 pr_emerg("tree range: %px start %lx end %lx\n", vma,
622 vmi_start, vmi_end - 1);
623 vma_iter_dump_tree(&vmi);
624 }
625
626 #ifdef CONFIG_DEBUG_VM_RB
627 if (anon_vma) {
628 anon_vma_lock_read(anon_vma);
629 list_for_each_entry(avc, &vma->anon_vma_chain, same_vma)
630 anon_vma_interval_tree_verify(avc);
631 anon_vma_unlock_read(anon_vma);
632 }
633 #endif
634 /* Check for a infinite loop */
635 if (++i > mm->map_count + 10) {
636 i = -1;
637 break;
638 }
639 }
640 if (i != mm->map_count) {
641 pr_emerg("map_count %d vma iterator %d\n", mm->map_count, i);
642 bug = 1;
643 }
644 VM_BUG_ON_MM(bug, mm);
645 }
646 #endif /* CONFIG_DEBUG_VM_MAPLE_TREE */
647
648 /*
649 * Based on the vmg flag indicating whether we need to adjust the vm_start field
650 * for the middle or next VMA, we calculate what the range of the newly adjusted
651 * VMA ought to be, and set the VMA's range accordingly.
652 */
653 static void vmg_adjust_set_range(struct vma_merge_struct *vmg)
654 {
655 unsigned long flags = vmg->merge_flags;
656 struct vm_area_struct *adjust;
657 pgoff_t pgoff;
658
659 if (flags & __VMG_FLAG_ADJUST_MIDDLE_START) {
660 adjust = vmg->middle;
661 pgoff = adjust->vm_pgoff + PHYS_PFN(vmg->end - adjust->vm_start);
662 } else if (flags & __VMG_FLAG_ADJUST_NEXT_START) {
663 adjust = vmg->next;
664 pgoff = adjust->vm_pgoff - PHYS_PFN(adjust->vm_start - vmg->end);
665 } else {
666 return;
667 }
668
669 vma_set_range(adjust, vmg->end, adjust->vm_end, pgoff);
670 }
671
672 /*
673 * Actually perform the VMA merge operation.
674 *
675 * On success, returns the merged VMA. Otherwise returns NULL.
676 */
677 static int commit_merge(struct vma_merge_struct *vmg)
678 {
679 struct vm_area_struct *vma;
680 struct vma_prepare vp;
681 bool adj_middle = vmg->merge_flags & __VMG_FLAG_ADJUST_MIDDLE_START;
682
683 if (vmg->merge_flags & __VMG_FLAG_ADJUST_NEXT_START) {
684 /* In this case we manipulate middle and return next. */
685 vma = vmg->middle;
686 vma_iter_config(vmg->vmi, vmg->end, vmg->next->vm_end);
687 } else {
688 vma = vmg->target;
689 /* Note: vma iterator must be pointing to 'start'. */
690 vma_iter_config(vmg->vmi, vmg->start, vmg->end);
691 }
692
693 init_multi_vma_prep(&vp, vma, vmg);
694
695 if (vma_iter_prealloc(vmg->vmi, vma))
696 return -ENOMEM;
697
698 vma_prepare(&vp);
699 /*
700 * THP pages may need to do additional splits if we increase
701 * middle->vm_start.
702 */
703 vma_adjust_trans_huge(vma, vmg->start, vmg->end,
> 704 adj_middle ? vmg->middle : NULL);
705 vma_set_range(vma, vmg->start, vmg->end, vmg->pgoff);
706 vmg_adjust_set_range(vmg);
707 vma_iter_store(vmg->vmi, vmg->target);
708
709 vma_complete(&vp, vmg->vmi, vma->vm_mm);
710
711 return 0;
712 }
713
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 5/5] mm: completely abstract unnecessary adj_start calculation
2025-01-27 15:50 ` [PATCH 5/5] mm: completely abstract unnecessary adj_start calculation Lorenzo Stoakes
2025-01-27 18:50 ` kernel test robot
2025-01-27 20:13 ` kernel test robot
@ 2025-01-27 20:35 ` kernel test robot
2025-01-28 7:54 ` kernel test robot
2025-01-29 15:42 ` Vlastimil Babka
4 siblings, 0 replies; 26+ messages in thread
From: kernel test robot @ 2025-01-27 20:35 UTC (permalink / raw)
To: Lorenzo Stoakes, Andrew Morton
Cc: llvm, oe-kbuild-all, Linux Memory Management List,
Liam R . Howlett, Vlastimil Babka, Jann Horn, linux-kernel
Hi Lorenzo,
kernel test robot noticed the following build warnings:
[auto build test WARNING on akpm-mm/mm-everything]
url: https://github.com/intel-lab-lkp/linux/commits/Lorenzo-Stoakes/mm-simplify-vma-merge-structure-and-expand-comments/20250127-235322
base: https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link: https://lore.kernel.org/r/ef00aec42a892fe6ac9557b3a11f18f30a2e51b3.1737929364.git.lorenzo.stoakes%40oracle.com
patch subject: [PATCH 5/5] mm: completely abstract unnecessary adj_start calculation
config: hexagon-randconfig-002-20250128 (https://download.01.org/0day-ci/archive/20250128/202501280408.9NtSz2Lt-lkp@intel.com/config)
compiler: clang version 14.0.6 (https://github.com/llvm/llvm-project f28c006a5895fc0e329fe15fead81e37457cb1d1)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250128/202501280408.9NtSz2Lt-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202501280408.9NtSz2Lt-lkp@intel.com/
All warnings (new ones prefixed by >>):
>> mm/vma.c:518:50: warning: incompatible pointer to integer conversion passing 'void *' to parameter of type 'long' [-Wint-conversion]
vma_adjust_trans_huge(vma, vma->vm_start, addr, NULL);
^~~~
include/linux/stddef.h:8:14: note: expanded from macro 'NULL'
#define NULL ((void *)0)
^~~~~~~~~~~
include/linux/huge_mm.h:574:12: note: passing argument to parameter 'adjust_next' here
long adjust_next)
^
>> mm/vma.c:704:10: warning: incompatible pointer to integer conversion passing 'struct vm_area_struct *' to parameter of type 'long' [-Wint-conversion]
adj_middle ? vmg->middle : NULL);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
include/linux/huge_mm.h:574:12: note: passing argument to parameter 'adjust_next' here
long adjust_next)
^
mm/vma.c:1141:41: warning: incompatible pointer to integer conversion passing 'void *' to parameter of type 'long' [-Wint-conversion]
vma_adjust_trans_huge(vma, start, end, NULL);
^~~~
include/linux/stddef.h:8:14: note: expanded from macro 'NULL'
#define NULL ((void *)0)
^~~~~~~~~~~
include/linux/huge_mm.h:574:12: note: passing argument to parameter 'adjust_next' here
long adjust_next)
^
3 warnings generated.
vim +518 mm/vma.c
459
460 /*
461 * __split_vma() bypasses sysctl_max_map_count checking. We use this where it
462 * has already been checked or doesn't make sense to fail.
463 * VMA Iterator will point to the original VMA.
464 */
465 static __must_check int
466 __split_vma(struct vma_iterator *vmi, struct vm_area_struct *vma,
467 unsigned long addr, int new_below)
468 {
469 struct vma_prepare vp;
470 struct vm_area_struct *new;
471 int err;
472
473 WARN_ON(vma->vm_start >= addr);
474 WARN_ON(vma->vm_end <= addr);
475
476 if (vma->vm_ops && vma->vm_ops->may_split) {
477 err = vma->vm_ops->may_split(vma, addr);
478 if (err)
479 return err;
480 }
481
482 new = vm_area_dup(vma);
483 if (!new)
484 return -ENOMEM;
485
486 if (new_below) {
487 new->vm_end = addr;
488 } else {
489 new->vm_start = addr;
490 new->vm_pgoff += ((addr - vma->vm_start) >> PAGE_SHIFT);
491 }
492
493 err = -ENOMEM;
494 vma_iter_config(vmi, new->vm_start, new->vm_end);
495 if (vma_iter_prealloc(vmi, new))
496 goto out_free_vma;
497
498 err = vma_dup_policy(vma, new);
499 if (err)
500 goto out_free_vmi;
501
502 err = anon_vma_clone(new, vma);
503 if (err)
504 goto out_free_mpol;
505
506 if (new->vm_file)
507 get_file(new->vm_file);
508
509 if (new->vm_ops && new->vm_ops->open)
510 new->vm_ops->open(new);
511
512 vma_start_write(vma);
513 vma_start_write(new);
514
515 init_vma_prep(&vp, vma);
516 vp.insert = new;
517 vma_prepare(&vp);
> 518 vma_adjust_trans_huge(vma, vma->vm_start, addr, NULL);
519
520 if (new_below) {
521 vma->vm_start = addr;
522 vma->vm_pgoff += (addr - new->vm_start) >> PAGE_SHIFT;
523 } else {
524 vma->vm_end = addr;
525 }
526
527 /* vma_complete stores the new vma */
528 vma_complete(&vp, vmi, vma->vm_mm);
529 validate_mm(vma->vm_mm);
530
531 /* Success. */
532 if (new_below)
533 vma_next(vmi);
534 else
535 vma_prev(vmi);
536
537 return 0;
538
539 out_free_mpol:
540 mpol_put(vma_policy(new));
541 out_free_vmi:
542 vma_iter_free(vmi);
543 out_free_vma:
544 vm_area_free(new);
545 return err;
546 }
547
548 /*
549 * Split a vma into two pieces at address 'addr', a new vma is allocated
550 * either for the first part or the tail.
551 */
552 static int split_vma(struct vma_iterator *vmi, struct vm_area_struct *vma,
553 unsigned long addr, int new_below)
554 {
555 if (vma->vm_mm->map_count >= sysctl_max_map_count)
556 return -ENOMEM;
557
558 return __split_vma(vmi, vma, addr, new_below);
559 }
560
561 /*
562 * dup_anon_vma() - Helper function to duplicate anon_vma
563 * @dst: The destination VMA
564 * @src: The source VMA
565 * @dup: Pointer to the destination VMA when successful.
566 *
567 * Returns: 0 on success.
568 */
569 static int dup_anon_vma(struct vm_area_struct *dst,
570 struct vm_area_struct *src, struct vm_area_struct **dup)
571 {
572 /*
573 * Easily overlooked: when mprotect shifts the boundary, make sure the
574 * expanding vma has anon_vma set if the shrinking vma had, to cover any
575 * anon pages imported.
576 */
577 if (src->anon_vma && !dst->anon_vma) {
578 int ret;
579
580 vma_assert_write_locked(dst);
581 dst->anon_vma = src->anon_vma;
582 ret = anon_vma_clone(dst, src);
583 if (ret)
584 return ret;
585
586 *dup = dst;
587 }
588
589 return 0;
590 }
591
592 #ifdef CONFIG_DEBUG_VM_MAPLE_TREE
593 void validate_mm(struct mm_struct *mm)
594 {
595 int bug = 0;
596 int i = 0;
597 struct vm_area_struct *vma;
598 VMA_ITERATOR(vmi, mm, 0);
599
600 mt_validate(&mm->mm_mt);
601 for_each_vma(vmi, vma) {
602 #ifdef CONFIG_DEBUG_VM_RB
603 struct anon_vma *anon_vma = vma->anon_vma;
604 struct anon_vma_chain *avc;
605 #endif
606 unsigned long vmi_start, vmi_end;
607 bool warn = 0;
608
609 vmi_start = vma_iter_addr(&vmi);
610 vmi_end = vma_iter_end(&vmi);
611 if (VM_WARN_ON_ONCE_MM(vma->vm_end != vmi_end, mm))
612 warn = 1;
613
614 if (VM_WARN_ON_ONCE_MM(vma->vm_start != vmi_start, mm))
615 warn = 1;
616
617 if (warn) {
618 pr_emerg("issue in %s\n", current->comm);
619 dump_stack();
620 dump_vma(vma);
621 pr_emerg("tree range: %px start %lx end %lx\n", vma,
622 vmi_start, vmi_end - 1);
623 vma_iter_dump_tree(&vmi);
624 }
625
626 #ifdef CONFIG_DEBUG_VM_RB
627 if (anon_vma) {
628 anon_vma_lock_read(anon_vma);
629 list_for_each_entry(avc, &vma->anon_vma_chain, same_vma)
630 anon_vma_interval_tree_verify(avc);
631 anon_vma_unlock_read(anon_vma);
632 }
633 #endif
634 /* Check for a infinite loop */
635 if (++i > mm->map_count + 10) {
636 i = -1;
637 break;
638 }
639 }
640 if (i != mm->map_count) {
641 pr_emerg("map_count %d vma iterator %d\n", mm->map_count, i);
642 bug = 1;
643 }
644 VM_BUG_ON_MM(bug, mm);
645 }
646 #endif /* CONFIG_DEBUG_VM_MAPLE_TREE */
647
648 /*
649 * Based on the vmg flag indicating whether we need to adjust the vm_start field
650 * for the middle or next VMA, we calculate what the range of the newly adjusted
651 * VMA ought to be, and set the VMA's range accordingly.
652 */
653 static void vmg_adjust_set_range(struct vma_merge_struct *vmg)
654 {
655 unsigned long flags = vmg->merge_flags;
656 struct vm_area_struct *adjust;
657 pgoff_t pgoff;
658
659 if (flags & __VMG_FLAG_ADJUST_MIDDLE_START) {
660 adjust = vmg->middle;
661 pgoff = adjust->vm_pgoff + PHYS_PFN(vmg->end - adjust->vm_start);
662 } else if (flags & __VMG_FLAG_ADJUST_NEXT_START) {
663 adjust = vmg->next;
664 pgoff = adjust->vm_pgoff - PHYS_PFN(adjust->vm_start - vmg->end);
665 } else {
666 return;
667 }
668
669 vma_set_range(adjust, vmg->end, adjust->vm_end, pgoff);
670 }
671
672 /*
673 * Actually perform the VMA merge operation.
674 *
675 * On success, returns the merged VMA. Otherwise returns NULL.
676 */
677 static int commit_merge(struct vma_merge_struct *vmg)
678 {
679 struct vm_area_struct *vma;
680 struct vma_prepare vp;
681 bool adj_middle = vmg->merge_flags & __VMG_FLAG_ADJUST_MIDDLE_START;
682
683 if (vmg->merge_flags & __VMG_FLAG_ADJUST_NEXT_START) {
684 /* In this case we manipulate middle and return next. */
685 vma = vmg->middle;
686 vma_iter_config(vmg->vmi, vmg->end, vmg->next->vm_end);
687 } else {
688 vma = vmg->target;
689 /* Note: vma iterator must be pointing to 'start'. */
690 vma_iter_config(vmg->vmi, vmg->start, vmg->end);
691 }
692
693 init_multi_vma_prep(&vp, vma, vmg);
694
695 if (vma_iter_prealloc(vmg->vmi, vma))
696 return -ENOMEM;
697
698 vma_prepare(&vp);
699 /*
700 * THP pages may need to do additional splits if we increase
701 * middle->vm_start.
702 */
703 vma_adjust_trans_huge(vma, vmg->start, vmg->end,
> 704 adj_middle ? vmg->middle : NULL);
705 vma_set_range(vma, vmg->start, vmg->end, vmg->pgoff);
706 vmg_adjust_set_range(vmg);
707 vma_iter_store(vmg->vmi, vmg->target);
708
709 vma_complete(&vp, vmg->vmi, vma->vm_mm);
710
711 return 0;
712 }
713
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 5/5] mm: completely abstract unnecessary adj_start calculation
2025-01-27 15:50 ` [PATCH 5/5] mm: completely abstract unnecessary adj_start calculation Lorenzo Stoakes
` (2 preceding siblings ...)
2025-01-27 20:35 ` kernel test robot
@ 2025-01-28 7:54 ` kernel test robot
2025-01-29 15:42 ` Vlastimil Babka
4 siblings, 0 replies; 26+ messages in thread
From: kernel test robot @ 2025-01-28 7:54 UTC (permalink / raw)
To: Lorenzo Stoakes, Andrew Morton
Cc: oe-kbuild-all, Linux Memory Management List, Liam R . Howlett,
Vlastimil Babka, Jann Horn, linux-kernel
Hi Lorenzo,
kernel test robot noticed the following build warnings:
[auto build test WARNING on akpm-mm/mm-everything]
url: https://github.com/intel-lab-lkp/linux/commits/Lorenzo-Stoakes/mm-simplify-vma-merge-structure-and-expand-comments/20250127-235322
base: https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link: https://lore.kernel.org/r/ef00aec42a892fe6ac9557b3a11f18f30a2e51b3.1737929364.git.lorenzo.stoakes%40oracle.com
patch subject: [PATCH 5/5] mm: completely abstract unnecessary adj_start calculation
config: x86_64-randconfig-122-20250128 (https://download.01.org/0day-ci/archive/20250128/202501281552.zDLVA5oq-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250128/202501281552.zDLVA5oq-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202501281552.zDLVA5oq-lkp@intel.com/
sparse warnings: (new ones prefixed by >>)
>> mm/vma.c:518:57: sparse: sparse: incorrect type in argument 4 (different base types) @@ expected long adjust_next @@ got void * @@
mm/vma.c:518:57: sparse: expected long adjust_next
mm/vma.c:518:57: sparse: got void *
>> mm/vma.c:704:42: sparse: sparse: incorrect type in argument 4 (different base types) @@ expected long adjust_next @@ got struct vm_area_struct * @@
mm/vma.c:704:42: sparse: expected long adjust_next
mm/vma.c:704:42: sparse: got struct vm_area_struct *
mm/vma.c:1141:48: sparse: sparse: incorrect type in argument 4 (different base types) @@ expected long adjust_next @@ got void * @@
mm/vma.c:1141:48: sparse: expected long adjust_next
mm/vma.c:1141:48: sparse: got void *
vim +518 mm/vma.c
459
460 /*
461 * __split_vma() bypasses sysctl_max_map_count checking. We use this where it
462 * has already been checked or doesn't make sense to fail.
463 * VMA Iterator will point to the original VMA.
464 */
465 static __must_check int
466 __split_vma(struct vma_iterator *vmi, struct vm_area_struct *vma,
467 unsigned long addr, int new_below)
468 {
469 struct vma_prepare vp;
470 struct vm_area_struct *new;
471 int err;
472
473 WARN_ON(vma->vm_start >= addr);
474 WARN_ON(vma->vm_end <= addr);
475
476 if (vma->vm_ops && vma->vm_ops->may_split) {
477 err = vma->vm_ops->may_split(vma, addr);
478 if (err)
479 return err;
480 }
481
482 new = vm_area_dup(vma);
483 if (!new)
484 return -ENOMEM;
485
486 if (new_below) {
487 new->vm_end = addr;
488 } else {
489 new->vm_start = addr;
490 new->vm_pgoff += ((addr - vma->vm_start) >> PAGE_SHIFT);
491 }
492
493 err = -ENOMEM;
494 vma_iter_config(vmi, new->vm_start, new->vm_end);
495 if (vma_iter_prealloc(vmi, new))
496 goto out_free_vma;
497
498 err = vma_dup_policy(vma, new);
499 if (err)
500 goto out_free_vmi;
501
502 err = anon_vma_clone(new, vma);
503 if (err)
504 goto out_free_mpol;
505
506 if (new->vm_file)
507 get_file(new->vm_file);
508
509 if (new->vm_ops && new->vm_ops->open)
510 new->vm_ops->open(new);
511
512 vma_start_write(vma);
513 vma_start_write(new);
514
515 init_vma_prep(&vp, vma);
516 vp.insert = new;
517 vma_prepare(&vp);
> 518 vma_adjust_trans_huge(vma, vma->vm_start, addr, NULL);
519
520 if (new_below) {
521 vma->vm_start = addr;
522 vma->vm_pgoff += (addr - new->vm_start) >> PAGE_SHIFT;
523 } else {
524 vma->vm_end = addr;
525 }
526
527 /* vma_complete stores the new vma */
528 vma_complete(&vp, vmi, vma->vm_mm);
529 validate_mm(vma->vm_mm);
530
531 /* Success. */
532 if (new_below)
533 vma_next(vmi);
534 else
535 vma_prev(vmi);
536
537 return 0;
538
539 out_free_mpol:
540 mpol_put(vma_policy(new));
541 out_free_vmi:
542 vma_iter_free(vmi);
543 out_free_vma:
544 vm_area_free(new);
545 return err;
546 }
547
548 /*
549 * Split a vma into two pieces at address 'addr', a new vma is allocated
550 * either for the first part or the tail.
551 */
552 static int split_vma(struct vma_iterator *vmi, struct vm_area_struct *vma,
553 unsigned long addr, int new_below)
554 {
555 if (vma->vm_mm->map_count >= sysctl_max_map_count)
556 return -ENOMEM;
557
558 return __split_vma(vmi, vma, addr, new_below);
559 }
560
561 /*
562 * dup_anon_vma() - Helper function to duplicate anon_vma
563 * @dst: The destination VMA
564 * @src: The source VMA
565 * @dup: Pointer to the destination VMA when successful.
566 *
567 * Returns: 0 on success.
568 */
569 static int dup_anon_vma(struct vm_area_struct *dst,
570 struct vm_area_struct *src, struct vm_area_struct **dup)
571 {
572 /*
573 * Easily overlooked: when mprotect shifts the boundary, make sure the
574 * expanding vma has anon_vma set if the shrinking vma had, to cover any
575 * anon pages imported.
576 */
577 if (src->anon_vma && !dst->anon_vma) {
578 int ret;
579
580 vma_assert_write_locked(dst);
581 dst->anon_vma = src->anon_vma;
582 ret = anon_vma_clone(dst, src);
583 if (ret)
584 return ret;
585
586 *dup = dst;
587 }
588
589 return 0;
590 }
591
592 #ifdef CONFIG_DEBUG_VM_MAPLE_TREE
593 void validate_mm(struct mm_struct *mm)
594 {
595 int bug = 0;
596 int i = 0;
597 struct vm_area_struct *vma;
598 VMA_ITERATOR(vmi, mm, 0);
599
600 mt_validate(&mm->mm_mt);
601 for_each_vma(vmi, vma) {
602 #ifdef CONFIG_DEBUG_VM_RB
603 struct anon_vma *anon_vma = vma->anon_vma;
604 struct anon_vma_chain *avc;
605 #endif
606 unsigned long vmi_start, vmi_end;
607 bool warn = 0;
608
609 vmi_start = vma_iter_addr(&vmi);
610 vmi_end = vma_iter_end(&vmi);
611 if (VM_WARN_ON_ONCE_MM(vma->vm_end != vmi_end, mm))
612 warn = 1;
613
614 if (VM_WARN_ON_ONCE_MM(vma->vm_start != vmi_start, mm))
615 warn = 1;
616
617 if (warn) {
618 pr_emerg("issue in %s\n", current->comm);
619 dump_stack();
620 dump_vma(vma);
621 pr_emerg("tree range: %px start %lx end %lx\n", vma,
622 vmi_start, vmi_end - 1);
623 vma_iter_dump_tree(&vmi);
624 }
625
626 #ifdef CONFIG_DEBUG_VM_RB
627 if (anon_vma) {
628 anon_vma_lock_read(anon_vma);
629 list_for_each_entry(avc, &vma->anon_vma_chain, same_vma)
630 anon_vma_interval_tree_verify(avc);
631 anon_vma_unlock_read(anon_vma);
632 }
633 #endif
634 /* Check for a infinite loop */
635 if (++i > mm->map_count + 10) {
636 i = -1;
637 break;
638 }
639 }
640 if (i != mm->map_count) {
641 pr_emerg("map_count %d vma iterator %d\n", mm->map_count, i);
642 bug = 1;
643 }
644 VM_BUG_ON_MM(bug, mm);
645 }
646 #endif /* CONFIG_DEBUG_VM_MAPLE_TREE */
647
648 /*
649 * Based on the vmg flag indicating whether we need to adjust the vm_start field
650 * for the middle or next VMA, we calculate what the range of the newly adjusted
651 * VMA ought to be, and set the VMA's range accordingly.
652 */
653 static void vmg_adjust_set_range(struct vma_merge_struct *vmg)
654 {
655 unsigned long flags = vmg->merge_flags;
656 struct vm_area_struct *adjust;
657 pgoff_t pgoff;
658
659 if (flags & __VMG_FLAG_ADJUST_MIDDLE_START) {
660 adjust = vmg->middle;
661 pgoff = adjust->vm_pgoff + PHYS_PFN(vmg->end - adjust->vm_start);
662 } else if (flags & __VMG_FLAG_ADJUST_NEXT_START) {
663 adjust = vmg->next;
664 pgoff = adjust->vm_pgoff - PHYS_PFN(adjust->vm_start - vmg->end);
665 } else {
666 return;
667 }
668
669 vma_set_range(adjust, vmg->end, adjust->vm_end, pgoff);
670 }
671
672 /*
673 * Actually perform the VMA merge operation.
674 *
675 * On success, returns the merged VMA. Otherwise returns NULL.
676 */
677 static int commit_merge(struct vma_merge_struct *vmg)
678 {
679 struct vm_area_struct *vma;
680 struct vma_prepare vp;
681 bool adj_middle = vmg->merge_flags & __VMG_FLAG_ADJUST_MIDDLE_START;
682
683 if (vmg->merge_flags & __VMG_FLAG_ADJUST_NEXT_START) {
684 /* In this case we manipulate middle and return next. */
685 vma = vmg->middle;
686 vma_iter_config(vmg->vmi, vmg->end, vmg->next->vm_end);
687 } else {
688 vma = vmg->target;
689 /* Note: vma iterator must be pointing to 'start'. */
690 vma_iter_config(vmg->vmi, vmg->start, vmg->end);
691 }
692
693 init_multi_vma_prep(&vp, vma, vmg);
694
695 if (vma_iter_prealloc(vmg->vmi, vma))
696 return -ENOMEM;
697
698 vma_prepare(&vp);
699 /*
700 * THP pages may need to do additional splits if we increase
701 * middle->vm_start.
702 */
703 vma_adjust_trans_huge(vma, vmg->start, vmg->end,
> 704 adj_middle ? vmg->middle : NULL);
705 vma_set_range(vma, vmg->start, vmg->end, vmg->pgoff);
706 vmg_adjust_set_range(vmg);
707 vma_iter_store(vmg->vmi, vmg->target);
708
709 vma_complete(&vp, vmg->vmi, vma->vm_mm);
710
711 return 0;
712 }
713
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 1/5] mm: simplify vma merge structure and expand comments
2025-01-27 15:50 ` [PATCH 1/5] mm: simplify vma merge structure and expand comments Lorenzo Stoakes
@ 2025-01-28 11:38 ` Vlastimil Babka
2025-01-28 14:05 ` Lorenzo Stoakes
0 siblings, 1 reply; 26+ messages in thread
From: Vlastimil Babka @ 2025-01-28 11:38 UTC (permalink / raw)
To: Lorenzo Stoakes, Andrew Morton
Cc: Liam R . Howlett, Jann Horn, linux-mm, linux-kernel
On 1/27/25 16:50, Lorenzo Stoakes wrote:
> The merge code, while much improved, still has a number of points of
> confusion. As part of a broader series cleaning this up to make this more
> maintainable, we start by addressing some confusion around vma_merge_struct
> fields.
>
> So far, the caller either provides no vmg->vma (a new VMA) or supplies the
> existing VMA which is being altered, setting vmg->start,end,pgoff to the
> proposed VMA dimensions.
>
> vmg->vma is then updated, as are vmg->start,end,pgoff as the merge process
> proceeds and the appropriate merge strategy is determined.
>
> This is rather confusing, as vmg->vma starts off as the 'middle' VMA
> between vmg->prev,next, but becomes the 'target' VMA, except in one
> specific edge case (merge next, shrink middle).
>
> Int his patch we introduce vmg->middle to describe the VMA that is between
> vmg->prev and vmg->next, and does NOT change during the merge operation.
>
> We replace vmg->vma with vmg->target, and use this only during the merge
> operation itself.
Yeah that's much better.
> Aside from the merge right, shrink middle case, this becomes the VMA that
> forms the basis of the VMA that is returned. This edge case can be
> addressed in a future commit.
>
> We also add a number of comments to explain what is going on.
>
> Finally, we adjust the ASCII diagrams showing each merge case in
> vma_merge_existing_range() to be clearer - the arrow range previously
> showed the vmg->start, end spanned area, but it is clearer to change this
> to show the final merged VMA.
>
> This patch has no change in functional behaviour.
>
> Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
> --- a/mm/vma.h
> +++ b/mm/vma.h
> @@ -69,16 +69,48 @@ enum vma_merge_flags {
> VMG_FLAG_JUST_EXPAND = 1 << 0,
> };
>
> -/* Represents a VMA merge operation. */
> +/*
> + * Describes a VMA merge operation and is threaded throughout it.
> + *
> + * Any of the fields may be mutated by the merge operation, so no guarantees are
> + * made to the contents of this structure after a merge operation has completed.
> + */
Well this patch seems like a step in the direction to limit what's mutated,
and perhaps defining some of the guarantees (via const?) could be then possible?
> struct vma_merge_struct {
> struct mm_struct *mm;
> struct vma_iterator *vmi;
> - pgoff_t pgoff;
> + /*
> + * Adjacent VMAs, any of which may be NULL if not present:
> + *
> + * |------|--------|------|
> + * | prev | middle | next |
> + * |------|--------|------|
> + *
> + * middle may not yet exist in the case of a proposed new VMA being
> + * merged, or it may be an existing VMA.
> + *
> + * next may be assigned by the caller.
Caller of what?
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 1/5] mm: simplify vma merge structure and expand comments
2025-01-28 11:38 ` Vlastimil Babka
@ 2025-01-28 14:05 ` Lorenzo Stoakes
0 siblings, 0 replies; 26+ messages in thread
From: Lorenzo Stoakes @ 2025-01-28 14:05 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Andrew Morton, Liam R . Howlett, Jann Horn, linux-mm, linux-kernel
On Tue, Jan 28, 2025 at 12:38:48PM +0100, Vlastimil Babka wrote:
> On 1/27/25 16:50, Lorenzo Stoakes wrote:
> > The merge code, while much improved, still has a number of points of
> > confusion. As part of a broader series cleaning this up to make this more
> > maintainable, we start by addressing some confusion around vma_merge_struct
> > fields.
> >
> > So far, the caller either provides no vmg->vma (a new VMA) or supplies the
> > existing VMA which is being altered, setting vmg->start,end,pgoff to the
> > proposed VMA dimensions.
> >
> > vmg->vma is then updated, as are vmg->start,end,pgoff as the merge process
> > proceeds and the appropriate merge strategy is determined.
> >
> > This is rather confusing, as vmg->vma starts off as the 'middle' VMA
> > between vmg->prev,next, but becomes the 'target' VMA, except in one
> > specific edge case (merge next, shrink middle).
> >
> > Int his patch we introduce vmg->middle to describe the VMA that is between
> > vmg->prev and vmg->next, and does NOT change during the merge operation.
> >
> > We replace vmg->vma with vmg->target, and use this only during the merge
> > operation itself.
>
> Yeah that's much better.
Yes, and part of a number of steps that gradually improve things (though
some of the incremental states are not quite beautiful, the final result is
good :)
>
> > Aside from the merge right, shrink middle case, this becomes the VMA that
> > forms the basis of the VMA that is returned. This edge case can be
> > addressed in a future commit.
> >
> > We also add a number of comments to explain what is going on.
> >
> > Finally, we adjust the ASCII diagrams showing each merge case in
> > vma_merge_existing_range() to be clearer - the arrow range previously
> > showed the vmg->start, end spanned area, but it is clearer to change this
> > to show the final merged VMA.
> >
> > This patch has no change in functional behaviour.
> >
> > Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
>
> Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Thanks!
>
> > --- a/mm/vma.h
> > +++ b/mm/vma.h
> > @@ -69,16 +69,48 @@ enum vma_merge_flags {
> > VMG_FLAG_JUST_EXPAND = 1 << 0,
> > };
> >
> > -/* Represents a VMA merge operation. */
> > +/*
> > + * Describes a VMA merge operation and is threaded throughout it.
> > + *
> > + * Any of the fields may be mutated by the merge operation, so no guarantees are
> > + * made to the contents of this structure after a merge operation has completed.
> > + */
>
> Well this patch seems like a step in the direction to limit what's mutated,
> and perhaps defining some of the guarantees (via const?) could be then possible?
Yeah, I was thinking about doing this, but perhaps as a follow-up. We'd
have to differentiate between:
const struct foo *bar;
and
struct foo * const bar;
To indicate in some cases we are fine with changing the pointer but not the
underlying struct and vice-versa.
>
> > struct vma_merge_struct {
> > struct mm_struct *mm;
> > struct vma_iterator *vmi;
> > - pgoff_t pgoff;
> > + /*
> > + * Adjacent VMAs, any of which may be NULL if not present:
> > + *
> > + * |------|--------|------|
> > + * | prev | middle | next |
> > + * |------|--------|------|
> > + *
> > + * middle may not yet exist in the case of a proposed new VMA being
> > + * merged, or it may be an existing VMA.
> > + *
> > + * next may be assigned by the caller.
>
> Caller of what?
vma_merge_new_range() requires you to specify it, we document this there:
* ASSUMPTIONS:
...
* - The caller must have specified the next vma in @vmg->next.
In the case of vma_merge_existing_range() ,the caller should _not_ specify
next:
* ASSUMPTIONS:
* - The caller must not set @vmg->next, as we determine this.
Yes, this is insane, and I tried in the original series to avoid the need
for this stupid situation, but it ended up being more complicated than just
documenting and checking for this in the vma_merge_existing_range() case
(in the vma_merge_new_range() case we can't, since it may legitimately be
NULL if the proposed VMA is the last in the virtual address space).
Another one for a future fixup :)
>
>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 2/5] mm: further refactor commit_merge()
2025-01-27 15:50 ` [PATCH 2/5] mm: further refactor commit_merge() Lorenzo Stoakes
@ 2025-01-28 15:05 ` Vlastimil Babka
2025-01-28 15:45 ` Vlastimil Babka
1 sibling, 0 replies; 26+ messages in thread
From: Vlastimil Babka @ 2025-01-28 15:05 UTC (permalink / raw)
To: Lorenzo Stoakes, Andrew Morton
Cc: Liam R . Howlett, Jann Horn, linux-mm, linux-kernel
On 1/27/25 16:50, Lorenzo Stoakes wrote:
> The current VMA merge mechanism contains a number of confusing mechanisms
> around removal of VMAs on merge and the shrinking of the VMA adjacent to
> vma->target in the case of merges which result in a partial merge with that
> adjacent VMA.
>
> Since we now have a STABLE set of VMAs - prev, middle, next - we are now
> able to have the caller of commit_merge() explicitly tell us which VMAs
> need deleting, using newly introduced internal VMA merge flags.
>
> Doing so allows us to embed this state within the VMG and remove the
> confusing remove, remove2 parameters from commit_merge().
>
> We additionally are able to eliminate the highly confusing and misleading
> 'expanded' parameter - a parameter that in reality refers to whether or not
> the return VMA is the target one or the one immediately adjacent.
>
> We can infer which is the case from whether or not the adj_start parameter
> is negative. This also allows us to simplify further logic around iterator
> configuration and VMA iterator stores.
>
> Doing so means we can also eliminate the adjust parameter, as we are able
> to infer which VMA ought to be adjusted from adj_start - a positive value
> implies we adjust the start of 'middle', a negative one implies we adjust
> the start of 'next'.
>
> We are then able to have commit_merge() explicitly return the target VMA,
> or NULL on inability to pre-allocate memory. Errors were previously
> filtered so behaviour does not change.
>
> This patch has no change in functional behaviour.
>
> Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Some of the parts would be quite a hack for a final state of things, but I
understand it's a intermediate step for review purposes.
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 2/5] mm: further refactor commit_merge()
2025-01-27 15:50 ` [PATCH 2/5] mm: further refactor commit_merge() Lorenzo Stoakes
2025-01-28 15:05 ` Vlastimil Babka
@ 2025-01-28 15:45 ` Vlastimil Babka
2025-01-28 16:07 ` Lorenzo Stoakes
1 sibling, 1 reply; 26+ messages in thread
From: Vlastimil Babka @ 2025-01-28 15:45 UTC (permalink / raw)
To: Lorenzo Stoakes, Andrew Morton
Cc: Liam R . Howlett, Jann Horn, linux-mm, linux-kernel
On 1/27/25 16:50, Lorenzo Stoakes wrote:
> --- a/mm/vma.h
> +++ b/mm/vma.h
> @@ -67,6 +67,16 @@ enum vma_merge_flags {
> * at the gap.
> */
> VMG_FLAG_JUST_EXPAND = 1 << 0,
> + /*
> + * Internal flag used during the merge operation to indicate we will
> + * remove vmg->middle.
> + */
> + __VMG_FLAG_REMOVE_MIDDLE = 1 << 1,
> + /*
> + * Internal flag used during the merge operationr to indicate we will
> + * remove vmg->next.
> + */
> + __VMG_FLAG_REMOVE_NEXT = 1 << 2,
> };
Hm this is actually kinda weird? It's an enum, but the values of it are
defined as different bits. And then struct vma_merge_struct has a "enum
vma_merge_flags merge_flags;" but we don't store to it a single "enum
vma_merge_flags" value defined above, but a combination of those. Is that
even legal to do in C?
AFAIK the more common pattern is enum that has normal incremental values
that are used for the shifts.
But I don't think we need all of this at all here? Just have bitfields in
struct vma_merge_struct?
bool just_expand : 1;
bool remove_middle : 1;
...
> /*
> diff --git a/tools/testing/vma/vma.c b/tools/testing/vma/vma.c
> index 3c0572120e94..8cce67237d86 100644
> --- a/tools/testing/vma/vma.c
> +++ b/tools/testing/vma/vma.c
> @@ -154,6 +154,9 @@ static void vmg_set_range(struct vma_merge_struct *vmg, unsigned long start,
> vmg->end = end;
> vmg->pgoff = pgoff;
> vmg->flags = flags;
> +
> + vmg->merge_flags = VMG_FLAG_DEFAULT;
> + vmg->target = NULL;
> }
>
> /*
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 2/5] mm: further refactor commit_merge()
2025-01-28 15:45 ` Vlastimil Babka
@ 2025-01-28 16:07 ` Lorenzo Stoakes
2025-01-28 16:42 ` Lorenzo Stoakes
0 siblings, 1 reply; 26+ messages in thread
From: Lorenzo Stoakes @ 2025-01-28 16:07 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Andrew Morton, Liam R . Howlett, Jann Horn, linux-mm, linux-kernel
On Tue, Jan 28, 2025 at 04:45:01PM +0100, Vlastimil Babka wrote:
> On 1/27/25 16:50, Lorenzo Stoakes wrote:
> > --- a/mm/vma.h
> > +++ b/mm/vma.h
> > @@ -67,6 +67,16 @@ enum vma_merge_flags {
> > * at the gap.
> > */
> > VMG_FLAG_JUST_EXPAND = 1 << 0,
> > + /*
> > + * Internal flag used during the merge operation to indicate we will
> > + * remove vmg->middle.
> > + */
> > + __VMG_FLAG_REMOVE_MIDDLE = 1 << 1,
> > + /*
> > + * Internal flag used during the merge operationr to indicate we will
> > + * remove vmg->next.
> > + */
> > + __VMG_FLAG_REMOVE_NEXT = 1 << 2,
> > };
>
> Hm this is actually kinda weird? It's an enum, but the values of it are
> defined as different bits. And then struct vma_merge_struct has a "enum
> vma_merge_flags merge_flags;" but we don't store to it a single "enum
> vma_merge_flags" value defined above, but a combination of those. Is that
> even legal to do in C?
Yes it's legal to do. And we already did it. And other parts of the kernel do
it.
I get that it breaks a switch (enum val) { case ... } statement but we don't do
that.
>
> AFAIK the more common pattern is enum that has normal incremental values
> that are used for the shifts.
>
> But I don't think we need all of this at all here? Just have bitfields in
> struct vma_merge_struct?
>
> bool just_expand : 1;
> bool remove_middle : 1;
I find that ugly, and it necessitates the addition of a new field for every new
flag.
It also prevents any masking stuff going forward and clutters everything.
It also makes the interface confusiing, because now you have users having to
know there's a field that lets you do X rather than just a simple flags field
that can encapsulate all state.
And some of those fields are now internal...
If you were to insist we have to change this, then I'd pefer a set of defines
and the but then it'd be a question of whether we typedef something for that or
just pass an unsigned long.
I prefer having the type safety of the enum even if it pedantically 'not
correct'.
C doesn't give you many sane choices for this. I am doing my part to make rust
more of a thing in mm which will help on this front ;)
> ...
>
> > /*
> > diff --git a/tools/testing/vma/vma.c b/tools/testing/vma/vma.c
> > index 3c0572120e94..8cce67237d86 100644
> > --- a/tools/testing/vma/vma.c
> > +++ b/tools/testing/vma/vma.c
> > @@ -154,6 +154,9 @@ static void vmg_set_range(struct vma_merge_struct *vmg, unsigned long start,
> > vmg->end = end;
> > vmg->pgoff = pgoff;
> > vmg->flags = flags;
> > +
> > + vmg->merge_flags = VMG_FLAG_DEFAULT;
> > + vmg->target = NULL;
> > }
> >
> > /*
>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 2/5] mm: further refactor commit_merge()
2025-01-28 16:07 ` Lorenzo Stoakes
@ 2025-01-28 16:42 ` Lorenzo Stoakes
0 siblings, 0 replies; 26+ messages in thread
From: Lorenzo Stoakes @ 2025-01-28 16:42 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Andrew Morton, Liam R . Howlett, Jann Horn, linux-mm, linux-kernel
On Tue, Jan 28, 2025 at 04:07:00PM +0000, Lorenzo Stoakes wrote:
> On Tue, Jan 28, 2025 at 04:45:01PM +0100, Vlastimil Babka wrote:
> > On 1/27/25 16:50, Lorenzo Stoakes wrote:
> > > --- a/mm/vma.h
> > > +++ b/mm/vma.h
> > > @@ -67,6 +67,16 @@ enum vma_merge_flags {
> > > * at the gap.
> > > */
> > > VMG_FLAG_JUST_EXPAND = 1 << 0,
> > > + /*
> > > + * Internal flag used during the merge operation to indicate we will
> > > + * remove vmg->middle.
> > > + */
> > > + __VMG_FLAG_REMOVE_MIDDLE = 1 << 1,
> > > + /*
> > > + * Internal flag used during the merge operationr to indicate we will
> > > + * remove vmg->next.
> > > + */
> > > + __VMG_FLAG_REMOVE_NEXT = 1 << 2,
> > > };
> >
> > Hm this is actually kinda weird? It's an enum, but the values of it are
> > defined as different bits. And then struct vma_merge_struct has a "enum
> > vma_merge_flags merge_flags;" but we don't store to it a single "enum
> > vma_merge_flags" value defined above, but a combination of those. Is that
> > even legal to do in C?
>
> Yes it's legal to do. And we already did it. And other parts of the kernel do
> it.
>
> I get that it breaks a switch (enum val) { case ... } statement but we don't do
> that.
>
> >
> > AFAIK the more common pattern is enum that has normal incremental values
> > that are used for the shifts.
> >
> > But I don't think we need all of this at all here? Just have bitfields in
> > struct vma_merge_struct?
> >
> > bool just_expand : 1;
> > bool remove_middle : 1;
>
> I find that ugly, and it necessitates the addition of a new field for every new
> flag.
>
> It also prevents any masking stuff going forward and clutters everything.
>
> It also makes the interface confusiing, because now you have users having to
> know there's a field that lets you do X rather than just a simple flags field
> that can encapsulate all state.
>
> And some of those fields are now internal...
>
> If you were to insist we have to change this, then I'd pefer a set of defines
> and the but then it'd be a question of whether we typedef something for that or
> just pass an unsigned long.
>
> I prefer having the type safety of the enum even if it pedantically 'not
> correct'.
>
> C doesn't give you many sane choices for this. I am doing my part to make rust
> more of a thing in mm which will help on this front ;)
>
Actually, looking more closely, this is not a common pattern and the weirdness
you say is confusing.
The alternative of a bare flags field sucks badly, so while I dislike the
aesthetics of the bitfields, the fact you can't mask, the fact it's not a
clean 'state parameter' now, it's probably marginally better overall
vs. alternatives.
C doesn't help you here very much... some other languages have a concept of
a 'flags enum' or at least a specifically-typed value that can be used this
way.
We can add comments to make this less sucky like:
/* Flags which callers can use to modify merge behaviour */
bool just_expand :1;
/* Internal flags set during merge process */
bool __blahdy_blah :1;
TL;DR - will convert to bitfields, you're right I'm wrong, beer + pizzas in
Prague soon! ;)
> > ...
> >
> > > /*
> > > diff --git a/tools/testing/vma/vma.c b/tools/testing/vma/vma.c
> > > index 3c0572120e94..8cce67237d86 100644
> > > --- a/tools/testing/vma/vma.c
> > > +++ b/tools/testing/vma/vma.c
> > > @@ -154,6 +154,9 @@ static void vmg_set_range(struct vma_merge_struct *vmg, unsigned long start,
> > > vmg->end = end;
> > > vmg->pgoff = pgoff;
> > > vmg->flags = flags;
> > > +
> > > + vmg->merge_flags = VMG_FLAG_DEFAULT;
> > > + vmg->target = NULL;
> > > }
> > >
> > > /*
> >
>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 3/5] mm: eliminate adj_start parameter from commit_merge()
2025-01-27 15:50 ` [PATCH 3/5] mm: eliminate adj_start parameter from commit_merge() Lorenzo Stoakes
@ 2025-01-29 14:13 ` Vlastimil Babka
2025-01-29 14:19 ` Lorenzo Stoakes
0 siblings, 1 reply; 26+ messages in thread
From: Vlastimil Babka @ 2025-01-29 14:13 UTC (permalink / raw)
To: Lorenzo Stoakes, Andrew Morton
Cc: Liam R . Howlett, Jann Horn, linux-mm, linux-kernel
On 1/27/25 16:50, Lorenzo Stoakes wrote:
> Introduce internal __VMG_FLAG_ADJUST_MIDDLE_START and
> __VMG_FLAG_ADJUST_NEXT_START merge flags, enabling us to indicate to
> commit_merge() that we are performing a merge which either spans only part
> of vmg->middle, or part of vmg->next respectively.
>
> In the former instance, we change the start of vmg->middle to match the
> attributes of vmg->prev, without spanning all of vmg->middle.
>
> This implies that vmg->prev->vm_end and vmg->middle->vm_start are both
> increased to form the new merged VMA (vmg->prev) and the new subsequent VMA
> (vmg->middle).
>
> In the latter case, we change the end of vmg->middle to match the
> attributes of vmg->next, without spanning all of vmg->next.
>
> This implies that vmg->middle->vm_end and vmg->next->vm_start are both
> decreased to form the new merged VMA (vmg->next) and the new prior VMA
> (vmg->middle).
>
> Since we now have a stable set of prev, middle, next VMAs threaded through
> vmg and with these flags set know what is happening, we can perform
> the calculation in commit_merge() instead.
>
> This allows us to drop the confusing adj_start parameter and instead pass
> semantic information to commit_merge().
>
> In the latter case the -(middle->vm_end - start) calculation becomes
> -(middle->vm-end - vmg->end), however this is correct as vmg->end is set to
> the start parameter.
>
> This is because in this case (rather confusingly), we manipulate
> vmg->middle, but ultimately return vmg->next, whose range will be correctly
> specified. At this point vmg->start, end is the new range for the prior VMA
> rather than the merged one.
>
> This patch has no change in functional behaviour.
>
> Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
> ---
> mm/vma.c | 53 ++++++++++++++++++++++++++++++++---------------------
> mm/vma.h | 14 ++++++++++++--
> 2 files changed, 44 insertions(+), 23 deletions(-)
>
> diff --git a/mm/vma.c b/mm/vma.c
> index 955c5ebd5739..e78d65de734b 100644
> --- a/mm/vma.c
> +++ b/mm/vma.c
> @@ -633,29 +633,45 @@ void validate_mm(struct mm_struct *mm)
> *
> * On success, returns the merged VMA. Otherwise returns NULL.
> */
> -static struct vm_area_struct *commit_merge(struct vma_merge_struct *vmg,
> - long adj_start)
> +static struct vm_area_struct *commit_merge(struct vma_merge_struct *vmg)
> {
> - struct vma_prepare vp;
> struct vm_area_struct *remove = NULL;
> struct vm_area_struct *remove2 = NULL;
> + unsigned long flags = vmg->merge_flags;
> + struct vma_prepare vp;
> struct vm_area_struct *adjust = NULL;
> + long adj_start;
> + bool merge_target;
> +
> /*
> - * In all cases but that of merge right, shrink next, we write
> - * vmg->target to the maple tree and return this as the merged VMA.
> + * If modifying an existing VMA and we don't remove vmg->middle, then we
> + * shrink the adjacent VMA.
> */
> - bool merge_target = adj_start >= 0;
> + if (flags & __VMG_FLAG_ADJUST_MIDDLE_START) {
> + adjust = vmg->middle;
> + /* The POSITIVE value by which we offset vmg->middle->vm_start. */
> + adj_start = vmg->end - vmg->middle->vm_start;
> + merge_target = true;
> + } else if (flags & __VMG_FLAG_ADJUST_NEXT_START) {
> + adjust = vmg->next;
> + /* The NEGATIVE value by which we offset vmg->next->vm_start. */
> + adj_start = -(vmg->middle->vm_end - vmg->end);
Wouldn't it make more sense to use vmg->next->vm_start here, which AFAIU is
the same value as vmg->middle->vm_end, but perhaps a bit more obvious
considering the flag and comment says we're adjusting vmg->next->vm_start.
(somewhat less important if this code is temporary)
> + /*
> + * In all cases but this - merge right, shrink next - we write
> + * vmg->target to the maple tree and return this as the merged VMA.
> + */
> + merge_target = false;
That comment reads like it would make sense to init merge_target as true and
only here change it to false.
> + } else {
> + adjust = NULL;
> + adj_start = 0;
> + merge_target = true;
> + }
I guess all these could be initial assignments and we don't need the else
branch at all.
> - if (vmg->merge_flags & __VMG_FLAG_REMOVE_MIDDLE)
> + if (flags & __VMG_FLAG_REMOVE_MIDDLE)
> remove = vmg->middle;
> if (vmg->merge_flags & __VMG_FLAG_REMOVE_NEXT)
> remove2 = vmg->next;
>
> - if (adj_start > 0)
> - adjust = vmg->middle;
> - else if (adj_start < 0)
> - adjust = vmg->next;
> -
> init_multi_vma_prep(&vp, vmg->target, adjust, remove, remove2);
>
> VM_WARN_ON(vp.anon_vma && adjust && adjust->anon_vma &&
> @@ -739,7 +755,6 @@ static __must_check struct vm_area_struct *vma_merge_existing_range(
> bool left_side = middle && start == middle->vm_start;
> bool right_side = middle && end == middle->vm_end;
> int err = 0;
> - long adj_start = 0;
> bool merge_will_delete_middle, merge_will_delete_next;
> bool merge_left, merge_right, merge_both;
>
> @@ -860,11 +875,8 @@ static __must_check struct vm_area_struct *vma_merge_existing_range(
> vmg->start = prev->vm_start;
> vmg->pgoff = prev->vm_pgoff;
>
> - /*
> - * We both expand prev and shrink middle.
> - */
> if (!merge_will_delete_middle)
> - adj_start = vmg->end - middle->vm_start;
> + vmg->merge_flags |= __VMG_FLAG_ADJUST_MIDDLE_START;
>
> err = dup_anon_vma(prev, middle, &anon_dup);
> } else { /* merge_right */
> @@ -893,12 +905,11 @@ static __must_check struct vm_area_struct *vma_merge_existing_range(
> * IMPORTANT: This is the ONLY case where the final
> * merged VMA is NOT vmg->target, but rather vmg->next.
> */
> + vmg->merge_flags |= __VMG_FLAG_ADJUST_NEXT_START;
> vmg->target = middle;
> vmg->start = middle->vm_start;
> vmg->end = start;
> vmg->pgoff = middle->vm_pgoff;
> -
> - adj_start = -(middle->vm_end - start);
> }
>
> err = dup_anon_vma(next, middle, &anon_dup);
> @@ -912,7 +923,7 @@ static __must_check struct vm_area_struct *vma_merge_existing_range(
> if (merge_will_delete_next)
> vmg->merge_flags |= __VMG_FLAG_REMOVE_NEXT;
>
> - res = commit_merge(vmg, adj_start);
> + res = commit_merge(vmg);
> if (!res) {
> if (anon_dup)
> unlink_anon_vmas(anon_dup);
> @@ -1087,7 +1098,7 @@ int vma_expand(struct vma_merge_struct *vmg)
> if (remove_next)
> vmg->merge_flags |= __VMG_FLAG_REMOVE_NEXT;
>
> - if (!commit_merge(vmg, 0))
> + if (!commit_merge(vmg))
> goto nomem;
>
> return 0;
> diff --git a/mm/vma.h b/mm/vma.h
> index ffbfefb9a83d..ddf567359880 100644
> --- a/mm/vma.h
> +++ b/mm/vma.h
> @@ -67,16 +67,26 @@ enum vma_merge_flags {
> * at the gap.
> */
> VMG_FLAG_JUST_EXPAND = 1 << 0,
> + /*
> + * Internal flag indicating the merge increases vmg->middle->vm_start
> + * (and thereby, vmg->prev->vm_end).
> + */
> + __VMG_FLAG_ADJUST_MIDDLE_START = 1 << 1,
> + /*
> + * Internal flag indicating the merge decreases vmg->next->vm_start
> + * (and thereby, vmg->middle->vm_end).
> + */
> + __VMG_FLAG_ADJUST_NEXT_START = 1 << 2,
> /*
> * Internal flag used during the merge operation to indicate we will
> * remove vmg->middle.
> */
> - __VMG_FLAG_REMOVE_MIDDLE = 1 << 1,
> + __VMG_FLAG_REMOVE_MIDDLE = 1 << 3,
> /*
> * Internal flag used during the merge operationr to indicate we will
> * remove vmg->next.
> */
> - __VMG_FLAG_REMOVE_NEXT = 1 << 2,
> + __VMG_FLAG_REMOVE_NEXT = 1 << 4,
> };
>
> /*
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 3/5] mm: eliminate adj_start parameter from commit_merge()
2025-01-29 14:13 ` Vlastimil Babka
@ 2025-01-29 14:19 ` Lorenzo Stoakes
0 siblings, 0 replies; 26+ messages in thread
From: Lorenzo Stoakes @ 2025-01-29 14:19 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Andrew Morton, Liam R . Howlett, Jann Horn, linux-mm, linux-kernel
On Wed, Jan 29, 2025 at 03:13:06PM +0100, Vlastimil Babka wrote:
> On 1/27/25 16:50, Lorenzo Stoakes wrote:
> > Introduce internal __VMG_FLAG_ADJUST_MIDDLE_START and
> > __VMG_FLAG_ADJUST_NEXT_START merge flags, enabling us to indicate to
> > commit_merge() that we are performing a merge which either spans only part
> > of vmg->middle, or part of vmg->next respectively.
> >
> > In the former instance, we change the start of vmg->middle to match the
> > attributes of vmg->prev, without spanning all of vmg->middle.
> >
> > This implies that vmg->prev->vm_end and vmg->middle->vm_start are both
> > increased to form the new merged VMA (vmg->prev) and the new subsequent VMA
> > (vmg->middle).
> >
> > In the latter case, we change the end of vmg->middle to match the
> > attributes of vmg->next, without spanning all of vmg->next.
> >
> > This implies that vmg->middle->vm_end and vmg->next->vm_start are both
> > decreased to form the new merged VMA (vmg->next) and the new prior VMA
> > (vmg->middle).
> >
> > Since we now have a stable set of prev, middle, next VMAs threaded through
> > vmg and with these flags set know what is happening, we can perform
> > the calculation in commit_merge() instead.
> >
> > This allows us to drop the confusing adj_start parameter and instead pass
> > semantic information to commit_merge().
> >
> > In the latter case the -(middle->vm_end - start) calculation becomes
> > -(middle->vm-end - vmg->end), however this is correct as vmg->end is set to
> > the start parameter.
> >
> > This is because in this case (rather confusingly), we manipulate
> > vmg->middle, but ultimately return vmg->next, whose range will be correctly
> > specified. At this point vmg->start, end is the new range for the prior VMA
> > rather than the merged one.
> >
> > This patch has no change in functional behaviour.
> >
> > Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
>
> Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Thanks!
>
> > ---
> > mm/vma.c | 53 ++++++++++++++++++++++++++++++++---------------------
> > mm/vma.h | 14 ++++++++++++--
> > 2 files changed, 44 insertions(+), 23 deletions(-)
> >
> > diff --git a/mm/vma.c b/mm/vma.c
> > index 955c5ebd5739..e78d65de734b 100644
> > --- a/mm/vma.c
> > +++ b/mm/vma.c
> > @@ -633,29 +633,45 @@ void validate_mm(struct mm_struct *mm)
> > *
> > * On success, returns the merged VMA. Otherwise returns NULL.
> > */
> > -static struct vm_area_struct *commit_merge(struct vma_merge_struct *vmg,
> > - long adj_start)
> > +static struct vm_area_struct *commit_merge(struct vma_merge_struct *vmg)
> > {
> > - struct vma_prepare vp;
> > struct vm_area_struct *remove = NULL;
> > struct vm_area_struct *remove2 = NULL;
> > + unsigned long flags = vmg->merge_flags;
> > + struct vma_prepare vp;
> > struct vm_area_struct *adjust = NULL;
> > + long adj_start;
> > + bool merge_target;
> > +
> > /*
> > - * In all cases but that of merge right, shrink next, we write
> > - * vmg->target to the maple tree and return this as the merged VMA.
> > + * If modifying an existing VMA and we don't remove vmg->middle, then we
> > + * shrink the adjacent VMA.
> > */
> > - bool merge_target = adj_start >= 0;
> > + if (flags & __VMG_FLAG_ADJUST_MIDDLE_START) {
> > + adjust = vmg->middle;
> > + /* The POSITIVE value by which we offset vmg->middle->vm_start. */
> > + adj_start = vmg->end - vmg->middle->vm_start;
> > + merge_target = true;
> > + } else if (flags & __VMG_FLAG_ADJUST_NEXT_START) {
> > + adjust = vmg->next;
> > + /* The NEGATIVE value by which we offset vmg->next->vm_start. */
> > + adj_start = -(vmg->middle->vm_end - vmg->end);
>
> Wouldn't it make more sense to use vmg->next->vm_start here, which AFAIU is
> the same value as vmg->middle->vm_end, but perhaps a bit more obvious
> considering the flag and comment says we're adjusting vmg->next->vm_start.
> (somewhat less important if this code is temporary)
Maybe, but the code is temporary.
>
> > + /*
> > + * In all cases but this - merge right, shrink next - we write
> > + * vmg->target to the maple tree and return this as the merged VMA.
> > + */
> > + merge_target = false;
>
> That comment reads like it would make sense to init merge_target as true and
> only here change it to false.
I'd rather explicitly assign the value in each branch though just so you don't
have to figure out there's not an assignment so default value is used.
But also... maybe, but the code is temporary ;)
>
> > + } else {
> > + adjust = NULL;
> > + adj_start = 0;
> > + merge_target = true;
> > + }
>
> I guess all these could be initial assignments and we don't need the else
> branch at all.
Again, I feel it's simpler to have 3 branches where values are set according to
branch conditions so you don't have to figure out which value is set to which by
cross-referencing default values and assignments.
But again I also will say (you know what's coming...) maybe, but the code is
temporary ;)
>
> > - if (vmg->merge_flags & __VMG_FLAG_REMOVE_MIDDLE)
> > + if (flags & __VMG_FLAG_REMOVE_MIDDLE)
> > remove = vmg->middle;
> > if (vmg->merge_flags & __VMG_FLAG_REMOVE_NEXT)
> > remove2 = vmg->next;
> >
> > - if (adj_start > 0)
> > - adjust = vmg->middle;
> > - else if (adj_start < 0)
> > - adjust = vmg->next;
> > -
> > init_multi_vma_prep(&vp, vmg->target, adjust, remove, remove2);
> >
> > VM_WARN_ON(vp.anon_vma && adjust && adjust->anon_vma &&
> > @@ -739,7 +755,6 @@ static __must_check struct vm_area_struct *vma_merge_existing_range(
> > bool left_side = middle && start == middle->vm_start;
> > bool right_side = middle && end == middle->vm_end;
> > int err = 0;
> > - long adj_start = 0;
> > bool merge_will_delete_middle, merge_will_delete_next;
> > bool merge_left, merge_right, merge_both;
> >
> > @@ -860,11 +875,8 @@ static __must_check struct vm_area_struct *vma_merge_existing_range(
> > vmg->start = prev->vm_start;
> > vmg->pgoff = prev->vm_pgoff;
> >
> > - /*
> > - * We both expand prev and shrink middle.
> > - */
> > if (!merge_will_delete_middle)
> > - adj_start = vmg->end - middle->vm_start;
> > + vmg->merge_flags |= __VMG_FLAG_ADJUST_MIDDLE_START;
> >
> > err = dup_anon_vma(prev, middle, &anon_dup);
> > } else { /* merge_right */
> > @@ -893,12 +905,11 @@ static __must_check struct vm_area_struct *vma_merge_existing_range(
> > * IMPORTANT: This is the ONLY case where the final
> > * merged VMA is NOT vmg->target, but rather vmg->next.
> > */
> > + vmg->merge_flags |= __VMG_FLAG_ADJUST_NEXT_START;
> > vmg->target = middle;
> > vmg->start = middle->vm_start;
> > vmg->end = start;
> > vmg->pgoff = middle->vm_pgoff;
> > -
> > - adj_start = -(middle->vm_end - start);
> > }
> >
> > err = dup_anon_vma(next, middle, &anon_dup);
> > @@ -912,7 +923,7 @@ static __must_check struct vm_area_struct *vma_merge_existing_range(
> > if (merge_will_delete_next)
> > vmg->merge_flags |= __VMG_FLAG_REMOVE_NEXT;
> >
> > - res = commit_merge(vmg, adj_start);
> > + res = commit_merge(vmg);
> > if (!res) {
> > if (anon_dup)
> > unlink_anon_vmas(anon_dup);
> > @@ -1087,7 +1098,7 @@ int vma_expand(struct vma_merge_struct *vmg)
> > if (remove_next)
> > vmg->merge_flags |= __VMG_FLAG_REMOVE_NEXT;
> >
> > - if (!commit_merge(vmg, 0))
> > + if (!commit_merge(vmg))
> > goto nomem;
> >
> > return 0;
> > diff --git a/mm/vma.h b/mm/vma.h
> > index ffbfefb9a83d..ddf567359880 100644
> > --- a/mm/vma.h
> > +++ b/mm/vma.h
> > @@ -67,16 +67,26 @@ enum vma_merge_flags {
> > * at the gap.
> > */
> > VMG_FLAG_JUST_EXPAND = 1 << 0,
> > + /*
> > + * Internal flag indicating the merge increases vmg->middle->vm_start
> > + * (and thereby, vmg->prev->vm_end).
> > + */
> > + __VMG_FLAG_ADJUST_MIDDLE_START = 1 << 1,
> > + /*
> > + * Internal flag indicating the merge decreases vmg->next->vm_start
> > + * (and thereby, vmg->middle->vm_end).
> > + */
> > + __VMG_FLAG_ADJUST_NEXT_START = 1 << 2,
> > /*
> > * Internal flag used during the merge operation to indicate we will
> > * remove vmg->middle.
> > */
> > - __VMG_FLAG_REMOVE_MIDDLE = 1 << 1,
> > + __VMG_FLAG_REMOVE_MIDDLE = 1 << 3,
> > /*
> > * Internal flag used during the merge operationr to indicate we will
> > * remove vmg->next.
> > */
> > - __VMG_FLAG_REMOVE_NEXT = 1 << 2,
> > + __VMG_FLAG_REMOVE_NEXT = 1 << 4,
> > };
> >
> > /*
>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 4/5] mm: make vmg->target consistent and further simplify commit_merge()
2025-01-27 15:50 ` [PATCH 4/5] mm: make vmg->target consistent and further simplify commit_merge() Lorenzo Stoakes
@ 2025-01-29 14:45 ` Vlastimil Babka
2025-01-29 14:51 ` Lorenzo Stoakes
2025-01-29 15:19 ` Vlastimil Babka
1 sibling, 1 reply; 26+ messages in thread
From: Vlastimil Babka @ 2025-01-29 14:45 UTC (permalink / raw)
To: Lorenzo Stoakes, Andrew Morton
Cc: Liam R . Howlett, Jann Horn, linux-mm, linux-kernel
On 1/27/25 16:50, Lorenzo Stoakes wrote:
> It is confusing for vmg->target to sometimes be the target merged VMA and
> in one case not.
>
> Fix this by having commit_merge() use its awareness of the
> __VMG_FLAG_ADJUST_NEXT_START case to know that it is manipulating a
> separate vma, abstracted in the 'vma' local variable.
>
> Place removal and adjust VMA determination logic into
> init_multi_vma_prep(), as the flags give us enough information to do so,
> and since this is the function that sets up the vma_prepare struct it makes
> sense to do so here.
>
> Doing this significantly simplifies commit_merge(), allowing us to
> eliminate the 'merge_target' handling, initialise the VMA iterator in a
> more sensible place and simply return vmg->target consistently.
>
> This also allows us to simplify setting vmg->target in
> vma_merge_existing_range() since we are then left only with two cases -
> merge left (or both) where the target is vmg->prev or merge right in which
> the target is vmg->next.
>
> This makes it easy for somebody reading the code to know what VMA will
> actually be the one returned and merged into and removes a great deal of
> the confusing 'adjust' nonsense.
>
> This patch has no change in functional behaviour.
>
> Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
> @@ -906,7 +911,6 @@ static __must_check struct vm_area_struct *vma_merge_existing_range(
> * merged VMA is NOT vmg->target, but rather vmg->next.
> */
Is this comment now also obsolete?
> vmg->merge_flags |= __VMG_FLAG_ADJUST_NEXT_START;
> - vmg->target = middle;
> vmg->start = middle->vm_start;
> vmg->end = start;
> vmg->pgoff = middle->vm_pgoff;
> diff --git a/mm/vma.h b/mm/vma.h
> index ddf567359880..5be43e2bba3f 100644
> --- a/mm/vma.h
> +++ b/mm/vma.h
> @@ -113,11 +113,7 @@ struct vma_merge_struct {
> struct vm_area_struct *prev;
> struct vm_area_struct *middle;
> struct vm_area_struct *next;
> - /*
> - * This is the VMA we ultimately target to become the merged VMA, except
> - * for the one exception of merge right, shrink next (for details of
> - * this scenario see vma_merge_existing_range()).
> - */
> + /* This is the VMA we ultimately target to become the merged VMA. */
> struct vm_area_struct *target;
> /*
> * Initially, the start, end, pgoff fields are provided by the caller
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 4/5] mm: make vmg->target consistent and further simplify commit_merge()
2025-01-29 14:45 ` Vlastimil Babka
@ 2025-01-29 14:51 ` Lorenzo Stoakes
0 siblings, 0 replies; 26+ messages in thread
From: Lorenzo Stoakes @ 2025-01-29 14:51 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Andrew Morton, Liam R . Howlett, Jann Horn, linux-mm, linux-kernel
On Wed, Jan 29, 2025 at 03:45:28PM +0100, Vlastimil Babka wrote:
> On 1/27/25 16:50, Lorenzo Stoakes wrote:
> > It is confusing for vmg->target to sometimes be the target merged VMA and
> > in one case not.
> >
> > Fix this by having commit_merge() use its awareness of the
> > __VMG_FLAG_ADJUST_NEXT_START case to know that it is manipulating a
> > separate vma, abstracted in the 'vma' local variable.
> >
> > Place removal and adjust VMA determination logic into
> > init_multi_vma_prep(), as the flags give us enough information to do so,
> > and since this is the function that sets up the vma_prepare struct it makes
> > sense to do so here.
> >
> > Doing this significantly simplifies commit_merge(), allowing us to
> > eliminate the 'merge_target' handling, initialise the VMA iterator in a
> > more sensible place and simply return vmg->target consistently.
> >
> > This also allows us to simplify setting vmg->target in
> > vma_merge_existing_range() since we are then left only with two cases -
> > merge left (or both) where the target is vmg->prev or merge right in which
> > the target is vmg->next.
> >
> > This makes it easy for somebody reading the code to know what VMA will
> > actually be the one returned and merged into and removes a great deal of
> > the confusing 'adjust' nonsense.
> >
> > This patch has no change in functional behaviour.
> >
> > Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
>
> Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Thanks!
>
>
> > @@ -906,7 +911,6 @@ static __must_check struct vm_area_struct *vma_merge_existing_range(
> > * merged VMA is NOT vmg->target, but rather vmg->next.
> > */
>
> Is this comment now also obsolete?
>
It will be in 5/5 where it gets deleted.
> > vmg->merge_flags |= __VMG_FLAG_ADJUST_NEXT_START;
> > - vmg->target = middle;
> > vmg->start = middle->vm_start;
> > vmg->end = start;
> > vmg->pgoff = middle->vm_pgoff;
> > diff --git a/mm/vma.h b/mm/vma.h
> > index ddf567359880..5be43e2bba3f 100644
> > --- a/mm/vma.h
> > +++ b/mm/vma.h
> > @@ -113,11 +113,7 @@ struct vma_merge_struct {
> > struct vm_area_struct *prev;
> > struct vm_area_struct *middle;
> > struct vm_area_struct *next;
> > - /*
> > - * This is the VMA we ultimately target to become the merged VMA, except
> > - * for the one exception of merge right, shrink next (for details of
> > - * this scenario see vma_merge_existing_range()).
> > - */
> > + /* This is the VMA we ultimately target to become the merged VMA. */
> > struct vm_area_struct *target;
> > /*
> > * Initially, the start, end, pgoff fields are provided by the caller
>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 4/5] mm: make vmg->target consistent and further simplify commit_merge()
2025-01-27 15:50 ` [PATCH 4/5] mm: make vmg->target consistent and further simplify commit_merge() Lorenzo Stoakes
2025-01-29 14:45 ` Vlastimil Babka
@ 2025-01-29 15:19 ` Vlastimil Babka
2025-01-29 15:30 ` Lorenzo Stoakes
1 sibling, 1 reply; 26+ messages in thread
From: Vlastimil Babka @ 2025-01-29 15:19 UTC (permalink / raw)
To: Lorenzo Stoakes, Andrew Morton
Cc: Liam R . Howlett, Jann Horn, linux-mm, linux-kernel
On 1/27/25 16:50, Lorenzo Stoakes wrote:
> + if (flags & __VMG_FLAG_REMOVE_MIDDLE) {
> + *remove = vmg->middle;
> + remove = &vp->remove2;
> + }
> + if (flags & __VMG_FLAG_REMOVE_NEXT)
> + *remove = vmg->next;
> +
> + if (flags & __VMG_FLAG_ADJUST_MIDDLE_START)
> + adjust = vmg->middle;
> + else if (flags & __VMG_FLAG_ADJUST_NEXT_START)
> + adjust = vmg->next;
> + else
> + adjust = NULL;
> +
> + vp->adj_next = adjust;
Realized this has kinda made it more obvious that vp->adj_next is a misnomer?
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 4/5] mm: make vmg->target consistent and further simplify commit_merge()
2025-01-29 15:19 ` Vlastimil Babka
@ 2025-01-29 15:30 ` Lorenzo Stoakes
0 siblings, 0 replies; 26+ messages in thread
From: Lorenzo Stoakes @ 2025-01-29 15:30 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Andrew Morton, Liam R . Howlett, Jann Horn, linux-mm, linux-kernel
On Wed, Jan 29, 2025 at 04:19:50PM +0100, Vlastimil Babka wrote:
> On 1/27/25 16:50, Lorenzo Stoakes wrote:
> > + if (flags & __VMG_FLAG_REMOVE_MIDDLE) {
> > + *remove = vmg->middle;
> > + remove = &vp->remove2;
> > + }
> > + if (flags & __VMG_FLAG_REMOVE_NEXT)
> > + *remove = vmg->next;
> > +
> > + if (flags & __VMG_FLAG_ADJUST_MIDDLE_START)
> > + adjust = vmg->middle;
> > + else if (flags & __VMG_FLAG_ADJUST_NEXT_START)
> > + adjust = vmg->next;
> > + else
> > + adjust = NULL;
> > +
> > + vp->adj_next = adjust;
>
> Realized this has kinda made it more obvious that vp->adj_next is a misnomer?
Well yes and no.
It is badly named, as it sounds like adj_start, which is an offset, but it is
equivalent to 'adjust' elsewhere.
But it _is_ the 'next' VMA from the one which we are otherwise manipulating, so
'adjusted next' is sort of a vaguely reasonable-ish name.
But at the same time, it's all horrible and maybe this whole
vma_prepare()/complete() will be the target of my next round of
churn^D^D^D^D^Drefactorings...
I also don't like the various 'special casing' of these situations in
vma_complete(), and it feels like - maybe we can use or adapt the vmg structure
for split as well and put the vma_prepare()/vma_complete() stuff there too...
Anyway. One thing at a time.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 5/5] mm: completely abstract unnecessary adj_start calculation
2025-01-27 15:50 ` [PATCH 5/5] mm: completely abstract unnecessary adj_start calculation Lorenzo Stoakes
` (3 preceding siblings ...)
2025-01-28 7:54 ` kernel test robot
@ 2025-01-29 15:42 ` Vlastimil Babka
2025-01-29 16:19 ` Lorenzo Stoakes
4 siblings, 1 reply; 26+ messages in thread
From: Vlastimil Babka @ 2025-01-29 15:42 UTC (permalink / raw)
To: Lorenzo Stoakes, Andrew Morton
Cc: Liam R . Howlett, Jann Horn, linux-mm, linux-kernel
On 1/27/25 16:50, Lorenzo Stoakes wrote:
> The adj_start calculation has been a constant source of confusion in the
> VMA merge code.
>
> There are two cases to consider, one where we adjust the start of the
> vmg->middle VMA (i.e. the __VMG_FLAG_ADJUST_MIDDLE_START merge flag is
> set), in which case adj_start is calculated as:
>
> (1) adj_start = vmg->end - vmg->middle->vm_start
>
> And the case where we adjust the start of the vmg->next VMA (i.e.t he
> __VMG_FLAG_ADJUST_NEXT_START merge flag is set), in which case adj_start is
> calculated as:
>
> (2) adj_start = -(vmg->middle->vm_end - vmg->end)
>
> We apply (1) thusly:
>
> vmg->middle->vm_start =
> vmg->middle->vm_start + vmg->end - vmg->middle->vm_start
>
> Which simplifies to:
>
> vmg->middle->vm_start = vmg->end
>
> Similarly, we apply (2) as:
>
> vmg->next->vm_start =
> vmg->next->vm_start + -(vmg->middle->vm_end - vmg->end)
>
> Noting that for these VMAs to be mergeable vmg->middle->vm_end ==
> vmg->next->vm_start and so this simplifies to:
>
> vmg->next->vm_start =
> vmg->next->vm_start + -(vmg->next->vm_start - vmg->end)
>
> Which simplifies to:
>
> vmg->next->vm_start = vmg->end
>
> Therefore in each case, we simply need to adjust the start of the VMA to
> vmg->end (!) and can do away with this adj_start calculation. The only
> caveat is that we must ensure we update the vm_pgoff field correctly.
>
> We therefore abstract this entire calculation to a new function
> vmg_adjust_set_range() which performs this calculation and sets the
> adjusted VMA's new range using the general vma_set_range() function.
>
> We also must update vma_adjust_trans_huge() which expects the
> now-abstracted adj_start parameter. It turns out this is wholly
> unnecessary.
>
> In vma_adjust_trans_huge() the relevant code is:
>
> if (adjust_next > 0) {
> struct vm_area_struct *next = find_vma(vma->vm_mm, vma->vm_end);
> unsigned long nstart = next->vm_start;
> nstart += adjust_next;
> split_huge_pmd_if_needed(next, nstart);
> }
>
> The only case where this is relevant is when __VMG_FLAG_ADJUST_MIDDLE_START
> is specified (in which case adj_next would have been positive), i.e. the
> one in which the vma specified is vmg->prev and this the sought 'next' VMA
> would be vmg->middle.
>
> We can therefore eliminate the find_vma() invocation altogether and simply
> provide the vmg->middle VMA in this instance, or NULL otherwise.
>
> Again we have an adj_next offset calculation:
>
> next->vm_start + vmg->end - vmg->middle->vm_start
>
> Where next == vmg->middle this simplifies to vmg->end as previously
> demonstrated.
>
> Therefore nstart is equal to vmg->end, which is already passed to
> vma_adjust_trans_huge() via the 'end' parameter and so this code (rather
> delightfully) simplifies to:
>
> if (next)
> split_huge_pmd_if_needed(next, end);
>
> With these changes in place, it becomes silly for commit_merge() to return
> vmg->target, as it is always the same and threaded through vmg, so we
> finally change commit_merge() to return an error value once again.
>
> This patch has no change in functional behaviour.
>
> Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Yeah this makes the preparations worth it. Nice!
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
> +/*
> + * Actually perform the VMA merge operation.
> + *
> + * On success, returns the merged VMA. Otherwise returns NULL.
Needs updating?
> + */
> +static int commit_merge(struct vma_merge_struct *vmg)
> +{
> + struct vm_area_struct *vma;
> + struct vma_prepare vp;
> + bool adj_middle = vmg->merge_flags & __VMG_FLAG_ADJUST_MIDDLE_START;
>
> - vma_iter_config(vmg->vmi, vmg->next->vm_start + adj_start,
> - vmg->next->vm_end);
> + if (vmg->merge_flags & __VMG_FLAG_ADJUST_NEXT_START) {
> + /* In this case we manipulate middle and return next. */
Also we don't return next anymore? At least not here.
vma_merge_existing_range() does, but here it's rather "the target is next"?
> + vma = vmg->middle;
> + vma_iter_config(vmg->vmi, vmg->end, vmg->next->vm_end);
> } else {
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 5/5] mm: completely abstract unnecessary adj_start calculation
2025-01-29 15:42 ` Vlastimil Babka
@ 2025-01-29 16:19 ` Lorenzo Stoakes
0 siblings, 0 replies; 26+ messages in thread
From: Lorenzo Stoakes @ 2025-01-29 16:19 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Andrew Morton, Liam R . Howlett, Jann Horn, linux-mm, linux-kernel
On Wed, Jan 29, 2025 at 04:42:06PM +0100, Vlastimil Babka wrote:
> On 1/27/25 16:50, Lorenzo Stoakes wrote:
> > The adj_start calculation has been a constant source of confusion in the
> > VMA merge code.
> >
> > There are two cases to consider, one where we adjust the start of the
> > vmg->middle VMA (i.e. the __VMG_FLAG_ADJUST_MIDDLE_START merge flag is
> > set), in which case adj_start is calculated as:
> >
> > (1) adj_start = vmg->end - vmg->middle->vm_start
> >
> > And the case where we adjust the start of the vmg->next VMA (i.e.t he
> > __VMG_FLAG_ADJUST_NEXT_START merge flag is set), in which case adj_start is
> > calculated as:
> >
> > (2) adj_start = -(vmg->middle->vm_end - vmg->end)
> >
> > We apply (1) thusly:
> >
> > vmg->middle->vm_start =
> > vmg->middle->vm_start + vmg->end - vmg->middle->vm_start
> >
> > Which simplifies to:
> >
> > vmg->middle->vm_start = vmg->end
> >
> > Similarly, we apply (2) as:
> >
> > vmg->next->vm_start =
> > vmg->next->vm_start + -(vmg->middle->vm_end - vmg->end)
> >
> > Noting that for these VMAs to be mergeable vmg->middle->vm_end ==
> > vmg->next->vm_start and so this simplifies to:
> >
> > vmg->next->vm_start =
> > vmg->next->vm_start + -(vmg->next->vm_start - vmg->end)
> >
> > Which simplifies to:
> >
> > vmg->next->vm_start = vmg->end
> >
> > Therefore in each case, we simply need to adjust the start of the VMA to
> > vmg->end (!) and can do away with this adj_start calculation. The only
> > caveat is that we must ensure we update the vm_pgoff field correctly.
> >
> > We therefore abstract this entire calculation to a new function
> > vmg_adjust_set_range() which performs this calculation and sets the
> > adjusted VMA's new range using the general vma_set_range() function.
> >
> > We also must update vma_adjust_trans_huge() which expects the
> > now-abstracted adj_start parameter. It turns out this is wholly
> > unnecessary.
> >
> > In vma_adjust_trans_huge() the relevant code is:
> >
> > if (adjust_next > 0) {
> > struct vm_area_struct *next = find_vma(vma->vm_mm, vma->vm_end);
> > unsigned long nstart = next->vm_start;
> > nstart += adjust_next;
> > split_huge_pmd_if_needed(next, nstart);
> > }
> >
> > The only case where this is relevant is when __VMG_FLAG_ADJUST_MIDDLE_START
> > is specified (in which case adj_next would have been positive), i.e. the
> > one in which the vma specified is vmg->prev and this the sought 'next' VMA
> > would be vmg->middle.
> >
> > We can therefore eliminate the find_vma() invocation altogether and simply
> > provide the vmg->middle VMA in this instance, or NULL otherwise.
> >
> > Again we have an adj_next offset calculation:
> >
> > next->vm_start + vmg->end - vmg->middle->vm_start
> >
> > Where next == vmg->middle this simplifies to vmg->end as previously
> > demonstrated.
> >
> > Therefore nstart is equal to vmg->end, which is already passed to
> > vma_adjust_trans_huge() via the 'end' parameter and so this code (rather
> > delightfully) simplifies to:
> >
> > if (next)
> > split_huge_pmd_if_needed(next, end);
> >
> > With these changes in place, it becomes silly for commit_merge() to return
> > vmg->target, as it is always the same and threaded through vmg, so we
> > finally change commit_merge() to return an error value once again.
> >
> > This patch has no change in functional behaviour.
> >
> > Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
>
> Yeah this makes the preparations worth it. Nice!
>
> Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Thanks!
>
> > +/*
> > + * Actually perform the VMA merge operation.
> > + *
> > + * On success, returns the merged VMA. Otherwise returns NULL.
>
> Needs updating?
Ah yeah you're right, will fix!
>
> > + */
> > +static int commit_merge(struct vma_merge_struct *vmg)
> > +{
> > + struct vm_area_struct *vma;
> > + struct vma_prepare vp;
> > + bool adj_middle = vmg->merge_flags & __VMG_FLAG_ADJUST_MIDDLE_START;
> >
> > - vma_iter_config(vmg->vmi, vmg->next->vm_start + adj_start,
> > - vmg->next->vm_end);
> > + if (vmg->merge_flags & __VMG_FLAG_ADJUST_NEXT_START) {
> > + /* In this case we manipulate middle and return next. */
>
> Also we don't return next anymore? At least not here.
> vma_merge_existing_range() does, but here it's rather "the target is next"?
Yeah we should reword as you say, will fix!
>
> > + vma = vmg->middle;
> > + vma_iter_config(vmg->vmi, vmg->end, vmg->next->vm_end);
> > } else {
^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2025-01-29 16:19 UTC | newest]
Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-01-27 15:50 [PATCH 0/5] mm: further simplify VMA merge operation Lorenzo Stoakes
2025-01-27 15:50 ` [PATCH 1/5] mm: simplify vma merge structure and expand comments Lorenzo Stoakes
2025-01-28 11:38 ` Vlastimil Babka
2025-01-28 14:05 ` Lorenzo Stoakes
2025-01-27 15:50 ` [PATCH 2/5] mm: further refactor commit_merge() Lorenzo Stoakes
2025-01-28 15:05 ` Vlastimil Babka
2025-01-28 15:45 ` Vlastimil Babka
2025-01-28 16:07 ` Lorenzo Stoakes
2025-01-28 16:42 ` Lorenzo Stoakes
2025-01-27 15:50 ` [PATCH 3/5] mm: eliminate adj_start parameter from commit_merge() Lorenzo Stoakes
2025-01-29 14:13 ` Vlastimil Babka
2025-01-29 14:19 ` Lorenzo Stoakes
2025-01-27 15:50 ` [PATCH 4/5] mm: make vmg->target consistent and further simplify commit_merge() Lorenzo Stoakes
2025-01-29 14:45 ` Vlastimil Babka
2025-01-29 14:51 ` Lorenzo Stoakes
2025-01-29 15:19 ` Vlastimil Babka
2025-01-29 15:30 ` Lorenzo Stoakes
2025-01-27 15:50 ` [PATCH 5/5] mm: completely abstract unnecessary adj_start calculation Lorenzo Stoakes
2025-01-27 18:50 ` kernel test robot
2025-01-27 18:56 ` Lorenzo Stoakes
2025-01-27 19:58 ` Lorenzo Stoakes
2025-01-27 20:13 ` kernel test robot
2025-01-27 20:35 ` kernel test robot
2025-01-28 7:54 ` kernel test robot
2025-01-29 15:42 ` Vlastimil Babka
2025-01-29 16:19 ` Lorenzo Stoakes
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox