linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Sidhartha Kumar <sidhartha.kumar@oracle.com>
To: linux-kernel@vger.kernel.org, maple-tree@lists.infradead.org
Cc: linux-mm@kvack.org, akpm@linux-foundation.org,
	liam.howlett@oracle.com, willy@infradead.org
Subject: Re: [PATCH v4 0/6] Track node vacancy to reduce worst case allocation counts
Date: Wed, 9 Apr 2025 13:51:29 -0400	[thread overview]
Message-ID: <d25b3d0d-ebc3-4cf8-bf87-7d42dd23dc36@oracle.com> (raw)
In-Reply-To: <20250407184102.2155415-1-sidhartha.kumar@oracle.com>

On 4/7/25 2:40 PM, Sidhartha Kumar wrote:
> v3[3] -> v4:
>    - fix bug reported by Vasily Gorbik by fixing condition for
>      mast_overflow() and add test case for this condition. This fix
>      was also tested on s390 by Vasily.
> 
>    - cleanup new_height variable usage in mas_spanning_rebalance()
>      and add an additional test to make sure mt_height() is correct
>      after a rebalance causes a tree's height to decrease per Liam.
> 
Hi Andrew,

Could the following fixup patch be applied to the end of this series? If 
you'd prefer a new version of the series I can do that as well.

Thanks,
Sid


commit 1be83a7dccbef0a5973a3723de5a4460270d90c7 (HEAD)
Author: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Date:   Wed Apr 9 17:44:51 2025 +0000

     maple_tree: vacant_height series fixup

     l_mas.depth is no longer overloaded to track the height in
     mas_spanning_rebalance() so we can remove its initialization.

     Fix comments that refer to variables that track height as depth.

     Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>

diff --git a/include/linux/maple_tree.h b/include/linux/maple_tree.h
index 37dc9525dff6..9ef129038224 100644
--- a/include/linux/maple_tree.h
+++ b/include/linux/maple_tree.h
@@ -463,8 +463,8 @@ struct ma_wr_state {
         void __rcu **slots;             /* mas->node->slots pointer */
         void *entry;                    /* The entry to write */
         void *content;                  /* The existing entry that is 
being overwritten */
-       unsigned char vacant_height;    /* Depth of lowest node with 
free space */
-       unsigned char sufficient_height;/* Depth of lowest node with min 
sufficiency + 1 nodes */
+       unsigned char vacant_height;    /* Height of lowest node with 
free space */
+       unsigned char sufficient_height;/* Height of lowest node with 
min sufficiency + 1 nodes */
  };

  #define mas_lock(mas)           spin_lock(&((mas)->tree->ma_lock))
diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index 8425728e3c5a..affe979bd14d 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -2850,8 +2850,6 @@ static void mas_spanning_rebalance(struct ma_state 
*mas,
             unlikely(mast->bn->b_end <= mt_min_slots[mast->bn->type]))
                 mast_spanning_rebalance(mast);

-       l_mas.depth = 0;
-
         /*
          * Each level of the tree is examined and balanced, pushing 
data to the left or
          * right, or rebalancing against left or right nodes is 
employed to avoid

> v2[2] -> v3:
>    - add r-b to patches 1,4, and 6
>    
>    - update function parameter comments in patch 2
> 
>    - remove line that sets mas->depth in patch 2
> 
>    - remove redundant code for checking for a spanning write in patch 3
> 
>    - rewrite commit message of patch 5 for additonal context and clarity
> 
> 
> v1[1] -> v2:
>    - fix comment for vacant_height which refers to depth per Wei
> 
>    - add a patch to reorder switch case statements in mas_prealloc_calc and
>      mas_wr_store_entry
> 
>    - use sufficient height in spanning stores
> 
>    - modify patch 2 to use a counter to track ascending the tree rather
>      than overloading mas->depth to have this function.
> 
> ================ overview ========================
> Currently, the maple tree preallocates the worst case number of nodes for
> given store type by taking into account the whole height of the tree. This
> comes from a worst case scenario of every node in the tree being full and
> having to propagate node allocation upwards until we reach the root of the
> tree. This can be optimized if there are vacancies in nodes that are at a
> lower depth than the root node. This series implements tracking the level
> at which there is a vacant node so we only need to allocate until this
> level is reached, rather than always using the full height of the tree.
> The ma_wr_state struct is modified to add a field which keeps track of the
> vacant height and is updated during walks of the tree. This value is then
> read in mas_prealloc_calc() when we decide how many nodes to allocate.
> 
> For rebalancing and spanning stores, we also need to track the lowest
> height at which a node has 1 more entry than the minimum sufficient number
> of entries. This is because rebalancing can cause a parent node to become
> insufficient which results in further node allocations. In this case, we
> need to use the sufficient height as the worst case rather than the vacant
> height.
> 
> patch 1-2: preparatory patches
> patch 3: implement vacant height tracking + update the tests
> patch 4: support vacant height tracking for rebalancing writes
> patch 5: implement sufficient height tracking
> patch 6: reorder switch case statements
> 
> ================ results =========================
> Bpftrace was used to profile the allocation path for requesting new maple
> nodes while running stress-ng mmap 120s. The histograms below represent
> requests to kmem_cache_alloc_bulk() and show the count argument. This
> represnts how many maple nodes the caller is requesting in
> kmem_cache_alloc_bulk()
> 
> command: stress-ng --mmap 4 --timeout 120
> 
> mm-unstable
> 
> @bulk_alloc_req:
> [3, 4)                 4 |                                                    |
> [4, 5)             54170 |@                                                   |
> [5, 6)                 0 |                                                    |
> [6, 7)            893057 |@@@@@@@@@@@@@@@@@@@@                                |
> [7, 8)                 4 |                                                    |
> [8, 9)           2230287 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
> [9, 10)            55811 |@                                                   |
> [10, 11)           77834 |@                                                   |
> [11, 12)               0 |                                                    |
> [12, 13)         1368684 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@                     |
> [13, 14)               0 |                                                    |
> [14, 15)               0 |                                                    |
> [15, 16)          367197 |@@@@@@@@                                            |
> 
> 
> @maple_node_total: 46,630,160
> @total_vmas: 46184591
> 
> mm-unstable + this series
> 
> @bulk_alloc_req:
> [2, 3)               198 |                                                    |
> [3, 4)                 4 |                                                    |
> [4, 5)                43 |                                                    |
> [5, 6)                 0 |                                                    |
> [6, 7)           1069503 |@@@@@@@@@@@@@@@@@@@@@                               |
> [7, 8)                 4 |                                                    |
> [8, 9)           2597268 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
> [9, 10)           472191 |@@@@@@@@@                                           |
> [10, 11)          191904 |@@@                                                 |
> [11, 12)               0 |                                                    |
> [12, 13)          247316 |@@@@                                                |
> [13, 14)               0 |                                                    |
> [14, 15)               0 |                                                    |
> [15, 16)           98769 |@                                                   |
> 
> 
> @maple_node_total: 37,813,856
> @total_vmas: 43493287
> 
> This represents a ~19% reduction in the number of bulk maple nodes allocated.
> 
> For more reproducible results, a historgram of the return value of
> mas_prealloc_calc() is displayed while running the maple_tree_tests
> whcih have a deterministic store pattern
> 
> mas_prealloc_calc() return value mm-unstable
> 1   :                                                    (12068)
> 3   :                                                    (11836)
> 5   : *****                                              (271192)
> 7   : ************************************************** (2329329)
> 9   : ***********                                        (534186)
> 10  :                                                    (435)
> 11  : ***************                                    (704306)
> 13  : ********                                           (409781)
> 
> mas_prealloc_calc() return value mm-unstable + this series
> 1   :                                                    (12070)
> 3   : ************************************************** (3548777)
> 5   : ********                                           (633458)
> 7   :                                                    (65081)
> 9   :                                                    (11224)
> 10  :                                                    (341)
> 11  :                                                    (2973)
> 13  :                                                    (68)
> 
> do_mmap latency was also measured for regressions:
> command: stress-ng --mmap 4 --timeout 120
> 
> mm-unstable:
> avg = 7162 nsecs, total: 16101821292 nsecs, count: 2248034
> 
> mm-unstable + this series:
> avg = 6689 nsecs, total: 15135391764 nsecs, count: 2262726
> 
> 
> stress-ng --mmap4 --timeout 120
> 
> with vacant_height:
> stress-ng: info:  [257]                   21526312 Maple Tree Read                0.176 M/sec
> stress-ng: info:  [257]                  339979348 Maple Tree Write               2.774 M/sec
> 
> without vacant_height:
> stress-ng: info:  [8228]                   20968900 Maple Tree Read                0.171 M/sec
> stress-ng: info:  [8228]                  312214648 Maple Tree Write               2.547 M/sec
> 
> This respresents an increase of ~3% read throughput and ~9% increase in write throughput.
> 
> [1]: https://lore.kernel.org/lkml/20241114170524.64391-1-sidhartha.kumar@oracle.com/T/
> [2]: https://lore.kernel.org/lkml/20250221163610.578409-1-sidhartha.kumar@oracle.com/
> [3]: https://lore.kernel.org/lkml/20250227204823.758784-1-sidhartha.kumar@oracle.com/
> Sidhartha Kumar (6):
>    maple_tree: convert mas_prealloc_calc() to take in a maple write state
>    maple_tree: use height and depth consistently
>    maple_tree: use vacant nodes to reduce worst case allocations
>    maple_tree: break on convergence in mas_spanning_rebalance()
>    maple_tree: add sufficient height
>    maple_tree: reorder mas->store_type case statements
> 
>   include/linux/maple_tree.h       |   4 +
>   lib/maple_tree.c                 | 189 ++++++++++++++++++-------------
>   tools/testing/radix-tree/maple.c | 126 +++++++++++++++++++--
>   3 files changed, 233 insertions(+), 86 deletions(-)
> 



  parent reply	other threads:[~2025-04-09 17:51 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-07 18:40 Sidhartha Kumar
2025-04-07 18:40 ` [PATCH v4 1/6] maple_tree: convert mas_prealloc_calc() to take in a maple write state Sidhartha Kumar
2025-04-07 18:40 ` [PATCH v4 2/6] maple_tree: use height and depth consistently Sidhartha Kumar
2025-04-08 18:02   ` Liam R. Howlett
2025-04-09 17:37   ` Sidhartha Kumar
2025-04-07 18:40 ` [PATCH v4 3/6] maple_tree: use vacant nodes to reduce worst case allocations Sidhartha Kumar
2025-04-08 18:08   ` Liam R. Howlett
2025-04-07 18:41 ` [PATCH v4 4/6] maple_tree: break on convergence in mas_spanning_rebalance() Sidhartha Kumar
2025-04-07 18:41 ` [PATCH v4 5/6] maple_tree: add sufficient height Sidhartha Kumar
2025-04-07 18:41 ` [PATCH v4 6/6] maple_tree: reorder mas->store_type case statements Sidhartha Kumar
2025-04-09 17:51 ` Sidhartha Kumar [this message]
2025-04-10  0:29   ` [PATCH v4 0/6] Track node vacancy to reduce worst case allocation counts Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d25b3d0d-ebc3-4cf8-bf87-7d42dd23dc36@oracle.com \
    --to=sidhartha.kumar@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=liam.howlett@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=maple-tree@lists.infradead.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox