From: David Rientjes <rientjes@google.com>
To: "Jay Patel" <jaypatel@linux.ibm.com>,
"Brian “Binder” Makin" <merimus@google.com>
Cc: linux-mm@kvack.org, cl@linux.com, penberg@kernel.org,
iamjoonsoo.kim@lge.com,
Andrew Morton <akpm@linux-foundation.org>,
Vlastimil Babka <vbabka@suse.cz>,
aneesh.kumar@linux.ibm.com, tsahu@linux.ibm.com,
piyushs@linux.ibm.com
Subject: Re: [PATCH] [RFC PATCH v2]mm/slub: Optimize slub memory usage
Date: Sun, 2 Jul 2023 17:13:19 -0700 (PDT) [thread overview]
Message-ID: <b87d8eee-2ce5-b7d5-97f8-a5d80eed3c44@google.com> (raw)
In-Reply-To: <20230628095740.589893-1-jaypatel@linux.ibm.com>
Thanks very much for looking at this, Jay!
My colleague, Binder, has also been looking at opportunities to optimize
memory usage when using SLUB. We're preparing to deprecate SLAB
internally and shift toward SLUB since SLAB is scheduled for removal after
the next LTS kernel.
Binder, do you have an evaluation with this patch similar to what Jay did?
Also, tangentially: we are looking at other opportunities for reduction in
memory overhead when using SLUB. If you or anybody else are interested in
being involved in a working group with this shared goal, please let me
know. We could brainstorm, collaborate, and share data.
Thanks again!
On Wed, 28 Jun 2023, Jay Patel wrote:
> In the previous version [1], we were able to reduce slub memory
> wastage, but the total memory was also increasing so to solve
> this problem have modified the patch as follow:
>
> 1) If min_objects * object_size > PAGE_ALLOC_COSTLY_ORDER, then it
> will return with PAGE_ALLOC_COSTLY_ORDER.
> 2) Similarly, if min_objects * object_size < PAGE_SIZE, then it will
> return with slub_min_order.
> 3) Additionally, I changed slub_max_order to 2. There is no specific
> reason for using the value 2, but it provided the best results in
> terms of performance without any noticeable impact.
>
> [1]
> https://lore.kernel.org/linux-mm/20230612085535.275206-1-jaypatel@linux.ibm.com/
>
> I have conducted tests on systems with 160 CPUs and 16 CPUs using 4K
> and 64K page sizes. The tests showed that the patch successfully
> reduces the total and wastage of slab memory without any noticeable
> performance degradation in the hackbench test.
>
> Test Results are as follows:
> 1) On 160 CPUs with 4K Page size
>
> +----------------+----------------+----------------+
> | Total wastage in slub memory |
> +----------------+----------------+----------------+
> | | After Boot | After Hackbench|
> | Normal | 2090 Kb | 3204 Kb |
> | With Patch | 1825 Kb | 3088 Kb |
> | Wastage reduce | ~12% | ~4% |
> +----------------+----------------+----------------+
>
> +-----------------+----------------+----------------+
> | Total slub memory |
> +-----------------+----------------+----------------+
> | | After Boot | After Hackbench|
> | Normal | 500572 | 713568 |
> | With Patch | 482036 | 688312 |
> | Memory reduce | ~4% | ~3% |
> +-----------------+----------------+----------------+
>
> hackbench-process-sockets
> +-------+-----+----------+----------+-----------+
> | | Normal |With Patch| |
> +-------+-----+----------+----------+-----------+
> | Amean | 1 | 1.3237 | 1.2737 | ( 3.78%) |
> | Amean | 4 | 1.5923 | 1.6023 | ( -0.63%) |
> | Amean | 7 | 2.3727 | 2.4260 | ( -2.25%) |
> | Amean | 12 | 3.9813 | 4.1290 | ( -3.71%) |
> | Amean | 21 | 6.9680 | 7.0630 | ( -1.36%) |
> | Amean | 30 | 10.1480 | 10.2170 | ( -0.68%) |
> | Amean | 48 | 16.7793 | 16.8780 | ( -0.59%) |
> | Amean | 79 | 28.9537 | 28.8187 | ( 0.47%) |
> | Amean | 110 | 39.5507 | 40.0157 | ( -1.18%) |
> | Amean | 141 | 51.5670 | 51.8200 | ( -0.49%) |
> | Amean | 172 | 62.8710 | 63.2540 | ( -0.61%) |
> | Amean | 203 | 74.6417 | 75.2520 | ( -0.82%) |
> | Amean | 234 | 86.0853 | 86.5653 | ( -0.56%) |
> | Amean | 265 | 97.9203 | 98.4617 | ( -0.55%) |
> | Amean | 296 | 108.6243 | 109.8770 | ( -1.15%) |
> +-------+-----+----------+----------+-----------+
>
> 2) On 160 CPUs with 64K Page size
> +-----------------+----------------+----------------+
> | Total wastage in slub memory |
> +-----------------+----------------+----------------+
> | | After Boot |After Hackbench |
> | Normal | 919 Kb | 1880 Kb |
> | With Patch | 807 Kb | 1684 Kb |
> | Wastage reduce | ~12% | ~10% |
> +-----------------+----------------+----------------+
>
> +-----------------+----------------+----------------+
> | Total slub memory |
> +-----------------+----------------+----------------+
> | | After Boot | After Hackbench|
> | Normal | 1862592 | 3023744 |
> | With Patch | 1644416 | 2675776 |
> | Memory reduce | ~12% | ~11% |
> +-----------------+----------------+----------------+
>
> hackbench-process-sockets
> +-------+-----+----------+----------+-----------+
> | | Normal |With Patch| |
> +-------+-----+----------+----------+-----------+
> | Amean | 1 | 1.2547 | 1.2677 | ( -1.04%) |
> | Amean | 4 | 1.5523 | 1.5783 | ( -1.67%) |
> | Amean | 7 | 2.4157 | 2.3883 | ( 1.13%) |
> | Amean | 12 | 3.9807 | 3.9793 | ( 0.03%) |
> | Amean | 21 | 6.9687 | 6.9703 | ( -0.02%) |
> | Amean | 30 | 10.1403 | 10.1297 | ( 0.11%) |
> | Amean | 48 | 16.7477 | 16.6893 | ( 0.35%) |
> | Amean | 79 | 27.9510 | 28.0463 | ( -0.34%) |
> | Amean | 110 | 39.6833 | 39.5687 | ( 0.29%) |
> | Amean | 141 | 51.5673 | 51.4477 | ( 0.23%) |
> | Amean | 172 | 62.9643 | 63.1647 | ( -0.32%) |
> | Amean | 203 | 74.6220 | 73.7900 | ( 1.11%) |
> | Amean | 234 | 85.1783 | 85.3420 | ( -0.19%) |
> | Amean | 265 | 96.6627 | 96.7903 | ( -0.13%) |
> | Amean | 296 | 108.2543 | 108.2253 | ( 0.03%) |
> +-------+-----+----------+----------+-----------+
>
> 3) On 16 CPUs with 4K Page size
> +-----------------+----------------+------------------+
> | Total wastage in slub memory |
> +-----------------+----------------+------------------+
> | | After Boot | After Hackbench |
> | Normal | 491 Kb | 727 Kb |
> | With Patch | 483 Kb | 670 Kb |
> | Wastage reduce | ~1% | ~8% |
> +-----------------+----------------+------------------+
>
> +-----------------+----------------+----------------+
> | Total slub memory |
> +-----------------+----------------+----------------+
> | | After Boot | After Hackbench|
> | Normal | 105340 | 153116 |
> | With Patch | 103620 | 147412 |
> | Memory reduce | ~1.6% | ~4% |
> +-----------------+----------------+----------------+
>
> hackbench-process-sockets
> +-------+-----+----------+----------+---------+
> | | Normal |With Patch| |
> +-------+-----+----------+----------+---------+
> | Amean | 1 | 1.0963 | 1.1070 | ( -0.97%) |
> | Amean | 4 | 3.7963) | 3.7957 | ( 0.02%) |
> | Amean | 7 | 6.5947) | 6.6017 | ( -0.11%) |
> | Amean | 12 | 11.1993) | 11.1730 | ( 0.24%) |
> | Amean | 21 | 19.4097) | 19.3647 | ( 0.23%) |
> | Amean | 30 | 27.7023) | 27.6040 | ( 0.35%) |
> | Amean | 48 | 44.1287) | 43.9630 | ( 0.38%) |
> | Amean | 64 | 58.8147) | 58.5753 | ( 0.41%) |
> +-------+----+---------+----------+-----------+
>
> 4) On 16 CPUs with 64K Page size
> +----------------+----------------+----------------+
> | Total wastage in slub memory |
> +----------------+----------------+----------------+
> | | After Boot | After Hackbench|
> | Normal | 194 Kb | 349 Kb |
> | With Patch | 191 Kb | 344 Kb |
> | Wastage reduce | ~1% | ~1% |
> +----------------+----------------+----------------+
>
> +-----------------+----------------+----------------+
> | Total slub memory |
> +-----------------+----------------+----------------+
> | | After Boot | After Hackbench|
> | Normal | 330304 | 472960 |
> | With Patch | 319808 | 458944 |
> | Memory reduce | ~3% | ~3% |
> +-----------------+----------------+----------------+
>
> hackbench-process-sockets
> +-------+-----+----------+----------+---------+
> | | Normal |With Patch| |
> +-------+----+----------+----------+----------+
> | Amean | 1 | 1.9030 | 1.8967 | ( 0.33%) |
> | Amean | 4 | 7.2117 | 7.1283 | ( 1.16%) |
> | Amean | 7 | 12.5247 | 12.3460 | ( 1.43%) |
> | Amean | 12 | 21.7157 | 21.4753 | ( 1.11%) |
> | Amean | 21 | 38.2693 | 37.6670 | ( 1.57%) |
> | Amean | 30 | 54.5930 | 53.8657 | ( 1.33%) |
> | Amean | 48 | 87.6700 | 86.3690 | ( 1.48%) |
> | Amean | 64 | 117.1227 | 115.4893 | ( 1.39%) |
> +-------+----+----------+----------+----------+
>
> Signed-off-by: Jay Patel <jaypatel@linux.ibm.com>
> ---
> mm/slub.c | 52 +++++++++++++++++++++++++---------------------------
> 1 file changed, 25 insertions(+), 27 deletions(-)
>
> diff --git a/mm/slub.c b/mm/slub.c
> index c87628cd8a9a..0a1090c528da 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -4058,7 +4058,7 @@ EXPORT_SYMBOL(kmem_cache_alloc_bulk);
> */
> static unsigned int slub_min_order;
> static unsigned int slub_max_order =
> - IS_ENABLED(CONFIG_SLUB_TINY) ? 1 : PAGE_ALLOC_COSTLY_ORDER;
> + IS_ENABLED(CONFIG_SLUB_TINY) ? 1 : 2;
> static unsigned int slub_min_objects;
>
> /*
> @@ -4087,11 +4087,10 @@ static unsigned int slub_min_objects;
> * the smallest order which will fit the object.
> */
> static inline unsigned int calc_slab_order(unsigned int size,
> - unsigned int min_objects, unsigned int max_order,
> - unsigned int fract_leftover)
> + unsigned int min_objects, unsigned int max_order)
> {
> unsigned int min_order = slub_min_order;
> - unsigned int order;
> + unsigned int order, min_wastage = size, min_wastage_order = MAX_ORDER+1;
>
> if (order_objects(min_order, size) > MAX_OBJS_PER_PAGE)
> return get_order(size * MAX_OBJS_PER_PAGE) - 1;
> @@ -4104,11 +4103,17 @@ static inline unsigned int calc_slab_order(unsigned int size,
>
> rem = slab_size % size;
>
> - if (rem <= slab_size / fract_leftover)
> - break;
> + if (rem < min_wastage) {
> + min_wastage = rem;
> + min_wastage_order = order;
> + }
> }
>
> - return order;
> + if (min_wastage_order <= slub_max_order)
> + return min_wastage_order;
> + else
> + return order;
> +
> }
>
> static inline int calculate_order(unsigned int size)
> @@ -4142,35 +4147,28 @@ static inline int calculate_order(unsigned int size)
> nr_cpus = nr_cpu_ids;
> min_objects = 4 * (fls(nr_cpus) + 1);
> }
> +
> + if ((min_objects * size) > (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER))
> + return PAGE_ALLOC_COSTLY_ORDER;
> +
> + if ((min_objects * size) <= PAGE_SIZE)
> + return slub_min_order;
> +
> max_objects = order_objects(slub_max_order, size);
> min_objects = min(min_objects, max_objects);
>
> - while (min_objects > 1) {
> - unsigned int fraction;
> -
> - fraction = 16;
> - while (fraction >= 4) {
> - order = calc_slab_order(size, min_objects,
> - slub_max_order, fraction);
> - if (order <= slub_max_order)
> - return order;
> - fraction /= 2;
> - }
> + while (min_objects >= 1) {
> + order = calc_slab_order(size, min_objects,
> + slub_max_order);
> + if (order <= slub_max_order)
> + return order;
> min_objects--;
> }
>
> - /*
> - * We were unable to place multiple objects in a slab. Now
> - * lets see if we can place a single object there.
> - */
> - order = calc_slab_order(size, 1, slub_max_order, 1);
> - if (order <= slub_max_order)
> - return order;
> -
> /*
> * Doh this slab cannot be placed using slub_max_order.
> */
> - order = calc_slab_order(size, 1, MAX_ORDER, 1);
> + order = calc_slab_order(size, 1, MAX_ORDER);
> if (order <= MAX_ORDER)
> return order;
> return -ENOSYS;
> --
> 2.39.1
>
>
next prev parent reply other threads:[~2023-07-03 0:13 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-28 9:57 Jay Patel
2023-07-03 0:13 ` David Rientjes [this message]
2023-07-03 8:39 ` Jay Patel
2023-07-09 14:42 ` Hyeonggon Yoo
2023-07-12 13:06 ` Vlastimil Babka
2023-07-20 10:30 ` Jay Patel
2023-07-17 13:41 ` kernel test robot
2023-07-18 6:43 ` Hyeonggon Yoo
2023-07-20 3:00 ` Oliver Sang
2023-07-20 12:59 ` Hyeonggon Yoo
2023-07-20 13:46 ` Hyeonggon Yoo
2023-07-20 14:15 ` Hyeonggon Yoo
2023-07-24 2:39 ` Oliver Sang
2023-07-31 9:49 ` Hyeonggon Yoo
2023-07-20 13:49 ` Feng Tang
2023-07-20 15:05 ` Hyeonggon Yoo
2023-07-21 14:50 ` Binder Makin
2023-07-21 15:39 ` Hyeonggon Yoo
2023-07-21 18:31 ` Binder Makin
2023-07-24 14:35 ` Feng Tang
2023-07-25 3:13 ` Hyeonggon Yoo
2023-07-25 9:12 ` Feng Tang
2023-08-29 8:30 ` Feng Tang
2023-07-26 10:06 ` Vlastimil Babka
2023-08-10 10:38 ` Jay Patel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b87d8eee-2ce5-b7d5-97f8-a5d80eed3c44@google.com \
--to=rientjes@google.com \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@linux.ibm.com \
--cc=cl@linux.com \
--cc=iamjoonsoo.kim@lge.com \
--cc=jaypatel@linux.ibm.com \
--cc=linux-mm@kvack.org \
--cc=merimus@google.com \
--cc=penberg@kernel.org \
--cc=piyushs@linux.ibm.com \
--cc=tsahu@linux.ibm.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox