From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pf1-f199.google.com (mail-pf1-f199.google.com [209.85.210.199]) by kanga.kvack.org (Postfix) with ESMTP id 092EA6B0003 for ; Fri, 12 Oct 2018 17:25:01 -0400 (EDT) Received: by mail-pf1-f199.google.com with SMTP id h76-v6so13263853pfd.10 for ; Fri, 12 Oct 2018 14:25:01 -0700 (PDT) Received: from mail-sor-f65.google.com (mail-sor-f65.google.com. [209.85.220.65]) by mx.google.com with SMTPS id u6-v6sor1998756plz.53.2018.10.12.14.24.59 for (Google Transport Security); Fri, 12 Oct 2018 14:24:59 -0700 (PDT) Date: Fri, 12 Oct 2018 14:24:57 -0700 (PDT) From: David Rientjes Subject: [patch] mm, slab: avoid high-order slab pages when it does not reduce waste Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Sender: owner-linux-mm@kvack.org List-ID: To: Christoph Lameter , Pekka Enberg , Joonsoo Kim , Andrew Morton Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org The slab allocator has a heuristic that checks whether the internal fragmentation is satisfactory and, if not, increases cachep->gfporder to try to improve this. If the amount of waste is the same at higher cachep->gfporder values, there is no significant benefit to allocating higher order memory. There will be fewer calls to the page allocator, but each call will require zone->lock and finding the page of best fit from the per-zone free areas. Instead, it is better to allocate order-0 memory if possible so that pages can be returned from the per-cpu pagesets (pcp). There are two reasons to prefer this over allocating high order memory: - allocating from the pcp lists does not require a per-zone lock, and - this reduces stranding of MIGRATE_UNMOVABLE pageblocks on pcp lists that increases slab fragmentation across a zone. We are particularly interested in the second point to eliminate cases where all other pages on a pageblock are movable (or free) and fallback to pageblocks of other migratetypes from the per-zone free areas causes high-order slab memory to be allocated from them rather than from free MIGRATE_UNMOVABLE pages on the pcp. Signed-off-by: David Rientjes --- mm/slab.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/mm/slab.c b/mm/slab.c --- a/mm/slab.c +++ b/mm/slab.c @@ -1748,6 +1748,7 @@ static size_t calculate_slab_order(struct kmem_cache *cachep, for (gfporder = 0; gfporder <= KMALLOC_MAX_ORDER; gfporder++) { unsigned int num; size_t remainder; + int order; num = cache_estimate(gfporder, size, flags, &remainder); if (!num) @@ -1803,6 +1804,20 @@ static size_t calculate_slab_order(struct kmem_cache *cachep, */ if (left_over * 8 <= (PAGE_SIZE << gfporder)) break; + + /* + * If a higher gfporder would not reduce internal fragmentation, + * no need to continue. The preference is to keep gfporder as + * small as possible so slab allocations can be served from + * MIGRATE_UNMOVABLE pcp lists to avoid stranding. + */ + for (order = gfporder + 1; order <= slab_max_order; order++) { + cache_estimate(order, size, flags, &remainder); + if (remainder < left_over) + break; + } + if (order > slab_max_order) + break; } return left_over; }