From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qt0-f197.google.com (mail-qt0-f197.google.com [209.85.216.197]) by kanga.kvack.org (Postfix) with ESMTP id CEC236B58D2 for ; Fri, 31 Aug 2018 16:35:13 -0400 (EDT) Received: by mail-qt0-f197.google.com with SMTP id b5-v6so15946410qtk.4 for ; Fri, 31 Aug 2018 13:35:13 -0700 (PDT) Received: from mx0a-00082601.pphosted.com (mx0b-00082601.pphosted.com. [67.231.153.30]) by mx.google.com with ESMTPS id i33-v6si5352326qtb.238.2018.08.31.13.35.12 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 31 Aug 2018 13:35:12 -0700 (PDT) From: Roman Gushchin Subject: [PATCH] mm: slowly shrink slabs with a relatively small number of objects Date: Fri, 31 Aug 2018 13:34:50 -0700 Message-ID: <20180831203450.2536-1-guro@fb.com> MIME-Version: 1.0 Content-Type: text/plain Sender: owner-linux-mm@kvack.org List-ID: To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, kernel-team@fb.com, Roman Gushchin , Josef Bacik , Johannes Weiner , Rik van Riel , Andrew Morton Commit 9092c71bb724 ("mm: use sc->priority for slab shrink targets") changed the way how target the slab pressure is calculated and made it priority-based: delta = freeable >> priority; delta *= 4; do_div(delta, shrinker->seeks); The problem is that on a default priority (which is 12) no pressure is applied at all, if the number of potentially reclaimable objects is less than 4096. It wouldn't be a big deal, if only these objects were not pinning the corresponding dying memory cgroups. 4096 dentries/inodes/radix tree nodes/... is a reasonable number, but 4096 dying cgroups is not. If there are no big spikes in memory pressure, and new memory cgroups are created and destroyed periodically, this causes the number of dying cgroups grow steadily, causing a slow-ish and hard-to-detect memory "leak". It's not a real leak, as the memory can be eventually reclaimed, but it could not happen in a real life at all. I've seen hosts with a steadily climbing number of dying cgroups, which doesn't show any signs of a decline in months, despite the host is loaded with a production workload. It is an obvious waste of memory, and to prevent it, let's apply a minimal pressure even on small shrinker lists. E.g. if there are freeable objects, let's scan at least min(freeable, scan_batch) objects. This fix significantly improves a chance of a dying cgroup to be reclaimed, and together with some previous patches stops the steady growth of the dying cgroups number on some of our hosts. Signed-off-by: Roman Gushchin Cc: Josef Bacik Cc: Johannes Weiner Cc: Rik van Riel Cc: Andrew Morton --- mm/vmscan.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/mm/vmscan.c b/mm/vmscan.c index fa2c150ab7b9..c910cf6bf606 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -476,6 +476,10 @@ static unsigned long do_shrink_slab(struct shrink_control *shrinkctl, delta = freeable >> priority; delta *= 4; do_div(delta, shrinker->seeks); + + if (delta == 0 && freeable > 0) + delta = min(freeable, batch_size); + total_scan += delta; if (total_scan < 0) { pr_err("shrink_slab: %pF negative objects to delete nr=%ld\n", -- 2.17.1