From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.5 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9ECC8C43461 for ; Wed, 9 Sep 2020 15:20:59 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 1F70720C09 for ; Wed, 9 Sep 2020 15:20:58 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=bytedance-com.20150623.gappssmtp.com header.i=@bytedance-com.20150623.gappssmtp.com header.b="cj5wRTpK" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1F70720C09 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=bytedance.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id DB7936B008C; Wed, 9 Sep 2020 11:20:57 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D42F36B0092; Wed, 9 Sep 2020 11:20:57 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C0A836B0093; Wed, 9 Sep 2020 11:20:57 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0092.hostedemail.com [216.40.44.92]) by kanga.kvack.org (Postfix) with ESMTP id AC4006B008C for ; Wed, 9 Sep 2020 11:20:57 -0400 (EDT) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 6B0123626 for ; Wed, 9 Sep 2020 15:20:57 +0000 (UTC) X-FDA: 77243885754.24.flock04_290e567270de Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin24.hostedemail.com (Postfix) with ESMTP id 434CC1A4A5 for ; Wed, 9 Sep 2020 15:20:57 +0000 (UTC) X-HE-Tag: flock04_290e567270de X-Filterd-Recvd-Size: 5986 Received: from mail-pf1-f193.google.com (mail-pf1-f193.google.com [209.85.210.193]) by imf38.hostedemail.com (Postfix) with ESMTP for ; Wed, 9 Sep 2020 15:20:56 +0000 (UTC) Received: by mail-pf1-f193.google.com with SMTP id w7so2453542pfi.4 for ; Wed, 09 Sep 2020 08:20:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=nRSn5ntJRj7Oony3O2CA+XH1U2imtO9TR4JjB0ilhXo=; b=cj5wRTpKPlMV7QtPieSvF+gBLFmGNSf4JanrLhXcm1B5OrnUUj1J+zxeg1fyofhtf2 3vBYnuNYVg64VaSiT3R4PNhx9T0xwRvhs40W/XjP+BU96SRg2S4JnuWl6lI1+Qdw8iEX AUEVbabpxWiwbvu1/lJ0lWGjPFOvXxMMb9/1R+HPWwajs3QCss1MktD4Y9oTy7B8/mpA UUq/uKY2NL58Qg9LHJAUgapV3HJnDZw111zdVw8Bzfk1tPbjzlco3BK3i3UtY27Y9qgd DU0MtSjR3ygVRfl7kWJ1iFcl9XC+s9WUR8B232PzYVeIlxC1ysT9H7Re6kDlZYSGP61g u61w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=nRSn5ntJRj7Oony3O2CA+XH1U2imtO9TR4JjB0ilhXo=; b=tXjFRxuJT3iTTzB/0g8OzFtFH+fhkC1dxJnhCRaa+w/4QDmJDgztBxNqBDlx+qfjaf bFqgatu6bOQwm52EAw0JiChx04A/BudyzvtA+JdbjYIY3gyRjAivXKhhB182IyXruaqt DzpF4rWE7bQGDlDWm4mUDyHppf3v02Fw3nJqBIUL3RBUHp0Tw8v2ZJKObbD9LVBPpCNi +2vjmkHcQujjzusA3U81ZOIe5E7uH+wOSJ84c24MZt5ldHVO5roGZbMn68JIAzBav7JK /ZGzppi2HeXm7nuDElImrMZXUKf8z3XujQ7XXsClnKC9dtu52scMUZfqDUfZyQYjFKy8 /kaQ== X-Gm-Message-State: AOAM533w0yzxE9vo2AnwCusEXaUATPkdLw2iW4OzarODBMHE49/X5PPy B4ylN7CeSQwdgnmxmwVN8T0chg== X-Google-Smtp-Source: ABdhPJwgUlU/RNTULEXezR/V5cY6SyF8WCPd3zyU4+7r+bNYT6rqiVXWT2Yvv4GBBk3ZzuVSYxpw3g== X-Received: by 2002:a63:af01:: with SMTP id w1mr1012742pge.23.1599664855461; Wed, 09 Sep 2020 08:20:55 -0700 (PDT) Received: from Zs-MacBook-Pro.local.net ([103.136.221.66]) by smtp.gmail.com with ESMTPSA id kf10sm2160958pjb.2.2020.09.09.08.20.52 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 09 Sep 2020 08:20:54 -0700 (PDT) From: zangchunxin@bytedance.com To: akpm@linux-foundation.org Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Chunxin Zang , Muchun Song Subject: [PATCH v2] mm/vmscan: fix infinite loop in drop_slab_node Date: Wed, 9 Sep 2020 23:20:47 +0800 Message-Id: <20200909152047.27905-1-zangchunxin@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 X-Rspamd-Queue-Id: 434CC1A4A5 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam04 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000034, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Chunxin Zang On our server, there are about 10k memcg in one machine. They use memory very frequently. When I tigger drop caches=EF=BC=8Cthe process will infin= ite loop in drop_slab_node. There are two reasons: 1.We have too many memcgs, even though one object freed in one memcg, the sum of object is bigger than 10. 2.We spend a lot of time in traverse memcg once. So, the memcg who traversed at the first have been freed many objects. Traverse memcg nex= t time, the freed count bigger than 10 again. We can get the following info through 'ps': root:~# ps -aux | grep drop root 357956 ... R Aug25 21119854:55 echo 3 > /proc/sys/vm/drop_cach= es root 1771385 ... R Aug16 21146421:17 echo 3 > /proc/sys/vm/drop_cach= es root 1986319 ... R 18:56 117:27 echo 3 > /proc/sys/vm/drop_caches root 2002148 ... R Aug24 5720:39 echo 3 > /proc/sys/vm/drop_caches root 2564666 ... R 18:59 113:58 echo 3 > /proc/sys/vm/drop_caches root 2639347 ... R Sep03 2383:39 echo 3 > /proc/sys/vm/drop_caches root 3904747 ... R 03:35 993:31 echo 3 > /proc/sys/vm/drop_caches root 4016780 ... R Aug21 7882:18 echo 3 > /proc/sys/vm/drop_caches Use bpftrace follow 'freed' value in drop_slab_node: root:~# bpftrace -e 'kprobe:drop_slab_node+70 {@ret=3Dhist(reg("bp")); = }' Attaching 1 probe... ^B^C @ret: [64, 128) 1 | = | [128, 256) 28 | = | [256, 512) 107 |@ = | [512, 1K) 298 |@@@ = | [1K, 2K) 613 |@@@@@@@ = | [2K, 4K) 4435 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@= @| [4K, 8K) 442 |@@@@@ = | [8K, 16K) 299 |@@@ = | [16K, 32K) 100 |@ = | [32K, 64K) 139 |@ = | [64K, 128K) 56 | = | [128K, 256K) 26 | = | [256K, 512K) 2 | = | In the while loop, we can check whether the TASK_KILLABLE signal is set, if so, we should break the loop. Signed-off-by: Chunxin Zang Signed-off-by: Muchun Song --- changelogs in v2:=20 1) Via check TASK_KILLABLE signal break loop. mm/vmscan.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/mm/vmscan.c b/mm/vmscan.c index b6d84326bdf2..c3ed8b45d264 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -704,6 +704,9 @@ void drop_slab_node(int nid) do { struct mem_cgroup *memcg =3D NULL; =20 + if (fatal_signal_pending(current)) + return; + freed =3D 0; memcg =3D mem_cgroup_iter(NULL, NULL, NULL); do { --=20 2.11.0