From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D92EEC43461 for ; Tue, 15 Sep 2020 11:40:27 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 43DBD20872 for ; Tue, 15 Sep 2020 11:40:27 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=bytedance-com.20150623.gappssmtp.com header.i=@bytedance-com.20150623.gappssmtp.com header.b="vxnijL8m" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 43DBD20872 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=bytedance.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id B26BF90002D; Tue, 15 Sep 2020 07:40:26 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AB01C90001D; Tue, 15 Sep 2020 07:40:26 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 976BB90002D; Tue, 15 Sep 2020 07:40:26 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0070.hostedemail.com [216.40.44.70]) by kanga.kvack.org (Postfix) with ESMTP id 7A9DB90001D for ; Tue, 15 Sep 2020 07:40:26 -0400 (EDT) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 3658C1DF5 for ; Tue, 15 Sep 2020 11:40:26 +0000 (UTC) X-FDA: 77265102852.06.doll85_030b4ce27111 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin06.hostedemail.com (Postfix) with ESMTP id DFE1D10055006 for ; Tue, 15 Sep 2020 11:40:25 +0000 (UTC) X-HE-Tag: doll85_030b4ce27111 X-Filterd-Recvd-Size: 5890 Received: from mail-pg1-f195.google.com (mail-pg1-f195.google.com [209.85.215.195]) by imf48.hostedemail.com (Postfix) with ESMTP for ; Tue, 15 Sep 2020 11:40:25 +0000 (UTC) Received: by mail-pg1-f195.google.com with SMTP id y1so1836299pgk.8 for ; Tue, 15 Sep 2020 04:40:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=UZTeZgS/HRogQFGvFWOGASfxBZXKDJCSTQPpZUgR/Es=; b=vxnijL8mD9fvf864vGCOCOtWI6+tMeoadJgLcONMkLvgHCNOX7kdC9XC/M93DzSeaH QGPzRRlkpg27B6cazJeQWQcukneWZzkX3TsmMEKNulhclV6UHsbzVnsPK/v0pmwaQ9PS 6kv02KQB0aYp44SStbAb+j15rXzKYFiZ7+H+6hZ+xyyepznt5H0/uDGGQt1c6hlgIHlU 3LU18HP4i4iP29wIDm8jOpJUNZbfzovYBwpQfdofgF8EwvrZGxBy6edMzo0oZIlWN0IF PZlGnm0Af0AhMKADYufJz8mXCJ4ak48VmgKSfTH+aKg8OZneNx4RDA7tedMxIyRjfPEE JELw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=UZTeZgS/HRogQFGvFWOGASfxBZXKDJCSTQPpZUgR/Es=; b=I2kJOullnlAfBEU5JqvxLFGKsb9JWyWpYq3lvF4rih4Eoj3AQ9nrGeVNi9sLsK1o0a uY2w5d3Y5Q90ds1LH0T0newFgV86b9899Owx3xvX7wCs7jaOKoCjuVBExXdDXl129LlW SJ6OCiY5vErdCCvJFR7/ly+iXHO9zdrFJ+Tivo3jBWFAaNdCLirZG75v8FnAN+CUcesc AewHCxZRo8KL1uT8GRcbH810GU2bkb2PAqZnjus+TZlbRY30XyB5LugQZTCUMOkW0ApH 1ntzxwZoejMufM8r2/Q7cPgSiHMb64s9vKvtX6X88SOw2uIWVjUx/wbz0tTfC7woWpBd /NEA== X-Gm-Message-State: AOAM532xFJrXOFbYzH+p1JaikYDfaYR83xjyYKr3RGlDfdPuYyiq+b13 HD8xbEROq2ovknTemAEAfV2N+g== X-Google-Smtp-Source: ABdhPJy1tf/Lc997XhgJ4gLpnYAYfclS51OkwLLYh8aJRd2O+CupL1BAPxlxg7PUeTNuwm2T+8BB8A== X-Received: by 2002:a63:5d07:: with SMTP id r7mr1494562pgb.440.1600170023688; Tue, 15 Sep 2020 04:40:23 -0700 (PDT) Received: from Zs-MacBook-Pro.local.net ([103.136.220.73]) by smtp.gmail.com with ESMTPSA id u14sm13494204pfm.80.2020.09.15.04.40.20 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 15 Sep 2020 04:40:23 -0700 (PDT) From: zangchunxin@bytedance.com To: akpm@linux-foundation.org Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Chunxin Zang , Muchun Song Subject: [PATCH v3] mm/vmscan: add a fatal signals check in drop_slab_node Date: Tue, 15 Sep 2020 19:40:01 +0800 Message-Id: <20200915114001.79950-1-zangchunxin@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) MIME-Version: 1.0 X-Rspamd-Queue-Id: DFE1D10055006 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam03 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000111, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Chunxin Zang On our server, there are about 10k memcg in one machine. They use memory very frequently. We have observed that drop_caches can take a considerable amount of time, and can't stop it. There are two reasons: 1. There is somebody constantly generating more objects to reclaim on drop_caches, result the 'freed' always bigger than 10. 2. The process has no chance to process signals. We can get the following info through 'ps': root:~# ps -aux | grep drop root 357956 ... R Aug25 21119854:55 echo 3 > /proc/sys/vm/drop_cach= es root 1771385 ... R Aug16 21146421:17 echo 3 > /proc/sys/vm/drop_cach= es root 1986319 ... R 18:56 117:27 echo 3 > /proc/sys/vm/drop_caches root 2002148 ... R Aug24 5720:39 echo 3 > /proc/sys/vm/drop_caches root 2564666 ... R 18:59 113:58 echo 3 > /proc/sys/vm/drop_caches root 2639347 ... R Sep03 2383:39 echo 3 > /proc/sys/vm/drop_caches root 3904747 ... R 03:35 993:31 echo 3 > /proc/sys/vm/drop_caches root 4016780 ... R Aug21 7882:18 echo 3 > /proc/sys/vm/drop_caches Use bpftrace follow 'freed' value in drop_slab_node: root:~# bpftrace -e 'kprobe:drop_slab_node+70 {@ret=3Dhist(reg("bp")); = }' Attaching 1 probe... ^B^C @ret: [64, 128) 1 | = | [128, 256) 28 | = | [256, 512) 107 |@ = | [512, 1K) 298 |@@@ = | [1K, 2K) 613 |@@@@@@@ = | [2K, 4K) 4435 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@= @| [4K, 8K) 442 |@@@@@ = | [8K, 16K) 299 |@@@ = | [16K, 32K) 100 |@ = | [32K, 64K) 139 |@ = | [64K, 128K) 56 | = | [128K, 256K) 26 | = | [256K, 512K) 2 | = | We need one path to stop the process. Signed-off-by: Chunxin Zang Signed-off-by: Muchun Song --- changelogs in v3:=20 1) update the description of the patch. v2 named: mm/vmscan: fix infinite loop in drop_slab_node changelogs in v2:=20 1) via check fatal signal break loop. mm/vmscan.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/mm/vmscan.c b/mm/vmscan.c index b6d84326bdf2..6b2b5d420510 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -704,6 +704,9 @@ void drop_slab_node(int nid) do { struct mem_cgroup *memcg =3D NULL; =20 + if (signal_pending(current)) + return; + freed =3D 0; memcg =3D mem_cgroup_iter(NULL, NULL, NULL); do { --=20 2.11.0