From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 52793C43461 for ; Mon, 14 Sep 2020 15:17:43 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id ECC6020756 for ; Mon, 14 Sep 2020 15:17:42 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org ECC6020756 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 5AFC48E0001; Mon, 14 Sep 2020 11:17:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 560056B007B; Mon, 14 Sep 2020 11:17:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 475888E0001; Mon, 14 Sep 2020 11:17:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0068.hostedemail.com [216.40.44.68]) by kanga.kvack.org (Postfix) with ESMTP id 2EAFA6B005D for ; Mon, 14 Sep 2020 11:17:42 -0400 (EDT) Received: from smtpin23.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id D36151E0A for ; Mon, 14 Sep 2020 15:17:41 +0000 (UTC) X-FDA: 77262021522.23.help29_291792527109 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin23.hostedemail.com (Postfix) with ESMTP id 69A2C37617 for ; Mon, 14 Sep 2020 15:17:39 +0000 (UTC) X-HE-Tag: help29_291792527109 X-Filterd-Recvd-Size: 5587 Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by imf49.hostedemail.com (Postfix) with ESMTP for ; Mon, 14 Sep 2020 15:17:38 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 9B45CB527; Mon, 14 Sep 2020 15:17:52 +0000 (UTC) Date: Mon, 14 Sep 2020 17:17:36 +0200 From: Michal Hocko To: Chunxin Zang Cc: Andrew Morton , Linux Memory Management List , LKML , Muchun Song Subject: Re: [External] Re: [PATCH v2] mm/vmscan: fix infinite loop in drop_slab_node Message-ID: <20200914151736.GA16999@dhcp22.suse.cz> References: <20200909152047.27905-1-zangchunxin@bytedance.com> <20200914093032.GG16999@dhcp22.suse.cz> <20200914134713.GS16999@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: X-Rspamd-Queue-Id: 69A2C37617 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam02 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon 14-09-20 23:02:15, Chunxin Zang wrote: > On Mon, Sep 14, 2020 at 9:47 PM Michal Hocko wrote: >=20 > > On Mon 14-09-20 21:25:59, Chunxin Zang wrote: > > > On Mon, Sep 14, 2020 at 5:30 PM Michal Hocko wrot= e: > > > > > > > The subject is misleading because this patch doesn't fix an infin= ite > > > > loop, right? It just allows the userspace to interrupt the operat= ion. > > > > > > > > > > > Yes, so we are making a separate patch follow Vlastimil's > > recommendations. > > > Use double of threshold to end the loop. > > > > That still means the changelog needs an update > > >=20 > The patch is already merged in Linux-next branch. Can I update the > changelog now? Yes. Andrew will refresh it. He doesn't have a git tree which would prevent rewriting the patch. > This is my first patch, please forgive me :) No worries. The mm patch workflow is rather different from others. > > > On Thu, Sep 10, 2020 at 1:59 AM Vlastimil Babka wr= ote: > > > > > From: Chunxin Zang > > > > > > > > > ... > > > > - IMHO it's still worth to bail out in your scenario even without= a > > > > signal, e.g. > > > > by the doubling of threshold. But it can be a separate patch. > > > > Thanks! > > > > ... > > > > > > > > > > > > On Wed 09-09-20 23:20:47, zangchunxin@bytedance.com wrote: > > > > > From: Chunxin Zang > > > > > > > > > > On our server, there are about 10k memcg in one machine. They u= se > > memory > > > > > very frequently. When I tigger drop caches=EF=BC=8Cthe process = will infinite > > loop > > > > > in drop_slab_node. > > > > > > > > Is this really an infinite loop, or it just takes a lot of time t= o > > > > process all the metadata in that setup? If this is really an infi= nite > > > > loop then we should look at it. My current understanding is that = the > > > > operation would finish at some time it just takes painfully long = to get > > > > there. > > > > > > > > > > Yes, it's really an infinite loop. Every loop spends a lot of tim= e. In > > > this time, > > > memcg will alloc/free memory, so the next loop, the total of 'fre= ed' > > > always bigger than 10. > > > > I am still not sure I follow. Do you mean that there is somebody > > constantly generating more objects to reclaim? > > >=20 > Yes, this is my meaning. :) >=20 >=20 > > Maybe we are just not agreeing on the definition of an infinite loop = but > > in my book that means that the final condition can never be met. Whil= e a > > busy adding new object might indeed cause drop caches to loop for a l= ong > > time this is to be expected from that interface as it is supposed to > > drop all the cache and that can grow during the operation. > > -- > > >=20 > Because I have 10k memcg , all of them are heavy users of memory. > During each loop, there are always more than 10 reclaimable objects > generating, so the > condition is never met. 10k or any number of memcgs shouldn't really make much of a difference. Except for the time the scan adds. Fundamentally we are talking about freed objects and whether they are on the global or per memcg lists should result in a similar behavior. > The drop cache process has no chance to exit the > loop. > Although the purpose of the 'drop cache' interface is to release all > caches, we still need a > way to terminate it, e.g. in this case, the process took too long to ru= n . Yes, this is perfectly understandable. Having a bail out on fatal signal is a completely reasonable thing to do. I am mostly confused by your infinite loop claims and what the relation of this patch to it. I would propose this wording instead " We have observed that drop_caches can take a considerable amount of time (). Especially when there are many memcgs involved because they are adding an additional overhead. It is quite unfortunate that the operation cannot be interrupted by a signal currently. Add a check for fatal signals into the main loop so that userspace can control early bailout. " or something along those lines. >=20 > root 357956 ... R Aug25 21119854:55 echo 3 > /proc/sys/vm/drop_ca= ches --=20 Michal Hocko SUSE Labs