From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.8 required=3.0 tests=BAYES_00, FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0C7D7C433B4 for ; Thu, 22 Apr 2021 10:23:42 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2167E61249 for ; Thu, 22 Apr 2021 10:23:41 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2167E61249 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=sina.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 84CA76B006C; Thu, 22 Apr 2021 06:23:40 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7FBBC6B006E; Thu, 22 Apr 2021 06:23:40 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6C3146B0070; Thu, 22 Apr 2021 06:23:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0156.hostedemail.com [216.40.44.156]) by kanga.kvack.org (Postfix) with ESMTP id 508546B006C for ; Thu, 22 Apr 2021 06:23:40 -0400 (EDT) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 0C1B4824934B for ; Thu, 22 Apr 2021 10:23:40 +0000 (UTC) X-FDA: 78059616600.24.250F9E7 Received: from mail3-162.sinamail.sina.com.cn (mail3-162.sinamail.sina.com.cn [202.108.3.162]) by imf09.hostedemail.com (Postfix) with SMTP id 62230600010A for ; Thu, 22 Apr 2021 10:23:33 +0000 (UTC) Received: from unknown (HELO localhost.localdomain)([221.199.207.227]) by sina.com (172.16.97.27) with ESMTP id 60814EA600024896; Thu, 22 Apr 2021 18:23:36 +0800 (CST) X-Sender: hdanton@sina.com X-Auth-ID: hdanton@sina.com X-SMAIL-MID: 15077049283458 From: Hillf Danton To: Xing Zhengjun Cc: akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, ying.huang@intel.com, tim.c.chen@linux.intel.com, Shakeel Butt , Michal Hocko , yuzhao@google.com, wfg@mail.ustc.edu.cn Subject: Re: [RFC] mm/vmscan.c: avoid possible long latency caused by too_many_isolated() Date: Thu, 22 Apr 2021 18:23:25 +0800 Message-Id: <20210422102325.1332-1-hdanton@sina.com> In-Reply-To: <7b7a1c09-3d16-e199-15d2-ccea906d4a66@linux.intel.com> References: <20210416023536.168632-1-zhengjun.xing@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 X-Stat-Signature: siuzkkp76isat6xury7rwq9uqdcz7zi7 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 62230600010A Received-SPF: none (sina.com>: No applicable sender policy available) receiver=imf09; identity=mailfrom; envelope-from=""; helo=mail3-162.sinamail.sina.com.cn; client-ip=202.108.3.162 X-HE-DKIM-Result: none/none X-HE-Tag: 1619087013-413486 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi Zhengjun On Thu, 22 Apr 2021 16:36:19 +0800 Zhengjun Xing wrote: > In the system with very few file pages (nr_active_file +=20 > nr_inactive_file < 100), it is easy to reproduce "nr_isolated_file >=20 > nr_inactive_file", then too_many_isolated return true,=20 > shrink_inactive_list enter "msleep(100)", the long latency will happen. We should skip reclaiming page cache in this case. >=20 > The test case to reproduce it is very simple: allocate many huge=20 > pages(near the DRAM size), then do free, repeat the same operation many= =20 > times. > In the test case, the system with very few file pages (nr_active_file += =20 > nr_inactive_file < 100), I have dumpped the numbers of=20 > active/inactive/isolated file pages during the whole test(see in the=20 > attachments) , in shrink_inactive_list "too_many_isolated" is very easy= =20 > to return true, then enter "msleep(100)",in "too_many_isolated"=20 > sc->gfp_mask is 0x342cca ("_GFP_IO" and "__GFP_FS" is masked) , it is=20 > also very easy to enter =E2=80=9Cinactive >>=3D3=E2=80=9D, then =E2=80=9C= isolated > inactive=E2=80=9D will=20 > be true. >=20 > So I have a proposal to set a threshold number for the total file page= s=20 > to ignore the system with very few file pages, and then bypass the 100m= s=20 > sleep. > It is hard to set a perfect number for the threshold, so I just give an= =20 > example of "256" for it. Another option seems like we take a nap at the second time of lru tmi with some allocators in your case served without the 100ms delay. +++ x/mm/vmscan.c @@ -118,6 +118,9 @@ struct scan_control { /* The file pages on the current node are dangerously low */ unsigned int file_is_tiny:1; =20 + unsigned int file_tmi:1; /* too many isolated */ + unsigned int anon_tmi:1; + /* Allocation order */ s8 order; =20 @@ -1905,6 +1908,21 @@ static int current_may_throttle(void) bdi_write_congested(current->backing_dev_info); } =20 +static void update_sc_tmi(struct scan_control *sc, bool file, int set) +{ + if (file) + sc->file_tmi =3D set; + else + sc->anon_tmi =3D set; +} +static bool is_sc_tmi(struct scan_control *sc, bool file) +{ + if (file) + return sc->file_tmi !=3D 0; + else + return sc->anon_tmi !=3D 0; +} + /* * shrink_inactive_list() is a helper for shrink_node(). It returns the= number * of reclaimed pages @@ -1927,6 +1945,11 @@ shrink_inactive_list(unsigned long nr_to if (stalled) return 0; =20 + if (!is_sc_tmi(sc, file)) { + update_sc_tmi(sc, file, 1); + return 0; + } + /* wait a bit for the reclaimer. */ msleep(100); stalled =3D true; @@ -1936,6 +1959,9 @@ shrink_inactive_list(unsigned long nr_to return SWAP_CLUSTER_MAX; } =20 + if (is_sc_tmi(sc, file)) + update_sc_tmi(sc, file, 0); + lru_add_drain(); =20 spin_lock_irq(&lruvec->lru_lock);