From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.4 required=3.0 tests=FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,PLING_QUERY, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E2B95C433DF for ; Sat, 13 Jun 2020 04:48:02 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 75C1A207D8 for ; Sat, 13 Jun 2020 04:48:02 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 75C1A207D8 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=sina.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 0CBA48D00DE; Sat, 13 Jun 2020 00:48:02 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 07ABD8D00A0; Sat, 13 Jun 2020 00:48:02 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id ED4BB8D00DE; Sat, 13 Jun 2020 00:48:01 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0192.hostedemail.com [216.40.44.192]) by kanga.kvack.org (Postfix) with ESMTP id D333C8D00A0 for ; Sat, 13 Jun 2020 00:48:01 -0400 (EDT) Received: from smtpin02.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 82DAB1EE6 for ; Sat, 13 Jun 2020 04:48:01 +0000 (UTC) X-FDA: 76922956362.02.leg92_3f0293926de2 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin02.hostedemail.com (Postfix) with ESMTP id 566171E10 for ; Sat, 13 Jun 2020 04:48:01 +0000 (UTC) X-HE-Tag: leg92_3f0293926de2 X-Filterd-Recvd-Size: 5297 Received: from mail3-163.sinamail.sina.com.cn (mail3-163.sinamail.sina.com.cn [202.108.3.163]) by imf41.hostedemail.com (Postfix) with SMTP for ; Sat, 13 Jun 2020 04:47:58 +0000 (UTC) Received: from unknown (HELO localhost.localdomain)([114.253.229.236]) by sina.com with ESMTP id 5EE45A730001943A; Sat, 13 Jun 2020 12:47:49 +0800 (CST) X-Sender: hdanton@sina.com X-Auth-ID: hdanton@sina.com X-SMAIL-MID: 0070249283222 From: Hillf Danton To: Pavel Machek Cc: Vlastimil Babka , kernel list , Andrew Morton , mhocko@suse.cz, Hillf Danton , linux mm Subject: Re: 5.7-rc0: kswapd eats cpu during a disk test?! Date: Sat, 13 Jun 2020 12:47:38 +0800 Message-Id: <20200613044738.11764-1-hdanton@sina.com> In-Reply-To: <20200612230552.GA3593@amd> References: <20200531103431.GA28429@amd> <20200612224532.GA24103@amd> MIME-Version: 1.0 X-Rspamd-Queue-Id: 566171E10 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam05 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Sat, 13 Jun 2020 01:05:52 +0200 Pavel Machek wrote: > > > +CC linux-mm > > > > > > On 5/31/20 12:34 PM, Pavel Machek wrote: > > > > Hi! > > > > > > > > This is simple cat /dev/sda > /dev/zero... on thinkpad x60 (x86-3= 2), > > > > with spinning rust. > > > > > > > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIM= E+ COMMAND > > > > 1000 root 20 0 0 0 0 R 53.3 0.0 57:3= 4.93 kswapd0 > > > > 27897 root 20 0 6976 580 536 R 44.5 0.0 1:4= 4.53 cat > > > > > > > > It keeps both CPUs busy... and I don't think that's right. > > > > > > Does an older kernel behave differently here? > > > > Let me try on x220 (x86-64, first): > > > > 737 root 20 0 5404 744 680 R 31.2 0.0 0:09.98 = cat =20 > > 1024 root 20 0 0 0 0 S 21.4 0.0 165:22.68 = kswapd0 =20 > > > > That was with ssd, result with spinning rust is similar: > > > > 859 root 20 0 5404 740 672 D 21.1 0.0 0:03.33 = cat =20 > > 1024 root 20 0 0 0 0 R 11.8 0.0 165:33.07 = kswapd0 =20 > > > > 5.7-rc1+ kernel. > > > > Performance of spinning rust is down, too, on x60: > > > > pavel@amd:~/misc/hw/hdd1t$ sudo ddrescue --force /dev/sda1 /dev/null > > GNU ddrescue 1.19 > > Press Ctrl-C to interrupt > > rescued: 2147 MB, errsize: 0 B, current rate: 3080 kB/= s > > ipos: 2147 MB, errors: 0, average rate: 5382 kB/= s > > opos: 2147 MB, run time: 6.65 m, successful read: > > 0 s ago > > Finished > > pavel@amd:~/misc/hw/hdd1t$ uname -a > > Linux amd 5.7.0-next-20200611+ #123 SMP PREEMPT Thu Jun 11 > > 15:41:22 CEST 2020 i686 GNU/Linux > > > > And there's something clearly wrong here: > > > > 966 root 20 0 0 0 0 R 94.4 0.0 8:18.82= kswapd0 > > 23933 root 20 0 4612 1112 1028 D 80.6 0.0 0:26.40= ddrescue > > =20 >=20 > Same x60 under older kernel: >=20 > pavel@amd:/data/fast/pavel$ sudo ddrescue --force /dev/sda4 /dev/null > GNU ddrescue 1.19 > Press Ctrl-C to interrupt > rescued: 6593 MB, errsize: 0 B, current rate: 60424 kB/s > ipos: 6593 MB, errors: 0, average rate: 95563 kB/s >=20 > 3539 root 20 0 4616 1136 1048 D 21.4 0.0 0:15.63 dd= rescue > 865 root 20 0 0 0 0 S 6.9 0.0 0:04.91 k= swapd0 >=20 > Linux amd 4.6.0+ #172 SMP Sun Aug 14 11:25:34 CEST 2016 i686 GNU/Linux >=20 > These are more reasonable numbers. Treat referenced & active pages as reclaim cost. --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -2020,6 +2020,7 @@ static void shrink_active_list(unsigned struct page *page; unsigned nr_deactivate, nr_activate; unsigned nr_rotated =3D 0; + unsigned nr_refered =3D 0; int file =3D is_file_lru(lru); struct pglist_data *pgdat =3D lruvec_pgdat(lruvec); =20 @@ -2070,7 +2071,8 @@ static void shrink_active_list(unsigned nr_rotated +=3D hpage_nr_pages(page); list_add(&page->lru, &l_active); continue; - } + } else if (!file) + nr_refered++; } =20 ClearPageActive(page); /* we are de-activating */ @@ -2098,6 +2100,14 @@ static void shrink_active_list(unsigned free_unref_page_list(&l_active); trace_mm_vmscan_lru_shrink_active(pgdat->node_id, nr_taken, nr_activate= , nr_deactivate, nr_rotated, sc->priority, file); + if (file) + sc->file_cost +=3D nr_rotated; + else + /* + * add cost to avoid swapin in the near future which incurs IO + * on top of reclaim + */ + sc->anon_cost +=3D nr_refered; } =20 unsigned long reclaim_pages(struct list_head *page_list) @@ -2311,11 +2321,13 @@ static void get_scan_count(struct lruvec file_cost =3D total_cost + sc->file_cost; total_cost =3D anon_cost + file_cost; =20 - ap =3D swappiness * (total_cost + 1); - ap /=3D anon_cost + 1; - - fp =3D (200 - swappiness) * (total_cost + 1); - fp /=3D file_cost + 1; + ap =3D swappiness * total_cost; + if (anon_cost) + ap /=3D anon_cost; + + fp =3D (200 - swappiness) * total_cost; + if (file_cost) + fp /=3D file_cost; =20 fraction[0] =3D ap; fraction[1] =3D fp;