From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 83CA9C7EE2C for ; Thu, 25 May 2023 07:05:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D8F07900003; Thu, 25 May 2023 03:05:32 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D3F2A6B0075; Thu, 25 May 2023 03:05:32 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C580C900003; Thu, 25 May 2023 03:05:32 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id B72526B0074 for ; Thu, 25 May 2023 03:05:32 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 7ACB6C054B for ; Thu, 25 May 2023 07:05:32 +0000 (UTC) X-FDA: 80827891704.17.D922D7A Received: from p3plsmtpa12-03.prod.phx3.secureserver.net (p3plsmtpa12-03.prod.phx3.secureserver.net [68.178.252.232]) by imf25.hostedemail.com (Postfix) with ESMTP id B7983A0004 for ; Thu, 25 May 2023 07:05:29 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf25.hostedemail.com: domain of atomlin@atomlin.com designates 68.178.252.232 as permitted sender) smtp.mailfrom=atomlin@atomlin.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1684998329; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=k3uKiFbp6wWgGS+s98H/qRrwzT1EbYaaoYf7yM0xCjU=; b=0YVlr30pE24wu76M4xfdDGkpKUSk3wgcNk9z7Et5BjM23qJ9bprcwFkCmEZN1WjxK6Unux GHy8KJlDpUODy+8uTHudqHfG7jRJ5wS4l1NkZ67U5mUMYNPzFHc1m2kVcouEg8WoyB3k6b EUKOabUUXSsWWKXfoSmY6mtgCvv5M0E= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf25.hostedemail.com: domain of atomlin@atomlin.com designates 68.178.252.232 as permitted sender) smtp.mailfrom=atomlin@atomlin.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1684998329; a=rsa-sha256; cv=none; b=sy6fcmy6eZGlwnf3rsmDWNCWePDF8f2/dMQ7RdSyMH51XPETE4wCJrDmec6C645spmUXoC BmNzDAxdLskrnxi4ttU2T8IyidGV8AIP6QiqB/mP0fY+mp9ouxYgfEOK8nb1DW4RWoQWNp i9x2yrQWr9UyKRRiG2nvvpESmwwiCUA= Received: from localhost ([82.27.99.45]) by :SMTPAUTH: with ESMTPA id 252Mq3fY8svF6252NqWt8T; Thu, 25 May 2023 00:05:28 -0700 X-SECURESERVER-ACCT: atomlin@atomlin.com Date: Thu, 25 May 2023 08:05:26 +0100 From: Aaron Tomlin To: Marcelo Tosatti Cc: Christoph Lameter , Frederic Weisbecker , Andrew Morton , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Russell King , Huacai Chen , Heiko Carstens , x86@kernel.org, Vlastimil Babka , Michal Hocko , Aaron Tomlin Subject: Re: [PATCH v8 01/13] vmstat: allow_direct_reclaim should use zone_page_state_snapshot Message-ID: <20230525070526.5uhmh6zku5gzxyny@atomlin.usersys.com> References: <20230515180015.016409657@redhat.com> <20230515180138.442505633@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20230515180138.442505633@redhat.com> X-CMAE-Envelope: MS4xfFR1UFf+kSbpQNwP6Fw1448N7D2LwqquPSuM0YlBaOIm7wJbbUVwHw71ZPtd6nJhpJR69B/HBGQ/347cIzSRBQLxbXIVBdZXTDzSoVdvcjLl8tu9DlgI ogj2P0as2mSQpl7MNBn1MmnMuoGhqdHeHKaZ6STF4LWHKoYqaf/fJqlKy3xQ7lSl/lVe21Ztiraen8+hvu5B8eyzzww/Xdcj9dqXdJb3Vhw5UbCRZccwbgEx kBuqhMc1dBnb1NRGorrRBf7z4to6bvUthdcX1IHC8AlQ22iPIlP9tx/04FhsDKrs3hl9E0dFMkvIzOXjib9V7M1xQsFXVWTtYcaS85lh+XF0h4ul/G+10eQ4 l6JwcjFhQcyIPvX9/7SKIs3jTh7693jWCIb9t/Ej4gQUxtUAmlv2JPi67WhV4o+seQ8gCfa8iyXHq9hr3Nvu4QZUcHBGH5W80S/cpOJaBE84UTrDiqWIp5JB b/DS8jqdUyFE6oh/OOw+DBu93h53AjBPnqWYhiGKK5VbK+rZJutLlZqZgHA5YaYU7jAmOJe/eTYA+HkU X-Rspam-User: X-Stat-Signature: dd71z8hseyfqy58w3rnt4byu39s45f5j X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: B7983A0004 X-CMAE-Analysis: v=2.4 cv=CISSHznD c=1 sm=1 tr=0 ts=646f08ba a=gyEo9i5HAJAMrj3bkb2g+A==:117 a=YwMIiW7BGddQzL8MrqPWMg==:17 a=IkcTkHD0fZMA:10 a=P0xRbXHiH_UA:10 a=iox4zFpeAAAA:8 a=20KFwNOVAAAA:8 a=j-LfP5YGAAAA:8 a=Uda1tbkDn9zVvUAUaBAA:9 a=QEXdDO2ut3YA:10 a=WzC6qhA0u3u7Ye7llzcV:22 a=pci6KG57UX3UzFLC8IW4:22 X-HE-Tag: 1684998329-685701 X-HE-Meta: U2FsdGVkX18yfi8Ir2fpwOEYZ5yoCyxTyXwe8bKnhZ+oBx+YByb3UCRuBHPft1Ri+eLxZ5S/ZNI1POns2jCnefQ0MHZbm5+87UA2VII2dRwomDUFxGdmhXUdQEB48FLhZZAYc5z+4AP7TV8UpmhWhrN0TyKx6wFLtlL+DPQ5YJDf0cpQyeRzXTgbIeHpTRO3jTFhjedNtfCaEiOzuAPs2kPnjbsjQtukFbuBJChIGRPiHztWv2UnYzkLEhlatYoxPzJ1LIQ8UCyjVhrM1cgCgLxht6zO83RJiTMh+NzJzjHl7XKYQUYusg/SZCbzVW0Ivf7EjXDWcoudqiYy0y6gqVCf3oBWHTNK+95enimDXgOoG8Q0kTmbeUGTiUZTMtOm9obBYkBUllXx/cKAcMPH3+gEg6B7tWAmo0Xky78GZnjhVV3V6fyQbwWzDlWYeog5HgeTDULfuo1DreRppDW1ZhYH8mLvXC3fhSy4ubF0qk/ycy+FvLZjH3Edc1BBeCLHpoTldzbjpoqegTAX+pkWDLKQzXkajKx+dIFAzubNFChTrwamKa1g3LJehUOPePbGiz4LE+XZ4dXTQEf/IAOGG++1S10yY75jWAmKHKuzvsGL8eaB4i5+7L+MnrBbXwiT3NTn4qQHxGkd7tqPxiEPEHBwXa+qzHEdhDYjxUgnKuF4umtxoa3I7EfI+eUxCkdxjhaXTfDZ0SLHqYVm6CPb+OV041ylAXWBvPF5zwgpnyPgOH3fQnANq6kGmkhOKBCl8BI5JaSLXjy0vBTrXksKZfyPrs02r6UXWgjl9gri0wWjApk3mLZx+BSbXJMCVQBASGW6iA3UYlDCqRi0npEVdhl2B/tNQIle8ZwFqllCBe2a3IRuMXP/LlZ+1r0vDDhsGoYIRhnDeSgNKVTj1aeqg35CZm0Bv4Uxw5dYKituBjVK+J1Nml6xOYiBpJnqGrfFcEdUxWxoWWHA33yL7W0 UmLuyAal 3tLImg4be6xBSUeiu5puzrXDz1EYZGeulLtheQ+r/RZzsVnLhzxediUK0hEpO3wSn8bkA+t7Ehcw4EoVyEpRpxvGB1lQdOR6yhkTP3kdw/KRe2clXSAdg9kMrhIOx1MSnkymdvKp0L2IE0+SE5P7JSeuhlzbZEY+ZAgHvgbCUyr6f1YqhBU/m7MhyhdrdM/pLleeny2jLcU8k90mTN9jhtcv2L4b3/7iM60jp3QuvyPfev3RiuV4jv26hXCcTqNsmXrS0s79C4t6jhrAZd1bAlAz9ergb0bqVtwy1HQuN0/qvI/+jm/WTWpO2R/mFkKBD+8+9It1cxfXkwHEMVzpY8XQG3QRI4zwcaUzhF3NN/pQMOoeXDLoH42sE0z2GEv1iYOXdE0GhW8/NQYnMpHBYRvN5Su1iTgOOiPAk5RxBKG5UOUrtsQSYtDrF+ShlO5nBMfBm1/woX6s/1MmSdKoLk7lbH/cXmiJfASmKECta+aN1BIKhpZ2L0FH9VGChjC4Ff7zdJQgGq6Q5sP4eypnyBXrXQ13MeHriCPi/Nl35FgACGSWL7pnTeZ6qo7xdC8o8nztb0Duu9Givh2kD2tyYa6+AbxDRmLGv3Kw7CYx8pXgQHic= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, May 15, 2023 at 03:00:16PM -0300, Marcelo Tosatti wrote: > A customer provided evidence indicating that a process > was stalled in direct reclaim: > > - The process was trapped in throttle_direct_reclaim(). > The function wait_event_killable() was called to wait condition > allow_direct_reclaim(pgdat) for current node to be true. > The allow_direct_reclaim(pgdat) examined the number of free pages > on the node by zone_page_state() which just returns value in > zone->vm_stat[NR_FREE_PAGES]. > > - On node #1, zone->vm_stat[NR_FREE_PAGES] was 0. > However, the freelist on this node was not empty. > > - This inconsistent of vmstat value was caused by percpu vmstat on > nohz_full cpus. Every increment/decrement of vmstat is performed > on percpu vmstat counter at first, then pooled diffs are cumulated > to the zone's vmstat counter in timely manner. However, on nohz_full > cpus (in case of this customer's system, 48 of 52 cpus) these pooled > diffs were not cumulated once the cpu had no event on it so that > the cpu started sleeping infinitely. > I checked percpu vmstat and found there were total 69 counts not > cumulated to the zone's vmstat counter yet. > > - In this situation, kswapd did not help the trapped process. > In pgdat_balanced(), zone_wakermark_ok_safe() examined the number > of free pages on the node by zone_page_state_snapshot() which > checks pending counts on percpu vmstat. > Therefore kswapd could know there were 69 free pages correctly. > Since zone->_watermark = {8, 20, 32}, kswapd did not work because > 69 was greater than 32 as high watermark. > > Change allow_direct_reclaim to use zone_page_state_snapshot, which > allows a more precise version of the vmstat counters to be used. > > allow_direct_reclaim will only be called from try_to_free_pages, > which is not a hot path. > > Suggested-by: Michal Hocko > Signed-off-by: Marcelo Tosatti > > --- > > Index: linux-vmstat-remote/mm/vmscan.c > =================================================================== > --- linux-vmstat-remote.orig/mm/vmscan.c > +++ linux-vmstat-remote/mm/vmscan.c > @@ -6886,7 +6886,7 @@ static bool allow_direct_reclaim(pg_data > continue; > > pfmemalloc_reserve += min_wmark_pages(zone); > - free_pages += zone_page_state(zone, NR_FREE_PAGES); > + free_pages += zone_page_state_snapshot(zone, NR_FREE_PAGES); > } > > /* If there are no reserves (unexpected config) then do not throttle */ > > Reviewed-by: Aaron Tomlin -- Aaron Tomlin