From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 937D8C4332F for ; Tue, 14 Nov 2023 08:20:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 13E516B02BC; Tue, 14 Nov 2023 03:20:15 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0EEBB6B02BE; Tue, 14 Nov 2023 03:20:15 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EF7A76B02BF; Tue, 14 Nov 2023 03:20:14 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id DE8906B02BC for ; Tue, 14 Nov 2023 03:20:14 -0500 (EST) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id A6649B5784 for ; Tue, 14 Nov 2023 08:20:14 +0000 (UTC) X-FDA: 81455862348.15.A612E60 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by imf29.hostedemail.com (Postfix) with ESMTP id 85FB1120012 for ; Tue, 14 Nov 2023 08:20:12 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b=c29AEWvp; dmarc=pass (policy=quarantine) header.from=suse.com; spf=pass (imf29.hostedemail.com: domain of mhocko@suse.com designates 195.135.220.28 as permitted sender) smtp.mailfrom=mhocko@suse.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1699950012; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=6H7EakbDnhsb8D1NyizF9B8h3ANmbbwZmRIRDeHzN4c=; b=jemcwvjW8YHPPuB0vj2EIJeyje0/Nw6s1MDUBRfeJz2p5tuIKO/efNxs3SUMdFkjFKPq7d 5BSx4Z2QkDbgmG1K5+Slur564+M2HVdAGvBwJwl7DZzruvOTkO1EZVbeqbYODOfJZEyrvi FZoXZZzsWFHFrxsk+dzIsE0QbtHXTAM= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b=c29AEWvp; dmarc=pass (policy=quarantine) header.from=suse.com; spf=pass (imf29.hostedemail.com: domain of mhocko@suse.com designates 195.135.220.28 as permitted sender) smtp.mailfrom=mhocko@suse.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1699950012; a=rsa-sha256; cv=none; b=vr94ErC6AF/5cZjSN+1B2FweCDRrzJX9HmWxzchcLmFn5haOfN4X4EcVA3nDhk8a66Sq0I V9PH3Wq+wJncZB7OxmZFyZzBBzXvLrfmG9f1xv97Y3hSbwZqjHdCNtVtjo5Y1mmaDuYNqI c6RtzGfN07Md68FxOEvpvF+BU1Awdls= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 3A211218E5; Tue, 14 Nov 2023 08:20:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1699950010; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=6H7EakbDnhsb8D1NyizF9B8h3ANmbbwZmRIRDeHzN4c=; b=c29AEWvpe1BPv5FM/+sqaVHlTQ3dIGi3gNq7vceBJOJQ2gpTfJvNWI+JMGP9JWH5vTVUt0 Oa7sQh/WAMM2JLjfPICCy72x8hig6QZeR4CLl71Xh9DzokSwtxsz846doJb/rnRPIGLFJu gMRmnmWMU8TSnYQ1f9DPBdH+NvFtkfE= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 18F2713460; Tue, 14 Nov 2023 08:20:10 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id WageA7otU2UHbAAAMHmgww (envelope-from ); Tue, 14 Nov 2023 08:20:10 +0000 Date: Tue, 14 Nov 2023 09:20:09 +0100 From: Michal Hocko To: Marcelo Tosatti Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Vlastimil Babka , Andrew Morton , David Hildenbrand , Peter Xu Subject: Re: [patch 0/2] mm: too_many_isolated can stall due to out of sync VM counters Message-ID: References: <20231113233420.446465795@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20231113233420.446465795@redhat.com> X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 85FB1120012 X-Stat-Signature: acm49hriba3ckkpwnxezhr1pakedfpmt X-Rspam-User: X-HE-Tag: 1699950012-823818 X-HE-Meta: U2FsdGVkX1/auLwJD0Wv85v6bafpOAnr+88hzDnkPlZHNBu2c3hdHHyyEtARjdiU3b3ksERlJnpUJSZZR//ag3uKZ13WOFdnF48H+h0gH+5+mZuOYlYgeQ6soIr6JM88W6E8grfRifbILn3TOt6BuLJKClK51GRMnM4ng6b7zpGlOwqN1FpUQBeW3RdXxxeLnp3Nd6eD3M0Pvo/m62IXmdtwPcGKwIDPjE/xQZr3bmSLMWS6E2I5O7wArkitonG+abft+E5QyT1fAX9vLFYWN5MUgSCQbVUGzTlK+rqpaAcfA37vQkhtWRLeD3zC6f4y5qaJTvQL+QK+pN9IX3sjCulE6apPpN1vTXLviT2DbwskLXk0392fXsO3ILJ0FF7GOgaKmS8aOFJ1S551TEZMz+g0DYLxd4xASJ+u8UYcmlc6YL6FRQeOKVXK8NiwHU8F4z8Ke5t3oYUFrmWx4qkXgmyoWPM/ESwFXpQ9vq2w6VGE97uLc/SLJG7olvaHN6OW26m/1Hqy9LgAqx9teq1ed3yz3cAvCp756kbGncw9R18RsjwhPGDWDtDfJXIqspwIg4/PQ3jXRDOjd/4CQr0RhC+Ip7aPcYFIpbmhH4xqUhyP8L0QtNJCYlhLPZevJq098ZFZp1ImhTgel/TysdP6rb2KRlPlaYOqMwxfkxVuz6xRC3DBS3W09ksKWIpMKyIgtaEx0Fi6auv/fWRLPrP4gbkxCXWria373lqfmtWBCJHimXp3cgJyXV7iqqFLaMd47wGNfeJDYvknN7KbW+28p5wNXkSIiMgZvOXG6bnaGFqMOqyqdsBTrYl0+d/hFI1wXXaB+0cqHyUgMOVRYOCw0T8su6a2k+xxCiAngBpavh0VMv+Z1QEDhs8DpWlXuCoCNK/jWUvi7KJMaIWcb5MtEXwthB0AsJrh4qUlNP0hKxBm30gnSexiP1H3t0colrtxFeJo08l7DMIS7zKAfXF oKVAoBWJ 4cek4qapx39J7yc3sQPsceKNWM02YFNL7LAnm6qC9kbwFQ6Di8Q6sqGqM2/EzZ66zVEDtCslgcU24UaNvlrSrhQSYEA2jQwHOtyyzpBq0tdfY6nf/nLLhKmgND0ryQ9/rE75Owut/yOBccBY/gsQrvhmNprYqhRFVmCxHCstuyZSZqArbNhsj6G/A1x7iXKhQu20kLtGbY57W4Gl4NwPWSl/QHeSqtUNMiM56xnC/6+v1dx0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon 13-11-23 20:34:20, Marcelo Tosatti wrote: > A customer reported seeing processes hung at too_many_isolated, > while analysis indicated that the problem occurred due to out > of sync per-CPU stats (see below). > > Fix is to use node_page_state_snapshot to avoid the out of stale values. > > 2136 static unsigned long > 2137 shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec, > 2138 struct scan_control *sc, enum lru_list lru) > 2139 { > : > 2145 bool file = is_file_lru(lru); > : > 2147 struct pglist_data *pgdat = lruvec_pgdat(lruvec); > : > 2150 while (unlikely(too_many_isolated(pgdat, file, sc))) { > 2151 if (stalled) > 2152 return 0; > 2153 > 2154 /* wait a bit for the reclaimer. */ > 2155 msleep(100); <--- some processes were sleeping here, with pending SIGKILL. > 2156 stalled = true; > 2157 > 2158 /* We are about to die and free our memory. Return now. */ > 2159 if (fatal_signal_pending(current)) > 2160 return SWAP_CLUSTER_MAX; > 2161 } > > msleep() must be called only when there are too many isolated pages: What do you mean here? > 2019 static int too_many_isolated(struct pglist_data *pgdat, int file, > 2020 struct scan_control *sc) > 2021 { > : > 2030 if (file) { > 2031 inactive = node_page_state(pgdat, NR_INACTIVE_FILE); > 2032 isolated = node_page_state(pgdat, NR_ISOLATED_FILE); > 2033 } else { > : > 2046 return isolated > inactive; > > The return value was true since: > > crash> p ((struct pglist_data *) 0xffff00817fffe580)->vm_stat[NR_INACTIVE_FILE] > $8 = { > counter = 1 > } > crash> p ((struct pglist_data *) 0xffff00817fffe580)->vm_stat[NR_ISOLATED_FILE] > $9 = { > counter = 2 > > while per_cpu stats had: > > crash> p ((struct pglist_data *) 0xffff00817fffe580)->per_cpu_nodestats > $85 = (struct per_cpu_nodestat *) 0xffff8000118832e0 > crash> p/x 0xffff8000118832e0 + __per_cpu_offset[42] > $86 = 0xffff00917fcc32e0 > crash> p ((struct per_cpu_nodestat *) 0xffff00917fcc32e0)->vm_node_stat_diff[NR_ISOLATED_FILE] > $87 = -1 '\377' > > crash> p/x 0xffff8000118832e0 + __per_cpu_offset[44] > $89 = 0xffff00917fe032e0 > crash> p ((struct per_cpu_nodestat *) 0xffff00917fe032e0)->vm_node_stat_diff[NR_ISOLATED_FILE] > $91 = -1 '\377' This doesn't really tell much. How much out of sync they really are cumulatively over all cpus? > It seems that processes were trapped in direct reclaim/compaction loop > because these nodes had few free pages lower than watermark min. > > crash> kmem -z | grep -A 3 Normal > : > NODE: 4 ZONE: 1 ADDR: ffff00817fffec40 NAME: "Normal" > SIZE: 8454144 PRESENT: 98304 MIN/LOW/HIGH: 68/166/264 > VM_STAT: > NR_FREE_PAGES: 68 > -- > NODE: 5 ZONE: 1 ADDR: ffff00897fffec40 NAME: "Normal" > SIZE: 118784 MIN/LOW/HIGH: 82/200/318 > VM_STAT: > NR_FREE_PAGES: 45 > -- > NODE: 6 ZONE: 1 ADDR: ffff00917fffec40 NAME: "Normal" > SIZE: 118784 MIN/LOW/HIGH: 82/200/318 > VM_STAT: > NR_FREE_PAGES: 53 > -- > NODE: 7 ZONE: 1 ADDR: ffff00997fbbec40 NAME: "Normal" > SIZE: 118784 MIN/LOW/HIGH: 82/200/318 > VM_STAT: > NR_FREE_PAGES: 52 How have you concluded that too_many_isolated is at root of this issue. With a very low NR_FREE_PAGES and many contending allocation the system could be easily stuck in reclaim. What are other reclaim characteristics? Is the direct reclaim successful? -- Michal Hocko SUSE Labs