From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.8 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_SANE_1,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D3FFBC433E0 for ; Wed, 24 Feb 2021 07:24:56 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 28B2664ECB for ; Wed, 24 Feb 2021 07:24:56 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 28B2664ECB Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 8DBFC6B006E; Wed, 24 Feb 2021 02:24:55 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 88CD76B0070; Wed, 24 Feb 2021 02:24:55 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 757376B0071; Wed, 24 Feb 2021 02:24:55 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0203.hostedemail.com [216.40.44.203]) by kanga.kvack.org (Postfix) with ESMTP id 5B90C6B006E for ; Wed, 24 Feb 2021 02:24:55 -0500 (EST) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 1332D1803DDAF for ; Wed, 24 Feb 2021 07:24:55 +0000 (UTC) X-FDA: 77852324550.26.24107ED Received: from mail-ot1-f42.google.com (mail-ot1-f42.google.com [209.85.210.42]) by imf03.hostedemail.com (Postfix) with ESMTP id 91266C000C63 for ; Wed, 24 Feb 2021 07:24:51 +0000 (UTC) Received: by mail-ot1-f42.google.com with SMTP id 105so1318199otd.3 for ; Tue, 23 Feb 2021 23:24:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:from:to:cc:subject:in-reply-to:message-id:references :user-agent:mime-version; bh=hVQRk3vsrmZl5N/tvnjO8rhMNKZckQjY8psEJ67EeM8=; b=lHpQY9Xoan3f7RzlfoTCYqMDpB2GYu2zzswfLM+0QwelUNJD+8FU8EhT1+yGzjim0I 9uhmS/t6MbMp2OziRC6oa2hVMN930QT9YNZkmMACTP1pxcksLVNwPWsDcYvmIXAMASn6 Sl9jqsO0uRpl9vMRiMcV2BGu0jcOEt68pMewVfz9KsUINtV/UQdajoTuJN0p5Nv59xvR DrW2N6oWyo3RmuIQuYeEng/NJSyQks4RvvA5/QCXcgyQztXrtznPlkvR4M6mBZZK9Wb7 UiZafRQtPqpF8HAR3legu9HznQoAtT1LX7xvkM4aLZbZ7SLwcgmaRx3EE2DBBMSk7sY5 DeMg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:in-reply-to:message-id :references:user-agent:mime-version; bh=hVQRk3vsrmZl5N/tvnjO8rhMNKZckQjY8psEJ67EeM8=; b=s5xu1zJtmdMbOM1ZO0SfL8TaFVD9G0C1P4cu5hnAlLzAFIfw9ONjHeh1OvSdkUm2vp +9aYx2JEI9AOvC67J+Uuu2uEfT5KcR1kKnCQrpWJhrYBYInOPhcg7xvwgqHK0/RLFd0K 4OGzKP3OmN7dZxlfymcL4ZehWyzjYZwtKYe6Giugs8Xwv2E8Zv1WFGJoQMRjlH5VJhc6 s9nZy3UDh5eym/w6A+AGpz8yHu7YbNyg/bTqMAiUcqzzNETyD0dhs0bBKdTWHI9FxfeV 8TIJhEHqlbh20P5/P8b+UR7PYsA9p2v9iOIFkwyclTZ/HDDgPILU3iCo0nxkx7Vgipib 2LtA== X-Gm-Message-State: AOAM5319kYs/Syc/XDIgasQMeexzIf5sku26+CZE0EJZ7V9DXaM0ENcG +xT3FaU0WbT34oa6Us/V5CJ2Uw== X-Google-Smtp-Source: ABdhPJwIf+Ts7WN2ZWmu6PeVEXypBZ7ULkjM0QFA8F1//ByW1QhpvC8/0liMfSE/JiCDssbi21C9dA== X-Received: by 2002:a9d:b90:: with SMTP id 16mr7735327oth.304.1614151493564; Tue, 23 Feb 2021 23:24:53 -0800 (PST) Received: from eggly.attlocal.net (172-10-233-147.lightspeed.sntcca.sbcglobal.net. [172.10.233.147]) by smtp.gmail.com with ESMTPSA id 64sm227540otu.80.2021.02.23.23.24.52 (version=TLS1 cipher=ECDHE-ECDSA-AES128-SHA bits=128/128); Tue, 23 Feb 2021 23:24:53 -0800 (PST) Date: Tue, 23 Feb 2021 23:24:23 -0800 (PST) From: Hugh Dickins X-X-Sender: hugh@eggly.anvils To: Roman Gushchin cc: Andrew Morton , Hugh Dickins , Johannes Weiner , Michal Hocko , Vlastimil Babka , linux-mm@kvack.org, kernel-team@fb.com, linux-kernel@vger.kernel.org Subject: Re: [PATCH v2] mm: vmstat: fix /proc/sys/vm/stat_refresh generating false warnings In-Reply-To: <20200806182555.d7a7fc9853b5a239ffe9f846@linux-foundation.org> Message-ID: References: <20200714173920.3319063-1-guro@fb.com> <20200730162348.GA679955@carbon.dhcp.thefacebook.com> <20200801011821.GA859734@carbon.dhcp.thefacebook.com> <20200804004012.GA1049259@carbon.dhcp.thefacebook.com> <20200806233804.GB1217906@carbon.dhcp.thefacebook.com> <20200806182555.d7a7fc9853b5a239ffe9f846@linux-foundation.org> User-Agent: Alpine 2.11 (LSU 23 2013-08-11) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Stat-Signature: epzfex5x9y3qsmdz5hom8bofs3ac6kqx X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 91266C000C63 Received-SPF: none (google.com>: No applicable sender policy available) receiver=imf03; identity=mailfrom; envelope-from=""; helo=mail-ot1-f42.google.com; client-ip=209.85.210.42 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1614151491-192959 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, 6 Aug 2020, Andrew Morton wrote: > On Thu, 6 Aug 2020 16:38:04 -0700 Roman Gushchin wrote: August, yikes, I thought it was much more recent. > > > it seems that Hugh and me haven't reached a consensus here. > > Can, you, please, not merge this patch into 5.9, so we would have > > more time to find a solution, acceptable for all? > > No probs. I already had a big red asterisk on it ;) I've a suspicion that Andrew might be tiring of his big red asterisk, and wanting to unload mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings.patch mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings-fix.patch mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings-fix-2.patch into 5.12. I would prefer not, and reiterate my Nack: but no great harm will befall the cosmos if he overrules that, and it does go through to 5.12 - I'll just want to revert it again later. And I do think a more straightforward way of suppressing those warnings would be just to delete the code that issues them, rather than brushing them under a carpet of overtuning. I've been running mmotm with the patch below (shown as sign of good faith, and for you to try, but not ready to go yet) for a few months now - overriding your max_drift, restoring nr_writeback and friends to the same checking, fixing the obvious reason why nr_zone_write_pending and nr_writeback are seen negative occasionally (interrupt interrupting to decrement those stats before they have even been incremented). Two big BUTs (if not asterisks): since adding that patch, I have usually forgotten all about it, so forgotten to run the script that echoes /proc/sys/vm/stat_refresh at odd intervals while under load: so have less data than I'd intended by now. And secondly (and I've just checked again this evening) I do still see nr_zone_write_pending and nr_writeback occasionally caught negative while under load. So, there's something more at play, perhaps the predicted Gushchin Effect (but wouldn't they go together if so? I've only seen them separately), or maybe something else, I don't know. Those are the only stats I've seen caught negative, but I don't have CMA configured at all. You mention nr_free_cma as the only(?) other stat you've seen negative, that of course I won't see, but looking at the source I now notice that NR_FREE_CMA_PAGES is incremented and decremented according to page migratetype... ... internally we have another stat that's incremented and decremented according to page migratetype, and that one has been seen negative too: isn't page migratetype something that usually stays the same, but sometimes the migratetype of the page's block can change, even while some pages of it are allocated? Not a stable basis for maintaining stats, though won't matter much if they are only for display. vmstat_refresh could just exempt nr_zone_write_pending, nr_writeback and nr_free_cma from warnings, if we cannot find a fix to them: but I see no reason to suppress warnings on all the other vmstats. The patch I've been testing with: --- mmotm/mm/page-writeback.c 2021-02-14 14:32:24.000000000 -0800 +++ hughd/mm/page-writeback.c 2021-02-20 18:01:11.264162616 -0800 @@ -2769,6 +2769,13 @@ int __test_set_page_writeback(struct pag int ret, access_ret; lock_page_memcg(page); + /* + * Increment counts in advance, so that they will not go negative + * if test_clear_page_writeback() comes in to decrement them. + */ + inc_lruvec_page_state(page, NR_WRITEBACK); + inc_zone_page_state(page, NR_ZONE_WRITE_PENDING); + if (mapping && mapping_use_writeback_tags(mapping)) { XA_STATE(xas, &mapping->i_pages, page_index(page)); struct inode *inode = mapping->host; @@ -2804,9 +2811,14 @@ int __test_set_page_writeback(struct pag } else { ret = TestSetPageWriteback(page); } - if (!ret) { - inc_lruvec_page_state(page, NR_WRITEBACK); - inc_zone_page_state(page, NR_ZONE_WRITE_PENDING); + + if (WARN_ON_ONCE(ret)) { + /* + * Correct counts in retrospect, if PageWriteback was already + * set; but does any filesystem ever allow this to happen? + */ + dec_lruvec_page_state(page, NR_WRITEBACK); + dec_zone_page_state(page, NR_ZONE_WRITE_PENDING); } unlock_page_memcg(page); access_ret = arch_make_page_accessible(page); --- mmotm/mm/vmstat.c 2021-02-20 17:59:44.838171232 -0800 +++ hughd/mm/vmstat.c 2021-02-20 18:01:11.272162661 -0800 @@ -1865,7 +1865,7 @@ int vmstat_refresh(struct ctl_table *tab for (i = 0; i < NR_VM_ZONE_STAT_ITEMS; i++) { val = atomic_long_read(&vm_zone_stat[i]); - if (val < -max_drift) { + if (val < 0) { pr_warn("%s: %s %ld\n", __func__, zone_stat_name(i), val); err = -EINVAL; @@ -1874,13 +1874,21 @@ int vmstat_refresh(struct ctl_table *tab #ifdef CONFIG_NUMA for (i = 0; i < NR_VM_NUMA_STAT_ITEMS; i++) { val = atomic_long_read(&vm_numa_stat[i]); - if (val < -max_drift) { + if (val < 0) { pr_warn("%s: %s %ld\n", __func__, numa_stat_name(i), val); err = -EINVAL; } } #endif + for (i = 0; i < NR_VM_NODE_STAT_ITEMS; i++) { + val = atomic_long_read(&vm_node_stat[i]); + if (val < 0) { + pr_warn("%s: %s %ld\n", + __func__, node_stat_name(i), val); + err = -EINVAL; + } + } if (err) return err; if (write)