From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 64A50C433DB for ; Thu, 4 Feb 2021 19:30:02 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id BBD8964F38 for ; Thu, 4 Feb 2021 19:30:01 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org BBD8964F38 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 39D776B0005; Thu, 4 Feb 2021 14:30:01 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 34C946B0006; Thu, 4 Feb 2021 14:30:01 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 23C366B006E; Thu, 4 Feb 2021 14:30:01 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0230.hostedemail.com [216.40.44.230]) by kanga.kvack.org (Postfix) with ESMTP id 0A90A6B0005 for ; Thu, 4 Feb 2021 14:30:01 -0500 (EST) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id CC3778249980 for ; Thu, 4 Feb 2021 19:30:00 +0000 (UTC) X-FDA: 77781575760.22.able68_270a837275de Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin22.hostedemail.com (Postfix) with ESMTP id AE1E818038E60 for ; Thu, 4 Feb 2021 19:30:00 +0000 (UTC) X-HE-Tag: able68_270a837275de X-Filterd-Recvd-Size: 5527 Received: from mail-qt1-f175.google.com (mail-qt1-f175.google.com [209.85.160.175]) by imf18.hostedemail.com (Postfix) with ESMTP for ; Thu, 4 Feb 2021 19:29:59 +0000 (UTC) Received: by mail-qt1-f175.google.com with SMTP id d15so3241427qtw.12 for ; Thu, 04 Feb 2021 11:29:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=pxEH3lPVvNUUdUP05ZPOl2OQ8LUijQb0MCrSWBXGG1I=; b=avFnTqyzD1lpktS9bgh7ccT5o3JRmPhvL3P/+Rzvtn4OvxFpTV7zxaoJO0kCJ4iVHb xUqJgv/Re97obtqWTQQsdPQ//rV2bSEcTqkAbkpGABW2h2z02WGuBRIRJ8nfipeBbK6L hRybH6mwFcUaSkUpiAu4YDDK/fmqYz2oVMwmXgwexb0x6+YoIf/h5gabNV3X50431Md8 QGBrSXWq/n3Xz8F5Q+XF+Sg32MgsX3EXrGPMUX6RUOv/t9InDPlr/qvtfobNspqKEDqj hPf4NcugQRHsrEnIQXjVacillW1TmexsR9w1vUrPbVszAlHh+1zvegmvVQuGMVrJeOX9 gxJQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=pxEH3lPVvNUUdUP05ZPOl2OQ8LUijQb0MCrSWBXGG1I=; b=dqQIbk1vnW00d06jmI+m+MnnVbFTrNpmIF6J13Be+N8mJUCl1Eom5au8wPJ0qroBlQ f/0hNobRN9iHX74JaOfqjAJMV7amJ29B/A4S08aUdFcHDcZ7cymurxOct3CE9Gq/+LCK fayyGEDrRBDZexW5AzVrkytuAnEmWyMAEF3hOhiFqAqUr1Mq3qrcmKqzFkfKKvP2YaJu TrXs0aYnjY41o7rCx5bcEeFJ+J/fXEl8iUXXadF9G3UOSRyayFrdS5NGYaylEl496cXj 0pcmt6rfLKZdIK1O5ncvONHkuwxnMgMq67k8areh61Qe9oYTwziuCXHu6fwjDNXaA+7p WC4w== X-Gm-Message-State: AOAM530kGk7ZPtX751XAh5+e5sucslZd5JCHAv+cMqbWTxH9HHpW79ck TWbjL2TnR6Z7LW6+BjE6aPBiQw== X-Google-Smtp-Source: ABdhPJz2RQx7/sc+iaLvhakoRLJxsdTzKIgiu31ziLgJIIup/Jpig7TrILtyJyDRlpAa5rzhhSwQ8w== X-Received: by 2002:ac8:5a01:: with SMTP id n1mr1094177qta.107.1612466999188; Thu, 04 Feb 2021 11:29:59 -0800 (PST) Received: from localhost (70.44.39.90.res-cmts.bus.ptd.net. [70.44.39.90]) by smtp.gmail.com with ESMTPSA id z23sm2040630qkb.13.2021.02.04.11.29.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 Feb 2021 11:29:58 -0800 (PST) Date: Thu, 4 Feb 2021 14:29:57 -0500 From: Johannes Weiner To: Roman Gushchin Cc: Andrew Morton , Tejun Heo , Michal Hocko , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: Re: [PATCH 1/7] mm: memcontrol: fix cpuhotplug statistics flushing Message-ID: References: <20210202184746.119084-1-hannes@cmpxchg.org> <20210202184746.119084-2-hannes@cmpxchg.org> <20210202230747.GA1812008@carbon.dhcp.thefacebook.com> <20210203022853.GG1812008@carbon.dhcp.thefacebook.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210203022853.GG1812008@carbon.dhcp.thefacebook.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.026272, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Feb 02, 2021 at 06:28:53PM -0800, Roman Gushchin wrote: > On Tue, Feb 02, 2021 at 03:07:47PM -0800, Roman Gushchin wrote: > > On Tue, Feb 02, 2021 at 01:47:40PM -0500, Johannes Weiner wrote: > > > The memcg hotunplug callback erroneously flushes counts on the local > > > CPU, not the counts of the CPU going away; those counts will be lost. > > > > > > Flush the CPU that is actually going away. > > > > > > Also simplify the code a bit by using mod_memcg_state() and > > > count_memcg_events() instead of open-coding the upward flush - this is > > > comparable to how vmstat.c handles hotunplug flushing. > > > > To the whole series: it's really nice to have an accurate stats at > > non-leaf levels. Just as an illustration: if there are 32 CPUs and > > 1000 sub-cgroups (which is an absolutely realistic number, because > > often there are many dying generations of each cgroup), the error > > margin is 3.9GB. It makes all numbers pretty much random and all > > possible tests extremely flaky. > > Btw, I was just looking into kmem kselftests failures/flakiness, > which is caused by exactly this problem: without waiting for the > finish of dying cgroups reclaim, we can't make any reliable assumptions > about what to expect from memcg stats. Good point about the selftests. I gave them a shot, and indeed this series makes test_kmem work again: vanilla: ok 1 test_kmem_basic memory.current = 8810496 slab + anon + file + kernel_stack = 17074568 slab = 6101384 anon = 946176 file = 0 kernel_stack = 10027008 not ok 2 test_kmem_memcg_deletion ok 3 test_kmem_proc_kpagecgroup ok 4 test_kmem_kernel_stacks ok 5 test_kmem_dead_cgroups ok 6 test_percpu_basic patched: ok 1 test_kmem_basic ok 2 test_kmem_memcg_deletion ok 3 test_kmem_proc_kpagecgroup ok 4 test_kmem_kernel_stacks ok 5 test_kmem_dead_cgroups ok 6 test_percpu_basic It even passes with a reduced margin in the patched kernel, since the percpu drift - which this test already tried to account for - is now only on the page_counter side (whereas memory.stat is always precise). I'm going to include that data in the v2 changelog, as well as a patch to update test_kmem.c to the more stringent error tolerances. > So looking forward to have this patchset merged! Thanks