From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6F173C4332F for ; Mon, 17 Oct 2022 21:31:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A0BEB6B0072; Mon, 17 Oct 2022 17:31:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9BAC16B0075; Mon, 17 Oct 2022 17:31:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8AA3B6B0078; Mon, 17 Oct 2022 17:31:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 7C5A06B0072 for ; Mon, 17 Oct 2022 17:31:18 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 55C2640F54 for ; Mon, 17 Oct 2022 21:31:18 +0000 (UTC) X-FDA: 80031737436.13.913188A Received: from mail-io1-f42.google.com (mail-io1-f42.google.com [209.85.166.42]) by imf19.hostedemail.com (Postfix) with ESMTP id 110D71A0032 for ; Mon, 17 Oct 2022 21:31:17 +0000 (UTC) Received: by mail-io1-f42.google.com with SMTP id y80so10240770iof.3 for ; Mon, 17 Oct 2022 14:31:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=vc3li/tzR/oCCHp/uAgdn1p2dDfcQKGULLcU+GBdb+c=; b=RAhZtWKR/m4APOcLNaQu/IyieOIGeudNyluby/pCrVi2sFz2u1A0FSefKBAtiPDBf8 seUBQVFcmsN0KKROH6v2fX9rYKR9yV3RxQEUg5HZJmPE4otB/k/wdZ3cXQnqp4gI2cfX +BsbrZ/E+jyA19LGwo0Xfj/sPwij+yDUr5znU9FgwCBQuY3ECDYxZIXkooVOZstOU5DW 8kEpIXyAYuDCwvYA3IqHwv88jmBPBdiiuinwjffl8ccf0BCAMvS0uxjEmpOz576Fgi4u VQFx9bUPD3QtIf2eqt48BCliwgEEmZuBd7vQ9EcuUw28y1AQ5QNHJKrLI+9GcbuH4aKz KJJQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=vc3li/tzR/oCCHp/uAgdn1p2dDfcQKGULLcU+GBdb+c=; b=ws8vxWkA3nJ5LkYmu5ccvtHy16k728tY0z3+OmAUraLW2Gw5nBZ/ZOdHYiiiA51aRR 8feePOcw+jxvQB9FCtxPYWE0TOQBGRmTWv7CXgdztRZY7mi+nheM7A8itEwcNumVy7Lb Po84/c29EelzWpYfVfV8dRqFOPQ4d2g3fnkDGqZb62qqugdECBT02zXN6j1XPifYfaoC B4YKCXfgZ4Q/P+sPHXQpsmaSgDNCBosFEg6G5L1K48VXzk/OUw+beqZQoqEEwbGLG6sl c/u9sUcDBGp8XJxUPW76X102VaLze+mkLO2FLpmO3dqNfOVmGKwfjBqO+LJGVa/85JsV L25g== X-Gm-Message-State: ACrzQf2jkyCmIIySS4Jl7ypQKhrBx53c1eD2a4qz+94aqDTIw3tpg/Pw jRgJDmm8/BDwQ8TBlF/a1cn9p8rWGYgxXEet8S0Gog== X-Google-Smtp-Source: AMsMyM4aKKAWsgREP04N0/OdfQ9Pbgp/J6t+WdZ4ViGaN//UH/MMywdFAMTg/PJ/x7SLE2VVL46RT9dMSVX0Qp0eM1w= X-Received: by 2002:a05:6638:480c:b0:363:aed5:ed3c with SMTP id cp12-20020a056638480c00b00363aed5ed3cmr6464575jab.207.1666042277152; Mon, 17 Oct 2022 14:31:17 -0700 (PDT) MIME-Version: 1.0 References: <20221017185238.GA7699@blackbody.suse.cz> In-Reply-To: <20221017185238.GA7699@blackbody.suse.cz> From: Yosry Ahmed Date: Mon, 17 Oct 2022 14:30:41 -0700 Message-ID: Subject: Re: [RFC] memcg rstat flushing optimization To: =?UTF-8?Q?Michal_Koutn=C3=BD?= Cc: Tejun Heo , Zefan Li , Johannes Weiner , Michal Hocko , Shakeel Butt , Roman Gushchin , Andrew Morton , Linux-MM , Cgroups , Greg Thelen Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1666042278; a=rsa-sha256; cv=none; b=IUp2OGw2z6/ca5RCE5IliT9X2YnDyQqshU5fmwxVQTdKIkHDCAb4FUzM4tfbqMLzL/e5Ho 9Ch604yPGksWo/hsbh1z0QtFI1rN+/d5z3nZZQJ5GEHIliY/FUAJArq3bvnTRNbHXtwKK0 pjrTPsS/K8Wco4wBeDb2y4zIx52pIRk= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=RAhZtWKR; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf19.hostedemail.com: domain of yosryahmed@google.com designates 209.85.166.42 as permitted sender) smtp.mailfrom=yosryahmed@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1666042278; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=vc3li/tzR/oCCHp/uAgdn1p2dDfcQKGULLcU+GBdb+c=; b=Rz5xC0Iz1CCQg45Wd0bj0pCMIkGU4y0MFiZc4uX5jmoreePvMK2UQsLdKwCXajW13H8rYV Hv0raK5dMEYYwoV3cLb+d84ELIi56/ZXt53Sq/OK7VabowNkDIrS4WhwlULBEd4kxJOT2L yXyfxvTaNum5Ffz6f8j3ih1ySqKmma4= X-Stat-Signature: 9xhzdpstjft1omrwcgwi9ymxdtrjjjpa X-Rspamd-Queue-Id: 110D71A0032 X-Rspam-User: Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=RAhZtWKR; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf19.hostedemail.com: domain of yosryahmed@google.com designates 209.85.166.42 as permitted sender) smtp.mailfrom=yosryahmed@google.com X-Rspamd-Server: rspam11 X-HE-Tag: 1666042277-405253 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Oct 17, 2022 at 11:52 AM Michal Koutn=C3=BD wrot= e: > > Hello. > > On Tue, Oct 04, 2022 at 06:17:40PM -0700, Yosry Ahmed wrote: > > Sorry for the long email :) > > (I'll get to other parts sometime in the future. Sorry for my latency :) > > > We have recently ran into a hard lockup on a machine with hundreds of > > CPUs and thousands of memcgs during an rstat flush. > > [...] > > I only respond with some remarks to this particular case. > > > > As you can imagine, with a sufficiently large number of > > memcgs and cpus, a call to mem_cgroup_flush_stats() might be slow, or > > in an extreme case like the one we ran into, cause a hard lockup > > (despite periodically flushing every 4 seconds). > > Is this your modification from the upstream value of FLUSH_TIME (that's > every 2 s)? It's actually once every 4s like upstream, I got confused by flush_next_time multiplying the flush interval by 2. > > In the mailthread, you also mention >10s for hard-lockups. That sounds > scary (even with the once per 4 seconds) since with large enough update > tree (and update activity) periodic flush couldn't keep up. > Also, it seems to be kind of bad feedback, the longer a (periodic) flush > takes, the lower is the frequency of them and the more updates may > accumulate. I.e. one spike in update activity can get the system into > a spiral of long flushes that won't recover once the activity doesn't > drop much more. Yeah it is scary and shouldn't be likely to happen, but it did :( We can keep coming up with mitigations to try and make it less likely, but I was hoping we can find something more fundamental like keeping track of what we really need to flush or avoiding all flushing in non-sleepable contexts if possible. > > (2nd point should have been about some memcg_check_events() optimization > or THRESHOLDS_EVENTS_TARGET justifying delayed flush but I've found none = to be applicable. > Just noting that v2 fortunetly doesn't have the threshold > notifications.) I think even without that, we can still run into the same problem in other non-sleepable flushing contexts. > > Regards, > Michal