From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.2 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EB6D0C433F5 for ; Mon, 13 Sep 2021 16:41:21 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 9526860E8B for ; Mon, 13 Sep 2021 16:41:21 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 9526860E8B Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 28454900002; Mon, 13 Sep 2021 12:41:21 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2335E6B0071; Mon, 13 Sep 2021 12:41:21 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1221C900002; Mon, 13 Sep 2021 12:41:21 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0156.hostedemail.com [216.40.44.156]) by kanga.kvack.org (Postfix) with ESMTP id 042B76B006C for ; Mon, 13 Sep 2021 12:41:21 -0400 (EDT) Received: from smtpin40.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id B1146181AC9B6 for ; Mon, 13 Sep 2021 16:41:20 +0000 (UTC) X-FDA: 78583115520.40.4B91FED Received: from mail-lj1-f172.google.com (mail-lj1-f172.google.com [209.85.208.172]) by imf19.hostedemail.com (Postfix) with ESMTP id 73B61B000093 for ; Mon, 13 Sep 2021 16:41:20 +0000 (UTC) Received: by mail-lj1-f172.google.com with SMTP id w4so18379035ljh.13 for ; Mon, 13 Sep 2021 09:41:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=LEgn/zBUCAYLpfQi4V6xIhWmiywfHNfRtMEnwO6kR+c=; b=syDyX7vgd0a0s0H8E39DhdZAxxIacRGmImxZXPDkqrsFtuRdtZ01nc7oe6QkiK7+kF 9AO+KH8UbGbL3V7FT3Ua1GNtC+Oqfy81iaYdeZQda2WHE9WE91MM+TrlUUljGjPFP/iy OfKPLyTPMSMJLH7KkBvSf4QmVWSfWCE7LkZjGqHuzLJpD99IzLkzz1tC7wNRkkBNr1fs iVZm2d807al1PKNfLkShegkP7FCmW34W1Cor9RwrItg3Udt3yPx7WptLAaU4a9qjRcgy I3ZMgZzVBY5nwcfJPFVB+fDMteiMcdiKm4g32qySvuthLfEensi7DTOmc+1ReI8Tuc+2 CbRg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=LEgn/zBUCAYLpfQi4V6xIhWmiywfHNfRtMEnwO6kR+c=; b=gvdVRaeu7DDtPiuPRF4s08Z/1TZtjU08cPxvTUa2q7f4o143GQ0qXB8DMWGj+Ra01m wWhtaBxa2ZOZjh5d0C3pm2h7mfh8cFs+wLjehwyC5PKYTDuejVXE0MDifUBJ8jesdIOW PwgbQiPY/X4sAv5EYQTC+1o9ZBfs8LqNkxeTi3xNLy7dsl1fVx3n3TXaCmozfKAkekR2 M5rGRWHLuASdwIp8mVzU99lR57EZwOZ9S/fMwmB0ZDF3S3MJGdBrK1cgAkj0lHTOJrpQ F56/z9imjPl3/ktdZsuZCVsw8vwy78eF27EdgvsDZ1L6LtQcHPDPgL068nYdZCJTxgbZ 0mDg== X-Gm-Message-State: AOAM531GmZZG2AN7KjgBH/83O360uu7NGmrQ/dymsX7ZEpb6o5jdiCys Qu7h+6zDa7Ec/5kQAW3AtKDbykBRBoxoM1jzn8jvsg== X-Google-Smtp-Source: ABdhPJw1F3Ec/Nrf+cKOsxps+U6oWgomqVmy/xP75QvywM17PB0ypAtnETT5CAR4/ekwkaBNqOPOHa0pLsjdau9tK+E= X-Received: by 2002:a2e:9ed9:: with SMTP id h25mr11034279ljk.40.1631551278401; Mon, 13 Sep 2021 09:41:18 -0700 (PDT) MIME-Version: 1.0 References: <20210902215504.dSSfDKJZu%akpm@linux-foundation.org> <20210905124439.GA15026@xsang-OptiPlex-9020> <20210907033000.GA88160@shbuild999.sh.intel.com> <20210912111756.4158-1-hdanton@sina.com> <20210912132914.GA56674@shbuild999.sh.intel.com> In-Reply-To: <20210912132914.GA56674@shbuild999.sh.intel.com> From: Shakeel Butt Date: Mon, 13 Sep 2021 09:41:06 -0700 Message-ID: Subject: Re: [memcg] 45208c9105: aim7.jobs-per-min -14.0% regression To: Feng Tang Cc: Hillf Danton , LKML , Xing Zhengjun , Linux MM Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 73B61B000093 X-Stat-Signature: 4tdxx4fn9uehxhkxna6in1t4o7t14jjj Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=syDyX7vg; spf=pass (imf19.hostedemail.com: domain of shakeelb@google.com designates 209.85.208.172 as permitted sender) smtp.mailfrom=shakeelb@google.com; dmarc=pass (policy=reject) header.from=google.com X-HE-Tag: 1631551280-928355 X-Bogosity: Ham, tests=bogofilter, spamicity=0.106139, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Sun, Sep 12, 2021 at 6:29 AM Feng Tang wrote: > > On Sun, Sep 12, 2021 at 07:17:56PM +0800, Hillf Danton wrote: > [...] > > > +// if (!(__this_cpu_inc_return(stats_flush_threshold) % MEMCG_CHARGE_BATCH)) > > > + if (!(__this_cpu_inc_return(stats_flush_threshold) % 128)) > > > queue_work(system_unbound_wq, &stats_flush_work); > > > } > > > > Hi Feng, > > > > Would you please check if it helps fix the regression to avoid queuing a > > queued work by adding and checking an atomic counter. > > Hi Hillf, > > I just tested your patch, and it didn't recover the regression, but > just reduced it from -14% to around -13%, similar to the patch > increasing the batch charge number. > Thanks Hillf for taking a look and Feng for running the test. This shows that parallel calls to queue_work() is not the issue (there is already a test and set at the start of queue_work()) but the actual work done by queue_work() is costly for this code path. I wrote a simple anon page fault nohuge c program, profiled it and it seems like queue_work() is significant enough. - 51.00% do_anonymous_page + 16.68% alloc_pages_vma 11.61% _raw_spin_lock + 10.26% mem_cgroup_charge - 5.25% lru_cache_add_inactive_or_unevictable - 4.48% __pagevec_lru_add - 3.71% __pagevec_lru_add_fn - 1.74% __mod_lruvec_state - 1.60% __mod_memcg_lruvec_state - 1.35% queue_work_on - __queue_work - 0.93% wake_up_process - try_to_wake_up - 0.82% ttwu_queue 0.61% ttwu_do_activate - 2.97% page_add_new_anon_rmap - 2.68% __mod_lruvec_page_state - 2.48% __mod_memcg_lruvec_state - 1.67% queue_work_on - 1.53% __queue_work - 1.25% wake_up_process - try_to_wake_up - 0.94% ttwu_queue + 0.70% ttwu_do_activate 0.61% cgroup_rstat_updated 2.10% rcu_read_unlock_strict 1.40% cgroup_throttle_swaprate However when I switch the batch size to 128, it goes away. Feng, do you see any change with 128 batch size for aim7 benchmark?