From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E3517EDEC7B for ; Wed, 13 Sep 2023 15:52:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2D8316B01A7; Wed, 13 Sep 2023 11:52:22 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2878F6B01A8; Wed, 13 Sep 2023 11:52:22 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 14F146B01A9; Wed, 13 Sep 2023 11:52:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 05F8E6B01A7 for ; Wed, 13 Sep 2023 11:52:22 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id BA3F416047B for ; Wed, 13 Sep 2023 15:52:21 +0000 (UTC) X-FDA: 81232016082.14.AB7918B Received: from mail-vs1-f45.google.com (mail-vs1-f45.google.com [209.85.217.45]) by imf16.hostedemail.com (Postfix) with ESMTP id A1DFC18001D for ; Wed, 13 Sep 2023 15:52:19 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b=ttGZwS7u; dmarc=pass (policy=none) header.from=cmpxchg.org; spf=pass (imf16.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.217.45 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1694620339; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=U1o+xAaY0eE6mqHpp9tQ0uQLU9Huay2534Teze/R2PM=; b=PcN2KJXHBmxNoTB7xbcdvdEps+2NAsvrs83ISSMAgD1gJeGX9yK1T3p8Dje6s5p9FNurs1 lGpPv5g5Z30nR8/7NRUC6WnosnGw1PUQBRyC+Bvjx3SVSSxBhrF3VtlhAWI4UEtTtAuH2d 28dxZjWO8jKQyjkqC6uonDHFuReCONk= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b=ttGZwS7u; dmarc=pass (policy=none) header.from=cmpxchg.org; spf=pass (imf16.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.217.45 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1694620339; a=rsa-sha256; cv=none; b=ZqiRuu5c0ZBNqRwrRtkPK4GyQl6qA4yB2poJx4nl+6HwFt9sKf+yxkWZGMF1ysoYLgWvGx S4rSPkIexTHiKNngjYrw5ciiKVls1uoh3i8Em/vut9HwlfYkonO4GDhoP0b5addcCSEP+Z gaJF9sZ/VN7UoElS57Adl+qE7cqBifQ= Received: by mail-vs1-f45.google.com with SMTP id ada2fe7eead31-450fe53a741so13470137.0 for ; Wed, 13 Sep 2023 08:52:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20230601.gappssmtp.com; s=20230601; t=1694620338; x=1695225138; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=U1o+xAaY0eE6mqHpp9tQ0uQLU9Huay2534Teze/R2PM=; b=ttGZwS7u2xZlrS707WOKJbvSedtlB9vhbBnjbOJHSoYGBgmW9sAPlXttNLh5/+1Au+ QD12oG/GDEXWTpHCQ6o4cGbU7UDQ1q+ipyzWybspnmPbdIw/fCrGd5VQ27sK3sJHEiBY hUSlI+iLVOTAYsZ5c8suDbkmDT8ctQEOuEBRO7oTQ300HhkamOfAx7pJlJJQWbUENuy6 5GT8+1InnjWtWwhoyTcIxv37c/oer7QH2jdyJpeiuCTVPhvaGTXv8KIyC9ge1Lgp8UwL r6dS2nZy0jJcVnLOItiuplL0RuVlUbX3vG0DPZeYwnYim9O/5TYhPQgBjMo9sh14waWi aoew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694620338; x=1695225138; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=U1o+xAaY0eE6mqHpp9tQ0uQLU9Huay2534Teze/R2PM=; b=u0vAlPyaqHtADHEWHrJC5z6OuJShjhxOM9bU+taMNLbA41Qzvhw6iaWR/FEFPom06E rc7whWkT1FC0agQS+Q7Rvz0kkXJFUxbOtjCTVc9VreMhYnUHhEI6jb22FryDVDx4MllC BLW9FqlwPt9c7zoPEHd8Ar1taumRHU5AAjqPmXjqMjstyvFaV90W5yFaxEOnBX81E29R 1wGC/Mj+43HfydkzILkAZ5uRfLgVfXyXNgJ8VF6vb1QXsWQDKzdrmEprZ30l8OgTgYj0 FBG46ktGijwQAolDcutEryzK/c9tD12C0fWNJu8i6xONZ6mnxWosqkEMglLLiKdt51eQ oivQ== X-Gm-Message-State: AOJu0YxXBtpKiyNX0/M8IycrQW16bf4apKUHvoFa4e/OoeRDRAOUUnmz W2Z4RfcQpeommmyuiXqgVRCAdg== X-Google-Smtp-Source: AGHT+IEuCPN7mCVvNBSDsD7+IBZWRt5sWP6OLNzRugyBsPSLwhwSm1Njpa1czX8znYwNA1FxOwWATw== X-Received: by 2002:a67:ea43:0:b0:44e:9cba:5f2 with SMTP id r3-20020a67ea43000000b0044e9cba05f2mr2088907vso.31.1694620338540; Wed, 13 Sep 2023 08:52:18 -0700 (PDT) Received: from localhost (2603-7000-0c01-2716-3012-16a2-6bc2-2937.res6.spectrum.com. [2603:7000:c01:2716:3012:16a2:6bc2:2937]) by smtp.gmail.com with ESMTPSA id r1-20020a0cb281000000b0063cf4d0d558sm4513021qve.25.2023.09.13.08.52.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Sep 2023 08:52:18 -0700 (PDT) Date: Wed, 13 Sep 2023 11:52:17 -0400 From: Johannes Weiner To: Vern Hao Cc: mhocko@kernel.org, roman.gushchin@linux.dev, shakeelb@google.com, akpm@linux-foundation.org, cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Xin Hao Subject: Re: [PATCH v2] mm: memcg: add THP swap out info for anonymous reclaim Message-ID: <20230913155217.GC45543@cmpxchg.org> References: <20230912021727.61601-1-vernhao@tencent.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230912021727.61601-1-vernhao@tencent.com> X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: A1DFC18001D X-Stat-Signature: botp6skrnr1sx79ncc5rpbmpj1te6yo4 X-HE-Tag: 1694620339-483496 X-HE-Meta: U2FsdGVkX1/2vXUlTdMoCxcRXepvE/UA8EYIlINSQE2uRffHLVEcma5fftxZVc4Bh6YDlTNUewhkJ0fzqjT/crK7JxDdFl0eBEh3q1+QjKdI8Se+VC4yrtvNkuc/QVeL1vXHRaJTWRe0gGFsXaamYExwHEI71AtUB1ea9ND0Top8jygosr3FNN3ZVZ8dI1qs8idUK9jCSClkLw9PtaG0lj2bwN/5n5WCvXDKyC23XXKmHmfcTrHuuhgoDGNoPqs6Caylb0UwgRg/0J9GkWkmydrhTrY+flSRxjMFhz81gxpEOBUICXur/+ofldDgzvg/sou37HyX5GmuOiUuT2SFv9/USLH/drL2LWjJhngAuGNRiKUJmML2oTTUB/6xMkNy/gLrComXqQ2sWbo/t6QPJJksToXfngz1jiGIaTjdkCtculYGjahKNFoZ2Dao+CJJfm8IjuIly/WHWcuAFkFqOjDqGGs7oNZo1PxXemkmi9xukxdIijZ99CV0ZAdctMHrZwavEHvB5tN2APYCiiQxbQvRDeHWs1nnCGpatMydHWz1HpShGJYDu9fOlMO45EnPoMjOfN1dDlG7VpkVAS8q06ACCklpeje/rYDsdOLGAbJBgvm1Nhd1nHIMCOxVOn+QwTYnvx94s9V+oHGwzr2yQk/4PmU+XMHSDDE+5ETiM2MiUjQky0gEUKDcjg1Q3lEzCB9y+ZA0MHAg/fVs6YKvN28ySUdoZP4GTZkXvyoOFQ8ZhZuQngzePAAuuZA3pGpXemwNMcDhF+pve9+H1mDXAYbZkbiR/TngFbgDTSwKS0O3LOX3RHqGsZ7rHc4Xf1nGAImCjRf1qsnYYBoamWLPMFzsLg+0szZZZF6hQWVKXhCvkwJ3KlWUoYlYbMFnOz8kKOGkQOnvlotSrrfO/uY/h6B9qUHTjrUyWLlpzuL31NJ21XOw6HBM+bHsxkScuc6kkFk++TYzPsaYt+msyVO DXntqyoZ 5SgLKFuOfIi+KJBi6StlfmxQ5YIjjOpJUesCKjJOmHw8nBh5iYRL86JlJLzc/duwnh+IcOQrTBCRgllM1aXW8mfFnUb0jvMY1jUZoMUAtV2luO2c9rEqI5oJ8yM66mf722i2YDqADl/CXQQQzesrTYxo04/Z1TlC+5Q+JoTu5yvgHCdemq2lihxzdQF2EP8N6u3FjpWvPpXSfslb7VSr+1UzZ+6H4BcyTWdP9Pc4JvEiBMadwi6YTJ1D5UwscXhANLzfcJ0hwQilqih8jMm9TQvaOF7LBHX3E+E8KKEg3xf9Qr/EOIn+3gSXD78VgoSbCtEvpk1GqyMEgF7pD7qBIevFVvCqm017qiDwVidp63nQoHDcTDZFDzB5IOb6U3yIRtlCJ5d7hjdFhdIhFpZPAXcwpM1RL8M/AO5kMN2hieWAyV0+c0hSjJkCCe2DDNeERJ/Ew/4jWIpgocaDL7LCZQFRj+OfV9Pfi1yMD6pgUKyF7IwygStk+RwiawcnkNKbcyd7g0qHSUz8bhEVztfkT8eJ/PnTBTYd3ORsjXYC+vP+LPkoq1BElXfTEkqS2YeTzAO1R X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Sep 12, 2023 at 10:17:25AM +0800, Vern Hao wrote: > From: Xin Hao > > At present, we support per-memcg reclaim strategy, however we do not > know the number of transparent huge pages being reclaimed, as we know > the transparent huge pages need to be splited before reclaim them, and > they will bring some performance bottleneck effect. for example, when > two memcg (A & B) are doing reclaim for anonymous pages at same time, > and 'A' memcg is reclaiming a large number of transparent huge pages, we > can better analyze that the performance bottleneck will be caused by 'A' > memcg. therefore, in order to better analyze such problems, there add > THP swap out info for per-memcg. > > Signed-off-by: Xin Hao > --- > v1 -> v2 > - Do some fix as Johannes Weiner suggestion. > v1: > https://lore.kernel.org/linux-mm/20230911160824.GB103342@cmpxchg.org/T/ > > mm/memcontrol.c | 2 ++ > mm/page_io.c | 4 +++- > mm/vmscan.c | 1 + > 3 files changed, 6 insertions(+), 1 deletion(-) > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index ecc07b47e813..32d50db9ea0d 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -752,6 +752,8 @@ static const unsigned int memcg_vm_event_stat[] = { > #ifdef CONFIG_TRANSPARENT_HUGEPAGE > THP_FAULT_ALLOC, > THP_COLLAPSE_ALLOC, > + THP_SWPOUT, > + THP_SWPOUT_FALLBACK, Can you please add documentation to Documentation/admin-guide/cgroup-v2.rst? Sorry, I missed this in v1. > @@ -208,8 +208,10 @@ int swap_writepage(struct page *page, struct writeback_control *wbc) > static inline void count_swpout_vm_event(struct folio *folio) > { > #ifdef CONFIG_TRANSPARENT_HUGEPAGE > - if (unlikely(folio_test_pmd_mappable(folio))) > + if (unlikely(folio_test_pmd_mappable(folio))) { > + count_memcg_folio_events(folio, THP_SWPOUT, 1); > count_vm_event(THP_SWPOUT); > + } > #endif > count_vm_events(PSWPOUT, folio_nr_pages(folio)); > } Looking through the callers, they seem mostly fine except this one: static void sio_write_complete(struct kiocb *iocb, long ret) { struct swap_iocb *sio = container_of(iocb, struct swap_iocb, iocb); struct page *page = sio->bvec[0].bv_page; int p; if (ret != sio->len) { /* * In the case of swap-over-nfs, this can be a * temporary failure if the system has limited * memory for allocating transmit buffers. * Mark the page dirty and avoid * folio_rotate_reclaimable but rate-limit the * messages but do not flag PageError like * the normal direct-to-bio case as it could * be temporary. */ pr_err_ratelimited("Write error %ld on dio swapfile (%llu)\n", ret, page_file_offset(page)); for (p = 0; p < sio->pages; p++) { page = sio->bvec[p].bv_page; set_page_dirty(page); ClearPageReclaim(page); } } else { for (p = 0; p < sio->pages; p++) count_swpout_vm_event(page_folio(sio->bvec[p].bv_page)); This is called at the end of IO where the page isn't locked anymore. Since it's not locked, page->memcg is not stable and might get freed (charge moving is deprecated but still possible). The fix is simple, though. Every other IO path bumps THP_SWPOUT before starting the IO while the page is still locked. We don't really care if we get SWPOUT events even for failed IOs. So we can just adjust this caller to fit the others, and count while still locked: diff --git a/mm/page_io.c b/mm/page_io.c index fe4c21af23f2..7925e19aeedd 100644 --- a/mm/page_io.c +++ b/mm/page_io.c @@ -278,9 +278,6 @@ static void sio_write_complete(struct kiocb *iocb, long ret) set_page_dirty(page); ClearPageReclaim(page); } - } else { - for (p = 0; p < sio->pages; p++) - count_swpout_vm_event(page_folio(sio->bvec[p].bv_page)); } for (p = 0; p < sio->pages; p++) @@ -296,6 +293,7 @@ static void swap_writepage_fs(struct page *page, struct writeback_control *wbc) struct file *swap_file = sis->swap_file; loff_t pos = page_file_offset(page); + count_swpout_vm_event(page_folio(sio->bvec[p].bv_page)); set_page_writeback(page); unlock_page(page); if (wbc->swap_plug)