From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C8F1AC352A1 for ; Tue, 6 Dec 2022 23:10:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 319EC8E0003; Tue, 6 Dec 2022 18:10:54 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 2C9408E0001; Tue, 6 Dec 2022 18:10:54 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 16AD18E0003; Tue, 6 Dec 2022 18:10:54 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 03A2F8E0001 for ; Tue, 6 Dec 2022 18:10:54 -0500 (EST) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id D6716AB452 for ; Tue, 6 Dec 2022 23:10:53 +0000 (UTC) X-FDA: 80213428386.02.3FF13D6 Received: from mail-pg1-f201.google.com (mail-pg1-f201.google.com [209.85.215.201]) by imf15.hostedemail.com (Postfix) with ESMTP id 865D2A000D for ; Tue, 6 Dec 2022 23:10:52 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=Jlk1MmZC; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf15.hostedemail.com: domain of 3-8uPYwgKCPQodWgaahXckkcha.Ykihejqt-iigrWYg.knc@flex--shakeelb.bounces.google.com designates 209.85.215.201 as permitted sender) smtp.mailfrom=3-8uPYwgKCPQodWgaahXckkcha.Ykihejqt-iigrWYg.knc@flex--shakeelb.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1670368252; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=H30kxZZAF5jNAG74FSVbdmAZx19VpCBztd5KoQzCrP8=; b=yASskC43RsXdHPv2jrfpBT1FA3NCM5XqOaNlC41vOBBdFxsCZdUaxAgIp5fJ/2prajVKuo tZsO7jDZIFq2RlPtC1uLw4Fns6F3y01vfla6MpJ57IyRMRnzIer8SydUFvCeG/cMdXwPO+ S1kRWA5WgnKCBSmZFygEFqHguhLVMe4= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=Jlk1MmZC; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf15.hostedemail.com: domain of 3-8uPYwgKCPQodWgaahXckkcha.Ykihejqt-iigrWYg.knc@flex--shakeelb.bounces.google.com designates 209.85.215.201 as permitted sender) smtp.mailfrom=3-8uPYwgKCPQodWgaahXckkcha.Ykihejqt-iigrWYg.knc@flex--shakeelb.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1670368252; a=rsa-sha256; cv=none; b=DRpP3EjWk/CbsvJctEPZFHUlWG0nWfI6YSbb5mPoRGwfYMcAH8wF8akBaQc7Qlt4TNlfM4 E+oKoe3iGkusHa+IA4Ks7j6nW+4bhwGuQnarGPoazz916irI7pw59QMU6qIOFQo2IULE4l xWlTCXzklSJ/xbRjRnz9XgEF4PxCOoE= Received: by mail-pg1-f201.google.com with SMTP id k7-20020a632407000000b00478c0260975so4900054pgk.1 for ; Tue, 06 Dec 2022 15:10:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=H30kxZZAF5jNAG74FSVbdmAZx19VpCBztd5KoQzCrP8=; b=Jlk1MmZCRjTlJDcMa2AlTVoJLrTt+/m+1AA4uwx+MttAi+FmUNRsMX8SzhrD1HHgXX d81kNXOA1lGLcF+WHEmHQo1v1bFG4omVEdJD7yIqGo8TB3Yt5ZVtaZARZqYKl/9iAx5M Ip2C0G4UFGPZaZYJb+DSc6cmX73dTrJ37Hku2SMBzeSGOXzMKMp1vVoHaJhAsGL1m2Lo cKtochCKQNZ2Aat3boBLKDEdB+ibi5/spc/sAVj6Nmq2vhlYKpKSeDxMzQwvTvE+fNqK L6rj/E8c0TzzAExGWhtgPxUFwhopdoD46Xa10DctN43JPpt1Q266DAF3IiT5L4puAbzi i6hA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=H30kxZZAF5jNAG74FSVbdmAZx19VpCBztd5KoQzCrP8=; b=Y8eSukZmbBZtyDfNDmAAnvI/uKD1lCMGLVw+kEZTPF6YPMZDhavOEdas8XIhKVlD51 sqQiQmcU4DzhUGk0uskLBF5WPQzzbDDWIYEcRQUF1UT+T4mcTtYwoCEsZYaWhy/poIqo KExo2Ra83MLd1L9WJMVsUKduPQSaROv4hgEH9oyg8jSolPWmtLLHHxpdptc6NQoQhWot mxZoK449v4m4Y7cwpvIE2e8sVSMb3kZVAC58yWz86kKX+3gh23N+AAZ/t94pxRIw3cXV yLJLYwV11PSUCjHwTAb3l6KXzadmRHrjYMuRyMTtjOQ/xyJ9qu1qyAr+TObF2/BQATAZ mYSw== X-Gm-Message-State: ANoB5pkjL/bPsx/jAM2jCbHbbRQrrLtWQY6lRiFo1dpY0LpEJjOrakfk Vm+t8eCUao7BRHxPTtYf+548wQm/gIqkJA== X-Google-Smtp-Source: AA0mqf4pNzogl/JUXLfYSBY3pMqRBpS5Ca9/0oJMM041D8rSTyO+eYpsF6Dzqyhh467WIFCAgnYNhCX16y0THw== X-Received: from shakeelb.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:262e]) (user=shakeelb job=sendgmr) by 2002:a05:6a00:1391:b0:575:eaa:c28c with SMTP id t17-20020a056a00139100b005750eaac28cmr46822396pfg.76.1670368251140; Tue, 06 Dec 2022 15:10:51 -0800 (PST) Date: Tue, 6 Dec 2022 23:10:49 +0000 In-Reply-To: Mime-Version: 1.0 References: Message-ID: <20221206231049.g35ltbxbk54izrie@google.com> Subject: Re: Low TCP throughput due to vmpressure with swap enabled From: Shakeel Butt To: Johannes Weiner Cc: Eric Dumazet , Ivan Babrou , Linux MM , Linux Kernel Network Developers , linux-kernel , Michal Hocko , Roman Gushchin , Muchun Song , Andrew Morton , "David S. Miller" , Hideaki YOSHIFUJI , David Ahern , Jakub Kicinski , Paolo Abeni , cgroups@vger.kernel.org, kernel-team Content-Type: text/plain; charset="us-ascii" X-Spamd-Result: default: False [-2.07 / 9.00]; BAYES_HAM(-5.97)[99.94%]; SORBS_IRL_BL(3.00)[209.85.215.201:from]; MV_CASE(0.50)[]; FORGED_SENDER(0.30)[shakeelb@google.com,3-8uPYwgKCPQodWgaahXckkcha.Ykihejqt-iigrWYg.knc@flex--shakeelb.bounces.google.com]; RCVD_NO_TLS_LAST(0.10)[]; MIME_GOOD(-0.10)[text/plain]; BAD_REP_POLICIES(0.10)[]; TO_DN_SOME(0.00)[]; R_DKIM_ALLOW(0.00)[google.com:s=20210112]; FROM_HAS_DN(0.00)[]; MIME_TRACE(0.00)[0:+]; FROM_NEQ_ENVFROM(0.00)[shakeelb@google.com,3-8uPYwgKCPQodWgaahXckkcha.Ykihejqt-iigrWYg.knc@flex--shakeelb.bounces.google.com]; RCVD_COUNT_TWO(0.00)[2]; R_SPF_ALLOW(0.00)[+ip4:209.85.128.0/17]; RCPT_COUNT_TWELVE(0.00)[17]; DKIM_TRACE(0.00)[google.com:+]; PREVIOUSLY_DELIVERED(0.00)[linux-mm@kvack.org]; DMARC_POLICY_ALLOW(0.00)[google.com,reject]; ARC_SIGNED(0.00)[hostedemail.com:s=arc-20220608:i=1]; TO_MATCH_ENVRCPT_SOME(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; ARC_NA(0.00)[] X-Rspamd-Queue-Id: 865D2A000D X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: erbw891wgbk5spg3mggms9n177xeu3r9 X-HE-Tag: 1670368252-1497 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Dec 06, 2022 at 09:51:01PM +0100, Johannes Weiner wrote: > On Tue, Dec 06, 2022 at 08:13:50PM +0100, Eric Dumazet wrote: > > On Tue, Dec 6, 2022 at 8:00 PM Johannes Weiner wrote: > > > @@ -1701,10 +1701,10 @@ void mem_cgroup_sk_alloc(struct sock *sk); > > > void mem_cgroup_sk_free(struct sock *sk); > > > static inline bool mem_cgroup_under_socket_pressure(struct mem_cgroup *memcg) > > > { > > > - if (!cgroup_subsys_on_dfl(memory_cgrp_subsys) && memcg->tcpmem_pressure) > > > + if (!cgroup_subsys_on_dfl(memory_cgrp_subsys) && memcg->socket_pressure) > > > > && READ_ONCE(memcg->socket_pressure)) > > > > > return true; > > > do { > > > - if (time_before(jiffies, READ_ONCE(memcg->socket_pressure))) > > > + if (memcg->socket_pressure) > > > > if (READ_ONCE(...)) > > Good point, I'll add those. > > > > @@ -7195,10 +7194,10 @@ bool mem_cgroup_charge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages, > > > struct page_counter *fail; > > > > > > if (page_counter_try_charge(&memcg->tcpmem, nr_pages, &fail)) { > > > - memcg->tcpmem_pressure = 0; > > > > Orthogonal to your patch, but: > > > > Maybe avoid touching this cache line too often and use READ/WRITE_ONCE() ? > > > > if (READ_ONCE(memcg->socket_pressure)) > > WRITE_ONCE(memcg->socket_pressure, false); > > Ah, that's a good idea. > > I think it'll be fine in the failure case, since that's associated > with OOM and total performance breakdown anyway. > > But certainly, in the common case of the charge succeeding, we should > not keep hammering false into that variable over and over. > > How about the delta below? I also flipped the branches around to keep > the common path at the first indentation level, hopefully making that > a bit clearer too. > > Thanks for taking a look, Eric! > I still think we should not put a persistent state of socket pressure on unsuccessful charge which will only get reset on successful charge. I think the better approach would be to limit the pressure state by time window same as today but set it on charge path. Something like below: diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index d3c8203cab6c..7bd88d443c42 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -287,7 +287,6 @@ struct mem_cgroup { /* Legacy tcp memory accounting */ bool tcpmem_active; - int tcpmem_pressure; #ifdef CONFIG_MEMCG_KMEM int kmemcg_id; @@ -1712,8 +1711,6 @@ void mem_cgroup_sk_alloc(struct sock *sk); void mem_cgroup_sk_free(struct sock *sk); static inline bool mem_cgroup_under_socket_pressure(struct mem_cgroup *memcg) { - if (!cgroup_subsys_on_dfl(memory_cgrp_subsys) && memcg->tcpmem_pressure) - return true; do { if (time_before(jiffies, READ_ONCE(memcg->socket_pressure))) return true; diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 48c44229cf47..290444bcab84 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -5286,7 +5286,6 @@ static struct mem_cgroup *mem_cgroup_alloc(void) vmpressure_init(&memcg->vmpressure); INIT_LIST_HEAD(&memcg->event_list); spin_lock_init(&memcg->event_list_lock); - memcg->socket_pressure = jiffies; #ifdef CONFIG_MEMCG_KMEM memcg->kmemcg_id = -1; INIT_LIST_HEAD(&memcg->objcg_list); @@ -7252,10 +7251,12 @@ bool mem_cgroup_charge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages, struct page_counter *fail; if (page_counter_try_charge(&memcg->tcpmem, nr_pages, &fail)) { - memcg->tcpmem_pressure = 0; + if (READ_ONCE(memcg->socket_pressure)) + WRITE_ONCE(memcg->socket_pressure, 0); return true; } - memcg->tcpmem_pressure = 1; + if (READ_ONCE(memcg->socket_pressure) < jiffies + HZ) + WRITE_ONCE(memcg->socket_pressure, jiffies + HZ); if (gfp_mask & __GFP_NOFAIL) { page_counter_charge(&memcg->tcpmem, nr_pages); return true; @@ -7263,12 +7264,21 @@ bool mem_cgroup_charge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages, return false; } - if (try_charge(memcg, gfp_mask, nr_pages) == 0) { - mod_memcg_state(memcg, MEMCG_SOCK, nr_pages); - return true; + if (try_charge(memcg, gfp_mask & ~__GFP_NOFAIL, nr_pages) < 0) { + if (READ_ONCE(memcg->socket_pressure) < jiffies + HZ) + WRITE_ONCE(memcg->socket_pressure, jiffies + HZ); + if (gfp_mask & __GFP_NOFAIL) { + try_charge(memcg, gfp_mask, nr_pages); + goto out; + } + return false; } - return false; + if (READ_ONCE(memcg->socket_pressure)) + WRITE_ONCE(memcg->socket_pressure, 0); +out: + mod_memcg_state(memcg, MEMCG_SOCK, nr_pages); + return true; } /**