From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 04E0EC4332F for ; Thu, 13 Oct 2022 04:36:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 57C006B0071; Thu, 13 Oct 2022 00:36:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 52B8A6B0073; Thu, 13 Oct 2022 00:36:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3CBB46B0074; Thu, 13 Oct 2022 00:36:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 23AE46B0071 for ; Thu, 13 Oct 2022 00:36:48 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id DE1341206D5 for ; Thu, 13 Oct 2022 04:36:47 +0000 (UTC) X-FDA: 80014665654.21.32C0ACD Received: from mail-yw1-f175.google.com (mail-yw1-f175.google.com [209.85.128.175]) by imf28.hostedemail.com (Postfix) with ESMTP id 78BE1C0025 for ; Thu, 13 Oct 2022 04:36:47 +0000 (UTC) Received: by mail-yw1-f175.google.com with SMTP id 00721157ae682-3573ed7cc15so7518827b3.1 for ; Wed, 12 Oct 2022 21:36:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=MYZC/ZZhrwfqeSKJS7iWNQa4xs9BkuJYpmdkhZqDO3E=; b=bD3BNclyP9ltZIweT1HDKFuXx6eb7hOiLH9JVviEPYPhB0sOYPnOgSU66gdF5d75+3 TlT4HyKkoVhHJB9h5halO2XpqS8djMcIXY4706siHzr/PLKnv112E/jabXR193uN3VcJ Q4R2PquNN4OCA4stpjgToiAixsItpkw4Wf/g8xxe6JpEyWsj43y9KYMVQ7VVW1MKEkWD MW2SYxDHdUMKQZv2bqCPYKIN999RCky76DYbu1YcIVUDt+yDL3z5STwfNhL9Z+AfZMeT aMVhDpXfyfPVhRgip9XUGbp9hhNLyveS8nWaOZwTZWsFXiNOKv9T/2UxcbCCZzKIE1Mm UQmQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=MYZC/ZZhrwfqeSKJS7iWNQa4xs9BkuJYpmdkhZqDO3E=; b=kntyhBSvbPTdjflDgVDMO71dzIfzDj1o5pzeqFeuLrMNI+oZUnSneXorWDoDdSmQm0 ODGBceANFPUtVs2qGDCCAbldfE59gt8iEwXWViZB5ZJGonW6BFKv3+n+Zim7WGGiD1yV aLmcJ39cAMmb0QO8sLL4mSHfT92YCy4LxteN907YIQOwcnCyha7xE/gClm7bKpQ20mrN gXca5RkxUdT4MTYzk31tSFya671CvaCrKcnJk74md38Q4YScMGj7avgGboxLTFYPr4tu RFx3ZZFo54aPV6YugtL9ysFbe2lvbO7SA1Y+0PmrCD/LjSxzx/8H1SNWmzGhN21EhKSd mBWA== X-Gm-Message-State: ACrzQf2Q05gXw1Yb5dvVRGwHyxwLJvjkQwQJRgYeT76V+GtMt+F1KF1z eG2ANsIB0sgBE0CSMDJMFkwciUbHJZI9Wi/I0P+1FQ== X-Google-Smtp-Source: AMsMyM4JwARt2gsfW3ADYilWDn1NFUBx3KE68DKqvT23RO3GiX4CqATZEILY6t09XMVmV61nBv6lrhFaSEPDN4pinJk= X-Received: by 2002:a81:892:0:b0:355:a4c8:f310 with SMTP id 140-20020a810892000000b00355a4c8f310mr29362922ywi.486.1665635806604; Wed, 12 Oct 2022 21:36:46 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Shakeel Butt Date: Wed, 12 Oct 2022 21:36:34 -0700 Message-ID: Subject: Re: UDP rx packet loss in a cgroup with a memory limit To: =?UTF-8?Q?Gra=C5=BEvydas_Ignotas?= Cc: Wei Wang , Johannes Weiner , Eric Dumazet , netdev , Michal Hocko , Roman Gushchin , Linux MM , Cgroups Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1665635807; a=rsa-sha256; cv=none; b=CvWt6jTksl8WYq0D8X2xXZocAsosMZaCKyAY3MmRqr07zvEGORhP3WQbNNYmoHWqqwEDNF IGTB9y1+F9foA1xXe1QU1FYqPN/mgWJ7nciHGkGxiWWYOhil/7mWofDnfey6Drp+NkkWj1 RI9YFE1Qug8ejrXDNylz0hjdXvgkm9c= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=bD3BNcly; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf28.hostedemail.com: domain of shakeelb@google.com designates 209.85.128.175 as permitted sender) smtp.mailfrom=shakeelb@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1665635807; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=MYZC/ZZhrwfqeSKJS7iWNQa4xs9BkuJYpmdkhZqDO3E=; b=S4PlApsPLLkvhKto3cBoSlWcxSzyzwPnc/v++SG0sM4TCzYveIZr/ow2Ry8uUguKbQ6HhJ 2vOJlNLLRuFfp9lclAFo0A46PimYMDucRV8NKHARnbwzC0UXwj9wBF431+IdxuBBML5an9 9s9jMvy218q+jKSH/4h31y5OBSQjp3o= X-Stat-Signature: mphpijrttk6hn87qxo1quhdfw7yiio4n X-Rspamd-Queue-Id: 78BE1C0025 Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=bD3BNcly; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf28.hostedemail.com: domain of shakeelb@google.com designates 209.85.128.175 as permitted sender) smtp.mailfrom=shakeelb@google.com X-Rspam-User: X-Rspamd-Server: rspam03 X-HE-Tag: 1665635807-299316 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Aug 17, 2022 at 1:12 PM Gra=C5=BEvydas Ignotas = wrote: > > On Wed, Aug 17, 2022 at 9:16 PM Wei Wang wrote: > > > > On Wed, Aug 17, 2022 at 10:37 AM Shakeel Butt wro= te: > > > > > > + Eric and netdev > > > > > > On Wed, Aug 17, 2022 at 10:13 AM Johannes Weiner = wrote: > > > > > > > > This is most likely a regression caused by this patch: > > > > > > > > commit 4b1327be9fe57443295ae86fe0fcf24a18469e9f > > > > Author: Wei Wang > > > > Date: Tue Aug 17 12:40:03 2021 -0700 > > > > > > > > net-memcg: pass in gfp_t mask to mem_cgroup_charge_skmem() > > > > > > > > Add gfp_t mask as an input parameter to mem_cgroup_charge_skmem= (), > > > > to give more control to the networking stack and enable it to c= hange > > > > memcg charging behavior. In the future, the networking stack ma= y decide > > > > to avoid oom-kills when fallbacks are more appropriate. > > > > > > > > One behavior change in mem_cgroup_charge_skmem() by this patch = is to > > > > avoid force charging by default and let the caller decide when = and if > > > > force charging is needed through the presence or absence of > > > > __GFP_NOFAIL. > > > > > > > > Signed-off-by: Wei Wang > > > > Reviewed-by: Shakeel Butt > > > > Signed-off-by: David S. Miller > > > > > > > > We never used to fail these allocations. Cgroups don't have a > > > > kswapd-style watermark reclaimer, so the network relied on > > > > force-charging and leaving reclaim to allocations that can block. > > > > Now it seems network packets could just fail indefinitely. > > > > > > > > The changelog is a bit terse given how drastic the behavior change > > > > is. Wei, Shakeel, can you fill in why this was changed? Can we reve= rt > > > > this for the time being? > > > > > > Does reverting the patch fix the issue? However I don't think it will= . > > > > > > Please note that we still have the force charging as before this > > > patch. Previously when mem_cgroup_charge_skmem() force charges, it > > > returns false and __sk_mem_raise_allocated takes suppress_allocation > > > code path. Based on some heuristics, it may allow it or it may > > > uncharge and return failure. > > > > The force charging logic in __sk_mem_raise_allocated only gets > > considered on tx path for STREAM socket. So it probably does not take > > effect on UDP path. And, that logic is NOT being altered in the above > > patch. > > So specifically for UDP receive path, what happens in > > __sk_mem_raise_allocated() BEFORE the above patch is: > > - mem_cgroup_charge_skmem() gets called: > > - try_charge() with GFP_NOWAIT gets called and failed > > - try_charge() with __GFP_NOFAIL > > - return false > > - goto suppress_allocation: > > - mem_cgroup_uncharge_skmem() gets called > > - return 0 (which means failure) > > > > AFTER the above patch, what happens in __sk_mem_raise_allocated() is: > > - mem_cgroup_charge_skmem() gets called: > > - try_charge() with GFP_NOWAIT gets called and failed > > - return false > > - goto suppress_allocation: > > - We no longer calls mem_cgroup_uncharge_skmem() > > - return 0 > > > > So I agree with Shakeel, that this change shouldn't alter the behavior > > of the above call path in such a situation. > > But do let us know if reverting this change has any effect on your test= . > > The problem is still there (the kernel wasn't compiling after revert, > had to adjust another seemingly unrelated callsite). It's hard to tell > if it's better or worse since it happens so randomly. > Hello everyone, we have a better understanding why the patch pointed out by Johannes might have exposed this issue. See https://lore.kernel.org/all/20221013041833.rhifxw4gqwk4ofi2@google.com/. To summarize, the old code was depending on a subtle interaction of force-charge and percpu charge caches which this patch removed. The fix I am proposing is for the network stack to be explicit of its need (i.e. use GFP_ATOMIC) instead of depending on a subtle behavior.