From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 732FEC433FE for ; Thu, 13 Oct 2022 14:22:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B0CD96B0071; Thu, 13 Oct 2022 10:22:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id ABB746B0073; Thu, 13 Oct 2022 10:22:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 983966B0074; Thu, 13 Oct 2022 10:22:14 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 7B01C6B0071 for ; Thu, 13 Oct 2022 10:22:14 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 4A64E1A0BCD for ; Thu, 13 Oct 2022 14:22:14 +0000 (UTC) X-FDA: 80016140988.23.400459D Received: from mail-qv1-f41.google.com (mail-qv1-f41.google.com [209.85.219.41]) by imf20.hostedemail.com (Postfix) with ESMTP id C0CF01C0036 for ; Thu, 13 Oct 2022 14:22:12 +0000 (UTC) Received: by mail-qv1-f41.google.com with SMTP id i9so1375205qvu.1 for ; Thu, 13 Oct 2022 07:22:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20210112.gappssmtp.com; s=20210112; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:from:to :cc:subject:date:message-id:reply-to; bh=2NGQAGAfwfgj6bQQFri3e7UvpFHJBbG66RlLh3PT3p8=; b=SDhA7zbv62PBaHsvPGDAgC+u2bWJOE2OO2P9PQ8O8yhfC3+O+6zOHA0pKrIUoxR0jb rdkdaIOIcP0Zm/LwhM9aYOF2nNRGLOCXW6KDvAxTI8GGHfmY59JXXY4vLfnHaGMnBvYm KPJqMc1JMbsz+TlNcGzW9lsTFb86rQ8TKR4SJ8rtNt8IG9eK3Gx2mc7rf0D0IzPdj0qj q94Q3OnfgWsGiDhsBjJzu33I6+YFoFGIH5067KhLkF04G3vqGwwO3xG4oVW1zsY4yO2E yyw4sfe4cfH5bzvYjNCPsf8dOL1soi9uAiw9XtpJ0HfB21bV+G9RFgpzYGiK9R9YarDJ hmOg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=2NGQAGAfwfgj6bQQFri3e7UvpFHJBbG66RlLh3PT3p8=; b=z3hwuaSbxkMoC7kaRsmnfkoKOWZ71MxUzAbERAW+5omg0zGzQbckpnDEu0Lnku6D4Q vNOqHL9BkSp2zldhagkSCHDQqj8TEoVJih+HWBu+AgRqsF+PJgq09WCoHqs6mP13dAzU mUBu/7FPnUgOmO7FfCnSYnHl0Dli1i/a+07yY5ZDYiwu5QzWqOQKAFnh0fz+aWMjTyi0 obkDJ4B76hq9ackyS/PgDplusSID1Uu5mddLSKR/2nXs5MjHA8IJ83tyHwDeoviyxCfi VvHqk3EDCWQ6OFDIoA0Af0nbmbyyo46P/BMA4YnLyqSNEyBJGRKgy12VD4vdHxdWZ57d Qeew== X-Gm-Message-State: ACrzQf0c93NkAMt8lGsk0SJF3/cTH7v5rDISiipReVOekMtzF3H04Pej 3NIBQnqg8uumPEIgxm+WhjNmJQ== X-Google-Smtp-Source: AMsMyM4ejZLk6SAda6oIT4GjMyfCkdv+MKWSXy85/4wqrAIiPeVWI24lZSdclBBimgKybOfcGtq5Dw== X-Received: by 2002:a05:6214:2a83:b0:4b1:cdc6:821d with SMTP id jr3-20020a0562142a8300b004b1cdc6821dmr160703qvb.36.1665670931825; Thu, 13 Oct 2022 07:22:11 -0700 (PDT) Received: from localhost ([2620:10d:c091:480::3a61]) by smtp.gmail.com with ESMTPSA id q4-20020a05620a2a4400b006ee74cc976esm9007413qkp.70.2022.10.13.07.22.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 13 Oct 2022 07:22:11 -0700 (PDT) Date: Thu, 13 Oct 2022 10:22:10 -0400 From: Johannes Weiner To: Shakeel Butt Cc: =?utf-8?Q?Gra=C5=BEvydas?= Ignotas , Wei Wang , Eric Dumazet , netdev , Michal Hocko , Roman Gushchin , Linux MM , Cgroups Subject: Re: UDP rx packet loss in a cgroup with a memory limit Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=cmpxchg-org.20210112.gappssmtp.com header.s=20210112 header.b=SDhA7zbv; spf=pass (imf20.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.219.41 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org; dmarc=pass (policy=none) header.from=cmpxchg.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1665670933; a=rsa-sha256; cv=none; b=rIwHmk4InplltHIBpIYSYkpf6vkq8b1nFEOVJ9dXsEPku0r3gDUrQZUdaBIAw8QfQhqm6v HRAkS19ZisffowGVPCweYp1BP9FEUlBIsMlA3cSwqTnYNQ7kaQXLvUOfw7B/wmUbhiuMea nYW4uZWpCglV+Ol7edCbhFh8b+oxbAA= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1665670933; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=2NGQAGAfwfgj6bQQFri3e7UvpFHJBbG66RlLh3PT3p8=; b=lO3zbwqVCKB0hjwd/lpdAT5yvenP2LCZ2+j3kqBbOGfWyYiucK+tHT1W3/kL/DAocJeFd4 yT+xPgPrfbCWwVsRiSMqXFSCAV3Nddm9ORAjv2mTFccolrbo9l8tartCJCkfC/gn7B3HsE UtW0Bu3qKNnrCnHkunTEqt2k5aOk2V4= X-Rspamd-Server: rspam05 X-Rspam-User: Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=cmpxchg-org.20210112.gappssmtp.com header.s=20210112 header.b=SDhA7zbv; spf=pass (imf20.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.219.41 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org; dmarc=pass (policy=none) header.from=cmpxchg.org X-Stat-Signature: ezgw31gouu15kxop8outyrw5omfw4wbs X-Rspamd-Queue-Id: C0CF01C0036 X-HE-Tag: 1665670932-701416 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Oct 12, 2022 at 09:36:34PM -0700, Shakeel Butt wrote: > On Wed, Aug 17, 2022 at 1:12 PM GraÅžvydas Ignotas wrote: > > > > On Wed, Aug 17, 2022 at 9:16 PM Wei Wang wrote: > > > > > > On Wed, Aug 17, 2022 at 10:37 AM Shakeel Butt wrote: > > > > > > > > + Eric and netdev > > > > > > > > On Wed, Aug 17, 2022 at 10:13 AM Johannes Weiner wrote: > > > > > > > > > > This is most likely a regression caused by this patch: > > > > > > > > > > commit 4b1327be9fe57443295ae86fe0fcf24a18469e9f > > > > > Author: Wei Wang > > > > > Date: Tue Aug 17 12:40:03 2021 -0700 > > > > > > > > > > net-memcg: pass in gfp_t mask to mem_cgroup_charge_skmem() > > > > > > > > > > Add gfp_t mask as an input parameter to mem_cgroup_charge_skmem(), > > > > > to give more control to the networking stack and enable it to change > > > > > memcg charging behavior. In the future, the networking stack may decide > > > > > to avoid oom-kills when fallbacks are more appropriate. > > > > > > > > > > One behavior change in mem_cgroup_charge_skmem() by this patch is to > > > > > avoid force charging by default and let the caller decide when and if > > > > > force charging is needed through the presence or absence of > > > > > __GFP_NOFAIL. > > > > > > > > > > Signed-off-by: Wei Wang > > > > > Reviewed-by: Shakeel Butt > > > > > Signed-off-by: David S. Miller > > > > > > > > > > We never used to fail these allocations. Cgroups don't have a > > > > > kswapd-style watermark reclaimer, so the network relied on > > > > > force-charging and leaving reclaim to allocations that can block. > > > > > Now it seems network packets could just fail indefinitely. > > > > > > > > > > The changelog is a bit terse given how drastic the behavior change > > > > > is. Wei, Shakeel, can you fill in why this was changed? Can we revert > > > > > this for the time being? > > > > > > > > Does reverting the patch fix the issue? However I don't think it will. > > > > > > > > Please note that we still have the force charging as before this > > > > patch. Previously when mem_cgroup_charge_skmem() force charges, it > > > > returns false and __sk_mem_raise_allocated takes suppress_allocation > > > > code path. Based on some heuristics, it may allow it or it may > > > > uncharge and return failure. > > > > > > The force charging logic in __sk_mem_raise_allocated only gets > > > considered on tx path for STREAM socket. So it probably does not take > > > effect on UDP path. And, that logic is NOT being altered in the above > > > patch. > > > So specifically for UDP receive path, what happens in > > > __sk_mem_raise_allocated() BEFORE the above patch is: > > > - mem_cgroup_charge_skmem() gets called: > > > - try_charge() with GFP_NOWAIT gets called and failed > > > - try_charge() with __GFP_NOFAIL > > > - return false > > > - goto suppress_allocation: > > > - mem_cgroup_uncharge_skmem() gets called > > > - return 0 (which means failure) > > > > > > AFTER the above patch, what happens in __sk_mem_raise_allocated() is: > > > - mem_cgroup_charge_skmem() gets called: > > > - try_charge() with GFP_NOWAIT gets called and failed > > > - return false > > > - goto suppress_allocation: > > > - We no longer calls mem_cgroup_uncharge_skmem() > > > - return 0 > > > > > > So I agree with Shakeel, that this change shouldn't alter the behavior > > > of the above call path in such a situation. > > > But do let us know if reverting this change has any effect on your test. > > > > The problem is still there (the kernel wasn't compiling after revert, > > had to adjust another seemingly unrelated callsite). It's hard to tell > > if it's better or worse since it happens so randomly. > > > > Hello everyone, we have a better understanding why the patch pointed > out by Johannes might have exposed this issue. See > https://lore.kernel.org/all/20221013041833.rhifxw4gqwk4ofi2@google.com/. Wow, that's super subtle! Nice sleuthing. > To summarize, the old code was depending on a subtle interaction of > force-charge and percpu charge caches which this patch removed. The > fix I am proposing is for the network stack to be explicit of its need > (i.e. use GFP_ATOMIC) instead of depending on a subtle behavior. That sounds good to me.