From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2006FFC6194 for ; Thu, 7 Nov 2019 00:22:10 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D375F206DF for ; Thu, 7 Nov 2019 00:22:09 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=cmpxchg-org.20150623.gappssmtp.com header.i=@cmpxchg-org.20150623.gappssmtp.com header.b="0cc8pNLT" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D375F206DF Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 7DA856B0006; Wed, 6 Nov 2019 19:22:09 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7B17F6B0007; Wed, 6 Nov 2019 19:22:09 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6C8626B0008; Wed, 6 Nov 2019 19:22:09 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0070.hostedemail.com [216.40.44.70]) by kanga.kvack.org (Postfix) with ESMTP id 583F46B0006 for ; Wed, 6 Nov 2019 19:22:09 -0500 (EST) Received: from smtpin10.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with SMTP id 000B5181AEF15 for ; Thu, 7 Nov 2019 00:22:08 +0000 (UTC) X-FDA: 76127579178.10.screw77_12ae3cbfa8d15 X-HE-Tag: screw77_12ae3cbfa8d15 X-Filterd-Recvd-Size: 5121 Received: from mail-pg1-f196.google.com (mail-pg1-f196.google.com [209.85.215.196]) by imf26.hostedemail.com (Postfix) with ESMTP for ; Thu, 7 Nov 2019 00:22:08 +0000 (UTC) Received: by mail-pg1-f196.google.com with SMTP id q17so390172pgt.9 for ; Wed, 06 Nov 2019 16:22:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=krnlyvjs+KeSr0Hw/FRyxkzA1g9k1qVle5bXbv5R+yw=; b=0cc8pNLTRY4uYn+HWKMYisOJ/XkVQI6HNMfotIlKRVv9t1mag0CmVviq468ziCwLou yG/6iyhWwop7qXtxNBa3R2bkVrC5ya4ByPrnnQO82kosOkwh0aCb/kDYyCQttFGHyR66 JNgcpdVEWlPTpyc9tVpRmfmc3p03TYtdGths7wywJ743cw9M0GNvhMUrvOnGakSdOujC bMdsT6XUMaTkLsdgdZh9CwP/N1MKXeugZXggeNwXV7NJp2fbd4x7G27Cz76UqAUBqycW JliOIOAX2QJPXR06sTd4LGTRY3H7vWudrkFCQ633AwBThq4FB59u+7dDOEbPAwhnM8W4 lOyw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=krnlyvjs+KeSr0Hw/FRyxkzA1g9k1qVle5bXbv5R+yw=; b=XWwPExywTWcxKQ35pE1unjXhp+WY/+xdkeB+2+IPgtSgyrIwvT6Tm66Fh2MVfhdw3Z kxGDvGhqWXpWBtcJWXmtXwXuscibcBLS/bFxaSaQi66EN8brPCdYYm57LDTj8lonyDqJ ABdVVsabbUqJnKqAFFd2s0GvXGtnpJ745cyFNMAlSrvMPIaK6KT9X7/uncV5HOroyaD2 qLXD8SBJwQPquwdfW3lbyi+a2UnoUzDLyAaprA12veumQyK+McmpIy8SwZCSkGRP0T5k MycTx/nLT6i/4CGKnE4+FW4mLVcKybef9u0fTH37ISorHsyaopb6L+SCTbDbIW7VaS7e Ndvg== X-Gm-Message-State: APjAAAX1/HTqiabH41J/R0KwxMi4jlrph4FN9HWwH3aH3IkPcbAN7yB4 RIIvt/orTCsgc+aXFDbs2A92KA== X-Google-Smtp-Source: APXvYqxaSZ1IFWRdCcONVtAyfoKIOk4ZuFIAmG6rRcLaVQuR4vW9ssqMnLmOUI4tX9rIPwwL6y5XiQ== X-Received: by 2002:a62:1ac6:: with SMTP id a189mr185546pfa.96.1573086126591; Wed, 06 Nov 2019 16:22:06 -0800 (PST) Received: from localhost ([2620:10d:c090:200::2:deb0]) by smtp.gmail.com with ESMTPSA id z25sm145364pfa.88.2019.11.06.16.22.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Nov 2019 16:22:05 -0800 (PST) Date: Wed, 6 Nov 2019 16:22:04 -0800 From: Johannes Weiner To: Roman Gushchin Cc: linux-mm@kvack.org, Andrew Morton , Michal Hocko , linux-kernel@vger.kernel.org, kernel-team@fb.com, stable@vger.kernel.org, Tejun Heo Subject: Re: [PATCH 1/2] mm: memcg: switch to css_tryget() in get_mem_cgroup_from_mm() Message-ID: <20191107002204.GA96548@cmpxchg.org> References: <20191106225131.3543616-1-guro@fb.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20191106225131.3543616-1-guro@fb.com> User-Agent: Mutt/1.12.2 (2019-09-21) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Nov 06, 2019 at 02:51:30PM -0800, Roman Gushchin wrote: > We've encountered a rcu stall in get_mem_cgroup_from_mm(): > > rcu: INFO: rcu_sched self-detected stall on CPU > rcu: 33-....: (21000 ticks this GP) idle=6c6/1/0x4000000000000002 softirq=35441/35441 fqs=5017 > (t=21031 jiffies g=324821 q=95837) NMI backtrace for cpu 33 > <...> > RIP: 0010:get_mem_cgroup_from_mm+0x2f/0x90 > <...> > __memcg_kmem_charge+0x55/0x140 > __alloc_pages_nodemask+0x267/0x320 > pipe_write+0x1ad/0x400 > new_sync_write+0x127/0x1c0 > __kernel_write+0x4f/0xf0 > dump_emit+0x91/0xc0 > writenote+0xa0/0xc0 > elf_core_dump+0x11af/0x1430 > do_coredump+0xc65/0xee0 > ? unix_stream_sendmsg+0x37d/0x3b0 > get_signal+0x132/0x7c0 > do_signal+0x36/0x640 > ? recalc_sigpending+0x17/0x50 > exit_to_usermode_loop+0x61/0xd0 > do_syscall_64+0xd4/0x100 > entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > The problem is caused by an exiting task which is associated with > an offline memcg. We're iterating over and over in the > do {} while (!css_tryget_online()) loop, but obviously the memcg won't > become online and the exiting task won't be migrated to a live memcg. > > Let's fix it by switching from css_tryget_online() to css_tryget(). > > As css_tryget_online() cannot guarantee that the memcg won't go > offline, the check is usually useless, except some rare cases > when for example it determines if something should be presented > to a user. > > A similar problem is described by commit 18fa84a2db0e ("cgroup: Use > css_tryget() instead of css_tryget_online() in task_get_css()"). > > Signed-off-by: Roman Gushchin > Cc: stable@vger.kernel.org > Cc: Tejun Heo Acked-by: Johannes Weiner The bug aside, it doesn't matter whether the cgroup is online for the callers. It used to matter when offlining needed to evacuate all charges from the memcg, and so needed to prevent new ones from showing up, but we don't care now.