From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 57139E77188 for ; Tue, 14 Jan 2025 19:23:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CBEF8280003; Tue, 14 Jan 2025 14:23:31 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C6E71280001; Tue, 14 Jan 2025 14:23:31 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B3611280003; Tue, 14 Jan 2025 14:23:31 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 959CE280001 for ; Tue, 14 Jan 2025 14:23:31 -0500 (EST) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 86BD9C0CC6 for ; Tue, 14 Jan 2025 19:23:30 +0000 (UTC) X-FDA: 83007031380.19.7FE679E Received: from mail-qk1-f176.google.com (mail-qk1-f176.google.com [209.85.222.176]) by imf19.hostedemail.com (Postfix) with ESMTP id 0C49B1A001E for ; Tue, 14 Jan 2025 19:23:27 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b=x9ptGWKU; spf=pass (imf19.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.222.176 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org; dmarc=pass (policy=none) header.from=cmpxchg.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736882608; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=pRH5WXAIfJW9sI4d3VG1suC4yIuhuyX08y9rusU5HkA=; b=YdjMaBYTTRk8c3/Ko0fIQ9tAP9MUjSK3WUQHJT0CxbAT9wuonXgl/KWp1aZ0pt3Yud8mL+ UNgImwXB8mET1JSCorg5o0qm57jF3SfgUb89Q4TscdqR4nRHpDaFla9SxrEgw1MwHkOL0j ACzpsvwMrPCr5uw2rlNeAQwLv1HK0HM= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736882608; a=rsa-sha256; cv=none; b=LeGH5TaN0v6vqXfc8qsP9x5g8kkTMXa6aiQX9O3a8ogL0kVxQWJ0UHK70S9qcloQVMaU7w W0DOv1c322fwFHkxnq2thP2CmhZGTBtWw31KhmKEjBuW3fjs3mdWwJV2XMoAgI44ijm9/m YLn2XB323Pc5xsBV2voPcJKOe6pKdfQ= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b=x9ptGWKU; spf=pass (imf19.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.222.176 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org; dmarc=pass (policy=none) header.from=cmpxchg.org Received: by mail-qk1-f176.google.com with SMTP id af79cd13be357-7b8618be68bso495938485a.3 for ; Tue, 14 Jan 2025 11:23:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20230601.gappssmtp.com; s=20230601; t=1736882607; x=1737487407; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=pRH5WXAIfJW9sI4d3VG1suC4yIuhuyX08y9rusU5HkA=; b=x9ptGWKUIAoTr3vxKGUQPOJCYVhzZZ5YWSxW1pD416ghUYD8SNxObUaAnw2JNPRQii qLx0PxB4s/ZDH8o2iqhOohwWRqU2VJKUAp320vd9SZQfCu8fpdZXX0Wby62QGsFEa0UJ CuQSUwtvoX7Xs9uxJB4rbDZ+443BmdnsvEa++cHXMB88VNPUVPg89R0pKTLzElzaJoqZ sxEvUm9zw9kXdZeMrKYi7BxXl9UD8eUPBd8AhzDTEbZxFdww4p2qL4TCPgAD4CS8RsHv C8dMN9bcbdmSo7U64qb+mtODrK6XognW+NkZnaUN1vj3V/q4OUekjESVPCP8Q+Z61x2l 0Vzg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736882607; x=1737487407; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=pRH5WXAIfJW9sI4d3VG1suC4yIuhuyX08y9rusU5HkA=; b=sbCVNXsnDOGhgSiK/iY7JYuueeLZlSQl8Nlj2FyihkDN5l97hJc/2vs9XzKgACMMBF NFNCiB9F1jxDAPfluXe3Nn+uvcGf9cnti1JOtpxAIu9ckPhI8jd6PAwSZGNH4lFkPwxd 0866US0oyzFiOfLiPFXGgNCDm/TOzp2clah35J1+MsfX4CBZ6ET+QpvC2X68bvcXaT12 S2/jRcA81avuBUa+W8lwgK2vbqRlG5ovML8eJfD3yJD5fzYj/edrB3RphpsqwzBIevVz Ifd2U9PIxP/6oyBv2PyTyDBljlgow/4jUV5mLqo+Fz9v33ef7Gv1Rvm3shzQk7726taI MDIg== X-Forwarded-Encrypted: i=1; AJvYcCXdHFrC+uNry5bYo9tJ57+l9aA7RZjPZ26l4Ym4kS2mTShQXMJB0P8eh7UNcXnyBwep3gECnw0MpQ==@kvack.org X-Gm-Message-State: AOJu0Yz/Mk0lSHv0wnPsft5YrLFa4/GMOLb/+z2JwCnMWJKnAydyL2oT sQKeS35yc2YIodGwhaoKnM6jDxqC3F1iRLdEqFGqsO7OVxwYrBAVhN4VUgL0xqk= X-Gm-Gg: ASbGncviuiqySd5VZdyhnFp72/gxGq5kYT8USDnCsFY79WpMcar0PyD8EBWcK4sQcVt Br5ix5wO/Gqtn7gWafaTctT3qK+HqFp+ayFGrglGcAfIwArZnqi++Y0R7JxY/Bzmp6FaqcS/sAX KA/smN0NtvTpD1lqXDZRdSgb6+S/fHJ6x8ikuajt2iZfMHNPdcuYZYFu2/tGNs0kbkuLxzDjNHN Tt8JztkMDPDCkNQ5CWnBWJAUoRubOxXLPYZ0Mm7f1K4WQi2XzojJ70= X-Google-Smtp-Source: AGHT+IGLXLFv2cUPWN27tEDLm5LK2vwWsP3tTlNAMSZlbuEbtvxtpoDkKUXi/q1Kv5wrEtQZjuRT9A== X-Received: by 2002:a05:620a:31a9:b0:7b6:ea91:d886 with SMTP id af79cd13be357-7bcd97b5a4amr4124433785a.39.1736882606918; Tue, 14 Jan 2025 11:23:26 -0800 (PST) Received: from localhost ([2603:7000:c01:2716:da5e:d3ff:fee7:26e7]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6dfad86153asm56206756d6.22.2025.01.14.11.23.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 14 Jan 2025 11:23:26 -0800 (PST) Date: Tue, 14 Jan 2025 14:23:22 -0500 From: Johannes Weiner To: Michal Hocko Cc: Rik van Riel , Yosry Ahmed , Balbir Singh , Roman Gushchin , hakeel Butt , Muchun Song , Andrew Morton , cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@meta.com, Nhat Pham Subject: Re: [PATCH v2] memcg: allow exiting tasks to write back data to swap Message-ID: <20250114192322.GB1115056@cmpxchg.org> References: <20241212115754.38f798b3@fangorn> <20241212183012.GB1026@cmpxchg.org> <20250114160955.GA1115056@cmpxchg.org> <193d98b0d5d2b14da1b96953fcb5d91b2a35bf21.camel@surriel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 0C49B1A001E X-Stat-Signature: w788tjdja9ehhpng64izq7xw91xsqogc X-Rspam-User: X-HE-Tag: 1736882607-518476 X-HE-Meta: U2FsdGVkX19858fA/qS54bnuo939UFBpJg18LyRu8f7e6WuP8t8s021aC5hT23dfbi5J4BEC9pR4IsQ/B4Vi9FKeLg8nDxZ8NWMVXpV3H77juBLjLX533qUnl3x8HEpoMI37uZG+QTEvA4GGxYe6sEe42lyew80a+b39S7LBS6ub8FLHygvXVOUF6wF/HbqXrizwZ/bMop74qfeiLiI+aLxenIv5pdcoj4bYXzWYjqvhXDByEm4UX5SJm1sUSisaHWGqtynONlUhTzZRLQR1JD1HidDjRBNc6O3z70IWLjCPksMtiOvibcUxPzDTo21mO0w6PUtUihEQBAjGojXDKCcpHKxlC3EngzCfFreoXkOFHdOvwCY/5b76GSJ/bxhF0Manpzs6qULEEdgwXp5BLsm3bkvWQsAzKnYBXp8J716UoyL69EF/NKkMNSDxc83ZztlC4eue7GHBuLPqy9Xzcs76lybm/XHz9Ed2vLjtL7nZfSkcseioPo62hwgeuQkabkRxiEDaAxYXobgq+yxw+hjaI6B50/ZEkYFqtX+RdD6ZTabbDWhRFb2mLEXbz/jih9eSZlBKPNw1UWsMEPQ1nvP93C3uz2pO8eIk0DIL/6oBw6TkF0BlzqRikZzGs0cKIGZ8aEXArXND9S3BmnxWwXG6GVRQ9Xc2I97Uls9oRTHsBApBHe+CVKroKGlc1eNxqEq7YNItPa/f4xv7IYT//Pec2Gq5uu5jUtaiaZQHeva268AXdEDWE63ywfSRoDW84wrvOXDUPyFdKoJWH0eqE9DSnkxerxUN6mkYmukVyxQ3HhxL7mSUnlZ77sg0rMm4wOfmr97g92ATrl+3wzAEQT4yN+CTPZHHk4kJrXK5Uah2FB5SDPjlK2u+R4sLS/e7dyXiGiSIDwMca4saRdR/EZSQBLt49qcP1LL2SPEx0CU5molq/O53NpEL9ate7/GWxD8EDUO9G1ksxrl/1Or DhRvnfby SxraRnVSLmsparaHGQFLGlaQXIStnm7Vpanlmpd+ZjXhYdhf7KoqBOVV4Z1WwpdO0H/opQCsEb/QBH6lsS+ID9A2A/igzN9EDcMnC6VyEgXZX4Mfn13VQP0c6zognaVUjM+d0Z1W4HN2AAWo+ePpJNPdPivIpNLR7yx1rb/vxunxsn9OOMJvfCtNsEfoTIwAVp0i1wJ2FItT2MWOfUEq3jI6F/bJTWCqT14M+sgffw7c3DsvaOhSJpRuK/9cNVPhZIgY5Bfnha7sRNt9E0hrW+G0mE6R7gSRhbalCnCRprYY9VanObH3d4mr1sFLTB7bgha5wcipOi3jbY+UVYr/Ig8N9f1ZrXEFmwB1ftT6iRsThZw0svxBcDARrHPqaLun42WFIP6wz3qDNmlzXbfSDlwRA571KQHoTcca6d/jKR+gXCBXWrqWO41m8raO8Ysa4pSRdpfCZ+JTg/jw= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Jan 14, 2025 at 07:13:07PM +0100, Michal Hocko wrote: > Anyway, have you tried to reproduce with > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index 7b3503d12aaf..9c30c442e3b0 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -1627,7 +1627,7 @@ static bool mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask, > * A few threads which were not waiting at mutex_lock_killable() can > * fail to bail out. Therefore, check again after holding oom_lock. > */ > - ret = task_is_dying() || out_of_memory(&oc); > + ret = out_of_memory(&oc); > > unlock: > mutex_unlock(&oom_lock); > > proposed by Johannes earlier? This should help to trigger the oom reaper > to free up some memory. Yes, I was wondering about that too. If the OOM reaper can be our reliable way of forward progress, we don't need any reserve or headroom beyond memory.max. IIRC it can fail if somebody is holding mmap_sem for writing. The exit path at some point takes that, but also around the time it frees up all its memory voluntarily, so that should be fine. Are you aware of other scenarios where it can fail? What if everything has been swapped out already and there is nothing to reap? IOW, only unreclaimable/kernel memory remaining in the group. It still seems to me that allowing the OOM victim (and only the OOM victim) to bypass memory.max is the only guarantee to progress. I'm not really concerned about side effects. Any runaway allocation in the exit path (like the vmalloc one you referenced before) is a much bigger concern for exceeding the physical OOM reserves in the page allocator. What's a containment failure for cgroups would be a memory deadlock at the system level. It's a class of kernel bug that needs fixing, not something we can really work around in the cgroup code.