From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1F7B8C369D9 for ; Wed, 30 Apr 2025 14:54:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D50946B00AB; Wed, 30 Apr 2025 10:53:58 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CD9466B00BC; Wed, 30 Apr 2025 10:53:58 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B53A26B00BB; Wed, 30 Apr 2025 10:53:58 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 930536B0096 for ; Wed, 30 Apr 2025 10:53:58 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 072A316017A for ; Wed, 30 Apr 2025 14:54:00 +0000 (UTC) X-FDA: 83391005040.24.89F8231 Received: from out-189.mta0.migadu.com (out-189.mta0.migadu.com [91.218.175.189]) by imf14.hostedemail.com (Postfix) with ESMTP id D4965100012 for ; Wed, 30 Apr 2025 14:53:57 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=E0wr+Ehh; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf14.hostedemail.com: domain of roman.gushchin@linux.dev designates 91.218.175.189 as permitted sender) smtp.mailfrom=roman.gushchin@linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1746024838; a=rsa-sha256; cv=none; b=EIGxqFuh4q5OJ7I3eBqJNIVZedkKayn8nSGZMfxBaVZxNFhL3QSJjidqvVgiyYwnWOAR99 j6a6Xh7g/mN3gF2HKxN0qKNQX5CB8+GbHMMJWIhxbWVRl/OtG7V4wNfennnoMmIIKAkGB0 UkzerDaEgbnd+v7Gkh4dkLul1J2xRls= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=E0wr+Ehh; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf14.hostedemail.com: domain of roman.gushchin@linux.dev designates 91.218.175.189 as permitted sender) smtp.mailfrom=roman.gushchin@linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1746024838; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=o2WfIJfkULi6jd69VpstuPL+eloDnxf89UwFbTADE+M=; b=hjY0aLsh8vfj4EGKkMTDAX/Wg/+5Ywq8hvW0rHpz5vQVNfY9HjM8HZTxllCzXXstzd+Xqz N7mGPqCpGERzjBQxObAq10vIraUr4xW5zUakiK16KQ1BgPiMqZb4c4oOHsgs9R58A+oM9B xV0qx4csfO30LW4dSdr3BkU3j+I335Q= Date: Wed, 30 Apr 2025 14:53:50 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1746024835; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=o2WfIJfkULi6jd69VpstuPL+eloDnxf89UwFbTADE+M=; b=E0wr+Ehhi38fUhrRY4JGFtik3UcDULO4QpeMyNdfQtdQKsXxpN2TGmfVU+Ur2PKPjidHkj WalMJXXN+eDKOVzkfge/EK/AkqSxvpsPXNMFyf9hXcJREEh6VKYxXlt5d4YczRhebL3srL M/Jm5uC99IZZdpq2XcyINGE7biNDcvw= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Roman Gushchin To: Michal Hocko Cc: linux-kernel@vger.kernel.org, Andrew Morton , Alexei Starovoitov , Johannes Weiner , Shakeel Butt , Suren Baghdasaryan , David Rientjes , Josh Don , Chuyi Zhou , cgroups@vger.kernel.org, linux-mm@kvack.org, bpf@vger.kernel.org Subject: Re: [PATCH rfc 10/12] mm: introduce bpf_out_of_memory() bpf kfunc Message-ID: References: <20250428033617.3797686-1-roman.gushchin@linux.dev> <20250428033617.3797686-11-roman.gushchin@linux.dev> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Migadu-Flow: FLOW_OUT X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: D4965100012 X-Stat-Signature: wu347ac1c9cs48jeizzmrjxte3fzurnz X-Rspam-User: X-HE-Tag: 1746024837-72551 X-HE-Meta: U2FsdGVkX1/0FVotxEKSScOKtY6zbljZs3ZIot1po7q92LZx75YUXNRn61Uxl2WT+SiHEBMn+bgI675P/Hh+LIJWviY/rBPtAuZi1wtQB7vR+UW3sjOJX3ckhglal7ZbttTkr3JFwJ5oLYyc/5PQD3JT4cdT2ed1DZo5KVyO82MsOuQOtAQzpn1u3rm+WRC/7bbKCaRfYluRmPnD1v+LTnY8BK1AFb3b4pxClybceX72x5M6sE5VC4JHFY7a7qXs+WQlYryDtiaEHiJPB0arZaWTr8/6UnNJFvyfE4qelZEhuK9UQL1u/tlT7SxDQ39NU8qTExdzjuNVtp7CpZqlM5SU2bjhYP5h9oZvDNB5pqSePIghXLenVjPpcpjgEx+ubePNsPxHuJcTTrJ+7cA1PO0cfYQhH5FJ9h/xfesjeCg4IU37bYopTf5WUXewbfEGR3PonzWHvUzaURV6UcfKPu77QHjSrmebdszVnGSaSI9p/heS3Ub57nE7hLiksdqKc7+tkvyMY25tti3VOo7ZQwU9O6Iekrs9LxJXxz351iuxVyN2i+pin7/BBbCvZCNt9BXfM1obWoPfwvKRIaeLDy/3iCPeq0XRYpGDRupSr+TSmK4YYFIcE8vFsRHpjVLAELCRkZAYARY4+EdC3tUn1F77GPfhdBFNDhwe0yihzUzdi8CRa3oDRTB6MaEwsqGI+Voj7zMvfj1LTs3TXFM5fO+eg6CG6sP6ikvSD0o7XF/qRQitYJg7+fTtZGtQVO+9d4q7f00hTTLVEVnJauqjxfH5ifkMmtFp5HGI3PWTWQWXSBRmBQxKE5HkuxK4H2DtPuDTMOcsOAXI3E0yDUmHqqV2eSOiX2+W6DCn6Z30T5iC+gDZTGAETycL50O47i4ycW69aqbPIH8Zr7TqU+/ZnQUlgCN2xZvqagyIykVbjI3KpHn6WSD5JeBFCvLi6Ibtbox9dmH5Uvo3F8kNVFx yPXeCc/O pInAy7/2VeZfURF3Fbjp4ngbNKckNl3w8OuBRwPhjE4mteSI+sQI41Kgf//vv5BhOkz8L9aQ9fSas6xkKQWjFFaTr3zXImrxTZlzyr6rpptz1VzM2cIVrk9plictkeQomh1wa8GZ+Kg9CS/ZH2arNj2lZKVhMo3pTJpNAHmtPDhGKBOC5OECEEkHgCzmT3Mi8kKrdCLw+lpGNh4RkJkJfSkTFr+UFVan57kYKu9iyDN6W0KSU00XhMnUioYKGqjt2EkV9A7zLZLOGUg8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Apr 30, 2025 at 09:27:39AM +0200, Michal Hocko wrote: > On Tue 29-04-25 21:31:35, Roman Gushchin wrote: > > On Tue, Apr 29, 2025 at 01:46:07PM +0200, Michal Hocko wrote: > > > On Mon 28-04-25 03:36:15, Roman Gushchin wrote: > > > > Introduce bpf_out_of_memory() bpf kfunc, which allows to declare > > > > an out of memory events and trigger the corresponding kernel OOM > > > > handling mechanism. > > > > > > > > It takes a trusted memcg pointer (or NULL for system-wide OOMs) > > > > as an argument, as well as the page order. > > > > > > > > Only one OOM can be declared and handled in the system at once, > > > > so if the function is called in parallel to another OOM handling, > > > > it bails out with -EBUSY. > > > > > > This makes sense for the global OOM handler because concurrent handlers > > > are cooperative. But is this really correct for memcg ooms which could > > > happen for different hierarchies? Currently we do block on oom_lock in > > > that case to make sure one oom doesn't starve others. Do we want the > > > same behavior for custom OOM handlers? > > > > It's a good point and I had similar thoughts when I was working on it. > > But I think it's orthogonal to the customization of the oom handling. > > Even for the existing oom killer it makes no sense to serialize memcg ooms > > in independent memcg subtrees. But I'm worried about the dmesg reporting, > > it can become really messy for 2+ concurrent OOMs. > > > > Also, some memory can be shared, so one OOM can eliminate a need for another > > OOM, even if they look independent. > > > > So my conclusion here is to leave things as they are until we'll get signs > > of real world problems with the (lack of) concurrency between ooms. > > How do we learn about that happening though? I do not think we have any > counters to watch to suspect that some oom handlers cannot run. The bpf program which declares an OOM can handle this: e.g. retry, wait and retry, etc. We can also try to mimick the existing behavior and wait on oom_lock (potentially splitting it into multiple locks to support concurrent ooms in various memcgs). Do you think it's preferable?