From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9B3DAC433FE for ; Tue, 16 Nov 2021 09:39:21 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 33FFE61BFB for ; Tue, 16 Nov 2021 09:39:21 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 33FFE61BFB Authentication-Results: mail.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=suse.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id BFBE26B0083; Tue, 16 Nov 2021 04:39:20 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id BAAA86B0085; Tue, 16 Nov 2021 04:39:20 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A99506B0089; Tue, 16 Nov 2021 04:39:20 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0122.hostedemail.com [216.40.44.122]) by kanga.kvack.org (Postfix) with ESMTP id 999E36B0083 for ; Tue, 16 Nov 2021 04:39:20 -0500 (EST) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 4FD558249980 for ; Tue, 16 Nov 2021 09:39:20 +0000 (UTC) X-FDA: 78814295070.19.810297C Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by imf29.hostedemail.com (Postfix) with ESMTP id A33369000273 for ; Tue, 16 Nov 2021 09:39:17 +0000 (UTC) Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out1.suse.de (Postfix) with ESMTP id 29D87218A4; Tue, 16 Nov 2021 09:39:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1637055549; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=CyoOkrmM4Ugx2tFGn3orFJRtgDuhEZbVLB7WBpXDCRI=; b=Td9R8Cpl9hprSCtmF2qOYLllhemJV/VM5LtSFcG6ch6W5tKUr6EsqYiq84CVlxTsBns0SF UzTICDbq38Wt+AlB3SOsGQYf1eSgtBne1YAokwEGCUTB3eR/CHNjZSgk7nzP0XADK0C/p0 22pnnE5JPw3LiB85kDS6bJrX5PFy56s= Received: from suse.cz (unknown [10.100.201.86]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id CE818A3B83; Tue, 16 Nov 2021 09:39:08 +0000 (UTC) Date: Tue, 16 Nov 2021 10:39:08 +0100 From: Michal Hocko To: Mina Almasry Cc: Theodore Ts'o , Greg Thelen , Shakeel Butt , Andrew Morton , Hugh Dickins , Roman Gushchin , Johannes Weiner , Tejun Heo , Vladimir Davydov , Muchun Song , riel@surriel.com, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, cgroups@vger.kernel.org Subject: Re: [PATCH v3 2/4] mm/oom: handle remote ooms Message-ID: References: <20211111234203.1824138-1-almasrymina@google.com> <20211111234203.1824138-3-almasrymina@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: A33369000273 X-Stat-Signature: eo7pwzkhbcfgerk678x586gse6hitdmp Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b=Td9R8Cpl; spf=pass (imf29.hostedemail.com: domain of mhocko@suse.com designates 195.135.220.28 as permitted sender) smtp.mailfrom=mhocko@suse.com; dmarc=pass (policy=quarantine) header.from=suse.com X-HE-Tag: 1637055557-60284 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue 16-11-21 10:28:25, Michal Hocko wrote: > On Mon 15-11-21 16:58:19, Mina Almasry wrote: [...] > > To be honest I think this is very workable, as is Shakeel's suggestion > > of MEMCG_OOM_NO_VICTIM. Since this is an opt-in feature, we can > > document the behavior and if the userspace doesn't want to get killed > > they can catch the sigbus and handle it gracefully. If not, the > > userspace just gets killed if we hit this edge case. > > I am not sure about the MEMCG_OOM_NO_VICTIM approach. It sounds really > hackish to me. I will get back to Shakeel's email as time permits. The > primary problem I have with this, though, is that the kernel oom killer > cannot really do anything sensible if the limit is reached and there > is nothing reclaimable left in this case. The tmpfs backed memory will > simply stay around and there are no means to recover without userspace > intervention. And just a small clarification. Tmpfs is fundamentally problematic from the OOM handling POV. The nuance here is that the OOM happens in a different memcg and thus a different resource domain. If you kill a task in the target memcg then you effectively DoS that workload. If you kill the allocating task then it is DoSed by anybody allowed to write to that shmem. All that without a graceful fallback. I still have very hard time seeing how that can work reasonably except for a very special case with a lot of other measures to ensure the target memcg never hits the hard limit so the OOM simply is not a problem. Memory controller has always been used to enforce and balance memory usage among resource domains and this goes against that principle. I would be really curious what Johannes thinks about this. -- Michal Hocko SUSE Labs