From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 633BCC43334 for ; Fri, 8 Jul 2022 09:37:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DC5A96B0071; Fri, 8 Jul 2022 05:37:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D755E6B0073; Fri, 8 Jul 2022 05:37:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C3DC96B0074; Fri, 8 Jul 2022 05:37:39 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id B64ED6B0071 for ; Fri, 8 Jul 2022 05:37:39 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay11.hostedemail.com (Postfix) with ESMTP id 8B621803A6 for ; Fri, 8 Jul 2022 09:37:39 +0000 (UTC) X-FDA: 79663430238.13.3CD1295 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by imf19.hostedemail.com (Postfix) with ESMTP id 0A5201A004B for ; Fri, 8 Jul 2022 09:37:37 +0000 (UTC) Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out1.suse.de (Postfix) with ESMTP id 7C22C21D17; Fri, 8 Jul 2022 09:37:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1657273056; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=pF5t9woP5GK7nrYHCChHnPpig1ebKAkf9WM7m6BgpzY=; b=ShLUEdpxYc7XD3h7H3p2c0Ce+oEtfkQmGRD//brWuBbI5EHHFnb9Azy6FYZEEJjo6Fiybh YxEOE6VrCabpXmn/LDZkYUDcfTwLA+eFskdIIUXll3DbYZKMPq0mHpxf2NWQ3+GV++z9ow 33bR6j+Um9l5Xkum6DMu4IYiFMF2J2Y= Received: from suse.cz (unknown [10.100.201.86]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id B78522C141; Fri, 8 Jul 2022 09:37:34 +0000 (UTC) Date: Fri, 8 Jul 2022 11:37:31 +0200 From: Michal Hocko To: Gang Li Cc: akpm@linux-foundation.org, surenb@google.com, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, borntraeger@linux.ibm.com, svens@linux.ibm.com, viro@zeniv.linux.org.uk, ebiederm@xmission.com, keescook@chromium.org, rostedt@goodmis.org, mingo@redhat.com, peterz@infradead.org, acme@kernel.org, mark.rutland@arm.com, alexander.shishkin@linux.intel.com, jolsa@kernel.org, namhyung@kernel.org, david@redhat.com, imbrenda@linux.ibm.com, adobriyan@gmail.com, yang.yang29@zte.com.cn, brauner@kernel.org, stephen.s.brennan@oracle.com, zhengqi.arch@bytedance.com, haolee.swjtu@gmail.com, xu.xin16@zte.com.cn, Liam.Howlett@oracle.com, ohoono.kwon@samsung.com, peterx@redhat.com, arnd@arndb.de, shy828301@gmail.com, alex.sierra@amd.com, xianting.tian@linux.alibaba.com, willy@infradead.org, ccross@google.com, vbabka@suse.cz, sujiaxun@uniontech.com, sfr@canb.auug.org.au, vasily.averin@linux.dev, mgorman@suse.de, vvghjk1234@gmail.com, tglx@linutronix.de, luto@kernel.org, bigeasy@linutronix.de, fenghua.yu@intel.com, linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-perf-users@vger.kernel.org Subject: Re: Re: [PATCH v2 0/5] mm, oom: Introduce per numa node oom for CONSTRAINT_{MEMORY_POLICY,CPUSET} Message-ID: References: <20220708082129.80115-1-ligang.bdlg@bytedance.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1657273058; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=pF5t9woP5GK7nrYHCChHnPpig1ebKAkf9WM7m6BgpzY=; b=M7XmJYVD8/yO3tRVPryKCWD04GCMb0JM9nmBm6l7OygkSC1y8Ce3zYH8yiQgkjk9CFX8BC PwdjIwD4xhnGq8W9/BKGdTWbiFzqkRTFf1iFet5ZynHleDlmDTGhMxbTYOEAkbhRY5Dmmk AU8fA3DFrhK73eynU7PMoNca4e06m+o= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1657273058; a=rsa-sha256; cv=none; b=WIefjX1b/KFpWbd35p9zcQIHsyBFWEFSRzrIjvX+jzTua/CCRNh14dR15P04TzKDn9uOFu VNykmMYJ6UesjVonEGSTMHTsSxjHFD020/XTO8xOi8huQySzzxnIBNklmCLo76zu1K1e+T ry3/PyvfqS3RSo8MEqXL5B0/vwhtIeU= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b=ShLUEdpx; dmarc=pass (policy=quarantine) header.from=suse.com; spf=pass (imf19.hostedemail.com: domain of mhocko@suse.com designates 195.135.220.28 as permitted sender) smtp.mailfrom=mhocko@suse.com X-Stat-Signature: 7xdq1ckiepsxwmmkm4tq8pc5wgrg5dsk X-Rspamd-Queue-Id: 0A5201A004B Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b=ShLUEdpx; dmarc=pass (policy=quarantine) header.from=suse.com; spf=pass (imf19.hostedemail.com: domain of mhocko@suse.com designates 195.135.220.28 as permitted sender) smtp.mailfrom=mhocko@suse.com X-Rspam-User: X-Rspamd-Server: rspam05 X-HE-Tag: 1657273057-297437 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri 08-07-22 17:25:31, Gang Li wrote: > Oh apologize. I just realized what you mean. > > I should try a "cpuset cgroup oom killer" selecting victim from a > specific cpuset cgroup. yes, that was the idea. Many workloads which really do care about particioning the NUMA system tend to use cpusets. In those cases you have reasonably defined boundaries and the current OOM killer imeplementation is not really aware of that. The oom selection process could be enhanced/fixed to select victims from those cpusets similar to how memcg oom killer victim selection is done. There is no additional accounting required for this approach because the workload is partitioned on the cgroup level already. Maybe this is not really the best fit for all workloads but it should be reasonably simple to implement without intrusive or runtime visible changes. I am not saying per-numa accounting is wrong or a bad idea. I would just like to see a stronger justification for that and also some arguments why a simpler approach via cpusets is not viable. Does this make sense to you? -- Michal Hocko SUSE Labs