From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B5215C77B7A for ; Tue, 13 Jun 2023 08:27:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 566C88E0008; Tue, 13 Jun 2023 04:27:36 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 517C88E0002; Tue, 13 Jun 2023 04:27:36 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4076C8E0008; Tue, 13 Jun 2023 04:27:36 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 2F8518E0002 for ; Tue, 13 Jun 2023 04:27:36 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 05BF1140403 for ; Tue, 13 Jun 2023 08:27:36 +0000 (UTC) X-FDA: 80897045712.12.377F41C Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by imf15.hostedemail.com (Postfix) with ESMTP id 1A545A0012 for ; Tue, 13 Jun 2023 08:27:33 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b=AvTOTVSV; dmarc=pass (policy=quarantine) header.from=suse.com; spf=pass (imf15.hostedemail.com: domain of mhocko@suse.com designates 195.135.220.29 as permitted sender) smtp.mailfrom=mhocko@suse.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1686644854; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=6nUGC14+7CBrEZi0pZlxIb9RwE6G35ofltIgPALxQpU=; b=PNKc7YWV0YxZeeoNTscNe8CG8DbnrzVBCuYZ6Z8akbKa9bF7gb7P8hTAYbJtjDdnr/qzQJ FxultNQv6wJz/XOSJMFiws+vWnDsTtA8/RSAro3DIWGLL/NMyoXa/+iwClRuGp0km05PQ3 4rt605m58MPjuYMieI28deY+ZsMHwxM= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b=AvTOTVSV; dmarc=pass (policy=quarantine) header.from=suse.com; spf=pass (imf15.hostedemail.com: domain of mhocko@suse.com designates 195.135.220.29 as permitted sender) smtp.mailfrom=mhocko@suse.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1686644854; a=rsa-sha256; cv=none; b=Al2eMo+zb1S3Jl5C8hEpBSPZf8Hoocpdddt522R7x+gwvnpa5AY+UxsWdQUDj3jP7kxUjD rpfKfT55qEoUI0hazBrHsHfQ56xfLUvYuHhIvrFk4sM6LpZ3VVrJqvX+xmQLlnMcSXeEUe rZX1KUZB7IeIm8gXyuxgMB+09NgFh1c= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 923F31FD80; Tue, 13 Jun 2023 08:27:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1686644852; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=6nUGC14+7CBrEZi0pZlxIb9RwE6G35ofltIgPALxQpU=; b=AvTOTVSVYVd8ee7Kx3UT6ua7VKmK3BO63CfBLXP9P/NNd2GCnkFP7DCouBzWUgF+Re9T5D L9ERmrpkcMrCveWeQFYtfZyWwGKRIjc/oS2aEt3SIBgfA2D/M/FXI8OD41n4eZXivEZ0R4 VwVmBbZupyQ5A0phG1gvCTbkXDbPVkE= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 80C5E13483; Tue, 13 Jun 2023 08:27:32 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id PuVeH3QoiGRqMwAAMHmgww (envelope-from ); Tue, 13 Jun 2023 08:27:32 +0000 Date: Tue, 13 Jun 2023 10:27:32 +0200 From: Michal Hocko To: Yosry Ahmed Cc: =?utf-8?B?56iL5Z6y5rab?= Chengkaitao Cheng , "tj@kernel.org" , "lizefan.x@bytedance.com" , "hannes@cmpxchg.org" , "corbet@lwn.net" , "roman.gushchin@linux.dev" , "shakeelb@google.com" , "akpm@linux-foundation.org" , "brauner@kernel.org" , "muchun.song@linux.dev" , "viro@zeniv.linux.org.uk" , "zhengqi.arch@bytedance.com" , "ebiederm@xmission.com" , "Liam.Howlett@oracle.com" , "chengzhihao1@huawei.com" , "pilgrimtao@gmail.com" , "haolee.swjtu@gmail.com" , "yuzhao@google.com" , "willy@infradead.org" , "vasily.averin@linux.dev" , "vbabka@suse.cz" , "surenb@google.com" , "sfr@canb.auug.org.au" , "mcgrof@kernel.org" , "sujiaxun@uniontech.com" , "feng.tang@intel.com" , "cgroups@vger.kernel.org" , "linux-doc@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "linux-fsdevel@vger.kernel.org" , "linux-mm@kvack.org" Subject: Re: [PATCH v3 0/2] memcontrol: support cgroup level OOM protection Message-ID: References: <66F9BB37-3BE1-4B0F-8DE1-97085AF4BED2@didiglobal.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Queue-Id: 1A545A0012 X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: fa6cb78tu8fcp9ntp5pb3ww7dkn9uduj X-HE-Tag: 1686644853-329891 X-HE-Meta: U2FsdGVkX1+urqKmomney+FmBx0K8DIrBg/OssWfvCePqKLYlSI8etlD8hyzaWnV8RS/aJGtcEcNGDr6DIWKR6dTsRZ8nfubs6uHHcE3yMalYSdkodBUdYn6xiVm3WC1E1SJoO7YwbmekGlE0bsGx5mryHeoIxhpH3QTcTfLZWlMLfemOE4S95lbb/7gDKigUAfxEputs1z5Qw6AkTtiG48iC5wIu8I6r6O8JTS99/MzdYUTe0r2unmUlo6dq8e3iXKXWioThUslMomLZLxvY1VRgJFgJZ0hUdZFepWdv3rl/O0j975EMRiYHQ1XMRTo8YbHHjsff9zo+laZbhUUjRcWxlJ79IFm1T3kCj6spF1nyeGOKCinNHCykMfdw6f2BApTIWgj8BgZDi5RiD4l+Gfw2fJuHJAiu3lPWAk6+f43f5IUiDv2Sa3i9mkXTieTVM77ZnQbZKnZv44V1lW9GjQT2MEC4xUCnXlUFKinI5XWl9Ao5wuDI4ZteR43jBJI2P4zuS1h4s3/vTcFM6ebw6jMYEpO1bIMNggbRTpJ182b0HIpgPmtJV06LOCs/oVt1SxYfw6F360cTq5zbsRHcnim0RCfPrddSgcz/tfOHDWGd0s4ZgGVLSSrT0rXG6cWWI4zuiftKpcGHyZD0ncawvhyqrIfZuBk7Pvfqiz6/D/TwVXQ3/zCKSIyCC7SZ8EJ8IiF+SuryKAYk5tnSiU42FSTWCEAWePVlP2e+EODDH8mOmdBVZMHOnDPxT+UOBit0eyZhpXyEJblEV6Qi9uWvkyf8SEGS/c1GhClaqOGF5XC4iT2dqRFScGzKfZBk7FXgr/Hpi3TVS6vX23vMMF+AFyEOmdo58zL8RW898i8j/J/zKNu3yy+7hh6/TUzdc7/Rm7oK2MK42zavyk2L+dw2zHYeyVxMKQ0WEHGrMFXlghGqOKHiv71mQVNKru1SJBWEX0jYJ5wmNhoe/j5SV3 tLYhJpZk Hb57GeNHFgNR5/1KEBpu8tNUi24lla+xO/g592EiSe4WrDiu0ooym9sJzMYncU6halkJlKVoY9GX4RYbFGk/whiUUFyqHaImR9ufReoSIbdQws+KULVrlMTZVts9s+Asen7XHeA5iJlJJg3AXAb88vLnVyRM9QWNqSjCFzHCPGNxJCzoRPatA6QGri950FyIMFrzCUAnp/8L+QKTR7SrhGQn9zVjxxnkRTlS0e3YnsnVWgoJ8zgJ/TBzw6SdiR3kCq24W61beBVRiu/GLv5x7coL8I2uBTbWZvRt/e8DNFGP5Dv/OO1hlCrh4qAZ4VBcjWQ9L7A239peBSGBjoQV1InuTCyluflRLbRrN50iBpEfXMmL80kh+bvXOBuhurw5GLPb7UUsVGJL/LFA= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Sun 04-06-23 01:25:42, Yosry Ahmed wrote: [...] > There has been a parallel discussion in the cover letter thread of v4 > [1]. To summarize, at Google, we have been using OOM scores to > describe different job priorities in a more explicit way -- regardless > of memory usage. It is strictly priority-based OOM killing. Ties are > broken based on memory usage. > > We understand that something like memory.oom.protect has an advantage > in the sense that you can skip killing a process if you know that it > won't free enough memory anyway, but for an environment where multiple > jobs of different priorities are running, we find it crucial to be > able to define strict ordering. Some jobs are simply more important > than others, regardless of their memory usage. I do remember that discussion. I am not a great fan of simple priority based interfaces TBH. It sounds as an easy interface but it hits complications as soon as you try to define a proper/sensible hierarchical semantic. I can see how they might work on leaf memcgs with statically assigned priorities but that sounds like a very narrow usecase IMHO. I do not think we can effort a plethora of different OOM selection algorithms implemented in the kernel. Therefore we should really consider a control interface to be as much extensible and in line with the existing interfaces as much as possible. That is why I am really open to the oom protection concept which fits reasonably well to the reclaim protection scheme. After all oom killer is just a very aggressive method of the memory reclaim. On the other hand I can see a need to customizable OOM victim selection functionality. We've been through that discussion on several other occasions and the best thing we could come up with was to allow to plug BPF into the victim selection process and allow to bypass the system default method. No code has ever materialized from those discussions though. Maybe this is the time to revive that idea again? > It would be great if we can arrive at an interface that serves this > use case as well. > > Thanks! > > [1]https://lore.kernel.org/linux-mm/CAJD7tkaQdSTDX0Q7zvvYrA3Y4TcvLdWKnN3yc8VpfWRpUjcYBw@mail.gmail.com/ -- Michal Hocko SUSE Labs