From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6ABC3CE7A81 for ; Mon, 25 Sep 2023 12:28:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D82E88D0022; Mon, 25 Sep 2023 08:28:09 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D33288D0001; Mon, 25 Sep 2023 08:28:09 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C20E18D0022; Mon, 25 Sep 2023 08:28:09 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id B66F18D0001 for ; Mon, 25 Sep 2023 08:28:09 -0400 (EDT) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 88FEA40AF8 for ; Mon, 25 Sep 2023 12:28:09 +0000 (UTC) X-FDA: 81275047098.05.94FCBD0 Received: from mail-pf1-f173.google.com (mail-pf1-f173.google.com [209.85.210.173]) by imf04.hostedemail.com (Postfix) with ESMTP id 8D6A04001F for ; Mon, 25 Sep 2023 12:28:07 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=shopee.com header.s=shopee.com header.b=W5hcsBpd; dmarc=pass (policy=reject) header.from=shopee.com; spf=pass (imf04.hostedemail.com: domain of haifeng.xu@shopee.com designates 209.85.210.173 as permitted sender) smtp.mailfrom=haifeng.xu@shopee.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1695644887; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Sa2JeZg/HCJGKJydYD0Nz2S/LzX+OYJQg4QuWBQON58=; b=p2VCSFDkLgWmtnAfdzqy/e7QCxoY9iZO0ZNVYdN4+sPFuE1IqDLsdPpiwr4t9JgzI3vCgg U3eUaEM2NBdIERysLXMpREmk426ucOhXgVCxAZ6Gh9RCkKSuIBkDfAxN3+Fu1yUt8gKdxc Lt95ogRkgOvvT6mIPRNq7Cp/QUMQ9hs= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=shopee.com header.s=shopee.com header.b=W5hcsBpd; dmarc=pass (policy=reject) header.from=shopee.com; spf=pass (imf04.hostedemail.com: domain of haifeng.xu@shopee.com designates 209.85.210.173 as permitted sender) smtp.mailfrom=haifeng.xu@shopee.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1695644887; a=rsa-sha256; cv=none; b=MuFhazDLBC6/TI6AJ9QD1UJiM7alF+B6NMNC9nzhNWt/egWrLhlZCLcQncsUgAMzWFa6Pp yS/EKCuFfAch206BsVWxeJeOhE2RFckhyzsZGM2Tmm/cU46Fhxbgwx85jJlmU7TSXCex18 +CIcIXGovxcbiEYlQIfSMZCjhLvlKWU= Received: by mail-pf1-f173.google.com with SMTP id d2e1a72fcca58-692a9bc32bcso3350612b3a.2 for ; Mon, 25 Sep 2023 05:28:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=shopee.com; s=shopee.com; t=1695644886; x=1696249686; darn=kvack.org; h=content-transfer-encoding:in-reply-to:from:references:cc:to:subject :user-agent:mime-version:date:message-id:from:to:cc:subject:date :message-id:reply-to; bh=Sa2JeZg/HCJGKJydYD0Nz2S/LzX+OYJQg4QuWBQON58=; b=W5hcsBpdg61ultXhzHU00UhLzGGV758nfnv8zuC+VKUJGImyCMQeA5lmQA+meStqA7 isMImnRfkud0VbCt+LRELj61NzZCJVNmpR4RQQsnnWmMqc8jo5BQu/NM8KDWb30mP+WH Qz//vArna8FzpwSeKHLJO2HOGjh5XFBZc7SDsEbUN+uE5xbxZvz+LxrU3XAV4gX5FOA0 yvas/dX72yv6Ygm5vUph67ef+sL02r1Pk9PsPNrPEkV9dB5RhJHV0Sr8WKD6GIavzSjC /OhhF91U2bw2lpyPrLLT+w6E3CwWrf+RyefgeTxiSLDaTSrLjX7g9vdtS7kThySGP1J6 W21Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695644886; x=1696249686; h=content-transfer-encoding:in-reply-to:from:references:cc:to:subject :user-agent:mime-version:date:message-id:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=Sa2JeZg/HCJGKJydYD0Nz2S/LzX+OYJQg4QuWBQON58=; b=IBKkj6rlDJTqp78g5piqIl1G3xwVMb8BleHMfPXMunAEW6MlbG+BIhJ5V2y9RNBSF/ Zyw9wQEiitsouxyTNPwv12yqNsFWDbg7oHF8K6YrNoyoiayub9Bkq6cOzpRa8zPusaef Tu+t0iKH5Xp6jocGW4UlzdAadnBcdA8etDxlVLZcXx2baC7Vi0gSOYdczWF76v1AGb6y dvnuiVwykWc8+P+m7QbuhGHQVKAJx0GmVQLFKQaZxq1/l1fCsltAWkBTwORsKYoD4Pw1 oIRYLTkFlvncTw+ITM5LyC1nc8mFcWseNbiytmioksFKVixwnmgCGj8O9Ucn9NR12QQS Vynw== X-Gm-Message-State: AOJu0YxpeaAq8gk+C3aUEvYV6pNDzMVgX02lv/qTOosCafHAPIKXmxAQ bZSO2l+imQOKCUEu0GVBflYHBw== X-Google-Smtp-Source: AGHT+IGQl+/qNdx05UHGvgtq0pWDgEQ0LdwTrhKAZvm/SX5mzX7tagHv2WiAw5GmtI6bxnrgnlUC+Q== X-Received: by 2002:a05:6a20:6aaf:b0:157:eb32:e775 with SMTP id bi47-20020a056a206aaf00b00157eb32e775mr3448966pzb.62.1695644886337; Mon, 25 Sep 2023 05:28:06 -0700 (PDT) Received: from [10.54.24.10] (static-ip-148-99-134-202.rev.dyxnet.com. [202.134.99.148]) by smtp.gmail.com with ESMTPSA id fn1-20020a056a002fc100b00692e9bf82fcsm1897323pfb.182.2023.09.25.05.28.03 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 25 Sep 2023 05:28:05 -0700 (PDT) Message-ID: <94b7ed1d-9ca8-7d34-a0f4-c46bc995a3d2@shopee.com> Date: Mon, 25 Sep 2023 20:28:02 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.15.0 Subject: Re: [PATCH 1/2] memcg, oom: unmark under_oom after the oom killer is done To: Michal Hocko Cc: hannes@cmpxchg.org, roman.gushchin@linux.dev, shakeelb@google.com, cgroups@vger.kernel.org, linux-mm@kvack.org References: <20230922070529.362202-1-haifeng.xu@shopee.com> <6b7af68c-2cfb-b789-4239-204be7c8ad7e@shopee.com> From: Haifeng Xu In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 8D6A04001F X-Stat-Signature: gr5waxkiuyejqd7p3mkmr9sznsmhsowi X-Rspam-User: X-HE-Tag: 1695644887-28059 X-HE-Meta: U2FsdGVkX1+AufiOd1TZbzFRlIT7nvQw3RCLognnDYTfdrg2zleHoS7DKDKpEyfSiql8q1K8kJ4sZx/BLtItjy14sp0LRJnOLNxV3AxlM55RmmuT2JNdwKFoVIkjvYjZOsrkguRLYpNgsgciw/VIqQamO5bduQXuBXl176aZnSw+LIsNnYb+SF9HHmPWsEuwKQqfGDZ6DteOaUJujTtfSKTt0AyPldmsXyGaVNHxRggdXwClU/lypVUK8PF1CI5EEPqQIcY8GJ+lA2UJT1kYs3+zZ6JIWa8nGsT7cRwRIRKEa3u2GpPH4TV26jiwcaUbpCaEtRlEvqd4rUPJK5/4sDSRZrKNJXXMm4G4zbRXrnS9uoL4X3XF7dt8Aur9/bp83o3qpASAe8g0UhtjXB5RJJKm2s2AIhJyyutuzk2xsLV/P0P9/KAF6UfdPr4hjxvLRCXVONaht9xvdxDyeKHk5WHEPVjEuMMW7HEjLiC2JEH4Xg/JynUsAhUPECNch8ovzmyAw5uZ8MvVzck8lBF1vDpILQb0rGtboeDalEtGlrNfxIpotqgJ5KNFtQKHDxKWjE9cniV4i+ciaAnt6PKprapL8GHM1gQjJ0YZzeNl0J9VjuCXKVur+DAWsvHN8jhWMyUkITshGeXWs2iFqwjquxmO6gf5BTX9uUm9iL2EBhtmaPY8gq5bAXo+gEsEeMAtBTgk6uyxukh4phJz7Km6MJXam1LwOW+OXPyBbgE1tusjSEmsVRo4DAbPpz4va+XFhnr7XdaUwOvFArlw5tLzfLYrlpils3FguIgwU5uJcWDwvQGDpAwgwRnGHwYTWG+Z8jLYUAlQc+SYWAK9R1Ax7lqKGOsIA5IxS6clKnrQtq3nuj0gkQjK0tIbEGu4vOejvN9oQHCL6Xg+LHNzbrTwgydb5C81WKDi21nS7eLnfvKSqyUoJQxXzaG55S///SPiWT1yxd7tNlIUC6CPLl1 CUy/lHwz aP6CNe3eWczeFcxRVfHra2o50Uu1Td/2IYQcL9fA95rFk7K1CHcaXFQ6Ld91lpfCDyGrgJs/a7NAdaiRPylrmefRRz0F9905M5bGZsXobEuyJduCc5eh9tQfpHPC9aatfCxEj1f69r6PuW+ZEEK7tnNPP+MdUsBjimeXf715FL1yCm/4zFiIz8lEDh+1r30gBxM8/QhZ4oAhSj1OccaUXVzTplu+e2YduwzouVCEEBq676OVTYbKwujb+VBcm583jUBZvgP42e7waZAxA3eaMLP8iHcFTh+G11Qjq4acj0sa1b/Rn+qZCjEkIDe69WR2TNqnsOGbrg1lUV8TTUg8kffWz56vcn625SfBf1gm0QvfAusLP0ESA4OYhVktDTV0E7Z+FF1tcgz/BYOH74EuxF4jRlQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2023/9/25 19:38, Michal Hocko wrote: > On Mon 25-09-23 17:03:05, Haifeng Xu wrote: >> >> >> On 2023/9/25 15:57, Michal Hocko wrote: >>> On Fri 22-09-23 07:05:28, Haifeng Xu wrote: >>>> When application in userland receives oom notification from kernel >>>> and reads the oom_control file, it's confusing that under_oom is 0 >>>> though the omm killer hasn't finished. The reason is that under_oom >>>> is cleared before invoking mem_cgroup_out_of_memory(), so move the >>>> action that unmark under_oom after completing oom handling. Therefore, >>>> the value of under_oom won't mislead users. >>> >>> I do not really remember why are we doing it this way but trying to track >>> this down shows that we have been doing that since fb2a6fc56be6 ("mm: >>> memcg: rework and document OOM waiting and wakeup"). So this is an >>> established behavior for 10 years now. Do we really need to change it >>> now? The interface is legacy and hopefully no new workloads are >>> emerging. >>> >>> I agree that the placement is surprising but I would rather not change >>> that unless there is a very good reason for that. Do you have any actual >>> workload which depends on the ordering? And if yes, how do you deal with >>> timing when the consumer of the notification just gets woken up after >>> mem_cgroup_out_of_memory completes? >> >> yes, when the oom event is triggered, we check the under_oom every 10 seconds. If it >> is cleared, then we create a new process with less memory allocation to avoid oom again. > > OK, I do understand what you mean and I could have made myself > more clear previously. Even if the state is cleared _after_ > mem_cgroup_out_of_memory then you won't get what you need I am > afraid. The memcg stays under OOM until a memory is freed (uncharged) > from that memcg. mem_cgroup_out_of_memory itself doesn't really free > any memory on its own. It relies on the task to wake up and die or > oom_reaper to do the work on its behalf. All of that is time dependent. > under_oom would have to be reimplemented to be cleared when a memory is > unchanrged to meet your demands. Something that has never really been > the semantic. > yes, but at least before we create the new process, it has more chance to get some memory freed. > Btw. is this something new that you are developing on top of v1? And if > yes, why don't you use v2? > yes, v2 doesn't have the "cgroup.event_control" file.