From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.5 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,NICE_REPLY_A, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6D9ECC47094 for ; Mon, 7 Jun 2021 20:14:21 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id EF72A610A1 for ; Mon, 7 Jun 2021 20:14:20 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org EF72A610A1 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 8D7796B006E; Mon, 7 Jun 2021 16:14:20 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 887A46B0070; Mon, 7 Jun 2021 16:14:20 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 728B96B0071; Mon, 7 Jun 2021 16:14:20 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0157.hostedemail.com [216.40.44.157]) by kanga.kvack.org (Postfix) with ESMTP id 427F16B006E for ; Mon, 7 Jun 2021 16:14:20 -0400 (EDT) Received: from smtpin36.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id DB7C410F35 for ; Mon, 7 Jun 2021 20:14:19 +0000 (UTC) X-FDA: 78228029838.36.02AFAE8 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf20.hostedemail.com (Postfix) with ESMTP id 2A707184 for ; Mon, 7 Jun 2021 20:14:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1623096841; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/PXCh7Dqi2kbwFooxxZfx2yNPC/v9Ua5jcVXvhDX1hE=; b=Hn3uvHcmDkPLlGSpITvlj1R+ErGOc4d7m6uLPtOoUGbRGPPc5Wt+1hhls1cTyeHUGvct31 vBTu041i3vT1be18Y9dzyxV8K7/IIZZWgQMCJp8KIuntJpdZwA91xZoVP3IvGlbPJMZbBw 1GwuiqW0p7oPmtptoVIckNZOT5Xn61w= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1623096848; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/PXCh7Dqi2kbwFooxxZfx2yNPC/v9Ua5jcVXvhDX1hE=; b=eskucgxWFGxNGDG026qN+WXkE3vGmg1OdfwG0a5RHMEtMMXs+WdotZP66979MtVWscqi0O Cd8XcQdzzZuH+rhU0iXcQEgsg2oo86dPTucBod3yHlu9VgcGicUpg7MBfmjoNnpgeFNAgG ATL9jLBsdp/vRoFmy519a9ududmaw4o= Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com [209.85.222.198]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-555-c7JmgdGIPRK_5lHfWoW3UA-1; Mon, 07 Jun 2021 16:07:16 -0400 X-MC-Unique: c7JmgdGIPRK_5lHfWoW3UA-1 Received: by mail-qk1-f198.google.com with SMTP id o5-20020a05620a22c5b02903aa5498b6f8so7587358qki.5 for ; Mon, 07 Jun 2021 13:07:16 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:subject:to:cc:references:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding :content-language; bh=/PXCh7Dqi2kbwFooxxZfx2yNPC/v9Ua5jcVXvhDX1hE=; b=kO3QiLoo4TOec6w5adSZoeAKxtiFz5j1T9HFs1MZ53lFqhbn+ZlZP4dwc+htg2JG6D ghFA+lDi9XpTd7XYcCy4gkaYKd7l9WewVtSiO11P6gxgr2/wc3SI6vqRlOA+cRFfYxSQ VVLOjDdzlZJPtIVYEBANpUVInXSm+vwUl6ceqhe4Rp5dIeioBjYC3iInosiyh/LH4vPi VFlpX2mB2HWzwYxEYC7vWciOcbWMT5/kYblqRoPkj+bL4jE+0YnHaWkObD0KoLCRU0J/ 3LPxnLnXP+JIa6JHN1C6xYC/NfAQkGeRCTPpbGtF36acKKmJPbrrA76UBBzb3uU6gwZS DlzQ== X-Gm-Message-State: AOAM533OHkWY4DhBnpsoMWx7gVuXdi1ub3AaxQfT6AC+TuOPKESXzmuE JkKLyPltpwyqcKXONnO4yEiUmpTBCLsk0ocvhAv6tMqnpr6yVdKUwU0SQHpsr4ZAdUZ3syzeyCl OUsbpuwRITGE= X-Received: by 2002:a05:6214:441:: with SMTP id cc1mr19968678qvb.29.1623096435877; Mon, 07 Jun 2021 13:07:15 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzokglHC+o/VEsHWKTh4yqYsXEvJOQTHSUT6IuD+k1yuBcWoGLdullIvHUfQhhlriOWmC8/ew== X-Received: by 2002:a05:6214:441:: with SMTP id cc1mr19968661qvb.29.1623096435658; Mon, 07 Jun 2021 13:07:15 -0700 (PDT) Received: from llong.remote.csb ([2601:191:8500:76c0::cdbc]) by smtp.gmail.com with ESMTPSA id b132sm7351906qkg.116.2021.06.07.13.07.14 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 07 Jun 2021 13:07:15 -0700 (PDT) From: Waiman Long X-Google-Original-From: Waiman Long Subject: Re: [RFC PATCH] mm/oom_kill: allow oom kill allocating task for non-global case To: Shakeel Butt , Waiman Long Cc: Aaron Tomlin , Linux MM , Andrew Morton , Vlastimil Babka , Michal Hocko , LKML References: <20210607163103.632681-1-atomlin@redhat.com> Message-ID: Date: Mon, 7 Jun 2021 16:07:14 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.9.0 MIME-Version: 1.0 In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-US X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 2A707184 Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Hn3uvHcm; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=eskucgxW; spf=none (imf20.hostedemail.com: domain of llong@redhat.com has no SPF policy when checking 216.205.24.124) smtp.mailfrom=llong@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Stat-Signature: o73q157zy3pmjb8p3761my55w7qjt7fx X-HE-Tag: 1623096850-812896 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 6/7/21 3:04 PM, Shakeel Butt wrote: > On Mon, Jun 7, 2021 at 11:51 AM Waiman Long wrote: >> On 6/7/21 2:43 PM, Shakeel Butt wrote: >>> On Mon, Jun 7, 2021 at 9:45 AM Waiman Long wrote: >>>> On 6/7/21 12:31 PM, Aaron Tomlin wrote: >>>>> At the present time, in the context of memcg OOM, even when >>>>> sysctl_oom_kill_allocating_task is enabled/or set, the "allocating" >>>>> task cannot be selected, as a target for the OOM killer. >>>>> >>>>> This patch removes the restriction entirely. >>>>> >>>>> Signed-off-by: Aaron Tomlin >>>>> --- >>>>> mm/oom_kill.c | 6 +++--- >>>>> 1 file changed, 3 insertions(+), 3 deletions(-) >>>>> >>>>> diff --git a/mm/oom_kill.c b/mm/oom_kill.c >>>>> index eefd3f5fde46..3bae33e2d9c2 100644 >>>>> --- a/mm/oom_kill.c >>>>> +++ b/mm/oom_kill.c >>>>> @@ -1089,9 +1089,9 @@ bool out_of_memory(struct oom_control *oc) >>>>> oc->nodemask = NULL; >>>>> check_panic_on_oom(oc); >>>>> >>>>> - if (!is_memcg_oom(oc) && sysctl_oom_kill_allocating_task && >>>>> - current->mm && !oom_unkillable_task(current) && >>>>> - oom_cpuset_eligible(current, oc) && >>>>> + if (sysctl_oom_kill_allocating_task && current->mm && >>>>> + !oom_unkillable_task(current) && >>>>> + oom_cpuset_eligible(current, oc) && >>>>> current->signal->oom_score_adj != OOM_SCORE_ADJ_MIN) { >>>>> get_task_struct(current); >>>>> oc->chosen = current; >>>> To provide more context for this patch, we are actually seeing that in a >>>> customer report about OOM happened in a container where the dominating >>>> task used up most of the memory and it happened to be the task that >>>> triggered the OOM with the result that no killable process could be >>>> found. >>> Why was there no killable process? What about the process allocating >>> the memory or is this remote memcg charging? >> It is because the other processes have a oom_adjust_score of -1000. So >> they are non-killable. Anyway, they don't consume that much memory and >> killing them won't free up that much. >> >> The other process that uses most of the memory is the one that trigger >> the OOM kill in the first place because the memory limit has been >> reached in new memory allocation. Based on the current logic, this >> process cannot be killed at all even if we set the >> oom_kill_allocating_task to 1 if the OOM happens only within the memcg >> context, not in a global OOM situation. > I am not really against the patch but I am still not able to > understand why select_bad_process() was not able to select the current > process. mem_cgroup_scan_tasks() traverses all the processes in the > target memcg hierarchy, so why the current was skipped. Yes, you are right. Probably there is some problem with reaping so that the MMF_OOM_SKIP bit gets set. I don't have the answer yet. Regards, Longman