From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4705AC433F5 for ; Fri, 12 Nov 2021 08:21:38 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D14AE60FE3 for ; Fri, 12 Nov 2021 08:21:37 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org D14AE60FE3 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 5C09A6B0074; Fri, 12 Nov 2021 03:21:37 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 56F5C6B0078; Fri, 12 Nov 2021 03:21:37 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 45E906B007B; Fri, 12 Nov 2021 03:21:37 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0068.hostedemail.com [216.40.44.68]) by kanga.kvack.org (Postfix) with ESMTP id 36FB46B0074 for ; Fri, 12 Nov 2021 03:21:37 -0500 (EST) Received: from smtpin28.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id DD20C2806 for ; Fri, 12 Nov 2021 08:21:36 +0000 (UTC) X-FDA: 78799584192.28.B728BA4 Received: from mail-il1-f179.google.com (mail-il1-f179.google.com [209.85.166.179]) by imf09.hostedemail.com (Postfix) with ESMTP id 78942300011B for ; Fri, 12 Nov 2021 08:21:30 +0000 (UTC) Received: by mail-il1-f179.google.com with SMTP id i11so8239910ilv.13 for ; Fri, 12 Nov 2021 00:21:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=e5kPBmCOcmFspVqwDgZJWVPQZeOyM/KG0MhndtiqdEc=; b=U2FZQm5zQOY88IbxrwbNkiPdB3B/wQ4eXF7J3PAIDxN9GQZbCs8E1k0ZwSlv1ScR1j fbH1eYVxb8RqPksQXmoIdAhf0TADb6wnzQbfxUkclgDCuwbx93P6i10BmRulValid+Aq L/mbamz5BrE87AlBoENcHa/+FU3neQPWrwjd5Ta7kkLSTkoOGZvpVRnYJrUrysuzUWA1 Zr1kZDqhWI/U1TQVRTwv/5Nexd8+4+6Bf7s0Xyv7VFQoia//aN7mxMAoQ64UJ5eA2A78 xBNVLp6eSz+Hlv9qn6b7uOxai7YvHPpzrEWvIVWekPZ2WSLlI5oeCPMQWI6FyaFFI8Hv BkXw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=e5kPBmCOcmFspVqwDgZJWVPQZeOyM/KG0MhndtiqdEc=; b=TccG+6IWQ4vu283I6J/Z8UeAAqB/pkkzGSj4cO2FkSBmmP3LD/W+w/OczTtaSmn35V XpK+CWFsCQfyDLQTUsyEX0GPojeAW4xdE1BCnqfLZcqTS9Rq2cB6LiNAXP9gRmDTrDN1 ScXaxT7zJqcXRTlrJmbSIKb5PqLFSrrhFy9oyNbWKechtuvRnAxGhFHKv0NDMFi+uM10 U9XIlHlguntzzHLvxnD+F3EwUUXs/AovSDeVzRYDDpdD226GDAG2KX+xarCJmHE+8Ste hpOgEcTluvMkb/LY/Wcxuo7oCACm+x1jhC26gvACjK0qvJqtCME4THBBvdNSI0Yayiua XXPg== X-Gm-Message-State: AOAM531wR+hFoABCjgNnAQtIl72dbgS/AnCO5aLDM0zK16DwqWMxIkEJ 70rYuuBaHn5AWHse9uHs1NcTJbzm7WizfVOTBZwEP6ddJe0= X-Google-Smtp-Source: ABdhPJxxu2EwTDfJW3RZ3EEsLoo2kzpp7LljPIydmIgrLb5iGOm5dsWKK84Si92pfv/ljn1w5pwars4Ly5H418qcYXY= X-Received: by 2002:a05:6602:1d0:: with SMTP id w16mr9038879iot.140.1636704783771; Fri, 12 Nov 2021 00:13:03 -0800 (PST) MIME-Version: 1.0 References: <20211111234203.1824138-1-almasrymina@google.com> <20211111234203.1824138-3-almasrymina@google.com> In-Reply-To: From: Mina Almasry Date: Fri, 12 Nov 2021 00:12:52 -0800 Message-ID: Subject: Re: [PATCH v3 2/4] mm/oom: handle remote ooms To: Michal Hocko Cc: "Theodore Ts'o" , Greg Thelen , Shakeel Butt , Andrew Morton , Hugh Dickins , Roman Gushchin , Johannes Weiner , Tejun Heo , Vladimir Davydov , Muchun Song , riel@surriel.com, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, cgroups@vger.kernel.org Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 78942300011B X-Stat-Signature: cenh4awezch9nuamsw84t7zi9g3p7a89 Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=U2FZQm5z; spf=pass (imf09.hostedemail.com: domain of almasrymina@google.com designates 209.85.166.179 as permitted sender) smtp.mailfrom=almasrymina@google.com; dmarc=pass (policy=reject) header.from=google.com X-HE-Tag: 1636705290-344997 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Nov 11, 2021 at 11:52 PM Michal Hocko wrote: > > On Thu 11-11-21 15:42:01, Mina Almasry wrote: > > On remote ooms (OOMs due to remote charging), the oom-killer will attempt > > to find a task to kill in the memcg under oom, if the oom-killer > > is unable to find one, the oom-killer should simply return ENOMEM to the > > allocating process. > > This really begs for some justification. > I'm thinking (and I can add to the commit message in v4) that we have 2 reasonable options when the oom-killer gets invoked and finds nothing to kill: (1) return ENOMEM, (2) kill the allocating task. I'm thinking returning ENOMEM allows the application to gracefully handle the failure to remote charge and continue operation. For example, in the network service use case that I mentioned in the RFC proposal, it's beneficial for the network service to get an ENOMEM and continue to service network requests for other clients running on the machine, rather than get oom-killed when hitting the remote memcg limit. But, this is not a hard requirement, the network service could fork a process that does the remote charging to guard against the remote charge bringing down the entire process. > > If we're in pagefault path and we're unable to return ENOMEM to the > > allocating process, we instead kill the allocating process. > > Why do you handle those differently? > I'm thinking (possibly incorrectly) it's beneficial to return ENOMEM to the allocating task rather than killing it. I would love to return ENOMEM in both these cases, but I can't return ENOMEM in the fault path. The behavior I see is that the oom-killer gets invoked over and over again looking to find something to kill and continually failing to find something to kill and the pagefault never gets handled. I could, however, kill the allocating task whether it's in the pagefault path or not; it's not a hard requirement that I return ENOMEM. If this is what you'd like to see in v4, please let me know, but I do see some value in allowing some callers to gracefully handle the ENOMEM. > > Signed-off-by: Mina Almasry > > > > Cc: Michal Hocko > > Cc: Theodore Ts'o > > Cc: Greg Thelen > > Cc: Shakeel Butt > > Cc: Andrew Morton > > Cc: Hugh Dickins > > CC: Roman Gushchin > > Cc: Johannes Weiner > > Cc: Hugh Dickins > > Cc: Tejun Heo > > Cc: Vladimir Davydov > > Cc: Muchun Song > > Cc: riel@surriel.com > > Cc: linux-mm@kvack.org > > Cc: linux-fsdevel@vger.kernel.org > > Cc: cgroups@vger.kernel.org > -- > Michal Hocko > SUSE Labs