From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-11.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 04943C55ABD for ; Sat, 14 Nov 2020 01:09:52 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 87D6E2225F for ; Sat, 14 Nov 2020 01:09:51 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="b+5IzEVN" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 87D6E2225F Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 1B7C96B006E; Fri, 13 Nov 2020 20:09:51 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 168826B0070; Fri, 13 Nov 2020 20:09:51 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 055E96B0071; Fri, 13 Nov 2020 20:09:50 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0176.hostedemail.com [216.40.44.176]) by kanga.kvack.org (Postfix) with ESMTP id CD3CB6B006E for ; Fri, 13 Nov 2020 20:09:50 -0500 (EST) Received: from smtpin28.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 6E01B180AD801 for ; Sat, 14 Nov 2020 01:09:50 +0000 (UTC) X-FDA: 77481241740.28.spoon35_1e0191b27313 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin28.hostedemail.com (Postfix) with ESMTP id 4DA996D64 for ; Sat, 14 Nov 2020 01:09:50 +0000 (UTC) X-HE-Tag: spoon35_1e0191b27313 X-Filterd-Recvd-Size: 6011 Received: from mail-wr1-f66.google.com (mail-wr1-f66.google.com [209.85.221.66]) by imf16.hostedemail.com (Postfix) with ESMTP for ; Sat, 14 Nov 2020 01:09:49 +0000 (UTC) Received: by mail-wr1-f66.google.com with SMTP id p1so12161555wrf.12 for ; Fri, 13 Nov 2020 17:09:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=+6uJPHVHZjnBVdYWG8pwd50i/TCRQ3PDiM6P81LC9MM=; b=b+5IzEVN7OejNiE+T/N86sZGs8yRU5yd0TTcbfR8EjeaFJuUrNJORujGZE+btEBN51 LvI7gFJtqeQ+kiDR76okbOA4a0IrezJxY+sUFgOTXZCU3ccVJGuw3FM3JUkm6FvxqEXs BO9jvTrp4OOvXi5icvmcWbfr9Lx1TgGb/Cve1HvT2MFGnFDg5nIS0Afd1M6SahTPyg2w DgChod3vM3xBd/lhUWXP5KeWoCggDIJxIf4WurxOaQ1gSaPkaC0BYDQLqL2dpuC5Vvdi Pu33gRB6maWbKqiHg1dKp3vkjKQGWRb+mtDdBJpPHR0sdwPIT5rPltZO6YhZt3IkQSiR TIFQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=+6uJPHVHZjnBVdYWG8pwd50i/TCRQ3PDiM6P81LC9MM=; b=HpnNC/6nA0Be5zC0Lc7mlBI95CqablcfqYWu+jktdiV/Rdtnk7Qt3y1q78lMbfGlOB l0W1YiDEBfvswStAQLh1d7HnOoHw/sraxXXuHKxPds/G66psw1IgFtQNgnWw4SryRED8 6+Q5BMRCwBw04cHw+cEKNfJzXYW6NtaBOZ5W3S661duAJ2EC8QaRew2acwzRpzkfOMkS vHogWaSCww92TwzE58ZqHY0H9f1KX2BKl+MuKnEAEgFs/kXuJ85Y6daFq1/pg/4W+5KD lEAlJ/LzpB6n6XZ9ZeR5KflyLiSPVctH+ZNm1gB+/HSuQ8Nw2XzDRaZjOMHjz+6FPoQZ WP2Q== X-Gm-Message-State: AOAM533hDBF+yvd7vAaoH3cRh5wAHIJMxhZykJwD7gnPHlxYWiAITXHy NZ0mz4VqZNl3yxIzPBhWJWfz9HrcNzqQB3QgRlNpLQ== X-Google-Smtp-Source: ABdhPJyDit7AsJHRIbjweR6pAqFyWC08iFm6LoUnZQFrvW1I7c0TBuVVH7MLkeKW1h4KCvmp8DYfa57x34fuVXOFbt8= X-Received: by 2002:a5d:4a50:: with SMTP id v16mr6801270wrs.106.1605316188293; Fri, 13 Nov 2020 17:09:48 -0800 (PST) MIME-Version: 1.0 References: <20201113173448.1863419-1-surenb@google.com> <20201113155539.64e0af5b60ad3145b018ab0d@linux-foundation.org> <20201113170032.7aa56ea273c900f97e6ccbdc@linux-foundation.org> In-Reply-To: <20201113170032.7aa56ea273c900f97e6ccbdc@linux-foundation.org> From: Suren Baghdasaryan Date: Fri, 13 Nov 2020 17:09:37 -0800 Message-ID: Subject: Re: [PATCH 1/1] RFC: add pidfd_send_signal flag to reclaim mm while killing a process To: Andrew Morton Cc: Michal Hocko , David Rientjes , Matthew Wilcox , Johannes Weiner , Roman Gushchin , Rik van Riel , Christian Brauner , Oleg Nesterov , Tim Murray , linux-api@vger.kernel.org, linux-mm , LKML , kernel-team , Minchan Kim Content-Type: text/plain; charset="UTF-8" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Nov 13, 2020 at 5:00 PM Andrew Morton wrote: > > On Fri, 13 Nov 2020 16:06:25 -0800 Suren Baghdasaryan wrote: > > > On Fri, Nov 13, 2020 at 3:55 PM Andrew Morton wrote: > > > > > > On Fri, 13 Nov 2020 09:34:48 -0800 Suren Baghdasaryan wrote: > > > > > > > When a process is being killed it might be in an uninterruptible sleep > > > > which leads to an unpredictable delay in its memory reclaim. In low memory > > > > situations, when it's important to free up memory quickly, such delay is > > > > problematic. Kernel solves this problem with oom-reaper thread which > > > > performs memory reclaim even when the victim process is not runnable. > > > > Userspace currently lacks such mechanisms and the need and potential > > > > solutions were discussed before (see links below). > > > > This patch provides a mechanism to perform memory reclaim in the context > > > > of the process that sends SIGKILL signal. New SYNC_REAP_MM flag for > > > > pidfd_send_signal syscall can be used only when sending SIGKILL signal > > > > and will lead to the caller synchronously reclaiming the memory that > > > > belongs to the victim and can be easily reclaimed. > > > > > > hm. > > > > > > Seems to me that the ability to reap another process's memory is a > > > generally useful one, and that it should not be tied to delivering a > > > signal in this fashion. > > > > > > And we do have the new process_madvise(MADV_PAGEOUT). It may need a > > > few changes and tweaks, but can't that be used to solve this problem? > > > > Thank you for the feedback, Andrew. process_madvise(MADV_DONTNEED) was > > one of the options recently discussed in > > https://lore.kernel.org/linux-api/CAJuCfpGz1kPM3G1gZH+09Z7aoWKg05QSAMMisJ7H5MdmRrRhNQ@mail.gmail.com > > . The thread describes some of the issues with that approach but if we > > limit it to processes with pending SIGKILL only then I think that > > would be doable. > > Why would it be necessary to read /proc/pid/maps? I'd have thought > that a starting effort would be > > madvise((void *)0, (void *)-1, MADV_PAGEOUT) > > (after translation into process_madvise() speak). Which is equivalent > to the proposed process_madvise(MADV_DONTNEED_MM)? Yep, this is very similar to option #3 in https://lore.kernel.org/linux-api/CAJuCfpGz1kPM3G1gZH+09Z7aoWKg05QSAMMisJ7H5MdmRrRhNQ@mail.gmail.com and I actually have a tested prototype for that. If that's the preferred method then I can post it quite quickly. > > There may be things which trip this up, such as mlocked regions or > whatever, but we could add another madvise `advice' mode to handle > this?