From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.4 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B5B63C0044D for ; Thu, 12 Mar 2020 02:04:09 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id BC19F20753 for ; Thu, 12 Mar 2020 02:04:09 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="vpgM4oEg" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org BC19F20753 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 094C86B0003; Wed, 11 Mar 2020 22:04:09 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0440C6B0006; Wed, 11 Mar 2020 22:04:08 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E75B96B0007; Wed, 11 Mar 2020 22:04:08 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0061.hostedemail.com [216.40.44.61]) by kanga.kvack.org (Postfix) with ESMTP id CD8096B0003 for ; Wed, 11 Mar 2020 22:04:08 -0400 (EDT) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id A707B181AEF2A for ; Thu, 12 Mar 2020 02:04:08 +0000 (UTC) X-FDA: 76585064976.22.berry81_5e56a36b93d50 X-HE-Tag: berry81_5e56a36b93d50 X-Filterd-Recvd-Size: 5550 Received: from mail-lf1-f67.google.com (mail-lf1-f67.google.com [209.85.167.67]) by imf29.hostedemail.com (Postfix) with ESMTP for ; Thu, 12 Mar 2020 02:04:08 +0000 (UTC) Received: by mail-lf1-f67.google.com with SMTP id j17so3404400lfe.7 for ; Wed, 11 Mar 2020 19:04:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=s3hzFOv+ufMUG9fZYGvNPf1Rj5SadPChsc427Pq4MuU=; b=vpgM4oEg5vQg7zF+tYEv7Vyyu3scK/a95qbpaEP7UrD/aYmlQ6SeWqFvun0hZ3k67e OtBqqS+3D0XjK7YxmxKEvnYIlia/p2feYBe18wkmL347YysBay5yMKLF2zJyOsjObWG3 nYyq2KALh1S+6KACoAWixnLjFD8geRrFQ9cYz7SmG6RCJZ+iqI6xa6zvxpHBHgUsME0D vakqiL/MWiRRrmMN7HzSSkWbBv+e326Y5hpVdwc7nGjb/kOxy6a6qCVWlUQ7JZHCV5IO 3eYWuWuLreDi+8F7hqmTuKnuQdGVfl+AlUIjqrrF4uhddIsSNDIokXjuJU3d3A+RNeXo xbEw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=s3hzFOv+ufMUG9fZYGvNPf1Rj5SadPChsc427Pq4MuU=; b=MrM0RECff2022nFbl31rkA7HzD4vWDarxW1+lNx02MAiR0ziluR/aGOhaRQUTNOUuo p0eGq8Rw6AKpA56YIUAdb/MW3rwNjRpiYvNX6rjpN51gTxQ0PPsqzEh8PTLZP38A4ZZl HtXfwgSKvZ3vVAKNZUkRmgkQ1G6X0W6rVccJOHjSODGFlKqcQbUqGAFzClqy4eSqwuz9 iKuibOx46TR77SYpFvTFG0uxUbqh+B+/OJce6fdwjo78woaYRLxo2RB8+DSrJoFbuIFD ATx7aoZnabtQaAyM9XqKr+u9A1rvvmUBeZGq6yJNEwo57qzkVDL/dWVETQxpuHdy7IdB Xr6g== X-Gm-Message-State: ANhLgQ2kjgwAYUO6GTNhsQTmzOMFsXNwyoUP2Odbfd3bitRci3koddbw EkQjArCXZQtcu1Seu39yWgcMy2SYeAqEdAOxuH9WyA== X-Google-Smtp-Source: ADFU+vsU/OyV5rdK4QGYblP7aUk0KmFAJfC4f7yZQUDmeMQmHszyALYMvHMr33qX8nhSOez1pD8RqJNQFihyA+tyDgQ= X-Received: by 2002:ac2:4c14:: with SMTP id t20mr3677256lfq.193.1583978646217; Wed, 11 Mar 2020 19:04:06 -0700 (PDT) MIME-Version: 1.0 References: <20200310184814.GA8447@dhcp22.suse.cz> <20200310210906.GD8447@dhcp22.suse.cz> <20200311084513.GD23944@dhcp22.suse.cz> <20200312001849.GA96953@google.com> In-Reply-To: <20200312001849.GA96953@google.com> From: Daniel Colascione Date: Wed, 11 Mar 2020 19:03:29 -0700 Message-ID: Subject: Re: interaction of MADV_PAGEOUT with CoW anonymous mappings? To: Minchan Kim Cc: Shakeel Butt , Michal Hocko , Dave Hansen , Jann Horn , Linux-MM , kernel list , "Joel Fernandes (Google)" Content-Type: text/plain; charset="UTF-8" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Mar 11, 2020 at 5:18 PM Minchan Kim wrote: > > On Wed, Mar 11, 2020 at 04:53:17PM -0700, Shakeel Butt wrote: > > On Wed, Mar 11, 2020 at 1:45 AM Michal Hocko wrote: > > > > > > On Tue 10-03-20 15:48:31, Dave Hansen wrote: > > > > Maybe instead of just punting on MADV_PAGEOUT for map_count>1 we should > > > > only let it affect the *local* process. We could still put the page in > > > > the swap cache, we just wouldn't go do the rmap walk. > > > > > > Is it really worth medling with the reclaim code and special case > > > MADV_PAGEOUT there? I mean it is quite reasonable to have an initial > > > implementation that doesn't really touch shared pages because that can > > > lead to all sorts of hard to debug and unexpected problems. So I would > > > much rather go with a simple patch to check map count first and see > > > whether somebody actually cares about those shared pages and go from > > > there. > > > > > > Minchan, do you want to take my diff and turn it into the proper patch > > > or should I do it. > > > > > > > What about the remote_madvise(MADV_PAGEOUT)? Will your patch disable > > the pageout from that code path as well for pages with mapcount > 1? > > Maybe, not because process_madvise syscall needs more previliedge(ie, > PTRACE_MODE_ATTACH_FSCREDS) so I guess it would be more secure. > So in that case, I want to rely on the LRU chance for shared pages. I don't want the behavior of an madvise command to change depending on *how* the command is invoked. MADV_PAGEOUT should do the same thing regardless. If you want to allow purging of shared pages as well, please add a new MADV_PAGEOUT_ALL or something and require a privilege to use it. > With that, the manager process could give a hint to several processes > and finally makes them paging out. On many different occasions over the past few years, I've found myself wanting to ask the kernel to do bulk memory management operations. I'd much rather add *one* facility to allow for multiple-target mm operations than add one-off special cases for specific callers cases as they come up. If we had such a bulk operation, the kernel could inspect the bulk operation payload, see whether the number of MADV_PAGEOUT requests for a given page were equal to the share count for that page, and, if so, evict that page despite it being shared.