From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.9 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 77FDBC433F5 for ; Fri, 3 Sep 2021 07:25:58 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 0E79660F9C for ; Fri, 3 Sep 2021 07:25:58 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 0E79660F9C Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 53B51900002; Fri, 3 Sep 2021 03:25:57 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4EB4B8D0001; Fri, 3 Sep 2021 03:25:57 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3B233900002; Fri, 3 Sep 2021 03:25:57 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0146.hostedemail.com [216.40.44.146]) by kanga.kvack.org (Postfix) with ESMTP id 2C4958D0001 for ; Fri, 3 Sep 2021 03:25:57 -0400 (EDT) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id B4A9580AC406 for ; Fri, 3 Sep 2021 07:25:56 +0000 (UTC) X-FDA: 78545427912.01.15B3CF2 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf16.hostedemail.com (Postfix) with ESMTP id 4F9E4F000090 for ; Fri, 3 Sep 2021 07:25:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1630653955; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=dSs9jhYIc4tuPmP+RqJPsx9TcphJP1kswHl64EqKBk0=; b=RLX/Odg+Mnd/M31bVc+WalWapH7rQDe/LbkrtTVytdqBqQxlI4FRj4JqX87lC+h5CNXeQl p7MSBkNd7s1nYtHeF6oeRYXJ9MGgir2wljKruoSDKUw9x+FlquJLX1WB2nX4Tsg7yTDWvB wmL6xxCTHmfpiE72gd1MxpbX1S/fc6w= Received: from mail-wm1-f72.google.com (mail-wm1-f72.google.com [209.85.128.72]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-60-3EBbYq77OY-Za945wODuDQ-1; Fri, 03 Sep 2021 03:25:54 -0400 X-MC-Unique: 3EBbYq77OY-Za945wODuDQ-1 Received: by mail-wm1-f72.google.com with SMTP id m22-20020a7bcb96000000b002f7b840d9dcso1581440wmi.1 for ; Fri, 03 Sep 2021 00:25:54 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:to:cc:references:from:organization:subject :message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=dSs9jhYIc4tuPmP+RqJPsx9TcphJP1kswHl64EqKBk0=; b=o3+ntKIsVrMNAZfQyFFKcVLO/5VH62/gcEzWyS7mTAuoEnMNoBRgXCYPWFf+kT/XDB l5F59I47pXVFMip2GDnJch2QWWjNMaPKJz6bLT2z/HBHJmC9PTwZvbl5EawkKw+23bB6 Jjf7BnaBgP4ZD28HSI07gapzwBjPbmtvUPdekFwa1lyer6B98ZAb8nHmKZnDDnSCWwSt yTTKic789bvcxE/r0PnV9qJzfx4Dd8piLMs79HYVq9uauIdgTppNXKR8sGz7xWa7pxRZ jGFQdTfMBNxVFPfypkKpg3RMiAQGaewp3zUF9BWjG32Y7Xvot/0Rvvkg1VHm2R4+FaQf GWaw== X-Gm-Message-State: AOAM532c375j3ucb+U77rYHGP2fj20Prh+gPE+WKoSPxapXLPstcqnaC SJqDfOVFo7GY3EO0iiejs7PaEcoGCw+dM0wQ8UlMZOndIxtBYIUWPCDHFJeUlUG5ljrjD3/zefd lokPj0+V+A0s= X-Received: by 2002:a1c:f00a:: with SMTP id a10mr1840513wmb.112.1630653953386; Fri, 03 Sep 2021 00:25:53 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxCLqP0N/WoWFnWHpg0pkGCyTN+2A8G7u0hibsfoTv8ptu4dUdxFKvMgluO+VB1S+c/WAJ/AA== X-Received: by 2002:a1c:f00a:: with SMTP id a10mr1840493wmb.112.1630653953127; Fri, 03 Sep 2021 00:25:53 -0700 (PDT) Received: from [192.168.3.132] (p4ff23e05.dip0.t-ipconnect.de. [79.242.62.5]) by smtp.gmail.com with ESMTPSA id b12sm4571279wrx.72.2021.09.03.00.25.52 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 03 Sep 2021 00:25:52 -0700 (PDT) To: Peter Xu , linux-mm@kvack.org, Hugh Dickins , Andrew Morton , linux-kernel@vger.kernel.org Cc: Miaohe Lin , Matthew Wilcox , Yang Shi , "Kirill A . Shutemov" , Jerome Glisse , Alistair Popple , Andrea Arcangeli , Mike Rapoport References: <20210902201721.52796-1-peterx@redhat.com> <20210902201836.53605-1-peterx@redhat.com> From: David Hildenbrand Organization: Red Hat Subject: Re: [PATCH v2 5/5] mm: Add ZAP_FLAG_SKIP_SWAP and zap_flags Message-ID: Date: Fri, 3 Sep 2021 09:25:51 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: <20210902201836.53605-1-peterx@redhat.com> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="RLX/Odg+"; spf=none (imf16.hostedemail.com: domain of david@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Stat-Signature: eeifb3j36eqsd8e6eopsqxcgaskmxt6b X-Rspamd-Queue-Id: 4F9E4F000090 X-Rspamd-Server: rspam04 X-HE-Tag: 1630653956-587631 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 02.09.21 22:18, Peter Xu wrote: > Firstly, the comment in zap_pte_range() is misleading because it checks= against > details rather than check_mappings, so it's against what the code did. >=20 > Meanwhile, it's confusing too on not explaining why passing in the deta= ils s/on// > pointer would mean to skip all swap entries. New user of zap_details c= ould > very possibly miss this fact if they don't read deep until zap_pte_rang= e() > because there's no comment at zap_details talking about it at all, so s= wap > entries could be errornously skipped without being noticed. s/errornously/erroneously/ >=20 > Actually very recently we introduced unmap_mapping_page() in 22061a1ffa= bd, I > think that should also look into swap entries. Add a comment there. I= OW, this > patch will be a functional change to unmap_mapping_page() but hopefully= in the > right way to do it. >=20 > This partly reverts 3e8715fdc03e ("mm: drop zap_details::check_swap_ent= ries"), > but introduce ZAP_FLAG_SKIP_SWAP flag, which means the opposite of prev= ious > "details" parameter: the caller should explicitly set this to skip swap > entries, otherwise swap entries will always be considered (which should= still > be the major case here). >=20 > Cc: Kirill A. Shutemov > Cc: Hugh Dickins > Signed-off-by: Peter Xu > --- > include/linux/mm.h | 16 ++++++++++++++++ > mm/memory.c | 6 +++--- > 2 files changed, 19 insertions(+), 3 deletions(-) >=20 > diff --git a/include/linux/mm.h b/include/linux/mm.h > index 81e402a5fbc9..a7bcdb2ec956 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -1716,12 +1716,18 @@ static inline bool can_do_mlock(void) { return = false; } > extern int user_shm_lock(size_t, struct ucounts *); > extern void user_shm_unlock(size_t, struct ucounts *); > =20 > +typedef unsigned int __bitwise zap_flags_t; > + > +/* Whether to skip zapping swap entries */ > +#define ZAP_FLAG_SKIP_SWAP ((__force zap_flags_t) BIT(0)) Interestingly, this will also skip fake some swap entries (e.g.,=20 migration entries but not private/exclusive entries). Maybe extend that=20 documentation a bit. ... but, looking into zap_pmd_range(), we don't care about "details"=20 when calling zap_huge_pmd(), which will zap pmd migration entries IIUC=20 ... so it's really unclear to me what the flag (and current behavior)=20 really is and what should be documented. Should we maybe really only=20 care about "real" swap entries? Most probably I'm just missing something important. > + > /* > * Parameter block passed down to zap_pte_range in exceptional cases. > */ > struct zap_details { > struct address_space *zap_mapping; /* Check page->mapping if set */ > struct page *single_page; /* Locked page to be unmapped */ > + zap_flags_t zap_flags; /* Extra flags for zapping */ > }; > =20 > /* > @@ -1737,6 +1743,16 @@ zap_skip_check_mapping(struct zap_details *detai= ls, struct page *page) > return details->zap_mapping !=3D page_rmapping(page); > } > =20 > +/* Return true if skip swap entries, false otherwise */ > +static inline bool > +zap_skip_swap(struct zap_details *details) > +{ > + if (!details) > + return false; > + > + return details->zap_flags & ZAP_FLAG_SKIP_SWAP; > +} > + > struct page *vm_normal_page(struct vm_area_struct *vma, unsigned long= addr, > pte_t pte); > struct page *vm_normal_page_pmd(struct vm_area_struct *vma, unsigned = long addr, > diff --git a/mm/memory.c b/mm/memory.c > index e5ee8399d270..4cb269ca8249 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -1379,8 +1379,7 @@ static unsigned long zap_pte_range(struct mmu_gat= her *tlb, > continue; > } > =20 > - /* If details->check_mapping, we leave swap entries. */ > - if (unlikely(details)) > + if (unlikely(zap_skip_swap(details))) > continue; > =20 > if (!non_swap_entry(entry)) > @@ -3351,6 +3350,7 @@ void unmap_mapping_page(struct page *page) > first_index =3D page->index; > last_index =3D page->index + thp_nr_pages(page) - 1; > =20 > + /* Keep ZAP_FLAG_SKIP_SWAP cleared because we're truncating */ > details.zap_mapping =3D mapping; > details.single_page =3D page; > =20 > @@ -3377,7 +3377,7 @@ void unmap_mapping_pages(struct address_space *ma= pping, pgoff_t start, > pgoff_t nr, bool even_cows) > { > pgoff_t first_index =3D start, last_index =3D start + nr - 1; > - struct zap_details details =3D { }; > + struct zap_details details =3D { .zap_flags =3D ZAP_FLAG_SKIP_SWAP }; > =20 > details.zap_mapping =3D even_cows ? NULL : mapping; > if (last_index < first_index) >=20 I think what would really help is to add a high-level description what=20 unmap_mapping_page() vs. unmap_mapping_pages() really does, and what the=20 expectations/use cases are. The names are just way too similar ... I wonder if it would make sense to split this into two parts a) Introduce ZAP_FLAG_SKIP_SWAP and use it accordingly for existing cases b) Stop setting it for unmap_mapping_page() So we'd have the change in behavior isolated. But not sure if it's worth=20 the trouble, especially if we want to backport the fix ... --=20 Thanks, David / dhildenb