From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3453DECAAD3 for ; Wed, 31 Aug 2022 16:21:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 919A4940008; Wed, 31 Aug 2022 12:21:55 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8A1F9940007; Wed, 31 Aug 2022 12:21:55 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 71C46940008; Wed, 31 Aug 2022 12:21:55 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 5A32A940007 for ; Wed, 31 Aug 2022 12:21:55 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 2C34C4059B for ; Wed, 31 Aug 2022 16:21:55 +0000 (UTC) X-FDA: 79860404190.14.0D3D42E Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf11.hostedemail.com (Postfix) with ESMTP id C3C6340039 for ; Wed, 31 Aug 2022 16:21:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1661962913; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Q2PvSFMAKRyqD5xr77ZhmTUPg6TNvkEaRcAm+SBJKKY=; b=hCuq9uLHBTo7MdFS221R3z6zgm9jgG6j5p9MrHYSZwe0MXkEXgICpydU1pfCPk1EKHXtkv DEOw9ErDB+8G1JNDj2XVYx3MH+88Uy3LtK5p/15dYpFjMU8N0iInmI9jIyWLwxZaD74Btv OmVDFZaSvAIIDvaWMxqElexjKv9AOu0= Received: from mail-qv1-f72.google.com (mail-qv1-f72.google.com [209.85.219.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-308-i3fCwHDoPpGJqBMqo_hIQQ-1; Wed, 31 Aug 2022 12:21:52 -0400 X-MC-Unique: i3fCwHDoPpGJqBMqo_hIQQ-1 Received: by mail-qv1-f72.google.com with SMTP id c1-20020a0cfb01000000b00495ad218c74so9637337qvp.20 for ; Wed, 31 Aug 2022 09:21:52 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc; bh=Q2PvSFMAKRyqD5xr77ZhmTUPg6TNvkEaRcAm+SBJKKY=; b=x6z0gwntvWC65dAcK/LJeHDbHeyEjjVqfqwd8luUXBUGR4Wfx4iFUDPBfOWonYuOYF 3Go6CubBVZMGqT/AMq7BuEoWCg1k4lEeEzZr7ZMP89FBfexK5iIKwJLSjOWSCK/Fnz4B y2Auvr9hoqwCuOsnzYzVcDQPruAdyZByutFPwU+OXvelZaPI2OalfIM+/M/FRa/wxRxE Y+pPBJRVceNrlbcxJDJXcr1PpG4rHgKAX+Igva3tKKk1UKv/jrTPZB4zNxiI80nWc8AF s0++wXpBi4DuM4Nmu3ft1fsZ4HYpL4BIPlmi5sgPqBP6YK0rIOa+smM6ko0S+RMtti2m vN9A== X-Gm-Message-State: ACgBeo3CsAyYZ+dno6LfA0sVuNB/q9SzudJBhNVa2sB3wxWDJySaB3PN kW1JUsAz3HWy80O7onrWVrn0ciFRd62gQAWg+MDYXZivPVLMJC8w2M0ttlyW65/nR4kh/krcyn1 gJyDPbjyQuMw= X-Received: by 2002:a05:620a:458c:b0:6bb:848a:b86b with SMTP id bp12-20020a05620a458c00b006bb848ab86bmr16344622qkb.267.1661962911643; Wed, 31 Aug 2022 09:21:51 -0700 (PDT) X-Google-Smtp-Source: AA6agR4zauEVAvhqWjWRcH8+X2pgMEDv1pLZPt12WstEl7PghzJqkOwM/i9MOxWB0Q6Kud8vRb/ZKA== X-Received: by 2002:a05:620a:458c:b0:6bb:848a:b86b with SMTP id bp12-20020a05620a458c00b006bb848ab86bmr16344604qkb.267.1661962911350; Wed, 31 Aug 2022 09:21:51 -0700 (PDT) Received: from xz-m1.local (bras-base-aurron9127w-grc-35-70-27-3-10.dsl.bell.ca. [70.27.3.10]) by smtp.gmail.com with ESMTPSA id bp8-20020a05620a458800b006b93ff541dasm10341458qkb.8.2022.08.31.09.21.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 31 Aug 2022 09:21:50 -0700 (PDT) Date: Wed, 31 Aug 2022 12:21:49 -0400 From: Peter Xu To: David Hildenbrand Cc: John Hubbard , Jason Gunthorpe , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Andrew Morton , Mel Gorman , "Matthew Wilcox (Oracle)" , Andrea Arcangeli , Hugh Dickins Subject: Re: [PATCH v1 2/3] mm/gup: use gup_can_follow_protnone() also in GUP-fast Message-ID: References: <20220825164659.89824-1-david@redhat.com> <20220825164659.89824-3-david@redhat.com> <1892f6de-fd22-0e8b-3ff6-4c8641e1c68e@redhat.com> <2e20c90d-4d1f-dd83-aa63-9d8d17021263@redhat.com> <9ce3aaaa-71a6-5a81-16a3-36e6763feb91@redhat.com> <9a4fe603-950e-785b-6281-2e309256463f@nvidia.com> <68b38ac4-c680-b694-21a9-1971396d63b9@redhat.com> MIME-Version: 1.0 In-Reply-To: <68b38ac4-c680-b694-21a9-1971396d63b9@redhat.com> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1661962914; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Q2PvSFMAKRyqD5xr77ZhmTUPg6TNvkEaRcAm+SBJKKY=; b=GivBJbue+vU9fKMKNDobtlA87EOL7mE+jRzh/ftrm7qFjUmPJr+KJ1anmcBsJFuAYv+vRk GLDmuUM5m9m5+u0Yn2YdHPBWKNViLmdATpBeLgv88rph3gc71VaCd1RJautrPSg0s1HlTV oId7yHsxToihVLQ+MIatnpQoOF2DeFo= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=hCuq9uLH; spf=pass (imf11.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1661962914; a=rsa-sha256; cv=none; b=unxOTO31WiwgtKgAoZhYRvIX4C2Hg1OSpGSvhe5Bp2QAJiSdeVjlxidbjMFzOr99hpsIZ1 x/EiWp82bmLRj8Q75tvU42TcSPvXwbMaDaeLlD+eDzB7SnCrMhq9r6k6O5gAdfAG5wm2Ax tQqCwwxazUF2LrXcWQ4MwtPD5OI5XIE= X-Rspamd-Queue-Id: C3C6340039 Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=hCuq9uLH; spf=pass (imf11.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Rspam-User: X-Rspamd-Server: rspam10 X-Stat-Signature: 5cfikarx16m3rhodkpf9kn6d8fxi5iyp X-HE-Tag: 1661962913-323348 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Aug 30, 2022 at 09:23:44PM +0200, David Hildenbrand wrote: > On 30.08.22 21:18, John Hubbard wrote: > > On 8/30/22 11:53, David Hildenbrand wrote: > >> Good, I managed to attract the attention of someone who understands that machinery :) > >> > >> While validating whether GUP-fast and PageAnonExclusive code work correctly, > >> I started looking at the whole RCU GUP-fast machinery. I do have a patch to > >> improve PageAnonExclusive clearing (I think we're missing memory barriers to > >> make it work as expected in any possible case), but I also stumbled eventually > >> over a more generic issue that might need memory barriers. > >> > >> Any thoughts whether I am missing something or this is actually missing > >> memory barriers? > >> > > > > It's actually missing memory barriers. > > > > In fact, others have had that same thought! [1] :) In that 2019 thread, > > I recall that this got dismissed because of a focus on the IPI-based > > aspect of gup fast synchronization (there was some hand waving, perhaps > > accurate waving, about memory barriers vs. CPU interrupts). But now the > > RCU (non-IPI) implementation is more widely used than it used to be, the > > issue is clearer. > > > >> > >> From ce8c941c11d1f60cea87a3e4d941041dc6b79900 Mon Sep 17 00:00:00 2001 > >> From: David Hildenbrand > >> Date: Mon, 29 Aug 2022 16:57:07 +0200 > >> Subject: [PATCH] mm/gup: update refcount+pincount before testing if the PTE > >> changed > >> > >> mm/ksm.c:write_protect_page() has to make sure that no unknown > >> references to a mapped page exist and that no additional ones with write > >> permissions are possible -- unknown references could have write permissions > >> and modify the page afterwards. > >> > >> Conceptually, mm/ksm.c:write_protect_page() consists of: > >> (1) Clear/invalidate PTE > >> (2) Check if there are unknown references; back off if so. > >> (3) Update PTE (e.g., map it R/O) > >> > >> Conceptually, GUP-fast code consists of: > >> (1) Read the PTE > >> (2) Increment refcount/pincount of the mapped page > >> (3) Check if the PTE changed by re-reading it; back off if so. > >> > >> To make sure GUP-fast won't be able to grab additional references after > >> clearing the PTE, but will properly detect the change and back off, we > >> need a memory barrier between updating the recount/pincount and checking > >> if it changed. > >> > >> try_grab_folio() doesn't necessarily imply a memory barrier, so add an > >> explicit smp_mb__after_atomic() after the atomic RMW operation to > >> increment the refcount and pincount. > >> > >> ptep_clear_flush() used to clear the PTE and flush the TLB should imply > >> a memory barrier for flushing the TLB, so don't add another one for now. > >> > >> PageAnonExclusive handling requires further care and will be handled > >> separately. > >> > >> Fixes: 2667f50e8b81 ("mm: introduce a general RCU get_user_pages_fast()") > >> Signed-off-by: David Hildenbrand > >> --- > >> mm/gup.c | 17 +++++++++++++++++ > >> 1 file changed, 17 insertions(+) > >> > >> diff --git a/mm/gup.c b/mm/gup.c > >> index 5abdaf487460..0008b808f484 100644 > >> --- a/mm/gup.c > >> +++ b/mm/gup.c > >> @@ -2392,6 +2392,14 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end, > >> goto pte_unmap; > >> } > >> > >> + /* > >> + * Update refcount/pincount before testing for changed PTE. This > >> + * is required for code like mm/ksm.c:write_protect_page() that > >> + * wants to make sure that a page has no unknown references > >> + * after clearing the PTE. > >> + */ > >> + smp_mb__after_atomic(); > >> + > >> if (unlikely(pte_val(pte) != pte_val(*ptep))) { > >> gup_put_folio(folio, 1, flags); > >> goto pte_unmap; > >> @@ -2577,6 +2585,9 @@ static int gup_hugepte(pte_t *ptep, unsigned long sz, unsigned long addr, > >> if (!folio) > >> return 0; > >> > >> + /* See gup_pte_range(). */ > > > > Don't we usually also identify what each mb pairs with, in the comments? That would help. > > Yeah, if only I could locate them reliably (as documented ptep_clear_flush() > should imply one I guess) ... but it will depend on the context. > > > As I now have the attention of two people that understand that machinery, > here goes the PageAnonExclusive thing I *think* should be correct. > > The IPI-based mechanism really did make such synchronization with > GUP-fast easier ... > > > > From 8f91ef3555178149ad560b5424a9854b2ceee2d6 Mon Sep 17 00:00:00 2001 > From: David Hildenbrand > Date: Sat, 27 Aug 2022 10:44:13 +0200 > Subject: [PATCH] mm: rework PageAnonExclusive() interaction with GUP-fast > > commit 6c287605fd56 (mm: remember exclusively mapped anonymous pages with > PG_anon_exclusive) made sure that when PageAnonExclusive() has to be > cleared during temporary unmapping of a page, that the PTE is > cleared/invalidated and that the TLB is flushed. > > That handling was inspired by an outdated comment in > mm/ksm.c:write_protect_page(), which similarly required the TLB flush in > the past to synchronize with GUP-fast. However, ever since general RCU GUP > fast was introduced in commit 2667f50e8b81 ("mm: introduce a general RCU > get_user_pages_fast()"), a TLB flush is no longer sufficient and > required to synchronize with concurrent GUP-fast > > Peter pointed out, that TLB flush is not required, and looking into > details it turns out that he's right. To synchronize with GUP-fast, it's > sufficient to clear the PTE only: GUP-fast will either detect that the PTE > changed or that PageAnonExclusive is not set and back off. However, we > rely on a given memory order and should make sure that that order is > always respected. > > Conceptually, GUP-fast pinning code of anon pages consists of: > (1) Read the PTE > (2) Pin the mapped page > (3) Check if the PTE changed by re-reading it; back off if so. > (4) Check if PageAnonExclusive is not set; back off if so. > > Conceptually, PageAnonExclusive clearing code consists of: > (1) Clear PTE > (2) Check if the page is pinned; back off if so. > (3) Clear PageAnonExclusive > (4) Restore PTE (optional) > > As GUP-fast temporarily pins the page before validating whether the PTE > changed, and PageAnonExclusive clearing code clears the PTE before > checking if the page is pinned, GUP-fast cannot end up pinning an anon > page that is not exclusive. > > One corner case to consider is when we restore the PTE to the same value > after PageAnonExclusive was cleared, as it can happen in > mm/ksm.c:write_protect_page(). In that case, GUP-fast might not detect > a PTE change (because there was none). However, as restoring the PTE > happens after clearing PageAnonExclusive, GUP-fast would detect that > PageAnonExclusive was cleared in that case and would properly back off. > > Let's document that, avoid the TLB flush where possible and use proper > explicit memory barriers where required. We shouldn't really care about the > additional memory barriers here, as we're not on extremely hot paths. > > The possible issues due to reordering are of theoretical nature so far, > but it better be addressed. > > Note that we don't need a memory barrier between checking if the page is > pinned and clearing PageAnonExclusive, because stores are not > speculated. > > Fixes: 6c287605fd56 ("mm: remember exclusively mapped anonymous pages with PG_anon_exclusive") > Signed-off-by: David Hildenbrand > --- > include/linux/mm.h | 9 +++++-- > include/linux/rmap.h | 58 ++++++++++++++++++++++++++++++++++++++++---- > mm/huge_memory.c | 3 +++ > mm/ksm.c | 1 + > mm/migrate_device.c | 22 +++++++---------- > mm/rmap.c | 11 +++++---- > 6 files changed, 79 insertions(+), 25 deletions(-) > > diff --git a/include/linux/mm.h b/include/linux/mm.h > index 21f8b27bd9fd..f7e8f4b34fb5 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -2975,8 +2975,8 @@ static inline int vm_fault_to_errno(vm_fault_t vm_fault, int foll_flags) > * PageAnonExclusive() has to protect against concurrent GUP: > * * Ordinary GUP: Using the PT lock > * * GUP-fast and fork(): mm->write_protect_seq > - * * GUP-fast and KSM or temporary unmapping (swap, migration): > - * clear/invalidate+flush of the page table entry > + * * GUP-fast and KSM or temporary unmapping (swap, migration): see > + * page_try_share_anon_rmap() > * > * Must be called with the (sub)page that's actually referenced via the > * page table entry, which might not necessarily be the head page for a > @@ -2997,6 +2997,11 @@ static inline bool gup_must_unshare(unsigned int flags, struct page *page) > */ > if (!PageAnon(page)) > return false; > + > + /* See page_try_share_anon_rmap() for GUP-fast details. */ > + if (IS_ENABLED(CONFIG_HAVE_FAST_GUP) && irqs_disabled()) > + smp_rmb(); > + > /* > * Note that PageKsm() pages cannot be exclusive, and consequently, > * cannot get pinned. > diff --git a/include/linux/rmap.h b/include/linux/rmap.h > index bf80adca980b..454c159f2aae 100644 > --- a/include/linux/rmap.h > +++ b/include/linux/rmap.h > @@ -267,7 +267,7 @@ static inline int page_try_dup_anon_rmap(struct page *page, bool compound, > * @page: the exclusive anonymous page to try marking possibly shared > * > * The caller needs to hold the PT lock and has to have the page table entry > - * cleared/invalidated+flushed, to properly sync against GUP-fast. > + * cleared/invalidated. > * > * This is similar to page_try_dup_anon_rmap(), however, not used during fork() > * to duplicate a mapping, but instead to prepare for KSM or temporarily > @@ -283,12 +283,60 @@ static inline int page_try_share_anon_rmap(struct page *page) > { > VM_BUG_ON_PAGE(!PageAnon(page) || !PageAnonExclusive(page), page); > > - /* See page_try_dup_anon_rmap(). */ > - if (likely(!is_device_private_page(page) && > - unlikely(page_maybe_dma_pinned(page)))) > - return -EBUSY; > + /* device private pages cannot get pinned via GUP. */ > + if (unlikely(is_device_private_page(page))) { > + ClearPageAnonExclusive(page); > + return 0; > + } > > + /* > + * We have to make sure that while we clear PageAnonExclusive, that > + * the page is not pinned and that concurrent GUP-fast won't succeed in > + * concurrently pinning the page. > + * > + * Conceptually, GUP-fast pinning code of anon pages consists of: > + * (1) Read the PTE > + * (2) Pin the mapped page > + * (3) Check if the PTE changed by re-reading it; back off if so. > + * (4) Check if PageAnonExclusive is not set; back off if so. > + * > + * Conceptually, PageAnonExclusive clearing code consists of: > + * (1) Clear PTE > + * (2) Check if the page is pinned; back off if so. > + * (3) Clear PageAnonExclusive > + * (4) Restore PTE (optional) > + * > + * In GUP-fast, we have to make sure that (2),(3) and (4) happen in > + * the right order. Memory order between (2) and (3) is handled by > + * GUP-fast, independent of PageAnonExclusive. > + * > + * When clearing PageAnonExclusive(), we have to make sure that (1), > + * (2), (3) and (4) happen in the right order. > + * > + * Note that (4) has to happen after (3) in both cases to handle the > + * corner case whereby the PTE is restored to the original value after > + * clearing PageAnonExclusive and while GUP-fast might not detect the > + * PTE change, it will detect the PageAnonExclusive change. > + * > + * We assume that there might not be a memory barrier after > + * clearing/invalidating the PTE (1) and before restoring the PTE (4), > + * so we use explicit ones here. > + * > + * These memory barriers are paired with memory barriers in GUP-fast > + * code, including gup_must_unshare(). > + */ > + > + /* Clear/invalidate the PTE before checking for PINs. */ > + if (IS_ENABLED(CONFIG_HAVE_FAST_GUP)) > + smp_mb(); Wondering whether this could be smp_mb__before_atomic(). > + > + if (unlikely(page_maybe_dma_pinned(page))) > + return -EBUSY; > ClearPageAnonExclusive(page); > + > + /* Clear PageAnonExclusive() before eventually restoring the PTE. */ > + if (IS_ENABLED(CONFIG_HAVE_FAST_GUP)) > + smp_mb__after_atomic(); > return 0; > } > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index e9414ee57c5b..2aef8d76fcf2 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -2140,6 +2140,8 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, > * > * In case we cannot clear PageAnonExclusive(), split the PMD > * only and let try_to_migrate_one() fail later. > + * > + * See page_try_share_anon_rmap(): invalidate PMD first. > */ > anon_exclusive = PageAnon(page) && PageAnonExclusive(page); > if (freeze && anon_exclusive && page_try_share_anon_rmap(page)) > @@ -3177,6 +3179,7 @@ int set_pmd_migration_entry(struct page_vma_mapped_walk *pvmw, > flush_cache_range(vma, address, address + HPAGE_PMD_SIZE); > pmdval = pmdp_invalidate(vma, address, pvmw->pmd); > > + /* See page_try_share_anon_rmap(): invalidate PMD first. */ > anon_exclusive = PageAnon(page) && PageAnonExclusive(page); > if (anon_exclusive && page_try_share_anon_rmap(page)) { > set_pmd_at(mm, address, pvmw->pmd, pmdval); > diff --git a/mm/ksm.c b/mm/ksm.c > index d7526c705081..971cf923c0eb 100644 > --- a/mm/ksm.c > +++ b/mm/ksm.c > @@ -1091,6 +1091,7 @@ static int write_protect_page(struct vm_area_struct *vma, struct page *page, > goto out_unlock; > } > > + /* See page_try_share_anon_rmap(): clear PTE first. */ > if (anon_exclusive && page_try_share_anon_rmap(page)) { > set_pte_at(mm, pvmw.address, pvmw.pte, entry); > goto out_unlock; > diff --git a/mm/migrate_device.c b/mm/migrate_device.c > index 27fb37d65476..47e955212f15 100644 > --- a/mm/migrate_device.c > +++ b/mm/migrate_device.c > @@ -193,20 +193,16 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, > bool anon_exclusive; > pte_t swp_pte; > flush_cache_page() missing here? Better copy Alistair too when post formally since this will have a slight conflict with the other thread. > + ptep_get_and_clear(mm, addr, ptep); > + > + /* See page_try_share_anon_rmap(): clear PTE first. */ > anon_exclusive = PageAnon(page) && PageAnonExclusive(page); > - if (anon_exclusive) { > - flush_cache_page(vma, addr, pte_pfn(*ptep)); > - ptep_clear_flush(vma, addr, ptep); > - > - if (page_try_share_anon_rmap(page)) { > - set_pte_at(mm, addr, ptep, pte); > - unlock_page(page); > - put_page(page); > - mpfn = 0; > - goto next; > - } > - } else { > - ptep_get_and_clear(mm, addr, ptep); > + if (anon_exclusive && page_try_share_anon_rmap(page)) { > + set_pte_at(mm, addr, ptep, pte); > + unlock_page(page); > + put_page(page); > + mpfn = 0; > + goto next; > } > > migrate->cpages++; Thanks, -- Peter Xu