From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8EEA8ECAAD8 for ; Fri, 26 Aug 2022 21:37:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 281AC940008; Fri, 26 Aug 2022 17:37:19 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 23193940007; Fri, 26 Aug 2022 17:37:19 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0D29D940008; Fri, 26 Aug 2022 17:37:19 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id EE7B0940007 for ; Fri, 26 Aug 2022 17:37:18 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id B460BA95E3 for ; Fri, 26 Aug 2022 21:37:18 +0000 (UTC) X-FDA: 79843054956.12.97ED033 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf06.hostedemail.com (Postfix) with ESMTP id 7767E18001E for ; Fri, 26 Aug 2022 21:37:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1661549836; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=2KrAZmGrG6kDDNuHo11ICMD83hxyjEivX/cg1A87nKw=; b=EnPecC5q8SQAHUon872hmOnUsn1RlcGc9qm3lacOeXIazl3V03awvtkB1UXwHHgolRbPxN jR2ebRfH0KmOjixb1ij0OqfQRSWFCUlXQQqWrubLim5MlkEzha7DZTTuvRt4W8l2hot6Iv q+k7QSzwCAXHAiKpFaVYhmqJF+52sXA= Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-544-Wc073VwuPFmc4NGNs01VuQ-1; Fri, 26 Aug 2022 17:37:15 -0400 X-MC-Unique: Wc073VwuPFmc4NGNs01VuQ-1 Received: by mail-qk1-f199.google.com with SMTP id x22-20020a05620a259600b006b552a69231so2176732qko.18 for ; Fri, 26 Aug 2022 14:37:15 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc; bh=2KrAZmGrG6kDDNuHo11ICMD83hxyjEivX/cg1A87nKw=; b=0MdmrYJ9snFEQxr2graE+VRFBMcldtMlCU0G+lYLIiTP7SuQWYrHKeT26k+ubkbcJ4 5lRCOpFVA84D8blZnw3DPFOoL/GsE6ueWNAMdzM9SWohQp9ohG1eA7/C7yQbFl2bRleN 0f6nwJ4NC9wD+JqJDEltBZN6OhrRNkoj8ajLZ9At71eeeH5IhV9aXhler8yqDkcXKoAp uT98FRBQThovjANUEwPQqPPVxzc3U5CMB44xfvwFTKHdTel8dKoCLOqFtkBcNls2h3e8 RvBKjTISvPaq8lUm5D7ABLeP3DrsyE0RVsZzJ3EG3q5ArZ2afrUFUEZ12SQXjAmLJslz KxZA== X-Gm-Message-State: ACgBeo302mAHSEuN32sBkidC1ATMClastcKQFUFOo4WW9hAqpaAKsDje fzfogmVKzp5aiwnSmuBZE8+9olWXg37xv/UfTDHBd5xiIOhOFqb7jd1GG2vL0GK/P/73C1dD/VW TWyo06rK8quI= X-Received: by 2002:a05:6214:e6c:b0:476:a4bd:2b95 with SMTP id jz12-20020a0562140e6c00b00476a4bd2b95mr1312587qvb.25.1661549835119; Fri, 26 Aug 2022 14:37:15 -0700 (PDT) X-Google-Smtp-Source: AA6agR53is4ixbz7h41Wx5XIm184jiaZL8HsGwhTwAJEHRIbbpDPLRMqX963knWH+clH/uGR3aknhA== X-Received: by 2002:a05:6214:e6c:b0:476:a4bd:2b95 with SMTP id jz12-20020a0562140e6c00b00476a4bd2b95mr1312571qvb.25.1661549834906; Fri, 26 Aug 2022 14:37:14 -0700 (PDT) Received: from xz-m1.local (bras-base-aurron9127w-grc-35-70-27-3-10.dsl.bell.ca. [70.27.3.10]) by smtp.gmail.com with ESMTPSA id h1-20020a05620a400100b006bbe7ded98csm554339qko.112.2022.08.26.14.37.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Aug 2022 14:37:14 -0700 (PDT) Date: Fri, 26 Aug 2022 17:37:12 -0400 From: Peter Xu To: David Hildenbrand Cc: Alistair Popple , linux-mm@kvack.org, akpm@linux-foundation.org, Nadav Amit , huang ying , LKML , "Sierra Guiza, Alejandro (Alex)" , Felix Kuehling , Jason Gunthorpe , John Hubbard , Ralph Campbell , Matthew Wilcox , Karol Herbst , Lyude Paul , Ben Skeggs , Logan Gunthorpe , paulus@ozlabs.org, linuxppc-dev@lists.ozlabs.org, stable@vger.kernel.org, Huang Ying Subject: Re: [PATCH v3 2/3] mm/migrate_device.c: Copy pte dirty bit to page Message-ID: References: <3b01af093515ce2960ac39bb16ff77473150d179.1661309831.git-series.apopple@nvidia.com> <8735dkeyyg.fsf@nvdebian.thelocal> <8735dj7qwb.fsf@nvdebian.thelocal> <72146725-3d70-0427-50d4-165283a5a85d@redhat.com> <140e7688-b66d-2f6d-fed8-e39da5045420@redhat.com> MIME-Version: 1.0 In-Reply-To: <140e7688-b66d-2f6d-fed8-e39da5045420@redhat.com> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1661549838; a=rsa-sha256; cv=none; b=15AYY286/VQJqzj6e7LTxD9AzuenTK71TBH0A1PMF5WITyueXg2YtnAnRkuIW676/9xjtb X/PnDePqaYUVytvfNCBiA2X+Q4Kmf7FUis3s+hKNGlXFNw2dGIAfejCjgM9zhHMk9yY6uB TxROeXF8MAOTXHo9esphDNlnhpRWsUs= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=EnPecC5q; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf06.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1661549838; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=2KrAZmGrG6kDDNuHo11ICMD83hxyjEivX/cg1A87nKw=; b=J2VhsUEAY/DJ8J0OtHXpRMU+S5PjEyK7FuCbo1oYPvurTDF3kxdOv5zDOpUAkS25/sjEVc YMtHxVdN/5n7go33A8BVkrsMjcfeHMgTx0AeaL+83nrfA2ozJ8p8lZvNNZRoqT+VsXIT6I ZLZplXTICqQmkM+pBS2yfKyNrmXvAyg= Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=EnPecC5q; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf06.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: mnzrrse3uetc38rfi1cpsaycd19jsi7y X-Rspamd-Queue-Id: 7767E18001E X-HE-Tag: 1661549837-938654 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Aug 26, 2022 at 06:46:02PM +0200, David Hildenbrand wrote: > On 26.08.22 17:55, Peter Xu wrote: > > On Fri, Aug 26, 2022 at 04:47:22PM +0200, David Hildenbrand wrote: > >>> To me anon exclusive only shows this mm exclusively owns this page. I > >>> didn't quickly figure out why that requires different handling on tlb > >>> flushs. Did I perhaps miss something? > >> > >> GUP-fast is the magic bit, we have to make sure that we won't see new > >> GUP pins, thus the TLB flush. > >> > >> include/linux/mm.h:gup_must_unshare() contains documentation. > > > > Hmm.. Shouldn't ptep_get_and_clear() (e.g., xchg() on x86_64) already > > guarantees that no other process/thread will see this pte anymore > > afterwards? > > You could have a GUP-fast thread that just looked up the PTE and is > going to pin the page afterwards, after the ptep_get_and_clear() > returned. You'll have to wait until that thread finished. IIUC the early tlb flush won't protect concurrent fast-gup from happening, but I think it's safe because fast-gup will check pte after pinning, so either: (1) fast-gup runs before ptep_get_and_clear(), then page_try_share_anon_rmap() will fail properly, or, (2) fast-gup runs during or after ptep_get_and_clear(), then fast-gup will see that either the pte is none or changed, then it'll fail the fast-gup itself. > > Another user that relies on this interaction between GUP-fast and TLB > flushing is for example mm/ksm.c:write_protect_page() > > There is a comment in there explaining the interaction a bit more detailed. > > Maybe we'll be able to handle this differently in the future (maybe once > this turns out to be an actual performance problem). Unfortunately, > mm->write_protect_seq isn't easily usable because we'd need have to make > sure we're the exclusive writer. > > > For now, it's not too complicated. For PTEs: > * try_to_migrate_one() already uses ptep_clear_flush(). > * try_to_unmap_one() already conditionally used ptep_clear_flush(). > * migrate_vma_collect_pmd() was the one case that didn't use it already > (and I wonder why it's different than try_to_migrate_one()). I'm not sure whether I fully get the point, but here one major difference is all the rest handles one page, so a tlb flush alongside with the pte clear sounds reasonable. Even if so try_to_unmap_one() was modified to use tlb batching, but then I see that anon exclusive made that batching conditional. I also have question there on whether we can keep using the tlb batching even with anon exclusive pages there. In general, I still don't see how stall tlb could affect anon exclusive pages on racing with fast-gup, because the only side effect of a stall tlb is unwanted page update iiuc, the problem is fast-gup doesn't even use tlb, afaict.. Thanks, -- Peter Xu