From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 584FCC2D0A3 for ; Tue, 3 Nov 2020 17:03:36 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id CAB4322226 for ; Tue, 3 Nov 2020 17:03:35 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="OIKFuGL5" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CAB4322226 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id EC1376B005D; Tue, 3 Nov 2020 12:03:34 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E72AE6B0068; Tue, 3 Nov 2020 12:03:34 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D88C46B006C; Tue, 3 Nov 2020 12:03:34 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0109.hostedemail.com [216.40.44.109]) by kanga.kvack.org (Postfix) with ESMTP id A42676B005D for ; Tue, 3 Nov 2020 12:03:34 -0500 (EST) Received: from smtpin10.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id EF94D3630 for ; Tue, 3 Nov 2020 17:03:33 +0000 (UTC) X-FDA: 77443728348.10.oil62_2711ca1272ba Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin10.hostedemail.com (Postfix) with ESMTP id C0A8D16A0DE for ; Tue, 3 Nov 2020 17:03:33 +0000 (UTC) X-HE-Tag: oil62_2711ca1272ba X-Filterd-Recvd-Size: 6924 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf18.hostedemail.com (Postfix) with ESMTP for ; Tue, 3 Nov 2020 17:03:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1604423012; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=3zOnoIciI0nLpKfwERbfxFpmQzoaz6nBi64SrOrNElY=; b=OIKFuGL5qDu8Da/zR7hhFq4CoP78h42pIrx7dpmgMch4REOZ/C0+dRGJeKV7RgKnU4Hu0O j+Ff2SQFvaOyd10hjErPeKq1CBzlxKlWhAS7ffitUPYqA560NhRJLzqN+kZ5q68j8nO8nz cB5Y5cxsC1f8xz4K52/ChCPSRRT0FdA= Received: from mail-qv1-f70.google.com (mail-qv1-f70.google.com [209.85.219.70]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-37-61dNHPBgM0qESw0VlpdgXw-1; Tue, 03 Nov 2020 12:03:30 -0500 X-MC-Unique: 61dNHPBgM0qESw0VlpdgXw-1 Received: by mail-qv1-f70.google.com with SMTP id dd7so10581417qvb.6 for ; Tue, 03 Nov 2020 09:03:30 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=3zOnoIciI0nLpKfwERbfxFpmQzoaz6nBi64SrOrNElY=; b=W7bQrWk2VuaqVONYcgkXSQFFRkgqUvk/ulU1FZK52AXCTOrjUhV00fbEYmbtiLpw8t 9pj4XR97nJMPGcbIzXprfO3AVZ01YfTl/EwfzyveGctp+F9CPG+144bsli8EQkxe1dQn Nh4YXxV5lMMljjjgC8DbiI7tZbai/umTMsEy3La+HrgJ3xtYXSMdYl0C91GiiVe5SkzS 6WWUozpc5l4iKnoRod6gYUHYWH2MSzUlBnyA9qJQCD8mfhLeqZPBgn4GrfWLhtSmoidT 8R8Ha1vlJ8qSsv/KCurnC/iAaUfmwaKEpK1f+m4Cy1lpPcXIiDLTsotokkmpCH23seNg KiWw== X-Gm-Message-State: AOAM530ePlXVSqRK1VFyXJSRCIBr7qMLBhOBeSjrFky8lmYYdbco1x3M aHGeC3yk3/dN5+TdhsVyskQxwo9bW2ubglexQmCtYoi3pf5bUgRjBoGfhqVuhD0LCm8+gAiNVel Z/ncFczpBj08= X-Received: by 2002:a05:620a:2054:: with SMTP id d20mr19954729qka.175.1604423010410; Tue, 03 Nov 2020 09:03:30 -0800 (PST) X-Google-Smtp-Source: ABdhPJwJz+xNdHIKTAISQMZrx5HscDgQ8KLSo0JFC9F0Nmsr/u3TxYevfSuB1UCc+ACiaHiL0EByXw== X-Received: by 2002:a05:620a:2054:: with SMTP id d20mr19954697qka.175.1604423010161; Tue, 03 Nov 2020 09:03:30 -0800 (PST) Received: from xz-x1 (bras-vprn-toroon474qw-lp130-20-174-93-89-196.dsl.bell.ca. [174.93.89.196]) by smtp.gmail.com with ESMTPSA id n201sm10861996qka.32.2020.11.03.09.03.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 03 Nov 2020 09:03:29 -0800 (PST) Date: Tue, 3 Nov 2020 12:03:27 -0500 From: Peter Xu To: "Ahmed S. Darwish" Cc: Jason Gunthorpe , linux-kernel@vger.kernel.org, Linus Torvalds , Andrea Arcangeli , Andrew Morton , "Aneesh Kumar K.V" , Christoph Hellwig , Hugh Dickins , Jan Kara , Jann Horn , John Hubbard , Kirill Shutemov , Kirill Tkhai , Leon Romanovsky , Linux-MM , Michal Hocko , Oleg Nesterov , Peter Zijlstra , Ingo Molnar , Will Deacon , Thomas Gleixner , Sebastian Siewior Subject: Re: [PATCH v2 2/2] mm: prevent gup_fast from racing with COW during fork Message-ID: <20201103170327.GJ20600@xz-x1> References: <0-v2-dfe9ecdb6c74+2066-gup_fork_jgg@nvidia.com> <2-v2-dfe9ecdb6c74+2066-gup_fork_jgg@nvidia.com> <20201030225250.GB6357@xz-x1> <20201030235121.GQ2620339@nvidia.com> <20201103001712.GB52235@lx-t490> MIME-Version: 1.0 In-Reply-To: <20201103001712.GB52235@lx-t490> Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Nov 03, 2020 at 01:17:12AM +0100, Ahmed S. Darwish wrote: > > > > diff --git a/mm/memory.c b/mm/memory.c > > > > index c48f8df6e50268..294c2c3c4fe00d 100644 > > > > +++ b/mm/memory.c > > > > @@ -1171,6 +1171,12 @@ copy_page_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma) > > > > mmu_notifier_range_init(&range, MMU_NOTIFY_PROTECTION_PAGE, > > > > 0, src_vma, src_mm, addr, end); > > > > mmu_notifier_invalidate_range_start(&range); > > > > + /* > > > > + * The read side doesn't spin, it goes to the mmap_lock, so the > > > > + * raw version is used to avoid disabling preemption here > > > > + */ > > > > + mmap_assert_write_locked(src_mm); > > > > + raw_write_seqcount_t_begin(&src_mm->write_protect_seq); > > > > > > Would raw_write_seqcount_begin() be better here? > > > > Hum.. > > > > I felt no because it had the preempt stuff added into it, however it > > would work - __seqcount_lock_preemptible() == false for the seqcount_t > > case (see below) > > > > Looking more closely, maybe the right API to pick is > > write_seqcount_t_begin() and write_seqcount_t_end() ?? > > > > No, that's not the right API: it is also internal to seqlock.h. > > Please stick with the official exported API: raw_write_seqcount_begin(). > > It should satisfy your needs, and the raw_*() variant is created exactly > for contexts wishing to avoid the lockdep checks (e.g. NMI handlers > cannot invoke lockdep, etc.) Ahmed, Jason, feel free to correct me - but I feel like what Jason wanted here is indeed the version that does not require disabling of preemption, a.k.a., write_seqcount_t_begin() and write_seqcount_t_end(), since it's preempt-safe if the read side does not retry. Not sure whether there's no "*_t_*" version of it. Another idea is that maybe we can use the raw_write_seqcount_begin() version, instead of in copy_page_range() but move it to copy_pte_range(). That would not affect normal Linux on preemption I think, since when reach pte level we should have disabled preemption already after all (by taking the pgtable spin lock). But again there could be extra overhead since we'll need to take the write seqcount very often (rather than once per fork(), so maybe there's some perf influence), also that means it'll be an extra/real disable_preempt() for the future RT code if it'll land some day (since again rt_spin_lock should not need to disable preemption, iiuc, which is used here). So seems it's still better to do in copy_page_range() as Jason proposed. Thanks, -- Peter Xu