From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1A4BDC5517A for ; Wed, 11 Nov 2020 19:59:31 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 62B1A207F7 for ; Wed, 11 Nov 2020 19:59:30 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="QJK0NE2E" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 62B1A207F7 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 833F36B0036; Wed, 11 Nov 2020 14:59:29 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7E4F26B005D; Wed, 11 Nov 2020 14:59:29 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6FA176B0068; Wed, 11 Nov 2020 14:59:29 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0047.hostedemail.com [216.40.44.47]) by kanga.kvack.org (Postfix) with ESMTP id 441936B0036 for ; Wed, 11 Nov 2020 14:59:29 -0500 (EST) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id E165D181AC9CB for ; Wed, 11 Nov 2020 19:59:28 +0000 (UTC) X-FDA: 77473202016.19.rat56_3012d2b27300 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin19.hostedemail.com (Postfix) with ESMTP id B57E71AD1B3 for ; Wed, 11 Nov 2020 19:59:28 +0000 (UTC) X-HE-Tag: rat56_3012d2b27300 X-Filterd-Recvd-Size: 6461 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf29.hostedemail.com (Postfix) with ESMTP for ; Wed, 11 Nov 2020 19:59:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1605124767; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=b66wQ5nrB6gSlcuiEZtdKUP5zPut4fVBmwWpg3DYUA8=; b=QJK0NE2EYGbgNDztAtO1pJYIV2T6+OQMPhbuf7EThEktPvaEIcE/qgVQdmsgF+2QnHkjEN P+ZIqOfhxRLCFNDZLbgCiUPvcA84BAGMKBOlQeKNmNgckN0qRFhujKsjqqOj6/8S04rbk/ 2u3M6jCdgEcFEPO6zEu1O8M+D4Eqk+Y= Received: from mail-qt1-f197.google.com (mail-qt1-f197.google.com [209.85.160.197]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-561-f-OePZNGNgSvYRMnSnIjog-1; Wed, 11 Nov 2020 14:59:23 -0500 X-MC-Unique: f-OePZNGNgSvYRMnSnIjog-1 Received: by mail-qt1-f197.google.com with SMTP id n12so1854914qta.9 for ; Wed, 11 Nov 2020 11:59:23 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=b66wQ5nrB6gSlcuiEZtdKUP5zPut4fVBmwWpg3DYUA8=; b=N8pAebSeD3GhkhVwL6FFw/KitlaHt3LDKeRJqbqgP3vYAX7M+Yyf1a5h6lfG9ZhTof FGwVuyWoLRQ0JVMhY81ndqdHO8tGkJQj8rGLe9g0ifddR3h1Ywr91milcs5ZTaZPeN5X SpWT8CVmTC4WCXpSzL3ta48RG5quwFOp9vd8SEjPMaBPogZRbEQr1Rz+fhxPduFjQHXS epQoRD1Yr8tdOAwSnk3XP5Z0WZGTpMAN5XRyADXt6g2iY4uKz2VFyY91TuhzJx0T4OlM WvvxjsdfHWscMHTCP43Sd9StAlX3ReU+ZyHpA1My9UIb6P0QbnFQWaaiImVwjzoCoWmQ x9Pg== X-Gm-Message-State: AOAM530EZvsVOg0ayoGWul9/6ObS3O0G+Pipmv55Xt1qF9umzDOjYM9K LAj/Cmq+nBeiEj8g4wTare51Tp5OU0NMjn5/4l1PaRHb0c4xeBIsO807lTPSC8tlfNi+3KvcVu7 LKyiKt6UAYhk= X-Received: by 2002:ad4:4673:: with SMTP id z19mr14372847qvv.60.1605124763415; Wed, 11 Nov 2020 11:59:23 -0800 (PST) X-Google-Smtp-Source: ABdhPJwUcZStWcSPU8GFgkNP6BEiNh0tQ4sbZVHVIUHiRmmdkCJXdwgNZEdbAhv41PRh0pVlaHGmvw== X-Received: by 2002:ad4:4673:: with SMTP id z19mr14372815qvv.60.1605124763163; Wed, 11 Nov 2020 11:59:23 -0800 (PST) Received: from xz-x1 (bras-vprn-toroon474qw-lp130-20-174-93-89-196.dsl.bell.ca. [174.93.89.196]) by smtp.gmail.com with ESMTPSA id q70sm3213742qka.87.2020.11.11.11.59.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Nov 2020 11:59:22 -0800 (PST) Date: Wed, 11 Nov 2020 14:59:20 -0500 From: Peter Xu To: Jason Gunthorpe Cc: linux-kernel@vger.kernel.org, Linus Torvalds , "Ahmed S. Darwish" , Andrea Arcangeli , Andrew Morton , "Aneesh Kumar K.V" , Christoph Hellwig , Hugh Dickins , Jan Kara , Jann Horn , John Hubbard , Kirill Shutemov , Kirill Tkhai , Leon Romanovsky , Linux-MM , Michal Hocko , Oleg Nesterov Subject: Re: [PATCH v4 2/2] mm: prevent gup_fast from racing with COW during fork Message-ID: <20201111195920.GO26342@xz-x1> References: <0-v4-908497cf359a+4782-gup_fork_jgg@nvidia.com> <2-v4-908497cf359a+4782-gup_fork_jgg@nvidia.com> MIME-Version: 1.0 In-Reply-To: <2-v4-908497cf359a+4782-gup_fork_jgg@nvidia.com> Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Nov 10, 2020 at 07:44:09PM -0400, Jason Gunthorpe wrote: > Since commit 70e806e4e645 ("mm: Do early cow for pinned pages during > fork() for ptes") pages under a FOLL_PIN will not be write protected > during COW for fork. This means that pages returned from > pin_user_pages(FOLL_WRITE) should not become write protected while the pin > is active. > > However, there is a small race where get_user_pages_fast(FOLL_PIN) can > establish a FOLL_PIN at the same time copy_present_page() is write > protecting it: > > CPU 0 CPU 1 > get_user_pages_fast() > internal_get_user_pages_fast() > copy_page_range() > pte_alloc_map_lock() > copy_present_page() > atomic_read(has_pinned) == 0 > page_maybe_dma_pinned() == false > atomic_set(has_pinned, 1); > gup_pgd_range() > gup_pte_range() > pte_t pte = gup_get_pte(ptep) > pte_access_permitted(pte) > try_grab_compound_head() > pte = pte_wrprotect(pte) > set_pte_at(); > pte_unmap_unlock() > // GUP now returns with a write protected page > > The first attempt to resolve this by using the write protect caused > problems (and was missing a barrrier), see commit f3c64eda3e50 ("mm: avoid > early COW write protect games during fork()") > > Instead wrap copy_p4d_range() with the write side of a seqcount and check > the read side around gup_pgd_range(). If there is a collision then > get_user_pages_fast() fails and falls back to slow GUP. > > Slow GUP is safe against this race because copy_page_range() is only > called while holding the exclusive side of the mmap_lock on the src > mm_struct. > > Fixes: f3c64eda3e50 ("mm: avoid early COW write protect games during fork()") > Suggested-by: Linus Torvalds > Link: https://lore.kernel.org/r/CAHk-=wi=iCnYCARbPGjkVJu9eyYeZ13N64tZYLdOB8CP5Q_PLw@mail.gmail.com > Reviewed-by: John Hubbard > Reviewed-by: Jan Kara > Signed-off-by: Jason Gunthorpe Reviewed-by: Peter Xu -- Peter Xu