From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 89DB0C433E0 for ; Wed, 3 Feb 2021 21:15:27 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id DD4D164F68 for ; Wed, 3 Feb 2021 21:15:26 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org DD4D164F68 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 4574B6B0005; Wed, 3 Feb 2021 16:15:26 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3E1396B006C; Wed, 3 Feb 2021 16:15:26 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2CEDF6B006E; Wed, 3 Feb 2021 16:15:26 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0201.hostedemail.com [216.40.44.201]) by kanga.kvack.org (Postfix) with ESMTP id 144B36B0005 for ; Wed, 3 Feb 2021 16:15:26 -0500 (EST) Received: from smtpin12.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id CD1EE180AD806 for ; Wed, 3 Feb 2021 21:15:25 +0000 (UTC) X-FDA: 77778212610.12.knife09_550309c275d6 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin12.hostedemail.com (Postfix) with ESMTP id A76F418012029 for ; Wed, 3 Feb 2021 21:15:25 +0000 (UTC) X-HE-Tag: knife09_550309c275d6 X-Filterd-Recvd-Size: 7329 Received: from mail-lf1-f50.google.com (mail-lf1-f50.google.com [209.85.167.50]) by imf04.hostedemail.com (Postfix) with ESMTP for ; Wed, 3 Feb 2021 21:15:25 +0000 (UTC) Received: by mail-lf1-f50.google.com with SMTP id q12so1206226lfo.12 for ; Wed, 03 Feb 2021 13:15:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=i+u6BTILoUVaAAqMvPP05q1+sYeBPgAq4q8dfomuEGM=; b=WEU9PqqZm2T0n4ETvzj8mQa1byAVBWQqIYe3UmQcn18OJkbNKbZTIyyJA1vcHeB9Qh exvPK/JEtrSWN7eF8haKy1SWYJ64eE/yz8Kr6mQ/5HqBIf2LUWRyeyuaTGlpm3LtIdfi QdkMo2BqRJIdo5qA25F+mrnvHdLId1tZ6k2cY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=i+u6BTILoUVaAAqMvPP05q1+sYeBPgAq4q8dfomuEGM=; b=KmRgvZ5hKmRrfO2zu857Otrh127zMw4z8N2YTn0CwvzXfBC4ipWwmO9nDPcXRKq1YY HWJQ7XaFgZvSelVu+UGRnBJf4gqX7H+HfQtitijPkXDr0J5jzt61nCxjQfoDGgIfNkOn CI85R3vote7ksdo+7h5nooZAkzc2vkW9fOe0/+QkxEg+LmFGnjZOpUetFVcs/lvBbelR NiD6TK9ZCImmQS6I1WZ4f6avxPk7KvT3fh4dGSwKvQw/5JZltA5gxap+Jy18Up/9hsQq ImXY/nyaE5fonYDOOS6GYBuEvs0mMqovx4LwjoncGu40ONVgf90vC/FlR3emjp4an/lG UMdA== X-Gm-Message-State: AOAM5300CCi0lanuRiJyEpUaMWKkgJNes7/6pDt0eXiAj6wRojG4TS2h V3wRtp6fyfsDUW8xMtfZwCUBFVDwKYatVg== X-Google-Smtp-Source: ABdhPJw8QX0sqGLzn/3VwDFR6lOO0y1biBkxHexMICuB1q4T2mPLAN9IG44CMGXuaEZxbEdp90tZJw== X-Received: by 2002:ac2:4d18:: with SMTP id r24mr2776713lfi.316.1612386922407; Wed, 03 Feb 2021 13:15:22 -0800 (PST) Received: from mail-lf1-f47.google.com (mail-lf1-f47.google.com. [209.85.167.47]) by smtp.gmail.com with ESMTPSA id k1sm364387ljj.105.2021.02.03.13.15.20 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 03 Feb 2021 13:15:20 -0800 (PST) Received: by mail-lf1-f47.google.com with SMTP id m22so1275082lfg.5 for ; Wed, 03 Feb 2021 13:15:20 -0800 (PST) X-Received: by 2002:ac2:5502:: with SMTP id j2mr2651150lfk.421.1612386919848; Wed, 03 Feb 2021 13:15:19 -0800 (PST) MIME-Version: 1.0 References: <20210203210832.113685-1-peterx@redhat.com> <20210203210832.113685-5-peterx@redhat.com> In-Reply-To: <20210203210832.113685-5-peterx@redhat.com> From: Linus Torvalds Date: Wed, 3 Feb 2021 13:15:03 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH 4/4] hugetlb: Do early cow when page pinned on src mm To: Peter Xu Cc: Linux Kernel Mailing List , Linux-MM , Wei Zhang , Matthew Wilcox , Jason Gunthorpe , Gal Pressman , Christoph Hellwig , Andrea Arcangeli , Jan Kara , Kirill Shutemov , David Gibson , Mike Rapoport , Mike Kravetz , Kirill Tkhai , Jann Horn , Andrew Morton Content-Type: text/plain; charset="UTF-8" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Feb 3, 2021 at 1:08 PM Peter Xu wrote: > > This is the last missing piece of the COW-during-fork effort when there're > pinned pages found. One can reference 70e806e4e645 ("mm: Do early cow for > pinned pages during fork() for ptes", 2020-09-27) for more information, since > we do similar things here rather than pte this time, but just for hugetlb. No issues with the code itself, but.. Comments are good, but the comments inside this block of code actually makes the code *much* harder to read, because now the actual logic is much more spread out and you can't see what it does so well. > + if (unlikely(page_needs_cow_for_dma(vma, ptepage))) { > + /* This is very possibly a pinned huge page */ > + if (!prealloc) { > + /* > + * Preallocate the huge page without > + * tons of locks since we could sleep. > + * Note: we can't use any reservation > + * because the page will be exclusively > + * owned by the child later. > + */ > + put_page(ptepage); > + spin_unlock(src_ptl); > + spin_unlock(dst_ptl); > + prealloc = alloc_huge_page(vma, addr, 0); > + if (!prealloc) { > + /* > + * hugetlb_cow() seems to be > + * more careful here than us. > + * However for fork() we could > + * be strict not only because > + * no one should be referencing > + * the child mm yet, but also > + * if resources are rare we'd > + * better simply fail the > + * fork() even earlier. > + */ > + ret = -ENOMEM; > + break; > + } > + goto again; > + } > + /* > + * We have page preallocated so that we can do > + * the copy right now. > + */ > + hugetlb_copy_page(vma, dst_pte, addr, ptepage, > + prealloc); > + put_page(ptepage); > + spin_unlock(src_ptl); > + spin_unlock(dst_ptl); > + prealloc = NULL; > + continue; > + } Can you move the comment above the code? And I _think_ the prealloc conditional could be split up to a helper function (which would help more), but maybe there are too many variables for that to be practical. Linus