From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-23.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A2D84C433EF for ; Fri, 24 Sep 2021 03:56:47 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 3181E60F48 for ; Fri, 24 Sep 2021 03:56:47 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 3181E60F48 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id A9A506B006C; Thu, 23 Sep 2021 23:56:46 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A493D6B0071; Thu, 23 Sep 2021 23:56:46 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 910CE900002; Thu, 23 Sep 2021 23:56:46 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0001.hostedemail.com [216.40.44.1]) by kanga.kvack.org (Postfix) with ESMTP id 7EABB6B006C for ; Thu, 23 Sep 2021 23:56:46 -0400 (EDT) Received: from smtpin05.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 2E2818249980 for ; Fri, 24 Sep 2021 03:56:46 +0000 (UTC) X-FDA: 78621105612.05.12FC0A0 Received: from mail-qk1-f174.google.com (mail-qk1-f174.google.com [209.85.222.174]) by imf16.hostedemail.com (Postfix) with ESMTP id CDA2CF000092 for ; Fri, 24 Sep 2021 03:56:45 +0000 (UTC) Received: by mail-qk1-f174.google.com with SMTP id q125so12038284qkd.12 for ; Thu, 23 Sep 2021 20:56:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:from:to:cc:subject:in-reply-to:message-id:references :mime-version; bh=T5ADzI1en+YNAAEohrN23QUrZld4cBrYp9ozDzunpR4=; b=NnJX5JHW9yz4l0kRzGcspC0ezBt/AAfd1RF760dKV1skI7k4vUi3HnJRseYnNU7N2z yW/wNZVO9qdllvrtpr/g3Fswl67xuUfnNAqSJwuC84/bOUdGV8KW2fXm8UQaBeXj0zl3 +dT1tunmrfdARc1CYn5P3DmGFTOpHFTh0I4Z3H6j0dVQpc3qHVZ1xWbyNJP0yEDAru3C QEE7EAfKyPdKJuZ92W1oaZ34kNNkBgccIrUYsKGYJDd0ADs7d2v2Xx0+oaw0i0C9rBZu NaBgEBSMkAmtA3nGUKu7rfShAYpuEIcrJ9RDAX5H/02zkKF4hx1nZowupUkH2SBf7PoR qsLw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:in-reply-to:message-id :references:mime-version; bh=T5ADzI1en+YNAAEohrN23QUrZld4cBrYp9ozDzunpR4=; b=JEHXN8oWk7AqOE02Iz2eKT0oCHXG5rfxfdavLsxiHpcWqqehs7CakjsvfeQ8vZN1X/ OE/Eg8HWWFPS8rIj96GqVUnk2fOoFeGML0kD/WkTNcUPQwsQ8T+fbOkBR4RX1djMmY3Y uRebCiR3rsUt+3N5q1/27P0rZMd6OHiijW9AbJuyvvEGu4TjhRhneXQz3L+hTo385pLe VtyUN4d1ySkjaxxprWTa4ihAnNYw0BQEj0beskf/0q4q1j0xXxrCqndsua9Xupy79Q8H vmkfWp7GGdX/5ER2GavXczA5YFLgGRRmxuIj3EXB39c3K8HXxBIrSc07WQMA2t8Ta1x4 BrDg== X-Gm-Message-State: AOAM530zUUd5cnSDDXnGKnkJUOruFNcwNrPfoA5NryDeHxB2oW+jf0xX tUsNHzh5AJoAKQADI0oRDjQ74Q== X-Google-Smtp-Source: ABdhPJwxCwPy6uCxI8Dl35dqUTg1+YhaXVWGZRboWOidkejvRGBO6taET3IT+SOPd4hc8zg2BvHbaQ== X-Received: by 2002:a37:b94:: with SMTP id 142mr8461074qkl.390.1632455804912; Thu, 23 Sep 2021 20:56:44 -0700 (PDT) Received: from ripple.attlocal.net (172-10-233-147.lightspeed.sntcca.sbcglobal.net. [172.10.233.147]) by smtp.gmail.com with ESMTPSA id u13sm5281885qki.38.2021.09.23.20.56.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 23 Sep 2021 20:56:44 -0700 (PDT) Date: Thu, 23 Sep 2021 20:56:33 -0700 (PDT) From: Hugh Dickins X-X-Sender: hugh@ripple.anvils To: Peter Xu cc: Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Andrea Arcangeli , Liam Howlett , Hugh Dickins , Mike Rapoport , Yang Shi , David Hildenbrand , "Kirill A . Shutemov" , Jerome Glisse , Alistair Popple , Miaohe Lin , Matthew Wilcox , Axel Rasmussen Subject: Re: [PATCH v4 1/4] mm/shmem: Unconditionally set pte dirty in mfill_atomic_install_pte In-Reply-To: <20210915181456.10739-2-peterx@redhat.com> Message-ID: <49fddb9a-4a52-1df-8b7c-dde2a89330bf@google.com> References: <20210915181456.10739-1-peterx@redhat.com> <20210915181456.10739-2-peterx@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Rspamd-Queue-Id: CDA2CF000092 X-Stat-Signature: mg1pnq43k9rjpfnwtjrmnw9cyjred86q Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=NnJX5JHW; spf=pass (imf16.hostedemail.com: domain of hughd@google.com designates 209.85.222.174 as permitted sender) smtp.mailfrom=hughd@google.com; dmarc=pass (policy=reject) header.from=google.com X-Rspamd-Server: rspam06 X-HE-Tag: 1632455805-493016 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, 15 Sep 2021, Peter Xu wrote: > It was conditionally done previously, as there's one shmem special case that we > use SetPageDirty() instead. However that's not necessary and it should be > easier and cleaner to do it unconditionally in mfill_atomic_install_pte(). > > The most recent discussion about this is here, where Hugh explained the history > of SetPageDirty() and why it's possible that it's not required at all: > > https://lore.kernel.org/lkml/alpine.LSU.2.11.2104121657050.1097@eggly.anvils/ > > Currently mfill_atomic_install_pte() has three callers: > > 1. shmem_mfill_atomic_pte > 2. mcopy_atomic_pte > 3. mcontinue_atomic_pte > > After the change: case (1) should have its SetPageDirty replaced by the dirty > bit on pte (so we unify them together, finally), case (2) should have no > functional change at all as it has page_in_cache==false, case (3) may add a > dirty bit to the pte. However since case (3) is UFFDIO_CONTINUE for shmem, > it's merely 100% sure the page is dirty after all because UFFDIO_CONTINUE > normally requires another process to modify the page cache and kick the faulted > thread, so should not make a real difference either. > > This should make it much easier to follow on which case will set dirty for > uffd, as we'll simply set it all now for all uffd related ioctls. Meanwhile, > no special handling of SetPageDirty() if there's no need. > > Cc: Hugh Dickins > Cc: Axel Rasmussen > Cc: Andrea Arcangeli > Reviewed-by: Axel Rasmussen > Signed-off-by: Peter Xu I'm not going to NAK this, but you and I have different ideas of "very nice cleanups". Generally, you appear (understandably) to be trying to offload pieces of work from your larger series, but often I don't see the sense of them, here in isolation anyway. Is this a safe transformation of the existing code? Yes, I believe so (at least until someone adds some PTESAN checker which looks to see if any ptes are dirty in vmas to which user never had write access). But it took quite a lot of lawyering to arrive at that conclusion. Is this a cleanup? No, it's a dirtyup. shmem_mfill_atomic_pte() does SetPageDirty (before unlocking page) because that's where the page contents are made dirty. You could criticise it for doing SetPageDirty even in the zeropage case: yes, we've been lazy there; but that's a different argument. If someone is faulting this page into a read-only vma, it's surprising to make the pte dirty there. What would be most correct would be to keep the SetPageDirty in shmem_mfill_atomic_pte() (with or without zeropage optimization), and probably SetPageDirty in some other places in mm/userfaultfd.c (I didn't look where) when the page is filled with supplied data, and mfill_atomic_install_pte() only do that pte_mkdirty() when it's serving a FAULT_FLAG_WRITE. I haven't looked again (I have a pile of mails to respond to!), but when I looked before I think I found that the vmf flags are not available to the userfaultfd ioctler. If so, then it would be more appropriate to just leave the mkdirty to the hardware on return from fault (except - and again I cannot spend time researching this - perhaps I'm too x86-centric, and there are other architectures on which the software *must* do the mkdirty fixup to avoid refaulting forever - though probably userfaultfd state would itself prevent that). But you seem to think that doing the dirtying in an unnatural place helps somehow; and for all I know, that may be so in your larger series, though this change certainly raises suspicions of that. I'm sorry to be so discouraging, but you have asked for my opinion, and here at last you have it. Not a NAK, but no enthusiasm at all. Hugh > --- > mm/shmem.c | 1 - > mm/userfaultfd.c | 3 +-- > 2 files changed, 1 insertion(+), 3 deletions(-) > > diff --git a/mm/shmem.c b/mm/shmem.c > index 88742953532c..96ccf6e941aa 100644 > --- a/mm/shmem.c > +++ b/mm/shmem.c > @@ -2424,7 +2424,6 @@ int shmem_mfill_atomic_pte(struct mm_struct *dst_mm, > shmem_recalc_inode(inode); > spin_unlock_irq(&info->lock); > > - SetPageDirty(page); > unlock_page(page); > return 0; > out_delete_from_cache: > diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c > index 7a9008415534..caf6dfff2a60 100644 > --- a/mm/userfaultfd.c > +++ b/mm/userfaultfd.c > @@ -69,10 +69,9 @@ int mfill_atomic_install_pte(struct mm_struct *dst_mm, pmd_t *dst_pmd, > pgoff_t offset, max_off; > > _dst_pte = mk_pte(page, dst_vma->vm_page_prot); > + _dst_pte = pte_mkdirty(_dst_pte); > if (page_in_cache && !vm_shared) > writable = false; > - if (writable || !page_in_cache) > - _dst_pte = pte_mkdirty(_dst_pte); > if (writable) { > if (wp_copy) > _dst_pte = pte_mkuffd_wp(_dst_pte); > -- > 2.31.1