From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AC587C47258 for ; Tue, 23 Jan 2024 13:42:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EBBE66B0078; Tue, 23 Jan 2024 08:42:38 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E6BD96B007B; Tue, 23 Jan 2024 08:42:38 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D82166B007D; Tue, 23 Jan 2024 08:42:38 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id C93EC6B0078 for ; Tue, 23 Jan 2024 08:42:38 -0500 (EST) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 6927EC0B1A for ; Tue, 23 Jan 2024 13:42:38 +0000 (UTC) X-FDA: 81710690796.18.3209A3A Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf19.hostedemail.com (Postfix) with ESMTP id 3C85A1A001D for ; Tue, 23 Jan 2024 13:42:36 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=none; spf=pass (imf19.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1706017356; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ijAG7htsP2sT9AsDoi4BzAFWEz8FwY5yVfOsaWOwYog=; b=lch1ulAq7XVrVd6w7744RomXD2qvlIdjOfCI1AVCfmUv3enYXQ5uwIC11OzcS+BCr+2gO7 ZQUHZ0KwHMAaZGg3s5jN+9wbL77TwY0CB/ZsJX21CNJp0HYXHs+Y18MOCOiOnLuEntqSo7 iBbECWxYTrKPvGnRYNn2d5xmSe1mT/o= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=none; spf=pass (imf19.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1706017356; a=rsa-sha256; cv=none; b=ycAl6ajfysIlMBj1aBkmX+51r91AaN4xfIYhB9lbRpDQNaNUsoL4mD3hB2edahATJNaz9w CGmrVsWrTyM1FeBL0De74qrLJERwY838+inLEgij8VJD2fYaiDWQYBJ/OChQkIxPwnOQMd sF69V4E5DtaAZ+LK2G3/80eW2UqFr1c= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 5D631FEC; Tue, 23 Jan 2024 05:43:20 -0800 (PST) Received: from [10.57.77.165] (unknown [10.57.77.165]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id A8BB63F762; Tue, 23 Jan 2024 05:42:30 -0800 (PST) Message-ID: <94d33a07-c59a-4315-9c64-8b4d959ca1f4@arm.com> Date: Tue, 23 Jan 2024 13:42:29 +0000 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v1 10/11] mm/memory: ignore dirty/accessed/soft-dirty bits in folio_pte_batch() Content-Language: en-GB To: David Hildenbrand , linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, Andrew Morton , Matthew Wilcox , Russell King , Catalin Marinas , Will Deacon , Dinh Nguyen , Michael Ellerman , Nicholas Piggin , Christophe Leroy , "Aneesh Kumar K.V" , "Naveen N. Rao" , Paul Walmsley , Palmer Dabbelt , Albert Ou , Alexander Gordeev , Gerald Schaefer , Heiko Carstens , Vasily Gorbik , Christian Borntraeger , Sven Schnelle , "David S. Miller" , linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, sparclinux@vger.kernel.org References: <20240122194200.381241-1-david@redhat.com> <20240122194200.381241-11-david@redhat.com> <59592b50-fe89-4b32-8490-2e6c296f972f@arm.com> <76740e33-9b52-4e23-b407-8ae38bac15ec@redhat.com> From: Ryan Roberts In-Reply-To: <76740e33-9b52-4e23-b407-8ae38bac15ec@redhat.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: 3C85A1A001D X-Rspam-User: X-Stat-Signature: 4oruoahqq8nio459a5eqqrgh1hw556fu X-Rspamd-Server: rspam01 X-HE-Tag: 1706017356-846515 X-HE-Meta: U2FsdGVkX1/UGEIO6l4Xg10LPdRV046OoQ5pMv24Zk8pZPK/Csqbn8Jwqff7mCV6NBnzfKWL1D/5b+nnVVyNku/MWeerjDIKGCk/zjkKqPJ/GKQsdK28EuP/o6jyYShTQWUy3N1l8aKntQ4EPkT+cIGsMnS6D9axshe2h13G2lNLWlY7REYGOFtJ6P9MaBa6PKqgqNr99Pw+tSw8I/+wjOmpaYa8calle2FtmB5n3ruMNSpyp9QXVAZmyfCr4W4t52HjEEXGa4yTLh8faPZftSSEVxWP1FP0BllJqSB6pxcJaEK5pV9OqAbhmmTWC0iNzkRdle/YPDYiALsqtojpO83UIRHQWKT6A9Oj1IHz1RfcxprpjNiBzssofObF0B4zW9kkel3xY4Ke14+eZqAJl1euGmzBOOT3UlyL6GuewZE58hCglYr7mi8pDEhq2IMnGq+owP9fw5WrZIMBqzLnP3iyvnq2zubkDsbY3wZjghReOGhM/p6dxs9b00YnV88mGsLO5kc58xhhIMAvFichKV43RpbXQ9mzUNxlRcTV7hf9zzTyKTF2LJpH/8x86CGakUvt6spKi41mrVxAhwyvLw+hwbXT/oawpbHZizCk4NyP2jVGRg8O4aV8RZCKe2STwUED/ciSBDCkN5QCxPs0fZcmqHcKEmeyNlwBnb+neSVX9ER2neOgEf4MPTAGUhDwXXIYCFsR1N6TS6n31+eL4sp+vTOXNKlSva4bndYSH5zEiKBtZmNZdlSDslNpAGsydIUMjXETR4t0a8sQY8cVCgDXHS5PsRsiBzHd6ryoA2m10oQfgVyCkpMTi9y/5AcBPBgMcPC3pMba6s63ku71keRRgzeLdqf8x4F+aXYZiAH7TMzWsP89Hla67aImZIRR0TaTr8dMcIACmjNxM8Lp3wuMzl2muV9KxMZxbwJAyGJJ4tcygpko4glpdq/JQR0T0As/wb71uItI2Ew+8v0 +3HOvLcJ +rFg6N3hmx91pdPU3+g19DN87TXmZYjWijrrn9e/LzX0W2c0ocdnL8O2xqr+2Hn8Vjcf9WORdeF9Rl7JzR1TzFg0FEN0zoXgYP8WvKrVMMudTrpISHgVrdV3qWVcve8cEj5P25UzybUYTBEmj6fze1XTa1hlImaduYmwMrVgJoYYGRW6/nzKDnLZEqy1xmr2xXbM6GAVumz4dWxfHwv/qvN1BvfRaEiZMOh3uAQVioWYNeXsBaeOsCcn8YxCOr7KhukShh9VvPC5XdoL2OOqJBNmhmXuf8dum5iEcirVVEnSYmap9bnkJ8uXfLxboPZNw4Fg73Q9iLH51M9KD9ejX6z3u9g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 23/01/2024 13:06, David Hildenbrand wrote: > On 23.01.24 13:25, Ryan Roberts wrote: >> On 22/01/2024 19:41, David Hildenbrand wrote: >>> Let's ignore these bits: they are irrelevant for fork, and will likely >>> be irrelevant for upcoming users such as page unmapping. >>> >>> Signed-off-by: David Hildenbrand >>> --- >>>   mm/memory.c | 10 ++++++++-- >>>   1 file changed, 8 insertions(+), 2 deletions(-) >>> >>> diff --git a/mm/memory.c b/mm/memory.c >>> index f563aec85b2a8..341b2be845b6e 100644 >>> --- a/mm/memory.c >>> +++ b/mm/memory.c >>> @@ -953,24 +953,30 @@ static __always_inline void __copy_present_ptes(struct >>> vm_area_struct *dst_vma, >>>       set_ptes(dst_vma->vm_mm, addr, dst_pte, pte, nr); >>>   } >>>   +static inline pte_t __pte_batch_clear_ignored(pte_t pte) >>> +{ >>> +    return pte_clear_soft_dirty(pte_mkclean(pte_mkold(pte))); >>> +} >>> + >>>   /* >>>    * Detect a PTE batch: consecutive (present) PTEs that map consecutive >>>    * pages of the same folio. >>>    * >>>    * All PTEs inside a PTE batch have the same PTE bits set, excluding the PFN. >> >> nit: last char should be a comma (,) not a full stop (.) >> >>> + * the accessed bit, dirty bit and soft-dirty bit. >>>    */ >>>   static inline int folio_pte_batch(struct folio *folio, unsigned long addr, >>>           pte_t *start_ptep, pte_t pte, int max_nr) >>>   { >>>       unsigned long folio_end_pfn = folio_pfn(folio) + folio_nr_pages(folio); >>>       const pte_t *end_ptep = start_ptep + max_nr; >>> -    pte_t expected_pte = pte_next_pfn(pte); >>> +    pte_t expected_pte = __pte_batch_clear_ignored(pte_next_pfn(pte)); >>>       pte_t *ptep = start_ptep + 1; >>>         VM_WARN_ON_FOLIO(!pte_present(pte), folio); >>>         while (ptep != end_ptep) { >>> -        pte = ptep_get(ptep); >>> +        pte = __pte_batch_clear_ignored(ptep_get(ptep)); >>>             if (!pte_same(pte, expected_pte)) >>>               break; >> >> I think you'll lose dirty information in the child for private mappings? If the >> first pte in a batch is clean, but a subsequent page is dirty, you will end up >> setting all the pages in the batch as clean in the child. Previous behavior >> would preserve dirty bit for private mappings. >> >> In my version (v3) that did arbitrary batching, I had some fun and games >> tracking dirty, write and uffd_wp: >> https://lore.kernel.org/linux-arm-kernel/20231204105440.61448-2-ryan.roberts@arm.com/ >> >> Also, I think you will currently either set soft dirty on all or none of the >> pages in the batch, depending on the value of the first. I previously convinced >> myself that the state was unimportant so always cleared it in the child to >> provide consistency. > > Good points regarding dirty and soft-dirty. I wanted to avoid passing flags to > folio_pte_batch(), but maybe that's just what we need to not change behavior. I think you could not bother with the enforce_uffd_wp - just always enforce uffd-wp. So that's one simplification vs mine. Then you just need an any_dirty flag following the same pattern as your any_writable. Then just set dirty on the whole batch in the child if any were dirty in the parent. Although now I'm wondering if there is a race here... What happens if a page in the parent becomes dirty after you have checked it but before you write protect it? Isn't that already a problem with the current non-batched version? Why do we even to preserve dirty in the child for private mappings?