From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AB34FC433EF for ; Wed, 20 Jul 2022 19:55:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E0D826B0072; Wed, 20 Jul 2022 15:55:40 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DBC136B0073; Wed, 20 Jul 2022 15:55:40 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C84936B0074; Wed, 20 Jul 2022 15:55:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id B73CC6B0072 for ; Wed, 20 Jul 2022 15:55:40 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 7E051406CC for ; Wed, 20 Jul 2022 19:55:40 +0000 (UTC) X-FDA: 79708533240.28.283917B Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf10.hostedemail.com (Postfix) with ESMTP id 2DF99C0089 for ; Wed, 20 Jul 2022 19:55:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1658346939; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=mr8pnCTauYjzOGkOniK7GloDn5urX/KaCQJ8ZGemUEM=; b=Fw3p8YyvtdV+d19K1NIyOhn/OID0BpLqdfueQVmU2GxCtHqKmwyYshl8SoSQAsYOJz6h/A JNDb092QxSjMextuAGpPmByr64Apo5lwPGRZHh/X9DBPtwh/IPMSWyXrE4lbo93ASHVDqH h5pZpo7EJvyJiP9N6mrV07W9h9Sht8A= Received: from mail-wm1-f69.google.com (mail-wm1-f69.google.com [209.85.128.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-368-wpkrKq-nNY2g9jmh1faEfA-1; Wed, 20 Jul 2022 15:55:38 -0400 X-MC-Unique: wpkrKq-nNY2g9jmh1faEfA-1 Received: by mail-wm1-f69.google.com with SMTP id v123-20020a1cac81000000b003a02a3f0beeso1733791wme.3 for ; Wed, 20 Jul 2022 12:55:38 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent :content-language:to:cc:references:from:organization:subject :in-reply-to:content-transfer-encoding; bh=mr8pnCTauYjzOGkOniK7GloDn5urX/KaCQJ8ZGemUEM=; b=NFcWIHYnPjCkV0QXQdxBpSJKwEIM749gthCpS3oCMAqZuIDQqtJtNDly3CPafyqeu3 rCcgONlljFFrGVeWHdY2wRTmuaFrt7jHjw7s3stQ0QM3B7SNxb6ivgsjR9+KsCgbQLYl oMu/dNowBgYPTeK7bnW7NzzCF2/9MimhgTHYZAWdJ8Kmmm7IFdZ0agXN15AZajftftSW 2vY6iH42W/QsblrEnDUzU6ZfACtpSbF44FFf5EF42kiVWlJAM6klVzIztuekr50W0vT7 wn+yMdB2N6wksY74FMoI4ZzQ0f+sVdvKZCWAYJwA78lyrTc7UPgq5bhaOV2hBJXuyNli nMrg== X-Gm-Message-State: AJIora8K4YVexzUuq37SPmSKjoEEmZTtsvyunN83T4SoitWKuGmX4zIM XqNetqp9pS3UwBm4vvTI31tVe5Z1SfyFF/qOa2R61Jd/DhyTlfWtmJ/+QxVEmAP8wtrSYFlW208 hZ2fS6Rr6SdY= X-Received: by 2002:a05:6000:15ce:b0:21d:b177:a8f0 with SMTP id y14-20020a05600015ce00b0021db177a8f0mr32417143wry.370.1658346937495; Wed, 20 Jul 2022 12:55:37 -0700 (PDT) X-Google-Smtp-Source: AGRyM1tJEQZKBMArtmW7WMPAxcqPwpk9+YRx+an8bBYKNXh8934xY+EHCPUzX4kL3p8wx50YKReHzg== X-Received: by 2002:a05:6000:15ce:b0:21d:b177:a8f0 with SMTP id y14-20020a05600015ce00b0021db177a8f0mr32417127wry.370.1658346937235; Wed, 20 Jul 2022 12:55:37 -0700 (PDT) Received: from ?IPV6:2003:cb:c706:e00:8d96:5dba:6bc4:6e89? (p200300cbc7060e008d965dba6bc46e89.dip0.t-ipconnect.de. [2003:cb:c706:e00:8d96:5dba:6bc4:6e89]) by smtp.gmail.com with ESMTPSA id g14-20020adff40e000000b0021bbf6687b1sm19731824wro.81.2022.07.20.12.55.36 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 20 Jul 2022 12:55:36 -0700 (PDT) Message-ID: <4ad140b5-1d5b-2486-0893-7886a9cdfd76@redhat.com> Date: Wed, 20 Jul 2022 21:55:35 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.11.0 To: Peter Xu Cc: Nadav Amit , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Andrew Morton , Mike Rapoport , Axel Rasmussen , Nadav Amit , Andrea Arcangeli , Andrew Cooper , Andy Lutomirski , Dave Hansen , Peter Zijlstra , Thomas Gleixner , Will Deacon , Yu Zhao , Nick Piggin References: <20220718120212.3180-1-namit@vmware.com> <20220718120212.3180-2-namit@vmware.com> <017facf0-7ef8-3faf-138d-3013a20b37db@redhat.com> <2b4393ce-95c9-dd3e-8495-058a139e771e@redhat.com> <69022bad-d6f1-d830-224d-eb8e5c90d5c7@redhat.com> From: David Hildenbrand Organization: Red Hat Subject: Re: [RFC PATCH 01/14] userfaultfd: set dirty and young on writeprotect In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1658346940; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=mr8pnCTauYjzOGkOniK7GloDn5urX/KaCQJ8ZGemUEM=; b=mWlR6EMioXzJpM6iXRvmUBw8Lh8ymrM4OQkGppq7VWJQBVjD0dJjJzrxWuLj+IYlayM5RG 0uvl08gxp5/hUD6/jIgHYMkSuPn0AwF6WmzMuGg46b0wcEA0Xgd7JXpSCIz2G8z0xLX9Kz i9ncKwN80gggIWuCrQ/O7HQXtPZKfEc= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1658346940; a=rsa-sha256; cv=none; b=kNXunIOZMbjZc5x5AIeWi2N0sLDebCMJ/qZ4aNnSYXVF62XQ30o4jnaKGqnSq/3c43Qm3E FldXKbmKScFPSFbqyuUFxPKTFhv6+hzOn3i9OA6NBN9J1CLU7BQVJUXqRmYNNU0zN4KHq5 BCJ9ECzUdB2q4xxKV71863izSeGac+k= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Fw3p8Yyv; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf10.hostedemail.com: domain of david@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=david@redhat.com X-Rspamd-Queue-Id: 2DF99C0089 Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Fw3p8Yyv; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf10.hostedemail.com: domain of david@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=david@redhat.com X-Rspamd-Server: rspam12 X-Rspam-User: X-Stat-Signature: u6j5hh5kostufeemgs9oujihsztb8zg5 X-HE-Tag: 1658346939-808306 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 20.07.22 21:48, Peter Xu wrote: > On Wed, Jul 20, 2022 at 09:33:35PM +0200, David Hildenbrand wrote: >> On 20.07.22 21:15, Peter Xu wrote: >>> On Wed, Jul 20, 2022 at 05:10:37PM +0200, David Hildenbrand wrote: >>>> For pagecache pages it may as well be *plain wrong* to bypass the write >>>> fault handler and simply mark pages dirty+map them writable. >>> >>> Could you elaborate? >> >> Write-fault handling for some filesystems (that even require this >> "slow path") is a bit special. >> >> For example, do_shared_fault() might have to call page_mkwrite(). >> >> AFAIK file systems use that for lazy allocation of disk blocks. >> If you simply go ahead and map a !dirty pagecache page writable >> and mark it dirty, it will not trigger page_mkwrite() and you might >> end up corrupting data. >> >> That's why we the old change_pte_range() code never touched >> anything if the pte wasn't already dirty. > > I don't think that pte_dirty() check was for the pagecache code. For any fs > that has page_mkwrite() defined, it'll already have vma_wants_writenotify() > return 1, so we'll never try to add write bit, hence we'll never even try > to check pte_dirty(). > I might be too tired, but the whole reason we had this magic before my commit in place was only for the pagecache. With vma_wants_writenotify()=0 you can directly map the pages writable and don't have to do these advanced checks here. In a writable MAP_SHARED VMA you'll already have pte_write(). We only get !pte_write() in case we have vma_wants_writenotify()=1 ... try_change_writable = vma_wants_writenotify(vma, vma->vm_page_prot); and that's the code that checked the dirty bit after all to decide -- amongst other things -- if we can simply map it writable without going via the write fault handler and triggering do_shared_fault() . See crazy/ugly FOLL_FORCE code in GUP that similarly checks the dirty bit. But yeah, it's all confusing so I might just be wrong regarding pagecache pages. -- Thanks, David / dhildenb