From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0D231E732FD for ; Thu, 28 Sep 2023 17:15:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7C6C08D00BE; Thu, 28 Sep 2023 13:15:22 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 74FF08D0023; Thu, 28 Sep 2023 13:15:22 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5F0358D00BE; Thu, 28 Sep 2023 13:15:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 497808D0023 for ; Thu, 28 Sep 2023 13:15:22 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 23C74140180 for ; Thu, 28 Sep 2023 17:15:22 +0000 (UTC) X-FDA: 81286657284.25.E1BEE10 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf04.hostedemail.com (Postfix) with ESMTP id CEBA64001F for ; Thu, 28 Sep 2023 17:15:19 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Zxw69s3i; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf04.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1695921319; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=iB4+ZJehBJyV1CWWY+MqwDYHFJ2vxRDvHuETyOblfLU=; b=mplCrDfPEumjpfP5YDBW/7iqP12XHSn/uyVGs5S8ptNcIVOWEhJj5/SKOPGGFDspInSh6n sKxO6sCEWQlP/hGufyVqmgRSx/rgjxIemdiG//WMtuNiY4cN6T89frT2TsxxbkkM+T3k8G 1BCQ4Vt+xn59NOMd4WAKVxABuVFXD8c= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Zxw69s3i; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf04.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1695921319; a=rsa-sha256; cv=none; b=tZqnoG+TJb61NncgAWkCx43i4Ve9Si8rErIpaqRnYhuwRODWPcc2Os600QXJ1Ks9QHCExt JOcyJm0FJ/CPseuzLboBibCvjM31PW6CSEWKtanPOaPRxq+GEcadjKXf3/CGx+JjDT8dyr I0QliOZ6rhPsfDQdwSBPzDJlK5b8XUk= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1695921319; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=iB4+ZJehBJyV1CWWY+MqwDYHFJ2vxRDvHuETyOblfLU=; b=Zxw69s3iLpyYupyiG0NhANiixYvDv/aXkoPr64Qc764y9b10sQlCOvvc06nWeompI1ADE5 F9t1VMph96PIcuPxiuOZIYc8hDecbBw5UDxmA9OFZjqiriNjOq/2vE5KHyDm+QGsk8xbDL p/Y8IEcgQJWxnXgmDR7q0N07ronb4jQ= Received: from mail-wr1-f72.google.com (mail-wr1-f72.google.com [209.85.221.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-201-hgvj6iiaPNqzWsyYjKU02Q-1; Thu, 28 Sep 2023 13:15:17 -0400 X-MC-Unique: hgvj6iiaPNqzWsyYjKU02Q-1 Received: by mail-wr1-f72.google.com with SMTP id ffacd0b85a97d-3217fdf913dso9849784f8f.3 for ; Thu, 28 Sep 2023 10:15:16 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695921316; x=1696526116; h=content-transfer-encoding:in-reply-to:subject:organization:from :references:cc:to:content-language:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=iB4+ZJehBJyV1CWWY+MqwDYHFJ2vxRDvHuETyOblfLU=; b=OsB8vdzaBNwZ+E+aiuOg/HoieRaNjVPqJ4Cz/4NAL5hexiMjePEehoyvJzv+7y92bv 01LVl13nVM84OaShMAWy1yNPPTYN4S3RLwAIILjDd2gVa/BR7siSY13N91Ppcy47q4qY dWvqYS6801ljLU8MFZ+e3HK/Ep/9lveOTPsUcZZao5dOVCL1MV9kggEq/8I4Ar+lKctW +7LHjSGbsGrAlgk9WxlTgJu3FBEliMnkHm4PShwv+R3pKK7gOXIoOLl7tXs65ffXFgsg WgKgqfp1Cuwp955DleVtNYIVEJJanwZ1CqNdjPu3ZQFJxH+X7KASjX00gmBrwjby+sIJ UxhQ== X-Gm-Message-State: AOJu0YwTZfqGuGb19YJ25fcspSraAAqwY1+Y+YqafvrVu7/dfk1SY7MD 5reW2bsAa8XQkd77Vgz6PHLXY63PRW2c8UuQocEZSMzNG7bOMeWC+qcOTAIjN4Y9IsLKo6frYdS YWooHFvlDmN0= X-Received: by 2002:adf:fac9:0:b0:319:785a:fce0 with SMTP id a9-20020adffac9000000b00319785afce0mr1778247wrs.26.1695921315868; Thu, 28 Sep 2023 10:15:15 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFMHySBXVM8IbvGir7YCqj8UNThTOXlUJ574EKr5MYsPao0vC5cge2y9Ks7isIE+CxQe6gk5w== X-Received: by 2002:adf:fac9:0:b0:319:785a:fce0 with SMTP id a9-20020adffac9000000b00319785afce0mr1778223wrs.26.1695921315344; Thu, 28 Sep 2023 10:15:15 -0700 (PDT) Received: from ?IPV6:2003:cb:c718:f00:b37d:4253:cd0d:d213? (p200300cbc7180f00b37d4253cd0dd213.dip0.t-ipconnect.de. [2003:cb:c718:f00:b37d:4253:cd0d:d213]) by smtp.gmail.com with ESMTPSA id q16-20020adf9dd0000000b0031912c0ffebsm7770278wre.23.2023.09.28.10.15.13 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 28 Sep 2023 10:15:14 -0700 (PDT) Message-ID: <9101f70c-0c0a-845b-4ab7-82edf71c7bac@redhat.com> Date: Thu, 28 Sep 2023 19:15:13 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.15.1 To: Suren Baghdasaryan Cc: Jann Horn , akpm@linux-foundation.org, viro@zeniv.linux.org.uk, brauner@kernel.org, shuah@kernel.org, aarcange@redhat.com, lokeshgidra@google.com, peterx@redhat.com, hughd@google.com, mhocko@suse.com, axelrasmussen@google.com, rppt@kernel.org, willy@infradead.org, Liam.Howlett@oracle.com, zhangpeng362@huawei.com, bgeffon@google.com, kaleshsingh@google.com, ngeoffray@google.com, jdduke@google.com, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, kernel-team@android.com References: <20230923013148.1390521-1-surenb@google.com> <20230923013148.1390521-3-surenb@google.com> <03f95e90-82bd-6ee2-7c0d-d4dc5d3e15ee@redhat.com> From: David Hildenbrand Organization: Red Hat Subject: Re: [PATCH v2 2/3] userfaultfd: UFFDIO_REMAP uABI In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: CEBA64001F X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: 1r3tdd4bmry1mc3fxrn6uwbsabf5xi7c X-HE-Tag: 1695921319-971525 X-HE-Meta: U2FsdGVkX1+RVnDKxY43rxGbWwcZeXi9+5nWdRVXMfDmLwGg5+K2UGHhgtvUETyTnZLK6p91wQf/cAguOS/VG12/DexRZVhsfqotlAnD24rjrOVMXtCHue3X0fW4tgW+LPMYrcGUAZYzie86O59CmKNwHWoMX18rxDrJ82tlc4zQbGWLDJQjaZE50AUdNsC+cQbh9FdHPAlB8N6q9IVvtuHBvSTtjgr6itWKV6Ae3guoXu2lBL/XEXURLC3a6DTbwCsxQPhdJmo6tEzFisE2gq/pCTN4tLhhQNQae23Mk59nbenxLAyr1DxvlQbzLk9aYfqyIOyb5ZUC+WR23mXSQwqicdquR6XtjQsyZIoaHpNv7w8sff7Kij/99UBi+HQ4XbET5ngLWShTR4bG58ZUBQvmvojsFYkb4p4fC8Sa8xCv/hBxQx8vjpFH/fTWaUOIZxTi1ARhp/8gRZkp1mJj/kVfU1DNe5kRngpcsBUEyItUsg1e0wzLnlMLFbOhZck93DoTtzmzUj9f+1e/JguOAptvwB+K0Oc+cigBvnQS4mTx28I5wg7WsYFN7w1T/nmsPGehsSVfQ6uAS+SvviNmGGAvQEAafgo2I1irjKaQ+0xlnP0OCJ1yBDaJj5pEiG1nTmC41MKerZ6pEtkrmyZRkqpVPP5kpxVZydwFF4QLXiF27UEOAe6lJL9k69lk17YQHpAZMPfZ+p6BAqte9cGQrz2g4n0zKGPt69t4sAPwaELhe2KKGMcD1JmWj+6LVySYFKlZGliG7eQQvi83bcwdVkregs0tKP1RWk0AwoYcJDnxnTlNqS6IWoHp4YMv0/YJ12vfVlgXuAP0sHrFdcx+VunSWOpyc8SgPl7Wl9z/NBdNNLGwVJ0T8SlLgy/uAmVnbdm5BDYiAJDfD6wueMSshVeuA1pm2GlVkHzmaKZjRTdHgXhv8+U8W5h3QeDxpI2DS67senGqcifPm2KkhaX GgSlH9t8 btR2G7vEiFGwrC3n9AzRbBs1JyJDdMhQahINFf4wZ8LPixwZGojhlcY4njHzAN9Qj0kgaFTYyGnYkJ6dOkFp5MQs+DRatBhM+WkAOYZE3BNngUZ7qeRjPi+bmlHDaVXnR2Dc79+NOwo+Y2K+saLx2u0GwfwvaSMME9iQD4ZR5l87yJUP9SZ0GeUAo827ApJKxy0dKR6aLeG9tOe0wzqQXDpO7uprGqZAk10+Y3yqOe0i1Gmxo1ToykC4jojovNjctuuVUuA5FbQlLtuWJAJ21xajDy/Yttu7oQuhJu4cvBU4/o4puOSq5WbDAn92K2b5zIgMuh4d9S8GjhtvGWL/nOJXfan0tzaupgkJqlYBP3lDEIzfsoroja4o1JjzofFjoCv14dWm8JWe9BgmakYdc6Gjk1wVvNyb6pO+9F3BunVXxM+nb7XXyXhuhBg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 27.09.23 20:25, Suren Baghdasaryan wrote: >> >> I have some cleanups pending for page_move_anon_rmap(), that moves the >> SetPageAnonExclusive hunk out. Here we should be using >> page_move_anon_rmap() [or rather, folio_move_anon_rmap() after my cleanups] >> >> I'll send them out soonish. > > Should I keep this as is in my next version until you post the > cleanups? I can add a TODO comment to convert it to > folio_move_anon_rmap() once it's ready. You should just be able to use page_move_anon_rmap() and whatever gets in first cleans it up :) > >> >>>> + WRITE_ONCE(src_folio->index, linear_page_index(dst_vma, >>>> + dst_addr)); >> + >>>> + orig_src_pte = ptep_clear_flush(src_vma, src_addr, src_pte); >>>> + orig_dst_pte = mk_pte(&src_folio->page, dst_vma->vm_page_prot); >>>> + orig_dst_pte = maybe_mkwrite(pte_mkdirty(orig_dst_pte), >>>> + dst_vma); >>> >>> I think there's still a theoretical issue here that you could fix by >>> checking for the AnonExclusive flag, similar to the huge page case. >>> >>> Consider the following scenario: >>> >>> 1. process P1 does a write fault in a private anonymous VMA, creating >>> and mapping a new anonymous page A1 >>> 2. process P1 forks and creates two children P2 and P3. afterwards, A1 >>> is mapped in P1, P2 and P3 as a COW page, with mapcount 3. >>> 3. process P1 removes its mapping of A1, dropping its mapcount to 2. >>> 4. process P2 uses vmsplice() to grab a reference to A1 with get_user_pages() >>> 5. process P2 removes its mapping of A1, dropping its mapcount to 1. >>> >>> If at this point P3 does a write fault on its mapping of A1, it will >>> still trigger copy-on-write thanks to the AnonExclusive mechanism; and >>> this is necessary to avoid P3 mapping A1 as writable and writing data >>> into it that will become visible to P2, if P2 and P3 are in different >>> security contexts. >>> >>> But if P3 instead moves its mapping of A1 to another address with >>> remap_anon_pte() which only does a page mapcount check, the >>> maybe_mkwrite() will directly make the mapping writable, circumventing >>> the AnonExclusive mechanism. >>> >> >> Yes, can_change_pte_writable() contains the exact logic when we can turn >> something easily writable even if it wasn't writable before. which >> includes that PageAnonExclusive is set. (but with uffd-wp or softdirty >> tracking, there is more to consider) > > For uffd_remap can_change_pte_writable() would fail it VM_WRITE is not > set, but we want remapping to work for RO memory as well. Are you In a VMA without VM_WRITE you certainly wouldn't want to make PTEs writable :) That's why that function just does a sanity check that it is not called in strange context. So one would only call it if VM_WRITE is set. > saying that a PageAnonExclusive() check alone would not be enough > here? There are some interesting questions to ask here: 1) What happens if the old VMA has VM_SOFTDIRTY set but the new one not? You most probably have to mark the PTE softdirty and not make it writable. 2) VM_UFFD_WP requires similar care I assume? Peter might know. -- Cheers, David / dhildenb