From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 37E46C433E0 for ; Tue, 22 Dec 2020 09:38:33 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C3FA02310D for ; Tue, 22 Dec 2020 09:38:32 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C3FA02310D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 0E8976B00A1; Tue, 22 Dec 2020 04:38:32 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 09A876B00A2; Tue, 22 Dec 2020 04:38:32 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EA3738D0003; Tue, 22 Dec 2020 04:38:31 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0061.hostedemail.com [216.40.44.61]) by kanga.kvack.org (Postfix) with ESMTP id D60646B00A1 for ; Tue, 22 Dec 2020 04:38:31 -0500 (EST) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 9116C3621 for ; Tue, 22 Dec 2020 09:38:31 +0000 (UTC) X-FDA: 77620418022.29.thing33_37057b62745e Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin29.hostedemail.com (Postfix) with ESMTP id 75CA618085CEB for ; Tue, 22 Dec 2020 09:38:31 +0000 (UTC) X-HE-Tag: thing33_37057b62745e X-Filterd-Recvd-Size: 6811 Received: from mail-oo1-f46.google.com (mail-oo1-f46.google.com [209.85.161.46]) by imf23.hostedemail.com (Postfix) with ESMTP for ; Tue, 22 Dec 2020 09:38:30 +0000 (UTC) Received: by mail-oo1-f46.google.com with SMTP id s19so2834604oos.2 for ; Tue, 22 Dec 2020 01:38:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=GdOuNcmMMWH4z8sKpbvhRSL0GxJAY00N8icJZcixgRw=; b=KnWDrnUIm/QDsqn03+KLVq7vICriGback2SD3UVuwGF4E8dfq/oGJ4UdMf33MCSS3Q /Qs242BaKjd5AsW/liitV+duls56pSRtKmbpbcasJLfnW1v4eFPhVRZz6CY1AwTmhcUc z8Oe+E6mcNKVOHL1wfsBPTmvK1Xgs9Q0+YnmqPxIFO8ydN8ZHlz8AOQq9pl4WRZLxs2I ug8V3GALqSXCFhC/O+ebCIHUEDTVbOhjrp0KlEoxSvjgi3sfo6gx6wVFucrObaOLWlV0 /Pe3cLmVVk7SLqrfmmDAM1hw/LTHpl8j/RZCk1PA8Y4KgIm4RU4XftWIK6IouGEIUOOY zfMg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=GdOuNcmMMWH4z8sKpbvhRSL0GxJAY00N8icJZcixgRw=; b=p5Hs3yAGPEFNFk708dwo4cWbQL9Lqg7BBqOEev8DN65np8sTvzQq5d3xN2mlw4DFzW KACssqQykGEqw3tsL/KBpJxhVDyGK8EJ6CvvtaRJEHZRGoSS0Ui2JNNmSuqysaUTfNP6 Mty08SFZLktJ/ulsaJoj5C3SOeZANCINwqJPTIDuefGjTdipc5EF5alGi5ekN36JNcoN dyluMkkm/DcxM8sfUrLjUTdbQNAppwaiuRXTL23tRlEecQMcpgv7J9cRxEsC8nJvZ8XP bWB9teg7lk72SRJdGEn/YQnc4qXYbuV3hmJX1d7G972+OWCieLYwZASKSqQ24mFI3toC onSw== X-Gm-Message-State: AOAM533U/mJKzCIBBK7WCBRo9jMOShPJVK6snMks9UZAZjKfy3asIRay t/agiMtzMxn83PE7pirizkA= X-Google-Smtp-Source: ABdhPJzv2y/wl75u+LajXaDBgiMwbLVLmfiq0P2hPYaynriPLuJgfYNAVHm2Te03f2t6xsOg2yxGIQ== X-Received: by 2002:a4a:9c01:: with SMTP id y1mr14202402ooj.15.1608629910042; Tue, 22 Dec 2020 01:38:30 -0800 (PST) Received: from ?IPv6:2601:647:4700:9b2:9423:6a08:cbd0:8220? ([2601:647:4700:9b2:9423:6a08:cbd0:8220]) by smtp.gmail.com with ESMTPSA id i194sm2011461oib.30.2020.12.22.01.38.28 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 22 Dec 2020 01:38:29 -0800 (PST) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.4\)) Subject: Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect From: Nadav Amit In-Reply-To: Date: Tue, 22 Dec 2020 01:38:27 -0800 Cc: Linus Torvalds , Peter Xu , Yu Zhao , Andrea Arcangeli , linux-mm , lkml , Pavel Emelyanov , Mike Kravetz , Mike Rapoport , stable , Minchan Kim , Will Deacon , Peter Zijlstra Content-Transfer-Encoding: quoted-printable Message-Id: <757EEA98-4935-4283-849B-22CBFC352C45@gmail.com> References: <20201221172711.GE6640@xz-x1> <76B4F49B-ED61-47EA-9BE4-7F17A26B610D@gmail.com> <9E301C7C-882A-4E0F-8D6D-1170E792065A@gmail.com> <1FCC8F93-FF29-44D3-A73A-DF943D056680@gmail.com> <20201221223041.GL6640@xz-x1> To: Andy Lutomirski X-Mailer: Apple Mail (2.3608.120.23.2.4) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: > On Dec 21, 2020, at 7:19 PM, Andy Lutomirski wrote: >=20 > On Mon, Dec 21, 2020 at 3:22 PM Linus Torvalds > wrote: >> On Mon, Dec 21, 2020 at 2:30 PM Peter Xu wrote: >>> AFAIU mprotect() is the only one who modifies the pte using the mmap = write >>> lock. NUMA balancing is also using read mmap lock when changing pte >>> protections, while my understanding is mprotect() used write lock = only because >>> it manipulates the address space itself (aka. vma layout) rather = than modifying >>> the ptes, so it needs to. >>=20 >> So it's ok to change the pte holding only the PTE lock, if it's a >> *one*way* conversion. >>=20 >> That doesn't break the "re-check the PTE contents" model (which >> predates _all_ of the rest: NUMA, userfaultfd, everything - it's >> pretty much the original model for our page table operations, and = goes >> back to the dark ages even before SMP and the existence of a page >> table lock). >>=20 >> So for example, a COW will always create a different pte (not just >> because the page number itself changes - you could imagine a page >> getting re-used and changing back - but because it's always a RO->RW >> transition). >>=20 >> So two COW operations cannot "undo" each other and fool us into >> thinking nothing changed. >>=20 >> Anything that changes RW->RO - like fork(), for example - needs to >> take the mmap_lock. >=20 > Ugh, this is unpleasantly complicated. I will admit that any API that > takes an address and more-or-less-blindly marks it RO makes me quite > nervous even assuming all the relevant locks are held. At least > userfaultfd refuses to operate on VM_SHARED VMAs, but we have another > instance of this (with mmap_sem held for write) in x86: > mark_screen_rdonly(). Dare I ask how broken this is? We could likely > get away with deleting it entirely. If you only look at the function in isolation, it seems broken. It = should have flushed the TLB before releasing the mmap_lock. After the mmap_write_unlock() and before the actual flush, a #PF on another thread = can happen, and a similar scenario to the one that is mentioned in this = thread (copying while a stale PTE in the TLBs is not-writeprotected) might = happen. Having said that, I do not know this code and the context in which this function is called, so I do not know whether there are other mitigating factors. Funny, I had a deja-vu and indeed you have already raised (other) TLB = issues with mark_screen_rdonly() 3 years ago. At the time you said "I'd like to delete it.=E2=80=9D [1] [1] https://lore.kernel.org/patchwork/patch/782486/#976151=