From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D7B77C001DC for ; Thu, 27 Jul 2023 19:21:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 545FC6B0074; Thu, 27 Jul 2023 15:21:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4F61A6B0075; Thu, 27 Jul 2023 15:21:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3BDF66B0078; Thu, 27 Jul 2023 15:21:14 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 2D3266B0074 for ; Thu, 27 Jul 2023 15:21:14 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 00E3EB2A0D for ; Thu, 27 Jul 2023 19:21:13 +0000 (UTC) X-FDA: 81058360068.17.BB62E46 Received: from mail-wm1-f53.google.com (mail-wm1-f53.google.com [209.85.128.53]) by imf29.hostedemail.com (Postfix) with ESMTP id 1EAF912000E for ; Thu, 27 Jul 2023 19:21:11 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=vCkIm85U; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf29.hostedemail.com: domain of jannh@google.com designates 209.85.128.53 as permitted sender) smtp.mailfrom=jannh@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1690485672; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=IyLrRoHQVkp9Qk0bYrHm60ga1B7v8n8gCKHi/W5Wo6o=; b=b04SBxqj3IdBoR+DHu47/VCvRsx2t5CtZWgadEGnWA6ptJEn+TcdzVxBw70G16mNw7Oegw XwrhQE06UTT39acPzjS1gX2WvVzBFrEeaOQpSh1DRsKjz2LdE5dQ9A+FZfiOW7Zx+1kyPM a+JLTickdHZYLEy1HWQCiVu4rGoxqbA= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=vCkIm85U; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf29.hostedemail.com: domain of jannh@google.com designates 209.85.128.53 as permitted sender) smtp.mailfrom=jannh@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1690485672; a=rsa-sha256; cv=none; b=75L37ZrR61rNVG52KEp5XvywSIzVfXSJ60c6fJGFqwLzXEZt4oV/xl6BrUUfDBsP2QrzKA tP5QSDWOdL6nYPjdqkXwtqzgZ3yPCGk2vd/ThYndBxI8Ysu+jgpxtLNLUaB2yJ7XILafDo 2L9XOQbtq6GQusBz1HumK6ykE6JEYWE= Received: by mail-wm1-f53.google.com with SMTP id 5b1f17b1804b1-3fd28ae8b90so16375e9.1 for ; Thu, 27 Jul 2023 12:21:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1690485670; x=1691090470; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=IyLrRoHQVkp9Qk0bYrHm60ga1B7v8n8gCKHi/W5Wo6o=; b=vCkIm85UWeFtI32XPvpTC4L+NlRu6wGNi9GJm3U+PsEFcASlm+L39Id790PoDx0Z8M klz28LJqypGOuNTa+R/qvax46BAVeSVxXD3FyL5JtGIMzYhea/3Myb4D2c7MuA5kr612 qfObrBrF52R6pap+CdSt0toROzcRADVA3WdrhroewE7DoCrpZm7p21pSDEZU1U1CLmUT TJFpFCMUpF2s6jSl0I3kEAA+TD9d0Xl5rVyIf3qoof6xlvkPrYYObzKicNnX6aJGozDl me6w4bGN2RjO6ShJRyoUEHrDm5K+Kids4i8jvbaTvoN+SVwGs076tc46fYgwZOwWGGZL UWTw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690485670; x=1691090470; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=IyLrRoHQVkp9Qk0bYrHm60ga1B7v8n8gCKHi/W5Wo6o=; b=DZ3oDAdf77dW3xritsU7Jg8IYWluXO+8QIwdFZm29lUjt5dkgy4pWo3a9smQ8rnEZf u//328pmR1SXTsd+zIU8hL3Sp+3tBgyzDJwYcF5qJxxp805MyUSnShCpH1BydQS0ieyW 9p7fEybTuZ95pdSjqbfkyee6o2bVcmB3L2RomHfXWoDJBGBJxC0aoCH3HycuKh3Fja0o gP+zM+3oMZl6g1Pl1kOVh785u4MtXp6fH47cWBaYSSCP0TfNKBwDZYaBoDobPZxzuCQR xXkVNSHRZ3DNH1dVxtG2nU5qlAtjhVXO8QapL0QNjMXkFwImT2+T3TcA7MezB80oDH7n vmRQ== X-Gm-Message-State: ABy/qLbkMHB0rK6CUQwHrKeP+Rbp1yJn+N9maPIz6H8m/977G74f6AzS 9kMlpKI+EqkD1chbB+s+apj+di1SuPU6pOZzqhZ5BQ== X-Google-Smtp-Source: APBJJlHzUz8IbElL2RzIXfyygRzF9aKFWJf2u1kth8hZi4M2zV3J9WjrTUfFqmWLQM2VhRNHHpXoqopHV6JRtBEPr/E= X-Received: by 2002:a05:600c:1c16:b0:3f1:70d1:21a6 with SMTP id j22-20020a05600c1c1600b003f170d121a6mr27810wms.0.1690485670405; Thu, 27 Jul 2023 12:21:10 -0700 (PDT) MIME-Version: 1.0 References: <000000000000607ff905ffc8e477@google.com> <0000000000000aeb7f06015e5cbd@google.com> <20230727164757.e2di75xjybxncohn@revolver> In-Reply-To: From: Jann Horn Date: Thu, 27 Jul 2023 21:20:33 +0200 Message-ID: Subject: Re: [syzbot] [mm?] WARNING: suspicious RCU usage in mas_walk (2) To: Matthew Wilcox Cc: Suren Baghdasaryan , "Liam R. Howlett" , akpm@linux-foundation.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, syzkaller-bugs@googlegroups.com, syzbot Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 1EAF912000E X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: 3d1nwczpm6ism15t7k8bbgk64pjeh5kx X-HE-Tag: 1690485671-738900 X-HE-Meta: U2FsdGVkX19YgK423Kw6V/polOexQjcNHVXlOblYan+xZIbUrkDxlpb/qh4MpAq7by/Ahe38isbo/uFpHpHtngAjRwG5Fv8l6P/21HjFsFJeePmyGY8pXSzboRRapk6Y2W43LlGKENNYUfAnaMRlyg4xCMh/K+sErQJA0enEl9HRAtTcxVqrbOEBjMveln+oMgHf8LdbedGXkEnQS4cXALRt6WcegBLK+ZEjWgrf7QazgDaOzEhNmTynaadXIGUbkAmw+yUihmJzEGXV5cIENM4sjoXXSLXKDd11IXE4ONa8Fzll8nq9lXqe8Qq5GE4LiobSwLPAQdUP3Lr9/WxpdRDdj+1o4CJ4svuM6lr+Ytktsi2YbTzHL2+kxBJjPMnkR8FG+Ga5UPjpQ6wJeSuXbCuBILqoT5jeJoV5aIElybo8HWyxPvI2mI7aY5K9IYyhjkvRZARpdajdwHA/W/tEbrpHUJsSOwn7AeokdwBsJcTQxm7UiA6gJ+GmuW5783IVBlZFr9/apOOD8ZaLceAT8qHCA3NjIEdXJdStP4a8cqnzKOEuKIQNCZcI+40HtdyKnF/aRP1xWFwXaWxWcSLdTxyl0RDoDMuBp6M0HVAeoi6kvKqeOPJiKyDVbiQ0xUw9C2vUf2p537wDUtzlq0gRhXqeeCbn8/+RdMpJ2p7fbm0DI4KZNgRHNnUDqVbGH16KbxrUUaJDurRoRd4UNLXwhwhpOs2rHSoGQJ9grsgooTy9ELoF7bIxfaM7kRbyzQrTbjtkHvWRHyI4NyHCusPGlA6wk2JGhZPPzzilOtJy91k2JyBQO60XigOe958pCd+ign3UY0g9Dkq8MZtADr92hx+jQ3cvMN5ZRaaQoOQCuudhFZwlHKgCfBpMchi6a0iAjYU0gUhovdC+HIlx1Fm1PXRzi+DuXXvTQLVixQtBQJ/alLOe3y2SCa/D0lu8UiZs0vKhlPy62brTc4U3R+T jNmQh6b3 WbKlxSDfqhekkWb8/QEltDJGaUa/w/OABTxU5pdAj+hsdKO/Hn+HK9WISDaYvIjbqqs4aXFScDjqSx1vEphJvBuBfZt9OwylbXJO+ZcyN0bT8maWuv90fwdO0zYa4ppCA2AjfGoL/O1nFWpCjmnkFrHNodXsr+YVdxMjmjsg5QUejCwdBmns9IEMFioE7wEAYCrAZR2vgpR9ef0lQ2ZanDNMzChZm48JoLoBVG/2gN25QiZNG+k0DkSYmXpisdi5x4HfkEsY4yepBaZofOGnj2LYbnXMvXXARldjQhZTM3WS0lcSFzcz5uZGBTDFl3aIMDHkbSOu227chNnBh7d+aQ73Po+RkjuY8thuCaRT4yyUU1J7b2TduGlldU8TUN4THDG8VUIXndgPupP3OKuE7Sr3uv2JlXk7zPqOUZ9uUnFwu0k0XoYQVsE2BHcv+q8LIHEzPluZQZLIZdgPNcxtZaVCEg7mq9tY8jQV8TvDIr4+HDKT4fc5CBcujjI3b10v3ax20B6IhoWmh85lgh9r4ZmgzCT2Ic/lOvK9lEj62G9Xpbio= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000123, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Jul 27, 2023 at 8:17=E2=80=AFPM Matthew Wilcox wrote: > > On Thu, Jul 27, 2023 at 07:59:33PM +0200, Jann Horn wrote: > > On Thu, Jul 27, 2023 at 7:22=E2=80=AFPM Suren Baghdasaryan wrote: > > > Hmm. lock_vma_under_rcu() specifically checks for vma->anon_vma=3D=3D= NULL > > > condition (see [1]) to avoid going into find_mergeable_anon_vma() (a > > > check inside anon_vma_prepare() should prevent that). So, it should > > > fall back to mmap_lock'ing. > > > > This syzkaller report applies to a tree with Willy's in-progress patch > > series, where lock_vma_under_rcu() only checks for vma->anon_vma if > > vma_is_anonymous() is true - it permits private non-anonymous VMAs > > (which require an anon_vma for handling write faults) even if they > > don't have an anon_vma. > > > > The commit bisected by syzkaller > > (https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/co= mmit/?id=3Da52f58b34afe095ebc5823684eb264404dad6f7b) > > removes the vma_is_anonymous() check in handle_pte_fault(), so it lets > > us reach do_wp_page() with a non-anonymous private VMA without > > anon_vma, even though that requires allocation of an anon_vma. > > > > So I think this is pretty clearly an issue with Willy's in-progress > > patch series that syzkaller blamed correctly. > > Agreed. What do we think the right solution is? > > Option 1: > > +++ b/mm/memory.c > @@ -3197,6 +3197,12 @@ static vm_fault_t wp_page_copy(struct vm_fault *vm= f) > struct mmu_notifier_range range; > int ret; > > + if (!vma->anon_vma) { > + // check if there are other things to undo here > + vma_end_read(vmf->vma); > + return VM_FAULT_RETRY; > + } > + > delayacct_wpcopy_start(); > > Option 2: > > @@ -5581,7 +5587,8 @@ struct vm_area_struct *lock_vma_under_rcu(struct mm= _struct *mm, > goto inval; > > /* find_mergeable_anon_vma uses adjacent vmas which are not locke= d */ > - if (vma_is_anonymous(vma) && !vma->anon_vma) > + if ((vma_is_anonymous(vma) || > + vma->vm_flags & (VM_SHARED | VM_MAYSHARE)) && !vma->anon_vma= ) > goto inval; > > The problem with option 2 is that we don't know whether this is a write > fault or not, so we'll handle read faults on private file > mappings under the mmap_lock UNTIL somebody writes to the mapping, which > might be never. That seems like a bad idea. > > We could pass FAULT_FLAG_WRITE into lock_vma_under_rcu(), but that also > seems like a bad idea. I dunno. Three bad ideas. Anyone think of a > good one? One kinda straightforward option would be to pass the vmf (or NULL if it's not in fault context) to anon_vma_prepare(), teach it to bail if it runs under the mm lock, and propagate a VM_FAULT_RETRY all the way up? It can already fail due to OOM, so the bailout paths exist, though you'd have to work a bit to plumb the right error code up. And if you're feeling adventurous, you could try to build a way to opportunistically upgrade from vma lock to mmap lock, to avoid having to bail out all the way back up and then dive back in when that happens. Something that does mmap_read_trylock(); on failure, bail out with VM_FAULT_RETRY; on success, drop the VMA lock and change vmf->flags to note the changed locking context.