From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CDFB9D2FFFA for ; Fri, 18 Oct 2024 15:50:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 482446B007B; Fri, 18 Oct 2024 11:50:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 431546B0082; Fri, 18 Oct 2024 11:50:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2F8FF6B0088; Fri, 18 Oct 2024 11:50:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 126346B007B for ; Fri, 18 Oct 2024 11:50:48 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 96161AC365 for ; Fri, 18 Oct 2024 15:50:23 +0000 (UTC) X-FDA: 82687160724.11.9FE5274 Received: from frasgout12.his.huawei.com (frasgout12.his.huawei.com [14.137.139.154]) by imf18.hostedemail.com (Postfix) with ESMTP id E85641C0009 for ; Fri, 18 Oct 2024 15:50:38 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf18.hostedemail.com: domain of roberto.sassu@huaweicloud.com designates 14.137.139.154 as permitted sender) smtp.mailfrom=roberto.sassu@huaweicloud.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729266536; a=rsa-sha256; cv=none; b=UuJ44Z34U/ZDl0y2Jax8bDkanjRsYXVVCWPAMXqfriZN5DfMo1phi9l8Y/HW9MGaNUizvh Wq28Cb6wua6Xi94CHszamgf3NVF/ysodxoiCI8zLVY8IWWPZsWue3qY7ONCeMnKdkLe9rA 4rdM+4rHOKFTdSj0maoMpk+h11tdJgY= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf18.hostedemail.com: domain of roberto.sassu@huaweicloud.com designates 14.137.139.154 as permitted sender) smtp.mailfrom=roberto.sassu@huaweicloud.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729266536; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ZQjEGiWL+rFs6hOhkgBZmLocmg3rG6e1te1E3cMiHH8=; b=5FOxqbztAYeDxa7PJXK1VjD2LwhHPL7kHnIxC0QRn6J94drlyv7lREOnO3tR4OkZZl/pqQ NAXWv71vy/A9bmfoy93EKj9JgHVxv9HEw2zkpFMdMOZN+0uIPdS9MEOU+Bud6JfrqzUFzJ PQBvKOeBFZYnzEVjB5T1rsY3bWGOYA4= Received: from mail.maildlp.com (unknown [172.18.186.29]) by frasgout12.his.huawei.com (SkyGuard) with ESMTP id 4XVT5N0WGJz9v7JM for ; Fri, 18 Oct 2024 23:24:24 +0800 (CST) Received: from mail02.huawei.com (unknown [7.182.16.27]) by mail.maildlp.com (Postfix) with ESMTP id 007D01403A2 for ; Fri, 18 Oct 2024 23:50:35 +0800 (CST) Received: from [127.0.0.1] (unknown [10.204.63.22]) by APP2 (Coremail) with SMTP id GxC2BwCnCMm9gxJn0hohAw--.60739S2; Fri, 18 Oct 2024 16:50:34 +0100 (CET) Message-ID: Subject: Re: [RFC][PATCH] mm: Split locks in remap_file_pages() From: Roberto Sassu To: Lorenzo Stoakes Cc: akpm@linux-foundation.org, Liam.Howlett@oracle.com, vbabka@suse.cz, jannh@google.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, ebpqwerty472123@gmail.com, paul@paul-moore.com, zohar@linux.ibm.com, dmitry.kasatkin@gmail.com, eric.snowberg@oracle.com, jmorris@namei.org, serge@hallyn.com, linux-integrity@vger.kernel.org, linux-security-module@vger.kernel.org, bpf@vger.kernel.org, linux-fsdevel@vger.kernel.org, "Kirill A. Shutemov" , stable@vger.kernel.org, syzbot+91ae49e1c1a2634d20c0@syzkaller.appspotmail.com, Roberto Sassu Date: Fri, 18 Oct 2024 17:50:19 +0200 In-Reply-To: <784c68fa023e99c53cd07265f0524e386815b443.camel@huaweicloud.com> References: <20241018144710.3800385-1-roberto.sassu@huaweicloud.com> <784c68fa023e99c53cd07265f0524e386815b443.camel@huaweicloud.com> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.44.4-0ubuntu2 MIME-Version: 1.0 X-CM-TRANSID:GxC2BwCnCMm9gxJn0hohAw--.60739S2 X-Coremail-Antispam: 1UD129KBjvJXoWxKw1rXFyxCw43tr48ZFyDAwb_yoWxGF48pF 95J3WqkF4UXFyxCrnFq3WqgFyFyry8KryUu3y3JFy8Ar9FvF1fKrWfGFy5uF4DArs7AFZ5 ZF4jyrZxGFZ8AFJanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUvYb4IE77IF4wAFF20E14v26ryj6rWUM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28lY4IEw2IIxxk0rwA2F7IY1VAKz4 vEj48ve4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_Jr0_JF4l84ACjcxK6xIIjxv20xvEc7Cj xVAFwI0_Gr0_Cr1l84ACjcxK6I8E87Iv67AKxVW8JVWxJwA2z4x0Y4vEx4A2jsIEc7CjxV AFwI0_Gr1j6F4UJwAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG 6I80ewAv7VC0I7IYx2IY67AKxVWUGVWUXwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFV Cjc4AY6r1j6r4UM4x0Y48IcVAKI48JM4IIrI8v6xkF7I0E8cxan2IY04v7MxkF7I0En4kS 14v26r4a6rW5MxAIw28IcxkI7VAKI48JMxC20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I 8CrVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwVAFwI0_JrI_JrWlx4CE17CEb7AF67AKxVW8ZVWr XwCIc40Y0x0EwIxGrwCI42IY6xIIjxv20xvE14v26r1j6r1xMIIF0xvE2Ix0cI8IcVCY1x 0267AKxVW8JVWxJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_ Jr0_Gr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBIdaVFxhVjvjDU0xZFpf9x07 j7l19UUUUU= X-CM-SenderInfo: purev21wro2thvvxqx5xdzvxpfor3voofrz/1tbiAQAABGcRw-kLwQABsC X-Rspamd-Queue-Id: E85641C0009 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: q4ppdqbgd93sf8o7xet53wgnsyryuyu1 X-HE-Tag: 1729266638-233005 X-HE-Meta: U2FsdGVkX190spcb3t1JM1fB7ScTuInvkEuFA7k+tonoZyokJW5dGjU72zTRK8z+uw2/nou+LyANkcrJC6LI8BxRi/9RdqHM/XXfNaQl5gSEERcKP62UsJHIUY6EdIR0gtFRwtNJgCQB/vVnDbMPeGqx4Qq6WnK3yEzQKD/oeXNSMaCe1s1C2e3I9PAZIJ53w7K3mbmKN82mmz3Tvlh7zPL2eTjJk+kgEOsxPF97oGirfdqqL4czBEEb3X3/8tzBXSCE4c/1NdL8UoONT2NasnUDakI7bUk0oBYA3N2mlQX8JGcTPowaoy8ilUJdXnRbRHG1Kjbklq38o2dZsZ6JT66FIwiOk6GckW9sSVnRxF+aRFa6IDqD6sTfpRnxntBplCxaKC9Er9OytLD954iajDOMgTRjfAA2FJpu/5abRG83APR7x270m5iuf/71ULWVvCJ8T11yiaKNQCU+ZZ4FKykMaEBejMdQMDekxv7iJy7jq+nyGGXdfbBDFwhKTGzFG8nB+mBVi85qHHH4TbqqKkepWuEIeCVhcaYJlCaBnqj3oqHyjge8AKxw3i2NFWSAf1UO9p6LHCJY9VRNRBJWxImWX0TnUrgU2GCWnDabpSEaEf3WJLelto0WW+3bKkDRd9oWwH8Lv1Cisui2fsjqoxDG5b7OkJoLtiIJl0pYyqn7nCGXpN2Ckq7MK9+SGXGzU/b/UnrGpMkg+zoA5VsIUNtQI2YXaCj8ne4u4eDYCDgSVitoJQMf3CurwglX41WF6gO09ye9jj1GjxsmZDgRfQwAOinkkO6v4BNa0C0oibozfX0NjFmy36YSwssAv1mFw2E0pbnSTjn7upI+36QuvHSuBzgsB1CRVVVpngvaViZBGhyKrKpXHajgOfutTwRqsn50BjjbUMKl4AmH5LVeMlR5el5hntuhVcVpUMksr2fg0gs1RU93OaABjeeu2iQFDe2FNiW77AekHBki7mt D3SQ8Xe4 COKYRTZx8vA+25OL2bNr45CviAVDvxz03xW2SoSKskAJ/KZNmQduT3YfTOkVemo/OtdXetE1GrMGqXUROfZyIyOhHAUeuYVukmMEPdKILtTuHo0jYmEMPwRJEOcbS9yz3UHHJSUjb4Z7z2kw0R5V59xFAL7uffO9bvh+1Yeo80VZTuhDP0kp+O9w/08VKWov7No/SYqtaJOykixYjfLLZj1kx8lmMZHZQFwhdbZ8+Syz6+z2yJaISquJE0f3FX7YQXPh9Vu4H1nTTnyJZjCBFCCZGm5Q2LgLxVUCE1h2Y1fEqH4X6PVpuIr2SZwaPUJH0VgaYsJqm7pfleign+v5I/Gx63k2LejJcigr3vhW9J9VoICdzkYe8/41nX85GLPoT/dnaP3m9cv67LlvOmnOyuL+Z8+RntAUbVuqkl1oSZSETuK+C1pYiAyXBFUamG7yOS/Yy X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, 2024-10-18 at 17:45 +0200, Roberto Sassu wrote: > On Fri, 2024-10-18 at 16:42 +0100, Lorenzo Stoakes wrote: > > On Fri, Oct 18, 2024 at 04:47:10PM +0200, Roberto Sassu wrote: > > > From: "Kirill A. Shutemov" > > >=20 > > > Commit ea7e2d5e49c0 ("mm: call the security_mmap_file() LSM hook in > > > remap_file_pages()") fixed a security issue, it added an LSM check wh= en > > > trying to remap file pages, so that LSMs have the opportunity to eval= uate > > > such action like for other memory operations such as mmap() and mprot= ect(). > > >=20 > > > However, that commit called security_mmap_file() inside the mmap_lock= lock, > > > while the other calls do it before taking the lock, after commit > > > 8b3ec6814c83 ("take security_mmap_file() outside of ->mmap_sem"). > > >=20 > > > This caused lock inversion issue with IMA which was taking the mmap_l= ock > > > and i_mutex lock in the opposite way when the remap_file_pages() syst= em > > > call was called. > > >=20 > > > Solve the issue by splitting the critical region in remap_file_pages(= ) in > > > two regions: the first takes a read lock of mmap_lock and retrieves t= he VMA > > > and the file associated, and calculate the 'prot' and 'flags' variabl= e; the > > > second takes a write lock on mmap_lock, checks that the VMA flags and= the > > > VMA file descriptor are the same as the ones obtained in the first cr= itical > > > region (otherwise the system call fails), and calls do_mmap(). > > >=20 > > > In between, after releasing the read lock and taking the write lock, = call > > > security_mmap_file(), and solve the lock inversion issue. > >=20 > > Great description! > >=20 > > >=20 > > > Cc: stable@vger.kernel.org > > > Fixes: ea7e2d5e49c0 ("mm: call the security_mmap_file() LSM hook in r= emap_file_pages()") > > > Reported-by: syzbot+91ae49e1c1a2634d20c0@syzkaller.appspotmail.com > > > Closes: https://lore.kernel.org/linux-security-module/66f7b10e.050a02= 20.46d20.0036.GAE@google.com/ > > > Reviewed-by: Roberto Sassu (Calculate prot= and flags earlier) > > > Signed-off-by: Kirill A. Shutemov > >=20 > > Other than some nits below: > >=20 > > Reviewed-by: Lorenzo Stoakes > >=20 > > I think you're definitely good to un-RFC here. >=20 > Perfect, will do. Thank you! I'm just going to change a bit the commit title: mm: Split critical region in remap_file_pages() and invoke LSMs in between Roberto > Roberto >=20 > > > --- > > > mm/mmap.c | 62 ++++++++++++++++++++++++++++++++++++++++-------------= -- > > > 1 file changed, 45 insertions(+), 17 deletions(-) > > >=20 > > > diff --git a/mm/mmap.c b/mm/mmap.c > > > index 9c0fb43064b5..762944427e03 100644 > > > --- a/mm/mmap.c > > > +++ b/mm/mmap.c > > > @@ -1640,6 +1640,7 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long= , start, unsigned long, size, > > > unsigned long populate =3D 0; > > > unsigned long ret =3D -EINVAL; > > > struct file *file; > > > + vm_flags_t vm_flags; > > >=20 > > > pr_warn_once("%s (%d) uses deprecated remap_file_pages() syscall. S= ee Documentation/mm/remap_file_pages.rst.\n", > > > current->comm, current->pid); > > > @@ -1656,12 +1657,53 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned lo= ng, start, unsigned long, size, > > > if (pgoff + (size >> PAGE_SHIFT) < pgoff) > > > return ret; > > >=20 > > > - if (mmap_write_lock_killable(mm)) > > > + if (mmap_read_lock_killable(mm)) > > > + return -EINTR; > >=20 > > I'm kinda verbose generally, but I'd love a comment like: > >=20 > > /* > > * Look up VMA under read lock first so we can perform the security > > * without holding locks (which can be problematic). We reacquire a > > * write lock later and check nothing changed underneath us. > > */ > >=20 > > > + > > > + vma =3D vma_lookup(mm, start); > > > + > > > + if (!vma || !(vma->vm_flags & VM_SHARED)) { > > > + mmap_read_unlock(mm); > > > + return -EINVAL; > > > + } > > > + > > > + prot |=3D vma->vm_flags & VM_READ ? PROT_READ : 0; > > > + prot |=3D vma->vm_flags & VM_WRITE ? PROT_WRITE : 0; > > > + prot |=3D vma->vm_flags & VM_EXEC ? PROT_EXEC : 0; > > > + > > > + flags &=3D MAP_NONBLOCK; > > > + flags |=3D MAP_SHARED | MAP_FIXED | MAP_POPULATE; > > > + if (vma->vm_flags & VM_LOCKED) > > > + flags |=3D MAP_LOCKED; > > > + > > > + /* Save vm_flags used to calculate prot and flags, and recheck late= r. */ > > > + vm_flags =3D vma->vm_flags; > > > + file =3D get_file(vma->vm_file); > > > + > > > + mmap_read_unlock(mm); > > > + > >=20 > > Maybe worth adding a comment to explain why you're doing this without t= he > > lock so somebody looking at this later can understand the dance? > >=20 > > > + ret =3D security_mmap_file(file, prot, flags); > > > + if (ret) { > > > + fput(file); > > > + return ret; > > > + } > > > + > > > + ret =3D -EINVAL; > > > + > >=20 > > Again, being verbose, I'd put something here like: > >=20 > > /* OK security check passed, take write lock + let it rip */ > >=20 > > > + if (mmap_write_lock_killable(mm)) { > > > + fput(file); > > > return -EINTR; > > > + } > > >=20 > > > vma =3D vma_lookup(mm, start); > > >=20 > > > - if (!vma || !(vma->vm_flags & VM_SHARED)) > > > + if (!vma) > > > + goto out; > > > + > >=20 > > I'd also add something like: > >=20 > > /* Make sure things didn't change under us. */ > >=20 > > > + if (vma->vm_flags !=3D vm_flags) > > > + goto out; > > > + > >=20 > > And drop this newline to group them together (super nitty I know, sorry= !) > >=20 > > > + if (vma->vm_file !=3D file) > > > goto out; > > >=20 > > > if (start + size > vma->vm_end) { > > > @@ -1689,25 +1731,11 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned lo= ng, start, unsigned long, size, > > > goto out; > > > } > > >=20 > > > - prot |=3D vma->vm_flags & VM_READ ? PROT_READ : 0; > > > - prot |=3D vma->vm_flags & VM_WRITE ? PROT_WRITE : 0; > > > - prot |=3D vma->vm_flags & VM_EXEC ? PROT_EXEC : 0; > > > - > > > - flags &=3D MAP_NONBLOCK; > > > - flags |=3D MAP_SHARED | MAP_FIXED | MAP_POPULATE; > > > - if (vma->vm_flags & VM_LOCKED) > > > - flags |=3D MAP_LOCKED; > > > - > > > - file =3D get_file(vma->vm_file); > > > - ret =3D security_mmap_file(vma->vm_file, prot, flags); > > > - if (ret) > > > - goto out_fput; > > > ret =3D do_mmap(vma->vm_file, start, size, > > > prot, flags, 0, pgoff, &populate, NULL); > > > -out_fput: > > > - fput(file); > > > out: > > > mmap_write_unlock(mm); > > > + fput(file); > > > if (populate) > > > mm_populate(ret, populate); > > > if (!IS_ERR_VALUE(ret)) > > > -- > > > 2.34.1 > > >=20 > >=20 > > These are just nits, this looks good to me! >=20