From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4519ACDB483 for ; Wed, 18 Oct 2023 17:14:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B46FC8D0165; Wed, 18 Oct 2023 13:14:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AF8068D0016; Wed, 18 Oct 2023 13:14:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9C01A8D0165; Wed, 18 Oct 2023 13:14:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 89AB88D0016 for ; Wed, 18 Oct 2023 13:14:41 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 5DA93C03E3 for ; Wed, 18 Oct 2023 17:14:41 +0000 (UTC) X-FDA: 81359231562.23.5B2892A Received: from mail-qt1-f179.google.com (mail-qt1-f179.google.com [209.85.160.179]) by imf19.hostedemail.com (Postfix) with ESMTP id 97F1B1A0016 for ; Wed, 18 Oct 2023 17:14:39 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="IJ/g/lGL"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf19.hostedemail.com: domain of jeffxu@google.com designates 209.85.160.179 as permitted sender) smtp.mailfrom=jeffxu@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1697649279; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=CxyGP1aBivHX+AXORIadgXhguJIarGguet1dgSguQtQ=; b=ZKWqxYK0+iFI9So0KRswl6et5Lksq9rBiNc9/HlCIVfA/gsBGlpV2XIDOhVGESsVE3008m VdUIMU/Wmk8yYFVGQolswf+V1PtzYATiz5FJI9sA3H2u0TPvURICJokDOrxDP3ZuQirgu/ HERsVeq0W8mLbAiQn97Qv2QCQ4ZSHAw= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="IJ/g/lGL"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf19.hostedemail.com: domain of jeffxu@google.com designates 209.85.160.179 as permitted sender) smtp.mailfrom=jeffxu@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1697649279; a=rsa-sha256; cv=none; b=i35uSPtcHQh5FM/kezOoDD6LT7cad8OdVN6XNZtoGsokzLL5euIZyPZ51k/KwONIcn2XiI 1BmVqfsxkSIYakPyLuYO02vD4T8H3d1f0BZ3hNdsP0KOe3po+jKRihTmXoj6AznxHqU9nP 8ZwdLflzb+7CUiCzD4oZ0QgUhEKH1CQ= Received: by mail-qt1-f179.google.com with SMTP id d75a77b69052e-41b813f0a29so14681cf.0 for ; Wed, 18 Oct 2023 10:14:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1697649279; x=1698254079; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=CxyGP1aBivHX+AXORIadgXhguJIarGguet1dgSguQtQ=; b=IJ/g/lGLyuvfYCwsLXxWgrIPs7B04h4Wre3YsWKh0Phyf7cqHpz4rd7461A0yI+Y5i +aEUCbuwCJRHXWOuGTDG6a7Ot78v73V2Tp5JeRGRxITvAkyggji6KuWK7Wmlp75Rqzk/ Cipw9pe/OMN+9B4YEzpH3NCcuv86QSVKTviMXjz92G5SNErMFeDPCFbaMpku+5E+y/ZA eO11ZpPfV20AKXMX1TpSKI6paPf/8uG2KwVsIfQZYaiKz6RzTO+CIo8aAxesb3iPOvTc msdwp3QUVfrumlpiHxIvsESpe43DncDXFpqhCqoKe+bu7UfxFz+3EcTqwTQfANQUsOzd eCMg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697649279; x=1698254079; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=CxyGP1aBivHX+AXORIadgXhguJIarGguet1dgSguQtQ=; b=q2qzfKT9iyiDabuw/Kam4dJf3KRcIpk1euESHNz0zFV8Fty9Ou1iWxr+pGPO6o7GLc tduEQxvsrckdjdvCiJ2ky/0AdsYPWJzWQ6axPKnydcgW1y7qyvOnbDt/Bcg7Ff4Ghom7 F5ws4+YlecxZYx1N/KLjUrhK/SX6/FM70IN564NDpVaYl8cSekidZ/168HTV6oKurSX3 hBfwQ8IlWqKxOiIy1/DTY56+DxWki6WlaqZS9rtWy8Q5LTkNXs0T4ZTNbNan01VAv2VH ysqwpeJ2HeRCXaKCdQ30cI0t3GBPGqTPJT+kQLRDygVQRb2ZC2LN6Ctz82vXve7NLXAn FEeQ== X-Gm-Message-State: AOJu0YzXVULANeijLVqLzGUoupUfCjCUabvmoI4lQNGV494cIPpccO52 SZhf0nXfBYjGUsH0UPmtcvtKRUIqAQW/0+1LM2kukQ== X-Google-Smtp-Source: AGHT+IEx8unH0tTtFuD3qM/gmmzwuSTL87BuIGpNM0vGMDgmYaSjk/NG4CaQxYRwo+pn8/4o0oj3+4jWGdf0BWNNry4= X-Received: by 2002:ac8:4d4a:0:b0:41b:aedb:a82d with SMTP id x10-20020ac84d4a000000b0041baedba82dmr15384qtv.28.1697649278489; Wed, 18 Oct 2023 10:14:38 -0700 (PDT) MIME-Version: 1.0 References: <20231017090815.1067790-1-jeffxu@chromium.org> <20231017090815.1067790-6-jeffxu@chromium.org> In-Reply-To: From: Jeff Xu Date: Wed, 18 Oct 2023 10:14:02 -0700 Message-ID: Subject: Re: [RFC PATCH v2 5/8] mseal: Check seal flag for munmap(2) To: Linus Torvalds Cc: jeffxu@chromium.org, akpm@linux-foundation.org, keescook@chromium.org, jannh@google.com, sroettger@google.com, willy@infradead.org, gregkh@linuxfoundation.org, jorgelo@chromium.org, groeck@chromium.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-mm@kvack.org, surenb@google.com, alex.sierra@amd.com, apopple@nvidia.com, aneesh.kumar@linux.ibm.com, axelrasmussen@google.com, ben@decadent.org.uk, catalin.marinas@arm.com, david@redhat.com, dwmw@amazon.co.uk, ying.huang@intel.com, hughd@google.com, joey.gouly@arm.com, corbet@lwn.net, wangkefeng.wang@huawei.com, Liam.Howlett@oracle.com, lstoakes@gmail.com, mawupeng1@huawei.com, linmiaohe@huawei.com, namit@vmware.com, peterx@redhat.com, peterz@infradead.org, ryan.roberts@arm.com, shr@devkernel.io, vbabka@suse.cz, xiujianfeng@huawei.com, yu.ma@intel.com, zhangpeng362@huawei.com, dave.hansen@intel.com, luto@kernel.org, linux-hardening@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Stat-Signature: b46h3nemcahr6r3dwdmudsdtghwxsnjs X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 97F1B1A0016 X-HE-Tag: 1697649279-290881 X-HE-Meta: U2FsdGVkX1/o5O6ZSTN1ryVgVh8FuiHCTtPPWecBSE/8u7/K5qfEiLLEGsa+0L4YA/lCV8GsBrSN3YN+eQ/Lyyd7VOPl0UDm4HcvCNOmM8/GP4NLIuOejej/k/jAIph5MTzSHU/GxGEfOx1sz9KZZHkRM1wdRnwz6Ts+fDqLFJgZbbnYMD5u+ulRdcRY6YTKH0daB5HO+d677MD3DOCn0qfr2licGhUcM13CzmLR02TcsIiuH1Mx4cRu7YOgtWSNltW2hbt8G/2JhYwMWs96fgmqzibZyKnT4ZfjDOXI255b5LmFRWt4hzL9lNikD1hTgJZWMAL+NyEDL6koMCb16B8wvVry77GPTjUW+1TUrLEQ1LMdK0QnfVHlXHvuY7EAZfjgjM4ey/+kH4RBzxvYBFAadJEZlnFHuQs17Y4NMJgd6B5pKWToochT+fruzS/Q0/FjHAQ7lan6aX4itAw5cwYQ92MJ2QiEhfg3SVsZ/C3Dcu5I3deFBnhGoMZD+UNMG/NcX8IdD6InQ3rFZher2vBTTDnuukkN1KrQTc21y3dN7LasswjB03VGOsE/4RviEmPSY4icYza12ainMpSMaBCGQtU4hpR7+Riwn3d5VjjDpoNq87yHKBRB+Bv+FA2Ed5NbzFxLjwWuXD+Gs/6Ae1MEfZ9RFe9j182n4cWA9IYTvRVpg/pHWWtAzmJly5eHabfYk93hJT9spEEydkX8ycvuY0aHkl48ZFsb+yaEpMx+1nrtXOf9LkZ69iJLLRGfLu6LW+M+jHjKLX9htcNJPHLjRaDOtRDLgG8qHscZ80uet2f2JBgFjIoO2rHrISLynSsmXWrB5gboyHrQhrHW0VCNP06Bwt/dhugZ6mwfNHpVaGAJKRRD4kLLe5N7nYV4iLvHVyglvudJNgMBleWGX6T1v6IdW0EqeI9y1LNv8L9tZVI21fHJWgg1gFZIJgR+Isd9EFE4WYkM3f14U1M fjKSvy+d BSU8+hQ3mHP0w3Abk1uTpXe3wlUEOk/+JGjTVni8/RBXNMjwYzKFMinPK0rFO+J5M6PW0A4x/24OhnGPj7a6Htqz1gGZrawNLFDHy66NQaAViFYnibxdkfom1MPGLzLYFglNDKMxMka7nG/dnxLqkroUZZXYRtC4fA4LL2EIdBAI2LEbCqNcttmrkNcYYVeLLPprb3ta6rc8TkaitVtfkShIxtyaM3lTm4kpW1//idEFp2gB2Dauv7GCm3eebTFMmB+HOa8IGFMQvK9TuzfND0Ah05ihNHKwictAcvCBQZfaT0xSGFmhLAkvs/FNhAWipTsih11jZ2tFCS+y3QbZlCrKIchb19GAfLpQyAPm/ghvlZB9f7jWBZueG30T/gUHum9MLMyAFhDM/Q7GlxEg2CDx5TjfJZ+YsyQ/PBnOcda4j56s= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Oct 18, 2023 at 8:08=E2=80=AFAM Jeff Xu wrote: > > On Tue, Oct 17, 2023 at 9:54=E2=80=AFAM Linus Torvalds > wrote: > > > > On Tue, 17 Oct 2023 at 02:08, wrote: > > > > > > Of all the call paths that call into do_vmi_munmap(), > > > this is the only place where checkSeals =3D MM_SEAL_MUNMAP. > > > The rest has checkSeals =3D 0. > > > > Why? > > > > None of this makes sense. > > > > So you say "we can't munmap in this *one* place, but all others ignore > > the sealing". > > > I apologize that previously, I described what this code does, and not rea= soning. > > In our threat model, as Stephen R=C3=B6ttger point out in [1], and I quot= e: > > V8 exploits typically follow a similar pattern: an initial bug leads > to memory corruption but often the initial corruption is limited and > the attacker has to find a way to arbitrarily read/write in the whole > address space. > > The memory correction is in the user space process, e.g. Chrome. > Attackers will try to modify permission of the memory, by calling > mprotect, or munmap then mmap to the same address but with different > permission, etc. > > Sealing blocks mprotect/munmap/mremap/mmap call from the user space > process, e.g. Chrome. > > At time of handling those 4 syscalls, we need to check the seal ( > can_modify_mm), this requires locking the VMA ( > mmap_write_lock_killable), and ideally, after validating the syscall > input. The reasonable place for can_modify_mm() is from utility > functions, such as do_mmap(), do_vmi_munmap(), etc. > > However, there is no guarantee that do_mmap() and do_vmi_munmap() are > only reachable from mprotect/munmap/mremap/mmap syscall entry point > (SYSCALL_DEFINE_XX). In theory, the kernel can call those in other > scenarios, and some of them can be perfectly legit. Those other > scenarios are not covered by our threat model at this time. Therefore, > we need a flag, passed from the SYSCALL_DEFINE_XX entry , down to > can_modify_mm(), to differentiate those other scenarios. > > Now, back to code, it did some optimization, i.e. doesn't pass the > flag from SYSCALL_DEFINE_XX in all cases. If SYSCALL_DEFINE_XX calls > do_a, and do_a has only one caller, I will set the flag in do_a, > instead of SYSCALL_DEFINE_XX. Doing this reduces the size of the > patchset, but it also makes the code less readable indeed. I could > remove this optimization in V3. I welcome suggestions to improve > readability on this. > > When handing the mmap/munmap/mremap/mmap, once the code passed > can_modify_mm(), it means the memory area is not sealed, if the code > continues to call the other utility functions, we don't need to check > the seal again. This is the case for mremap(), the seal of src address > and dest address (when applicable) are checked first, later when the > code calls do_vmi_munmap(), it no longer needs to check the seal > again. > > [1] https://v8.dev/blog/control-flow-integrity > > -Jeff There is also alternative approach: For all the places that call do_vmi_munmap(), find out which case should ignore the sealing flag legitimately, set an ignore_seal flag and pass it down into do_vmi_munmap(). For the rest case, use default behavior. All future API will automatically be covered for sealing, by using default. The risky side, if I missed a case that requires setting ignore_seal, there will be a bug. Also if a driver calls the utility functions to unmap a memory, the seal will be checked as well. (Driver is not in our threat model, but Chrome probably doesn't mind it.) Which of those two approaches are better ? I appreciate the direction on th= is. Thanks! -Jeff -Jeff