From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1CB00FC6182 for ; Fri, 13 Sep 2024 23:00:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A879E6B0093; Fri, 13 Sep 2024 19:00:17 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A10606B00A0; Fri, 13 Sep 2024 19:00:17 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 88A4B6B00A2; Fri, 13 Sep 2024 19:00:17 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 615506B0093 for ; Fri, 13 Sep 2024 19:00:17 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 1CF1B1A00E8 for ; Fri, 13 Sep 2024 23:00:17 +0000 (UTC) X-FDA: 82561235274.07.8D52600 Received: from mail-oa1-f50.google.com (mail-oa1-f50.google.com [209.85.160.50]) by imf17.hostedemail.com (Postfix) with ESMTP id 4278940017 for ; Fri, 13 Sep 2024 23:00:15 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=BBVcGxe6; spf=pass (imf17.hostedemail.com: domain of jeffxu@chromium.org designates 209.85.160.50 as permitted sender) smtp.mailfrom=jeffxu@chromium.org; dmarc=pass (policy=none) header.from=chromium.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726268308; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=WkIOvqXp0bknmVx1Vck74B49RD+y94/wq8LEP+SHHiM=; b=sJKXEPz8OgpNngPklHCthvNxQ9bsc12uq7wyrux5UUArtJGAbp7vXw0PL4yqItNgQIMeVR XCSoer03MElSunG3ZM95xzI+01U+4AECiQGYKDdm5pTfVqFRCTHkzFqQ8+kR0Nv6IXDTbA K5eEqES+vTuiOzoiOmCaNzwegYoJgSk= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726268308; a=rsa-sha256; cv=none; b=yDOfvN5FD0ZYR0B6EZMCD+MMb02xvo0bQK+SmDbog+jPu+95oLrNmphoQDv4SCEw6DYm8r jI0x2jsLumjqXuRD7GVOAmzBF2ZmucBXDwy0P9v7mJXMsgDDJGsFgCcHvhLwkipfnpVVj4 SmP4oV+ymQrMJSYGigzujMXlySwtnIg= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=BBVcGxe6; spf=pass (imf17.hostedemail.com: domain of jeffxu@chromium.org designates 209.85.160.50 as permitted sender) smtp.mailfrom=jeffxu@chromium.org; dmarc=pass (policy=none) header.from=chromium.org Received: by mail-oa1-f50.google.com with SMTP id 586e51a60fabf-2781d214392so120867fac.2 for ; Fri, 13 Sep 2024 16:00:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1726268414; x=1726873214; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=WkIOvqXp0bknmVx1Vck74B49RD+y94/wq8LEP+SHHiM=; b=BBVcGxe6SzLtL7jfzfYRHWnuVSBCcIrx621ulvNhKVZ51lDjnsokhq7wtJGLnAy2Pp LIBqB3vBUnDcG07M27dWdp/UbyTDjHbv2leqQEpNc13gEwyMKNtDDkf76H2/jPZb5oEU j4lsC4X4WNJIMw7A+JFTb7hAVVJtNOuDVXFR0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726268414; x=1726873214; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=WkIOvqXp0bknmVx1Vck74B49RD+y94/wq8LEP+SHHiM=; b=tm1/kTLdTI7sOGdK3CnRyZr4bXjbWiY5aThvxfEVlkTS2e6db/OFKXsD15km/6a5tH IKGvDycuXaf5f6RcHmVtWM4xJBznEH41rtkY491f4GmHcc/FML59bE+1V4yeFgYKBZ52 fQU7BgSAUkycUhUi27heRrrD3j3K+Ip6MwY0Hp4XeKPyyN6Y327sInAKoWdIitBmP7QK 8OsMzIIEPv5tZHB1oDz8ndf9llxY5A/LnVK7D5pWF0uknHmsuHMJVZaUfFP8UsUhjZLS JQue9+sWm7GHj/mkPNb2dtdF6J6BsSP+vJIki1DZvWduDxnu9Yg9xaGC+ZhM9L91jnK6 9JJg== X-Forwarded-Encrypted: i=1; AJvYcCXy4oCK6WPmwapILgpcF/lptZt+33w2ViIh6QpP1vFNsojR+BK6t7QKzMps4VuZN/PkvI8TGoH2ng==@kvack.org X-Gm-Message-State: AOJu0Yyh6bPV9c7qyXoAKn+v/eIcXXNKL124F2TUq209o3J8TA2/wDDn 1VdguLWU3aZtjiSMGk0IH2ss1k2+891WEV1YhKkD/c5cUglI+VLQbn2f+Q9hLHOZmR63GxJhvks hfPDwS1wvVUBYx0W+wOLRRajo2xwiK18vWf6f X-Google-Smtp-Source: AGHT+IHPuIs8Xo6Q9Nqk9p5B4mSRmV8QUmXfFXF8O6+1IoOb5C/bPaCY7foKwpfjxOPYErsM3rEYqH12IqZ0JQPe9+A= X-Received: by 2002:a05:6870:9381:b0:27b:9f8a:7a7f with SMTP id 586e51a60fabf-27c3f68be4bmr1479138fac.12.1726268414034; Fri, 13 Sep 2024 16:00:14 -0700 (PDT) MIME-Version: 1.0 References: <20240830180237.1220027-1-jeffxu@chromium.org> <20240830180237.1220027-5-jeffxu@chromium.org> <4944ce41-9fe1-4e22-8967-f6bd7eafae3f@lucifer.local> In-Reply-To: From: Jeff Xu Date: Fri, 13 Sep 2024 16:00:00 -0700 Message-ID: Subject: Re: [PATCH v3 4/5] selftests/mseal: add more tests for mmap To: Pedro Falcato Cc: Lorenzo Stoakes , akpm@linux-foundation.org, linux-kselftest@vger.kernel.org, linux-mm@kvack.org, linux-hardening@vger.kernel.org, linux-kernel@vger.kernel.org, willy@infradead.org, broonie@kernel.org, vbabka@suse.cz, Liam.Howlett@oracle.com, rientjes@google.com, keescook@chromium.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 4278940017 X-Stat-Signature: ithh6ayzrab9rxde66cbchyyuzpyfya5 X-HE-Tag: 1726268415-509025 X-HE-Meta: U2FsdGVkX18RqoOjFIirg8emG74SIP0cvdFLdc7JpDti9fI/MtaSI4Irqxn1SSsx9sJZOICVXcyCCev2agBXd09/wKLabnFiMpfxy8yJ+7HeFdN5oMVMQSkRRG5sRz1VR1mom/ggi+9ncyRwKF0CyoYnu+eu/YGtnu69CseQPlnBmx0yBQ/veD+hYMbz6uvvNnVs+z2xmidmFPw3K7EWzdiXs2YSA0C3ndU4kQj/XaquHmy0W7ZpbLToiMT6Rjs7hSDcwKdfIzwjNdcVhTLevOP0jGuZoCQGRbO4NAWNSZCCBv9mKNFH9E5eGlwLKV2mhRK6joPdZZXOq+i35PzqkYNZUogjniEG2wQuB1Qvd0lYfI5qxiesa56mB1SFu/WW6RzT1ztY3KPGvpLlJGrGXCyYyoPKKGQJpJzlRFnWkqwrpctb7nL/37J0Bz/GiLRlSZliDpfIZ9TsUl1A+t9gZ5iyIkDkkMLSpyvRqdIzEqQQmT7DiP/s7+HzazksWXlZh/ewdJ7zVORBuez5j5f5P5QmlYUQYAAUonjn8GttGrH3t3CLYUnVFfQwT1X2cJYmZYqxpXSAi1WlTD+l3Egw5rh0tN0gyU23LWz0mLx219pXRyJbcuhvv5XuSHp0zOl5s9Dg3l1WFJ1dwt1eT4w/aWynARFs6Rlt7HYp4IYKzLZFJiSgmbTeXeFk3H1oMzDvrG7+jy6UTvKHQBpSP4qvIY6PtUSJogiIwiLLJvcalrwMjON5IoX8f1eg7h0DjAuPPgq7+LMSorwDh7/l5kUbfkAvYIGMbvHWj9vZV3KuLS1cyjB4ytNIlvaXhlAJAXd54sUEawjezpcyprhSNwh0+7m3SxpDCSYQiy/MyuZd+4cNh8brKOHbTVF1DvfPrAhh504tVkbwJvT7HQPe1ZyvDr9ltHqE4gDeenCntiKVzTf7B5LCwhj7BvjebWKsXYWo0cVh19tnY/iwZURwoSI jxwevSr1 c7azmQjLbHDfWF1sgZIgTYgUoaccxXaClFl2ouAPAhfpk1XTQMrSPL+0C/D1mZ+10BpY7dmhI0InzCH4kMqnikGa5X4nZTPPk88Ub9XYF/t+JziGQHhSCxROkiyLju/IocHKcK8S13LMwsJi3w+EFHFPLJjFik1iToTQegFNiYBOEw2UlCaWdvL8KzMTFwDx/L3RGMINfl2B180bTrQM//AwqSrvEp9UMDr7JzPZZ7jU+Uy6p8OPvmyxsP2fsUscimWlp9TpJiB7NSNhtxgMuI+hUBzmU9EULXY7e6r0D5jgmfqODGFbCjlirs5F7AeumM9vw/uekZdBUw6BUkXIx77Uqu29ch9/vdgY+f67IL7JI/kijwH4FWoY7XmOTCaL68x5OpNS08PoT/pyj+KXPpgOmZg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi Pedro On Sun, Sep 8, 2024 at 2:35=E2=80=AFPM Pedro Falcato wrote: > > I agree with most of the points. Sitting down here to write unofficial > guidelines for mseal behavior. > > mseal should seal regions and mark them immutable, which means their prot= ection > and contents (ish?) (not _only_ backing mapping, but also contents in gen= eral > (see madvise below)) should not be changed throughout the lifetime of the= address space. > > For the general syscall interface, this means: > 1) mprotect and munmap need to be blocked on mseal regions. > 1a) munmap _cannot_ tolerate partial failure, per POSIX. > 2b) mmap MAP_FIXED counts as an unmap operation and also needs to be blo= cked and return -EPERM. > > 2) Most madvise calls are allowed, except for destructive operations on > read-only anonymous _pages_ (MADV_DONTNEED is destructive for anon, but w= e also don't care > about blocking these ops if we can do it manually with e.g memset) > 2a) The current implementation only blocks discard on anonymous _regions= _, which is slightly > different. We probably do want to block these on MAP_PRIVATE file ma= ppings, as to block > stuff like madvise MADV_DONTNEED on program rodata. > 2b) We take into account pkeys when doing the permission checks. > > 3) mremap is not allowed as we'd change the "contents" of the old region. > 3a) Should mremap expansion be allowed? aka only block moving and shrink= ing, but allow expansion. > We already informally allow expansion if e.g mmapping after it + mse= al. > > 4) mlock and msync are allowed. > > 5) mseal is blocked. mseal is not blocked, i.e. seal on an already sealed memory is no-op. This is described in mseal.rst [1] [1] https://github.com/torvalds/linux/blob/master/Documentation/userspace-a= pi/mseal.rst > > 6) Other miscellaneous syscalls (mbind, etc) that do not change contents = in any way, are allowed. > 6a) This obviously means PTEs can change as long as the contents don't. = Swapping is also ok. > > 7) FOLL_FORCE (kernel-internal speak, more commonly seen as ptrace and /p= roc/self/mem from userspace) > should be disallowed (?) > 7a) This currently does not happen, and seems like a large hole? But dis= allowing this > would probably severely break ptrace if the ELF sealing plans come t= o fruition. > Jann Horn pointed out FOLL_FORCE during RFC [2], and this is in mseal.rst = too. In short, FOLL_FORCE is not covered by mseal. On ChromeOS, FOLL_FORCE is disabled. Recently, Adrian Ratiu upstreamed that [3] [2] https://lore.kernel.org/lkml/CAG48ez3ShUYey+ZAFsU2i1RpQn0a5eOs2hzQ426Fk= cgnfUGLvA@mail.gmail.com/ [3] https://lore.kernel.org/lkml/20240802080225.89408-1-adrian.ratiu@coll= abora.com/ -Jeff > When we say "disallowed", we usually (apart from munmap) allow for partia= l failure. This > means getting an -EPERM while part of the call succeeded, if we e.g mprot= ect a region consisting > of [NORMAL VMA][SEALED VMA]. We do not want to test for this, because we = do not want to paint ourselves > into a corner - this is strictly "undefined behavior". The msealed region= s themselves > will never be touched in such cases. (we do however want to test munmap o= peration atomicity, but this is > also kind of a munmap-related test, and might not really be something we = really want in the mseal tests) > > Kernel-internal wise: The VMA and PTE modifications resulting from the ab= ove operations are blocked. > Sealed VMAs allow splitting and merging; there was contention about the s= plitting issue, but it truly > does not make sense to block operations unless they affect a VMA entirely= , and that would also force > VMA merging to be ABI ("vma_merge isn't merging these two regions and now= my madvise works/doesn't work :("). > > > Do I have everything right? Am I missing anything? > > -- > Pedro