From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2697AEB64D7 for ; Wed, 21 Jun 2023 16:09:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 823E78D0005; Wed, 21 Jun 2023 12:09:20 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7ACED8D0002; Wed, 21 Jun 2023 12:09:20 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 626238D0005; Wed, 21 Jun 2023 12:09:20 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 4C4F38D0002 for ; Wed, 21 Jun 2023 12:09:20 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 9C33DA09A1 for ; Wed, 21 Jun 2023 16:09:19 +0000 (UTC) X-FDA: 80927239638.02.06D7F08 Received: from mail-oo1-f43.google.com (mail-oo1-f43.google.com [209.85.161.43]) by imf29.hostedemail.com (Postfix) with ESMTP id 26F8612016F for ; Wed, 21 Jun 2023 16:08:34 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=MhNcQI5M; spf=pass (imf29.hostedemail.com: domain of jeffxu@chromium.org designates 209.85.161.43 as permitted sender) smtp.mailfrom=jeffxu@chromium.org; dmarc=pass (policy=none) header.from=chromium.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1687363715; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=jYeghwLJr3eeZVSf92WTHbW6kPAlj9yXHnGi0sHDypU=; b=tZsHPLNpI7/VBKu05WiI9TSKto0X/arnDcoWSqz9Eybkl+ebPkQE1+lb/RD9dIUZ6KkWeG DuKAU+1ahwY6NJoyZkIeNWzNJLDIkC6v594qXRt20gBaSDomDZ3vXI45sTOUUptMwIv5FE eHEDubx7zYSK/Aqz2RdPmyym+IWj3Lc= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1687363715; a=rsa-sha256; cv=none; b=VT0VGoQyEdzk0iLsDuzOv9ZPkCNt7L4mTDU/vNiEG4jhfhRxJN1ezJjBflWc9TS+vKoERn 7gRez8ZxSuU0bMuM3wjoDDtxIsjvWNWUYQopcp2SvsUSaCCN2ceRjGGPIMKytLjQs4b0ld xNC3zBbaMhprskjmpaa8iA4lqha6L58= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=MhNcQI5M; spf=pass (imf29.hostedemail.com: domain of jeffxu@chromium.org designates 209.85.161.43 as permitted sender) smtp.mailfrom=jeffxu@chromium.org; dmarc=pass (policy=none) header.from=chromium.org Received: by mail-oo1-f43.google.com with SMTP id 006d021491bc7-55af44f442dso4053619eaf.1 for ; Wed, 21 Jun 2023 09:08:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1687363714; x=1689955714; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=jYeghwLJr3eeZVSf92WTHbW6kPAlj9yXHnGi0sHDypU=; b=MhNcQI5MOADt15GCiHEO7GsmtjxNzjczpjMuNbwOqeInmGYTx6fW/634bGd1LaJ4Ky +/gtDS7TMyydtY2P1MxHV16TpQNjYmVehX6TBpq+YMLXZdCQjH195v51+Ga3owK5l6LN w9V61439DewhQiQQN42zQ4i//U6Ho6EiHCMNA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687363714; x=1689955714; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=jYeghwLJr3eeZVSf92WTHbW6kPAlj9yXHnGi0sHDypU=; b=k1Up72WEoItZA3KRSpj19d8oQIXfXIhMnci24DmNIj7bCY8K81wNJfdkcfgU4vy04q lCYtIH4Gj8gNhBkpVAfrjiHyTjeX+Fy3GWluNmO+Iyq257uyvscmZJXPi9OcBHNdW0xg lyXDnY5ItdlFWolvJdXduUVWQVK6fgNOjgKRPJcjmt6LGZm1IU3qYDnZg1GS+TsXd1WV 0FObHDjBagYUjTzWh2T6grm2+kKrPEm1aDQBMyjFySV11f2b7w30FfGuNdMnrNBWCmE1 bm8HLeNtG7dJBze6sEAe3MJ3SGVH21EpUpX+++wSkHZPRfiEaASVc+qrjkdjctK2YvAs D89Q== X-Gm-Message-State: AC+VfDwh4TXhtJWlU/dTgxulJ166NbIUKLl6aXRWD/ZjkYeeSvzn8kvX nYWf4nQxi8165+JqdTekIdIfTqpFUxR02xEGpBgUfQ== X-Google-Smtp-Source: ACHHUZ5XH71fuQLzrmIT7pWr4OuhEegocxiqeU3jL5mRb6BLRR+IGYO6nfkeL5BFTVgsoiY1VOl7o3L7ZuWPsVb7tqI= X-Received: by 2002:a05:6820:1527:b0:560:b9f0:b9fc with SMTP id ay39-20020a056820152700b00560b9f0b9fcmr814279oob.0.1687363713839; Wed, 21 Jun 2023 09:08:33 -0700 (PDT) MIME-Version: 1.0 References: <20230614011814.sz2l6z6wbaubabk2@revolver> <20230614125731.GY52412@kernel.org> <20230621055551.GE52412@kernel.org> In-Reply-To: <20230621055551.GE52412@kernel.org> From: Jeff Xu Date: Wed, 21 Jun 2023 09:08:22 -0700 Message-ID: Subject: Re: inconsistence in mprotect_fixup mlock_fixup madvise_update_vma To: Mike Rapoport Cc: "Liam R. Howlett" , Peter Xu , linux-mm@kvack.org, linux-hardening@vger.kernel.org, zhangpeng.00@bytedance.com, akpm@linux-foundation.org, koct9i@gmail.com, david@redhat.com, ak@linux.intel.com, hughd@google.com, emunson@akamai.com, rppt@linux.ibm.com, aarcange@redhat.com, linux-kernel@vger.kernel.org, Lorenzo Stoakes Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: t3kq5qgpofbzt31n1maeyyyrgj3ouykn X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 26F8612016F X-Rspam-User: X-HE-Tag: 1687363714-770454 X-HE-Meta: U2FsdGVkX1/HTrazTEt9j2p5jYOSz0gXD+b2ThQhdfoh2z10Osn4GwOQ02oiBflFF97f5CNi6WwRMT+919YkmRlqlq4UGUeCsh93paf7H+TJVtBlrruTPPKto+/RObVoiBdLTVmRKnEx1MXwkiNOurvWEQAakXJI9JtOrYdYC04YuZsulJmr5jcFSaU3Zha0y8ZV+YkOy5GQoClUTjkzVr1fZK415TsJITrnwg2qC1E0KqSvUnzh91Pv1vZP59C5AQAJvMR32Oga2t6O2EsBomV/+LHeoqO4ObM5IrMi2v6U9swdeAth9nC4JWmKbBjpJQmrmd/oHEaMOBV6Z8dD/3qfnj4EcvCIFfwfxQw1zheAJzB5/AtZdCnQIseA7lS+9+1sAvbwf+efAipCmhEYD4srtUDn6aHqX7b72nSivf0FNDrm+T5FiIM9aglzpXbwHEDro7G6wHV3ypbwq+BJVweu51qMQRtGsNBa4RjW82hNHL3oJTN9wOBJacsOCzloGwLevvgovf/ButOpq/D6+11YNkr2loVz6odBtUuLfYbY6PNjAT+kQkgdewrldV5I/RQXmhVKAodjOWzXhuDz1hfv22Um2gyPlNQjLPAgwEoqRRYwDXgidKMkGTRA239FWmLLnmq1g9lfxWYbq2CSzQEs5SMG396Xyc5BpnT6OaA1uBivU1AcHrSd2mSjg3u32TZbzthc78QTHmPF4bxAtpakiI8ScP7RCq51KBDWuGqkXevD+7yRRwaZ9w5isKFJZXcaHFdhD2Wae74YHxh5/q8LGrm3BPh6ZXnFUsAnDKcwmjhUplAeN7LBLML/N7ynP+jqd4d6KonJ4Niyp6RqE79BsxYRrSUuHNkl4xPowb2LcKzoQ/x2Mj+3wFGaBkujzGznq7sz0Nlmmd1/SCx06KT7up4eB/q4uVMAo1GuuiLadGwtMXWzSR/pcELZhWSj76WFPCLK8bq8edBHoAL czwbd8Yu 1t0gkpsnPfMQSTDFY7aCTjyq6li339rzjwzlQxh+0R6AimWHbuRgx4AE24VM1dexyKYYekOm1XFNJJBo6vsMdyPaYKYCoEgCaKjrgyabZDBasXY8oTK9da1ebMxzxn8cWD5bfXWD24rDOJBF7frK+W/IcsLYOEBJkvhZP1O2RsFZ91ZI5soXzigpPbSxMDkSJiHKbEJfGtpJOsiu1hchJyx4zjIOXaeCKYvUGpZABVlZROFXbq/UfoQJXgpjdIfB40IyesNMYWMY04b17C5kTS5mdpBKszl2QT0X+/KavIF+QLE9x41laoS2IBQrXGg6YAigPp1hVDNo8lFjXJ/9Gk7ki0ONkHPP3u5FDCIaM5IeRA8md8/GXRCpEp5MFYVTQUxdCDyihWIHsjDtqWyI1OgYP0JvaN3WSZ6hF X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Jun 20, 2023 at 10:56=E2=80=AFPM Mike Rapoport wr= ote: > > On Tue, Jun 20, 2023 at 03:29:34PM -0700, Jeff Xu wrote: > > On Wed, Jun 14, 2023 at 5:58=E2=80=AFAM Mike Rapoport = wrote: > > > > > > On Tue, Jun 13, 2023 at 09:18:14PM -0400, Liam R. Howlett wrote: > > > > * Jeff Xu [230613 17:29]: > > > > > Hello Peter, > > > > > > > > > > Thanks for responding. > > > > > > > > > > On Tue, Jun 13, 2023 at 1:16=E2=80=AFPM Peter Xu wrote: > > > > > > > > > > > > Hi, Jeff, > > > > > > > > > > > > On Tue, Jun 13, 2023 at 08:26:26AM -0700, Jeff Xu wrote: > > > > > > > + more ppl to the list. > > > > > > > > > > > > > > On Mon, Jun 12, 2023 at 6:04=E2=80=AFPM Jeff Xu wrote: > > > > > > > > > > > > > > > > Hello, > > > > > > > > > > > > > > > > There seems to be inconsistency in different VMA fixup > > > > > > > > implementations, for example: > > > > > > > > mlock_fixup will skip VMA that is hugettlb, etc, but those = checks do > > > > > > > > not exist in mprotect_fixup and madvise_update_vma. Wouldn'= t this be a > > > > > > > > problem? the merge/split skipped by mlock_fixup, might get = acted on in > > > > > > > > the madvice/mprotect case. > > > > > > > > > > > > > > > > mlock_fixup currently check for > > > > > > > > if (newflags =3D=3D oldflags || > > > > > > > > newflags =3D=3D oldflags, then we don't need to do anything here, i= t's > > > > already at the desired mlock. mprotect does this, madvise does thi= s.. > > > > probably.. it's ugly. > > > > > > > > > > > > (oldflags & VM_SPECIAL) || > > > > > > > > It's special, merging will fail always. I don't know about splitti= ng, > > > > but I guess we don't want to alter the mlock state on special mappi= ngs. > > > > > > > > > > > > is_vm_hugetlb_page(vma) || vma =3D=3D get_gate_vma(current-= >mm) || > > > > > > > > vma_is_dax(vma) || vma_is_secretmem(vma)) > > > > > > > > > > > > The special handling you mentioned in mlock_fixup mostly makes = sense to me. > > > > > > > > > > > > E.g., I think we can just ignore mlock a hugetlb page if it won= 't be > > > > > > swapped anyway. > > > > > > > > > > > > Do you encounter any issue with above? > > > > > > > > > > > > > > Should there be a common function to handle VMA merge/split= ? > > > > > > > > > > > > IMHO vma_merge() and split_vma() are the "common functions". C= opy Lorenzo > > > > > > as I think he has plan to look into the interface to make it ev= en easier to > > > > > > use. > > > > > > > > > > > The mprotect_fixup doesn't have the same check as mlock_fixup. Wh= en > > > > > userspace calls mlock(), two VMAs might not merge or split becaus= e of > > > > > vma_is_secretmem check, However, when user space calls mprotect()= with > > > > > the same address range, it will merge/split. If mlock() is doing= the > > > > > right thing to merge/split the VMAs, then mprotect() is not ? > > > > > > > > It looks like secretmem is mlock'ed to begin with so they don't wan= t it > > > > to be touched. So, I think they will be treated differently and I = think > > > > it is correct. > > > > > > Right, they don't :) > > > > > > secretmem VMAs are always mlocked, they cannot be munlocked and there= is no > > > point trying to mlock them again. > > > > > > The mprotect for secretmem is Ok though, so e.g. if we (unlikely) hav= e two > > > adjacent secretmem VMAs in a range passed to mprotect, it's fine to m= erge > > > them. > > > > > > > I m thinking/brainstorming below, assuming: > > Address range 1: 0x5000 to 0x6000 (regular mmap) > > Address range 2: 0x6000 to 0x7000 (allocated to secretmem) > > Address range 3: 0x7000 to 0x8000 (regular mmap) > > > > User space call: mlock(0x5000,0x3000) > > range 1 and 2 won't merge. > > range 2 and 3 could merge, when mlock_fixup checks current vma > > (range 3), it is not secretmem, so it will merge with prev vma. > > But 2 and 3 have different vm_file, they won't merge. > > > user space call: mprotect(0x5000,0x3000) > > range 1 2 3 could merge, all three can have the same flags. > > Note: vma_is_secretmem() isn't checked in mprotect_fixup, same for > > vma_is_dax and get_gate_vma, those doesn't have included in > > vma->vm_flags > > > > Once 1 and 2 are merged, maybe user space is able to use > > munlock(0x5000,0x3000) > > to unlock range 1 to 3, this will include 2, right ? (haven't used the > > code to prove it) > > But 1 and 2 won't merge because their vm_file's are different. > Is that possible to be staged the same ? > > I'm using secretmem as an example here, having 3 different _fixup > > implementations seems to be error prone to me. > > The actual decision whether to merge VMAs is taken in vma_merge rather th= an > by the _fixup functions. So while the checks around vma_merge might be > different in these functions, it does not mean it's possible to wrongly > merge VMA unless there is a bug in vma_merge. So in the end it boils down > to a single core implementation, don't you agree? > I agree that vma_merge should also check, but it doesn't seem to be the case ? I looked for secretmem, get_gate_vma(current->mm), vma_is_dax() Ideally, the skip/go decisions should be inside vma_merge/vma_split() function, not in the _fixup(), I think. > > Thanks > > -Jeff > > -- > Sincerely yours, > Mike.