From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id ED537EE49AA for ; Tue, 22 Aug 2023 15:31:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 881BA280044; Tue, 22 Aug 2023 11:31:01 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 80A5794001B; Tue, 22 Aug 2023 11:31:01 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 68446280044; Tue, 22 Aug 2023 11:31:01 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 58EB394001B for ; Tue, 22 Aug 2023 11:31:01 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 0B48D1C92AB for ; Tue, 22 Aug 2023 15:31:01 +0000 (UTC) X-FDA: 81152128722.25.7C9113F Received: from mail-wm1-f52.google.com (mail-wm1-f52.google.com [209.85.128.52]) by imf15.hostedemail.com (Postfix) with ESMTP id A79E6A002A for ; Tue, 22 Aug 2023 15:30:58 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b="1wJYwF/X"; spf=pass (imf15.hostedemail.com: domain of jannh@google.com designates 209.85.128.52 as permitted sender) smtp.mailfrom=jannh@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1692718258; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=a7rvpiaBmdaWESCzJVCBdhucf56RoNv7C/zYiNdqugw=; b=b8PqB/6I5zdSQaG2ortsj6CAdfRDHB6v8L/mvvDi9wBTgH9r3Ud/LvsSLvIJaNK+612jTE rzsPPPPWtMw/J41E0uNoZ+bXLIhnJmqey+bsspZIQD/p7hvFCHUGH14FXt/7Y59OBMhPIu VPamQOmHgdxTFZ8aOiynXOYJ9mdauqE= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b="1wJYwF/X"; spf=pass (imf15.hostedemail.com: domain of jannh@google.com designates 209.85.128.52 as permitted sender) smtp.mailfrom=jannh@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1692718258; a=rsa-sha256; cv=none; b=ff0hMkYpHdIM2iqYEcJ7ESsJXkYZgJYgsyRPqux8qOA5Ko5pmYRejJH/4KyrNPaQh8Y691 AThd86SlSkPL58sIgFgkIxjbHSmO0iCwlcQctkgxHvOYoYvwNoF5Nj/end5G/2oy8jwaHj Jn6R8BbOdIp/+Qu2DRlO77Gl7DBTWbI= Received: by mail-wm1-f52.google.com with SMTP id 5b1f17b1804b1-3fef2fafee2so107045e9.0 for ; Tue, 22 Aug 2023 08:30:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1692718257; x=1693323057; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=a7rvpiaBmdaWESCzJVCBdhucf56RoNv7C/zYiNdqugw=; b=1wJYwF/XUeBhFNuVLT9M6HxNx22zvWEOGkabOBprp3VPK7ecH6H8+WQb+PZ7C/NCpt Ck5ii8Mz711rRiDKPqlGnX/A/fC8hzKhoGaSLeXlK9RiTR3yPUlaVATQ4tR1Evqow1XZ 6truoiONXUoJXKQRJm1T7EVAuC/tr7RAC8LolwaT4w6uct3C9G/LHCao8Kvcj06I7f8u eDaGKuBZhqGT8KibcEIVeVLhn5DZX6+OupQi/a1pu/hNrXF5HQ/UOn6EFCRqpqY64qup zJqW1TqWuAN5jMj9xCzbC8AwDj59mQbkBi01tS5+mEMcOpKjVh+FpJ1T1dOf9nzgWhxh QEtA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692718257; x=1693323057; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=a7rvpiaBmdaWESCzJVCBdhucf56RoNv7C/zYiNdqugw=; b=WWW4KiLUb1K9yuf4ZZvcIb0YSfp/AATgrxWS6tFvQy9QOKd3XnYn/8g9Wre1JBWoBU pyp+S+YVNnbLks1E4P/nd5WYiFo2Gwll/80b9PuPHicVoBAgfGV71j1TYTv5OqqlFc3U pxIubMtSUSj+v4Gc4gdUtSKUK/MjTYXvrt0nsnY2HEcoJMDkuJ6oNLW7Xwh03iwOypob BGyEvInfvt7liSK+py6lijrNUrqgHrHlMwBpBoD4RfMR3WN3Sz6fDChBetgXJUfH868P clCpH7rIXlBhotbv7w6UdcGwv0a57ByOaDacmvW4sUZ5jy1ndNC/UtTkefKytwsaXmpa GFGw== X-Gm-Message-State: AOJu0Yz1ss0kEZPNUDWyRy6ZP+w2qxuqLRBdmQkor2BxfjSottI48rXt MwAOROw/sZVjBGdQMbWqSgpckMOq8uwOGVTGEKZWTg== X-Google-Smtp-Source: AGHT+IFyMgkMgYPE+lsjqN8nfI8yOuu8gDBvx8ZWd4M7BOWwoDnsNIcZ50kelsgmgyRcsCUSHPMH6DyCCbSZ5fwKiiA= X-Received: by 2002:a05:600c:1716:b0:3fe:5228:b7a2 with SMTP id c22-20020a05600c171600b003fe5228b7a2mr96047wmn.5.1692718256744; Tue, 22 Aug 2023 08:30:56 -0700 (PDT) MIME-Version: 1.0 References: <4d31abf5-56c0-9f3d-d12f-c9317936691@google.com> In-Reply-To: From: Jann Horn Date: Tue, 22 Aug 2023 17:30:19 +0200 Message-ID: Subject: Re: [PATCH mm-unstable] mm/khugepaged: fix collapse_pte_mapped_thp() versus uffd To: Matthew Wilcox Cc: Hugh Dickins , Andrew Morton , Mike Kravetz , Mike Rapoport , "Kirill A. Shutemov" , David Hildenbrand , Suren Baghdasaryan , Qi Zheng , Yang Shi , Mel Gorman , Peter Xu , Peter Zijlstra , Will Deacon , Yu Zhao , Alistair Popple , Ralph Campbell , Ira Weiny , Steven Price , SeongJae Park , Lorenzo Stoakes , Huang Ying , Naoya Horiguchi , Christophe Leroy , Zack Rusin , Jason Gunthorpe , Axel Rasmussen , Anshuman Khandual , Pasha Tatashin , Miaohe Lin , Minchan Kim , Christoph Hellwig , Song Liu , Thomas Hellstrom , Russell King , "David S. Miller" , Michael Ellerman , "Aneesh Kumar K.V" , Heiko Carstens , Christian Borntraeger , Claudio Imbrenda , Alexander Gordeev , Gerald Schaefer , Vasily Gorbik , Vishal Moola , Vlastimil Babka , Zi Yan , "Zach O'Keefe" , Linux ARM , sparclinux@vger.kernel.org, linuxppc-dev , linux-s390 , kernel list , Linux-MM Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: A79E6A002A X-Rspam-User: X-Stat-Signature: y81cae9goynqqjsrs5knt1umarcnwp16 X-Rspamd-Server: rspam01 X-HE-Tag: 1692718258-682796 X-HE-Meta: U2FsdGVkX19hDEZaQ0PZof68t7cx+mEpb4gAAak5Not6VP/+n1XgNYvDIxmgcUKJSTPnsT43RJEIsznCF+avzFAHZyyLxi/xp/d1I5bm//NdbZTNsTafDuONxgGOhPL9NvLjMJejlNHLbJSgnltfpAZxQNsA3DAAwnRpMFb4yXi0yoHV/Z6Ha+C+wNWwB4hBQbCMTKftuUCHpLROrzUdgfA0kRBaGa2ICmxPJG1WMhRGWvch8c2rBrXDzPqA2pYZAgrTm4ux8IkhrVm7NEevjAWNtyHi28sIPaDddmaeg3oOj/O3JnF8VC6Dvir/rLELTvgiIvL6D0Ja0ExosxiRvnjTxIOSVKSuzswrTzO5Z0ez5Id3w6deAaGxZjF7CKcQx1HR595KNz1SYFWU1EhVFi4mKteO2G3HfGApw7SSKJXFGxSWOt29eygnju8oiqBHBi28Jm6L3QVk1KToaHFu0CrNRAghqLSuczXpRE+CEEEbeDGS59e9QG/zVtELo3dWpGU8o3Sff7GfA/xO5rayZU1pvffqXi8V/0OwnTL6Uj0HxpmkbyxPDpZQ2A8fpWr3qii3dU1ePj9m2xcPDEQCSR3H3wVIRqCxJOhc4COHMK8OQLwiJVRda4dMKow4WSDB1H0gL/bePfb6WrmIrSP/R7xQuxFgQ7QD3QeGA9rCvpmmXyxp75A1xRqOeZpzbPJiDc8InF5w6H+qBQH0S7GRZBDcclmtAzyoLsCFaS/Zd+XAILLDQ8qdOhJ39QGYY8jDmhQsX5GGooG9bR6yDfVys7S6oTAZWN/VVvB17DpWPYnc9QOYBXpo9EjyjIiFNVbUB1JQIjZNlkBRK+9L4tyCjhwfj5COyXAVTnDOOk9XK6W9mKKTUo4LhVzqCZyXZ1/3BYbbKoSVP8xS4veLyJOu70VkcIOIRLEvRfdjjX4CJS2sc5J0ZdWNGNIjysxco5gdEs5V/GSbR32L5jcQNfv XUf4M7ET qu8zBKZjhf2fDSM7eUjMEErCbVvLQq7H78O44rIr4jLr1BAQLk9mMDnOuKczu/8jM/hHUMNApn7n1kGxrGDn61GzwbZc7s48a0FlzUEZLokOOQsLUwSgLcjxoG87StU4EgdmxvOjI26CpEycR63uHu/QKkj2PLN/B4auOVHTlwu1PUo6PL5vZWsdFzMCf2sojo6ds5sJZlVc4lAzQ/8CQ0J3l3cMZPjQEGm+A0hdVRo5yxdKTAfnJIYE03Y2/fwX98rz5NNsinZPXpdTDSM/Me7n6qUHn88m2MWFwk05YJtL19MW61JPRjgwwpbzos6kwtFXaYZzGg67oVq2uyU6HhDGx20sBx9i02YO3brsHRqOxa2l6wt2/V9o1kaLqo8F0fyHV7mewbd7+EZnDzxCdSn2jUQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Aug 22, 2023 at 5:23=E2=80=AFPM Matthew Wilcox wrote: > On Tue, Aug 22, 2023 at 04:39:43PM +0200, Jann Horn wrote: > > > Perhaps something else will want that same behaviour in future (it's > > > tempting, but difficult to guarantee correctness); for now, it is jus= t > > > userfaultfd (but by saying "_armed" rather than "_missing", I'm half- > > > expecting uffd to add more such exceptional modes in future). > > > > Hm, yeah, sounds okay. (I guess we'd also run into this if we ever > > wanted to make it possible to reliably install PTE markers with > > madvise() or something like that, which might be nice for allowing > > userspace to create guard pages without unnecessary extra VMAs...) > > I don't know what a userspace API for this would look like, but I have > a dream of creating guard VMAs which only live in the maple tree and > don't require the allocation of a struct VMA. Use some magic reserved > pointer value like XA_ZERO_ENTRY to represent them ... seems more > robust than putting a PTE marker in the page tables? Chrome currently uses a lot of VMAs for its heap, which I think are basically alternating PROT_NONE and PROT_READ|PROT_WRITE anonymous VMAs. Like this: [...] 3a10002cf000-3a10002d0000 ---p 00000000 00:00 0 3a10002d0000-3a10002e6000 rw-p 00000000 00:00 0 3a10002e6000-3a10002e8000 ---p 00000000 00:00 0 3a10002e8000-3a10002f2000 rw-p 00000000 00:00 0 3a10002f2000-3a10002f4000 ---p 00000000 00:00 0 3a10002f4000-3a10002fb000 rw-p 00000000 00:00 0 3a10002fb000-3a10002fc000 ---p 00000000 00:00 0 3a10002fc000-3a1000303000 rw-p 00000000 00:00 0 3a1000303000-3a1000304000 ---p 00000000 00:00 0 3a1000304000-3a100031b000 rw-p 00000000 00:00 0 3a100031b000-3a100031c000 ---p 00000000 00:00 0 3a100031c000-3a1000326000 rw-p 00000000 00:00 0 3a1000326000-3a1000328000 ---p 00000000 00:00 0 3a1000328000-3a100033a000 rw-p 00000000 00:00 0 3a100033a000-3a100033c000 ---p 00000000 00:00 0 3a100033c000-3a100038b000 rw-p 00000000 00:00 0 3a100038b000-3a100038c000 ---p 00000000 00:00 0 3a100038c000-3a100039b000 rw-p 00000000 00:00 0 3a100039b000-3a100039c000 ---p 00000000 00:00 0 3a100039c000-3a10003af000 rw-p 00000000 00:00 0 3a10003af000-3a10003b0000 ---p 00000000 00:00 0 3a10003b0000-3a10003e8000 rw-p 00000000 00:00 0 3a10003e8000-3a1000401000 ---p 00000000 00:00 0 3a1000401000-3a1000402000 rw-p 00000000 00:00 0 3a1000402000-3a100040c000 ---p 00000000 00:00 0 3a100040c000-3a100046f000 rw-p 00000000 00:00 0 3a100046f000-3a1000470000 ---p 00000000 00:00 0 3a1000470000-3a100047a000 rw-p 00000000 00:00 0 3a100047a000-3a100047c000 ---p 00000000 00:00 0 3a100047c000-3a1000492000 rw-p 00000000 00:00 0 3a1000492000-3a1000494000 ---p 00000000 00:00 0 3a1000494000-3a10004a2000 rw-p 00000000 00:00 0 3a10004a2000-3a10004a4000 ---p 00000000 00:00 0 3a10004a4000-3a10004b6000 rw-p 00000000 00:00 0 3a10004b6000-3a10004b8000 ---p 00000000 00:00 0 3a10004b8000-3a10004ea000 rw-p 00000000 00:00 0 3a10004ea000-3a10004ec000 ---p 00000000 00:00 0 3a10004ec000-3a10005f4000 rw-p 00000000 00:00 0 3a10005f4000-3a1000601000 ---p 00000000 00:00 0 3a1000601000-3a1000602000 rw-p 00000000 00:00 0 3a1000602000-3a1000604000 ---p 00000000 00:00 0 3a1000604000-3a100062b000 rw-p 00000000 00:00 0 3a100062b000-3a1000801000 ---p 00000000 00:00 0 [...] I was thinking if you used PTE markers as guards, you could maybe turn all that into more or less a single VMA?