From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A1F34EE49A3 for ; Tue, 22 Aug 2023 15:34:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2DDE9280045; Tue, 22 Aug 2023 11:34:02 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 28DB394001B; Tue, 22 Aug 2023 11:34:02 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 10759280045; Tue, 22 Aug 2023 11:34:02 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id F0FF294001B for ; Tue, 22 Aug 2023 11:34:01 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id B9E6212033B for ; Tue, 22 Aug 2023 15:34:01 +0000 (UTC) X-FDA: 81152136282.09.8E91296 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf15.hostedemail.com (Postfix) with ESMTP id 5EF64A0024 for ; Tue, 22 Aug 2023 15:33:59 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=YTJoWRsr; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf15.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1692718439; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=gfDVn6rZaVtHsRanik8WVSt7RnpiFjyAz4ecoDpN/hY=; b=L4hYbYRexAsHdKKSqI5ujeYCgUpPC9d/kGGWPXJuKZDXjypUZzWY1gLWnt4RvdtYoLapL/ 6hc9cc7er1N02tvglItKoYi95Lpwr1N3iVGKE/dxkWDL0yXwb0fbiXPYzEoDA+swFOqq4t 4vVh4AyxnLSU8rLLhsyNheuB0BqlgzA= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=YTJoWRsr; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf15.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1692718439; a=rsa-sha256; cv=none; b=oEsxAdnmNGzt0t38bIuidGmrhAhFSnB9C0836OU6aYFdtP0kIMqiwzWLAqqi7hmwCcfutH MZXGEGPyMi3h7XFsIH5QexQykiEgQGvPzUXMl+YL2RD5Gyn7pm8rJ9zJ1m2tth40zVj7KI rmiXfLwOkdC/l6yGYIeCg2IRrudljl4= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1692718438; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=gfDVn6rZaVtHsRanik8WVSt7RnpiFjyAz4ecoDpN/hY=; b=YTJoWRsrLWhhm+QydrbaNM1b4slmyvtQMut8mbhOJiyiE4w7YZKH9qlpPxSrULXM0WdcTW 2MshFfKXhmZBeyPE1QRepYb9VL09ES6Hj52n/PUn9wSmNBK1uPCObQwmPUpH5geynhkYOX fR8w9MTfdH1aI+Z36ndQE34S0IQ9hho= Received: from mail-wm1-f71.google.com (mail-wm1-f71.google.com [209.85.128.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-627-MSmOpPysODa5DhXrV_Hxtg-1; Tue, 22 Aug 2023 11:33:57 -0400 X-MC-Unique: MSmOpPysODa5DhXrV_Hxtg-1 Received: by mail-wm1-f71.google.com with SMTP id 5b1f17b1804b1-3fef698fb3cso6387755e9.1 for ; Tue, 22 Aug 2023 08:33:57 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692718436; x=1693323236; h=content-transfer-encoding:in-reply-to:organization:from:references :cc:to:content-language:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=gfDVn6rZaVtHsRanik8WVSt7RnpiFjyAz4ecoDpN/hY=; b=kBs4KP2vqudLjlQjbrbshmAP7d+HhfdzH+dRYW6IVUJFGeUi6QVQICPh45EM7+94lo kmnffPPK9t3Kh3MzFL7cL60/kLcxvX3U3I2R5Lhv1Du4iROArxTmM+gwNNZTdb7anjZp JcAy/MKdyCjlBD2DVxMECkW76J+nf/RmvPwzMSRW2Pmq50AC2t+c6ZR0C843xIxStGnB 7HZ38TW3ip8+nnbM6hRluqyxWhnw1Bw2TdPCWSy74xbPymoPJ/61YyQSal3VsgSltwmv iMlhOWfa7llfcSAKH+a1qVZy0DuOmAWXgHLJk2FfLylV9YQYdOhmZCNGsmUVV1im5318 hUbg== X-Gm-Message-State: AOJu0YydIuloT/K3MQPk6OPFmGYiLDTCMpM7dbInW26u4fJ8tk4qcHwh BBKLZui8l099yxMd+fcHtkciyc5mccX84evUkWgPQt3fSJt7213qtFjhvnNTxcCBeQ8v7omRdGJ M1/k/IHdu7UA= X-Received: by 2002:a05:600c:5122:b0:3fe:fbba:afa2 with SMTP id o34-20020a05600c512200b003fefbbaafa2mr1535351wms.6.1692718435900; Tue, 22 Aug 2023 08:33:55 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEEzgSt6c0d06CTldkinI+npTxalTCO8OFb2Fse9kAQ1VCoxoVKY5diKxmxkF7aMIBFdM0KIA== X-Received: by 2002:a05:600c:5122:b0:3fe:fbba:afa2 with SMTP id o34-20020a05600c512200b003fefbbaafa2mr1535303wms.6.1692718435498; Tue, 22 Aug 2023 08:33:55 -0700 (PDT) Received: from ?IPV6:2003:cb:c706:7400:83da:ebad:ba7f:c97c? (p200300cbc706740083daebadba7fc97c.dip0.t-ipconnect.de. [2003:cb:c706:7400:83da:ebad:ba7f:c97c]) by smtp.gmail.com with ESMTPSA id s14-20020a05600c044e00b003fa96fe2bd9sm19752310wmb.22.2023.08.22.08.33.53 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 22 Aug 2023 08:33:54 -0700 (PDT) Message-ID: Date: Tue, 22 Aug 2023 17:33:52 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.13.0 Subject: Re: [PATCH mm-unstable] mm/khugepaged: fix collapse_pte_mapped_thp() versus uffd To: Jann Horn , Matthew Wilcox Cc: Hugh Dickins , Andrew Morton , Mike Kravetz , Mike Rapoport , "Kirill A. Shutemov" , Suren Baghdasaryan , Qi Zheng , Yang Shi , Mel Gorman , Peter Xu , Peter Zijlstra , Will Deacon , Yu Zhao , Alistair Popple , Ralph Campbell , Ira Weiny , Steven Price , SeongJae Park , Lorenzo Stoakes , Huang Ying , Naoya Horiguchi , Christophe Leroy , Zack Rusin , Jason Gunthorpe , Axel Rasmussen , Anshuman Khandual , Pasha Tatashin , Miaohe Lin , Minchan Kim , Christoph Hellwig , Song Liu , Thomas Hellstrom , Russell King , "David S. Miller" , Michael Ellerman , "Aneesh Kumar K.V" , Heiko Carstens , Christian Borntraeger , Claudio Imbrenda , Alexander Gordeev , Gerald Schaefer , Vasily Gorbik , Vishal Moola , Vlastimil Babka , Zi Yan , Zach O'Keefe , Linux ARM , sparclinux@vger.kernel.org, linuxppc-dev , linux-s390 , kernel list , Linux-MM References: <4d31abf5-56c0-9f3d-d12f-c9317936691@google.com> From: David Hildenbrand Organization: Red Hat In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 5EF64A0024 X-Stat-Signature: b9su9ei958g59d9j8st1i7zc6jkdgizr X-HE-Tag: 1692718439-532568 X-HE-Meta: U2FsdGVkX1/QztagUsRXnZMcYX/f2Tn30E7gqz3AD/LmHXWvLYPouldS8Iap71h9voO1VKQMx+7QVWfnLxcmExDp2NJnsPivf8AMTH7BJxMtlXb8l2OqCQs1eB5P5I6gGtWdzjSy8lXNkIAiJeHu9zeMNhUL0c5A5/DLMy4/YJIGIo5qO8RZJ8R1duDq1tWwmi7RWdUwdUzM4o1Y3FY/whqSc46JZ239IGZIpR96eZhrWxOuuFibLgtANOPJd3Q23Y+eRRaN57HXEP22OjKISnovmkxJR8myr5qzK68eRubMMEjGjpmI6k0arKvbkd39CsS1jEPdQTKhbnyc5tcXQD7ja+hm+U7Qq7Bc1Qa+1YZp444xuMDh10CaEbAAojLD4pg6pfpkc7nfq3QThrO7E3snZQ9ZhLEMPYgNaGkLojac3Y6dZD7Hize52LB4QLEYyj52Ul5oaJ0HWwY5YK8zdy0v7ngKdgxDfD2oO33Dfc3ERVlQYmR2bq7RUzjfyYegRJ7LSnjOE62XtR2Q+E5SN5qYT4/agu9cv0XbZN8NiXdzD/hJ/uuajKh0e837jnvDyNNBsH+yEdRtjmSPmwNg1WOfa6VWjljc0Kj60X6aDz+eF4lxdHIyQmbhcwKzM25M/180lutATfsQmjHLNa8eNXF8dd98tqZYacfNZpM93OIrkH6StAbDK/2Rw5Ce0MiKH+yVUPI01FluUfBX9fHtmFsO/IZ2xzKYBz12PFQfrXRhdmIwrcZF/mG68zU9ikZQotPX0mL41+RBmhfCt/szvw/gFroKb/8efxwOA6m1jXOSWrfKdVUjS3aq9t5JcF8lHzW+C47Y5W1L7CFEmlqYfllz8qQRzDA8eEU6kfh7dmyeZfHrqanIriNHI1vRc8obJTdtPPP/u0dZbfXy366INWnGNe0jh+Ykt+FTju1S3bsyxMn7tOYKG9HsUuWDi/zSPH0Bd2o1454uawEargt mP4SoFoC AG9IWPXGf+0BPqEZflwVU4paH0xVuWvHgozZmC4HJH7h2KcFm5K2//WOamkyM7zf4mQT4+/pBWCo9hjW3QHdw6+QpB+WeJwlP3TPsl0+9g0rcaIFTjk6zSvnhMPsOiQcoVAjsBhmDQawOtEjV/PFgY7Gbg5edpTGxoiRM7z71SdjZkAjy0v5sBsRvodLeVVOGniD5SrR/4tL7tGpIIkXROr41eDMvJsVsVr9KWEkTkRKnzavPPypV41yk74DvvCFr5Ii0J7EpAgcBHHRWFzYE+sag3mFBLwAvDh7p8ioc9HsqCE23RtYQTlwU0nGshK3Lpv/Hqkrmoam1HLODdvb8pbl35KiOXTLntyhiU6KsHfaF1Q83V0nQ1sloRGGhKc4zBdKtJ0gHrC9CsixqsasDteKo2W9bxIc0orNc/udIE5xnPR0XCq/Zy9kVvhFIyKGZLbI8dW5A7vEDk8Nmn7i6MFER8g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 22.08.23 17:30, Jann Horn wrote: > On Tue, Aug 22, 2023 at 5:23 PM Matthew Wilcox wrote: >> On Tue, Aug 22, 2023 at 04:39:43PM +0200, Jann Horn wrote: >>>> Perhaps something else will want that same behaviour in future (it's >>>> tempting, but difficult to guarantee correctness); for now, it is just >>>> userfaultfd (but by saying "_armed" rather than "_missing", I'm half- >>>> expecting uffd to add more such exceptional modes in future). >>> >>> Hm, yeah, sounds okay. (I guess we'd also run into this if we ever >>> wanted to make it possible to reliably install PTE markers with >>> madvise() or something like that, which might be nice for allowing >>> userspace to create guard pages without unnecessary extra VMAs...) >> >> I don't know what a userspace API for this would look like, but I have >> a dream of creating guard VMAs which only live in the maple tree and >> don't require the allocation of a struct VMA. Use some magic reserved >> pointer value like XA_ZERO_ENTRY to represent them ... seems more >> robust than putting a PTE marker in the page tables? > > Chrome currently uses a lot of VMAs for its heap, which I think are > basically alternating PROT_NONE and PROT_READ|PROT_WRITE anonymous > VMAs. Like this: > > [...] > 3a10002cf000-3a10002d0000 ---p 00000000 00:00 0 > 3a10002d0000-3a10002e6000 rw-p 00000000 00:00 0 > 3a10002e6000-3a10002e8000 ---p 00000000 00:00 0 > 3a10002e8000-3a10002f2000 rw-p 00000000 00:00 0 > 3a10002f2000-3a10002f4000 ---p 00000000 00:00 0 > 3a10002f4000-3a10002fb000 rw-p 00000000 00:00 0 > 3a10002fb000-3a10002fc000 ---p 00000000 00:00 0 > 3a10002fc000-3a1000303000 rw-p 00000000 00:00 0 > 3a1000303000-3a1000304000 ---p 00000000 00:00 0 > 3a1000304000-3a100031b000 rw-p 00000000 00:00 0 > 3a100031b000-3a100031c000 ---p 00000000 00:00 0 > 3a100031c000-3a1000326000 rw-p 00000000 00:00 0 > 3a1000326000-3a1000328000 ---p 00000000 00:00 0 > 3a1000328000-3a100033a000 rw-p 00000000 00:00 0 > 3a100033a000-3a100033c000 ---p 00000000 00:00 0 > 3a100033c000-3a100038b000 rw-p 00000000 00:00 0 > 3a100038b000-3a100038c000 ---p 00000000 00:00 0 > 3a100038c000-3a100039b000 rw-p 00000000 00:00 0 > 3a100039b000-3a100039c000 ---p 00000000 00:00 0 > 3a100039c000-3a10003af000 rw-p 00000000 00:00 0 > 3a10003af000-3a10003b0000 ---p 00000000 00:00 0 > 3a10003b0000-3a10003e8000 rw-p 00000000 00:00 0 > 3a10003e8000-3a1000401000 ---p 00000000 00:00 0 > 3a1000401000-3a1000402000 rw-p 00000000 00:00 0 > 3a1000402000-3a100040c000 ---p 00000000 00:00 0 > 3a100040c000-3a100046f000 rw-p 00000000 00:00 0 > 3a100046f000-3a1000470000 ---p 00000000 00:00 0 > 3a1000470000-3a100047a000 rw-p 00000000 00:00 0 > 3a100047a000-3a100047c000 ---p 00000000 00:00 0 > 3a100047c000-3a1000492000 rw-p 00000000 00:00 0 > 3a1000492000-3a1000494000 ---p 00000000 00:00 0 > 3a1000494000-3a10004a2000 rw-p 00000000 00:00 0 > 3a10004a2000-3a10004a4000 ---p 00000000 00:00 0 > 3a10004a4000-3a10004b6000 rw-p 00000000 00:00 0 > 3a10004b6000-3a10004b8000 ---p 00000000 00:00 0 > 3a10004b8000-3a10004ea000 rw-p 00000000 00:00 0 > 3a10004ea000-3a10004ec000 ---p 00000000 00:00 0 > 3a10004ec000-3a10005f4000 rw-p 00000000 00:00 0 > 3a10005f4000-3a1000601000 ---p 00000000 00:00 0 > 3a1000601000-3a1000602000 rw-p 00000000 00:00 0 > 3a1000602000-3a1000604000 ---p 00000000 00:00 0 > 3a1000604000-3a100062b000 rw-p 00000000 00:00 0 > 3a100062b000-3a1000801000 ---p 00000000 00:00 0 > [...] > > I was thinking if you used PTE markers as guards, you could maybe turn > all that into more or less a single VMA? I proposed the topic "A proper API for sparse memory mappings" for the bi-weekly MM meeting on September 20, that would also cover exactly that use case. :) -- Cheers, David / dhildenb