From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 29BDAC636CD for ; Tue, 7 Feb 2023 22:46:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 816386B0093; Tue, 7 Feb 2023 17:46:46 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7C5476B0095; Tue, 7 Feb 2023 17:46:46 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 68D696B0096; Tue, 7 Feb 2023 17:46:46 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 59B456B0093 for ; Tue, 7 Feb 2023 17:46:46 -0500 (EST) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 54BB0160C58 for ; Tue, 7 Feb 2023 22:46:45 +0000 (UTC) X-FDA: 80441981970.29.BBD7A3A Received: from mail-vs1-f47.google.com (mail-vs1-f47.google.com [209.85.217.47]) by imf29.hostedemail.com (Postfix) with ESMTP id 9103812001A for ; Tue, 7 Feb 2023 22:46:43 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=aFdGcBbs; spf=pass (imf29.hostedemail.com: domain of jthoughton@google.com designates 209.85.217.47 as permitted sender) smtp.mailfrom=jthoughton@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1675810003; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=dd4k2clGQvAYBUq7oYGCfW0nQaeNii+CVBUcnHja++A=; b=KowDmPxpg9OClzEyQZhup7VAS6Z4CGLpkDYcoWOZa0O8mR/lOZS/Mcb6Y8ayjzjbKsi4qE K4RYVne0ap6GImrBHdMLL6yNHGX6KwDphjc71ztTIjBdhzx/GSq1cGDpm4y2Ohq//ebP/Y mkysSoPstewb5KtKL8XEZw/QupD0240= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=aFdGcBbs; spf=pass (imf29.hostedemail.com: domain of jthoughton@google.com designates 209.85.217.47 as permitted sender) smtp.mailfrom=jthoughton@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1675810003; a=rsa-sha256; cv=none; b=o/u9neDH5r/yELa+fPiJonqnqwyMPf0MgQyWzwXTRO8yfzd+UEe5tVBMyjyFMak0XdW1vV sioDZw+tntkunRT7x52yqXdXNho70kcbFo1+ALdYxB8C472wGlfNMhWbFwNxWc4MB9PyyF brQesrYUsrC6/oDEvn1lsbBPaiWHpZo= Received: by mail-vs1-f47.google.com with SMTP id m1so16112327vst.7 for ; Tue, 07 Feb 2023 14:46:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1675810002; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=dd4k2clGQvAYBUq7oYGCfW0nQaeNii+CVBUcnHja++A=; b=aFdGcBbsSwnv7E8NykVKcj5yUXrHS/JVFsmAQ7a4DD1xn86ypvDBdJBqMwPbgoVSz5 JovXcDp6p+vlVpM2KdoX+Fc6U8beRHxay4MyGFe2X4onczuMTGcTtEVFDuZCq0+m9b+o hZDmW+lwCZ0AQu/bVUVg613ApPOZ1DzqYq365zpOGjLBoWx7T/uer5s5MmwY25I2J6oE nlNwhZB9Tc2ocy+ygTlcUTphmtyW2Q+ywO87gfnXvbszKxDSGXu5hAsZrWO7DQYkAQPM GNVoI837kC/kSfrgiMQup3r9V8Qh/IfGr8ZxNCCo/uLNwmIJzevHbvXuZ6eChLJ7luvs aUMQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1675810002; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=dd4k2clGQvAYBUq7oYGCfW0nQaeNii+CVBUcnHja++A=; b=ChMT+10MFZQkZrqt1ev1lZEkS/gLcsECeg3qHjfStgtrlItA2ef1y+HrBLzF1Kn0Jp FDflba+5Fjm0DL5CBlbxRdXZrDF2PSI+YyjTr16DkQjUbCNmT99U3zhA5EGf7aW5APtX f12/I/BlyAOt+z5qcQI05CVpj3zHXQxYpOPYCNs/NvL24O1lXlbQN3c0XqyktpTr9y8D GVXk164KgtAtkB9UxbgtqehNKqlJpYq5Nqaz98HmbXH7JK0XuVAD5ZMdiAv8TlaPuFKL 22BKiJepEP1+TmExQh/vvy4DsJLd+sIbo88XjAIFgoyYjYpMl71GgFCY0Z+lWjbLD0Ys HVaA== X-Gm-Message-State: AO0yUKVlgaEOqfCJcTcH5CMPXdvxumhxwv3naCM+NmyDefdGwpFQOPNA KPyaN8zpdagKpGaUrMvEPRBWoKU6X/oUzx3TfTG2pg== X-Google-Smtp-Source: AK7set9KdE7K2VwpkmpPDdbgRPU9KWW0L/1ytARhlMr9eWAC9KK+OPs217EpAVwaL9S1d9Ug0mbZtnDwK65brTseVCU= X-Received: by 2002:a05:6102:201b:b0:3f7:dda3:f85 with SMTP id p27-20020a056102201b00b003f7dda30f85mr1048272vsr.66.1675810002563; Tue, 07 Feb 2023 14:46:42 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: James Houghton Date: Tue, 7 Feb 2023 14:46:04 -0800 Message-ID: Subject: Re: [PATCH 21/46] hugetlb: use struct hugetlb_pte for walk_hugetlb_range To: Peter Xu Cc: Mike Kravetz , David Hildenbrand , Muchun Song , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 9103812001A X-Rspam-User: X-Stat-Signature: ex8qa3e8jyyqztwkj9m97xpmzq8gh9qg X-HE-Tag: 1675810003-600400 X-HE-Meta: U2FsdGVkX1+l928FPVQdYLe9G7+nv6nKCJw4ZBB0A7zSkx557M7+n2IhCkrfLlO60sFtXZiUH1Mas/yXYuCpROSFR4ho4qfu3LB8UgMFtTPWFYaWpCUHOQHNHSP49FUpBqP7GEmeGIr6jqJlJ1JKpna2L3C0xC46zY1PvpjB/Dz9dWC0wLtVQ6utyFVncikj8/SUQKCUrHEIfJIavlWh/btsnpGxHEWfwP/Z+eiIHwtmgpWzrtpJIQvYlLT2BK9zk8Mdow1umdurqADkDrOMzZMa9sTslDlQ4/eJ4Qg6yAd5n8xIlOZ7q820Cbe+RaHMtVCZJt5ut5l9KJCoFqjaM1pXhi3vOVV3sqiTAjw6skmI3zOaBM7ZPxnuQpL6omTce7hjF8Aq/Ngruw+vkUJ0yXKhK6PL1zY63SI7S7kFAR7StdmfepMjVVX2vG0W2qb+p/bCaPmxaYVnIBVfI2dYcPzA6kMk76o638Q5GiK+plVEYvLr+1YUfDgEYB2UNNqBpKfcB0VVix4TqSwYk4KfgNamXlGqb00W4lvco/VgGjqIG1xqcvLndsK+t0XyXxPb34yUNa3C517K9oP8WrrCrfWeId4A000lJafv60H0feCGUgc+ihNbFA7y0Y+pnaR85NeihhHPOyxEcUU3eac31EWvYrmTHZ/O9Y7hMwWCLeECGPUYV/xZ3wOPqadevsUr5RyUh70MrkL7+LqRnXhyy0gQ22T+tJvNOKlvynJwOL5VRmYP/WtTYS+5zr2SW97KY30dy05uWFhhyHMuOz4fHG4CptSYa11n8qhT0SvB+01koAGjozYbxiOZzFPnyH6NFd9fYaCd/c8sN9MGT96inKP562qHe/YWv0DpA91UIPrdCA3JKjxgSuyfdRvSP0CtPAXfQ46UXFQLBkq0pbUKv5eH+CjCYVFEQ8KAlukn6fUsgQQSqWfOZp3c0NP/e3JhgF5qjCqB24IvR6cGe+B CG2wWGWX moNA5L94JhludwViywZzINL4UsPg8L1PStLOFAgeuOhVLDCX3ru0vY5O+F/+FSp0X9Yg0Pq25dv7cD8IGRM/Uq3iY0d9UDozKy/821hm3HmsRR7JLCFQHpevoGkxPIXZdhxAwOOCgsHYsM+NIdeHnxq9sznfEaykq5QXfM4mrWbWX8UlZSXiAEznz4UXgMu6EKE3T0/sbLqTHWCf3L6VTwkW/mKggVgyGOFT5uf9DQP7SAf29iDBLAkZCuzgptd/R8wTncSLwmZ9Rma9F6lzLqmIWdMySuLoeEk+oDp8HLCo7dTmxcDrNZ0NsWxvkm3nubwVub2G/PcmIGA9ob8T5GYl1Bji0lHQ+9KjfJZ0+H1ui14S/LCGcQNX7FN9XvNG3y/kgF/MPUM4TSD4WVf0iTUml5nu4rAdGRe5nwXW4ixjytSGLPiaLu7RZNWaknHAggzUaFI66IiogLGA= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: > Here is the result: [1] (sorry it took a little while heh). The > implementation of the "RFC v1" way is pretty horrible[2] (and this > implementation probably has bugs anyway; it doesn't account for the > folio_referenced() problem). > > Matthew is trying to solve the same problem with THPs right now: [3]. > I haven't figured out how we can apply Matthews's approach to HGM > right now, but there probably is a way. (If we left the mapcount > increment bits in the same place, we couldn't just check the > hstate-level PTE; it would have already been made present.) > > We could: > - use the THP-like way and tolerate ~1 second collapses Another thought here. We don't necessarily *need* to collapse the page table mappings in between mmu_notifier_invalidate_range_start() and mmu_notifier_invalidate_range_end(), as the pfns aren't changing, we aren't punching any holes, and we aren't changing permission bits. If we had an MMU notifier that simply informed KVM that we collapsed the page tables *after* we finished collapsing, then it would be ok for hugetlb_collapse() to be slow. If this MMU notifier is something that makes sense, it probably applies to MADV_COLLAPSE for THPs as well. > - use the (non-RFC) v1 way and tolerate the migration/smaps differences > - use the RFC v1 way and tolerate the complicated mapcount accounting > - flesh out [3] and see if it can be applied to HGM nicely > > I'm happy to go with any of these approaches. > > [1]: https://pastebin.com/raw/hJzFJHiD > [2]: https://github.com/48ca/linux/commit/4495f16a09b660aff44b3edcc125aa3a3df85976 > [3]: https://lore.kernel.org/linux-mm/Y+FkV4fBxHlp6FTH@casper.infradead.org/ - James