From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CA26DC4332F for ; Thu, 15 Dec 2022 18:08:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4652C8E0003; Thu, 15 Dec 2022 13:08:37 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4140B8E0002; Thu, 15 Dec 2022 13:08:37 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2DC818E0003; Thu, 15 Dec 2022 13:08:37 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 1BC9D8E0002 for ; Thu, 15 Dec 2022 13:08:37 -0500 (EST) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id E53841201AB for ; Thu, 15 Dec 2022 18:08:36 +0000 (UTC) X-FDA: 80245325832.28.EE1A296 Received: from mail-wr1-f54.google.com (mail-wr1-f54.google.com [209.85.221.54]) by imf15.hostedemail.com (Postfix) with ESMTP id 52A2EA0018 for ; Thu, 15 Dec 2022 18:08:35 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=fQJxJx2j; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf15.hostedemail.com: domain of jthoughton@google.com designates 209.85.221.54 as permitted sender) smtp.mailfrom=jthoughton@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1671127715; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ByuG98ZwLJHBzjuVb4YMRLtWuLmRZHcl+yVnvda45G0=; b=EX2OgO2eJfy8yhgZH4v0V7sVPP0RpgQqxMUQjqrxyUtO2cxEzCI/eV0wozvsiRxietuXVX HkKuZXxv91xNqgQ0rZkg3E13/BqQQzeOSwEAXE5iASKbOWTZ5t7tFIZybhtaHVUy/7nuBj vSweTT9HhOgEzfn9VAbCgCZfYsardCs= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=fQJxJx2j; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf15.hostedemail.com: domain of jthoughton@google.com designates 209.85.221.54 as permitted sender) smtp.mailfrom=jthoughton@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1671127715; a=rsa-sha256; cv=none; b=ZkAxQWfjhMCGRsRNZBnefshKUYgbO479UcIzKXBA7U5/b+l1KoF6xgpyEyP8cHp+B1oyOF aIftajgUYDYrzyILC23eD62nB7uYlOCQ0Zy3IMlgRZySwEydtJaBqB5i+VPKJ+c5+0481/ a7cxTJ/Ee2aeF3lvViRjBqdG1VqdX4w= Received: by mail-wr1-f54.google.com with SMTP id co23so58029wrb.4 for ; Thu, 15 Dec 2022 10:08:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=ByuG98ZwLJHBzjuVb4YMRLtWuLmRZHcl+yVnvda45G0=; b=fQJxJx2j21x6TIPo/MlAFafl25T/N5xrD9/4zOOOsiEgdQMHhBVPl0pgXEWodm+eLH OnbXiUVU3PExydereiUgThfTB/eNQla8/vuW7pl0nvsApbnX5nixk3Z1+9YRdvFT95Ke Onvs28Fb86W24CNLEhmr0kHK0AQGmSEsHnFQdLMuGUB7x/gywUfABssUAobluwUAIaVz tQoP4P2QerB2YeAV5i2J7EYs2ByJvnF1mimxm6DWkiVT7e+M11xPN+982vmd8KAs0y7a d9O0abl4bZ0cDTNHreRtqWuo4Rg6seYesZYRO3/qNuYqIwi8rqtt1gvzv2eoP9jpnfnR +4jQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=ByuG98ZwLJHBzjuVb4YMRLtWuLmRZHcl+yVnvda45G0=; b=02OzF+wp4rQIRqHAsqepgsgBlFjUv5ICOQ6ejj33SBVUeLVKXlgL4yiV3rHb5U16NS xuJlsQWPe5z0YWJnprNWYI9zXCvWCfjz6JFnNi+GVYEZNCMYvlS6AjrETgZpvB5Up3sF b9Ib0myieVZKk5+h4xsgRGk9vI+pEMNgCCb255nropFbQqeix5FUJtBj8qc3ptRcZI4e n7XQUEtwvUDaUVigJhFIH7rlSbG6SzMMtR7yfS6UinCdSNUEN9C8QAm5JadmdFTKoumE WaNws+hjJVUPynRqO5+oBeKF/zCtd/wi9pzgiDhud7cPXnSo6rLvLZSAQRevrHtj36Mk S7Dw== X-Gm-Message-State: ANoB5pnvZckUioMZg5fNELYIWHSdsShOwlhnyQB/srsNUKVPY03Z7TM2 xRTYJrXlgFh0rdPAebgKriy/RX12lznIxCWbi8DFIg== X-Google-Smtp-Source: AA0mqf7V0T72DWZZAp9Pphy9uCgIwIrirFh/Mq8PQcVYpyPrXGgBPjGD1agtjH36riACaZzUJE9jl4BUxsQEs7i36T4= X-Received: by 2002:a5d:524f:0:b0:242:dee:716c with SMTP id k15-20020a5d524f000000b002420dee716cmr33401675wrc.664.1671127713681; Thu, 15 Dec 2022 10:08:33 -0800 (PST) MIME-Version: 1.0 References: <20221021163703.3218176-1-jthoughton@google.com> <20221021163703.3218176-9-jthoughton@google.com> In-Reply-To: From: James Houghton Date: Thu, 15 Dec 2022 13:08:21 -0500 Message-ID: Subject: Re: [RFC PATCH v2 08/47] hugetlb: add HGM enablement functions To: Mike Kravetz Cc: Muchun Song , Peter Xu , David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 52A2EA0018 X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: thtsyhjqkfje9rx1km81m4nzu5ez5amx X-HE-Tag: 1671127715-324043 X-HE-Meta: U2FsdGVkX1/vfuwDPHNmVMVeuOCNmcVIXPPgYJCq4LfnrNJWb8Mjp2zK5VvDYyFUJuIppubkgVTXp7GdOs/lKr+SVsDLVkbAL0qGntCoN3GTaCMlsmmKP87CvFCQ9PDR6g5+LBf6jw9INgT3pzrWFcRBHsKRiSpCpeM5XUgSRcfSuX+BJjxIPck2ONpOgirt6ZKMyCEjZb5r0XWKZdEgz8gqjyBTG6vDqFR+Z1G6bZDv3bpp3+eHe7Y48l2vRhpAt9GnLmPCXQ2jZ81LLI0QCs7VZIoMi9XkKrlItO8jILvB+H8KbFl23m0iEmdxMYIWLLmt7QJ65M5ORpqEdVLTQUil5e8Ry8A81MVo6nNWBP4O8+vbZK9a09IRQ8W0knPyn7+34pYWMdCUZDUgfj7kOL2t45Quy6jJtzMm8uJI9yOr4mvBQ7yAnr+byn6cj3I1n9mXaEyc/B3Y9udVfaVbVFIKQpDIg31ODPBM73j4N/OfOcsWQzSKtj9X9oBvt98BMyAfFVkKYHrUeu3viJFbZ9oi09GooA7SCaTKwCyvFcfh7rL2FSd3nGcibwpDOK7OoAkasp13NIzsT/9yURWPFnaqndiTSD+UHlVMZAbm3TESbjIaoLgVS5maOnbCx3CIO18wzyXSgDVPY8dZUdzz5QeCpJaffFZdVg9ywGBrlkF8fHV4l3pkP56zSZ3SYQ3UTpyrjji56nbF75satYIUzY+q6GlyIXOtoSJXXbtDTuH21iUSRJ3e3Fs6lgt9HXSQN9dkqchKpjlNFgOLzcoEcPI2ibPCMGnUEm6kPThw0McoAhXiimUid3VhywNUMrUpJ6pf6mRa1aCG7iAR12yL8Ijihcx3wXMRhCxvUBbGbiVCVdHx4mKTMGJOz0/QmcBA1fjO5RZKmKsXM053tPehLznwT30I143mXMicFZFW+7pa5S4oo1Pi9UWmXBcP7Yb4MUJiZcnSWZbcXbDGqRa vnu6sUkO JFt1mh2TYc+eSn1s= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Dec 15, 2022 at 12:52 PM Mike Kravetz wrote: > > On 12/13/22 10:49, James Houghton wrote: > > On Mon, Dec 12, 2022 at 7:14 PM Mike Kravetz wrote: > > > > > > On 10/21/22 16:36, James Houghton wrote: > > > > Currently it is possible for all shared VMAs to use HGM, but it must be > > > > enabled first. This is because with HGM, we lose PMD sharing, and page > > > > table walks require additional synchronization (we need to take the VMA > > > > lock). > > > > > > Not sure yet, but I expect Peter's series will help with locking for > > > hugetlb specific page table walks. > > > > It should make things a little bit cleaner in this series; I'll rebase > > HGM on top of those patches this week (and hopefully get a v1 out > > soon). > > > > I don't think it's possible to implement MADV_COLLAPSE with RCU alone > > (as implemented in Peter's series anyway); we still need the VMA lock. > > As I continue going through the series, I realize that I am not exactly > sure what synchronization by the vma lock is required by HGM. As you are > aware, it was originally designed to protect against someone doing a > pmd_unshare and effectively removing part of the page table. However, > since pmd sharing is disabled for vmas with HGM enabled (I think?), then > it might be a good idea to explicitly say somewhere the reason for using > the lock. It synchronizes MADV_COLLAPSE for hugetlb (hugetlb_collapse). MADV_COLLAPSE will take it for writing and free some page table pages, and high-granularity walks will generally take it for reading. I'll make this clear in a comment somewhere and in commit messages. It might be easier if hugetlb_collapse() had the exact same synchronization as huge_pmd_unshare, where we not only take the VMA lock for writing, we also take the i_mmap_rw_sem for writing, so anywhere where hugetlb_walk() is safe, high-granularity walks are also safe. I think I should just do that for the sake of simplicity. - James > -- > Mike Kravetz