From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CC122EB64D8 for ; Thu, 15 Jun 2023 01:20:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3A3A38E0001; Wed, 14 Jun 2023 21:20:22 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3558E6B0074; Wed, 14 Jun 2023 21:20:22 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 21B7C8E0001; Wed, 14 Jun 2023 21:20:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 11ABE6B0072 for ; Wed, 14 Jun 2023 21:20:22 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id BD9E8808AA for ; Thu, 15 Jun 2023 01:20:21 +0000 (UTC) X-FDA: 80903226642.27.EF4BF81 Received: from mail-il1-f181.google.com (mail-il1-f181.google.com [209.85.166.181]) by imf23.hostedemail.com (Postfix) with ESMTP id E3E5014000C for ; Thu, 15 Jun 2023 01:20:19 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=WnmV7EkG; spf=pass (imf23.hostedemail.com: domain of rientjes@google.com designates 209.85.166.181 as permitted sender) smtp.mailfrom=rientjes@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1686792019; a=rsa-sha256; cv=none; b=jGjDPImjZtusFoMiFo/YClRoQK4pktLlfdZnQEi2s+TCvoylUdEnU0qSEGrQhqOxuM40D/ T+dyI4tJm0IGAjkppORsmAfdd7caxtpib1D24Ikp3SlNrkXRwd/JDTM0yo+VITVuKacHxH HvPYmjmNMjqT0ZyivYyvCvhDo+fnWnA= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=WnmV7EkG; spf=pass (imf23.hostedemail.com: domain of rientjes@google.com designates 209.85.166.181 as permitted sender) smtp.mailfrom=rientjes@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1686792019; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=iKWihYHapE2eJyn9tL+3PnyC/s1kMHTuPQlioPRgG10=; b=k3/z2YhifFKt0HB21IIUAaSyZWZTpM5d4QsrpSEFbBzilWSfM4vXtU4wV4bqIELFQlXFoI 7HePQn1GYmQ3fo6ZXRnYFFm+Ai2rllzj1FHB0ELTuiCkUEubIcljfUVVYelyjmUqD/aPxb kCZmvWLNm6/zza5NTMaIftvU16RAhvg= Received: by mail-il1-f181.google.com with SMTP id e9e14a558f8ab-3409d944009so56835ab.1 for ; Wed, 14 Jun 2023 18:20:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1686792019; x=1689384019; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=iKWihYHapE2eJyn9tL+3PnyC/s1kMHTuPQlioPRgG10=; b=WnmV7EkGRz/q53h46t3Ri9iMfnU63KHuEtMjTN7SGNuPQ9FARZx7LQpBIjbZXT3Ay2 RLbLRO0QyMX+/Ybcd0fpe5v8DJxr0PI3eEBbmUS1GDFD+gbVldamyr/1kT404o+TmfrT 2gdfZYGRYJrKlDO7xbdOypiH003LLjjkcXD4Su15sbKEaeq9CIOiJ2troGneivIjCB7A lCQN4J6R5CRLts9yANngM9zaxQHhjR1OUjn9blB6Lm/cS6LcsiVC5W2H3FuU3Ld0s6eF Sn1eaGhazGfBvFBMH3MKqPq2icxKbpJqHiSrdfUOWQ8L8n9rm47Lgz5vP5qE20i/h6oM sT7w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1686792019; x=1689384019; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=iKWihYHapE2eJyn9tL+3PnyC/s1kMHTuPQlioPRgG10=; b=UkQNRoKznW8qhT4gigdSvk9+q//Y8cRCSIZa3JnlHL5gsMPBF/ZaeDaNChDRsC4y1U b5UqS+DW0oNi+kxRx1wtoIgSjWZckjDcWcrjQSg7nBq/MuYzdU2nti3qCP7k57p3Fu7q YASQyAw5YVIae4paYvRjQ+TYGT1nGMguI61f/2pX/fUwcnPXTMHqwTBliUOMKYMogH4t nYg2ZkT0OemNpC922bXyn/HyYJ3ZRcarROH/Mp1o/chmGL3OXxpWNkqRExX/dRe8Kggs E7mZKuu0+81dFBG1tlJHzNc6NXQMtaPiuNsnLjG2S+QP2mxPHtaqUylJHRNdeYdX1bxv HWAg== X-Gm-Message-State: AC+VfDyYkoyJx9rtoZV+SK92/c7DYCz7zmwNJwqN+AieYy8bxYEA3gye u+OUSKWzLL6kq17VjQNWF9ArjVyGHjCwsMvJsbnvOw== X-Google-Smtp-Source: ACHHUZ7RNzSpJTjxKKcXyS2C8ejvpHlY2ixopsOUYZnNecqziER9fzn0aKM7v/mMhFZ1iE3rsv6Ckw== X-Received: by 2002:a17:903:453:b0:1b5:8a8:b587 with SMTP id iw19-20020a170903045300b001b508a8b587mr124467plb.2.1686791543271; Wed, 14 Jun 2023 18:12:23 -0700 (PDT) Received: from [2620:0:1008:15:afe0:a023:3ae7:4d10] ([2620:0:1008:15:afe0:a023:3ae7:4d10]) by smtp.gmail.com with ESMTPSA id r21-20020a170902ea5500b001ac8e0ea157sm12726331plg.144.2023.06.14.18.12.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 14 Jun 2023 18:12:22 -0700 (PDT) Date: Wed, 14 Jun 2023 18:12:21 -0700 (PDT) From: David Rientjes To: Mike Kravetz cc: linux-mm@kvack.org, David Hildenbrand , James Houghton , John Hubbard , Matthew Wilcox , Michal Hocko , Peter Xu , Vlastimil Babka , Zi Yan Subject: Re: [Invitation] Linux MM Alignment Session on HugeTLB Core MM Convergence on Wednesday In-Reply-To: <20230614230458.GB3559@monkey> Message-ID: <30c31347-0ee5-9f17-faa3-059facf5f4cc@google.com> References: <20230614230458.GB3559@monkey> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: E3E5014000C X-Stat-Signature: tpzrdm96pr73uyaaeg98hgnod7rh3gi5 X-Rspam-User: X-HE-Tag: 1686792019-337159 X-HE-Meta: U2FsdGVkX1+QvIkJslbcoFH3vHOG8B+MkbrXeImKHz47z/q4hQuWkCDa1b5TScS1940h86f9q0elhugK3EbDGBUHgHxp4lAkYh77mkkoCAiG1pJnurun7ltd57Odh8fChnpb/L5QVdadlB/cB6XsdH701cqHgoEkfPn25g7vLSJoQWNnsu+NRNuJ9eNPB5q2/xdrZklkVF0gsvFDT2OPzfPtYWx66/zeufgVeLcJz75sf+Q74jTpFg5BI6Od/kGw+gHCp14qLEkbY0/NC0OV5UGGEAFsbLT8SPbwS6v3wAlGcEG+uOKPtDJjKiTOeVXTjeTnJ0Z8RnHjO20liT9GW8BW3DwnYPApSzizMJ/Zg79rarwLhu7xTioXD8xZrMAtH4JZVqPUgXg8DH9iSYUMCX5StcwMYFK9xy2Gl/PJQVTwdPYU6e564FiyO0ahyyYVVaATjdQwmqLoO/nN2UdsSJuROSDdbms5YZsaBlFzS43Cje2rESrHFHuy7lfZsf81nNGt+V2dxdKajo++pl9LZSYMN0ZxpVMJrIQOa8waMZlPgcrFugkzLrKtqrrScvQZw1TXFBR+GXEwXNH6VxQGBnp+mTf2vah7nXx4DWP00S+Cq5rUC5BY5kr5VHjgNnDlAPzgI7C8ul0A7r2lcwTLp185KGq1P3sveUT7lpiEylcqU0dyxY+t2YSPyX29OyolLRcpnmN+PBhuQZY3YESZEnZrm1RRnkQGNXyZTsnAZb9ybhXfxlQBYtGuafENEWhEwQR8wzlRjDgvl6k/MxBgsBOYcNcYwSZJTedsokllfmCu5Vfj4TttIcIHvCNLVZ+5kBfWVwTlPsNnxnL2l66A467A1eiGQZwhRKh5UsKxISF6YrjDe0y/zyON8W8GZ/yIAY1AyhdF6l8NE8hqVOYXwwaO73V2C2bzb42/tX4UEMhzcO37T3UF7WBDkDiCB9fJk1ObADJCyK6PNkMfbm5 dbPzEVM0 EtRHUYRnFJYvsLzfpNEOwcToohCUHjykp8ImiWglE+k/PbFcTyi4j7vTfsAF7J6QkvlgjitV+/HxKBZvjVPsTFpscrgxAR7hs6npv/O+TjETwqhgrUXa3UHX9aUZ7u6+ENkIkBfSxuxLbEv/Em7wHrp4u2U+H4izOU+dHAGS7qpI018GUcKg87tKO1Jp/iaj/JCG2nKk2rTrZUtoH61hSgPz1EAMOg/F2etlI+3rNfkg8Ej7LjmL95uxdbjy7wmz4Ah0BXM2Z1DMubv0Mq5O62sDyFFdjNxu+usu/+5N/qZH0sXlhrTFCw66B0AUVKAJ+QdTEdLNvfVW5jDs+h9mKhmPQqbGiMQ80HMmKT4DVUDGLiuAC2q9iwtJWUVWWuTySzj4eGpfdD7YielFvx//YfT77sbNX8s9ojwfZCp6st+qSUfLtx0DG5XIPurdk4Mdp7txQbq25KIEa8tIdpW95BV7nuCvM5143mZp8Xcj62O80eh8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000771, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, 14 Jun 2023, Mike Kravetz wrote: > On 06/12/23 18:59, David Rientjes wrote: > > This week's topic will be a technical brainstorming session on HugeTLB > > convergence with the core MM. This has been discussed most recently in > > this thread: > > https://lore.kernel.org/linux-mm/ZIOEDTUBrBg6tepk@casper.infradead.org/T/ > > Thank you David for putting this session together! And, thanks to everyone > who participated. > > Following up on linux-mm with most active participants on Cc (sorry if I > missed someone). If it makes more sense to continue the above thread, > please move there. > Thank *you* for keeping the conversation going. And, yes, thanks to everybody that participated today and especially Matthew for preparing content on his thoughts and ideas on convergence opportunities. > Even though everyone knows that hugetlb is special cased throughout the > core mm, it came to a head with the proposed introduction of HGM. TBH, > few people in the core mm community paid much attention to HGM when first > introduced. A LSF/MM session was then dedicated to the discussion of > HGM with the outcome being the suggestion to create a new filesystem/driver > (hugetlb2 if you will) that would satisfy the use cases requiring HGM. > One thing that was not emphasized at LSF/MM is that there are existing > hugetlb users experiencing major issues that could be addressed with HGM: > specifically the issues of memory errors and live migration. That was > the starting point for recent discussion in the above thread. > > I may be wrong, but it appeared the direction of that thread was to > first try and unify some of the hugetlb and core mm code. Eliminate > some of the special casing. If hugetlb was less of a special case, then > perhaps HGM would be more acceptable. That is the impression I (perhaps > incorrectly) had going into today's session. > This matches my understanding as well. The above thread surfaced some great areas of improvement for hugetlb and the big idea was to surface those, do some technical brainstorming, and think about both short-term and long-term goals. There *are* complexities for some of the convergence opportunities that were discussed, but all of them would improve (imo) both maintainability and reliability. > During today's session, we often discussed what would/could be introduced > in a hugetlb v2. The idea is that this would be the ideal place for HGM. > However, people also made the comparisons to cgroup v1 - v2. Such a > redesign provides the needed 'clean slate' to do things right, but it > does little for existing users who would be unwilling to quickly move off > existing hugetlb. > > We did spend a good chunk of time on hugetlb/core mm unification and > removing special casing. In some (most) of these cases, the benefit of > removing special cases from core mm would result in adding more code to > hugetlb. For example: proper type'ing so that hugetlb does not treat > all page table entries as PTEs. Again, I may be wrong but I think > people were OK with adding more code (and even complexity) to hugetlb > if it eliminated special casing in the core mm. But, there did not > seem to be a clear concensus especially with the thought that we may > need to double hugetlb code to get types right. > > Unless I missed something, there was no clear direction at the end of this > session. I was hoping that we could come up with a plan to address the > issues facing today's hugetlb users. IMO, there seems to be two options: > 1) Start work on hugetlb v2 with the intention that customers will need > to move to this to address their issues. > 2) Incorporate functionality like HGM into existing hugetlb. > To address existing customer pain for 1GB memory poisoning and post-copy live migration, yeah, I think these are the only two possible paths forward. > My opinion is that adding HGM to existing hugetlb is the only way we > will be able to address issues for current users in a timely manner. > The session today (and email thread) point out the ugliness and > difficulty with hugetlb special casing in the core mm. Therefore, > adding HGM (or any new code) to hugetlb should not introduce new special > cases to core mm. I know the latest version of HGM does introduce new > special cases. I am not sure if those can be reduced or eliminated. > My suggestion for a direction forward would be to add HGM to existing > hugetlb with no or minimal new special casing. In parallel work could > begin on hugetlb v2. I'd agree, and I think the complexities of HGM are largely constrained to hugetlb. Reducing the special casing could have a concrete path forward that can be iterated on. That very specific and concrete feedback would be extremely valuable.