From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1F6A6C27C52 for ; Thu, 6 Jun 2024 23:08:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 904BE6B009C; Thu, 6 Jun 2024 19:08:11 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8B4E16B009D; Thu, 6 Jun 2024 19:08:11 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 754606B009E; Thu, 6 Jun 2024 19:08:11 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 573196B009C for ; Thu, 6 Jun 2024 19:08:11 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 0EE75160DA6 for ; Thu, 6 Jun 2024 23:08:11 +0000 (UTC) X-FDA: 82202003982.23.16FDF64 Received: from mail-il1-f181.google.com (mail-il1-f181.google.com [209.85.166.181]) by imf27.hostedemail.com (Postfix) with ESMTP id 44CFB40014 for ; Thu, 6 Jun 2024 23:08:09 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=A8r7DSM7; spf=pass (imf27.hostedemail.com: domain of jthoughton@google.com designates 209.85.166.181 as permitted sender) smtp.mailfrom=jthoughton@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717715289; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=cNaXIyXqLKU2xtng77DzZyAK/2mvNdYgwFfxaCF+KiA=; b=XulDGq+245fXnfamK0ODRmKa3MiSBCdMH0Kb3pZecfpu+7k+ztTKxaRilv2v4NkDE0Q51U N5Uo2d9xRv8dsczUOE+VAqcTnSvkumzFuUNJCyCYrRFykilknF0CfjzRVxT1EkzaX31fM4 Run3ZnwL/kLydeigWTjXQ2nnnzCxGag= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=A8r7DSM7; spf=pass (imf27.hostedemail.com: domain of jthoughton@google.com designates 209.85.166.181 as permitted sender) smtp.mailfrom=jthoughton@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717715289; a=rsa-sha256; cv=none; b=NlN15CPxxIzBEbtSrHPwSYzEbHc7XyJYSuhlmh+M4qbR3jw79i+mkGNrECeGwbHOPb6wxE AX4sraIbl5tiP/owx/E4kPIXcfHbORN64X2fua/3vwYObTw4w2OWuBFYLdyYdwlT1KB6cF 3BImwe0FYddZ/28kI7RSAvQp15hHsFA= Received: by mail-il1-f181.google.com with SMTP id e9e14a558f8ab-3748a185da1so21465ab.1 for ; Thu, 06 Jun 2024 16:08:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1717715288; x=1718320088; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=cNaXIyXqLKU2xtng77DzZyAK/2mvNdYgwFfxaCF+KiA=; b=A8r7DSM7os9ydRfR5tlkwyUj5U4ns8OG1gimfTMegZlEULXn41jxPVS/GUJOvLaHsm PTN0y8uXyKGxLk4qwmnFlZ6XGwqzeJTyZs/OsOFqIFIel8DETq9ECfjONw80Dt1eYbO8 KSMqqtgmWlsMS/RsgG6GCfyj1MVuAikNZoMmtiJ616YSx89oIv3rT8Lt7LHzHtiWWHvh q3VbhE8f+JoUa6iHJpB/al8pu33GWJ9QqwkkX+rBO7kCPY2omUxMVyy7PZBJmf+PqnMZ ZS46hXw6Ik6YoGB/GR9dLIzOTmrty0Kpr0oBENgPiEjD7kd5SYQKE1Sp34zyCBEL6x6Z WWmw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717715288; x=1718320088; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=cNaXIyXqLKU2xtng77DzZyAK/2mvNdYgwFfxaCF+KiA=; b=DsLcr6h0VheuZ9Blcsd/GWHc1NzKOUjV8YVrqW/UeGAJPbnNLwq4WYbLrBco0RdTKM CnvIhZFMromeRWUT/+uflAp6eOHgJteobBLvWnGBv4YYdJw5hB6rKOIqBDxp1robhPyN vvjLnwpxPE9FFfyeGjXUB/eRFYWGaRQtLbYkJOZvl6txqQDd2NSU/f36TneZpaxvg/+V QBhDb/fv/sNNmw1CXZa/OjHDlQNGET2/l17aTSC3CYceyw5UsTo/xudIzNQCTM5wDmsK 8g6y56HBLqLiViSP4hD2EPZWcX700IHrhyHYd8xXBA3GQhFmw2kwMdrK+Kn2tUCN/mEi 2JkQ== X-Forwarded-Encrypted: i=1; AJvYcCWMN09v/87/+MKXp4hubcVW6XlAHgsgDas/x+w8jYF/Ddq1ImZTr6Fx8YIWtfx9ccZWro12uR/48aT+9X5O1nDqQLk= X-Gm-Message-State: AOJu0YwPgZCKcNxOKqPnbDE5cpII4weCtAmTnhPQ9QhjyPTrk5Jj/M5F HZ/ZIR/h/UWPIPoGX9Rp0Bzb1vsZPq4oNT+52/D80j310iawDmScwzNbKPYGJsMiU5ybv1OBN5Y 7KqFjxO01c4vPVEPvZxufPd2dyW3H+oek0OsZ X-Google-Smtp-Source: AGHT+IEWCqe/Kr+40n8oFpiiGHGLFETyB1XSQwYTeB3dlzVoN9JdiR/meFGZsbtEMURUPRUHJ2k5T81lNtOts7vLzkg= X-Received: by 2002:a05:6e02:221e:b0:374:8a54:7622 with SMTP id e9e14a558f8ab-374bc0837c9mr5353085ab.19.1717715288108; Thu, 06 Jun 2024 16:08:08 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: James Houghton Date: Thu, 6 Jun 2024 16:07:27 -0700 Message-ID: Subject: Re: Unifying page table walkers To: Matthew Wilcox Cc: Khalid Aziz , Peter Xu , Vishal Moola , Jane Chu , Muchun Song , linux-mm@kvack.org, David Hildenbrand Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: 5pnnksyo81cpjf41rw1ta6gznm66cc31 X-Rspamd-Queue-Id: 44CFB40014 X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1717715289-290231 X-HE-Meta: U2FsdGVkX182x2JOk+xnZq8V1mMEe3vFmX6zVE1N/fyDkp+TCpN/T/fFh7RD6+B+pCca1vxGw+/iba44iGKlWIUdsQtOr2ZfD23jILyGNlY6V6xNyC37FmCGxHTJ+kZGwjR2Uz65QtomJRAEsnTwH6Ib9pJUrragFmSsiLo7wtU1v8eQy8F50ahZKb1/8W4pwLrzcfp9ayle3HN0To/BHjrFGGaS+LfFF7O6ch47qLobGPlDoHbyGgebvaQUWHlwQc6xTNxGFZtjK2dExQ6xqxV3wp2CGRDYXVqZZRvYpZbBdSmH1pKFryPi5OJ2DubuCjDqV1FcY7IPYfGuN9lz1zMwRAPQiUDHOPKOz2y6SK2KdDPJhg0SC2cVADPGFkcievz0njRK/VkCPIMoS35g8Nsg2/q9Jb2CnkKkVejvaZ8Mw6GmAv+FYh9Fx2OnMcjB3HhE8Mp4kfXgYnsufaaam+T9kYOrR1QHY/jRdbnJ2kTqvdLqq+1ZfkmkKI45fQVqLVil5cHMmxFZkUW23MFcVOJAiou/RoU09EohM5fdWBphAqxHo09eTRWqKSDYyvfL41AEEshLBr7g97lYYzyiO0D2GQ+IetNhEgFrLjz1BKKeaSweShKy9LFBHbMTgCOQaKQRvO94cfR2V+w9VTLQobRG1VjdTObGq/hFljcvTtAcV6L9PiRvbHDjSTI7hfpcicZbHIGhVSfI3/EEsDESgHOAwOHvy4GEezaR4LLXsromi+uxBmoO2iBYNHjthfFHUXrYS0uDJv6JYPARHHxZxYMmlaj8SDbCTzkURPl7hHMISYL2cp9RFFaQR+WUnMr+GGgcu9+A1+QkEGuGatYDVYs8M6fU3yI/NYH2Spznp9Uv46fal2MU43LWeY4sV7gRe5kh7vj//2w3mA67+6JYowTEjbx2DZM8ZX/zbc19Olg1QuV0mZ4xK2d4KzcqB8Pzx+7Clh0Mf1Hz+QVQhDY 8bmeL8Mi 9DwWmTeG1pubv2UZEfY+zfjwiAzl4Blhf0I8DEAo5S0aP/fbR0rIl97Fa5P1tyo3A5ezSZT3nyiB7SfupftR7xddIE96CAtz3oSh2Df1auzI+j4usfLrm0vLeoufTAKa20PU2SisNCCmDH4qY5/liU3d4kEs7CrdwxQ9rlryeZ7l5EDcjyCi92HBHgMVvoAGdrZfTm70yjpVXUD03/ErYaZmZCoaF04KLTkDhvrwWxXtbwB8Woem8MtVwAwuDaxe/Bg3XNBWNVlwsDEdNja5hwFOEu52jhsiax6ktG56WWVwccCQ1M+CHBhd3m4ugbVZRTntxpBse2fP7SKXbemKgjRzsaXfbxCJq3xYe X-Bogosity: Ham, tests=bogofilter, spamicity=0.000053, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Jun 6, 2024 at 2:21=E2=80=AFPM Matthew Wilcox = wrote: > > On Thu, Jun 06, 2024 at 01:23:08PM -0700, James Houghton wrote: > > On Thu, Jun 6, 2024 at 1:04=E2=80=AFPM Matthew Wilcox wrote: > > > Right, so we ignore hugetlb_fault() and call into __handle_mm_fault()= . > > > Once there, we'll do: > > > > > > vmf.pud =3D pud_alloc(mm, p4d, address); > > > if (pud_none(*vmf.pud) && > > > thp_vma_allowable_order(vma, vm_flags, > > > TVA_IN_PF | TVA_ENFORCE_SYSFS, PUD_OR= DER)) { > > > ret =3D create_huge_pud(&vmf); > > > > > > which will call vma->vm_ops->huge_fault(vmf, PUD_ORDER); > > > > > > So all we need to do is implement huge_fault in hugetlb_vm_ops. I > > > don't think that's the same as creating a hugetlbfs2 because it's jus= t > > > another entry point. You can mmap() the same file both ways and it's > > > all cache coherent. > > > > That makes a lot of sense. FWIW, this sounds good to me (though I'm > > curious what Peter thinks :)). > > > > But I think you'll need to be careful to ensure that, for now anyway, > > huge_fault() is always called with the exact same ptep/pmdp/pudp that > > hugetlb_walk() would have returned (ignoring sharing). If you allow > > PMD mapping of what would otherwise be PUD-mapped hugetlb pages right > > now, you'll break the vmemmap optimization (and probably other > > things). > > Why is that? This sounds like you know something I don't ;-) > Is it the mapcount issue? Yeah, that's what I was thinking about. But I guess whether or not you are compatible with the vmemmap optimization depends on what mapcounting scheme you use. If you just use the THP one, you're going to end up incrementing _mapcount on the subpages, and that won't work. I'm not immediately thinking of other things that would break... need to think some more.