From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 24C02C27C52 for ; Thu, 6 Jun 2024 19:31:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3809D6B009F; Thu, 6 Jun 2024 15:31:24 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3306A6B00A0; Thu, 6 Jun 2024 15:31:24 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 21E926B00A1; Thu, 6 Jun 2024 15:31:24 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 007216B009F for ; Thu, 6 Jun 2024 15:31:23 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 644D11A0EFF for ; Thu, 6 Jun 2024 19:31:23 +0000 (UTC) X-FDA: 82201457646.27.C658F07 Received: from mail-qt1-f170.google.com (mail-qt1-f170.google.com [209.85.160.170]) by imf24.hostedemail.com (Postfix) with ESMTP id A2AA7180006 for ; Thu, 6 Jun 2024 19:31:21 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=TDEZGTYJ; spf=pass (imf24.hostedemail.com: domain of jthoughton@google.com designates 209.85.160.170 as permitted sender) smtp.mailfrom=jthoughton@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717702281; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=GAEJs8eQPI8ubjdPsWUcJyRKvDZQwgIm7vLcZOJEL9Y=; b=CRjeFODSHy7ASdfYKvYf7vPJhKu2Y6NoULtR/KA4vUtRVxgqGh53XiO9mD1zvHC7by83Z2 hZe+9FUcYRaBHj1WG1fMLh2Caqix1qpTxdu+6TwFBwgtcuUV+fge3Dz2Zkcw01+tq9BXnN 2RBOoYSVxkpWxnzfDFDg0+FQLVMLqEA= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=TDEZGTYJ; spf=pass (imf24.hostedemail.com: domain of jthoughton@google.com designates 209.85.160.170 as permitted sender) smtp.mailfrom=jthoughton@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717702281; a=rsa-sha256; cv=none; b=w1bw/QcysaQv989x2Xw/mPbzE5zfupK0RMTJXhqLNI6XoLA5ubkd6b2BOHcdHmwF/FLN7V vyPmMiaz4y6OqHaBvEB41vZ0nBo4BMA50tLCtnNR3SBMCYLrGQ7GXASAb9ruwYhDFyGz9B ottIJQMofV3/xEdokQVbJmOs5EHopNE= Received: by mail-qt1-f170.google.com with SMTP id d75a77b69052e-4400cc0dad1so78311cf.0 for ; Thu, 06 Jun 2024 12:31:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1717702281; x=1718307081; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=GAEJs8eQPI8ubjdPsWUcJyRKvDZQwgIm7vLcZOJEL9Y=; b=TDEZGTYJt1FKhCz0OumnT3eo4VPnCwpV/dWyzVIZ840UgJzpD5TvSiP+B7L0Xz6yfB 65XndNiXSNAraUQmynEEooeLsT0dzwTHa4lEiivO5/RZdCnhp46VRG0f0fQgxloOn3oc 55qtx6V/7ztY9t/osqCKNHZUyyddNrZ9s92pHT5bh0nzW5tglw9odsHJWR1+2A1RAJIK DVOLG6E8SJygqsrTYFmKY07Xc7IySo6iT6VbOUhhO4ptwCwFkK0Z0iWJVW+sL8IweDDK KEaglrk2US9Ari/8hHC6sQvtCEmoonJ0iYVV5KoG0dOmFUwNfqD0GQBbanlBcv99Yc4O t7Cg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717702281; x=1718307081; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=GAEJs8eQPI8ubjdPsWUcJyRKvDZQwgIm7vLcZOJEL9Y=; b=FSu3Ub3tlHAEm/rSY7lpa24cPGTQZsh/4LmVjSRGYIJJgTiQ4StGVDmeVWpjXrujgd MM7g3zRtyDHzH9GPpZFdjA7tKUKK1xOF1TC42ZSfGVOnn3wH9rt7mXyM/m8bNLsPAg1B FM/kV/tRl7J0Fez3UD2KV8StI1Gt302QpFiheBzczg/bVwLd4gomHoO1o1ryj6lbwsyn HtVnm36WoDDh8jSpOa8qVAsppjj/jC9al77Mjm3tmXAObiqZolzAnk6Brg1RmSqx2lbT n9Ew8kdyiVZXBCZAPZjqA5k6U2JsPOZotFibua0f/p1TvuHP3GgBmS+WKaHBDVE8+XLC ADGg== X-Forwarded-Encrypted: i=1; AJvYcCX7djltnxbLhHqotHjRigucEsdvZnSV2b3eTR5EyBt3o6cDQfD3o1JywfPbx5V8a04eHPj7rfYM0uCmVT+d5qOMM1s= X-Gm-Message-State: AOJu0YwuOI2ssbeFjfivgFo2UVhhq9KyzSARl8+U2LGdTFamLql18k8N EURas10FiDeZajhDpFhXpxetUsqsceRfNScS4v7BujErTOhhPV1a6QzFdOuYU43wryKcpW/R54e 8AIVeioBr/Dh7U1mLkg7ItWCQVDUTwTxDbFIz X-Google-Smtp-Source: AGHT+IFzVhYt4QcYU5JP8smBs5KOm0XYO6N7DqqWlEFxzrNINmoIL3COFNDQ7aBnWIZi8nB9W1+HnDPAl1SVqrZf/lA= X-Received: by 2002:a05:622a:401b:b0:43e:398a:b0c0 with SMTP id d75a77b69052e-4403722a55emr5022801cf.12.1717702280571; Thu, 06 Jun 2024 12:31:20 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: James Houghton Date: Thu, 6 Jun 2024 12:30:44 -0700 Message-ID: Subject: Re: Unifying page table walkers To: Matthew Wilcox Cc: Khalid Aziz , Peter Xu , Vishal Moola , Jane Chu , Muchun Song , linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: A2AA7180006 X-Stat-Signature: nyzp73fmxw487nysnxaj3b5pg1o3yazk X-HE-Tag: 1717702281-86827 X-HE-Meta: U2FsdGVkX1/0+iPNRT1r1rXFqxOPx1vtRArwnIYV/ER2vZw2zb8wlbZdKWmWTe7G1POvLjoXMaQ6uFPY9c0bspn3PgV2Rd5xzjtpCL72k47bbRXNC7eXrd0bmHd/sKmiejclg8vBvz0QK3jAsMFvISQpsTXCTf00nSiq91qbOubf6Ou1wu9JAhlwF56/yMtnh98rnFFSVbkxdp28fi1nHEr5/myD8j0jaoy69i+AHmqXv0WL8cUjae/Ec1yIXGeRGxiC815LXSnoNFonBCzj+6uX7sTWaltHUkiGdT1tgYPKVxm3D29vZfApk3iwW2108vzca2f14gTnYohnB5sm3cdbv21XAO+u6sGQN+JW8raLclPzFiX8IQP817IqI9o3Jxa5se/Q9MocW4gUL5QIGr3r40hXqUvPHphtBoEn2bAWXIPx8KkUG0AaNmEakXObIVxBw9z8g78fSdNU4lVVxpo2vbrqhwvUNKuBldy8B47FVtTt4Dk9zQu88LQ7XeHPSP8/7lGh6BWCSpwupmrsJTTHa4NA4dkM06cjki294ivl0Visd5QvujryoGJOb8uU9kH6daZNFVjdOlSFt6IiKrOZWCQgFL/Fv8LuJK4BZSvgu29XpBNp72l+mlLgHUzTg9X8fhRWr3UXYoIa3nZZZ2o+z5tLhuMu+Lax8jElPminYNk2/VgW43cNvMx67K9ckk95NmRd38yLSns0mftX1ezYHJzeGOlX6sIw4ycNQ2O7gG22L67Ae9BordiR/lFCwJPEQsNNB73luYptqNhUtNhUFVDaOvVe/rRpairjmTVbq5lXa7RCRIUE5Y2BDfPfaqOA5FC/UQSGhn+s6oyRViqHwxTjPQh/rtSHo4/gTzhQtXJ2Jv2yeqVC1NTdZ0BjUAj8srIGmvqHO2JUthEf72dJ6ACl00YChq7W7Ya3f5kO/l7jbdrIQikXzsOj/dj5Vr7TrcxyHwOSkUMA4yb ShnkKMpb XxUcVxl57NpC2lyUq05FR1llnZStrye3iD8Tv9OpNDcHcDI90ruOVDeIlLW92NMa4h/B2NgtYCvq3F7m0nQkO6jm/kwd1RUkVwWRf8t3VvAkZyYEU+k/I9JJnOPfu7J98vjk2IIawH2VrdK23Y+9VbgqrDobsk1QMJb+7mGHyX6YQkHmV8maxaXXJpl+Bg/GhsuKg/Euv16KqJgs9L/AKOE+0ByCxlpMKUPN4Cr74MOJ+SHUf9Ehq4doO6WAStqLmDstS3nm1Q/G/oGjdh+TqoY0mwNdB/Y8oOWefKK/udEy5fpaOPCki7sdan75hj+S2WiP/vByIc4bj2EXF3K8+OSi5f216wRUVzO6d5QyPvP0XlME= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Jun 6, 2024 at 11:29=E2=80=AFAM Matthew Wilcox wrote: > > One of the things we discussed at LSFMM was unifying the hugetlb and > THP page table walkers. I've been looking into it some more recently; > I've found a problem and I think a solution. > > The reason we have a separate hugetlb_entry from pmd_entry and pud_entry > is that it has a different locking context. It is called with the > hugetlb_vma_lock held for read (nb: this is not the same as the vma > lock; see walk_hugetlb_range()). Why do we need this? Because of page > table sharing. > > In a completely separate discussion, I was talking with Khalid about > mshare() support for hugetlbfs, and I suggested that we permit hugetlbfs > pages to be mapped by a VMA which does not have the VM_HUGETLB flag set. > If we do that, the page tables would not be permitted to be shared with > other users of that hugetlbfs file. But we want to eliminate support > for that anyway, so that's more of a feature than a bug. > > Once we don't use the VM_HUGETLB flag on these VMAs, that opens the > door to the other features we want, like mapping individual pages from > a hugetlb folio. And we can use the regular page table walkers for > these VMAs. > > Is this a reasonable path forward, or have I overlooked something? Hi Matthew, Today the VM_HUGETLB flag tells the fault handler to call into hugetlb_fault() (there are many other special cases, but this one is probably the most important). How should faults on VMAs without VM_HUGETLB that map HugeTLB folios be handled? If you handle faults with the main mm fault handler without getting rid of hugetlb_fault(), I think you're basically implementing a second, more tmpfs-like hugetlbfs... right? I don't really have anything against this approach, but I think the decision was to reduce the number of special cases as much as we can first before attempting to rewrite hugetlbfs. Or maybe I've got something wrong and what you're asking doesn't logically end up at a hugetlbfs v2.