From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 48957C2FC0F for ; Thu, 17 Aug 2023 18:14:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D4114280044; Thu, 17 Aug 2023 14:14:17 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CC720940009; Thu, 17 Aug 2023 14:14:17 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B41B9280044; Thu, 17 Aug 2023 14:14:17 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 9B2EF940009 for ; Thu, 17 Aug 2023 14:14:17 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 501061C97C1 for ; Thu, 17 Aug 2023 18:14:17 +0000 (UTC) X-FDA: 81134396154.03.DF98411 Received: from mail-ed1-f45.google.com (mail-ed1-f45.google.com [209.85.208.45]) by imf30.hostedemail.com (Postfix) with ESMTP id 4FCFE8001E for ; Thu, 17 Aug 2023 18:14:15 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=zPVWY7Im; spf=pass (imf30.hostedemail.com: domain of zokeefe@google.com designates 209.85.208.45 as permitted sender) smtp.mailfrom=zokeefe@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1692296055; a=rsa-sha256; cv=none; b=whC7g7UJePYWmRQP7HzIhcvO0oKcJfiuCcAipma3jMaZr2BawAQmuU7nSImILKxWAQsQu0 82YV7I1H3AiQSF5C+3S+XPRdUwSuXA0s9lefae8gXWm/Y5OI3qqMFan3kuy6rVpZkw6Ld7 kqAYuFJP2O4DM/A0Ef1WnGofrqVg1aw= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=zPVWY7Im; spf=pass (imf30.hostedemail.com: domain of zokeefe@google.com designates 209.85.208.45 as permitted sender) smtp.mailfrom=zokeefe@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1692296055; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=LwDKgeuDQ3yDX0vpP5yrJsNnDd62QRt5AHduSLOizpo=; b=7vOW4aVDuZzm7x4O0OdpVIdxhxGhwcdr4W7aABnjAovbkLrAwRWqgurjLUzT2j+EluJq/t 73+mGIpbEodnubjTJC1JLnnCLbmQL+XssgKe5dA301s5Xt2jOWQv3LN5lSl01GBqtjoiST o19wyqyBtiO36CynjaKwedQ+LpRH4Rw= Received: by mail-ed1-f45.google.com with SMTP id 4fb4d7f45d1cf-5257d67368bso1472a12.0 for ; Thu, 17 Aug 2023 11:14:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1692296054; x=1692900854; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=LwDKgeuDQ3yDX0vpP5yrJsNnDd62QRt5AHduSLOizpo=; b=zPVWY7ImzUQAjZ6ug9gn47FS315cN5J3OSX4KN5vIYd+3I8lPC+E1EDUGgP0IELOLI lSwlZ4/COpws7Y1O3au+KW/gnvcDdq9ZxcCEXP+SctjF5w3wq9BfH+FboPHOnW4SdMa+ gg22zIghHGPAz30ZD+AjCKojfsPy4/+RRFQaVH8gwe5FOUjp2PysGWszfuPjzoe36CvZ D37f4YftXGjyu5edFV8XTNOOyEA2B21/DAnindpwQtNLmz1ryqTGzRnkBKSnxlSJoSz2 LcLJ2jvDzWHJRvm3EzmGDklURxquU+q3lxVRzWsoZrIS+LoUSSVKzLJfkUtNhHpWT9wa Khuw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692296054; x=1692900854; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=LwDKgeuDQ3yDX0vpP5yrJsNnDd62QRt5AHduSLOizpo=; b=In+pPG+GzXHkQhpZRyGILBl5B6rXnTuOcQv7Uy31xHIwXfKd3Fd/0MrIini6BnwAuU YqTf5YpFM1Ra8MyTSH2O/erVBtJyWCrZj2bL9Z1XgXz9yQ02fybe20W66CvMMR7vGl51 QN4IkV+926eHxvmfkrp7RWU6Fxs325Dm91KCX0oBQObXUxNoMTCjuXr23h3qgbEOo0ju UeLWl3xd0sPB5fkzR0Fw+kNwLqjJ7/LeMFEi0d/RWSw/vjDgja+7sYRa7kUFdEma50K4 KTIUsCOceIRSVJ20Yrw+n1mFeW7RLERHh/MVIBMYMt/XCzbeffm3AykuAxKe75oRLNCA 0KhA== X-Gm-Message-State: AOJu0YwJ9w+oLY/k+AOvg2xBNYt2kT6nb1ItKuG99clxJ/+sZBrsndvc WAjWe72Wn3Rt9p4f07U9jtWQ5xyfRbC3p8xwKxXdqA== X-Google-Smtp-Source: AGHT+IG44HNJtNBPQmMhWG+y988LlX5RCYoDcwPF/0YQawQk64rzt5DyU6j03E5dRJb29Eb+A7Q1/oyb5Dr1wpLNlIQ= X-Received: by 2002:a50:d4da:0:b0:523:193b:5587 with SMTP id e26-20020a50d4da000000b00523193b5587mr14812edj.6.1692296053641; Thu, 17 Aug 2023 11:14:13 -0700 (PDT) MIME-Version: 1.0 References: <20230812210053.2325091-1-zokeefe@google.com> In-Reply-To: From: "Zach O'Keefe" Date: Thu, 17 Aug 2023 11:13:36 -0700 Message-ID: Subject: Re: [EXTERNAL] [PATCH] mm/thp: fix "mm: thp: kill __transhuge_page_enabled()" To: Matthew Wilcox Cc: Saurabh Singh Sengar , Dan Williams , "linux-mm@kvack.org" , Yang Shi , "linux-kernel@vger.kernel.org" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 4FCFE8001E X-Stat-Signature: 614f3ifsognh7poac86mum91ckz6d7hh X-Rspam-User: X-HE-Tag: 1692296055-157824 X-HE-Meta: U2FsdGVkX19XgZTgCyba+4GoZP2SfJ0aNAK7XxOp/AarFEI9GYbuyFFkk/VLe3pLZFJiNuoRKvX0Mypjlc8s6VjvGLfcsWexgIdN7I3ERT07xe0OJr9lx+ijn7ewnwttFmsE/bng91a0cKs2aKuTY9f/akhHC56F2Q1x6O4xpcXm+8ZEon8U3utdUMjOWKD9jvUeEJtb9xPH7wTAF0TPbKQPPn1G/vkwShTihE4F+E92yAgX7M+cYWFcSZbL/uzS+FTVO3MK6FObxEuTW4VOrtLRL7r7Uk5mBpOMKOutym6c6YIqKu/DmGPhKZ2/sC91Ylc7FcKFrM6WZrXYexn+sfGyq9LjdvGk74ZF0WMOpm+JCC5z5smwxceZdJeMCsC0bnGlojFMvnoIcyW4WKYDAPK3hX+RQUsWUcQ79/dNtR9DQ/B46BxA6BMX5QODizRACvX/1fAO+Co2BkkG5xJXpOIrES8JDk4WKJuQf7UWmlQc+QqGgqJ7PPEOQFH5JAjCwMhWFUg/OYJKgpyHgV7ym/rI41xo2crdeGddBMthM/bvdlHNsQv0uZOF9m3bJzEOObYLlmbWaOPLBmSskZE1he//6q06FmwFtfC++nKA05inUKNcbfVWoDotWIO+yig8e/ytd4xCPYHtmTZBnhJQhOscfcD4IzF8vVjJihTApHlM0j8/uBtF+BdGqypA2VkhoF+/ObmrpYB55KBtLogmwBhcwDcuCCiyipdVEV80mDP2mtE5+x0L3cbclyPKkKu9A4U/rTgA8tVltMW6lDgM1QJOFVldoAvqgh3pn4X0RDR5be3XMEGPQMQ2WOkzUx2QMkZaeLOJUa4qWkAJAkbma2YixE5BNWkucS4SEec0FqQi+078d5+qpZNehclEk0cKQAFCbaHwK+4VtjugVP+Cr6ek06kONitbpvPzyL3IyTw1ybLKJSf+AhmwhwAJg/6NO4w1VSSET1IHu26NK50 IGHbMofz 6Pi8diU8Gj7ZIeaIWhfum6WA0me8FfxmuHS6uD/ztVx7/oMi3Xo2KjvogL9QroqkFOhmlbnSBj9FKfwOpZnHsTOZ98wv/p+n/25ouJjV1V3l836XfIZJyLodK3hjR2CLvf+O8m68H8MQUYDLW+9Lkf7zZFyI/FmKEJyfRe55k4qovibS4kOUDU7Lya/b0+6YGHOTCkBKbj5oU3vltbYQAXVPOereeXM03+5FYDLQn1i7CsRBfwauf3mXrr3uyVwFvQV5XK606PexGVUdo5MgjQ66BuQQImc82fGYagY/dR6HS0gTSX8X7NOLy63OuZPQZprKDp3IwjDxaKw8LTA7xD5WDKURVLAZ37lgnd0Q9Wf4NCwqRAmzXO0ivmw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Aug 17, 2023 at 5:18=E2=80=AFAM Matthew Wilcox wrote: > > On Wed, Aug 16, 2023 at 02:31:06PM -0700, Zach O'Keefe wrote: > > On Mon, Aug 14, 2023 at 7:24=E2=80=AFPM Matthew Wilcox wrote: > > > So if we find a large folio that is PMD mappable, and there's nothing > > > at vmf->pmd, we install a PMD-sized mapping at that spot. If that > > > fails, we install the preallocated PTE table at vmf->pmd and continue= to > > > trying set one or more PTEs to satisfy this page fault. > > > > Aha! I see. I did not expect ->fault() to have this logic, as I had > > incorrectly thought (aka assumed) the pmd vs pte-mapping logic split > > at create_huge_pmd(); i.e. do_huge_pmd_anonymous_page(), or > > ->huge_fault(), or fallback to pte-mapping. It seems very weird to me > > that hugepage_vma_check() "artificially" says "no" to file and shmem > > along the fault path, so they can go and do their own thing in > > ->fault(). > > Wow, hugepage_vma_check() is a very complicated function. I'm glad I > ignored it! Ya it's a tangly area. Far better now though, then before Yang centralized everything. But yes, now I need to figure out what to do with it.. > > IIUC then, there is a bug in smaps THPeligible code when > > CONFIG_READ_ONLY_THP_FOR_FS is not set. Not obvious, but apparently > > this config is (according to it's Kconfig desc) khugepaged-only, so it > > should be fine for it to be disabled, yet allow > > do_sync_mmap_readahead() to install a pmd for file-backed memory. > > hugepage_vma_check() will need to be patched to fix this. > > I guess so ... The easiest and most satisfying way to handle this -- and I think we talked about this before -- is relaxing that complicated file_thp_enabled() check when the file's mapping supports large folios. I think that makes sense to me, though I don't know all the details fs-side. Will we need any hook to give fs the chance to update any internal state on collapse? > > But I have a larger question for you: should we care about > > /sys/kernel/mm/transparent_hugepage/enabled for file-fault? We > > currently don't. Seems weird that we can transparently get a hugepage > > when THP=3D"never". Also, if THP=3D"always", we might as well skip the > > VM_HUGEPAGE check, and try the final pmd install (and save khugepaged > > the trouble of attempting it later). > > I deliberately ignored the humungous complexity of the THP options. > They're overgrown and make my brain hurt. [..] Same > [..] Instead, large folios are > adaptive; they observe the behaviour of the user program and choose based > on history what to do. This is far superior to having a sysadmin tell > us what to do! I had written a bunch on this, but I arrived to the conclusion that (a) pmd-mapping here is ~ a free win, and (b) I'm not the best person to argue for these knobs, given MADV_COLLAPSE ignores them entirely :P ..But (sorry) what about MMF_DISABLE_THP?