From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B196DEE4996 for ; Mon, 21 Aug 2023 23:34:47 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2BC47940016; Mon, 21 Aug 2023 19:34:47 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 26D4A94000B; Mon, 21 Aug 2023 19:34:47 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 13482940016; Mon, 21 Aug 2023 19:34:47 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id F20C694000B for ; Mon, 21 Aug 2023 19:34:46 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id C6AAA160132 for ; Mon, 21 Aug 2023 23:34:46 +0000 (UTC) X-FDA: 81149718972.29.DA3F096 Received: from mail-lf1-f48.google.com (mail-lf1-f48.google.com [209.85.167.48]) by imf20.hostedemail.com (Postfix) with ESMTP id DF64B1C0006 for ; Mon, 21 Aug 2023 23:34:44 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=4PUHlW3x; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf20.hostedemail.com: domain of zokeefe@google.com designates 209.85.167.48 as permitted sender) smtp.mailfrom=zokeefe@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1692660885; a=rsa-sha256; cv=none; b=NizxlmBc/GeKIx7QkcAVvqIjS33dDYafW50lOn3Ic6WGowDCoqWDLA74CpZjMXw/SJ6qF7 4yOtPEObQoqM8VID2t6VIhSlQOYTk4sX/RwNEb6IUTav9kGdeU68ZjDW8saAJzn1nFuIHL 8LXTWe7x8TA84SdMsrIsAanbqZEqGMY= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=4PUHlW3x; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf20.hostedemail.com: domain of zokeefe@google.com designates 209.85.167.48 as permitted sender) smtp.mailfrom=zokeefe@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1692660885; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=A54V70or4HVq8F9rxCPaBDJeQam0nY34rK8TRrDq9Pw=; b=ChwJt+DK32eRm1xsZivDjKtuHtVono9DWmmS3CEIujofa92JtcFl6o0oMjUUWp2K6dyj2Y kV5YZEeJuqunocjzpkQ6f7xu4IAtJgk5GkNtQksOlXj4AR9C7meYSp7cJnRZCZeP2Ie3tQ Q001Zq33UnV4BUUM+2UWoyVJMdZh9uI= Received: by mail-lf1-f48.google.com with SMTP id 2adb3069b0e04-4f58444a410so1979e87.0 for ; Mon, 21 Aug 2023 16:34:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1692660883; x=1693265683; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=A54V70or4HVq8F9rxCPaBDJeQam0nY34rK8TRrDq9Pw=; b=4PUHlW3xatjmFic7aEt4R9ivlLm+IXmw0tJ95HLAXe28jXbDSDyn3Z1AboBh0KLPry BuuGYYbPsOhIOsZ3nQ1EF+kMHDVwpUlWnNpBV7IG1wYm9530TNFOk6XeaTM+/7Zq/V1y Joa1Lt6lSkrj9BLsx5RWmAEm54lrJHSD2ccUyxiJI5bFT0ADZgUR8UWWbBvsXL3Q6Pdn 31yYxqfyBw4wb+kRBCBsGGjCgrXcCQicIgFijBWKgEBOPYAnZ4kbtrzEKS3rXF1c9D11 Qn35ooYRMOAEIQ1reiDZRgabEI43zFBtJahAT8d7tbGVkC1uaJQmmWb54q/BAgUHbPdp IJ3g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692660883; x=1693265683; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=A54V70or4HVq8F9rxCPaBDJeQam0nY34rK8TRrDq9Pw=; b=D1dwWHMl4puPRScL24+3pvnHhvEDw7yGlYcfZ2z007FnNOGG0wVdAfvVy1/isFZMGv YhC79D3xByTcDsqtPsBq/NmKgxYvxejF+BVRptkadFm6RTRnZrI1G4IL7jJp5pqpyRue nyNaq5PKJqPPXx/0T7sOi8XgFlXuMAROxRTL2wN2QewJLdvOpIqDcqLNNLjZbaw0QEjW 8wB6Fn4RnlslEJxeJ7gfQQlQ+QFq3ycANLqXs5hlZg9YMs0Iq9K90EVy5YP9A32VRqfT nojiRzS7fwXujY7TbGCJ1OxxHvT4JyiPqyR3EU+LbdbX64l9eWatdya/JOJ3yrX7UoEU nM0w== X-Gm-Message-State: AOJu0Yxs4ZH6l9hV3N9Q8nOND0ct6xf9QrdkxIAFzJewF8bad4A2mUAB VLevjkv6QE+hIKEGZlPsTgVCK0CBklql0nF96jayAw== X-Google-Smtp-Source: AGHT+IGQQnHeUTd5n/FizVssz2JTYq3qNAJxVfz6QI34YYlEswenP5KVAycXosoQp1EpVf/+ANie5NhxjXQypOjHX1w= X-Received: by 2002:ac2:5195:0:b0:4ff:d0c0:5d72 with SMTP id u21-20020ac25195000000b004ffd0c05d72mr27733lfi.0.1692660882966; Mon, 21 Aug 2023 16:34:42 -0700 (PDT) MIME-Version: 1.0 References: <20230818211533.2523697-1-zokeefe@google.com> In-Reply-To: From: "Zach O'Keefe" Date: Mon, 21 Aug 2023 16:34:05 -0700 Message-ID: Subject: Re: [PATCH v2 1/2] mm/thp: fix "mm: thp: kill __transhuge_page_enabled()" To: Yang Shi Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Saurabh Singh Sengar Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: DF64B1C0006 X-Stat-Signature: u5sc3ptw66muw9notzaz7nfkizoyk7gx X-HE-Tag: 1692660884-349629 X-HE-Meta: U2FsdGVkX18PJlD8qDHPojHiEsKecS/SYmPmKyFsM937+WVzwxxMNSqDbbeWO27QkU+J2PnKd/wMLJFJbvL28XgVsm/KaT320Acw4Gxnt51T8RXAtvnIrWaVYKCIhy8XwuOqMtota1PZri9NPOCzs9QpKp/AAr8bhwFL545iF7jtc6zVX/PiR6e76YvoTUXQbbQGW62DkwsqtmOZUShblJPI50OGsDq44LwOp6rzMJKRtupA1ah8XMkd4uSzR1ynFxd2z6s8hWXg8sFq2K46atTcd26rf+OnT7wJTqxvKj4u51WZnBiy0ATSbMmAwfubdB0dhrxcbjJXqNU7woXe561tt+cs7QrYD6o0QxLCY34OHKiC7Quqv1/E34udgSfasdfNcklsfuqwtqjjYeb7hoN+5FDeOfcAGi1Xx1Xt/WqbWna9P3hyXYgaeLJ1iegzxbMoefuzK1JcLaTuHtgfJeJqlSTB90qPyMD8jObyppgsFl3z7PPv2TmJk1RXRadOSyFXYBM7KrU0baoj61+ZUR6IEGqdq2fiwTuhwGxFCfrIZ/Byqq7CVLacNGqosf3eChplMVEQSkLPOueNJhpMSYimiuWaThnxTFdYlBpswyrdaFHjoFHsSgaOAhgyP767mkQXFXw/8H3coUYkDLoWxqtDjt98iqq6o/OO7R2w22hKMJI2fmMq5DTnhRck3E861N1xyJ/QtngxTdGg0DV4NM2Zb46RQWUKZN2chjER3hzBE4+1fBDrZWJuGKgMpttQTs15Z1ZlvbXdIr78FiCW+dR85MuNyLHFng8WRHoj54Jdluu8j/epGuPNMkaKQfWM8sXsv7JaZBCiZjSFBIpfmIn5VTP7s6ScdmCVfS3KknZ2xiYcQPeWDemAuWIXwqvE0r1afZRbfjzsCKZr7kAjYuuUuz3I0XcMfvnc1A5RIalq5qDn/KEqLE7nBq7TOGTCv1c6v/CLnbhdUSDQN2d dPu9QWMV f6Ssyneo2T1Kh0JHvmCE3f4czpD5Kq2dt91gQh7nHc37xFiRTKt+CSagP9J3OtTgz+Z62Zt1tdXgFv+XcTBIoLuwkL4EfXaAvXDi+7pqRAFUMJb1OXdXVki2XXRwDZAoU3ykmBFNToU2BUdyh5lUz83vWe6Xw8bC4uk8tz39WMGzayUyKm87v4WsHU3sCcE8OAFKhOvSvkOurP14DBrgpB/rzEu92Yh5rhhu9BbfQUkxpif7hU9py8e2binDcOj+yhpu2CobREqBFzKlcERRAhowSOdySnLAdJn6Px01YuIKMli+Cxg5Kc9ZrJzGvPhho1+YqBrZNw2pERX2WrtUqNjUCS9M87ELIIJO+rHdlvicnml6gIBAabf7f44ni01woeoBAHL93EQFwrP4ZSetlB8OfEQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Aug 21, 2023 at 3:53=E2=80=AFPM Yang Shi wrot= e: > > On Fri, Aug 18, 2023 at 2:15=E2=80=AFPM Zach O'Keefe = wrote: > > > > The 6.0 commits: > > > > commit 9fec51689ff6 ("mm: thp: kill transparent_hugepage_active()") > > commit 7da4e2cb8b1f ("mm: thp: kill __transhuge_page_enabled()") > > > > merged "can we have THPs in this VMA?" logic that was previously done > > separately by fault-path, khugepaged, and smaps "THPeligible" checks. > > > > During the process, the semantics of the fault path check changed in tw= o > > ways: > > > > 1) A VM_NO_KHUGEPAGED check was introduced (also added to smaps path). > > 2) We no longer checked if non-anonymous memory had a vm_ops->huge_faul= t > > handler that could satisfy the fault. Previously, this check had be= en > > done in create_huge_pud() and create_huge_pmd() routines, but after > > the changes, we never reach those routines. > > > > During the review of the above commits, it was determined that in-tree > > users weren't affected by the change; most notably, since the only rele= vant > > user (in terms of THP) of VM_MIXEDMAP or ->huge_fault is DAX, which is > > explicitly approved early in approval logic. However, there is at leas= t > > one occurrence where an out-of-tree driver that used > > VM_HUGEPAGE|VM_MIXEDMAP with a vm_ops->huge_fault handler, was broken. > > > > Remove the VM_NO_KHUGEPAGED check when not in collapse path and give > > any ->huge_fault handler a chance to handle the fault. Note that we > > don't validate the file mode or mapping alignment, which is consistent > > with the behavior before the aforementioned commits. > > > > Fixes: 7da4e2cb8b1f ("mm: thp: kill __transhuge_page_enabled()") > > Reported-by: Saurabh Singh Sengar > > Signed-off-by: Zach O'Keefe > > Cc: Yang Shi > > --- > > Changed from v1[1]: > > - [Saurabhi] Allow ->huge_fault handler to handle fault, if it = exists > > > > [1] https://lore.kernel.org/linux-mm/CAAa6QmQw+F=3Do6htOn=3D6ADD6mwvMO= =3DOw_67f3ifBv3GpXx9Xg_g@mail.gmail.com/ > > Thanks, Zach. The patch looks correct to me. You can add > Reviewed-by:Yang Shi . Hey Yang, thanks for taking the time to review .. and .. welcome back :) Sorry to do this to you, but while responding to you on another thread I realized an issue below: > A further comment below... > > > > > --- > > mm/huge_memory.c | 17 ++++++++++------- > > 1 file changed, 10 insertions(+), 7 deletions(-) > > > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > > index eb3678360b97..cd379b2c077b 100644 > > --- a/mm/huge_memory.c > > +++ b/mm/huge_memory.c > > @@ -96,11 +96,11 @@ bool hugepage_vma_check(struct vm_area_struct *vma,= unsigned long vm_flags, > > return in_pf; > > > > /* > > - * Special VMA and hugetlb VMA. > > + * khugepaged special VMA and hugetlb VMA. > > * Must be checked after dax since some dax mappings may have > > * VM_MIXEDMAP set. > > */ > > - if (vm_flags & VM_NO_KHUGEPAGED) > > + if (!in_pf && !smaps && (vm_flags & VM_NO_KHUGEPAGED)) > > I'm wondering whether we shall remove VM_MIXEDMAP from > VM_NO_KHUGEPAGED or not if that kind VMAs are huge page applicable for > some usecases. The downside may be some CPU time waste on the > VM_MIXEDMAP area which has PFN instead of struct page, but it should > be ok. Anything else did I miss? Just back from a long vacation, my > brain is still not running in full speed yet. I was thinking about the same thing, and had originally intended to raise that question here -- but thought it better to stick with the immediate issue. Ironically, we've gone off on both a THPeligible tangent and another about faulting file-backed memory. But ya, AFAIU, there is no technical reason why collapse can't act on VM_MIXEDMAP, as long as all the pages it finds are vm_normal() pages. I don't know enough about the possible use cases here though, and whether this is the best memory to be allocating precious hugepages to. You also raise a good point about cpu usage, since there may be a greater chance of failing late in scan due finding a PFN-only mapping. > > return false; > > > > /* > > @@ -128,12 +128,15 @@ bool hugepage_vma_check(struct vm_area_struct *vm= a, unsigned long vm_flags, > > !hugepage_flags_always()))) > > return false; > > > > - /* Only regular file is valid */ > > - if (!in_pf && file_thp_enabled(vma)) > > - return true; > > - > > if (!vma_is_anonymous(vma)) > > - return false; > > + return in_pf ? > > + /* > > + * Trust that ->huge_fault() handlers know > > + * what they are doing in fault path. > > + */ > > + !!vma->vm_ops->huge_fault : > > + /* Only regular file is valid in collapse path = */ > > + file_thp_enabled(vma); This works for fault and collapse paths, but what about smaps? I think we should be doing both checks, and returning "true" if either is true. This also raises the question of how hugepage_vma_check() is set up, and how we've been using "in_pf" and "smaps". Today, these mean, "am I in fault path?" and "am I in smaps path?", whereas I think they ought to be, "should I check fault path, else check collapse path", and "am I in smaps path?". smaps path should then use hugepage_vma_check(in_pf) || hugepage_vma_check(!in_pf). It a depends on how pedantic we want to be about THPeligible, but I've found a few corner cases where the distinction matters. What I think I'll do is send off an embarrassing 3rd revision of this simple patch -- removing Patch 2 that was previously included in v2 -- just so we have a shot of getting the fix for Saurabh into 6.6. We can worry about any other refactorings / fixes later.. Thanks, Zach > > if (vma_is_temporary_stack(vma)) > > return false; > > -- > > 2.42.0.rc1.204.g551eb34607-goog > >