From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9C4BCC433EF for ; Mon, 6 Jun 2022 23:03:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D30FC6B0072; Mon, 6 Jun 2022 19:03:11 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CB9CF6B0073; Mon, 6 Jun 2022 19:03:11 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B33C76B0074; Mon, 6 Jun 2022 19:03:11 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 9DC426B0072 for ; Mon, 6 Jun 2022 19:03:11 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 6B8203526C for ; Mon, 6 Jun 2022 23:03:11 +0000 (UTC) X-FDA: 79549338582.26.7654526 Received: from mail-pg1-f178.google.com (mail-pg1-f178.google.com [209.85.215.178]) by imf03.hostedemail.com (Postfix) with ESMTP id 6F34F20002 for ; Mon, 6 Jun 2022 23:02:54 +0000 (UTC) Received: by mail-pg1-f178.google.com with SMTP id q123so14169032pgq.6 for ; Mon, 06 Jun 2022 16:03:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=yPYVkoCvsm1jocEl79mhgzZ1mRNN715NkZZmX987Q0I=; b=LCJ4nzv/KbYlkkxDg+D6zIuZMQNXuOIaHxideT5RXHfggubDLJDAfjUJYwciEdDZZw /NYTnc8MnDLwlM0ns3i+zUHARYzZAAb+TvDDK9uy0JadO2l1kWe28BSGsHiV2WR5EZJa T0tY8W9g5mvixWPjuOtmFXgUs1UKA6dKrZ+JntYqCSX5Hx9cLmj9tc4gL6bG1K8sLe+t tvoKUm26Br3hCIB/w3fgp8N6pytWVCPkB7Xe+FSz4xk2m1hvCGxnIVf1YERri1B7iaTb kpRmzJSOcDxZbAhllOEg79l7sj4GY5UYn7S0ymnhivS9dG+ZZrnB5SHIoNEx0GZ/wbP5 WgwA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=yPYVkoCvsm1jocEl79mhgzZ1mRNN715NkZZmX987Q0I=; b=MoJn6MUY67Asbd6eZpWu0X5woNVMnKeiaXiNG8nD1/17ZsJzLwzuQVomq7U9mw2Cy9 9BLQfMbSLWoYZchj9t5cF1Qz6upXvs+HwEDzhIEHhIMhemnWr9XGhDzvC5syUtX1EJIV DeRHPsJtemZYOTmZQHZkLmG6FlD5303xSJt/OnrEjjQ1NfAQ/XI1PyGgD+7diPF413AM O0p5M3ZjjAjkaB25bdsaxnW/syLSQBrnwj6+5SHlNJff4WnIGEXvBIJLpJhNNLhb/vjO mLtT+wOQtD0ILUpHfsXYaUMbcTXU/NwiCaSIWqXjF4cP4bqW/EDcAolLq9uO8JcoUFKN SxfQ== X-Gm-Message-State: AOAM532ThrpBb2tVimhY3QVFJqrcJ2zZkqCPzg0/9oUqUGVuf4lfaCDT JaVG0czcHm0VsDSwoUMOyp112cB9DwS3dWb78QA= X-Google-Smtp-Source: ABdhPJxHMv9vT64MdWzMEgR6gZ2m+pLo94Th2McKzeUeL4bIbEA1ObP76BWWPwxMHpsNdMi+pusuLlVhmXkzuyHk/0s= X-Received: by 2002:aa7:94a6:0:b0:51b:ddb8:1fcf with SMTP id a6-20020aa794a6000000b0051bddb81fcfmr20697511pfl.23.1654556590058; Mon, 06 Jun 2022 16:03:10 -0700 (PDT) MIME-Version: 1.0 References: <20220604004004.954674-1-zokeefe@google.com> <20220604004004.954674-9-zokeefe@google.com> In-Reply-To: <20220604004004.954674-9-zokeefe@google.com> From: Yang Shi Date: Mon, 6 Jun 2022 16:02:57 -0700 Message-ID: Subject: Re: [PATCH v6 08/15] mm/khugepaged: add flag to ignore THP sysfs enabled To: "Zach O'Keefe" Cc: Alex Shi , David Hildenbrand , David Rientjes , Matthew Wilcox , Michal Hocko , Pasha Tatashin , Peter Xu , Rongwei Wang , SeongJae Park , Song Liu , Vlastimil Babka , Zi Yan , Linux MM , Andrea Arcangeli , Andrew Morton , Arnd Bergmann , Axel Rasmussen , Chris Kennelly , Chris Zankel , Helge Deller , Hugh Dickins , Ivan Kokshaysky , "James E.J. Bottomley" , Jens Axboe , "Kirill A. Shutemov" , Matt Turner , Max Filippov , Miaohe Lin , Minchan Kim , Patrick Xia , Pavel Begunkov , Thomas Bogendoerfer Content-Type: text/plain; charset="UTF-8" Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b="LCJ4nzv/"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf03.hostedemail.com: domain of shy828301@gmail.com designates 209.85.215.178 as permitted sender) smtp.mailfrom=shy828301@gmail.com X-Stat-Signature: 3ah4td7jnjfz4x39wxi3uhp3y9e9i66r X-Rspam-User: X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 6F34F20002 X-HE-Tag: 1654556574-574137 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Jun 3, 2022 at 5:40 PM Zach O'Keefe wrote: > > Add enforce_thp_enabled flag to struct collapse_control that allows context > to ignore constraints imposed by /sys/kernel/transparent_hugepage/enabled. > > This flag is set in khugepaged collapse context to preserve existing > khugepaged behavior. > > This flag will be used (unset) when introducing madvise collapse > context since the desired THP semantics of MADV_COLLAPSE aren't coupled > to sysfs THP settings. Most notably, for the purpose of eventual > madvise_collapse(2) support, this allows userspace to trigger THP collapse > on behalf of another processes, without adding support to meddle with > the VMA flags of said process, or change sysfs THP settings. > > For now, limit this flag to /sys/kernel/transparent_hugepage/enabled, > but it can be expanded to include > /sys/kernel/transparent_hugepage/shmem_enabled later. > > Link: https://lore.kernel.org/linux-mm/CAAa6QmQxay1_=Pmt8oCX2-Va18t44FV-Vs-WsQt_6+qBks4nZA@mail.gmail.com/ > > Signed-off-by: Zach O'Keefe Looks good to me. Reviewed-by: Yang Shi Just a reminder, I just posted series https://lore.kernel.org/linux-mm/20220606214414.736109-1-shy828301@gmail.com/T/#m5dae2dfa4b247f3b3903951dd3a1f0978a927e16, it changed some logic in hugepage_vma_check(). If your series gets in after it, you should need some additional tweaks to disregard sys THP setting. > --- > mm/khugepaged.c | 34 +++++++++++++++++++++++++++------- > 1 file changed, 27 insertions(+), 7 deletions(-) > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > index c3589b3e238d..4ad04f552347 100644 > --- a/mm/khugepaged.c > +++ b/mm/khugepaged.c > @@ -94,6 +94,11 @@ struct collapse_control { > */ > bool enforce_page_heuristics; > > + /* Enforce constraints of > + * /sys/kernel/mm/transparent_hugepage/enabled > + */ > + bool enforce_thp_enabled; > + > /* Num pages scanned per node */ > int node_load[MAX_NUMNODES]; > > @@ -893,10 +898,12 @@ static bool khugepaged_alloc_page(struct page **hpage, gfp_t gfp, int node) > */ > > static int hugepage_vma_revalidate(struct mm_struct *mm, unsigned long address, > - struct vm_area_struct **vmap) > + struct vm_area_struct **vmap, > + struct collapse_control *cc) > { > struct vm_area_struct *vma; > unsigned long hstart, hend; > + unsigned long vma_flags; > > if (unlikely(khugepaged_test_exit(mm))) > return SCAN_ANY_PROCESS; > @@ -909,7 +916,18 @@ static int hugepage_vma_revalidate(struct mm_struct *mm, unsigned long address, > hend = vma->vm_end & HPAGE_PMD_MASK; > if (address < hstart || address + HPAGE_PMD_SIZE > hend) > return SCAN_ADDRESS_RANGE; > - if (!hugepage_vma_check(vma, vma->vm_flags)) > + > + /* > + * If !cc->enforce_thp_enabled, set VM_HUGEPAGE so that > + * hugepage_vma_check() can pass even if > + * TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG is set (i.e. "madvise" mode). > + * Note that hugepage_vma_check() doesn't enforce that > + * TRANSPARENT_HUGEPAGE_FLAG or TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG > + * must be set (i.e. "never" mode). > + */ > + vma_flags = cc->enforce_thp_enabled ? vma->vm_flags > + : vma->vm_flags | VM_HUGEPAGE; > + if (!hugepage_vma_check(vma, vma_flags)) > return SCAN_VMA_CHECK; > /* Anon VMA expected */ > if (!vma->anon_vma || !vma_is_anonymous(vma)) > @@ -953,7 +971,8 @@ static int find_pmd_or_thp_or_none(struct mm_struct *mm, > static bool __collapse_huge_page_swapin(struct mm_struct *mm, > struct vm_area_struct *vma, > unsigned long haddr, pmd_t *pmd, > - int referenced) > + int referenced, > + struct collapse_control *cc) > { > int swapped_in = 0; > vm_fault_t ret = 0; > @@ -980,7 +999,7 @@ static bool __collapse_huge_page_swapin(struct mm_struct *mm, > /* do_swap_page returns VM_FAULT_RETRY with released mmap_lock */ > if (ret & VM_FAULT_RETRY) { > mmap_read_lock(mm); > - if (hugepage_vma_revalidate(mm, haddr, &vma)) { > + if (hugepage_vma_revalidate(mm, haddr, &vma, cc)) { > /* vma is no longer available, don't continue to swapin */ > trace_mm_collapse_huge_page_swapin(mm, swapped_in, referenced, 0); > return false; > @@ -1047,7 +1066,7 @@ static int collapse_huge_page(struct mm_struct *mm, unsigned long address, > goto out_nolock; > > mmap_read_lock(mm); > - result = hugepage_vma_revalidate(mm, address, &vma); > + result = hugepage_vma_revalidate(mm, address, &vma, cc); > if (result) { > mmap_read_unlock(mm); > goto out_nolock; > @@ -1066,7 +1085,7 @@ static int collapse_huge_page(struct mm_struct *mm, unsigned long address, > * Continuing to collapse causes inconsistency. > */ > if (unmapped && !__collapse_huge_page_swapin(mm, vma, address, > - pmd, referenced)) { > + pmd, referenced, cc)) { > mmap_read_unlock(mm); > goto out_nolock; > } > @@ -1078,7 +1097,7 @@ static int collapse_huge_page(struct mm_struct *mm, unsigned long address, > * handled by the anon_vma lock + PG_lock. > */ > mmap_write_lock(mm); > - result = hugepage_vma_revalidate(mm, address, &vma); > + result = hugepage_vma_revalidate(mm, address, &vma, cc); > if (result) > goto out_up_write; > /* check if the pmd is still valid */ > @@ -2277,6 +2296,7 @@ static int khugepaged(void *none) > struct mm_slot *mm_slot; > struct collapse_control cc = { > .enforce_page_heuristics = true, > + .enforce_thp_enabled = true, > .last_target_node = NUMA_NO_NODE, > /* .gfp set later */ > }; > -- > 2.36.1.255.ge46751e96f-goog >