From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 47116C433F5 for ; Thu, 10 Mar 2022 18:46:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B45FF8D0002; Thu, 10 Mar 2022 13:46:25 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id ACDEA8D0001; Thu, 10 Mar 2022 13:46:25 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9489D8D0002; Thu, 10 Mar 2022 13:46:25 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.a.hostedemail.com [64.99.140.24]) by kanga.kvack.org (Postfix) with ESMTP id 7F0B38D0001 for ; Thu, 10 Mar 2022 13:46:25 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 4B352208CC for ; Thu, 10 Mar 2022 18:46:25 +0000 (UTC) X-FDA: 79229357130.14.80834FD Received: from mail-pg1-f171.google.com (mail-pg1-f171.google.com [209.85.215.171]) by imf24.hostedemail.com (Postfix) with ESMTP id E664C180020 for ; Thu, 10 Mar 2022 18:46:24 +0000 (UTC) Received: by mail-pg1-f171.google.com with SMTP id o26so5453485pgb.8 for ; Thu, 10 Mar 2022 10:46:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:from:to:cc:subject:in-reply-to:message-id:references :mime-version; bh=zcNt0Gexdrk5Yrl18m1aTCrPiVYqZ7V56Snus6IOKnI=; b=ZLW+HkeWNiTWjelAqEEEnNuhtlCtode5DDOsdcwxgK5rj+oJpg8Yyi2E6gtuDrCyIx 6tOqvBDTR0StZh4bB115xoZ1rim4NKLOoHzZ1UiKgKkkQuycDTWCMwObbY3ag1w6zRLL N/A3B889+mAu8ded7MRvD+zIiBzFIaYAc+iRR7pYXuI+SvOOXBOEWmHnd10oKCALfHgV NJuSIm1I63QPy/AudASItUlXKlp/OXTs0piBMzDmd6P3KBfmOoqV4NgaMn7Ajkj7Ajby eMf1CjcrhykbY7ZrQXLCfuziFV2MDcUxXFUCuJXII7Yf/SUF8XctfmbOnWsCeTav/hvV F0lg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:in-reply-to:message-id :references:mime-version; bh=zcNt0Gexdrk5Yrl18m1aTCrPiVYqZ7V56Snus6IOKnI=; b=pEHXJYFIM/ONIggeBjtKQmqc3AxvU5/DYfK4d/5DotghvQ3pnKcRKmEm7p7peVKw5l shRDTo3KY2F3MLCnwt2PTB5AOJrCFrTvPpewOZp3ociIFF9ZRZpehhgq7HWuN86DpUtn yl6ZruzWe/g4wyMRGgJSkczPf3lPPmczNgpJoWuwgW6Dd8tJMZtf8dvX3ieSVzrsAeAo giBfKZgBmydvGeCJEY4tgOYSTJQgrhN2JfS3rpI73rQZjozWLeDVMgQBtdl6snzApc8S ESjgGcj9s1Q67wbo+/d/kHK0XRZw/LLvVJceuk09/xll7rBLX3lneGdvprnZ0MiNHgyU vbZw== X-Gm-Message-State: AOAM531B0azvfN7HEYzv3z6wb3TpQc/qoCs4FgoNM71uFEAyEhw83MGU z+ncuKoH6W7505U4xB2Isu5tZA== X-Google-Smtp-Source: ABdhPJwbBlDfvEgrhfnxxzPICoTk9Fr+98/tWXugXXhgoXXlGKWzeR3kosYX7c5S4T6RaMz97WFTXg== X-Received: by 2002:a63:9d48:0:b0:378:c359:fcbf with SMTP id i69-20020a639d48000000b00378c359fcbfmr5152710pgd.371.1646937983423; Thu, 10 Mar 2022 10:46:23 -0800 (PST) Received: from [2620:15c:29:204:9181:7c9:2e7e:9306] ([2620:15c:29:204:9181:7c9:2e7e:9306]) by smtp.gmail.com with ESMTPSA id y10-20020a63b50a000000b0038088a28ec0sm6250876pge.22.2022.03.10.10.46.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 10 Mar 2022 10:46:23 -0800 (PST) Date: Thu, 10 Mar 2022 10:46:22 -0800 (PST) From: David Rientjes To: Yang Shi cc: Zach O'Keefe , Alex Shi , David Hildenbrand , Michal Hocko , Pasha Tatashin , SeongJae Park , Song Liu , Vlastimil Babka , Zi Yan , Linux MM , Andrea Arcangeli , Andrew Morton , Arnd Bergmann , Axel Rasmussen , Chris Kennelly , Chris Zankel , Helge Deller , Hugh Dickins , Ivan Kokshaysky , "James E.J. Bottomley" , Jens Axboe , "Kirill A. Shutemov" , Matthew Wilcox , Matt Turner , Max Filippov , Miaohe Lin , Minchan Kim , Patrick Xia , Pavel Begunkov , Peter Xu , Thomas Bogendoerfer Subject: Re: [RFC PATCH 07/14] mm/khugepaged: add vm_flags_ignore to hugepage_vma_revalidate_pmd_count() In-Reply-To: Message-ID: References: <20220308213417.1407042-1-zokeefe@google.com> <20220308213417.1407042-8-zokeefe@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Rspamd-Queue-Id: E664C180020 X-Stat-Signature: edob77upatugge3gnn8w95mqdounbbsm X-Rspam-User: Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=ZLW+HkeW; spf=pass (imf24.hostedemail.com: domain of rientjes@google.com designates 209.85.215.171 as permitted sender) smtp.mailfrom=rientjes@google.com; dmarc=pass (policy=reject) header.from=google.com X-Rspamd-Server: rspam07 X-HE-Tag: 1646937984-433906 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, 10 Mar 2022, Yang Shi wrote: > > This separates "async-hint" vs "sync-explicit" madvise requests. > > MADV_[NO]HUGEPAGE are hints, and together with thp settings, advise > > the kernel how to treat memory in the future. The kernel uses > > VM_[NO]HUGEPAGE to aid with this. MADV_COLLAPSE, as an explicit > > request, is free to define its own defrag semantics. > > > > This would allow flexibility to separately define async vs sync thp > > policies. For example, highly tuned userspace applications that are > > sensitive to unexpected latency might want to manage their hugepages > > utilization themselves, and ask khugepaged to stay away. There is no > > way in "always" mode to do this without setting VM_NOHUGEPAGE. > > I don't quite get why you set THP to always but don't want to > khugepaged do its job. It may be slow, I think this is why you > introduce MADV_COLLAPSE, right? But it doesn't mean khugepaged can't > scan the same area, it just doesn't do any real work and waste some > cpu cycles. But I guess MADV_COLLAPSE doesn't prevent the PMD/THP from > being split, right? So khugepaged still plays a role to re-collapse > the area without calling MADV_COLLAPSE over again and again. > My only real concern for MADV_COLLAPSE was when the span being collapsed includes a mixture of both VM_HUGEPAGE and VM_NOHUGEPAGE. Does this collapse over the eligible memory or does it fail entirely? I'd think it was the former, that we should respect VM_NOHUGEPAGE and only collapse eligible memory when doing MADV_COLLAPSE but now userspace struggles to know whether it was a partial collapse because of ineligiblity or because we just couldn't allocate a hugepage. It has the information to figure this out on its own, so given the use of VM_NOHUGEPAGE for non-MADV_NOHUGEPAGE purposes, I think it makes sense to simply ignore these vmas as part of the collapse request.