From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9AD61EB64DA for ; Fri, 7 Jul 2023 13:57:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 060CE8D0002; Fri, 7 Jul 2023 09:57:23 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 00F2D8D0001; Fri, 7 Jul 2023 09:57:22 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E187A8D0002; Fri, 7 Jul 2023 09:57:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id CD32C8D0001 for ; Fri, 7 Jul 2023 09:57:22 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 5A367160DBC for ; Fri, 7 Jul 2023 13:57:22 +0000 (UTC) X-FDA: 80984967924.23.F38A112 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf15.hostedemail.com (Postfix) with ESMTP id 3AA92A0007 for ; Fri, 7 Jul 2023 13:57:19 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=FvDn4P7Z; spf=none (imf15.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1688738240; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=HOf6zW5tc2OKxnAzSW57DuGvsKKY4Rg1phen8HORVkg=; b=dyGd4rcJTicAZEBgQe9v8+JgMcCaLXbD9EB86I22rndM8Lh+TMJa/KHHKnowCLJAGLc2n0 h+ZvY79MNyIU3Le6HmbV10jUrU4F3cppob1bRDLlyRg6Q75LO6Eap7Pjj2+66b9KMTaWB7 0MFlsi4LY/pAftZx9AhtphBtA0UPmic= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1688738240; a=rsa-sha256; cv=none; b=YKOb7dK/oM/6wJo3M/FWQr7Qwd7y7fQbxlnobdgDNOdBEFk6Nf0VZIgiYWkrFgs3djptyS tUAp8hZxyXHSScYDo5ES7ZjAWIp5BecQIVmUtIxxRuW4+V6cN5HofjnQvp11ckXegcmq11 68tO4CpkJXqKl9VFmhpgOOYwOGTQK7A= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=FvDn4P7Z; spf=none (imf15.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=HOf6zW5tc2OKxnAzSW57DuGvsKKY4Rg1phen8HORVkg=; b=FvDn4P7ZEkpDde4hXJ6KlSS65n qjtlEYVIgCZChVeO1xO1TkXR7vBC6bACZqnbipoRMtKJ9g0TMvX9mOifU97HiUEbMUEKqDcPuO8Le hbeKWeuc/h+WmVaWlIWVnrwGCi/iCQiVLhVsqVePC2qVVZhMLTyZ2t7U3ffROIF7xWdehjZfYi/rN X7tDupOHB9wubZV29RZwWFXzbVbJho/j4gqFO+Pw1McbL2iBgG/gIe9iFMceyQ2jeV+IumaSGDqXc xaL4AzgCqzZJ9biTQ0ep69tneAw8Rieie8MWJkYiJwY8dwXAKbo3IuhH+5N7LAFxGsLRQfqY+WEBM 6khm6D6Q==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1qHlxE-00C4Ty-SF; Fri, 07 Jul 2023 13:57:00 +0000 Date: Fri, 7 Jul 2023 14:57:00 +0100 From: Matthew Wilcox To: David Hildenbrand Cc: Ryan Roberts , "Huang, Ying" , Andrew Morton , "Kirill A. Shutemov" , Yin Fengwei , Yu Zhao , Catalin Marinas , Will Deacon , Anshuman Khandual , Yang Shi , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH v2 4/5] mm: FLEXIBLE_THP for improved performance Message-ID: References: <20230703135330.1865927-1-ryan.roberts@arm.com> <20230703135330.1865927-5-ryan.roberts@arm.com> <87edlkgnfa.fsf@yhuang6-desk2.ccr.corp.intel.com> <44e60630-5e9d-c8df-ab79-cb0767de680e@arm.com> <524bacd2-4a47-2b8b-6685-c46e31a01631@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <524bacd2-4a47-2b8b-6685-c46e31a01631@redhat.com> X-Rspamd-Queue-Id: 3AA92A0007 X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: nfz1txme69k1bxw364ecfqmyasn7qjsf X-HE-Tag: 1688738239-679905 X-HE-Meta: U2FsdGVkX19XfbEfUQZzUV+R+Zr1pCjILzLRduhkaQJrVTLqxJ2sP7jbMwUWkemVz2b7nWhMVwuk4bHzAOoeA4+SFi7uYP6C5pUkAdsDUfZlYKlYZFACuUR0m6R9dyArVhKuvaLfv9DVq4VwELJu1zesvCkqltFAc0tW0v6T3YQv/Mf3VsjQMupc6wvtJTvkWj/By5YfINxZsIYhAMcAwABnpJ5601jIs5/8IRFHUaMNSDTSsEgTI22w670sjaG4tJyzdfyqZnYTIr8wvkzmH3DhgMu2w2LGsC2s5r7G6nQDcvNNSyRDZUHS1JIb/HJyqjE5EZD/e1703A0RZZZ0tzsg2cupX+dYTSKtcsNG7EEffz8NQKuU3zW74q1GZl0Ztbe4YBw39VEuKyCdWFb2VR/p/TF35dWXaUmhcD2KWBDQ4+ZE1DJewfU0RWcXQ2Z9W13ZPMoNZ5bMZx8Gy0Ik86lNQmmEOwUnwlmXZ6lufz0sihMqGShJQMyHIBzqWca7p5uaSVDnK2STz0OlPwF9gLcxI3ZPq+901CQk3Wfmn/3HmW7hf0WPEghqO+JMxHV6UxURweln+gZKRxNPjMqCmx44dc8VSTBZQ0cLHRwprevidL7/Rgp/ck4SIC+YhnTwS0b2TO8IzMGOsyv3VevQQQeNG44NeBoC3VJaJXTb4FKX9bsQ0pizr8sX9v1W1VJEbiMkVGr4XzVeFIpIGNkXubyMKgIE6YdIy7OOooXClX6MjFYzfUzOAXRS9JbPUDKMpL92SpIBaJaaz/WMnZoEe2ewHmUd7KGnJNgARXGBGCdWYEbSf4RTwcjb0ntOimkOaBfnOdVsYNUDR/MFqzbB3NCxWs/f/6EW5pFVfhw/lwYq4JQqVGdryyk3T4zbkBwHozGzkeqmoHOM/HccfHyJRjAU8eDf+964tde8uBXpsO9wz+dA9ZckXqxYQZ917lE0kb/6/SXlQSnR4+hGfSW wmsYrmWt qJKw0YnsNvmk8LoYm2/qRKEtiIlHDtftsh1w5F57DPT/TPObkiX4mlBSHkwA9Ell/95twb48dAYzEoeDSiML4VJ0YqG6RcF9gXA0NWxJsdAKFLB0OhqtiBtPnZ0VwgR6Al0zwmSZWHZmtRuJPcpfyMTGRVXHBNwCHWHRKdLmWKl1XSicUbYEMRe9lYKcxk+mluLMDwMDwgU6fZ+ENXti75ySnMnHkrbukb+OYnFihzRFQFA0jimOvpNWCGnbQ+NM2U8BX57C8tSmfSeHE17GZGrpfbhHwxWXoYkZ85KFQ1uMG7RgCzaI4IE06ng== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Jul 07, 2023 at 01:29:02PM +0200, David Hildenbrand wrote: > On 07.07.23 11:52, Ryan Roberts wrote: > > On 07/07/2023 09:01, Huang, Ying wrote: > > > Although we can use smaller page order for FLEXIBLE_THP, it's hard to > > > avoid internal fragmentation completely. So, I think that finally we > > > will need to provide a mechanism for the users to opt out, e.g., > > > something like "always madvise never" via > > > /sys/kernel/mm/transparent_hugepage/enabled. I'm not sure whether it's > > > a good idea to reuse the existing interface of THP. > > > > I wouldn't want to tie this to the existing interface, simply because that > > implies that we would want to follow the "always" and "madvise" advice too; That > > means that on a thp=madvise system (which is certainly the case for android and > > other client systems) we would have to disable large anon folios for VMAs that > > haven't explicitly opted in. That breaks the intention that this should be an > > invisible performance boost. I think it's important to set the policy for use of > > It will never ever be a completely invisible performance boost, just like > ordinary THP. > > Using the exact same existing toggle is the right thing to do. If someone > specify "never" or "madvise", then do exactly that. > > It might make sense to have more modes or additional toggles, but > "madvise=never" means no memory waste. I hate the existing mechanisms. They are an abdication of our responsibility, and an attempt to blame the user (be it the sysadmin or the programmer) of our code for using it wrongly. We should not replicate this mistake. Our code should be auto-tuning. I posted a long, detailed outline here: https://lore.kernel.org/linux-mm/Y%2FU8bQd15aUO97vS@casper.infradead.org/ > I remember I raised it already in the past, but you *absolutely* have to > respect the MADV_NOHUGEPAGE flag. There is user space out there (for > example, userfaultfd) that doesn't want the kernel to populate any > additional page tables. So if you have to respect that already, then also > respect MADV_HUGEPAGE, simple. Possibly having uffd enabled on a VMA should disable using large folios, I can get behind that. But the notion that userspace knows what it's doing ... hahaha. Just ignore the madvise flags. Userspace doesn't know what it's doing.