From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EB044C6FD18 for ; Wed, 19 Apr 2023 02:55:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 37D038E0002; Tue, 18 Apr 2023 22:55:57 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 32CCB8E0001; Tue, 18 Apr 2023 22:55:57 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1F4BF8E0002; Tue, 18 Apr 2023 22:55:57 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 0984E8E0001 for ; Tue, 18 Apr 2023 22:55:57 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id D2266120576 for ; Wed, 19 Apr 2023 02:55:56 +0000 (UTC) X-FDA: 80696625912.18.E99E691 Received: from mail-qt1-f170.google.com (mail-qt1-f170.google.com [209.85.160.170]) by imf18.hostedemail.com (Postfix) with ESMTP id DD3441C0007 for ; Wed, 19 Apr 2023 02:55:54 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=cmpxchg-org.20221208.gappssmtp.com header.s=20221208 header.b=XM39AUli; dmarc=pass (policy=none) header.from=cmpxchg.org; spf=pass (imf18.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.160.170 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681872955; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=LcgMWY9Egw7upTgiRyX9Ji+eQ7ZeBYypgCETacQZMO4=; b=U0TfzZvGtWzCWn+lwzdHD+a62BspzAAQmbPTH0ZEd+Ij5qpWwLeBzgzG0R5Ou63l+ZPSuA 3wpZ2B7r2vbZmxJ1A057m5f1t1tK1r2ml0UkfERGrSlQ1Uqkmm5CejSC73NPWmxwnJb0Ow meIJd15srYmK079id+a+XoJ1wj5d3C8= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=cmpxchg-org.20221208.gappssmtp.com header.s=20221208 header.b=XM39AUli; dmarc=pass (policy=none) header.from=cmpxchg.org; spf=pass (imf18.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.160.170 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681872955; a=rsa-sha256; cv=none; b=QKR+Ra3mFxOmicbEhUyUA97CVM++KzgbpR6w9wv6dirKYMh6/QZ8xc+P7OJGgZ3ki2soFP TogS5DtujFqQjM5Sqi+IypFZoLXvP1h1XNNNXkT//noRdSi71lsHe7zz6vQ16oZSgNJPbl Zz8uD6Rgn0aiM5c1/lkzrs/TrsgphzE= Received: by mail-qt1-f170.google.com with SMTP id ff18so9429902qtb.13 for ; Tue, 18 Apr 2023 19:55:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20221208.gappssmtp.com; s=20221208; t=1681872954; x=1684464954; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=LcgMWY9Egw7upTgiRyX9Ji+eQ7ZeBYypgCETacQZMO4=; b=XM39AUliYsnb3m2d6jTi1Ke0fqpB6LqbRk8Py1DEzIC4JG2a58fvM9VWLh6P3XySe+ YJD5aF8prsuEJQ9gwy1UZiJSzN8E87gSv/2sBJfGXmmHeMe3LP2btSQ9f/PA1U/dlF+T 65/vHxEfZ3ZC177MorXwTXUqJIjYf3wGPT/ldiiofrNyY1o7TiMBzNQBxJ1CQrl6KT3b CPe0/CgtCn67rqB2fgPoG+LMykLKLtp2WJFh90BsaPX4redKQn8I3QqYxXxCNhNojA30 1uTJ17lGO5DwCRpm1NNVC/PtNH9ugDi8p1BeA1F8/78NtnBixkWqaLIYdqNbDtsfq8TU S91A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681872954; x=1684464954; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=LcgMWY9Egw7upTgiRyX9Ji+eQ7ZeBYypgCETacQZMO4=; b=If25lduBn8yUpC1yNgOzB9WbahM0cZE1fCeYIh20jKNyxb9rJZof1OeVocq5VmRPja W3B/AvwNm7p41+xyhXilkJR9aL1aCaz2BL1v0TSjfU8PTOCaWNz1wLtyMx8XnpcmengI BA7S+q9uiK8Jvzfvw3ptTLITsHl93JDGHqFyzhEiD74UeRVK9ileT7Dm8M7arTRcsrAl mZ1UDkdjbErYLjjjO1t74jzAvCgbyadjOk4Y0TYjcisd1VqfkMELOJccZ9PBXCDZxP4f cGm+4mTjwgFutrOL8tj3Xl46naYT5uO7rIQzZvhoHc5Tbu680hQgSQz1BfW7keGbjH4C WYXg== X-Gm-Message-State: AAQBX9fu9Y1Ys1t3PUx3hGu+6ayP40wjttw/tOMru8oSQeA5Q3gxweqa 4ibuKW5Limus3SbfZUfSMSiIEQ== X-Google-Smtp-Source: AKy350aTwN5NADIBxWpDONKKP92OfmBIrgt7goEPYjvtgMWJHQujIfXZRHcTEgy6Kq/EftdB6TL2IA== X-Received: by 2002:ac8:590b:0:b0:3ef:3fcd:3c1c with SMTP id 11-20020ac8590b000000b003ef3fcd3c1cmr3125596qty.63.1681872953863; Tue, 18 Apr 2023 19:55:53 -0700 (PDT) Received: from localhost ([2620:10d:c091:400::5:e646]) by smtp.gmail.com with ESMTPSA id k15-20020a05620a414f00b007463509f94asm4325549qko.55.2023.04.18.19.55.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 18 Apr 2023 19:55:53 -0700 (PDT) Date: Tue, 18 Apr 2023 22:55:52 -0400 From: Johannes Weiner To: "Kirill A. Shutemov" Cc: linux-mm@kvack.org, Kaiyang Zhao , Mel Gorman , Vlastimil Babka , David Rientjes , linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: Re: [RFC PATCH 03/26] mm: make pageblock_order 2M per default Message-ID: <20230419025552.GB272256@cmpxchg.org> References: <20230418191313.268131-1-hannes@cmpxchg.org> <20230418191313.268131-4-hannes@cmpxchg.org> <20230419000105.matz43p6ihrqmado@box.shutemov.name> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230419000105.matz43p6ihrqmado@box.shutemov.name> X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: DD3441C0007 X-Stat-Signature: 9ey6e9dcuk7rj841fyix8rk8irrmddzt X-HE-Tag: 1681872954-47852 X-HE-Meta: U2FsdGVkX18Yp81J6Q6Rp9FPkoS3f/fvGwhYaiFSh/HfjMc42grpWKAb7U7loxQu8bD6+XlwVaUjjkZZMrongACwazKEI3UFkFOZLGl4xal3i8Np1aOS4uzW5HP/qTn+UK5ELEy2+i689OKBfQZLHqgzFwODPyVAUXQCDXZkoCzxrI2tbUTG0QvbHitlPcfbeYwGzKMPnQJbBQ1MLN7LL6RFIuwIKxfDlIzOuVnNUkE3XeXQtjxJtqrHRo8jG56/HGIMKWrcBMjkZkFp44kclcuWJSiIRu3J0JIs6QRtj9ZRbHriQHQ6aR/77FN8KoSMlnwncxUbCWc7eeRSMCw01nS0liHSof4d/zfUFIyJh2Gmn6EfvRA7uMdsVD/oeEeDZdxhMS1Zoqc2RTivTxFY37etmsmdMrDVRIcaQU+kJ/XhiXt9Damk006Iw/g9jcpckzYBfArBxTyuPuTbfkZh3PnrY/E/Gz1hef5WrGauyUlmPBGb1Zlsf2MQScwVu9DQSGDADOFwFMpYK/zQLu724MPE60qua5eijD2BKK03TXnzq5jwkox4xEdJwIOQ35/Lwuvg+46FVvAt8hwQ4EEic8+P+N2WxCDZRAH3ChnbgVepVGrZEm1uNUcjR6Ah7ZMbRSccbb+ihr1AT4O45MQvftLk5VJ6izc6cCEYSBY3HZFK28dmCU7en+tDbji4f6fn20xMzdIhppFOhPps1lsk7+QeSVh8j3GrllI3QOeA9e+GS2qfZjKUA4TGesML6PRo/87dN1T7KPOnns/ywG31daRcLrdK4+uvXDjLSGn3ecQtnGuka1pTR3RCQ6MuQ7voQlGjj5+qoWemoR14hH+BTz+J9DCSbxw/vhFFE7yxfg3JQsMBPNV2SyDTMGLvVvdXqbL67X1lHeLn4CPxetSlUk9nxs87dbtq2nMsIEEcO3P7bpGFOnAXV4Smy5V5a/RbzlUVuxu3wtVDuWv10Fj gKDj6o4E WDjV2IGDvulFs5+49AkBEZMutgF7mArXd9KjRwdJG4nL+HdVAyIAgNCM3bL+yAugQOYuc5Rhkzy9JHI+rv79DZR+UuHEDTWCBqM4ICfMR4+xwlnEGRFMknfDpn3NyUjENNzpVy/hNNd9Lz6TH2ADkWuXwl/Fd0U8DCwDB0Yu5iCaIOgjMnSjz0oGe5yGHg2wqbDt9abmngtYRqqJCW6ig+Xk5p9T3Xtom1FUofMaPx5mcLH6S5QL5Y2/sFdTb/3V3dG6lAh9wiAMy9HaYgcz4j01f0lnx8w1bAMQMVx6J2vd7KMfzj5LY/r0tDNWnOnXmL+WVp9hwWENru9r3pTwQjb1k2DfvBaKkaskHYJQ6ISGfyCno4KLHt4cX/y6GZK9T2dVCYK/0sNvKKiAv8w7vikpcM8hoCVmvm+QsppN4FLwNdEtNKXChVx7VnxTbHCifK9gbylBzNtufCttbDEERie8s/36dEVDL1jd1YNGPYuKWedo= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Apr 19, 2023 at 03:01:05AM +0300, Kirill A. Shutemov wrote: > On Tue, Apr 18, 2023 at 03:12:50PM -0400, Johannes Weiner wrote: > > pageblock_order can be of various sizes, depending on configuration, > > but the default is MAX_ORDER-1. > > Note that MAX_ORDER got redefined in -mm tree recently. > > > Given 4k pages, that comes out to > > 4M. This is a large chunk for the allocator/reclaim/compaction to try > > to keep grouped per migratetype. It's also unnecessary as the majority > > of higher order allocations - THP and slab - are smaller than that. > > This seems way to x86-specific. Hey, that's the machines I have access to ;) > Other arches have larger THP sizes. I believe 16M is common. > > Maybe define it as min(MAX_ORDER, PMD_ORDER)? Hm, let me play around with larger pageblocks. The thing that gives me pause is that this seems quite aggressive as a default block size for the allocator and reclaim/compaction - if you consider the implications for internal fragmentation and the amount of ongoing defragmentation work it would require. IOW, it's not just a function of physical page size supported by the CPU. It's also a function of overall memory capacity. Independent of architecture, 2MB seems like a more reasonable step up than 16M. 16M is great for TLB coverage, and in our DCs we're getting a lot of use out of 1G hugetlb pages as well. The question is if those archs are willing to pay the cost of serving such page sizes quickly and reliably during runtime; or if that's something better left to setups with explicit preallocations and stuff like hugetlb_cma reservations.