From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EBA18E7E657 for ; Tue, 26 Sep 2023 18:34:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7F5736B01A6; Tue, 26 Sep 2023 14:34:31 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7A5D46B01A7; Tue, 26 Sep 2023 14:34:31 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6462E6B01A8; Tue, 26 Sep 2023 14:34:31 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 5485B6B01A6 for ; Tue, 26 Sep 2023 14:34:31 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 1CA5840A5D for ; Tue, 26 Sep 2023 18:34:31 +0000 (UTC) X-FDA: 81279599142.27.210C1CC Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf26.hostedemail.com (Postfix) with ESMTP id E581F140022 for ; Tue, 26 Sep 2023 18:34:28 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=DTjhqwRB; spf=pass (imf26.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1695753269; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=g1T98dzQGWLWBrQ64ExYR/wc6OCmRVM8+A98XAd4G7s=; b=J/LgIRGdy9V2hsBCArivm6NaOeWV4rY6rNp7DY1wS1eL2AH6ODcmtAGoWz1d+HbRB67EdR //6RhAeYg2Jl34kPY7+R/aGE/aUScOstOW4XwKvFvpkK94IkCOPcua8nCU/ALB+aEaGZ/a 3QKDYG4SgPNHVD+fXszMoMtthPf9Gmc= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1695753269; a=rsa-sha256; cv=none; b=Ze5jxfvGB5wHFUILxN+pPXnmNXdZ4FP0pL0CGyLDtdlxKLGnzis5femFy/5We9v9I1D2+y vqDtFBYOY0xMco4sbleUJqkMrhWutqY0VAlEXa3Mon4ZKN5hoqb+QHvRikhi8prlzmyFu/ 8JrLpmsF+vp3SnP7Fofv0X2x09pM9b4= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=DTjhqwRB; spf=pass (imf26.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1695753268; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=g1T98dzQGWLWBrQ64ExYR/wc6OCmRVM8+A98XAd4G7s=; b=DTjhqwRBB1EsuGLLfeG/zGPtJxu0ZUN8TPMI6laqYaFU6AXpIPgg91JX/XWNAOAYOpJDlf Rj+nKScy4lUov4ZRfwQS50wM7PUHsgezUbW/geSv2itm4Xh5iYNOymmBGvjb+nnYARh+78 rHKdKdOyFbXHh+mYnQ1rPNmFHgkzUI4= Received: from mail-wm1-f72.google.com (mail-wm1-f72.google.com [209.85.128.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-591-mAD2R-XTNVik0wLB2fg5mg-1; Tue, 26 Sep 2023 14:34:26 -0400 X-MC-Unique: mAD2R-XTNVik0wLB2fg5mg-1 Received: by mail-wm1-f72.google.com with SMTP id 5b1f17b1804b1-3fe1521678fso87860675e9.1 for ; Tue, 26 Sep 2023 11:34:26 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695753265; x=1696358065; h=content-transfer-encoding:in-reply-to:organization:from:references :cc:to:content-language:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=g1T98dzQGWLWBrQ64ExYR/wc6OCmRVM8+A98XAd4G7s=; b=mbqMHp7LIUVOIS/J13OS9ztdnH4ip+EHkvY82hSUPduGh6KDEO2jJKg4WxLF2I/64R eH3ktkk/DmMlu8fFvTeSwBR9JOeZ8oHKhbum5Ys0bRMTB3+OoKcdCpU7bQl4ZDLWxsHy WF0BxvFjsAf16+6FTfldaP65ueDQPnBXL78KvcLiTCgxASGhH0O+6d0MuXSg9lagcbU9 YxxRA6PMStElWFwOx9E8s7+L/pxv7slacKWcXZWg8D1ESObPFYrUWOHTJaQvHDcxYjfF /GVOg6R4aAlo3ZqvIbtTalpMOW2r4Z6e6uJJzo/lZVy7F28Lhx0ec00pC3jKwOWrguRM iCGg== X-Gm-Message-State: AOJu0Yz+JipUPWbU8KHI4n4E82jnNPtCKDHhKUaDYAHsN/H8F6OXUMPa gQ5aPUqzYcq94GIGxqG3n7Sh0EG7GeeqTwje0vNHHWWxXMXC08bbTDWvVntEUgtVmvY9d7EdOfO fkl5Gabpm6Vs= X-Received: by 2002:a05:600c:21d2:b0:401:be70:53b6 with SMTP id x18-20020a05600c21d200b00401be7053b6mr9497829wmj.15.1695753265604; Tue, 26 Sep 2023 11:34:25 -0700 (PDT) X-Google-Smtp-Source: AGHT+IG7NRDZhQgro8duPi825VV2xuZPAVSS6Fguqwe0fAxkE9ijecQGzwO5Dv/Sc1+NkXxnWDb7Gw== X-Received: by 2002:a05:600c:21d2:b0:401:be70:53b6 with SMTP id x18-20020a05600c21d200b00401be7053b6mr9497807wmj.15.1695753265152; Tue, 26 Sep 2023 11:34:25 -0700 (PDT) Received: from ?IPV6:2003:cb:c73f:600:933b:ca69:5a80:230d? (p200300cbc73f0600933bca695a80230d.dip0.t-ipconnect.de. [2003:cb:c73f:600:933b:ca69:5a80:230d]) by smtp.gmail.com with ESMTPSA id bw10-20020a0560001f8a00b0032326908972sm6838109wrb.17.2023.09.26.11.34.24 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 26 Sep 2023 11:34:24 -0700 (PDT) Message-ID: Date: Tue, 26 Sep 2023 20:34:23 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.15.1 Subject: Re: ANON_LARGE_FOLIOS meeting follow-up & refined proposal To: Ryan Roberts , John Hubbard , Matthew Wilcox , Yang Shi , "Yin, Fengwei" , Yu Zhao , Zi Yan , David Rientjes , Andrew Morton , Vlastimil Babka , "Kirill A. Shutemov" Cc: Linux-MM References: <4966f496-9f71-460c-b2ab-8661384ce626@arm.com> <4830fb3e-4a35-4842-98f4-9e7baa0e692a@arm.com> <7301771f-d654-4e5a-a197-3a3d8750440c@nvidia.com> <92937776-1e16-47e5-bef9-4c1a04bc98c0@arm.com> From: David Hildenbrand Organization: Red Hat In-Reply-To: <92937776-1e16-47e5-bef9-4c1a04bc98c0@arm.com> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: E581F140022 X-Rspam-User: X-Stat-Signature: k1fqx43kwdtjr7o5p83z69p74yp7cu3o X-Rspamd-Server: rspam03 X-HE-Tag: 1695753268-596917 X-HE-Meta: U2FsdGVkX1+zyeSWE+gMF10oDWOPJX+qAK42OH4/JJcyFDHTiZeIZ4m78LJLPpRuneUlomv5YPRZ8UMas5m7xBRk4pNTNjLwBq5zSLEReeZ+b3nF4eE+KvjNsL9z+SQw+EuCg0Lrk2w4R5TsO5zZeEHD2I1qBtOs94lxh8LuRr6Eik5HvA9vzipD/qNT5VHhlIFLUtUGdclOlDqDqExJxi/lQP+g3aj8DL3ENSNA1RrMkpwOAF54JsimGnUWR1HlAtEn1MhOkCSjRwjAY68+afZTOw+PMRyikzSDV9gW7XgGw1EdEDHKDyfkBq1JPkY3dkidq3c6O32G0X4f+i9Wi/VH234lRKZC5TgHJJIp/hsZdgSlGLZUryPTrdRY4HQpJ2E0zFbsZ+C5IAP3VnzNTIdtNAO6e3dm9rtWe/CSaWKxPd5jEneYlFzqZOm7y8h8o59WPY6EUvanpmBM/lmlAh9aCD4cViWXOmY6wCtQeioQn4F7vLDCyZnnWM8tMgZBszj+iuxW9mRlgTUNTZTTQ33509hLpBS0aXJWCRoktkogK8Xn34RG+xMqL5icSYTuvH9x7PrD9aOLPGL31TrG2O2tBikyhx4rfkQOoI3PjiP69EEXFGEKaGGyCSB/XYy5kr8NrpddMUoHAprc/nFKK1ZcKorhtsM0xVgWhGLoEFFgssRexqWql/eWA3C5rGIHu+ZekGidPZihZ+R0Xm0k2ifaKsF7NP4e479Eni9fIf82CDEco2PeKYwu1Bx53sErGO9kp9VM+OFecDqjPjdmFAHty8S/wLhhsZB0Zszqo7y3D/Ok1H5yh8R9HZVgFIM9/S4ls6u4qDt0ohwszx+PXA2DrKHTRxUdWFWbfJiDc09EYAn6dhX/M/SXSJIBtmOTaifHeX6VWzNOSQWfYufi7sJW0dZCoseFnndWlYMqRaTf5QDe+VLZR5VyZGnCZ26RntpBWhaZtPVTiAs78hx i+FxdpcD KMVUfX6ZSSdr/ul1QGfa9+YH8bZO8JOW0Z8/CrKrwP52RsC1//GuuAQeU2UXVjHuFjXvgakPoEjp1/yCSEmnk7V31JPkznKsn737AZoQ+pb62kIQ6Rx+vGUqJHRxkON/vIfTu6XySBZ2wZ09gEzVdY5FfUhY7Fm9NxaLx0UGETCWJK8epeKMcwZ/X6awe5aYXs9VkXQefQJDcj6RKNxIePXZzY07AlSTuNFAdzGH6MqdFRhQI/vM9oCKFZ5Ton/3hwMAZ8KtHnSS5um9gNvG3LtmwTV2z5L6mZjNJSC8RLKZVBtSDKhV2y5i/8sM9B+fDgQSgtfbu1rq8Y6C15+DKj5T/qduX1BrbivCk2mhCPmJi98VV4ORCRTl9m1ULh0Wv5bLmxb4NNXltXeFsinCS/SJYT6wtw3FoyKkD2zAwZeJ6WySy23PfTvuGZw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 25.09.23 10:51, Ryan Roberts wrote: > On 23/09/2023 01:33, John Hubbard wrote: >> On 9/22/23 08:48, Ryan Roberts wrote: >> ... >>> I never had any feedback on the below; I'm not sure if that means everyone is >>> happy or that nobody read it?? >> >> One can never really know: zero or more people read it, and of those, no >> one hated it enough to send out a quick NAK. So that's a *possible*, >> lukewarm endorsement of sorts. Success! :) > > You really know how to fill a guy with confidence! ;-) > >> >> ... >> >>> BUT I've had yet another idea on the controls front, which would enable exposing >>> this to user space as an extension to transparent_hugepage, while continuing to >>> support THP as is and also be able to control THP and ALF (anon large folio) >> >> The new ALF / ANON_LARGE_FOLIO naming looks good to me. The grep aspect >> is a nice touch. > > Well if we go the route of the newest proposal, then I guess the naming is less > important, because it all attaches to transparent_hugepage. > >> >> ... >> >>> Add 2 controls to sysfs: >>> >>> /sys/kernel/mm/transparent_hugepage/anon_orders >>>    - bitfield where set bits are orders that will be tried during allocation >>>    - defaults to 1<>>    - For now, 1<>>    - To enable ALF, set the appropriate lower bits >>>    - To disable THP, clear 1<>>    - (In future we could add an "auto" option too) >>> >>> /sys/kernel/mm/transparent_hugepage/anon_always_mask >>>    - orders in (anon_orders & anon_always_mask) are not subject to madvise >>>    - so when enabled=madvise, still try (anon_orders & anon_always_mask) orders >>>      as if enabled=always >>>    - defaults to 0 (all subject to madvise) >>> >> >> I *think* I like this a lot, > > On the weight of this lukewarm endorsement, I'm going to code it up and aim to > post something for dicussion end of this week. ;-) > >> although I have some clarifying question >> below. It seems to address the key things that have been complicating >> the discussions: the API is now looking more flexible, and yet still >> easy to understand and reason about. Nice. >> >> A couple of questions about how this works: >> >>> >>> The defaults for those controls give you "legacy THP". But you can modify the >>> controls to generate policies like this: >>> >>> >> >> For these tables, a small key or legend would help. I've forgotten already >> what "S" means, and am also vague about exactly what "THP>ALF>S" behavior >> means, too. > > THP: > transparent hugepage allocation; specifically PMD sized/aligned/mapped. > > ALF: > anonymous large folio allocation; specifically some order between > [PMD_ORDER-1, 1]. Always PTE-mapped. ^ and that's exactly not where we wanted to draw the line to be future-proof. Ideally we'd create something that is future proof, such that it could be extended to any folio sizes in the future. -- Cheers, David / dhildenb