From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D58D0EB64D9 for ; Tue, 27 Jun 2023 18:33:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 50C498D0002; Tue, 27 Jun 2023 14:33:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4951E8D0001; Tue, 27 Jun 2023 14:33:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3367F8D0002; Tue, 27 Jun 2023 14:33:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 21FC08D0001 for ; Tue, 27 Jun 2023 14:33:48 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id E357B1A0996 for ; Tue, 27 Jun 2023 18:33:47 +0000 (UTC) X-FDA: 80949376494.25.8C9CF3C Received: from mail-qt1-f177.google.com (mail-qt1-f177.google.com [209.85.160.177]) by imf02.hostedemail.com (Postfix) with ESMTP id 1185680020 for ; Tue, 27 Jun 2023 18:33:45 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=IFZFZBAd; spf=pass (imf02.hostedemail.com: domain of yuzhao@google.com designates 209.85.160.177 as permitted sender) smtp.mailfrom=yuzhao@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1687890826; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=PTIY9reQDxCVjeBIGzBvbwdTkf7d3X2+E+ffOvvfOt8=; b=seVyCI3cQEdFyP21mFG1LnEwlNysD7eaeBMEWpCfSNdi/u6BdI/dJj4tcQFhKA+SpTomdO AmahQNaFdGpB6dCSJXnHIAWNxN/9Z+v9PnzI6Xm18URbsq2SntEnRp+g1KYLlc6vZAaQ0q bq+psPo1JSs7JcxlQQ3GfzYmy1gKsVw= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1687890826; a=rsa-sha256; cv=none; b=NDCKXmyj2Of89SQGnEf4MCJEqiPFawCwBpTomgGRFKQqiS4dDreIax814PXE+1DW2sEkrk yfoFc+ioRQayndTrYltFAerkLXpWpqVkuORbSaFQf8ZfQOJqDYjYv+Sfqf2ZI329Lr1DZ0 5YLPUpVZlOq2qMsDZ1QYGqcBEroM0gw= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=IFZFZBAd; spf=pass (imf02.hostedemail.com: domain of yuzhao@google.com designates 209.85.160.177 as permitted sender) smtp.mailfrom=yuzhao@google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-qt1-f177.google.com with SMTP id d75a77b69052e-401f4408955so40471cf.1 for ; Tue, 27 Jun 2023 11:33:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1687890825; x=1690482825; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=PTIY9reQDxCVjeBIGzBvbwdTkf7d3X2+E+ffOvvfOt8=; b=IFZFZBAdSS5WPvdUVXRmtMP6d4it0zMJf/l2GDxZXOXh1ctIJNlPkvj2sinpxjJpBe LSKJg/UTJqQVTKhWODYjkO1i4nmcSjjuGZmBYcplmBZiyiFW6CuoXRu+bQTCz2WCa1nH SdQntuzU7sGVa9Zb4/4LWl80uj4Q0BPHxEdxix7lK5L3lwfGKXDqrn1GAAYX6QuI7vNE z1pBKRtW9A03Pn/i46V3X9p17H8SAeO0G3yXELGKIy5uxYSjsIK7Iv6IFYtp9hQ3EnpD eG251UrPtZeVpaEySuJiM3LXqN8D4gZAZ/2Sukeubj3yBMBzkQ54sCd/tzIGiPBSNQ4H dJIw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687890825; x=1690482825; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=PTIY9reQDxCVjeBIGzBvbwdTkf7d3X2+E+ffOvvfOt8=; b=dTPmq9CrCBcfiCPK50VCakfR9hLmjBgT2W1PPkgZVplGPJ8BcZ59ld5Kt/MqB3PSrM KiR5T2MtGYvu1jf//TOPe6hVtqY2628NXmMOKmLKVKbBPk/T6zhbB9SLcH91iCCcn1iK jgW/deUMnKWvjJsNVc/FKk14G+ybviMRao/VXFV/RjwP2TCSLGywKQOKcMOSuo4PmpaG h5pQ0QMH5ve+Zbw8g+emGY2eQA2tOT+G0Eb+7WZ65gHQMfBup3VeBUrXdkEAuimyjuR0 6n57avK/RRul1gbAZzF66I755pBYtgAATioAxMAjuwleDJqqd0lXaHZLKUU84Xc1JFQY x9uw== X-Gm-Message-State: AC+VfDw85Ty4tbZclJlNB7BYpkvbAyYwPTnN30oGS+pMMnnV45SB339e Z8yBohZiDCWqwEV/eBxigu9Jj2xiwHxINVVixiB/GA== X-Google-Smtp-Source: ACHHUZ6J+FwNO1q3sZEb2eJpxaQ1yK9y93HAAT9rQ/w4kcix4s3yyUXUt2xkbkSmZhsyVHdXENGOFgl+h8r/6XgSKPU= X-Received: by 2002:a05:622a:2d2:b0:3fd:ad1b:4e8a with SMTP id a18-20020a05622a02d200b003fdad1b4e8amr15807qtx.22.1687890825017; Tue, 27 Jun 2023 11:33:45 -0700 (PDT) MIME-Version: 1.0 References: <20230626171430.3167004-1-ryan.roberts@arm.com> <20230626171430.3167004-11-ryan.roberts@arm.com> <0c98f854-b4e4-9a71-8e0c-1556bc79468c@arm.com> In-Reply-To: <0c98f854-b4e4-9a71-8e0c-1556bc79468c@arm.com> From: Yu Zhao Date: Tue, 27 Jun 2023 12:33:09 -0600 Message-ID: Subject: Re: [PATCH v1 10/10] mm: Allocate large folios for anonymous memory To: Ryan Roberts Cc: Andrew Morton , "Matthew Wilcox (Oracle)" , "Kirill A. Shutemov" , Yin Fengwei , David Hildenbrand , Catalin Marinas , Will Deacon , Geert Uytterhoeven , Christian Borntraeger , Sven Schnelle , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-alpha@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-ia64@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-s390@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 1185680020 X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: 3djemrh9y3wq76obt4pebc6c7ud5ntyp X-HE-Tag: 1687890825-40884 X-HE-Meta: U2FsdGVkX18Xb0Iw7zfs2tk2O1VfIlveY8buRMKMzGEKpDd1jotm+l7MxliT/WOf8nOuq+pqIIFcINB/EKeH36QDhWVDy1q1CYjmSzLUKXDeK7TQs3Vy1NKcoSCZ/0NYwjRSIEuTj3A38C9xfTKJocSc4RcMkjkSnuNM4PDm5JAtN8gAi0OBH5t7BMu233XtRUxJFBK65uj3ThFTJfhXuUe4QV3R5l9jLV3zkz7Dn+cPBySR6CIPdE4blZVu9eRnECfR9yFmzb2Ts37g5n6TyxNcni0XKUU0CkyWtQWTMuQjReoQs+dpF+EWLBQ0Gu2G5rXI8hpnf3I1JNeI6osWmEaPR6xYTE3CnRkwjiL1XbqsKjsAlt5FHZ0e5ok8RrPsTnrJEJkp3Bzf7XccN0il4NyBmSwTLxCkeXFTSj8SWNSMPf8kdKrjkJChh87bpqT4iryQVg7dLx9RRZ47JW2MznaN5SsxUdyuaoEjA/EGi0gqmxkQRqezfEG2HOiyPJwiVdDnXlKbtMCOn4GfK7I+/c258IFrpUrOOtzkFpp1k1e/8lB6mTSY0D6QdwM0UtUhY1FjghdErzsswBB2lPOdpo+uQUD9U4P6A8VPAVcryzhwI9ZcU3rFUrw9KXFG4Ej/nyluHhSA0i6c9lN7X3UXiOE/z45xsIiMfNdjEMxO5wEArnvHqC0VXYrHchnJZGbngDu9362JR5ajDD2IrEVHohY/yQ1BFrkTv0Dwl6JQuv5o4AD3xOr3qwuXVkBm0IzfFkHeB+RY0q5dm4RxHTwIOV8afcHBMBRh/Y63A/VDi3NHMUskq7XWMrpcY74T2KEZI9q7+hNnVBmRgV0tQ0Vt6YhcMaCQ1zhWl7fbsnB2Rknvl6mcmeQ7Aj+AbbqkwbN/O0nIPbsCNR+zbdzhoIPP+/Mfm/FxFe4Dy8SoRPBUQeF/UJ+CWcCQ0v4azG0xQU4dbodQMBpwXSLpLP0LDyk xjc0y3vv rBh40QNMbFBMMxLJGzyGiCkufTPoe73ywrKLg+P5fU7lL4aq7CicW1JCqPYUrFVZo+qiqxbHyZIZuPr4O2RwJDbsZWitJ6BtzWMREcZr9jvX9foDoQheMA/brcNAXkl0+1n4Cae8bk+jmOK8B7Mkm4S8UBae4RGTPFFS3uKPCOC6Lto20w2zHng89OctQK+36FcNHa2AA/W0URI4znwFol+QO+kh173kHmvjl0ioqTAJCJiVOmAGY65mEQMvqs3g4kn+NKfkPpYCDAmjmPwgrQ9brTxNtN2XVG0EU X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Jun 27, 2023 at 3:57=E2=80=AFAM Ryan Roberts = wrote: > > On 27/06/2023 04:01, Yu Zhao wrote: > > On Mon, Jun 26, 2023 at 11:15=E2=80=AFAM Ryan Roberts wrote: > >> > >> With all of the enabler patches in place, modify the anonymous memory > >> write allocation path so that it opportunistically attempts to allocat= e > >> a large folio up to `max_anon_folio_order()` size (This value is > >> ultimately configured by the architecture). This reduces the number of > >> page faults, reduces the size of (e.g. LRU) lists, and generally > >> improves performance by batching what were per-page operations into > >> per-(large)-folio operations. > >> > >> If CONFIG_LARGE_ANON_FOLIO is not enabled (the default) then > >> `max_anon_folio_order()` always returns 0, meaning we get the existing > >> allocation behaviour. > >> > >> Signed-off-by: Ryan Roberts > >> --- > >> mm/memory.c | 159 +++++++++++++++++++++++++++++++++++++++++++++++----= - > >> 1 file changed, 144 insertions(+), 15 deletions(-) > >> > >> diff --git a/mm/memory.c b/mm/memory.c > >> index a8f7e2b28d7a..d23c44cc5092 100644 > >> --- a/mm/memory.c > >> +++ b/mm/memory.c > >> @@ -3161,6 +3161,90 @@ static inline int max_anon_folio_order(struct v= m_area_struct *vma) > >> return CONFIG_LARGE_ANON_FOLIO_NOTHP_ORDER_MAX; > >> } > >> > >> +/* > >> + * Returns index of first pte that is not none, or nr if all are none= . > >> + */ > >> +static inline int check_ptes_none(pte_t *pte, int nr) > >> +{ > >> + int i; > >> + > >> + for (i =3D 0; i < nr; i++) { > >> + if (!pte_none(ptep_get(pte++))) > >> + return i; > >> + } > >> + > >> + return nr; > >> +} > >> + > >> +static int calc_anon_folio_order_alloc(struct vm_fault *vmf, int orde= r) > > > > As suggested previously in 03/10, we can leave this for later. > > I disagree. This is the logic that prevents us from accidentally replacin= g > already set PTEs, or wandering out of the VMA bounds etc. How would you c= atch > all those corener cases without this? Again, sorry for not being clear previously: we definitely need to handle alignments & overlapps. But the fallback, i.e., "for (; order > 1; order--) {" in calc_anon_folio_order_alloc() is not necessary. For now, we just need something like bool is_order_suitable() { // check whether it fits properly } Later on, we could add alloc_anon_folio_best_effort() { for a list of fallback orders is_order_suitable() }