From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.0 required=3.0 tests=BAYES_00,DKIM_ADSP_CUSTOM_MED, DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE, SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E32C1C433ED for ; Thu, 22 Apr 2021 19:31:58 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 3C76C613D1 for ; Thu, 22 Apr 2021 19:31:58 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3C76C613D1 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 78E426B009A; Thu, 22 Apr 2021 15:31:57 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 73D726B009B; Thu, 22 Apr 2021 15:31:57 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5B6A06B009C; Thu, 22 Apr 2021 15:31:57 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0114.hostedemail.com [216.40.44.114]) by kanga.kvack.org (Postfix) with ESMTP id 4086D6B009A for ; Thu, 22 Apr 2021 15:31:57 -0400 (EDT) Received: from smtpin05.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id F14053622 for ; Thu, 22 Apr 2021 19:31:56 +0000 (UTC) X-FDA: 78060998232.05.00812FA Received: from mail-pj1-f53.google.com (mail-pj1-f53.google.com [209.85.216.53]) by imf10.hostedemail.com (Postfix) with ESMTP id 5B42840002C3 for ; Thu, 22 Apr 2021 19:31:48 +0000 (UTC) Received: by mail-pj1-f53.google.com with SMTP id nm3-20020a17090b19c3b029014e1bbf6c60so1541592pjb.4 for ; Thu, 22 Apr 2021 12:31:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=HtlHiq0OXm9PrD6fGzcxib+iHeSzuf64yByn1v5aKv8=; b=LwV4HrEp45TBQ19vnNUH70ukl4eRnlAL32lVAnklfGvL0fqK71oB9JexmFY5tIo9xV +WqyREO0MMGKkBIXQkr93BmGKcIpysRN9/cqJlkvTsLjZC/pMly0sptr65NXnaoY41Wm X8Af6MqgbwEZx2tyMuJzGnVi99lXjWGeOZJpIzx3uGaw9XjCWFuiS3U2XigP+pTr1d5A phwp3GVUusKrEOkAD6quZqUakoyoLDUH05kf15Xa9dbACH2Xxf1m2Tw3ra+k3CM4N+zh bV51IJDeVwK1P9cz+kcXs9j2tA2pCEXEN8BRjWteAqUP8KVV+A0cZPygcm1RfLCYUkrE Okug== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=HtlHiq0OXm9PrD6fGzcxib+iHeSzuf64yByn1v5aKv8=; b=m79vPG3Py1SQT5SMb+2EkJ+XbHnzvsy3HPpDao72OB8aEsEO/O/wi8IbIY+jEhRe3P lbPOuCb50aLWu0TYJXMzAqJcTBmjInbE2fzwDiFihCtJnj5lcbw/8elRO/OQXSJxt18r LoBX9voqkBte9G/ErFoiQNxPIYZM2n2tYID8CU2B68QVJPy70zTbn1fdrOBW8uN0XvlJ paD6JCr8QTgJZlMexEwpRe+pmW9mjFuYhsvwH7oRxW7ydn8Qimj0OJrMhwIP48pEuB9k i5Cv7BkCCbyCckfOoydsPsQteOc+MkifsR3Nlr71uKf1PxAME8f3Q+03+mlBOcqiOpFD V+Fg== X-Gm-Message-State: AOAM53233esn0z2DEoG2ITl4x5HdNhtWQWWhBiPtjdZzBe1xIToCQZMu 5ioaZebogbFaVA5QA08YyoOI/PRlN3k= X-Google-Smtp-Source: ABdhPJwmlxvE13U20o4EVs5e5HoPFbkVMQjktu3xZ4dyTjTKHKkJbtEn+21mtb4DXAj6+HXV06ehRw== X-Received: by 2002:a17:90a:7c4b:: with SMTP id e11mr1616813pjl.151.1619119914995; Thu, 22 Apr 2021 12:31:54 -0700 (PDT) Received: from [10.230.29.202] ([192.19.223.252]) by smtp.gmail.com with ESMTPSA id k127sm2956907pfd.63.2021.04.22.12.31.53 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 22 Apr 2021 12:31:54 -0700 (PDT) Subject: Re: alloc_contig_range() with MIGRATE_MOVABLE performance regression since 4.9 To: David Hildenbrand , Michal Hocko Cc: Vlastimil Babka , Mel Gorman , Minchan Kim , Johannes Weiner , l.stach@pengutronix.de, LKML , Jaewon Kim , Michal Nazarewicz , Joonsoo Kim , Oscar Salvador , "linux-mm@kvack.org" References: <8919b724-ce5b-a80f-bbea-98b99af97357@redhat.com> <58726a6b-5468-a6b4-7c26-371ef5d71ee2@gmail.com> <9df905cf-cc4f-c739-26cb-c2e5c6e5a234@redhat.com> From: Florian Fainelli Message-ID: Date: Thu, 22 Apr 2021 12:31:52 -0700 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Firefox/78.0 Thunderbird/78.9.0 MIME-Version: 1.0 In-Reply-To: <9df905cf-cc4f-c739-26cb-c2e5c6e5a234@redhat.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 5B42840002C3 X-Stat-Signature: wbs8rgcjufqgdpbp9fyoyuajja8ruza5 Received-SPF: none (gmail.com>: No applicable sender policy available) receiver=imf10; identity=mailfrom; envelope-from=""; helo=mail-pj1-f53.google.com; client-ip=209.85.216.53 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1619119908-100463 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 4/22/2021 11:35 AM, David Hildenbrand wrote: > On 22.04.21 19:50, Florian Fainelli wrote: >> >> >> On 4/22/2021 1:56 AM, David Hildenbrand wrote: >>> On 22.04.21 09:49, Michal Hocko wrote: >>>> Cc David and Oscar who are familiar with this code as well. >>>> >>>> On Wed 21-04-21 11:36:01, Florian Fainelli wrote: >>>>> Hi all, >>>>> >>>>> I have been trying for the past few days to identify the source of = a >>>>> performance regression that we are seeing with the 5.4 kernel but n= ot >>>>> with the 4.9 kernel on ARM64. Testing something newer like 5.10 is >>>>> a bit >>>>> challenging at the moment but will happen eventually. >>>>> >>>>> What we are seeing is a ~3x increase in the time needed for >>>>> alloc_contig_range() to allocate 1GB in blocks of 2MB pages. The >>>>> system >>>>> is idle at the time and there are no other contenders for memory ot= her >>>>> than the user-space programs already started (DHCP client, shell, >>>>> etc.). >>> >>> Hi, >>> >>> If you can easily reproduce it might be worth to just try bisecting; >>> that could be faster than manually poking around in the code. >>> >>> Also, it would be worth having a look at the state of upstream Linux. >>> Upstream Linux developers tend to not care about minor performance >>> regressions on oldish kernels. >> >> This is a big pain point here and I cannot agree more, but until we >> bridge that gap, this is not exactly easy to do for me unfortunately a= nd >> neither is bisection :/ >> >>> >>> There has been work on improving exactly the situation you are >>> describing -- a "fail fast" / "no retry" mode for alloc_contig_range(= ). >>> Maybe it tackles exactly this issue. >>> >>> https://lkml.kernel.org/r/20210121175502.274391-3-minchan@kernel.org >>> >>> Minchan is already on cc. >> >> This patch does not appear to be helping, in fact, I had locally appli= ed >> this patch from way back when: >> >> https://lkml.org/lkml/2014/5/28/113 >> >> which would effectively do this unconditionally. Let me see if I can >> showcase this problem a x86 virtual machine operating in similar >> conditions to ours. >=20 > How exactly are you allocating these 2MiB blocks? >=20 > Via CMA->alloc_contig_range() or via alloc_contig_range() directly? I > assume via CMA. I am allocating this memory directly via alloc_contig_range(start, end, MIGRATE_MOVABLE, GFP_KERNEL), just looping over 1024MB via 2MB increments. This is just a synthetic benchmark though we do have an allocator that behaves just like that as well. >=20 > For >=20 > https://lkml.kernel.org/r/20210121175502.274391-3-minchan@kernel.org >=20 > to do its work you'll have to pass=C2=A0 __GFP_NORETRY to > alloc_contig_range(). This requires CMA adaptions, from where we call > alloc_contig_range(). Yes, I did modify the alloc_contig_range() caller to pass GFP_KERNEL | __GFP_NORETRY. I did run for a more iterations (1000) and the results are not very conclusive as with __GFP_NORETRY the allocation time per allocation was not significantly better, in fact it was slightly worse by 100us than without. My x86 VM with 1GB of DRAM including 512MB being in ZONE_MOVABLE does shows identical numbers for both 4.9 and 5.4 so this must be something specific to ARM64 and/or the code we added to create a ZONE_MOVABLE on that architecture since movablecore does not appear to have any effect unlike x86. --=20 Florian