From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 93883C10F1A for ; Tue, 7 May 2024 17:17:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 13CAF6B0085; Tue, 7 May 2024 13:17:49 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0EB5D6B0096; Tue, 7 May 2024 13:17:49 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EF5F36B0099; Tue, 7 May 2024 13:17:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id D239F6B0085 for ; Tue, 7 May 2024 13:17:48 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 4AC18409FF for ; Tue, 7 May 2024 17:17:48 +0000 (UTC) X-FDA: 82092257016.29.08E1F3B Received: from mail-lj1-f178.google.com (mail-lj1-f178.google.com [209.85.208.178]) by imf19.hostedemail.com (Postfix) with ESMTP id 6D0C91A0021 for ; Tue, 7 May 2024 17:17:46 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=MJ8KHtE+; spf=pass (imf19.hostedemail.com: domain of shy828301@gmail.com designates 209.85.208.178 as permitted sender) smtp.mailfrom=shy828301@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1715102266; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Wxsc7dN4OIswSJE2MtNAs04aLoVPOnnvcVBuaHnxab0=; b=bxT0uOs7EHz43hcU5VPCCoGdd+G1PtrZQNe7TtZ36bWQyIF26ME9BO3Jaj+B1Uhnf2gPvr NK+QckC8XK8gpvYG6dOITy0sNQlzrbnReb2H+FEXgHDWgYEGFGNJg3Xm4paX5EMNxHLSwh 5Yof0o+VqXxn9eS25Bm2MsWqhIzc16g= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=MJ8KHtE+; spf=pass (imf19.hostedemail.com: domain of shy828301@gmail.com designates 209.85.208.178 as permitted sender) smtp.mailfrom=shy828301@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1715102266; a=rsa-sha256; cv=none; b=WCRw/oo2LRqByDkswWi/KVsWOr7H995oX9pKDlldLTCxZ7OKXlExk1z5jVVKRBeigsfwUi JU3zBS4kVxcH2aDsI5GBcjQOsDOiKd3sik2rBMib45vG+Msy5aUQ7V4mvydIaLh13EgQTu Xy2Jn+N/wInxjE+uSJ4uWlZw5vMvkz0= Received: by mail-lj1-f178.google.com with SMTP id 38308e7fff4ca-2e22a1bed91so45248761fa.0 for ; Tue, 07 May 2024 10:17:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1715102264; x=1715707064; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Wxsc7dN4OIswSJE2MtNAs04aLoVPOnnvcVBuaHnxab0=; b=MJ8KHtE+TNFx5U8BO+sze+vbMzGzC35is2KtxdqOQf6F0NLFvvgssTU9XJa8d9lbTL i1sT2+jh/K8OT2eGXN5bDZtDHCduW5pIVj7fT54Lfaw1Iyi8F6OSzmXkQlWnLIqdyXus ug5iOkxN0Ii20kq518lqq3BtE4vo4mdqemqsVWiqi9tsfsfrLo+yeabwqdkkQzYDRPTk RhnA+iDXbpecxtrDSInZBrmTC6WzBkOyVTDtHIpvbECWFempPJK4IGG4cg2xWoQ5gY04 qgsOPALbUm0Lm3/uP/CnhokauQO0IWtLXOn/CHHlzjr9H0Sa7qcfIgrrt3MCUuQOgkcB 6uUA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1715102264; x=1715707064; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Wxsc7dN4OIswSJE2MtNAs04aLoVPOnnvcVBuaHnxab0=; b=Si8ZYEg022/Cg8uNTmJorF+pvWYFC9AAi/Fqw9Db1LqY3Xq9W85HqIcwqG1yHkhN6f ZVQOcjUq01lT80/+F+V9WGp0tv1SrHW91uBmOAjYVIAOIzPWYZr0jU3j/zzrxqxodqD4 ACfr922JutcBPdL9Hzur+HdL6oQcn1QyO7j3/IXMc5hpPKuX4PAy3hhWsqAUelCTkeNZ zaUzQ23LHXx9ge2w/hal8h17O3mv4wFk+QIPc2OvrMWzLRNYInnbBroNQnf8Sm6rNX+G ctvHJJ/SQz6844OKXnCvRQi9/3C1cRW8GAclrybm2WzhLqZUAYIRA/tFVXF2biafyUcs b/pA== X-Forwarded-Encrypted: i=1; AJvYcCXvdElhtyDbYb7riy5N5UjLRewXx0kkt2g+40/aSIIJlE/JdTooryRsKh4ax4hNUnXd99njuc/EpRZONcBFX0BmkT4= X-Gm-Message-State: AOJu0YzhrzbG2+4QGG0CG6fzDQrHOvYMNuPbCn/wrOLIJA/uRRFaM0tz EIjNQkj5Hu+pKJEoOFY4I6bsAsLug+wU2+rfbdquQoSrqv3NyrYRVkTHjjRWG+YKv0I8oF2vdHz yzPzLd+klmbRcHs6Lrc28Wbn4b5o= X-Google-Smtp-Source: AGHT+IHVwKMCSFFBxlt95XXjiIUjjLsx3T4M43BlxdbErkVnuwQChSJWblnS5btztEvxe9Ned7nhidvJJtYHyi/LhbQ= X-Received: by 2002:ac2:5df1:0:b0:51d:b7fc:29a6 with SMTP id 2adb3069b0e04-5217c3714e9mr124940e87.7.1715102264281; Tue, 07 May 2024 10:17:44 -0700 (PDT) MIME-Version: 1.0 References: <20231214223423.1133074-1-yang@os.amperecomputing.com> <1e8f5ac7-54ce-433a-ae53-81522b2320e1@arm.com> <1dc9a561-55f7-4d65-8b86-8a40fa0e84f9@arm.com> <6016c0e9-b567-4205-8368-1f1c76184a28@huawei.com> <2c14d9ad-c5a3-4f29-a6eb-633cdf3a5e9e@redhat.com> <2b403705-a03c-4cfe-8d95-b38dd83fca52@arm.com> In-Reply-To: <2b403705-a03c-4cfe-8d95-b38dd83fca52@arm.com> From: Yang Shi Date: Tue, 7 May 2024 10:17:32 -0700 Message-ID: Subject: Re: [RESEND PATCH] mm: align larger anonymous mappings on THP boundaries To: Ryan Roberts Cc: Kefeng Wang , David Hildenbrand , Matthew Wilcox , Yang Shi , riel@surriel.com, cl@linux.com, akpm@linux-foundation.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Ze Zuo Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 6D0C91A0021 X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: fsmj4osx1qa91epf9tyiws955j6f5kt6 X-HE-Tag: 1715102266-487509 X-HE-Meta: U2FsdGVkX196/NocIxca1OMA/5nE3BUok7yM1TqFfYnz7QtcNF3bYTE2uaXOjj6gqmvn0ZCO2Ljf5xqJum7Tes4V/p/PgqyfADOGNoT0dc+PMUz2pROnlc5a0fE5EtSgGRqTyURZN//0fMEwpsiDvjwi5E2hQOfOEDxhrAmwSvJ4byV0AbKfwYnVUrrC0xKP9/hc6kWDrbC/J6cUd7Xu+v8X579kiy9nFdYuV+5z3pTbK6uyndWb1yWdzy8aRqhwo4rUqUOuX8sdTezINJRhK+8x/jSDf6V4XdytKhod88N2CdxFydo5n1VcHsq6gwJtXFlQGMK6FsOyuQeNNc91+J8h+S3EYfwzD9D9822smLQmsZa/VLtI4yw29Ln0DSAZKKDjj3N/PS7vYh1OYtsuaIkmRoPSpQsdw/tLltT0hV72Db50HRQesdYtuS4XceTDF3jDaLsvtXH6jBZXe0WyPAkeY8mYlOwPQ4HutNB50yhCCJdB7uK+l1/3aZbN3VakCa2jQJqAPKDfXT18PrYMU8uSNHtIkoxCvIMGxnjnBuxH7JqB8gqwrjrnTfETc0eK3KxoTY9gpcpIV2ePcl018BEKra+Vb3Y/ldrIdg0GN9WfLzTBT7gc87nk0fCxr+vvfvCzhvEvp5E1+olSmDx5FE1jLi43y6znRtVykobOR52YKyYY+XWdgCmWKAq+JM7QcgKI1bjICRwUkiJEixpJtG0SxFHT/7BUrjjTWNihTBInx8g3E9oeTgFE6SmX3YRRFHDTH0TAhpCw3uKPFvo50ew5Sz+cy9eGqpCyv7jlEotRvFNsZD9VbMKHofe7t+/F7uRAzE78KcttT2aAsH466rF5Q4t4HcpKHyksd5wk5s9SmkympHgKrBjn42umavBxRhpvB6Q+NRaJux3OMiHTvdk+tQVppF5fUyLJ3cHPulhqK3DE8PAa0CCJ/1GmB369lP28hhytBapmptkyKBT svp58zP0 LPtzOYpQcDW7ytXt+ad2RabDgbwB6aO/Hdo8iNOqc9xTxpiWuKPupuqxGdEIPZZ62fG5peLYQxtM4OCQV5SBqwzs884T/F/J5h0y9NxEGY/TOD7/RdkB3DczHTOfSqVn2guasuOQx3J/EJXLMxbjmjpiwejXRuTbmcME2kaeBbV9mSzdgnMxsAY80vIbCfD6mlq4fb8u8U/AEKpJ8T2oUSCYgEdDhdDIRKaoyYLM11Wude3+DzlQ/3gz21rwihSNq4gZbbbR8+AKktv9kC1uLhpq8VraqGuBHC9eYUQ8/upGP3BsYAEoygHTNuphPdc5O0UYsU2OoAna4Rt0Dt9tiCEyxUFLw9QR6zYT20NDAcWVv37neOq1LWFFO5MfErw+A7fCSr2fbQ53SfPhV4WTADtWgGeP5neq7xUyS+SSxB8kKdiAvbw+ps6OSUA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, May 7, 2024 at 8:53=E2=80=AFAM Ryan Roberts = wrote: > > On 07/05/2024 14:53, Kefeng Wang wrote: > > > > > > On 2024/5/7 19:13, David Hildenbrand wrote: > >> > >>> https://github.com/intel/lmbench/blob/master/src/lat_mem_rd.c#L95 > >>> > >>>> suggest. If you want to try something semi-randomly; it might be use= ful to rule > >>>> out the arm64 contpte feature. I don't see how that would be interac= ting > >>>> here if > >>>> mTHP is disabled (is it?). But its new for 6.9 and arm64 only. Disab= le with > >>>> ARM64_CONTPTE (needs EXPERT) at compile time. > >>> I don't enabled mTHP, so it should be not related about ARM64_CONTPTE= , > >>> but will have a try. > > > > After ARM64_CONTPTE disabled, memory read latency is similar with ARM64= _CONTPTE > > enabled(default 6.9-rc7), still larger than align anon reverted. > > OK thanks for trying. > > Looking at the source for lmbench, its malloc'ing (512M + 8K) up front an= d using > that for all sizes. That will presumably be considered "large" by malloc = and > will be allocated using mmap. So with the patch, it will be 2M aligned. W= ithout > it, it probably won't. I'm still struggling to understand why not alignin= g it in > virtual space would make it more performant though... Yeah, I'm confused too. I just ran the same command on 6.6.13 (w/o the thp alignment patch and mTHP stuff) and 6.9-rc4 (w/ the thp alignment patch and all mTHP stuff) on my arm64 machine, but I didn't see such a pattern. The result has a little bit fluctuation, for example, 6.6.13 has better result with 4M/6M/8M, but 6.9-rc4 has better result for 12M/16M/32M/48M/64M, and the difference may be quite noticeable. But anyway I didn't see such a regression pattern. The benchmark is supposed to measure cache and memory latency, its result strongly relies on the cache and memory subsystem, for example, hw prefetcher, etc. > > Is it possible to provide the smaps output for at least that 512M+8K bloc= k for > both cases? It might give a bit of a clue. > > Do you have traditional (PMD-sized) THP enabled? If its enabled and unali= gned > then the front of the buffer wouldn't be mapped with THP, but if it is al= igned, > it will. That could affect it. > > > > >> > >> cont-pte can get active if we're just lucky when allocating pages in t= he right > >> order, correct Ryan? > >> >