From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8EA74C10F1A for ; Tue, 7 May 2024 11:26:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F1CA36B00A8; Tue, 7 May 2024 07:26:56 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id ECD306B00A9; Tue, 7 May 2024 07:26:56 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D94656B00AA; Tue, 7 May 2024 07:26:56 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id BAE1E6B00A8 for ; Tue, 7 May 2024 07:26:56 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 60043140E7A for ; Tue, 7 May 2024 11:26:56 +0000 (UTC) X-FDA: 82091372832.06.BBEAD9D Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf17.hostedemail.com (Postfix) with ESMTP id 8C80540017 for ; Tue, 7 May 2024 11:26:54 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=none; spf=pass (imf17.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1715081214; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=cUY4tZy73z8Z3uq935RIb2wTSDKYyODY/q3xslVOZGY=; b=kMHbPSpdk1f5CJlknRA8vi/oRASn0DSGdjlKblpOEhdRwcKyV5Ppg4IVHJKlG/fj8LtLXN sHGJLftBNO3tdlTZOV2x7ru1ERCSzgKx1JFaBRqvWvjkSVTOrRY/PRAsePZXKvk8cNwxVi J+W8vQ3EHIVnm5epiUejEPY9CtYn9qw= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=none; spf=pass (imf17.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1715081214; a=rsa-sha256; cv=none; b=yoSIfKqj+4aH0QZ/s4MLOLULmQkPfRJFj7J04vlM+kvQeiJEaM9EtVQd6AFbvGTb7ZmGGW xcRIGfLwJDRjLPfkqc70oJr6GqhnBmQ5Ig8NYxvBIgByRs9a482UinY6WF8a4DMSzh/0x4 as6oFuuXrmoFqgVP7bjlM+B/brdM/2g= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id D0F861063; Tue, 7 May 2024 04:27:19 -0700 (PDT) Received: from [10.1.34.181] (XHFQ2J9959.cambridge.arm.com [10.1.34.181]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 7AB6E3F587; Tue, 7 May 2024 04:26:52 -0700 (PDT) Message-ID: <9ce8a0f4-d1af-44ea-87b5-57ebdb3d2910@arm.com> Date: Tue, 7 May 2024 12:26:51 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RESEND PATCH] mm: align larger anonymous mappings on THP boundaries Content-Language: en-GB From: Ryan Roberts To: David Hildenbrand , Kefeng Wang , Yang Shi Cc: Matthew Wilcox , Yang Shi , riel@surriel.com, cl@linux.com, akpm@linux-foundation.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Ze Zuo References: <20231214223423.1133074-1-yang@os.amperecomputing.com> <1e8f5ac7-54ce-433a-ae53-81522b2320e1@arm.com> <1dc9a561-55f7-4d65-8b86-8a40fa0e84f9@arm.com> <6016c0e9-b567-4205-8368-1f1c76184a28@huawei.com> <2c14d9ad-c5a3-4f29-a6eb-633cdf3a5e9e@redhat.com> <4e7ce57f-cad1-44d5-a1d8-4cd47683a358@arm.com> In-Reply-To: <4e7ce57f-cad1-44d5-a1d8-4cd47683a358@arm.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 8C80540017 X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: 4d8k3ehmb78495onsnd7oh9s7c8msf6q X-HE-Tag: 1715081214-170313 X-HE-Meta: U2FsdGVkX1/ApL4LEf/vMpEUWn06F8VtihS12k6wuDgFjTXuGOMcvb584N9gmTUNa9+z2vKJ/ZLfhnYyeseXQyHJA1gKN9+O1T6RLDDzu7zDh0IzuIBYZFIku4Sj1BLUvPGzYvhxa9p0zfEEh8m5pI6Yjz4OiRmu1n612Y7PPrgJumcA5sXUP6SFU0PTl/LReqetHh08jLagTGHJkN+4+A3tWVMVl0CG+F0Vbo6wQqKSUcdKBzFTCm8ETb3BMYe1AWnv+Z0/fNwdORPttsPx/3QAVScC2qtPTeTKLTpk5SsTK+T/6x3HkpfTBFEtRyZRNHU27sZ7mBw5iG9pCzsTmOWSsO4Iney973fl5aWpMBy42cQLe2AnPSTrQCWgxpKUtpPqMcsERQUVEV3zgDaMa/AF4YIoWGfx+faC7IK0UTCrLWBAQ7N/ZAIYom8Apvogu7K0Us8ErWKAuj3AGx9hD3IpVYbY88JlKaiDsuEvWWsvZ7K/KDVJaa2gkFVfUlUxJquNkKOPK653zwVAvntd0HU4CmS6zX5vIhMs/HowmSoFM7Dd8mznXyox3M+NfDBy7ifXqQST0qjkQWoFh3No+g0RjG0IoMXCKmc2wFS9mwd2sMBi3F0oXZzOqQ0nnvksbxUg2tJoYCtcu7DTRm5GkejPuHfE2mX/AYXkHTIu7338LGtBqkIzrnFhlpbEVKm134lO9OZZeC+PEZy0kiDNU5yJKOKUEliU4EET2MN5BtgA1b05E8XxER1RJvWYDUEjhOxfdJvZ34WfsngsjgozrFl/4+bxbjoVPqiklo+4d6QXK534/CzBDUQnPHvqiGyIYfArIhZ9M31eXtyq+2UY9zdrn9kCGRF4I677xLzpirn17RBVzFCfHpElhy96P7x8CL6ixJlFeWFzzZWlWNXif2+E1Qme9FCgXtCTUu5TXFJMJ4w1izpuhc0IFiDYFs/dkF0FHkdgq3I34r1Ss4v d88EV4Ub V6uhMDooAFdaHYQE2jsYO0zxaXSZIv0lDyRp4hfEgynd/nyZQercmBqCh/g3C2BNdFqgAR6bmZSi4ZHBwDEsgg80r6JMNv/WDQfPI66e5EjqkxQi4AqnXX/1DNVLefhunQDtRciNXIE0Fu3Ss1m99TcsFXZ1Cc0Ai6/dMUP9OUqeYKvg= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 07/05/2024 12:14, Ryan Roberts wrote: > On 07/05/2024 12:13, David Hildenbrand wrote: >> >>> https://github.com/intel/lmbench/blob/master/src/lat_mem_rd.c#L95 >>> >>>> suggest. If you want to try something semi-randomly; it might be useful to rule >>>> out the arm64 contpte feature. I don't see how that would be interacting here if >>>> mTHP is disabled (is it?). But its new for 6.9 and arm64 only. Disable with >>>> ARM64_CONTPTE (needs EXPERT) at compile time. >>> I don't enabled mTHP, so it should be not related about ARM64_CONTPTE, >>> but will have a try. >> >> cont-pte can get active if we're just lucky when allocating pages in the right >> order, correct Ryan? > > No it shouldn't do; it requires the pages to be in the same folio. > That said, if we got lucky in allocating the "right" pages, then we will end up doing an extra function call and a bit of maths per every 16 PTEs in order to figure out that the span is not contained by a single folio, before backing out of an attempt to fold. That would probably be just about measurable. But the regression doesn't kick in until 96K, which is the step after 64K. I'd expect to see the regression on 64K too if that was the issue. The cacheline is 64K so I suspect it could be something related to the cache?