From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D40C0C001DF for ; Mon, 24 Jul 2023 11:59:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1DC336B0071; Mon, 24 Jul 2023 07:59:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 165256B0074; Mon, 24 Jul 2023 07:59:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0059D8E0001; Mon, 24 Jul 2023 07:59:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id DE2AE6B0071 for ; Mon, 24 Jul 2023 07:59:40 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 8FDE6809F1 for ; Mon, 24 Jul 2023 11:59:40 +0000 (UTC) X-FDA: 81046360920.12.7C17502 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf17.hostedemail.com (Postfix) with ESMTP id A7A4040006 for ; Mon, 24 Jul 2023 11:59:38 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf17.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1690199979; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=YcSCb4dlzcK3dWaMHWJCk4mRNmhV5WOUBzp9rUOy/6w=; b=KUEuG5hHL0e4bIbpZcGJMZon8y05C3G3W0fnCQOr4rmLuQB2wIdSvK7UgUutoElngq6T67 tFUQxae5Ndrz0k3e6SvgRdneh9bhPq4lZlg47A/S6kdFZf/Q4t8UD4ENnC6OaU5AOuB4zb S1AW8myFf6jiep8f51dZXhqyH+nqSw0= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf17.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1690199979; a=rsa-sha256; cv=none; b=ZYc+C1vHAq60QFEnGja/pKDQqqbC4mG/uNoodU71czPhC7KeZP3W40C9hJOy8zDwmHRsVY SykcHUJyIulol6/sKT+MJipZJS+bID4m64NG+ZhecERs5SrzCU0PhG0o4rp9aUewGtFydY YvMRhUI9LwkzC18PylFu1UIYLxz/pgk= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B78D8FEC; Mon, 24 Jul 2023 05:00:20 -0700 (PDT) Received: from [10.57.76.172] (unknown [10.57.76.172]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 39E4B3F5A1; Mon, 24 Jul 2023 04:59:35 -0700 (PDT) Message-ID: <83bb1b99-81d3-0f32-4bf2-032cb512a1a1@arm.com> Date: Mon, 24 Jul 2023 12:59:33 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.13.0 Subject: Re: [PATCH v3 0/4] variable-order, large folios for anonymous memory To: Andrew Morton , Matthew Wilcox , "Kirill A. Shutemov" , Yin Fengwei , David Hildenbrand , Yu Zhao , Catalin Marinas , Will Deacon , Anshuman Khandual , Yang Shi , "Huang, Ying" , Zi Yan , Luis Chamberlain Cc: linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org References: <20230714160407.4142030-1-ryan.roberts@arm.com> From: Ryan Roberts In-Reply-To: <20230714160407.4142030-1-ryan.roberts@arm.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: A7A4040006 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: mqz4j95dc8k3x947ww1keidbh9wcre7n X-HE-Tag: 1690199978-370145 X-HE-Meta: U2FsdGVkX1+NVEbv+piKDuaeNUEAWdLuiIKmyozqlYCGlbSPUt6ljzMw0YSfR000ZnAP0EZkT8WY6j3cJOLuEpbu9z0+Anll3p9zfl1nfmvNa6VUCcsVFLUVZIOV7wqs8uG89/M2WHOe5256smknrYZI2e2FfqbbkOKoxfLu6xuJ/zpG4L5DD9AysqckUyFj9mXCsis03JZszHzOiQdlvWpEyL0HMkDHQHCrPFuGez+3THUjS/gZ1YPAzonxv2MQWsotYizTRKs4xd1qi71m31O4cAf3hCdh74ZkHBleUuYgAAws3vTNgv+XgNy2r4YQkn6oAI+po462D4ZfLiH1Tm5aYlrCdQbLkRmHMwAHLHm0sjgQoWJ8mbwc0BM4KXmzn092W515jrmKD/Ug8+AlAIK1QRAmaf9wDVawrNIvIphKGXtbhjW6d82UfbHTFbx2zmFysGvgr/Qb3hl/+cNzCEI4L+gUB8iBJE0MMGyHrCUqq7BgniTOLryqVYc6LZWnukZMjt0ff0Bn+L6IArSg/evBV1X9kwCHUFWxUfg0jnEaOfog80ayfHzMyoIgMFl306QfOZWjC4xsO8jE6BRpjEw5MS8ImQL8Z38QYCDmIKpoQFX5O+a1EVB/f8VSAuAWc9Crijh8Lw/wTypUHKOHS34elR+r0fN5USlGGWY9whAkN2Mc2prp9nbiZE7LJ2kjz3fjigdi89d39D4lSRZOWqnLQBFlTzBrYKCjmZv41baTrVQ22h48GmHw3z4403F2XBpHPKXlYBTTQHbvObnkQF/ZFJ11py1ksMGwiPE9kVl434pI5Ucgaz7II6PMTFAW85q19DAy7mRuRiSnse5mRMUlWD0ko6zugoMD8dn5myt88Iqv0f4n0IxvASPBgeVpTRFB0YJHf2CtKmHmqSnoPKdqqcK//+c3gOKhab24Ezq50ei0OOqKhLGbFIFzRvIwAEoSu8Nv0BHOj75Wk3V mlol6vBb Z9RaDcLyGM48xFdV/Rsv/juQFJKon5CReCydPoVYM2mfAN4DtSNJqYaextaVkCLOmqk5jB714NXimKAZSHjVtahTuMOKkdM4vfdDKP9w6FZ3TSuA= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 14/07/2023 17:04, Ryan Roberts wrote: > Hi All, > > This is v3 of a series to implement variable order, large folios for anonymous > memory. (currently called "FLEXIBLE_THP") The objective of this is to improve > performance by allocating larger chunks of memory during anonymous page faults. > See [1] and [2] for background. A question for anyone that can help; I'm preparing v4 and as part of that am running the mm selftests, now that I've fixed them up to run reliably for arm64. This is showing 2 regressions vs the v6.5-rc3 baseline: 1) khugepaged test fails here: # Run test: collapse_max_ptes_none (khugepaged:anon) # Maybe collapse with max_ptes_none exceeded.... Fail # Unexpected huge page 2) split_huge_page_test fails with: # Still AnonHugePages not split I *think* (but haven't yet verified) that (1) is due to khugepaged ignoring non-order-0 folios when looking for candidates to collapse. Now that we have large anon folios, the memory allocated by the test is in large folios and therefore does not get collapsed. We understand this issue, and I believe DavidH's new scheme for determining exclusive vs shared should give us the tools to solve this. But (2) is weird. If I run this test on its own immediately after booting, it passes. If I then run the khugepaged test, then re-run this test, it fails. The test is allocating 4 hugepages, then requesting they are split using the debugfs interface. Then the test looks at /proc/self/smaps to check that AnonHugePages is back to 0. In both the passing and failing cases, the kernel thinks that it has successfully split the pages; the debug logs in split_huge_pages_pid() confirm this. In the failing case, I wonder if somehow khugepaged could be immediately re-collapsing the pages before user sapce can observe the split? Perhaps the failed khugepaged test has left khugepaged in an "awake" state and it immediately pounces? Thanks, Ryan