From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C34BDC2BD09 for ; Wed, 3 Jul 2024 17:37:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2F41E6B0083; Wed, 3 Jul 2024 13:37:56 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 27BAD6B0088; Wed, 3 Jul 2024 13:37:56 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 16A9A6B0089; Wed, 3 Jul 2024 13:37:56 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id EC01D6B0083 for ; Wed, 3 Jul 2024 13:37:55 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 80682A383E for ; Wed, 3 Jul 2024 17:37:55 +0000 (UTC) X-FDA: 82299149310.08.0EC218D Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf28.hostedemail.com (Postfix) with ESMTP id 2158AC0005 for ; Wed, 3 Jul 2024 17:37:51 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=none; spf=pass (imf28.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1720028262; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references; bh=7rI6zzmRK9hs965br6bRR5Un8M8v/gfoOwIOROlBvOM=; b=wKpiykCZ22OLxCS9wSwKq1he+D3HvUIAxxhAP1A1plGoiHfd+e0Jh8r8jz8lykvGL4+qQm lvaLinK18QYAIBBkMrY8xtaZ4iTC/qnxin8a/Xw2nLcLS/QQTjUp6KLX9F1e6RhEj2Dxap ds8XzBsNKL0OagX+LbaUkqlKF6KeaoY= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=none; spf=pass (imf28.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1720028262; a=rsa-sha256; cv=none; b=ImTwExcdYO+q64NG179Z9u/n9kYNeQ2Ah5cNyyPyUysdlT03QfC3ckdP+rjrevX4zdPrTM +kUw7NzLGWiwefZbDFInlMG/oBYi5QoaOLy8BcQjW3QbnednbO1xzhS169cxB5N4ZQyjfL X9QF2t9d8VF9o7cRZl2ufH5MT3+mB+Q= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 163DA367; Wed, 3 Jul 2024 10:38:16 -0700 (PDT) Received: from [10.57.74.76] (unknown [10.57.74.76]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id E70FF3F762; Wed, 3 Jul 2024 10:37:49 -0700 (PDT) Message-ID: <1cfae0c0-96a2-4308-9c62-f7a640520242@arm.com> Date: Wed, 3 Jul 2024 18:37:48 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Content-Language: en-GB From: Ryan Roberts To: "Kirill A. Shutemov" , Hugh Dickins , Mel Gorman Cc: Linux-MM , Catalin Marinas , David Hildenbrand , Matthew Wilcox Subject: huge zero page confusion Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspam-User: X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 2158AC0005 X-Stat-Signature: y3gy53bseit83imuk9q5n13ct4deuskx X-HE-Tag: 1720028271-604721 X-HE-Meta: U2FsdGVkX18AMgoIY1lWsApgKFMJQAfhNKwbkfhF6tsDfD6ekq5uaxo1OehwDOUb6d1MFoEQb7nh9/HN0Ybl79rbFp+0E6rvhlpTKcgVqS8y6uJDlwCT2PXs0jdwC03xZmqkIqDB98KG/5PWjU7QgoCJrFcQ7Xd9XtPShAZt/SQ7FD3oRZ0m3yP2DuQ4BjTg3oK7yXCGdIohc4ivCWLpvm5jIW38zvA7rVvkg+0sLc2E4AnS/4h5JIaPr1Rmfmp3/yTmw7+rSOAo9vt2Hw0/zKqY9frUGTs14S+UqyJBic0Lv3wB5qFe/N+HwXHoQJ0RhHUL8CR4R2CKkW/YLdXQPHGRgMKTGZFGXkwYrN2g/nFZA3v5kSoL94j2AdqASUF8cTmL5sdGct2d8mAwyU9eqkwE+2VN4Be0BLctAR7zAIIl+gMJlL0YI3vVyPh25twgQzZ2xZ6w2bODMNTrSnC5DNOkKdkM29wBsIHywdm7cgO2HvQ/hCGr8x/2ybA24S6r5wg6w6eTUD58dvAqp1p0mbs/mSPn+T8TgTmKJzVvdvlc/JF9MJ8tOvgaCj7tzy6YGQJ8tzBdv0fMcco1foDzpSbZIdR05/L6asE6DqGH8HpVqSRx0H7Z2BNSV8EFFtfbJGlqvatA38zc0uxKu1mtKfcj5AzTNiKenbwZQvpSLInzsJNBdsUX5a7oEOcdFg1U5z0SI8bi1+pyHnCHY99Jg6NG2ZoF7/bfIDuMNDsSD8hVathyEnyJisFiTQ5ZPRCk3auKN7rJ2G8InQFEDTBbtQBoIjFTqdQfj11NyGbhR1YoF5y2jMxGZGdkvEn1ZMhnAjMUOLKcxQxWgxZZqPy2hd0Yz+DLsuboFriwQjbrmoaKFcXmleYtverNkxV48SfXPsDWwDPZHGgr+rWYPIKwpZEFM2O5OdO66+3Yz58ebULdMm30xXcEqrTWOCq+uZvpPezE91Quyjiv93ooUYb pGEDPFid jV0sotryERuWCIx9SY1mnJPO6N+HSKa7hE1I6IJYVXEEteiGKogeM7ZKBGxQ5sFkn5p6PnCOHSPQYNdiz/DeSraqVJUltNUxdkhoOGpsj+/2V8vwpPYgIpmUemittMZOV+K3XsAmtK8AVdOp95sxP7S4j4svabtKZAeCemL1dLbl/KPJnFkz3zHcfMyueElnbRF7gLAIOH3CM2UTQBdmNJ+GlZC/Oee0ordhgHgYcGUgR6DBhZHu7nQ8mKw7m8hbHrD6GVgzzuSIpdV77ZH9EN5+la/BZdyPF7dFW9gbxgOyheXeWHkZnG8vl3h/2lOIUiNFL X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi Kirill, Hugh, Mel, We recently had a problem reported at [1] that due to aarch64 arch requiring that atomic RMW instructions raise a read fault, followed by a write fault, this causes a huge zero page to be faulted in during the read fault, then the write fault shatters the huge zero page, installing small zero pages for every PTE in the PMD region, except the faulting address which gets a writable private page. A number of ways were discussed to solve that problem. But it got me wondering why we have this behaviour in general for huge zero page? This seems like odd behaviour to me. Surely it would be less effort and more aligned with the app's expectations to notice the huge zero page in the PMD, remove it, and install a THP, as would have been done if pmd_none() was true? Or if there is a reason to shatter on write, why not do away with the huge zero page and save some memory, and just install a PMD's worth of small zero pages on fault? Perhaps replacing the huge zero page with a huge THP on write fault would have been a better behavior at the time, but perhaps changing that behaviour now risks a memory bloat regression in some workloads? I had some brief discussion with David H starting at [2]. Would appreciate your thoughts! [1] https://lore.kernel.org/all/20240626191830.3819324-1-yang@os.amperecomputing.com/ [2] https://lore.kernel.org/all/3743d7e1-0b79-4eaf-82d5-d1ca29fe347d@arm.com/ Thanks, Ryan