From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 45904C369AB for ; Tue, 15 Apr 2025 16:59:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7AA536B00ED; Tue, 15 Apr 2025 12:59:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 757CB28008F; Tue, 15 Apr 2025 12:59:00 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5D1216B00EF; Tue, 15 Apr 2025 12:59:00 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 3B4DF6B00ED for ; Tue, 15 Apr 2025 12:59:00 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 92942C0C08 for ; Tue, 15 Apr 2025 16:59:00 +0000 (UTC) X-FDA: 83336888040.17.107065F Received: from mail-ej1-f53.google.com (mail-ej1-f53.google.com [209.85.218.53]) by imf12.hostedemail.com (Postfix) with ESMTP id E4AB840014 for ; Tue, 15 Apr 2025 16:58:58 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=A1oukK2X; spf=pass (imf12.hostedemail.com: domain of mjguzik@gmail.com designates 209.85.218.53 as permitted sender) smtp.mailfrom=mjguzik@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1744736339; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=UOTvg1gLqwCWJWRVeBYITz2X02Jnuak6OlWgBVk5Dro=; b=B2o9j9hcWj7+48+eb4PnENQYwEK+VeP6vF3dvZTqU8+XES5TweIPumPu+BBoxXrF0Kbsjt JnSPXKnlY0YrOlhWFRv/LUCvpAoMjGYa93ZrUeS86p6SaCj7hn1H/vDKL6hF37fWZF551D Qu5Iwr4etz0PWa3qKqZL1Ovz5GeLS+Y= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=A1oukK2X; spf=pass (imf12.hostedemail.com: domain of mjguzik@gmail.com designates 209.85.218.53 as permitted sender) smtp.mailfrom=mjguzik@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1744736339; a=rsa-sha256; cv=none; b=hdpY37jf6A8sBCjNwNwkpB/ZW8LkHjucHeK41i5zgiMunClVS6Xv7DU44Loye5IhPB+VE+ Teqn8JKgngBFRliuqyX4oFSeWdPuCNHNfrO3evKhqyIzqvzbqr6c42SHmULiPi8zyDkdIO eMY9c5qiQsYxyBlABV40WC6xU1WWLg8= Received: by mail-ej1-f53.google.com with SMTP id a640c23a62f3a-ac2902f7c2aso964808766b.1 for ; Tue, 15 Apr 2025 09:58:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1744736337; x=1745341137; darn=kvack.org; h=content-transfer-encoding:to:subject:message-id:date:from :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=UOTvg1gLqwCWJWRVeBYITz2X02Jnuak6OlWgBVk5Dro=; b=A1oukK2X2H5GzFlmlvHY6KDdSaJUw55c5SErvdk+kr0Y2G49IiVIKIzkQt2GCTczQG FtrRQSbE/lxjtOyGR1KWeHL4ZeLp+Y+0H+6Br+VYDd7jDNUvdTJIGUX+vSI7EEnOw7k+ zybGq534kyEFDLD3RP4r14YJpQ2w61DZKp6pRB1QNpz2PLurKEdgvfMWQNKEJh0M01PQ Hjdy27TlogJuZRYgYuXqyDCCAjMiWRK/HuJ2EQGUo8DLPnP2Q8Hjl69f+7fU/wRk1QkN SQkvdxr6o9jNhhnUGHNlRIiEhvOGkyqf/HIY5q4U/bWBAPqZcWPhatWPjH7mobazpYpC mW2g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1744736337; x=1745341137; h=content-transfer-encoding:to:subject:message-id:date:from :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=UOTvg1gLqwCWJWRVeBYITz2X02Jnuak6OlWgBVk5Dro=; b=tbFoCEaylTw0rXCARlmMzHG4k030ieZ1KZJIa/PkfQy9CYKyksUxb8X0e19mVj71+F cvf2VF3GInf5pLintCt0tLpmhae6Sa1lzcETLnRjDXojLKUNARCC/26P7EyoYfh18itA sFVXSQaZnbnx4nY+SK7oBu83XsG5hZSHzOpAjS/snr7ABAmKr8mEP3SlikVplRC6oe8D H3zXTuR1ePaqajPPkC9OCNw8r7kfFMQlnjgpFS5x1bCuMyEIZdMZ3spNZrWfQFArT3vu Hw4CRBjjr9LY71m6cVIwvyPG9/Ti8478FqOj87FFjuzKS4RfxiJZ1hP8lNxYH5xBXZFR nGlg== X-Forwarded-Encrypted: i=1; AJvYcCWiSjeR4tte0mZUxyVYNh6JPVPm/nMGbkTG+XqLx+QVPpPZy5h8Djv+Fz4YJYDDRCd6ExEYX+aWAw==@kvack.org X-Gm-Message-State: AOJu0YzwxF8RQLrMcL5jc2t5Z+E+HxlCFgueBNkOVE+BhW1k79XZ9eiY NcDt+WeGA8EJd7mbuZMaW2J/da2ZZZKyav6B9+6eVjCNAMc0uJTTMw8oykRXXroI8qaEOjWemoX vPHG7yBQO9bs/oX0D7OSqeWIUJ9I= X-Gm-Gg: ASbGncvRpadLOYxfGFq5cbv/cuLscxNikA1wHhFKqb2KVWF2hFxzAAv+tyvGhIghFHx HWr/O0KglZ10rOgsZ12XWY6gjkY7box8PTihLM3YWuuig+mSWrXO0RxeUsIke4mMMxj82yXHeqX RWqIQpCxn0gJk3PCBnUNBCGA== X-Google-Smtp-Source: AGHT+IFcXpCsHkNPjq4wMzCnRdllYHqxleIFB+p1o8MxCDr41aPMA2TRdMocIVwsD4p0H38P7QmgLXfXo6kF6aA+qqY= X-Received: by 2002:a17:907:1b0c:b0:ac6:edd3:e466 with SMTP id a640c23a62f3a-acad3497ae3mr1780192366b.19.1744736337029; Tue, 15 Apr 2025 09:58:57 -0700 (PDT) MIME-Version: 1.0 From: Mateusz Guzik Date: Tue, 15 Apr 2025 18:58:43 +0200 X-Gm-Features: ATxdqUFhOCmC13qxGJVFrlHDZhCY7qOnE-ejf8Sp9Gsa8HINaIRNSW4Tfs5QKtI Message-ID: Subject: idea to ponder: handling extra pages on faults of anonymous areas To: x86@kernel.org, linux-mm Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: 6y6ourfo7jamsnijnn37o6gs83canfhf X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: E4AB840014 X-Rspam-User: X-HE-Tag: 1744736338-968107 X-HE-Meta: U2FsdGVkX18U/OWh15Txw9V05D13GzMQp6jGHRLKXqi6CWcSA0KsGKpWz8zGljRSZuqC7VhJOaKnZq704IpJ7B7CZl6vwm79jT+U5jrB03rdYnIJdO2zaBtTK9QkFKheYA7YNdBLwLZb3M59EWV2UnrKWwi9NjAWjwIxf2QGSggqDOPu3KYnLGRaEpTpxscAKY/yD06pQHs72/C5n8GiiH891kHzjZyH63ANWyhKuUffZ/mC2DC9BieRREhXX9rEYCqMZ70y0LkxoZxU7cLNfMQXRfuFQu4msJi4zIwrV1UyX4JBJWJuhhGF3CjQgceBF5vU8aD1ztmGdymc0gCmDJvdIRix5Bnc6ZksNOpYDGtolnh7+MxV4YUN+RNPWSTiQs7PmVOBg62+g53lpRiOIHm162ciY3W2Hk12qDPiEFodi4ot2XRjR/OtAMV0N/OuPpiKN6PCl1Gg/JRg7W9JYkEX6o4NeiYZYTfnPvArBIy5Lwr2yu4GJP4bKPo7H1ETH7uKb7KMHZwNdej1VbipCNVTd2YWTrbMV8jKXkqFkarcobzeV9OCYBDibBLvxWru2EegfYSqqDgonlnw+dbD1v1PV4DQxBqSR2+aZrEiI1V0DQHqlApfdcUy/sy1hIcSA8tibpNsbaAAow+U3GJiy41iRSnj/bQWi381vV/9mLluW6BoE+QVMyDqKvVJrH7hJWgTzPNsosGw/MrMqi015cTC4Yb2BFggVbIbcsO52yuiRrjFxNMnnM1VHKJrYaG+iyfPYf+EFnAOgYnQo9hVjOUU2tlXKCMaI1E/RsS/KsxaAzPuHYtOt9+Cha+QsJ9dgS4ciL0VgT9rD5ZBxwgTooG9MY/geZmpkh72j0t1PbAF8XszFEUgvCEoXwPkfVEimagWZTiUo5cesBbeAVcJ7ewPaQpxiKEGo8aMAadwiHfvCzAno/bxMEsyDOA3IsLf X-Bogosity: Ham, tests=bogofilter, spamicity=0.265379, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: If you have an area not backed by a huge page and fault on it, you only get the 4K page zeroed even if the given mmapped area is bigger. I have a promising result from zeroing more pages than that, but don't have time to evaluate more workloads or code up a proper patch. Hopefully someone(tm) will be interested enough to pick it up. Rationale: 4K pages predate the fall of the Soviet Union and ram sizes went up orders of magnitude since then even on what's considered low end systems today. Similarly, memory usage of programs went up significantly. It is not a stretch to suspect a bigger size would serve real workloads better. 2MB pages of course are applicable in some capacity, but my testing shows there is still tons of faults on areas where they are not used. In particular when running everyone's favourite workload of compiling stuff, kernel time is quite big (e.g., > 15%), where a large chunk is spent handling page faults. While the hardware does not provide good granularity (the immediate 4KB -> 2MB jump) and will still need to use 4KB pages, fault handling can go down by speculatively sorting out more than just the page which got faulted on. I suspect rolling with 8KB would provide a good enough improvement while suffering negligible waste in practice. While testing 8KB would require patching the kernel, I was pointed at knobs in /sys/kernel/mm/transparent_hugepage which facilitate early experiments. The smallest available size 16K, so that's what I used below for benchmarking. I conducted a simple experiment building will-it-scale like so: taskset --cpu-list 1 hyperfine "gmake -s -j 1 clean all" stock: Time (mean =C2=B1 =CF=83): 20.707 s =C2=B1 0.080 s [User: 17.222 s,= System: 3.376 s] 16K pages: Time (mean =C2=B1 =CF=83): 19.471 s =C2=B1 0.046 s [User: 16.836 s,= System: 2.608 s] Or to put it differently a reliable 5% reduction in real time. Page fault count dropped to less than half, which suggests majority of the improvement would show up with mere 8K instead of 16. the 16K thing was tested with: echo always > /sys/kernel/mm/transparent_hugepage/hugepages-16kB/enabled I stress the proposal is not necessarily to use mTHPs here (or whatever the name), the above was merely employed because it was readily available. I'm told the use of these might prevent other optimization by the kernel -- these are artifacts of the implementation and are not inherent to the idea. The proposal is to fill in more than one page on faults on anonymous areas, regardless of how it is specifically handled. I speculate handling two pages (aka 8KB of size) will be an overall win and should not be affecting anything else (huge page promotions, whatever TLB fuckery and what have you). Worst case you got a page you are not going to use. I think a good quality proposal is quite time consuming to produce and I don't have the cycles. I also can't guarantee the mm overlords will accept something like that. I can however point out that google experimented with 16KB pages for arm64 and got very promising results (i have no idea if they switched to use them) -- I would start with prodding those folk. cheers --=20 Mateusz Guzik