From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 69852C02194 for ; Fri, 7 Feb 2025 19:35:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D83B5280006; Fri, 7 Feb 2025 14:35:47 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D33E5280001; Fri, 7 Feb 2025 14:35:47 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BD479280006; Fri, 7 Feb 2025 14:35:47 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 9BB6D280001 for ; Fri, 7 Feb 2025 14:35:47 -0500 (EST) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 555E2A1F9E for ; Fri, 7 Feb 2025 19:35:47 +0000 (UTC) X-FDA: 83094153534.05.57A792E Received: from mail-pj1-f49.google.com (mail-pj1-f49.google.com [209.85.216.49]) by imf21.hostedemail.com (Postfix) with ESMTP id 321A31C0016 for ; Fri, 7 Feb 2025 19:35:45 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=purestorage.com header.s=google2022 header.b="JxAuQ2/C"; dmarc=pass (policy=reject) header.from=purestorage.com; spf=pass (imf21.hostedemail.com: domain of joern@purestorage.com designates 209.85.216.49 as permitted sender) smtp.mailfrom=joern@purestorage.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738956945; a=rsa-sha256; cv=none; b=afNHjQQsRq47HzMsDMJNR6KaiQHSe+ti+Z/+fA1ts+dNarpbmn4VujZc9CNtiJWJ5EaU+l PiYrEOrHUk/20os+jF+EBSyNioSvbtgOAf8mEF5m/GFvBQjLtaIgtWlNB4VIsU6/MR3Pzd PvqN4INzb99/8JIdXyEufgwJEm9e7JM= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=purestorage.com header.s=google2022 header.b="JxAuQ2/C"; dmarc=pass (policy=reject) header.from=purestorage.com; spf=pass (imf21.hostedemail.com: domain of joern@purestorage.com designates 209.85.216.49 as permitted sender) smtp.mailfrom=joern@purestorage.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738956945; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=mwOtCYCbeQ5fSG4sLWgXfqqLYty62JkzT294lASBw3Q=; b=BpncJT4nTyGp2vLilb9GtpYGsFF1HLDqhNiScGIR/t25SQDTyqUN15uqHREasLzj7N8Zb4 Pb6THaeiBT1guASVWzQ1NAJxHLFK3nC/AutWf6B9+tBgaCaDjeihrtzVRyVtv3FQjgEpTy +741GGTXRrs7hDfjpPdiS3SVfuDkdC4= Received: by mail-pj1-f49.google.com with SMTP id 98e67ed59e1d1-2f9bd7c480eso4412975a91.1 for ; Fri, 07 Feb 2025 11:35:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=purestorage.com; s=google2022; t=1738956944; x=1739561744; darn=kvack.org; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:from:to :cc:subject:date:message-id:reply-to; bh=mwOtCYCbeQ5fSG4sLWgXfqqLYty62JkzT294lASBw3Q=; b=JxAuQ2/C7bqDGtfqxaCrY28kEk4YUDPC8FKcxavGbrR6sQF9UbuUnKJ18btb8b4lLt n0TTtSL2vU2Sw6w0jeEW3mThXeurA/4YouGNU6EpLN4Dtz1gXujOU1jgLZqTvoNAatrO 8nBTFp5otX9JVL4gXLUOze4UpnAfFY2jaHFb1jagDN/Vfy5pHtKQWMRUf2xU18+Gdz+0 CVgepUCHo7Yz6PTZHexxauEO3r6K13nUwjAc4MAhQ/jPwc06lcG94k31qzk6QXseQV1J 9xaWQQ7cz2lF7wWZTZeD5br8JGgWWc7nx1B/e0FE8riX0lfOu4CvKka+NG/H8L2tTeO6 tRNQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738956944; x=1739561744; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=mwOtCYCbeQ5fSG4sLWgXfqqLYty62JkzT294lASBw3Q=; b=uhxE0eJUcW1dWXUcfbB0tv7gmixRBv72LQv1nBpZNmtj90LceNlFqb9Bb/7P7aDRTs JA6YzYSEMEv3MdDFA9uvbAzPvwDqebH5C+AfIPYY9WIwWbylC0JvimdV+8rUSXP+8vDJ OgHJQc3yRUTBDxke4D+8bjtsNWNn4iB2lbrkO+y1OiXK07UzRMngHeHzmQn2qOVxG7pp 3cigb4jdJ9q5jzeAIMjN/0FXr5heFpm0r6a2tlMqOqZPfqPl4o/istqgAaJ8Gs7cUtdF pP2wksGrZPy5uTLUMC1H3yEXpzFQnHG7sTWhSj2pEbfLkcYxa5w33OQCoWV+U/lNrNws Q4QQ== X-Forwarded-Encrypted: i=1; AJvYcCWTdagTaMAt6lw7nBAosUsPTOtYtO9v5562vSWaDGNQ0EST3Ph3L6f3a70AltiKd/aJGmorKuvN8Q==@kvack.org X-Gm-Message-State: AOJu0Yx697cxAPRcBs6Wp1XHdf7VbIjfbXvWgnS1HRb/oyBMSOznpJYg D9M3TCxQRff/Ik+t4iVuBO1ki4Y0+EYuFMq0gS0dgrHY+rgjjC7OzDQPOARbVr4= X-Gm-Gg: ASbGnctmMdZjrQuueP0KUzs/qCorIKi4z6jaC4uS656yx7DiOSqFnzIfpIWM7KOq87P 50UxXempFhz+sHQrR0nNuEXUNrSLRwVOKxQb5Wl5zICfishUQhaOIJlNJiR1q90MZTmQLoWQWj4 2wNyR2+GV1IamIAQZDw8Ub/pC5tB+7J9pwRfxPGhrhBWji/rgLkUl7+BdwSTwZypJTge89ggBPT YXI9rU+AZ05BibJ1HF7OPfaUcRGbIm+GPaKgr9HuJuNUzsZfzIyS8fMPfId/E1syr7YXsl8hpB6 4tkNrG7XQkIIzYmwQo6jUMUv53WGEltQuSM74yC6Tw== X-Google-Smtp-Source: AGHT+IGMkEIZkBGSbKRgl+p52b4TkWyfc4/Q4OFH5RNmHzwqIMX/0vurgTGtVjovkJEz0LoAHRha6Q== X-Received: by 2002:a17:90b:3c90:b0:2ea:77d9:6345 with SMTP id 98e67ed59e1d1-2fa243f03f7mr5714411a91.22.1738956943942; Fri, 07 Feb 2025 11:35:43 -0800 (PST) Received: from cork (c-73-158-249-15.hsd1.ca.comcast.net. [73.158.249.15]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2fa09b3f93esm3724602a91.33.2025.02.07.11.35.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 07 Feb 2025 11:35:43 -0800 (PST) Date: Fri, 7 Feb 2025 11:35:40 -0800 From: =?iso-8859-1?Q?J=F6rn?= Engel To: Lorenzo Stoakes Cc: Uday Shankar , Muchun Song , Andrew Morton , linux-mm@kvack.org, "Liam R. Howlett" , Vlastimil Babka , Jann Horn , Oscar Salvador Subject: Re: [bug report?] unintuitive behavior when mapping over hugepage-backed PROT_NONE regions Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 321A31C0016 X-Stat-Signature: esa39qe15ne3fmb5rc89yj8nqi714ik4 X-Rspam-User: X-HE-Tag: 1738956945-128449 X-HE-Meta: U2FsdGVkX18amMrjXwOF3dmrx2MyRmYX+MrE4Oaxhs8g6J+Xr3uT1zaE6PD7kG8hkWmBr5JLYcKQ/Z8NLGycoRpaN4B5SlyHogfforZx7XuhOsNQH0JTSmioSkxCMkiqFv8IbR29cmQsjYyKdkwAIV4jZsd63n8T8oiOxPQkLhSLd+faWenGB1RxeX/GDyfptLYCyg9Pjro2Xdv7sM5eqb+M+yGtUaJT0dlTGovg6AWUj2Isot1FzDqIgiOpCFqp6pKm7QrX7SmfzKOHv7bbiUf2pGKG0JlJKA3U0pumzE2g3VmQOdkViZo5mbCSwPYtcFjJdV9ibPv7jtvEGiyEEA6L7VEDM4PNytqWN8SnlodVJzJdrJ2RSulHsWlkUlmFDfPhBnXy6cRJxGuThNWppmGEG95D9z3enu9m6aF8NgVNQ79F5X/sgEfyb9gfjm2RUXZJKnemfBUsBHJAvosYOLx9WTGXqL2VoKl8T//St6SXjEqHA91qhhPsAKtHRqbtp9Rql0vPHb5LlwvZkvGY/TgnhWrxZYq1lo2jTHbbq5n8rMtFy9Wi8QJcxZTtjpysZkYCSY0D4AEzuhSpoqNCcZ6g+7CJQN9BYueDV2TgzYqwa5yBT+2dy0ntDLXD/tlZZIOcyknlHhhuiraJUqVnCiqn4chbbvxfMNNLro9BQaa/0WdxDLTmNSR50WWhg6h+1++r194S2yCbz2td/nM3llHYDcvUxp9l+5m3GGjEz6YXFFYpvQ8Xhcjvq4IzMUmKhA/IoTH4NxT/zNdjI0MVvIMFvHm9ObiPxyPHAhd2A1f1meNGxQJyb5enYu0CiCVQK7SoMLi5s0qdPUbVLwaNKtqYlbS7ZxkDYLwDL1Qy9zz9gHaS/5klDb28Qnp2YY29flNhidSX5fsRc9IJsfEBB895jjRUOQVO/U6a2TKPgAEO/MJylikt805KiVnYQVT+gcR8ZkVkwAHHyGM83fZ MjBdr0Iq nYUevVMWy4AVMGD3gnQkuoX0mFby8BsoBpsZcnmBV20ABv6tBnOD4tDmtags/Wt2XUVwi5F+mA6neUFTSXQEGRzUb9oRteuPL7tpnY0tMV09iKjJlsp8YkW967EA8wo2kOiU78o2lHrAjUjwqwmaxEznAc1wha8l/XdLF/jwC/xdCZumiIAEFTcHbt2n16fWtUFhQLWJKaib4KjwKR15mYcLxuFKHwgb7+9NQNOaqzo7EBMyh844qxJb8IpTQsssESIB/FQgBB/ffw3q1/LgG3T+va7x1B/YG6Z4MokkWqiK5f+D10AfREtD6jKquLUVO8PxRn9xqK1wq4JEPaj4f9SisYxHleHg6yKB4TqEDyhLlFG52nEO+NucuRlKk97FyL0HIrqaXAr2SatB4uW69ZCu3p8u6pJxpuS0wYN5HwSl/gwavawlJuy0a/mCbwjYL8KwsJhw16fz88CKoYXiKmoVR3ozajwS6kr56 X-Bogosity: Ham, tests=bogofilter, spamicity=0.198159, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Feb 07, 2025 at 01:12:33PM +0000, Lorenzo Stoakes wrote: > > So TL;DR is - aggregate operations failing means any or all of the > operation failed, you can no longer rely on the mapping state being what > you expected. Coming back to the "what should the interface be?" question, I can see three reasonable answers: 1. Failure should result in no change. We have a bug and will fix it. 2. Failure should result in no change. But fixing things is exceedingly hard and we may have to live with current reality for a long time. 3. Failure should result in undefined behavior. I think you convincingly argue against the first answer. It might still be useful to also argue against the third answer. For background, I wrote a somewhat weird memory allocator in 2017, called "big_allocate". Underlying problem is that your favorite malloc tends to do a reasonable job for small to medium objects, but eventually gives up and calls mmap()/munmap() for large objects. With a heavily threaded process, the combination of mmap_sem and TLB shootdown via IPI is a big performance-killer. Solution is a specialized allocator for large objects instead of mmap()/munmap(). The original (and still current) design of big_allocate has a mapping structure somewhat similar to "struct page" in the kernel. It relies on having a large chunk of virtual memory space that it directly controls, so that it can have a simple 1:1 mapping between virtual memory and "struct page". To get a large chunk of virtual memory space, big_allocate does a MAP_NONE mmap(). It then later does the MAP_RW mmap() to allocate memory. Often combined with MAP_HUGETLB, for obvious performance reasons. (Side note: I wish MAP_RW existed in the headers.) If memory serves, big_allocate resulted in a 2-3% macrobenchmark improvement. Current big_allocate has a number of ugly warts I rather dislike. One of those warts is that you now have existing users that rely on mmap() over existing MAP_NONE mappings working. At least with the special set of conditions we care about. I have some plans to rewrite big_allocate with a different design. But for now we have existing code that may make your life harder than you wished for. Jörn -- Without congressional action or a strong judicial precedent, I would _strongly_ recommend against anyone trusting their private data to a company with physical ties to the United States. -- Ladar Levison