From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 91680C43334 for ; Thu, 16 Jun 2022 05:24:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 23DCD6B0072; Thu, 16 Jun 2022 01:24:36 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1EDCD6B0073; Thu, 16 Jun 2022 01:24:36 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0DC936B0074; Thu, 16 Jun 2022 01:24:36 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id F3FF86B0072 for ; Thu, 16 Jun 2022 01:24:35 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id C7E9F1296 for ; Thu, 16 Jun 2022 05:24:35 +0000 (UTC) X-FDA: 79582958910.12.A2D1AEA Received: from mail-pj1-f48.google.com (mail-pj1-f48.google.com [209.85.216.48]) by imf24.hostedemail.com (Postfix) with ESMTP id 72EEF180076 for ; Thu, 16 Jun 2022 05:24:35 +0000 (UTC) Received: by mail-pj1-f48.google.com with SMTP id g16-20020a17090a7d1000b001ea9f820449so774880pjl.5 for ; Wed, 15 Jun 2022 22:24:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=2dBe/Pp5Y1VRg+tk1SvS0My8K+SqBFGzmpB7JcyvFLs=; b=Y8C7VVVja4ciC5NeV9pMlqGENewHUZ5cMQzes+EhNYf434XZdzxC+XhZ/D9tYrAAuq pKrZzyB/E3ShbhOWWJY/cQDb+BGf7UsKYOzk/E4gBiPQkH7+nSdkn9yXqpxWwL6IMZ3y M6L6TvB+4T3aF++w93wNLjKegV32C22NtUNd8i87LokW263Xk8GXoglevrTRiC3Dulrz VUw3O3KNw2My3kgwYggLPfK7LOEMeWtNSKqNcFeuObjdDJD6QLzjTshOvZCyliOVZAUu Etazil+FIvzvfLTcLB6lF+qA5wmhphyC+isPsuUPr18ax8RNCQfXDq3E1eFXwjHxVCqb 0wAw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=2dBe/Pp5Y1VRg+tk1SvS0My8K+SqBFGzmpB7JcyvFLs=; b=fnpFQ6BhsgAKJahNEooUrDYxhqdTblbXmcGbw4hmoNXsF7HLv/wCrmKs/03k40rLPf ZxVwkqT4Z0iPtNSZhiAUDEb5S694t1BNLOS+giP+hNKO4xSvW+SxNWP91yA7t57iq2iL 09dL0w15kKTKx8/gOQmz3ijsLycKYVCzOiOYEp5lQxJurPnKhshKf6cYvUpHOwphrni1 v331SgSU6bDDt3ynG4VD+pBOHCXTdEOgjiBduD4W0JOAfJPAuDJwTA2q6CarYtNbgPeY 1Dp05/FDXsoUj60sANnQKR49fwC5n1QSzd4Kh/DO1xZUQOVB4EqUD4/OIBkigXD8MFbl phOA== X-Gm-Message-State: AJIora8WIhvvXHz56wB8IU9Uf1mfYpIDL7YSlr8DYqYUtGrMLPb1F+uX TEWRtEyPnvkbTEay/121/Hk= X-Google-Smtp-Source: AGRyM1v7OpKDwYUFMKE/3hXk3tm9gZM2YWyLLMSqS726Mz3hmhDUkKtdM2TQHRUu59ZxSI6c4izYAQ== X-Received: by 2002:a17:902:e750:b0:166:3058:d0ed with SMTP id p16-20020a170902e75000b001663058d0edmr2928289plf.0.1655357074064; Wed, 15 Jun 2022 22:24:34 -0700 (PDT) Received: from smtpclient.apple (c-24-6-216-183.hsd1.ca.comcast.net. [24.6.216.183]) by smtp.gmail.com with ESMTPSA id i22-20020a056a00225600b0051bac6d2603sm598510pfu.214.2022.06.15.22.24.33 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 15 Jun 2022 22:24:33 -0700 (PDT) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.100.31\)) Subject: Re: [PATCH RFC] userfaultfd: introduce UFFDIO_COPY_MODE_YOUNG From: Nadav Amit In-Reply-To: Date: Wed, 15 Jun 2022 22:24:26 -0700 Cc: Mike Rapoport , John Hubbard , David Hildenbrand , Linux MM , Mike Kravetz , Hugh Dickins , Andrew Morton , Axel Rasmussen Content-Transfer-Encoding: quoted-printable Message-Id: References: <3eea2e6e-1646-546a-d9ef-d30052c00c7d@redhat.com> <481fc9d0-6122-bf59-9d04-23c10d256764@nvidia.com> <0BB58ACF-2801-4622-BF3B-9913A23AE46C@gmail.com> To: Peter Xu X-Mailer: Apple Mail (2.3696.100.31) ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1655357075; a=rsa-sha256; cv=none; b=SQV1vc50mgcw9CFRUYmueXL7M+GVbWlfq54ATR1otui1Ni1Y7NhLnpDtQG7zN/RM/+0ZbN FD1PBMlICq8QhgYdNqeeZguzTIuoRx5Hm//0g4h3VOVEbQEHvx2gx2CNZ4fCzrZ1o8gNWQ zWxO5z2W7gVow8aNcpv2OfvJ1HD+yi0= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=Y8C7VVVj; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf24.hostedemail.com: domain of nadav.amit@gmail.com designates 209.85.216.48 as permitted sender) smtp.mailfrom=nadav.amit@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1655357075; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=2dBe/Pp5Y1VRg+tk1SvS0My8K+SqBFGzmpB7JcyvFLs=; b=B5wGuQ5u808CSPO9VK8egtLXq9plPXOm7/np0uuLEvYngROkShEowOZ2JnLYxaoZ9TYNIU JqwmBhCL4jooL/T+Yr8rOXjX3HDY2TA9O7ft+71BwKJVSifYNHCz8H6IGshrrw4EvDERXY qHpZsRVerVyE07Z7V2WcbuHWz+vbtIk= X-Rspamd-Queue-Id: 72EEF180076 X-Rspam-User: Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=Y8C7VVVj; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf24.hostedemail.com: domain of nadav.amit@gmail.com designates 209.85.216.48 as permitted sender) smtp.mailfrom=nadav.amit@gmail.com X-Rspamd-Server: rspam06 X-Stat-Signature: ftiifx8dih47anwyojgh6d98pdnet578 X-HE-Tag: 1655357075-363309 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Jun 15, 2022, at 1:56 PM, Peter Xu wrote: > On Wed, Jun 15, 2022 at 12:42:13PM -0700, Nadav Amit wrote: >>>> (3) also has the downside of stack-protector that would be added = due to >>>> stack-protector strong, which is not-that-bad, but I hate it. >>>=20 >>> Any short (but further) explanations? >>=20 >> Sure, but it might not be short. >>=20 >> So one time Mathew Wilcox tried to convert some function arguments = into a >> struct that would hold the arguments and transfer it as a single = argument, >> and he found out that the binary size actually increased. Since I = like to >> waste my time, I analyzed why. >>=20 >> IIRC, there were two main reasons. >>=20 >> The first one is that the kernel by default uses the =E2=80=9Cstrong=E2= =80=9D >> stack-protector. It means that not all functions would have a stack >> protector, and actually only quite few would. One main reason that = you have >> a stack protector is that you provide dereference a local variable. = So if >> you have a local struct that hold the arguments - you get a stack = protector, >> and it does introduce slight overhead (~10ns IIRC). There may be some = pragma >> to prevent the stack protector, but clearly I will be shot if I used = it. >>=20 >> The second reason is that the compiler either reloads data from the = struct >> you use to hold the arguments or might spill it to the stack if you = try to >> cache it. >>=20 >> Consider the following two scenarios: >>=20 >> A. You access an argument multiple times: >>=20 >> local1 =3D args->arg1; >> another_fn(); // Or some store to the heap >> local2 =3D args->arg1; >>=20 >> // You use local1 and local2 >>=20 >> In this case the compiler would reload args->arg1 from memory, even = if there >> is a register that holds the value. The compiler is concerned that = another_fn() >> might have overwritten args->arg1 or - in the case of a store - that = the value >> was overwritten. The reload might prevent further optimizations of = the compiler. >>=20 >> B. You cache the argument locally (aka, you decided to be = =E2=80=9Csmart=E2=80=9D): >>=20 >> arg1 =3D args->arg1; >> local1 =3D arg1; >> another_fn(); >> local2 =3D arg1; >>=20 >> You may think that this prevents the reload. But this might even be = worse. >> The compiler might run out of registers spill arg1 to the stack and = then >> access it from the stack. Or it might need to spill something else, = or >> shuffle registers around. >>=20 >> So what can you do? You can mark another_fn() as pure if it is so, = which >> does help in very certain cases. There are various limitations = though. IIRC, >> gcc (or is it clang?) ignores it for inline functions. So if you have = an >> inline function which does some write that you don=E2=80=99t care = about you cannot >> force the compiler to ignore it. >>=20 >> Note that at least gcc (IIRC) regards inline assembly as something = that might >> write to arbitrary memory address. So having BUG_ON() would require a = reload >> of the argument from the struct. >=20 > Ah, I never knew that side of BUG_ON().. I was wrong about this part. Actually BUG_ON() does not have this = effect. Sorry for misleading you.=