From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1DADAC43334 for ; Tue, 12 Jul 2022 06:19:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9DCB494004C; Tue, 12 Jul 2022 02:19:11 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 98C8E940033; Tue, 12 Jul 2022 02:19:11 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 82CF094004C; Tue, 12 Jul 2022 02:19:11 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 742B5940033 for ; Tue, 12 Jul 2022 02:19:11 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 2572A20E9E for ; Tue, 12 Jul 2022 06:19:11 +0000 (UTC) X-FDA: 79677445302.10.41239E6 Received: from NAM10-BN7-obe.outbound.protection.outlook.com (mail-bn7nam10on2078.outbound.protection.outlook.com [40.107.92.78]) by imf04.hostedemail.com (Postfix) with ESMTP id 9A06A40037 for ; Tue, 12 Jul 2022 06:19:10 +0000 (UTC) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=oEeUIxSWNTOCBYRkT7ddu9y5TBFwvBLDuhWUCHFsNdFfKwbf5fBE+FTUHJxm92XA9iOElU+zaVOU/0/r9OdAWdMNUZojXwfxzJ2W6XDSsEtSVn4h3rMplTJcxgf11S/iLtdSIwBbYEWl04of3ObAn6BCjsuzTVDb15waXARIo0vCftnFyUtnehVCn73Vf1kK+tAmILmHMu29Ao9yTV2cUKFcnXf7mItuMiniXxaOY+lifPE+KMk1rqQTET9IWoKdczYuBtQLR9TKiuwHbAa4gSLwVRt8BWlOdBjY+dvHswqXR9h5/AHx0zQn6P0LtAg+wgxnluOPgvUtB5htrjTixQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=6VEQA41ZYDqkc7Q9kXH9KpYHU6keJtXUk441WZmTdeU=; b=fhAYPkVLvgFnvx7RMROe8Ad6wMXd+xQG3JT6EJ3ZLBvbR7V17QegEOEeOQ4SmullRtly24QW48+uOC2CeH8pmict2bdianzM5R040E8CIBsCsEmZJ6OAtbcsXw4/+RxUxgWJsb/lsUuRy3rcO0iAK8YPI2Zw1/VgQ8zwjaYwcCe0b8l9R3dwZ23Bfismdt8ymL8Y2cdG2Jx+4iDOs0yrqvV7T7Hys0Y1kSNyNKo0Z66OYkHpS/BkfmRW3oxzrtK6nPiHaqsTzT7NQ0fg41SDqUCdq/nSIIxFLq0kTwbom3IDe6Nq6pmRkvAtM4dazHTBi5XFsfSidi05+xPwoNa/JA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=vmware.com; dmarc=pass action=none header.from=vmware.com; dkim=pass header.d=vmware.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vmware.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=6VEQA41ZYDqkc7Q9kXH9KpYHU6keJtXUk441WZmTdeU=; b=X59jwUhUDF/GlvmFulei1Vs/pLPHtDrZi3VxSv1AOkg6nYtLhsIH9El2Me0q4YGpGUzNukB1x9PgakLkuqPLrca8QtB3ay9SXAL9RyEcv2FYKx27Kj9V04fPBhuy9P2y9X5awPZ6GNrV3YsyiM7AUb1ztviYyFxmmZjgrMCxC3A= Received: from BY3PR05MB8531.namprd05.prod.outlook.com (2603:10b6:a03:3ce::6) by DM6PR05MB5578.namprd05.prod.outlook.com (2603:10b6:5:5d::32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5438.11; Tue, 12 Jul 2022 06:19:08 +0000 Received: from BY3PR05MB8531.namprd05.prod.outlook.com ([fe80::a4f8:718a:b2a0:977f]) by BY3PR05MB8531.namprd05.prod.outlook.com ([fe80::a4f8:718a:b2a0:977f%5]) with mapi id 15.20.5438.011; Tue, 12 Jul 2022 06:19:08 +0000 From: Nadav Amit To: Peter Xu CC: Linux MM , Mike Kravetz , Hugh Dickins , Andrew Morton , Axel Rasmussen , David Hildenbrand , Mike Rapoport , Nadav Amit Subject: Re: [PATCH v1 2/5] userfaultfd: introduce access-likely mode for common operations Thread-Topic: [PATCH v1 2/5] userfaultfd: introduce access-likely mode for common operations Thread-Index: AQHYhqiRUk3vhOHs90CD+bXeUTpwQa16YUOA Date: Tue, 12 Jul 2022 06:19:08 +0000 Message-ID: <5D85870C-CBDF-45F7-A3A5-5F889521BE41@vmware.com> References: <20220622185038.71740-1-namit@vmware.com> <20220622185038.71740-3-namit@vmware.com> In-Reply-To: <20220622185038.71740-3-namit@vmware.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-mailer: Apple Mail (2.3696.100.31) x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 8d94d75a-87e8-4453-b483-08da63ce6db3 x-ms-traffictypediagnostic: DM6PR05MB5578:EE_ x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: BddaiE77l9plzui2GgO1nNTwFDMWnNGa/x0vPpy5fbqMMkeugtJJNY7LGLCuMdqr/mDjuJUo/L+B9xqu5Fwa3iM9W2aXym8F/azoG5atf0zIp887k5983f+qU7OnTjDLCOkT+BUysPkBI7WLLu/I8+RJn6YyQ2H58Kdi9cJaQLEWyhDXFOS80IAqF8bmCkq4+a2Fpq75VZv+Uts3LuK2uwbYNE5QlypbgmTQAtF+FkXZHNg2zyg9WjHZk7Ip+j6YnuXMJwRTCKOsK2a7YbI7/yN7PfWDFSwUWDaUe/FZ4MWRi6Oz2DiwJKGIUw4HtW1RIA45gKcqwlM1XBDB+l4bXr77v+QE5CuDis/FjwRunTdBUyXvknedUzcrgUhIu4/EZA9hpQnZYJMOKV3lwFqnjv4gMKTbG/EzUdiCJ3URGzLlII+2JwnLaUfe3AEXDjWVvMsqCx6Y2fb/P2zJLClu3F/+9LQL+2T25nSfo9cz5UjhLYb6tuuIVDyn8VWRsSjMhGHAaD9nwtQtv2c37DzuLHPW3PqNmo4RelqdQIMIi8q8dTLoHQgMxM5HAJjmFABrV64VQA+WBKbxRSwED0Av+N+s/EpSv3q7Oj84Nux56k/1LnFuxE5yG1XCUG+x/FuZTpz84eq9HHrd7sm0ZcmWlveZ3fnW9CD+s/bQb/yhjtNoBSbD2cY3iL0gy6vC4rPFYvOgs+nyx4SMzyenv2NZ7MTmLpFgBgNJMddCiZJV6NQfBCJPuBJ1T5JbWk8CALwCokXGziVIFn7/wgJ1YuTo3u2rc7nfyHTGiVdz/2691fV4OMjxUVnC6L2uAwk6ZVwylQNmDEnG+cS8S8IFBbXazTbqrreNhB0l9LIpNByY8usFQ3uiqgSFp3oIKlW5xkNh x-forefront-antispam-report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:BY3PR05MB8531.namprd05.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230016)(4636009)(346002)(136003)(376002)(396003)(39860400002)(366004)(83380400001)(38070700005)(66476007)(64756008)(66446008)(66556008)(71200400001)(316002)(76116006)(8676002)(66946007)(186003)(6916009)(122000001)(36756003)(54906003)(53546011)(4326008)(38100700002)(6486002)(41300700001)(6506007)(86362001)(2906002)(26005)(6512007)(2616005)(5660300002)(8936002)(33656002)(478600001)(45980500001)(309714004);DIR:OUT;SFP:1101; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?us-ascii?Q?QJNzQsrn/QwzcQqH+2++qW7PvblxsRRNPgUq6KruEdoxjyje+DDZbPX/htcI?= =?us-ascii?Q?Bdlw7rweVVoQ7tl8ZsN8itLhHwXyzVdqEJAjcs7u5AVpjfzDasxyZsixMvGe?= =?us-ascii?Q?qVfC7eY5nkmDQeO6mI2FMwfvGDOtTZYqJsMBEHJID+UWDtP+CVAmbO2g9wlA?= =?us-ascii?Q?d8HKjoiQWBtiqWW8Hh8Wk6I4VjFOFp0GaPxGBtua+MQVZp1k49i2Nzh0LBbA?= =?us-ascii?Q?as+TZ8ZB5rMsznYEhOXhTDLk4DkxgXnFWQMRzNdLmx3o6XzAVeMyX9RJLQv0?= =?us-ascii?Q?sBWpzHCIQrXeZ3V7crAvFSZrScxOIW4+vTjOGBhRBffFTH/WZncmRpuiTatS?= =?us-ascii?Q?HwYE5pEJe1j8xJ++b/mUDRAix/6+FPGnJLD/ivurUvUy0/3nXd8CPCJjX4ez?= =?us-ascii?Q?XSNNRLxoz/XXeRZYjzrj7lISlIeR1NLxSuhE+JRCEht/XEIrQc7frNYbm4OF?= =?us-ascii?Q?ckIgfd3mEGD8coT8IeXC4jBw5fClH2VH8MNkJdXFk7wNSWv/AMdrLQKlpdeP?= =?us-ascii?Q?87jTgchrGUtVX5NB9q+raNomhJK8wnNrrGh/x3wBzDM9dZKkLm8iLegChmG/?= =?us-ascii?Q?XeA3sID/jBiPM2cpxRwEayERKDmLkoFlxV67VHebwR4kdWAqR5vRhVds+h1k?= =?us-ascii?Q?RXXYv2qUDSrtCtWHy/rxf7RJtuFGCjbdowffsB09X/o0X9C7brZcYB1hkDIZ?= =?us-ascii?Q?ay8hH2v2l0mOoJyF0VjZ725QfDtLyHuihn+Q7tdeJAwzBeWQ0GbJOwGrXWee?= =?us-ascii?Q?5BRI2za6suSff2oLa9FgsL3FnvxzO0IXpmMmD9jxIJIkd2LyWgCP+yrCy9ah?= =?us-ascii?Q?jzbsv7pq7rgs8K4i5SS/qn9A6EqXphVWB9LqxW9s5J16aK+P4Q9vY1XJz58J?= =?us-ascii?Q?m3B+oT2M2FJ6PY0x31uAlQdLLlQamM9StYI/oSNhtK7QXSL4g0e2KyrvOmTa?= =?us-ascii?Q?90QimTxGgdEaZLjHjfHfCiA/sRC7XAVgURpQxZiitLWsR/5IfGPSPrzwi2dM?= =?us-ascii?Q?eMfiLcMNiFe0yM172TLsJMmgpXQ8YUTAa6BkVrGbP7bD6XPY00VvQJnUCe1t?= =?us-ascii?Q?hls7pReT5OWlUVFfrVlyx7SIVMQPKjDOd3oPt8MYiqQ1CAtIiAwHfAX5FgaO?= =?us-ascii?Q?f0Of8cK1B0iWCQPf4A4POJzq4wnccHgHp0jy87DXsGUsV+vYH4hnvaTud5DW?= =?us-ascii?Q?RymHMynZLbU8uO94P0779qtIbraXM44yW/vFkaI4/1CYR/tMprozhlBisTjK?= =?us-ascii?Q?1G/iFVmpg35iqgUmAXnj55lgGvNluA9Ppq8jGQm22axxqb4yh4FC+mtqs/KC?= =?us-ascii?Q?RfpxbJQaM8k5v8Xyd0bcUdzo+vA43q079OhVf6G2YWVFPFE37Fj4hJXkEMBw?= =?us-ascii?Q?R7fEG2uaJPk25b3xRq6HrIJMjSUK+vf+v+GCuqy0vr5/njAxeWoPePhpgUZB?= =?us-ascii?Q?bsAsDMjIgVgIqbFtceW0ZjVMEpEe1ydQQ2shwQaixxHrenxSPNdp9cr6l3KH?= =?us-ascii?Q?k6gc5ycnHDyJKn0Tksax2tSOntyq+BtjURVKvWJkRuLHbWFX+GlUKHC9KNR/?= =?us-ascii?Q?trs6hGHe2MNEhRdpjngt3hW4g/nXQkJgSg1kNrR5?= Content-Type: text/plain; charset="us-ascii" Content-ID: <492D7311341BD946B1B3C7A0919A3D33@namprd05.prod.outlook.com> Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: vmware.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: BY3PR05MB8531.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 8d94d75a-87e8-4453-b483-08da63ce6db3 X-MS-Exchange-CrossTenant-originalarrivaltime: 12 Jul 2022 06:19:08.1096 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: b39138ca-3cee-4b4a-a4d6-cd83d9dd62f0 X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: dFRFzapMn+aOupB+vgFclHdDRT7tL9rWo+c/lprT+Wp/O+tIwAM6T9FCQ3xMNV2ZXNTQsVWDHJvJJ0b0HXyjTA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM6PR05MB5578 ARC-Authentication-Results: i=2; imf04.hostedemail.com; dkim=pass header.d=vmware.com header.s=selector2 header.b=X59jwUhU; arc=pass ("microsoft.com:s=arcselector9901:i=1"); dmarc=pass (policy=quarantine) header.from=vmware.com; spf=none (imf04.hostedemail.com: domain of namit@vmware.com has no SPF policy when checking 40.107.92.78) smtp.mailfrom=namit@vmware.com ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1657606750; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=6VEQA41ZYDqkc7Q9kXH9KpYHU6keJtXUk441WZmTdeU=; b=XNm0gegENTJOCQaMq0IACvsBwAVrB/bAA+beU3MrptfDWGTRUF56yYFFAnYGzLJm8AVoRT wcO5lUkkliZnL0S5lXzlTnOtFcQBVhKRSj7dvsPzZnv3T/oS9JsZzSr6eV5J0mIHd1CrR9 VUsXw3gwEo+ZBee5oIA0No5VSA2HTrM= ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1657606750; a=rsa-sha256; cv=pass; b=NBXfkysqIF7ZsyQP62go9rEIR1d+mzWMn5/aL9lzjKlvcqCXRRQJnqjRwchC8kW9mr+WuS tGB7NDzfomnilBt8dRQNtOv5DrcgzEMpmB9VizzO5rQU9inA52hf0RnVoQkaFKTuFn7kGX tzsBz4H/7jaA+I+HvRdbZeeRwe/o3Ik= Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=vmware.com header.s=selector2 header.b=X59jwUhU; arc=pass ("microsoft.com:s=arcselector9901:i=1"); dmarc=pass (policy=quarantine) header.from=vmware.com; spf=none (imf04.hostedemail.com: domain of namit@vmware.com has no SPF policy when checking 40.107.92.78) smtp.mailfrom=namit@vmware.com X-Rspam-User: X-Rspamd-Server: rspam08 X-Stat-Signature: muwdwigmhntj6gwti46dhrb5o7teu5nt X-Rspamd-Queue-Id: 9A06A40037 X-HE-Tag: 1657606750-263944 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Jun 22, 2022, at 11:50 AM, Nadav Amit wrote: > From: Nadav Amit >=20 > Using a PTE on x86 with cleared access-bit (aka young-bit) > takes ~600 cycles more than when the access bit is set. At the same > time, setting the access-bit for memory that is not used (e.g., > prefetched) can introduce greater overheads, as the prefetched memory is > reclaimed later than it should be. >=20 > Userfaultfd currently does not set the access-bit (excluding the > huge-pages case). Arguably, it is best to let the user control whether > the access bit should be set or not. The expected use is to request > userfaultfd to set the access-bit when the copy/wp operation is done to > resolve a page-fault, and not to set the access-bit when the memory is > prefetched. >=20 > Introduce UFFDIO_[op]_ACCESS_LIKELY to enable userspace to request the > young bit to be set. I reply to my own email, but this mostly addresses the concerns that Peter has raised. So I ran the test below on my Haswell (x86), which showed two things: 1. Accessing an address using a clean PTE or old PTE takes ~500 cycles more than with dirty+young (depending on the access, of course: dirty does not matter for read, dirty+young both matter for write). 2. I made a mistake in my implementation. PTEs are - at least on x86 - created as young with mk_pte(). So the logic should be similar to do_set_pte(): if (prefault && arch_wants_old_prefaulted_pte()) entry =3D pte_mkold(entry); else entry =3D pte_sw_mkyoung(entry); Based on these results, I will send another version for both young and dirty. Let me know if these results are not convincing. I will add, as we discussed (well, I think I raised these things, so hopefully you agree): 1. On x86, avoid flush if changing WP->RO and PTE is clean. 2. When write-unprotecting entry, if PTE is exclusive, set it as writable. [ I considered not setting it as writable if write-hint is not provided, bu= t with the change in (1), it does not provide any real value. ] --- #define _GNU_SOURCE #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #define errExit(msg) do { perror(msg); exit(EXIT_FAILURE); \ } while (0) static inline uint64_t rdtscp(void) { uint64_t rax, rdx; uint32_t aux; asm volatile ("rdtscp" : "=3Da" (rax), "=3Dd" (rdx), "=3Dc" (aux):: "memor= y"); } int main(int argc, char *argv[]) { long uffd; /* userfaultfd file descriptor */ char *addr; /* Start of region handled by userfaultfd */ unsigned long len; /* Length of region handled by userfaultfd */ pthread_t thr; /* ID of thread that handles page faults */ bool young, dirty, write; struct uffdio_api uffdio_api; struct uffdio_register uffdio_register; int l; static char *page =3D NULL; struct uffdio_copy uffdio_copy; ssize_t nread; int page_size; if (argc !=3D 5) { fprintf(stderr, "Usage: %s [num-pages] [write] [young] [dirty]\n", argv[0= ]); exit(EXIT_FAILURE); } page_size =3D sysconf(_SC_PAGE_SIZE); len =3D strtoul(argv[1], NULL, 0) * page_size; write =3D !!strtoul(argv[2], NULL, 0); young =3D !!strtoul(argv[3], NULL, 0); dirty =3D !!strtoul(argv[4], NULL, 0); page =3D mmap(NULL, page_size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); if (page =3D=3D MAP_FAILED) errExit("mmap"); uffd =3D syscall(__NR_userfaultfd, O_CLOEXEC | O_NONBLOCK); if (uffd =3D=3D -1) errExit("userfaultfd"); uffdio_api.api =3D UFFD_API; uffdio_api.features =3D (1<<11); //UFFD_FEATURE_EXACT_ADDRESS; if (ioctl(uffd, UFFDIO_API, &uffdio_api) =3D=3D -1) errExit("ioctl-UFFDIO_API"); addr =3D mmap(NULL, len, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); if (addr =3D=3D MAP_FAILED) errExit("mmap"); uffdio_register.range.start =3D (unsigned long) addr; uffdio_register.range.len =3D len; uffdio_register.mode =3D UFFDIO_REGISTER_MODE_MISSING; if (ioctl(uffd, UFFDIO_REGISTER, &uffdio_register) =3D=3D -1) errExit("ioctl-UFFDIO_REGISTER"); uffdio_copy.src =3D (unsigned long) page; uffdio_copy.mode =3D 0; if (young) uffdio_copy.mode |=3D (1ul << 2); if (dirty) uffdio_copy.mode |=3D (1ul << 3); uffdio_copy.len =3D page_size; uffdio_copy.copy =3D 0; for (l =3D 0; l < len; l +=3D page_size) { uffdio_copy.dst =3D (unsigned long)(&addr[l]); if (ioctl(uffd, UFFDIO_COPY, &uffdio_copy) =3D=3D -1) errExit("ioctl-UFFDIO_COPY"); } for (l =3D 0; l < len; l +=3D page_size) { char c; uint64_t start; start =3D rdtscp(); if (write) addr[l] =3D 5; else c =3D *(volatile char *)(&addr[l]); printf("%ld\n", rdtscp() - start); } exit(EXIT_SUCCESS); }=