From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4DCE9C636D7 for ; Tue, 21 Feb 2023 17:51:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D06C56B0071; Tue, 21 Feb 2023 12:51:20 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id CB6CF6B0072; Tue, 21 Feb 2023 12:51:20 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B56F96B0073; Tue, 21 Feb 2023 12:51:20 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id A335A6B0071 for ; Tue, 21 Feb 2023 12:51:20 -0500 (EST) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 4E16E120695 for ; Tue, 21 Feb 2023 17:51:20 +0000 (UTC) X-FDA: 80492040720.07.CEECC14 Received: from NAM12-MW2-obe.outbound.protection.outlook.com (mail-mw2nam12on2074.outbound.protection.outlook.com [40.107.244.74]) by imf08.hostedemail.com (Postfix) with ESMTP id 7A2A4160008 for ; Tue, 21 Feb 2023 17:51:17 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b=gLKYzJNG; spf=pass (imf08.hostedemail.com: domain of jgg@nvidia.com designates 40.107.244.74 as permitted sender) smtp.mailfrom=jgg@nvidia.com; arc=pass ("microsoft.com:s=arcselector9901:i=1"); dmarc=pass (policy=reject) header.from=nvidia.com ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1677001877; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=nfsPA5ydF9JSAYHxDEO2XlmhJB5FFetsi9LZA1Lbr0A=; b=miniKOoUXqQsVOo2M0A9dOaBaRL2ZkNGr5tmWtlHR5aMu00maWh5ynZGAMmGcZt69P1WsX ARg31pVpO4Y6n/eMfioFPxgLXOUYO+i5WrMTKi49gaIWkntqa4Hnad7LfEkL4Tbb/rDjTU l8aVMn5vGwk+qu3l443w7Kju05+36xE= ARC-Authentication-Results: i=2; imf08.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b=gLKYzJNG; spf=pass (imf08.hostedemail.com: domain of jgg@nvidia.com designates 40.107.244.74 as permitted sender) smtp.mailfrom=jgg@nvidia.com; arc=pass ("microsoft.com:s=arcselector9901:i=1"); dmarc=pass (policy=reject) header.from=nvidia.com ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1677001877; a=rsa-sha256; cv=pass; b=RUsJYcPROZlhjmgweNucyFq/+3HeXi60yUAkhjjmxPH9qKcxJCRq2GVJRI4uXzC1JCvzMY g1O32250gCALFWUudlljeP2xaIbUkk0k/pWuOF8Sb9PXvf98W+T28ccSestufr/GO+LQ1U Q/GXT3G48FHvQPlvFvZWtuLbfXR6KjI= ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=McMNpgzGNJdocKCpDAOLho/m9/k4biVjAh/0BHttlDfA78wi8N69XlfV7AExFZlFBU5nBXttlyQVKu+6Bit9KAuPKqU3fAwnB2H3VNrEaF69jfQ6sliHCBo74bjuidJjNoKQWohgU+fmJxpwKlxc6mPgHCQJV6bmLDZQKw2lHistdsoaKfQ0WV4SueSrp3zNun9wSFcrD5e4mwbkQOg8ukQddbjFqsL2ei7FAYaxX8oG1fVtGkDz4vAvZBsa8WRJju8KH0m5HAF7NpEMQeH1IVev1jraS16W525fm6uYvC5eGUMyNWLpU+LFXC/r0ttK1JgSS3bYxtnHUMLO51hcWQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=nfsPA5ydF9JSAYHxDEO2XlmhJB5FFetsi9LZA1Lbr0A=; b=BWJxoUSZ6xCt74dr3klqsOb6waeDih1OeH8a7II/dyOcEVD2gtaQ5Sf8Pm3z8+Qey3LglloxCemyb6b1RHtVydinJMKmp/NRkxbnQQcEoRft+v5oaI5i4RoShf/lROcBaD6R42dCY+EqyOFSqWEBL2YHJ+goHxw+t3oWudxf39WLEUAPrdj3w4Jc7Tk48GmwQ2ICBqEcZUAH/pqXr9lHg9guqHQJYolD7DqMcx8StqgT6WFBUeiOH1TZ4K4BT5/bJ0Uwp69PE+vu9Lhmr8OKjLp9+4XIWzgNBbXHnO63HBAO8OwrSpKXAtZZsFAt9PIO23kFR/Q4ioKQSa/mKg1Ahg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=nfsPA5ydF9JSAYHxDEO2XlmhJB5FFetsi9LZA1Lbr0A=; b=gLKYzJNG53oPf6n9IcMdkdK58kNvECq+/1LpO9BTcXcB8r939w50XFMzMSshDC7dwiPGqbtvskfWAErXvW7JQhux3x21jH3D72Gp1Y3/wz36UQmQZfw5LD9/aOGAY5+Q3OKb93K2vS42gBS8SPlM2lDgi6Au2mfIRwgZKxdpNLN7mVtie6NnhkGgISfWvZmdWlo7l3YpyfqzJYx/v52oEvx1AEqITBNJw6Wz65DLnvkdqVTPRsosmINqCB+kKcSnZV/uHO7c4MFjGQHzxZ0/MdiqvB+d8Vid26ppJqpcrUEZacOFjDxoSTbB8354e+hQyoYbXDw8GtbJLK+e5iUY+Q== Received: from LV2PR12MB5869.namprd12.prod.outlook.com (2603:10b6:408:176::16) by SJ1PR12MB6242.namprd12.prod.outlook.com (2603:10b6:a03:457::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6111.21; Tue, 21 Feb 2023 17:51:14 +0000 Received: from LV2PR12MB5869.namprd12.prod.outlook.com ([fe80::3cb3:2fce:5c8f:82ee]) by LV2PR12MB5869.namprd12.prod.outlook.com ([fe80::3cb3:2fce:5c8f:82ee%4]) with mapi id 15.20.6111.021; Tue, 21 Feb 2023 17:51:14 +0000 Date: Tue, 21 Feb 2023 13:51:12 -0400 From: Jason Gunthorpe To: Tejun Heo Cc: Michal Hocko , Yosry Ahmed , Alistair Popple , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, jhubbard@nvidia.com, tjmercier@google.com, hannes@cmpxchg.org, surenb@google.com, mkoutny@suse.com, daniel@ffwll.ch, "Daniel P . Berrange" , Alex Williamson , Zefan Li , Andrew Morton Subject: Re: [PATCH 14/19] mm: Introduce a cgroup for pinned memory Message-ID: References: Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-ClientProxiedBy: MN2PR05CA0062.namprd05.prod.outlook.com (2603:10b6:208:236::31) To LV2PR12MB5869.namprd12.prod.outlook.com (2603:10b6:408:176::16) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: LV2PR12MB5869:EE_|SJ1PR12MB6242:EE_ X-MS-Office365-Filtering-Correlation-Id: 06a87e99-ebce-427d-cf13-08db14343993 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 8/sOPIqDSERu1SM0e/bOPA138NGTMv9AC7ZN/xtLZCkm8H0BsVJ3TLUUHcDodMo1JATZGPUr3MOHxdN8Q8pN5gDwY7pYlVnT189Vwdcorf5z61hRdx6gnJjh14YU0rt2vgGrD6hMra0R+bkIpwaxzbkabgfTw7fl5WZz/6uDCUCx1vMq2KiBo+E+e4FRU9gIoC0JNR1NMUO2o9BGY20jyNLt9HNwe8RHJe7EhFFxrLWVym+1GcZdLrInUVb1OC5axBfuizvaLUbpopjU7S2HfrMpxmrCnEF+UtHUaobx4mN0IshC1+fbsHucWnuSpVVoYsGCtESAoQHwHEIQJbgC/5rC2M9ZkjNuoAYQHDwhSzTHP6iDYrV33R1JJz3QZXrCS2z8RTPXygjxEFnz0ycbl7e/OKNMMhts6L+yEZhFX4uyuMp4RgEZoX1OyBNKHdhcrkdaI707Fl6REqINfrfR1pduwnNwTqzlBCfLqhkwfJhYkuQ7XUExTOFY7Eg8ntbCJN89BFvjO8cD3+TM3wz7Lc9a03uSVv0ooW9HkvUBDQGn0hsItandgSmJlJl46CWN4tH+yFvWrYdugQqoy48NusRDiUfvZ5uMcp530Vl6xm+qWOl7bGMP31co1dq9JKUoQFd0RBke8xrnTNb1D5omTg== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:LV2PR12MB5869.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230025)(4636009)(39860400002)(366004)(396003)(346002)(376002)(136003)(451199018)(83380400001)(66556008)(26005)(478600001)(316002)(4326008)(6916009)(66946007)(66476007)(6506007)(8676002)(186003)(6512007)(36756003)(86362001)(38100700002)(7416002)(6486002)(2906002)(41300700001)(8936002)(5660300002)(54906003)(2616005);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?lf4EcomXGO5nzT5kRqJS1/g3RIvOTcpsIzQrVvVqZ9iRE5fuLo+ITv0zhFGe?= =?us-ascii?Q?CKTb1ITjQxey/oxawmAdbBYejbA4c9YVuH9hD+nx5bA5Hk55v1XHPdnI+sJK?= =?us-ascii?Q?0hoNTnD/HAH3iuJs7vDnAxACJgLFYRlI2+98yanX9blZm/u2r8Eltrs1fMjc?= =?us-ascii?Q?JDhJVI9N9UZnDeGOCdwx8dOf8c2suErbGzw1YyxniBhqUtLRhF6hWOiuMsDS?= =?us-ascii?Q?NaUOKfTOnwbNaQwM/+GUlzaXUpgSezcmmx6zdEcX/Gfsyaa/SWI+0tE+HH07?= =?us-ascii?Q?byu7FldYiQdvMnJCwBMWQV0U3Mo0gPWofYXoBSWQp7f6uIvqf1kWjdvPCVD/?= =?us-ascii?Q?+qJ571hn0oMrFGgKlIdKuI+SGM2K62RdQAY3StuF/RfoVDqFUmN5vzEgIo3F?= =?us-ascii?Q?fgTAv2f5nijDJUEJY6kdRRCNmfvM0q8ATRx/4WPfyB5LalDga4pboJzsOF+D?= =?us-ascii?Q?Azt31uKDdkr6tG3JJErSQPymaD6xbLSJPtobXJ8pnoNcidDEZ2/RM4GWWnSR?= =?us-ascii?Q?v7EAnzQK5vBtu9/6EogB5d84+Z75m3YUhMpqjAAGW/f2L27kZvDIruUGyvTp?= =?us-ascii?Q?Q5h5tLXcBOUt1kpnRNVBOwgH9x5teZ9cr6p80wQIy8uPF//vXw5cy6TWIyTs?= =?us-ascii?Q?/Uy+pzI+XoKo8PwWgJYm221YnNl1DHGt2IV1K7K1ZRimO29apCw7Fi7Jq1T/?= =?us-ascii?Q?dzfmhkYAr331ZQYH3TvcHwxL/zGgUcpPYJOBNaGmTpqUIh9nt9kSaxcrTI74?= =?us-ascii?Q?Qv7Lox5inOz9/44Z7m4uQtnw8F7lqtzLAfH09YiklxVQFip51A6YnuEhnyOQ?= =?us-ascii?Q?lgWSCtPyNNKqx8i9M9IjSiaT8Vq4dgTyIcmmCWGWnYYRfrOr5AsyRw8yJMPr?= =?us-ascii?Q?+G79XDRDkB/GKh7GTlvl1ZfShBaUBa5tqXC8R5yCzqwBFRjclPcId7+fNYfE?= =?us-ascii?Q?Q4MkKTKzN/OXI4bfxTUdl7s0+Ra91th0yAXLpXlV+H/73RL0uob7ScJV3uoW?= =?us-ascii?Q?Oy6BAUPotMrZnFCDtEuyo3cNkEkpZ3qqZhTKdAXYR41uWn+vyZzBJLVkhb71?= =?us-ascii?Q?Eb/LydiX+fLMjvhtlY/3ZB7yxgvCetSB6c88fR25KFU+LK0Xmg/vGUzQ2c9J?= =?us-ascii?Q?85VkfbQ6xjwco2nKNng9SP9j2wDuwZmtt3ILG/EJyf60aOeHa05cNi8Uqutf?= =?us-ascii?Q?65Lp3x5dQCDitMDfGhTmRWQX25aasVTjaVp8+U2A4pm/AGqoTJvmgPX+UC7A?= =?us-ascii?Q?hWDXrQOwEmGNU6I08hrtiFFn4X6ypand9yuPTabdx8722oRB+Z+eijhNVQAg?= =?us-ascii?Q?WrUOZYho4Wj4/nUCcWfAl8XnxmfSdU0s0WxWCldO4TnKj0v9AkSU1uTd7oZg?= =?us-ascii?Q?PYYXBSXMDbDQCC3lpYPqVfk/pZknVabp2SP2K65ytgJ/MMaSGAHi5nu50yNc?= =?us-ascii?Q?DbEk9oBqLm61b5xsG0Htf16FKkvs272dlQd/IgKrz0hvRRlAIGENab+OP4Gs?= =?us-ascii?Q?tlfYRKha9Ueq/vtRbqMnNnGlT7VI5P8WG98u1XIy8UEz8L20QcwBMmuUmKma?= =?us-ascii?Q?t6rA06XF5mblAo2j5wmVTsX65ghVECucz4Z3C0R0?= X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 06a87e99-ebce-427d-cf13-08db14343993 X-MS-Exchange-CrossTenant-AuthSource: LV2PR12MB5869.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 21 Feb 2023 17:51:14.1462 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: jJ68VzYnEnQBD9EPJBqsYYC6Wy2UvKvSUi0y6+6fxRl+68c9+nMigxUKu5T1TPKE X-MS-Exchange-Transport-CrossTenantHeadersStamped: SJ1PR12MB6242 X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 7A2A4160008 X-Stat-Signature: ibsusis3h4inhxh5mw7d1son7yp3mg38 X-HE-Tag: 1677001877-608435 X-HE-Meta: U2FsdGVkX1+2Zn5xfaHmPVPQwZUki2a9KFJZS5L4ehcQQ+sUAWMvWPN2Dgqy4cYBvE2osNQJxk6q9vn8TF4hKBJOhmK3E/evmEztFYAds+2FEVYsNONBXT4L8u0KWL1v0uuhMKn78TsGmqddVDO4dZeotwbw25Z9UIVh24qbfvDsU3hak5S/F3kDOuRfMHVOg/P9sr377zTlwWCe1ZLB5zQt3b81G4heLUQwxiKUuWJU67uGtgIxIhXdIDrrzEXpt87i3CpJjwtpXf1acfZyArsZPVwo+4GueX/Wj9Au5jcAHwjAOFC9QMtyOoQN6af6PNSGMdj6fh9h+cbmTIczdLgDk5nDvnqkn2SgtYwWbo+KRAJytNrJ5pITBcAoKBqk4+gJXz9Dccf89alN9zbOFUnASP0WMDizl1wdh17gCsy+tyY6l7X1II6C5FPcagfpkIzdW4sddcO2nUmNuZELUqlxh39mnxmNS29b/telcqWr3pvg5TKOyK4qtNHjMOK2ZVofjcb/khYia+Uv6J9lkrZiatWHuIMVFIE1TKRG4Dm8n7ryBkJ6/WVPbPzocKGxqjd0IWI/Jmvoq+iLPyeHr5d1F0l/XZSIsR/qjg9GlqcQRLmqq5jlL77JTKz94MFV46CwG1+xDIwER1fDLLeYhrcy4uibpJ0Cye2PwJto/RmBntdnlowmeVjFgChWGycJEGBgW0RWDaE8mlaorMqfVvQMwO3kT9w32HaXMmqS7/Ux+RvYngB/zDQ0FG/eWN1GeMR+2UGy+pmuREy61czMgRfHGrmIgGOYddo8xD5jsEHqBh7JvBG4m3F75zpyoBYPF15OtkhtN5DF5KwKoSx3qiIDA7dGLTEwqrufbUAqIcFcUOwQvqfqjuiAA5sRN8ylAOOPmST6Y+GLL4ktqWcwBDCozCc9iCf5S0kJ/2rcaOn1P75V01KXEZQD5sVOk9efbCjaFX40CFKYaxROjQ+ WgZICQVT sx4Y2Bliq/BJ4RSh7CNXIbRq1i7hTyMhqhpC7ZzLZ/VauT32FATi2QFpWdhaMT+1q/lRXja0gDscKrUySj+Z94GNsLzUgCwcrzbghFpprtsM/ycrYa5DHISiqJ5VnjqeAV2lRNqTdH7PxhhAgninfKclxCP78qNfB0t2TP/2EM9b2ISBevfEKVJGMJdkTgqsMdq9X+jF3H6FsAQSsp4S9IaRt4Lp52RV79WhowL271KxzLWU= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Feb 21, 2023 at 07:29:18AM -1000, Tejun Heo wrote: > On Tue, Feb 21, 2023 at 01:25:59PM -0400, Jason Gunthorpe wrote: > > On Tue, Feb 21, 2023 at 06:51:48AM -1000, Tejun Heo wrote: > > > cgroup, right? It makes little sense to me to separate the owner of the > > > memory page and the pinner of it. They should be one and the same. > > > > The owner and pinner are not always the same entity or we could just > > use the page's cgroup. > > Yeah, so, what I'm trying to say is that that might be the source of the > problem. Is the current page ownership attribution correct It should be correct. This mechanism is driven by pin_user_page(), (as it is the only API that can actually create a pin) so the cgroup owner of the page is broadly related to the "owner" of the VMA's inode. The owner of the pin is the caller of pin_user_page(), which is initated by some FD/proces that is not necessarily related to the VMA's inode. Eg concretely, something like io_uring will do something like: buffer = mmap() <- Charge memcg for the pages fd = io_uring_setup(..) io_uring_register(fd,xx,buffer,..); <- Charge the pincg for the pin If mmap is a private anonymous VMA created by the same process then it is likely the pages will have the same cgroup as io_uring_register and the FD. Otherwise the page cgroup is unconstrained. MAP_SHARED mappings will have the page cgroup point at whatever cgroup was first to allocate the page for the VMA's inode. AFAIK there are few real use cases to establish a pin on MAP_SHARED mappings outside your cgroup. However, it is possible, the APIs allow it, and for security sandbox purposes we can't allow a process inside a cgroup to triger a charge on a different cgroup. That breaks the sandbox goal. If memcg could support multiple owners then it would be logical that the pinner would be one of the memcg owners. > for whatever reason is determining the pinning ownership or should the page > ownership be attributed the same way too? If they indeed need to differ, > that probably would need pretty strong justifications. It is inherent to how pin_user_pages() works. It is an API that establishs pins on existing pages. There is nothing about it that says who the page's memcg owner is. I don't think we can do anything about this without breaking things. Jason