From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 38132C61DA4 for ; Thu, 16 Feb 2023 12:45:47 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C9D3E6B0083; Thu, 16 Feb 2023 07:45:46 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C4D6A6B0089; Thu, 16 Feb 2023 07:45:46 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AA03E6B008A; Thu, 16 Feb 2023 07:45:46 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 94C8F6B0083 for ; Thu, 16 Feb 2023 07:45:46 -0500 (EST) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 699161208A0 for ; Thu, 16 Feb 2023 12:45:46 +0000 (UTC) X-FDA: 80473126692.18.8F74C5C Received: from NAM11-BN8-obe.outbound.protection.outlook.com (mail-bn8nam11on2041.outbound.protection.outlook.com [40.107.236.41]) by imf12.hostedemail.com (Postfix) with ESMTP id 7368E4000A for ; Thu, 16 Feb 2023 12:45:43 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b=X71sqwUE; dmarc=pass (policy=reject) header.from=nvidia.com; spf=pass (imf12.hostedemail.com: domain of jgg@nvidia.com designates 40.107.236.41 as permitted sender) smtp.mailfrom=jgg@nvidia.com; arc=pass ("microsoft.com:s=arcselector9901:i=1") ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676551543; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=dueX91XBPxeg3KkiY0Sk6ldj36DhXEK4K7I69eP3Ybs=; b=fDslqxhldfUFocR9pgeSqvfHZFB8Ul8hMZe5GbeXHgwuPUYGblix4ZAqNqBIcpwCz9iHg8 eqoAE98sUINzXsc26ODXZK4gZCWMZ35WMaFHilFgxqLdH8RsLKP3KD2qhHleubMtOPUpEb H/4vhbIp3ufIi4eCHscCbKWueKoIVek= ARC-Authentication-Results: i=2; imf12.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b=X71sqwUE; dmarc=pass (policy=reject) header.from=nvidia.com; spf=pass (imf12.hostedemail.com: domain of jgg@nvidia.com designates 40.107.236.41 as permitted sender) smtp.mailfrom=jgg@nvidia.com; arc=pass ("microsoft.com:s=arcselector9901:i=1") ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1676551543; a=rsa-sha256; cv=pass; b=4AG9CIc+6W3jE2IF85KFAwkXKIkArTKMwd40cpTCldAQV5sNZ05BNjAA97q9ynBvb6TeeI k+mUQ7GhsUBiHYA6oGjnt5PaZPbLCzoKpmf72TL5J7P3h94MH93JuyngcqMv3zA8kbnYRC P0T42DkzvDKUnmW5K40yq141IJsH8Io= ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=cLm0Wbh0rl0kDcbgfMxnuGCOG+er31CQiYpIE1jySIGM6pejLcgGBlfULavYTlWpLzZeWTYWWkcMyyC/cIod+0hAXH9DL/El35uqUdmjF2BiOErqibky1tFIN4iSkt2VrsbMf6sHZTxcBDmDKfyR7udQxZcX6AoaWLhtofs9drjbMcPIz+XDhLRrwV38IkpljSFfgVVyiS8HQJmg9+QsO2Ay3Vs7msHY2XoiNjlKyPOx6EccxIWMNGAXxivw71bD6Mb6kyyW7blb/cxjtNP/7153tarexI9uam/pFaRCgC43LafsU0d8lpsdJVkECMZpuq7nZTPavU1QCiz/+KxRsw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=dueX91XBPxeg3KkiY0Sk6ldj36DhXEK4K7I69eP3Ybs=; b=BA2gDw0oBz1/OCDybDLhfYdXp0sh6TMoF2WvqI28UJAY+6meEWwSCVmpqDceEV3jPAChyqLRJHJnQcSyF0ta0CIm0p/hbbLKK5EnrIQxvwooEZXUsWCheEO901+g799wQxs16BaBSc5SkaGt0jrRuIOl6+cJSiPBMgx5GEI3Qv7nB/nRsjlukhX5SKLyluaoNDU3PIod7PKKHusz+Fq2Fwu/jjEFSB30621Z7jv2mfcfWkf0+MXa+aBO0jxz/FeIFW/JicfQ8XDM4omLvCcrbg43usxHht7ffG/nFUSSvTsFlJc/P/urtLpl7tIL+bm1DgaI01h0x6EbDz00akBN+A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=dueX91XBPxeg3KkiY0Sk6ldj36DhXEK4K7I69eP3Ybs=; b=X71sqwUEX9xGSBw9vUgX1TE1rviSXoyGnsvaC32q5bW3vZmdhrJNa95st8ls2sFoJVEEZsHyv+2sjRN8lQFz9BpFQRKYzM8XiepQkwLP4r6sR+fbIrKlcxT0nJHhsqkbxR4wcDFZ+VMUiR1tJ1drUmH4qL62PWSXZjGDo/f0q1XNROrc14V3FynZ9W1sWGVlBKkUbSysJWUXpjEnTtTyjXNthcLh2Bdw9BePTY4+LSItLP7ALYep+qXyJ84d/GsboIepPdStPHYHvMr4Esl3c2IYaKiOtX+vnzfgWMBkwaFWQvvDEZzE7Lsn+JN0d4uEgK3GcKRcMeHDjDWC6/9XqA== Received: from LV2PR12MB5869.namprd12.prod.outlook.com (2603:10b6:408:176::16) by SA1PR12MB8699.namprd12.prod.outlook.com (2603:10b6:806:389::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6111.13; Thu, 16 Feb 2023 12:45:41 +0000 Received: from LV2PR12MB5869.namprd12.prod.outlook.com ([fe80::3cb3:2fce:5c8f:82ee]) by LV2PR12MB5869.namprd12.prod.outlook.com ([fe80::3cb3:2fce:5c8f:82ee%4]) with mapi id 15.20.6111.013; Thu, 16 Feb 2023 12:45:41 +0000 Date: Thu, 16 Feb 2023 08:45:38 -0400 From: Jason Gunthorpe To: Michal Hocko Cc: Tejun Heo , Yosry Ahmed , Alistair Popple , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, jhubbard@nvidia.com, tjmercier@google.com, hannes@cmpxchg.org, surenb@google.com, mkoutny@suse.com, daniel@ffwll.ch, "Daniel P . Berrange" , Alex Williamson , Zefan Li , Andrew Morton Subject: Re: [PATCH 14/19] mm: Introduce a cgroup for pinned memory Message-ID: References: Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-ClientProxiedBy: MN2PR16CA0053.namprd16.prod.outlook.com (2603:10b6:208:234::22) To LV2PR12MB5869.namprd12.prod.outlook.com (2603:10b6:408:176::16) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: LV2PR12MB5869:EE_|SA1PR12MB8699:EE_ X-MS-Office365-Filtering-Correlation-Id: aa7467d0-ba9b-4fcc-2514-08db101bb5d1 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: IHlNCOBz/rjaaFVcbzOolSE6yX4GygL6ISdEjBC0C9jOf0g7vJiM9cfAJnDsQV7ej7s7nfW7x4a/fN67toALUKZNUVTa1QR8f4q1wQ2mazTw/UbE/QjS2Apgj56dWLjiRGRjgeHdcllmLIGqu6jsO1Y9mx89XrZLiqeqBnDbxyb/8rAu/XvQkKpV+5+7sniIZXxZHN1tey449HqWE/uugawNCcFADbX+qybuackjKFpjtlRSguCi6+pnYTnb/ZAHLsB9nIacz/CPd3SV9zdDeEIZgC0H48VwdqLWFXNFsxHklPfLBaaJcta9yBKk7ozyv9LRW69MedzOPmk7qN65zQT+7+7AyvrBvNrDPMcUYfFmI0LCquze+QLS3exm7GPfnthvrj5G4KS5eIUVRZSRHUufoZc1Qbf/55lVgXXNbwmtCsXcwn7G42iN1Zeknzm1E+eKTS2Y/87G+qiSlX1TCXri8Fs1q8mE/VbJQq+lAS/aLW24HSAR80VasWVNEtlLTmrXEbgwXOjXUl9rbhAA1SgCAFjWfGzTg2fYvVNIOHUaGo11zlU82ZFSujJKPcrNtoKj7Zd0H8vsk4HLQ1765knSr6MsZcG3mfxKNJw6bTje5XG/Bv0V9EeYUCQVq+GPU1G4Is2kP/IKqnNiTAtrrT+PSUySXJ0LCGPMvdyMLL46coMN6OjYWGRGy3Jjn+sT X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:LV2PR12MB5869.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230025)(4636009)(346002)(366004)(396003)(376002)(136003)(39860400002)(451199018)(86362001)(38100700002)(36756003)(5660300002)(7416002)(2906002)(8676002)(6916009)(66556008)(66946007)(4326008)(66476007)(8936002)(26005)(2616005)(186003)(6512007)(83380400001)(6506007)(54906003)(316002)(41300700001)(6666004)(6486002)(478600001)(67856001);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?KHk0c+aVmHRBO7KvHISFwN/Pk03DIpvg6JBQKEWSq2nxhgjAVVayDhRZUr3t?= =?us-ascii?Q?R+Yw6d4goY17KNwuUHpcfn1hY0Gf/e/xQr61SUaeHa+RvC1m5I456Bg0vvxP?= =?us-ascii?Q?qPu1b0/XQiZ2OGqIvifmymOmw5FRE1MA7h97iY/T+IR9Aai0iSt59ilQrqWy?= =?us-ascii?Q?lm7dIAX2cNEUfD3/Hggk1LCpcU28jjMGSFblMLdTfclMGQaDGmKNzkTzvWks?= =?us-ascii?Q?EaPS62ORH+94UeaCFElab2NtLMrgh+oPhh4So7iTjAvwXXNHN3QvS+HyEKr1?= =?us-ascii?Q?LFGL/nWx3at3xN2xNM25in08+q+qA2XK/RHHAHo4F+gz1mVCM0esJ1oSvmol?= =?us-ascii?Q?Pl6Txf//sOyKQ+sIpQmmsDR3dSQYp1ewp6ugeR+zYzyTqoAvw0OJRd1MbTDS?= =?us-ascii?Q?l+D2fcPgkOodikzAvaR9EWAti/SwaR+NVi/cfo3pMCtWe3Z5ir2TW9dQQzy4?= =?us-ascii?Q?lf8Tb5IJoccSNd+KqoPGJdZZ1FtCkKC3HvgVnqSjIusbxdm68IEpBm5fJLC0?= =?us-ascii?Q?UqzD21XkhV1uxpzMaeeraN/+qpfwKu40JxHT1vqf+lH6CIimAewkvKoNlugC?= =?us-ascii?Q?qyviLTaMwstzdFxZr0Konh1iTaMmQ3Eyy8+jdfJmK5IPkSPX28Z2VqQxktNh?= =?us-ascii?Q?9E1cAGjLJet4RqZptngI5Y0xCJ/CGQ+a5aPnZ9+R/ugAhoS1V9F/QgFB4RuR?= =?us-ascii?Q?EweJXUKiVwkBzN4/0iXX9nGn2cyJMmLNiIO4IkvgQ4sgo6XjyiZYm6b49lQ2?= =?us-ascii?Q?219pBOWCxf4wQikeIzR8SpPBusy+olNiNXHD3QRxxeZtnT0u9OtDhZcg22WZ?= =?us-ascii?Q?y/claY2tF5wB+f8WaFzr4W8dqucE24aFLR/hh9NikxcpXlw9waAiqWzsSFTr?= =?us-ascii?Q?KclAckr/vFYYru90yv8WAfOJ/kzUVR8UX2voXwc5EEvHndJYMOddDKPpneDC?= =?us-ascii?Q?nHDDc0aq+g3BcDFNbkCD8Ag+7JBKgUhyzGUNi3plDN2+F6AI6sT7eV2s0ISC?= =?us-ascii?Q?kPM+KfAEOUnzOXv4Vq5cIg/12UjyRKkGIJlXK9GXRPOBJBbL/un/L8YaJmaD?= =?us-ascii?Q?DnEOQlhkcBZj0kwwvo+n51vQSNGg25d8jgErbQZ2q0p08oE6D27f/Eu7uKEQ?= =?us-ascii?Q?GtFPzuiH3BJyxa6wM5HhoY8VvUJe1acHaWsTY6BEwgfIz/76lPUvVSvrlfs/?= =?us-ascii?Q?+eufc7SaErQkHP75O0RSQn45pQoLAXdDraPQu5uexhjI0jAE7UeLXNHoWAv/?= =?us-ascii?Q?BYqy4uszhFVQS/VOaBQrBf7BVdqgXmn/9LC7ljuVjcBxtP9URQJheZ/4HcxC?= =?us-ascii?Q?9emKdimnY6qv6CcBUwOIu6GzD1JqjHQfCNRyCQPB3auOIJkKSsLIC9GE3dwR?= =?us-ascii?Q?LPvMvrL6+QVFSTucSfaEoVIOhs3y+pkrxTrbcWZCnPKC8w7azCI4sOkl81ag?= =?us-ascii?Q?hFeYTmbrOYle6JDxzUW6hO9u3QW7AIIVnOq1ZEhGtZREwIhBL+kJiN95hsqp?= =?us-ascii?Q?D0+SnnpnctTGqAbcy2LlLIP+RTdJQtJNsvrufUBo4nh3zIXkTJ+9qniEw8Yn?= =?us-ascii?Q?qCvyTH/iLbBJP4ojVh8FmZ8eabcJHJM45UVy6Q3z?= X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: aa7467d0-ba9b-4fcc-2514-08db101bb5d1 X-MS-Exchange-CrossTenant-AuthSource: LV2PR12MB5869.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 16 Feb 2023 12:45:41.0006 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 1ew0din0HJNs4Lp4ccCcYG3tVyrdv/M+jbA75rRkn6buBTp+YtHjSSuiMPT/dNB4 X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA1PR12MB8699 X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 7368E4000A X-Stat-Signature: 8bmhuakmmin5nc4ixai1g87xit87n66r X-HE-Tag: 1676551543-214772 X-HE-Meta: U2FsdGVkX1/nxJWAVP7JfhbBnK2kub0LmJYsKy2JiOCMfTp+hip/cAyJvdHJ9pz1bUpdI7iHjHZmFPaUF0Nrfh8hy11A6z77gqXM9csDvAYg/ixFRFhSxZdw98OQ7rGwhlOljddnDsb78kwxISAD/HhqucjOpmsmF87QJG5qM3kExXGdLNnd/pE2C2shquCyRjShM2TGeSo3iBMtq31d33Gv104+oitni8feDIbBLC/XZuEI9xxHIaDNplRoALUcp8itW93tUfxw4ptqOzazO9l1/vEN2jYzaOD5PT5AhvcJy0Kk5YtpvPsU8xZsXzD9xMWLt5nyQns7MiQMAapjCCFyJpXFSLrSoEu50VrT24gekw2AAN8H4wd7QtH6EpQ/J2BJWKuUVd+Y0iKGboE4FZJTT5jpXBqTG9dk2S4ZQhx3+w5spYlNL3dnrCLn7/mW8Bed2FIdr/vvORzJTBPjD2aweAZfmyVkOhgSOtJVanbyLV1VRhRxjdzuldoReu6bYtLkc42Chfd6BHAlatbLWAsA4fPnh3xIcK4FdEcqFv6s0wGEMZMKKVbiC+LO14jqowg7pR2lt9JY+J5cehsRQn5hXP9IFu2r9lr4JgMAwVdZz898/oazFVSK6hDyEp2bzIyjm4s53bumhkRP8+4JyCb89rgj6YyR34Hfc+GHL5xANRrB9teedoUMZU46hxGvVCLoU6Xs9rs9lXrdwYUVyVYhzjpOTmMaNouRVhOkmNhC4dAPrXqrvndRRNlHbvWsW9PXTgvJoENasoh7Ax1N9mdptMv2elc+WM9CccQegL+ea4ms+nfrPDCx4Y42G1mzMnnOp7PLMLNI+zi7UG/cpZCcYLZQAoJiJhpAAaFUMQZymsfy1qZxE8P3estdgaeHzmHxS5WHdqcw9eRADl9MWRFN9hSDODtl5/4mtymdml55fYbo2FDTw+k++hzSzWcDlsva0Zqa7/Qmzi9n8eL Gz+KHsH6 2XOQgdbroDCRFnDosfgLeLgby4UA/xTty9oOXiFZtEJj3//i3YvG31lcJUoZVPq21v2l2JquM8+o1ENKMiLWC2o2qfCFEbCvjy8QOEKusNUuo8RdEpwu1H1nDqqjy9EUNkZp4anSRYvW4cp3vP2/Y6hBRaQWgt3X7tgOK21Ix2Z1Ok6Hr/YFzZ4C3bjgHHjCxA1rOB3S5YStooNFwlDVx5vuWT0YWKjFWFxr/tXcLpRiKMnk= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000003, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Feb 16, 2023 at 09:04:03AM +0100, Michal Hocko wrote: > > In most cases the ownship traces back to a file descriptor. When the > > file is closed the pin goes away. > > This assumes a specific use of {un}pin_user_page*, right? IIUC the > cgroup charging is meant to be used from vm_account but that doesn't > really tell anything about the lifetime nor the ownership. Maybe this is > just a matter of documentation update... Yes documentation. > > > The interface itself doesn't talk about > > > anything like that and so it seems perfectly fine to unpin from a > > > completely different context then pinning. > > > > Yes, concievably the close of the FD can be in a totally different > > process with a different cgroup. > > Wouldn't you get an unbalanced charges then? How can admin recover that > situation? No, the approach in this patch series captures the cgroup that was charged and stores it in the FD until uncharge. This is the same as we do today for rlimit. The user/process that is charged is captured and the uncharge always applies to user/process that was charged, not the user/process that happens to be associated with the uncharging context. cgroup just add another option so it is user/process/cgroup that can hold the charge. It is conceptually similar to how each struct page has the memcg that its allocation was charged to - we just record this in the FD not the page. > > > Another thing that is not really clear to me is how the limit is > > > actually going to be used in practice. As there is no concept of a > > > reclaim for pins then I can imagine that it would be quite easy to > > > reach the hard limit and essentially DoS any further use of pins. > > > > Yes, that is the purpose. It is to sandbox pin users to put some limit > > on the effect they have on the full machine. > > > > It replaces the rlimit mess that was doing the same thing. > > arguably rlimit has a concept of the owner at least AFAICS. I do realize > this is not really great wrt a high level resource control though. rlimit uses either the user or the process as the "owner". In this model we view a cgroup as the "owner". The lifetime logic is all the same, you figure out the owner (cgroup/user/process) when the charge is made and record it, when the uncharge comes the recorded owner is uncharged. It never allows unbalanced charge/uncharge because that would be a security problem even for rlimit cases today. Jason