From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 484B9C25B75 for ; Thu, 23 May 2024 13:31:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id ACC086B0085; Thu, 23 May 2024 09:31:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A7BC36B0088; Thu, 23 May 2024 09:31:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 96AAF6B0089; Thu, 23 May 2024 09:31:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 78C816B0085 for ; Thu, 23 May 2024 09:31:18 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id E65504050F for ; Thu, 23 May 2024 13:31:17 +0000 (UTC) X-FDA: 82149746994.15.2139F77 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf16.hostedemail.com (Postfix) with ESMTP id 37C2518001C for ; Thu, 23 May 2024 13:31:12 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=JtPZNgss; spf=none (imf16.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1716471075; a=rsa-sha256; cv=none; b=4FjOXLSTn/a9KlvWV8lpBmXoKssgq7JLGtue7P7HKuPo1zGCrndPq178Ck6gvVNL9qpE1g 6EXyEQjjzJCYZCUmnstXKtAo3WO/IerltJliqOZ21FRkcKWppeuhOJlu6jQwEiSOhU5AkA OL3A8P0vt7UNWUC8JYBNgK59b+Q1BPk= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=JtPZNgss; spf=none (imf16.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1716471075; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=RavCK3aELYMPsJIZvxV9QXWeZcnPAfhzKUr04jHfAQ8=; b=QuoUC0SPnT8jTENgDybVnbG11c5GsnI3/PNCjTStqvSrYGrIWKDyCIGOsTpKgSB9RWJaxc 31Q34+s5sWPYdWNk52eB4HEu0+YhnJN8cdcme+4UyiYZRmscvqsbx3J5uDq4QKjpYKtaeE /kRqJBOMIP1i7RMaIzEf5JiZvwCX+o0= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=RavCK3aELYMPsJIZvxV9QXWeZcnPAfhzKUr04jHfAQ8=; b=JtPZNgssDNmkdTSOyPP9pwBoMa Q+QeoCORv14VePhnAMG52TyTqEn+QqBA9cfz31ClmKV2U+SYsGZXFiSfltIVh16FEA92dryGsn9CZ GEDAnLdCD2rdXJh+DFt/aXjM0LOKFUv4Tg2/bgKZ5AyYVzoN1CE0VWKz6FFLQkpdImUmcoTYf7fdy 89sr0xlYkgDdnVtorIytUG0FskvZ4kz1g0NSX+PzNiY9yZ+D3TglS+rFzLwwtg7EMaaJyKvktFOou +7n0/K8ase7yEvC0STG7Qgi6HeMdcPDg7RgqYYBHWcHEe0fPc9/ude0xKvxt7N9y/wHEtoIN1pWqR L94fW0SA==; Received: from willy by casper.infradead.org with local (Exim 4.97.1 #2 (Red Hat Linux)) id 1sA8XB-00000001mtV-2yTD; Thu, 23 May 2024 13:31:05 +0000 Date: Thu, 23 May 2024 14:31:05 +0100 From: Matthew Wilcox To: Shakeel Butt Cc: Kefeng Wang , Andrew Morton , Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , linux-mm@kvack.org, cgroups@vger.kernel.org, Uladzislau Rezki , Christoph Hellwig , Lorenzo Stoakes Subject: Re: [PATCH] mm: memcontrol: remove page_memcg() Message-ID: References: <20240521131556.142176-1-wangkefeng.wang@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Queue-Id: 37C2518001C X-Rspam-User: X-Rspamd-Server: rspam12 X-Stat-Signature: quiky9nte9xsccmmt4w9m8sciprqyffu X-HE-Tag: 1716471072-679567 X-HE-Meta: U2FsdGVkX182GE//zCjjsbZS0hsrQBdoM2kdJB20fiPp0hXvLspRb8yV1E7d/ohgjnbVnt+g7cf02gMmbEA64K6zp3MLiqn8pcHU3tBje9lHoKff1/zK2+g2dJLg18m62zy1D5lVO7Mh/lj6I0pqMx8H3C/9Sv+5PWvw/H64o5Ozam+9n9qxHmjJwtPJjo9rRq+6saLph0DaDrQQ9Q5h3IUqPNRfdxCDq15Z9mfxM8KIM3MQnvH57AChBWyo4BzhuudateijOlTxQ82+nbWXmiaKmmODdzqg4PT2nt0xtEBaBNTWubd+ytnFujRrM/B9LabE6VxjPFzPkA97R3gyBK3SY/ImzVdPzgHoMZOqhL4s0+01vtCAJ5phAD4RQNpGo3edkBnjfmq/UN/oQKLPkzPreszkKwNqF8kcjgq4duZW0gBwi/B4dkL/yz28SMmU5SCq91mkz1Fhm2dvRQWxSReIRRU2KBiX7JH+TU6zobSJy+qL+0RZLk6SVot7rBp5GHMmWiQnQB2Ui+b99AtE6lZi7YAD8VpitmxoQdvlxh3i5/tdC7LgAL1FhYBWQl4o9XymQszaNxy5waiTLQK7LuTxoEUgHjTDTza8ICZMrw8Xs6Y3xtNOADggc954Yfvmv07Q+63cd0OwBZtKJpga8laI7PGd4AeEyf+Q9MGoZGcaoL1InGj/OPnuKRhgHA7wpa3dOAuMOO/yUDuQeiXBTdsbWkEZaf63i47ampbNi26ozy8DxvqtpiPUxJhvX+XoOZ566cueN1HJkb1Kl2wN9LiDYJ5uPpPmO01jFs4y8dvW2NR3l34cTZT0LJ/zw6twBK/UCpf5hLG0gLiKWgDPZdGIFlQPo3/Qb+44iQB1Wybu1RrPufFBe7Lzo6pyIZ7/ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, May 21, 2024 at 12:29:39PM -0700, Shakeel Butt wrote: > On Tue, May 21, 2024 at 03:44:21PM +0100, Matthew Wilcox wrote: > > The memcg should not be attached to the individual pages that make up a > > vmalloc allocation. Rather, it should be managed by the vmalloc > > allocation itself. I don't have the knowledge to poke around inside > > vmalloc right now, but maybe somebody else could take that on. > > Are you concerned about accessing just memcg or any field of the > sub-page? There are drivers accessing fields of pages allocated through > vmalloc. Some details at 3b8000ae185c ("mm/vmalloc: huge vmalloc backing > pages should be split rather than compound"). Thanks for the pointer, and fb_deferred_io_fault() is already on my hitlist for abusing struct page. My primary concern is that we should track the entire allocation as a single object rather than tracking each page individually. That means assigning the vmalloc allocation to a memcg rather than assigning each page to a memcg. It's a lot less overhead to increment the counter once per allocation rather than once per page in the allocation! But secondarily, yes, pages allocated by vmalloc probably don't need any per-page state, other than tracking the vmalloc allocation they're assigned to. We'll see how that theory turns out.