From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id ABD88E77180 for ; Wed, 11 Dec 2024 20:58:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 47E6F6B0096; Wed, 11 Dec 2024 15:58:48 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 42C4B6B0095; Wed, 11 Dec 2024 15:58:48 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 31B8F6B0096; Wed, 11 Dec 2024 15:58:48 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 13FED6B0093 for ; Wed, 11 Dec 2024 15:58:48 -0500 (EST) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id BC4DF1216C9 for ; Wed, 11 Dec 2024 20:58:47 +0000 (UTC) X-FDA: 82883891244.01.2FC4B40 Received: from out-189.mta1.migadu.com (out-189.mta1.migadu.com [95.215.58.189]) by imf12.hostedemail.com (Postfix) with ESMTP id 8A72D40010 for ; Wed, 11 Dec 2024 20:58:35 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=Tcu5nZPO; spf=pass (imf12.hostedemail.com: domain of shakeel.butt@linux.dev designates 95.215.58.189 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1733950708; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=X3GGTf6dgkL1tJW6zmOq7QwK5jX2MweOMytieQX+f8g=; b=xfjx0GtUvihfumi67DTUqEeyX/kOy+/9Xq90soqNd65s7m385+ZUcwNCeopO7EFrWi4iqu 4qPpbqjDMJNFNQPvRzZaruzdUwwcsJydjzVJLvp8c8dFwcsZVxtRTGWVQKhaK2SSQdvvhJ 4+cAk4sA610yxNmbH7OoVJ+KksH3YfU= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=Tcu5nZPO; spf=pass (imf12.hostedemail.com: domain of shakeel.butt@linux.dev designates 95.215.58.189 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1733950708; a=rsa-sha256; cv=none; b=5sbhNEhMBEbttPnvPaRgtD1fZrj5sJ0RW36Sy4WG8CbRaO0g8BQKQl1QIhtu5xhypyEYfy TO5NlgpMaTRogNQEVQWoNAHsiljLkzGVnJ0O9k3WOvmNilqIMPe7WHAHgaL7wn17FjMYF5 wPBdveRhbMZ613UOADS1hSXkpwpsSC0= Date: Wed, 11 Dec 2024 12:58:36 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1733950723; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=X3GGTf6dgkL1tJW6zmOq7QwK5jX2MweOMytieQX+f8g=; b=Tcu5nZPO5AXCk9DyAjdioFnIUDL9uFmG5p4w/XtlLc4Lr1Se/sVNFLI8/cpsMd7w3rSpBC V6+SudxpuIl3zzdB7LB0c18Zy0cudfYdFXcSf6eGBUMgnl8VLzixJ3Yg8QQA30sNk009sz /x4qFJT7EKcxf3ew2VchqO2CM3gKkrE= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Shakeel Butt To: Matthew Wilcox Cc: Johannes Weiner , Andrew Morton , Christoph Hellwig , linux-mm@kvack.org, Michal Hocko , Roman Gushchin , Muchun Song , cgroups@vger.kernel.org, stable@vger.kernel.org Subject: Re: [PATCH 2/2] vmalloc: Account memcg per vmalloc Message-ID: <3bgedgrbu73dovgcy2keqjud6jafqxenceihtyre2hkego7oyb@opc5u53jef5a> References: <20241211043252.3295947-1-willy@infradead.org> <20241211043252.3295947-2-willy@infradead.org> <20241211160956.GB3136251@cmpxchg.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Migadu-Flow: FLOW_OUT X-Rspamd-Queue-Id: 8A72D40010 X-Rspamd-Server: rspam12 X-Stat-Signature: xhgnxkpx3b8csharqps18hcepeb8ceks X-Rspam-User: X-HE-Tag: 1733950715-527825 X-HE-Meta: U2FsdGVkX1/QgiGVynH01NcjurQASsh520jdv+hsMFTuqS4yf4jpt3Cc/uQWQOZznaFGSHd7la63zSvwS1CEgWb/QnbWCFcxA4WPn3Hpn9zclSoygrEIque64v1j0KCVXtue83SmavoA+k7ts8QezAr6b1TQNhASFUg/fpuzvJMagUQ1ZiHtB7TGA0M6F9JiWbHbSWrIz+w3SU9D/EieSMxy08NZMqn3PSev6B9no3KQ/isQxUUcxkEUc7SzDNxHyytpN71g03gZjuzc8ZQmgiNJ2ovbSfqgnWaEsEzRUQKLtXguHWZtmJodPz1AipkXVe+2iByUXl0qxj2As0W/yI2UOzhOtDyD5LbPP3wObfsbW7YM/+JfreTVuk90mWUxdeIAm7xQiN8N2uGHD54w76jPh9fQTlK8opgC9sbvTVJeVqZ7zbzEarMqJAx7bIh0BpbS6hG40lwMvFSy7PtnB9Uh2Hp0VSgsw3eQjRUiYz7vAdv8feR1q8VWlGbhmAErQdhA64qw/ST9j5gDc4/plk1ie2Z/aDJnYI7diONRmlIIQ/G+PzgpPhTl+C0ZKkKTPB0SenNji7iwO3A7VDTENSG2YTaQZbxHnWDIUJHeKJWGpoi4emw3MRRhPRIV67D0+fvj2/czQ8myTX0Jwzr31nYC7NdNpMnoAfhj7tljB11Ydo43c1HrY7OOnb9dWcPM6txSDHgnjCUWwhiYmW3EL1d+lawlnke2Lq5rQR3hvqMo8Lvvp4bf4ER0QBpMUY329AzTpvV5P7OmcDQDW7m52pwjU/i+H1xg9Mpc6oogkeyLaDFFlgj+IcJmxjvwbaWBQgF/GBYhNRrbjNn8GkKqJJoHuenj34B5KPUGzfDnDfL6A1CNlyLVnqfmHLHSK6XMrTu07+sVSNJmwsSQ/rc+PnMJkutJilwFfboxofuBI2OibDMaDPWrrVZUURuZT2lLxT7MLkuyuHcLMHw96+C pHxE0N6l L0wWtmipW2OPG7RuvuJrqYGuD62xqis3SWCnlpcVgaIUO07pYud+P4bEkQehOByurSvCKmAzqpmq7kN1VT+FJb4M8AP+qERZ48Joh11QjvMeQ44r7gqIkXbVAktI48WsLUMh0bh/Q9E6XwXkcWt1R/hH66Q== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Dec 11, 2024 at 08:20:36PM +0000, Matthew Wilcox wrote: > On Wed, Dec 11, 2024 at 11:32:13AM -0800, Shakeel Butt wrote: > > On Wed, Dec 11, 2024 at 04:50:39PM +0000, Matthew Wilcox wrote: > > > Perhaps you'd be more persuaded by: > > > > > > (a) If we clear __GFP_ACCOUNT then alloc_pages_bulk() will work, and > > > that's a pretty significant performance win over calling alloc_pages() > > > in a loop. > > > > > > (b) Once we get to memdescs, calling alloc_pages() with __GFP_ACCOUNT > > > set is going to require allocating a memdesc to store the obj_cgroup > > > in, so in the future we'll save an allocation. > > > > > > Your proposed alternative will work and is way less churn. But it's > > > not preparing us for memdescs ;-) > > > > We can make alloc_pages_bulk() work with __GFP_ACCOUNT but your second > > argument is more compelling. > > > > I am trying to think of what will we miss if we remove this per-page > > memcg metadata. One thing I can think of is debugging a live system > > or kdump where I need to track where a given page came from. I think > > Umm, I don't think you know which vmalloc allocation a page came from > today? I've sent patches to add that information before, but they were > rejected. Do you have a link handy for that discussion? > In fact, I don't think we know even _that_ a page belongs to > vmalloc today, do we? Yes, we know that the page is accounted, and > which memcg it belongs to ... but nothing more. Yes you are correct. At the moment it is a guesswork and exhaustive search into multiple sources. > > I actually want to improve this, without adding additional overhead. > What I'm working on right now (before I got waylaid by this bug) is: > > +struct choir { > + struct kref refcount; > + unsigned int nr; > + struct page *pages[] __counted_by(nr); > +}; > > and rewriting vmalloc to be based on choirs instead of its own pages. > One thing I've come to realise today is that the obj_cgroup pointer > needs to be in the choir and not in the vm_struct so that we uncharge the > allocation when the choir refcount drops to 0, not when the allocation > is unmapped. What/who else can take a reference on a choir? > > A regular choir allocation will (today) mark the pages in it as being > allocated to a choir (and thus not having their own refcount / mapcount), > but I'll give vmalloc a way to mark the pages as specifically being > from vmalloc. This sounds good. Thanks for the awesome work.