From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B552AC47422 for ; Sun, 21 Jan 2024 23:56:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 44A3E6B006E; Sun, 21 Jan 2024 18:56:24 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3D33B6B007D; Sun, 21 Jan 2024 18:56:24 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2739A6B0080; Sun, 21 Jan 2024 18:56:24 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 1189E6B006E for ; Sun, 21 Jan 2024 18:56:24 -0500 (EST) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id B31621A0633 for ; Sun, 21 Jan 2024 23:56:23 +0000 (UTC) X-FDA: 81704979846.05.546A58C Received: from out-183.mta1.migadu.com (out-183.mta1.migadu.com [95.215.58.183]) by imf20.hostedemail.com (Postfix) with ESMTP id A22DD1C0007 for ; Sun, 21 Jan 2024 23:56:21 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=PAhgnBTV; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf20.hostedemail.com: domain of kent.overstreet@linux.dev designates 95.215.58.183 as permitted sender) smtp.mailfrom=kent.overstreet@linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1705881382; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=MaNb9i900YTz8bDoI0sHPIuzs00Kdb0nl8oXjKcg6Wc=; b=RAe1VS4Elbsk4DOAfMhXe0sWv8Gs7VW1WpoNZon/eYAdwRj/FJ49sNIDrFewkVQEiIrmqn pDx2eLrFvPbCTFmZoo664ECgih3Zk+hjkgOUE3vZe+Gp2l/K1x8LbQND3Uaoc7CdCWYEBx /xKArk07ITBd9szZpRM3RhYx4BWiNdo= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=PAhgnBTV; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf20.hostedemail.com: domain of kent.overstreet@linux.dev designates 95.215.58.183 as permitted sender) smtp.mailfrom=kent.overstreet@linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1705881382; a=rsa-sha256; cv=none; b=3URoE1nWy+t7jkHtBycoZKO5JMnVl+e/oUzLiI1KKMiKnJp0iXCUo3QVjhbllr0xG9AtnW dRQIWT1ql8QgAPIvk+2iYNhG8o4KwfIbBfVY28KgYXYhbjo589DmHBO/AepTnIWLYhwFfy IbDPpBJq2cwGzPvh3y5lzoKd/dhPM3o= Date: Sun, 21 Jan 2024 18:56:16 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1705881379; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=MaNb9i900YTz8bDoI0sHPIuzs00Kdb0nl8oXjKcg6Wc=; b=PAhgnBTVHE7omdepBNAow9sZycbf4Y0PKKMFY5m6cSZfbZnlhnYdz1Q0/MtarLKvdkeB0T 1RQlc7sORwyjWQWXkztPDf8OcVvpCk//lqUqa+2bOxVfgNCwZ47sSxl4JuvANx7NcY3IY5 MIJuP01MxVp9SFtUWFjBapzRx85Vtbs= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Kent Overstreet To: Pasha Tatashin Cc: Vlastimil Babka , Suren Baghdasaryan , lsf-pc@lists.linux-foundation.org, linux-fsdevel , linux-mm Subject: Re: [LSF/MM/BPF TOPIC] Memory profiling using code tagging Message-ID: References: <115288f8-bd28-f01f-dd91-63015dcc635d@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Migadu-Flow: FLOW_OUT X-Rspam-User: X-Stat-Signature: 4rrfu39em47mfomfyxkmkbqyfs59ercy X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: A22DD1C0007 X-HE-Tag: 1705881381-821141 X-HE-Meta: U2FsdGVkX18RDWM8OWbq7GwOZD9NkQCHtnwzGXWK2b4+H1HjPfbOIVHoLUFqdR5rrkk4QjDWi0HnG82CDtksOJvm4NDomJf6f6G9FsRlBRrG0SZ8qxgxW/xYb607LbQXbHD8jwz6pAZHg2M238OFO5d6JputoiTBbH/TdNnkoapFjOiv1EckTW3djhDicZ47Lj5IeL1E6G9M56xbNdelxOfU1urnGns6HkUi26JpjN/UkE1JqfgLbtIsyM+nmcgRxkywaiXjqZhF1IMy1MBCaHO40LJUZ4bSHRykL64sGgW1Jl/imQToHmrrUDQ21klS2mufZEqqkkl+hAYJOfVtA9V8vx/OChPPfbs9FdPGAK3clSnst/K2F2E3xiAGiUt7Xsn8yWcGXtyI4MT/N+mt5pHpgFpaY8fU3ERqskvmq0EjEV/8dG2RyS2crBWkykLMMBRwdkWNsPuQhopTwaZLSmatdexYrKyngYoMMFjukYL33nGjaohkJlh6AK/HC4vXpV3w6ZiEmRa1iWh6XnVpqF7efG5Ha88YmKEiEmC6bbV83zalofMq7JaoqGXe+rY6goV0gAIIpiRJ7Ed5/ilO9+3pDkDcHpdFGTBc3Mrmo5DGX1HrmRQFClzVMyiQJ4kFEJffMUojTHtQbQsZshYECaf8YX9OWyEdsg/IgaYwnmTh5BtEbKynnMfgDtimQWlavL9e+iZVxZikAL7KP0U15yRcvZ6Q8iaEqTC3eYIVIjnwaTtzNKxzfuWe6t3Yf+Ehk1oms2FfW1XPwgIG7C6KUstbQvw3diaf/CZqBZhdkwKaUlNkNvYnF3b6SPJ+H5VIZdKIor5bEERhcnhhbyRs4BUieUbwp07NwTvEFA6fA6YiKqg1wpGnQPJRE8gBcr4njabjyhFVm7Ools0BgBWkjBXcSWwR9ovMBWy+r7daWzthcSUibtjCC9C116F1k2IXcOW18xoCbrRyPrwvl6l BsRYJOoy OvBN30+zXPh9wxRTywroZY0avRq6XIxeU9ei9gvxkS+IakghYPmdAKf28BJiGu3+/TeaqiGTxwkIVTG7Q/HYc4HTc/A== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000008, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sun, Jan 21, 2024 at 06:39:26PM -0500, Pasha Tatashin wrote: > On Wed, May 10, 2023 at 12:28 PM Kent Overstreet > wrote: > > > > On Tue, Mar 28, 2023 at 06:28:21PM +0200, Vlastimil Babka wrote: > > > On 2/22/23 20:31, Suren Baghdasaryan wrote: > > > > We would like to continue the discussion about code tagging use for > > > > memory allocation profiling. The code tagging framework [1] and its > > > > applications were posted as an RFC [2] and discussed at LPC 2022. It > > > > has many applications proposed in the RFC but we would like to focus > > > > on its application for memory profiling. It can be used as a > > > > low-overhead solution to track memory leaks, rank memory consumers by > > > > the amount of memory they use, identify memory allocation hot paths > > > > and possible other use cases. > > > > Kent Overstreet and I worked on simplifying the solution, minimizing > > > > the overhead and implementing features requested during RFC review. > > > > > > IIRC one large objection was the use of page_ext, I don't recall if you > > > found another solution to that? > > > > Hasn't been addressed yet, but we were just talking about moving the > > codetag pointer from page_ext to page last night for memory overhead > > reasons. > > > > The disadvantage then is that the memory overhead doesn't go down if you > > disable memory allocation profiling at boot time... > > > > But perhaps the performance overhead is low enough now that this is not > > something we expect to be doing as much? > > > > Choices, choices... > > I would like to participate in this discussion, specifically to > discuss how to make this profiling applicable at the scale > environment. Where we have many machines in the fleet, but the memory > and performance overheads must be much smaller compared to what is > currently proposed. > > There are several ideas that we can discuss: > 1. Filtering files that are going to be tagged at the build time. > For example, If a specific driver does not need to be tagged it can be > filtered out during build time. Not a bad idea - but do we have a concrete reason we want this? Our goal has been low enough overhead to be enabled in production, and I think we're delivering on that; perhaps we could wait and see if anyone complains. We've already got the runtime switch (via a static branch), so if overhead is the concern that should cover that. > 2. Reducing the memory overhead by not using page_ext pointer, but > instead use n-bits in the page->flags. > > The number of buckets is actually not that large, there is no need to > keep 8-byte pointer in page_ext, it could be an idx in an array of a > specific size. There could be buckets that contain several stacks. Just a single tag index directly maps to the pointer it replaces, we should be able to do this. > 3. Using static branches for performance optimizations, especially for > the cases when profiling is disabled. Already are :) > 4. Optionally enable only a specific allocator profiling: > kmalloc/pgalloc/vmalloc/pcp etc. See above - I'd prefer to be restrained with the knobs we add.