From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7BDA8C021A4 for ; Mon, 24 Feb 2025 18:47:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EB7D528000F; Mon, 24 Feb 2025 13:47:06 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E677628000A; Mon, 24 Feb 2025 13:47:06 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D085F28000F; Mon, 24 Feb 2025 13:47:06 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id ACEF228000A for ; Mon, 24 Feb 2025 13:47:06 -0500 (EST) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 69DB91C82DF for ; Mon, 24 Feb 2025 18:47:06 +0000 (UTC) X-FDA: 83155720452.17.16C2D63 Received: from mail-wr1-f48.google.com (mail-wr1-f48.google.com [209.85.221.48]) by imf14.hostedemail.com (Postfix) with ESMTP id 4C57F10000A for ; Mon, 24 Feb 2025 18:47:04 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=PJWXeaV+; spf=pass (imf14.hostedemail.com: domain of mjguzik@gmail.com designates 209.85.221.48 as permitted sender) smtp.mailfrom=mjguzik@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740422824; a=rsa-sha256; cv=none; b=olacKzYe39X6aZSPZIa76HwOcebvGMk0SEHsp5zwQUQKh/T4KnAKYRuviC7bq597l/dAT2 o02dddofbNpaaCDZmb9+yEnP4MBp8zKiDkP96d8aoXpWtsmtLAKnl1KN4nBmQLrwMUaLRm KMxDN/bQp4GqChiV/PlOvRqgWxU4QCc= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=PJWXeaV+; spf=pass (imf14.hostedemail.com: domain of mjguzik@gmail.com designates 209.85.221.48 as permitted sender) smtp.mailfrom=mjguzik@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740422824; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Jjer69476fXdQgjvSYyLKClwSOlF5QTqcGFir0xEKFA=; b=IVJfxlO7DD36QeK2IDbGav2wCc15nrDT714PqA+0s3y1faGQDCrUCckR6a28EHEr3G2+lQ IK5QABE4qUe+m7SI4fQU23AhRt8MlTQFUUGGY2Boy7J6gX0muB4d7uwlJ5W3QXSBodSzhR uW2MqFV+CJ1908IeB2m0Wm91fzg+t98= Received: by mail-wr1-f48.google.com with SMTP id ffacd0b85a97d-38f1e8efe82so5034142f8f.0 for ; Mon, 24 Feb 2025 10:47:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1740422823; x=1741027623; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=Jjer69476fXdQgjvSYyLKClwSOlF5QTqcGFir0xEKFA=; b=PJWXeaV+rrAcOzDnG7+4ns8/hVHVOiVVMDHvqQN1yA1YB0Uy+0cbkH8RklTnIPajFJ 7nVTr1+tbIIqdhGBmhPYKCIZyPE8jaFGcQ7GZ+MFkM6bCgCHZo/hl0zZ3YtinLhn3QlC OrjJ3krj6rP5r+/O7tH+Op/yuw52km9C5JDcn3Qy3e7I+vkHnTRWbK0Hhp43a0DYNFTr jrHze0M9TPi4GhXDTU8rXlVLkgO9S1sOEWyKEzmepdnMmIIosiE+p+vSLwNBTph1VpvH aFxxfoFUDFTGDaeVAQu3ls2MlHC82dux+ksMVuM+tMK9GF5d16GcqbqwJ57Zhq0wzWyq o7eA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740422823; x=1741027623; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=Jjer69476fXdQgjvSYyLKClwSOlF5QTqcGFir0xEKFA=; b=fhV43B5Or6oAW1nLmyqyCPSBzcmVnBRDF9CQNEbwq3xhW7NU4VKZnHAGJmcnl+LSeL E0WUYs1YH0+YfRR5TE0u26ML2iEsQmNg4Xcv59jY29Edq3z52gdQi56gZVAJGnnsyumQ UFgEnPLiXEEx7AHCeJvSTCyt0TdqxDJNOhTllNghdp2C9m0q80ZaS9Npt4r0zk9Pvm1S enU5wTPWdAvemCrw8IO6vEqWfwSZf6YjNJBKsT4+noL55/8qBJ9jfB2XB8vMZMFxcWcA C4SUxbHf2XkPXRnI62lz3qWWdq5iybU8JtyEYkES5eiy83mrVlXspvHNpzw5Psipajtn yEBw== X-Forwarded-Encrypted: i=1; AJvYcCXloLFSbzel5QCAuYGJdvfmyEDw5J2omz9EO36xgWBgtRaRncUS8f4CHtk4B6kqLtazcUV3l2OenQ==@kvack.org X-Gm-Message-State: AOJu0YzE1+UwSPz1uo/SRW7CTXiOVdf7HpnM2k10w6UiYREGSv7Tk5Zi D6mGaZoEEQ+3SpK75JSn184+g0IZmSJUjrKv/W64vmEZ5pgt0nkm X-Gm-Gg: ASbGncucNTdTaFDp03jJdkMBdOOWRn1nHWlLK0/AfLMArY9xmmWhmyDF9pxfyOrbbyI bs0yw5erM3Jd/Gf+WbtNWFUS4WJM9iCeu9uCNl6mfh5X0ft9+WPTXdmHn+wbDet9QV4iYA1+GXp 2M7etI/In0f92FIuh52CvMCkKYbLGJ14GIDQBCrZ/HEvCBXgN5B43q3bsxBastAa24p+HrfETxR Qe7H3brjUWI9ppvFAhATqf0KK4XnL6M6f9zciJQDJbNMUIRGjsRoBBHSaax3jxWTFJ97TdXnvCv 7DouJoEjkeoHK/bCBoALW0qL1e2PEHevaLyWb3Y= X-Google-Smtp-Source: AGHT+IGnp79TGzLNiybv1beS4iybyAIWQwYxtQalhUZGS2LAtsCjXk7zs8CElagRTdFr3J3b2Scq6w== X-Received: by 2002:a05:6000:18ad:b0:38f:32ab:d4f4 with SMTP id ffacd0b85a97d-390cc5f58admr194748f8f.4.1740422822613; Mon, 24 Feb 2025 10:47:02 -0800 (PST) Received: from f (cst-prg-14-58.cust.vodafone.cz. [46.135.14.58]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-abed205728esm3920766b.144.2025.02.24.10.46.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 Feb 2025 10:47:01 -0800 (PST) Date: Mon, 24 Feb 2025 19:46:52 +0100 From: Mateusz Guzik To: Shakeel Butt Cc: Vlastimil Babka , lsf-pc@lists.linux-foundation.org, linux-mm@kvack.org, bpf , Christoph Lameter , David Rientjes , Hyeonggon Yoo <42.hyeyoo@gmail.com>, "Uladzislau Rezki (Sony)" , Alexei Starovoitov Subject: Re: [LSF/MM/BPF TOPIC] SLUB allocator, mainly the sheaves caching layer Message-ID: References: <14422cf1-4a63-4115-87cb-92685e7dd91b@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: X-Rspamd-Queue-Id: 4C57F10000A X-Stat-Signature: go7ripbhc7i9z4w8y3uy1yh7zjj181t8 X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1740422824-946034 X-HE-Meta: U2FsdGVkX1/nbDBbqSAfKYj4LLbI5HMd3qsrxIL6j9t/uTLbSVNnI/ZQutFwKqnO8OdEVngshFRiTCjLe6jvwXw1yJXTTO7l3MNqNsGqolwStPHB84POj1SjABCRNlwMjqExJ3v2wWbOOQmukp7jEisIPsuEkQLez1Tn3xE9ur8n7d36zsrhLJ+VzgFlmg8Cae6qOoI6syMjI3GC81uT5ZoEBn1yW1y0+W88lryWdaZ2C9yyzbvzripcEqfis2RBfJsu3cmAx39QYH5WbskAGpSfFJdCAGl0MJkSxrpPavSkKCI6wP3eE0yM/ZIhjocKY5oIFOvaXNHu7MnkTeAdmpHlhYwccXpvf+XI6OCwwVGqe0zqXp+ofIrCULJOf0rY+8p9Oo/u7gU09/RX/dDKollcyBdJ04ygycDuAGioTxKqYzVJhhZBriRttYByA1c/t0htmwIMTyGYBzpYtL+umNjvuQ3FWcJR1Koo1NTIUPlnS+m04lvvjqfk/a5Q630Pk3O/GzNRlwchxYi7zmB6Frn4MiAmOaifOccL5XyKLYZ4nhjN845OjdYHe3aJ0R56lerg2A1qiMewhLaXiNZUJYIbdHBXLedpLmWNoL6GodyoM3QTC6YcK5OynRrW+ARZFwQjItVDrxsXu8GuBkvm89N6GBgrBuzamgHPKuS/oJYZ/grZJjYe/oglowXVb0EjpPwWftQgE1oilCXbRuijwVOuAE+3PcpFnLrsktvhPvx/WMoYAMO+P6QF7dhWnhiEXp0yWbqvzK3VneMSvg67+HUZ8u+bFVQ8FyKdoXp5+dntbRDbm7tUBVAijD1TTws2LelLtQjPavvC1ukDygNpEOBd7R+P1sGLgI9hkTVWa/hcT5NetcVDiY7Wg5yEFABU8KfVyjxUL73dpYcdI4/KauNHY2Kicc3e+12vjl6HKzDLk6pHdHlH1EJlT+J2K3sWwUxm5pNWoAcV4wikqnd Yz20Xk/N /bOKkBBbeW+AlJLAmgIT7hIjtmjYQJzkzyi17+gQAqdIlSRzSS60CSLsUJCa4UM0k3gGH0/ip6uxAn/QwkiY8HnKPB0FSSEpaj4SAHV92gJIOwgP54Uqu4J6sCCZbPv78Hn4zVAdv3ldz1l3R+dULV8xDRI7Q1Sn5RFmwfuIVT3trEUWnS4QFbjyhVm5FqbXEDXQeRbFXZ9jqqSN9LSaC4OeLQ+Xvoln9p3CESPNNX4i6QXmabbHn1F9R6BOD5YGoTw4eOPKD/h+SZpmu701g6Ljipbjozrxc2xv7Lcq80U5CprvV5kucptxc4tz2+AzLt2J5JhNX73IRBT7oUrzw08Vee/v9NOYtOsNfnczNDlgwJ9OPYXPNQHnlJMQtRvesgR6RWf0zSaaQ4T5rjy+Uaf6cC05C1Wb4JjfzwsLTcIUVwFoB7yd90q0WCYFjSOvtUvx/389uFojr+8U63w10dpDQig== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Feb 24, 2025 at 10:02:09AM -0800, Shakeel Butt wrote: > What about pre-memcg-charged sheaves? We had to disable memcg charging > of some kernel allocations and I think sheaves can help in reenabling > it. It has been several months since last I looked at memcg, so details are fuzzy and I don't have time to refresh everything. However, if memory serves right the primary problem was the irq on/off trip associated with them (sometimes happening twice, second time with refill_obj_stock()). I think the real fix(tm) would recognize only some allocations need interrupt safety -- as in some slabs should not be allowed to be used outside of the process context. This is somewhat what sheaves is doing, but can be applied without fronting the current kmem caching mechanism. This may be a tough sell and even then it plays whackamole with patching up all consumers. Suppose it is not an option. Then there are 2 ways that I considered. The easiest splits memcg accounting for irq and process level -- similar to what localtry thing is doing. this would only cost preemption off/on trip in the common case and a branch on the current state. But suppose this is a no-go as well. My primary idea was using hand-rolled sequence counters and local 8-byte cmpxchg (*without* the lock prefix, also not to be confused with 16-byte used by the current slub fast path). Should this work, it would be significantly faster than irq trips. The irq thing is there only to facilitate several fields being updated or memcg itself getting replaced in an atomic manner for process vs interrupt context. The observation is that all values which are getting updated are 4 bytes. Then perhaps an additional counter can be added next to each one so that an 8-byte cmpxchg is going to fail should an irq swoop in and change stuff from under us. The percpu state would have a sequence counter associated with the assigned memcg_stock_pcp. The memcg_stock_pcp object would have the same value replicated inside for every var which can be updated in the fast path. Then the fast path would only succeed if the value read off from per-cpu did not change vs what's in the stock thing. Any change to memcg_stock_pcp (e.g., rolling up bytes after passing the page size threshold) would disable interrupts and modify all these counters. There is some more work needed to make sure the stock obj can be safely swapped out for a new one and not accidentally have a value which lines up with the prevoius one, I don't remember what I had for that (and yes, I recognize a 4 byte value will invariably roll over and *in principle* a conflict will be possible). This is a rough outline since Vlasta keeps prodding me about it. That said, maybe someone will have a better idea. The above is up for grabs if someone wants to do it, I can't commit to looking at it.