From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 917A3C3ABDD for ; Mon, 19 May 2025 16:43:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8D0346B00C7; Mon, 19 May 2025 12:43:03 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 87F916B00D3; Mon, 19 May 2025 12:43:03 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7213A6B00D4; Mon, 19 May 2025 12:43:03 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 4EF2C6B00C7 for ; Mon, 19 May 2025 12:43:03 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 625B35C626 for ; Mon, 19 May 2025 16:43:05 +0000 (UTC) X-FDA: 83460227130.04.EDCEE5B Received: from mail-qt1-f174.google.com (mail-qt1-f174.google.com [209.85.160.174]) by imf07.hostedemail.com (Postfix) with ESMTP id 295EE40009 for ; Mon, 19 May 2025 16:43:03 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=JMfHMfOH; spf=pass (imf07.hostedemail.com: domain of surenb@google.com designates 209.85.160.174 as permitted sender) smtp.mailfrom=surenb@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1747672983; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=bXm50Zo/ZIkTqreM/aEsoiGQfgqbNe7u4ls7E5YFmDw=; b=K1w0Fx+IW/sMwq751pSwRUuNPT+UFOqDK/tUXFRil88K6Sa1kU2jJY2fM080zKRVbcr0EN xYViZfPdfFzlclKn4x7SoLi7KESEoCBstwWZ/M4vB2vvoIlu1EH/EHWHJ+b1SAHWgwDBn0 gn/oJ0VT8l74d5HkbItapSbabxY0Xjw= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=JMfHMfOH; spf=pass (imf07.hostedemail.com: domain of surenb@google.com designates 209.85.160.174 as permitted sender) smtp.mailfrom=surenb@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1747672983; a=rsa-sha256; cv=none; b=Js+c1MWAjrHSs31RJFEJdI4rQjP+hSMWm5YvO0qrmIogJpr5Iij6Z8nzIP1BAv9cVKrUrg PSSqrVh953V39bglMVD6+AH37dKn8d0DwKX8dN9F7FoSduChHdVFbOaZpwZKW98T+uyJLC h55AIk8mpF/asOskgdEzCjJJzpxnrgo= Received: by mail-qt1-f174.google.com with SMTP id d75a77b69052e-48b7747f881so671271cf.1 for ; Mon, 19 May 2025 09:43:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1747672982; x=1748277782; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=bXm50Zo/ZIkTqreM/aEsoiGQfgqbNe7u4ls7E5YFmDw=; b=JMfHMfOHVn91M1YZUAXQl+IqKiSsev4hTG6ozg0hIXP+jTNkVXf5cMh/l5Sw0G1Fv2 saQ1iHcSM+8S6nslWmlFV8MSxP9+Y+ilP4xyvepjo5pOMeBReWoEpZHiqmaq0tMX11tC Ad83LPrZHH2IUc9vtGr+G+E+/bZptQJnon0VyNU4CxOIJVoKoQd3ost4+xMvnmgocSpa PNrkamZ+ApjeSeWE6dGvspgmySKRis1EtDoMGQ4Vn7csXziUKzV2e7SQBQdizg8+Ehpi 5r9KhYpmZdu0WB+JrAUQkKTXsxlHOqFqNQ9jsjc/Ij/36IkYOU22E4tv6imG9fhve0Zs Ykbw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1747672982; x=1748277782; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=bXm50Zo/ZIkTqreM/aEsoiGQfgqbNe7u4ls7E5YFmDw=; b=LU1ZbEmYAVaVZBTbYq07VZlf+37LvxUEaex2hdGgSFEXr3IURXTufUc/VSCwwN/+vX jkx3lXI8d0JV2LMIcm7PGyu59sPw2vSymeDCx2wEotHVAUwd/XgNaxknzq9gCYRN6ZsO stmVXDpzAkDwlYtFNEnsFcrYUAVcbvUWXcw1U9RtucaTWavV95Q8XdHptoWudJD7omVc vNoeBQf/s15OmJG1wWG1CfQWX0uJ9CJSbneacrsbLr9/UKZdJWSaRNQ/xF6g62r4YTBd mmPJyQH0maxeO5ad2jVpnVFLeSl9nAPcsNiIftA7Qsd9DFExBxNYf4LgawPe/68191E+ oqcw== X-Forwarded-Encrypted: i=1; AJvYcCU17G7ZdxQyDhrDAicNFU8pfy6Pp+K671Z03gGh5fFb7aGaIRIMZJ349bJ9UxNpdB2CI6R0jwZ9Og==@kvack.org X-Gm-Message-State: AOJu0YxcFG+0OB0eTCVHFJfDcYl9kBs2EJfktSIc8ML6a6umZqFdtVzV T6yfNLdzEbCTZN51+PNEbDNZ08L6Zdlw83BLsrIl1Y0TCqv0dI+MPfEo6QwE+udfBDeeqHdAWjd QMbIO4tXItsCroKHPeRzNK9iPSgPbRH3YMHwhZlU9 X-Gm-Gg: ASbGncsMt5aYiu6ZrRALObxcb2F62+BJPPSTL60lB1lHknbO7c4AbKxys6zBlWb4TYR cFrD55hfD7ZsFB45COjafazF5KIAAp1PIWdE90UEmiLhjSw/FXBftErQKUR4WyLsAY7SADIp/0W ObdbO71xTDb0ul2EaBQJzRkof/SbtK96k= X-Google-Smtp-Source: AGHT+IGznHZ1S1Y7q9BnWgM4BP9QH2vxxsRmaUGgP5crqtPHXafMtcBASxZvjjcSO2lQ3PlrUq4d1AEjh0eT3dqVDvo= X-Received: by 2002:a05:622a:4a:b0:478:f8ac:8adf with SMTP id d75a77b69052e-49601270e9bmr6427631cf.19.1747672981871; Mon, 19 May 2025 09:43:01 -0700 (PDT) MIME-Version: 1.0 References: <17fab2d6-5a74-4573-bcc3-b75951508f0a@gmail.com> <20250519160846.GA773385@cmpxchg.org> In-Reply-To: <20250519160846.GA773385@cmpxchg.org> From: Suren Baghdasaryan Date: Mon, 19 May 2025 09:42:50 -0700 X-Gm-Features: AX0GCFvReV--_qhoVZEV7nulne0osYRRzQN3qbv28oS44F98GwUMnFLjlpuRYsM Message-ID: Subject: Re: Memory allocation profiling warnings in memory bound systems To: Johannes Weiner Cc: Usama Arif , Shakeel Butt , Linux Memory Management List , kent.overstreet@linux.dev, vlad.wing@gmail.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Queue-Id: 295EE40009 X-Rspamd-Server: rspam09 X-Stat-Signature: mbufzkpij18htc5zc8pfk16msk78g1nx X-HE-Tag: 1747672983-33088 X-HE-Meta: U2FsdGVkX1/E24jexD2or6xHLd6t6tdb0e1/ivbrb/XFcAaCTryXGTSoVeinIDZes0CNN0hC1xcVe7VHgPadRPvOdNKpymBA/tkVBYw2EmF2znomI/HSfZR0j9284JA7OTW/VTlHN2uxaTE0WoibCKwNF4lUUYHRpB9BytLw6r7m3Swc7BP/9PzYKiVicSKuwPU1ZuZ4lfTs2PmIQg9gCDrjsj47U/ottH19RDdzK2/aM0DTFRCXaZjwjuj1KxvL/6n8ASa5C7/hPPg/59+cOKL8FkHP28Aup+x/0mdz5nugz6LDZXoeWXB63OrHLZ4fslB7UcvzVQsCtiSXJmVS0qdRdT0d8iVLEHtLYttkQLqm4zn5kAq9ME6FtuNPew9rZh9ihfUgQKBvT8pqK2dCJfzdEEdYYsjuUTcwM76d8IwIFTfz9/o0JU5bVP2gZdK9OLwvQzowTft90ZMsBHYrIJKG7XhsNlaTyw9dPtvSYnz1RE1ydaZN8h2ErbPSlHD16wpoKGN62BRnJsfgE4i+CpEmFOlMJ1wJgGkPNqGXPwGzS1NwZZJS03J8J7dX0V14g/H4VxX9PGyWG2ziBsF4sWsZigiQbXoKTr8mHtJyvvQlCP1v6iZXcRGBSU0Y5P+pd4llUJctsGukdO0pSEK8Ak/7SXEZAZ/uyvXQd7gpphsOd47ipy2z9ec7X74Z022F5SmO1jorTdPdXJOKfWRGP7y0+vWtYCrWfLauMsU/qwCVnxCbCfbUQR14+37IVFMaf/+9hphdPr6iX3e+8uHb9fwuA36xobC9aIgWEBQ5ZJvSkH6ghWMY1ffhKTUF9a1UEWFjuYgbzZaixZ08UjcmMcqvB/7D0N3nls67sIFjpdYyl4m9MTte+SvKPoq11ecpmYU64xRssn3QzPv0RmREx/e5GNMfO9CCvGTY5OLL8oRadcn4rZ9EmQp1G2dJhUksE5NTq8LG6ijLSE/XlXm t5UQcMei DOEYuFBrXQZKRnj3A3waaBjJo+/wMK8KMTR3EDrd7Zx6Dhl6FWSgUpQqWlnWAuf3fHzR6D/QAQ3mXBDhPHqVzMVSAFa4tHJru40BeJCIg/qesLIwAzPJDVEK+/l12a0ElPgbjZzzFOBfYJgBFaHNOR1Euo2Ezfobit7sniJP9VO4BBmiUOW2XNYz0ZHnwzQBHxabNyD1vSIv3Uwb+yykJ/Kf06a9FlbIhXQfdpFjkuqtZ/ntFPy7OZOJRQUHCZzgv5pe9SUeWN5O1Jd9apW8KcQRlOaQel+a6hA211LGaU0LY+LGt1WcVs3i+gp6D0oml5qkMsqBODT81R0mxUzGqa+PKdRVKTUC45z7CJxNRtnnVdwOl//TFvu9lRzYiRRvcsVU0m4XdefqmP3Vmx5fNezHPEg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, May 19, 2025 at 9:08=E2=80=AFAM Johannes Weiner wrote: > > On Mon, May 19, 2025 at 08:50:28AM -0700, Suren Baghdasaryan wrote: > > On Mon, May 19, 2025 at 6:33=E2=80=AFAM Usama Arif wrote: > > > > > > > > > +cc Vlad > > > > > > On 19/05/2025 14:31, Usama Arif wrote: > > > > Hi, > > > > > > > > We have started enabling memory allocation profiling (with kernel 6= .13) in our fleet > > > > and are seeing a large number of warnings (reported by Vlad Poenaru= ) due to failure > > > > in allocation of slab object extensions on services that are memory= bound. I have attached > > > > one of the logs at the end. > > > > > > > > Does it make sense to change the slabobj_ext to be allocated via kv= calloc and also change > > > > the WARN to WARN_ONCE (or maybe even pr_debug?) like the diff below= ? A large number of > > > > prints for this in a short time may mask any real issues in the sys= tem during memory > > > > pressure being reported in dmesg. I tried to see if there were any = changes after 6.13 > > > > to this code but didn't find any, but thought will check before sen= ding below as a patch. > > > > > > > > diff --git a/mm/slub.c b/mm/slub.c > > > > index c2151c9fee22..4595ca190cd9 100644 > > > > --- a/mm/slub.c > > > > +++ b/mm/slub.c > > > > @@ -1961,7 +1961,7 @@ int alloc_slab_obj_exts(struct slab *slab, st= ruct kmem_cache *s, > > > > gfp &=3D ~OBJCGS_CLEAR_MASK; > > > > /* Prevent recursive extension vector allocation */ > > > > gfp |=3D __GFP_NO_OBJ_EXT; > > > > - vec =3D kcalloc_node(objects, sizeof(struct slabobj_ext), g= fp, > > > > + vec =3D kvcalloc_node(objects, sizeof(struct slabobj_ext), = gfp, > > > > Hi Usama, > > Is the allocation larger than page size? IIUC, unless allocation size > > is over PAGE_SIZE, kvcalloc_node() will not fall back to vmalloc (see: > > https://elixir.bootlin.com/linux/v6.14.7/source/mm/util.c#L668). How > > big is the allocation when it fails in your case? > > Digging through the reports, it appears we're encountering both. We've > seen a zswap slab where the slab is order-0 and slabext is > higher-order (8 byte objects, 512 objsperslab, 1 pageperslab), but > also biovec-max where it's the other way round (4k byte objects, 8 > objsperslab, 8 pagesperslab). > > In the first case, vmalloc would help. In the second it wouldn't. Ok, then I don't see any downside to changing to kvcalloc_node() here. Let's do it. > > The second case is interesting. The higher-order slab succeeds because > bios use a mempool; but the system is so depleted that the order-0 for > the slabext fails. I see. > > I'm not sure there is much we can do about this tbh. It would seem > overkill to add a mempool or grant the tracking access to system-wide > emergency reserves. Yeah, with the system under so much memory pressure we probably have bigger issues than extension vector allocation failures. > > A warn-once would probably make sense nonetheless. Agree. > > It might also make sense to flag the line item for that callsite in > the reporting file, to make it obvious that the counter is compromised > and is missing allocations? Good idea. We could output something like 'X' instead of the number if the value is known to be invalid. I can look into it. Will also have to raise the file version so that parsers can handle this change. >