From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F419FC4332F for ; Thu, 9 Nov 2023 17:36:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 649E2280004; Thu, 9 Nov 2023 12:36:36 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5D3008D001A; Thu, 9 Nov 2023 12:36:36 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 473FD280004; Thu, 9 Nov 2023 12:36:36 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 334968D001A for ; Thu, 9 Nov 2023 12:36:36 -0500 (EST) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id F371E1A0E79 for ; Thu, 9 Nov 2023 17:36:35 +0000 (UTC) X-FDA: 81439120392.08.E63DE6C Received: from out-182.mta1.migadu.com (out-182.mta1.migadu.com [95.215.58.182]) by imf10.hostedemail.com (Postfix) with ESMTP id A6896C0004 for ; Thu, 9 Nov 2023 17:36:33 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=PaadGnZh; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf10.hostedemail.com: domain of roman.gushchin@linux.dev designates 95.215.58.182 as permitted sender) smtp.mailfrom=roman.gushchin@linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1699551394; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=O6q76AbBJRV3A5UzRYYTovtVdqujVhe1G1oS+g4WJ7E=; b=xPYe/NtcXQCNAKoSKKsUNeefNUWvIOi1EAOBCz7qg0kxIpTOtDBj4ia5I5rM23G5YBXjIn 9zTVdV5EHh64zsKX1iBy61ygbplokg/5Usnix4ruXtlpr6raO+fJTba7/HzDO00hoFmC3L A//DD55f6jy1h/VlmRwCZSXctocbC70= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=PaadGnZh; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf10.hostedemail.com: domain of roman.gushchin@linux.dev designates 95.215.58.182 as permitted sender) smtp.mailfrom=roman.gushchin@linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1699551394; a=rsa-sha256; cv=none; b=Qs94romVwo6yffsOjibY1qWh9nVpTr1HvnTwpg+AF+Mn8Obkcwf57WmBPFy88/Qcb8DCKh La8z/BzzKeNyU2IEox2JN634ZY2wDU7atcEqYx0pSk+1kvvGtSMuFDOd256qmwYJRALUh7 59aAiHrRENhNy+qiD7EXF/0AvvFYY0I= Date: Thu, 9 Nov 2023 09:36:26 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1699551391; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=O6q76AbBJRV3A5UzRYYTovtVdqujVhe1G1oS+g4WJ7E=; b=PaadGnZhpplFNAjZ8I71usdqW3bhffkUTjLi4m2Dhu4VpWyz7Ja+0vG9RB6AC182J+3qOu Qj5hCIjP3rZ5PswxcgVWC1sjFLk4cbpnTHboFcU6Y74f7wiXEgmLxK9TrnVUDiIMdnEZtX HsHSOaOZdGtp4ZKaOJPfjCbXuR1ktwU= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Roman Gushchin To: Shakeel Butt Cc: Michal Hocko , Christoph Lameter , Matthew Wilcox , linux-mm@kvack.org, cgroups@vger.kernel.org Subject: Re: cgroups: warning for metadata allocation with GFP_NOFAIL (was Re: folio_alloc_buffers() doing allocations > order 1 with GFP_NOFAIL) Message-ID: References: <6b42243e-f197-600a-5d22-56bd728a5ad8@gentwo.org> <8f6d3d89-3632-01a8-80b8-6a788a4ba7a8@linux.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Migadu-Flow: FLOW_OUT X-Rspamd-Queue-Id: A6896C0004 X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: e9zcq5nejmcocu9948f8itsofk16qzn6 X-HE-Tag: 1699551393-18082 X-HE-Meta: U2FsdGVkX18ZdxYx6YIkK5wkxUvkSmqMtBWyaYyo0BnDkTeibpboXoHoiyPvX5/y0uSpggUT6s8raNc26moCfHuPD9DE0Ksuy/0I8UWsD00ezUoxwpj9fXd92GBrvtIfouKUrVl5jM8Y1k28Sna6Ct8n5Jz8IWjrd2Q3c3/yS5odKKPfl22HEDrHEdWP8G2XJCA9ieKiGhw/LxmYA+XrOxcNJomhAQwasg47t3L163P6A8nCoAilu3/2030XY4ij46fG48exJDMN4C+aeklelpMVjU3Oq21UDde2cN9tvVMhBj1ShxQQBaKmqqOxX4onCvZI7K9nbZFuqPquCrMB3iOzSaf0rbPpUWYFDxIyZ4lMmOjJjpwa1l44XZCZ0pzvcnirLpNeF8EFYIo+23vLoH8CEL7gQQ9oYTNi3m0IFy+KASUurXzn9mqeYA8rQU0Pvb6mAJkXLFGR37lG4llDQU/miPSGkcBrE+deJy7fOfyYJzyou5hbLW32H0cbeCvyyvj56CjczPkCKc6DKtHzklxBIdJ1k89z4B/N/JOfYrGTWmYF7aS0QQIsgCikvyqQk+eD3B2SP2ok0+mthnS+eVCYLPseqU5MA/iFo2lrbbhNVf/g7kRF2D8QigJq6T3rxSn9lmvFoDjlWKV7mNXdX+1YvOZjx920SoAZhQsC4GlYTCpE0iLtzYLTXz8ypyzksQz+FeQxSzQb3b34reReV7dQBBq5wmtXFMDmJONnsMd2ZAPB73jlseSAYXcaschZFEddfRYk4k8LXoeTNxQcbN9N7vkW5vMkwzOkZGI22DFlzAcA8D/IRp/9/K8na3UP0GjHZlTipB6iZAY6FxzVoSUcsdTIWdJe9zvxvrV1m+lmx6YpIR8R/TwZMTSpriHkQvahbww9Mfdp1J/GEvLii54JluAYFIkxFmULno1/wJTq+HHYnqi01nxWEOvS/eLOt3ITuD5yXY5uQERo5FK duEZE4Ys IcwtXODxejeiNM9kC8AMFb/7jm1mDtLEfhrSZp9jBEittUPJuOUEk6GGqlKLakoN2Bi5g59KEZIaBSLaHsmbKONBK7T/TY/nRwWh59+0xpM2B/q2XN7Kh193zlUOP7jdD63LfqHYSlWfuH3PEPzpJVkD/dAnGXwxJ7W+HihNNjtiYawKycrJoJevqykAlHs/lUI3kvXhq3/LKdx3itYzBGbOjFH0v784a0NQrURoAxaWMswwWmnydNonXdg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Nov 08, 2023 at 10:37:00PM -0800, Shakeel Butt wrote: > On Wed, Nov 8, 2023 at 2:33 AM Michal Hocko wrote: > > > > On Tue 07-11-23 10:05:24, Roman Gushchin wrote: > > > On Mon, Nov 06, 2023 at 06:57:05PM -0800, Christoph Lameter wrote: > > > > Right.. Well lets add the cgoup folks to this. > > > > > > Hello! > > > > > > I think it's the best thing we can do now. Thoughts? > > > > > > >From 5ed3e88f4f052b6ce8dbec0545dfc80eb7534a1a Mon Sep 17 00:00:00 2001 > > > From: Roman Gushchin > > > Date: Tue, 7 Nov 2023 09:18:02 -0800 > > > Subject: [PATCH] mm: kmem: drop __GFP_NOFAIL when allocating objcg vectors > > > > > > Objcg vectors attached to slab pages to store slab object ownership > > > information are allocated using gfp flags for the original slab > > > allocation. Depending on slab page order and the size of slab objects, > > > objcg vector can take several pages. > > > > > > If the original allocation was done with the __GFP_NOFAIL flag, it > > > triggered a warning in the page allocation code. Indeed, order > 1 > > > pages should not been allocated with the __GFP_NOFAIL flag. > > > > > > Fix this by simple dropping the __GFP_NOFAIL flag when allocating > > > the objcg vector. It effectively allows to skip the accounting of a > > > single slab object under a heavy memory pressure. > > > > It would be really good to describe what happens if the memcg metadata > > allocation fails. AFAICS both callers of memcg_alloc_slab_cgroups - > > memcg_slab_post_alloc_hook and account_slab will simply skip the > > accounting which is rather curious but probably tolerable (does this > > allow to runaway from memcg limits). If that is intended then it should > > be documented so that new users do not get it wrong. We do not want to > > error ever propagate down to the allocator caller which doesn't expect > > it. > > The memcg metadata allocation failure is a situation kind of similar > to how we used to have per-memcg kmem caches for accounting slab > memory. The first allocation from a memcg triggers kmem cache creation > and lets the allocation pass through. > > > > > Btw. if the large allocation is really necessary, which hasn't been > > explained so far AFAIK, would vmalloc fallback be an option? > > > > For this specific scenario, large allocation is kind of unexpected, > like a large (multi-order) slab having tiny objects. Roman, do you > know the slab settings where this failure occurs? No, I hope Christoph will shed some light here. > Anyways, I think kvmalloc is a better option. Most of the time we > should have order 0 allocation here and for weird settings we fallback > to vmalloc. I'm not sure about kvmalloc, because it's not fast. I think the better option would be to force the slab allocator to fall back to order-0 pages. Theoretically, we don't even need to free and re-allocate slab objects, but break the slab folio into pages and release all but first page. But I'd like to learn more about the use case before committing any time into this effort. Thanks!