From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3FB20C5478C for ; Fri, 1 Mar 2024 17:51:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6280F6B008A; Fri, 1 Mar 2024 12:51:39 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5D96C6B0092; Fri, 1 Mar 2024 12:51:39 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4527C6B0095; Fri, 1 Mar 2024 12:51:39 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 2F22D6B008A for ; Fri, 1 Mar 2024 12:51:39 -0500 (EST) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 06C14C04D8 for ; Fri, 1 Mar 2024 17:51:39 +0000 (UTC) X-FDA: 81849212718.07.15543E5 Received: from mail-ed1-f48.google.com (mail-ed1-f48.google.com [209.85.208.48]) by imf16.hostedemail.com (Postfix) with ESMTP id E82A0180009 for ; Fri, 1 Mar 2024 17:51:36 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=google header.b=CasmGuJV; dmarc=none; spf=pass (imf16.hostedemail.com: domain of torvalds@linuxfoundation.org designates 209.85.208.48 as permitted sender) smtp.mailfrom=torvalds@linuxfoundation.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1709315497; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=uzMbDkgTXsN2/951Q9RWiF9r9QR8HrSOWMnrlfRXcYc=; b=qqjwD7UIqUjZoK/bSDQZpGClA4q4XXeIzbigJ1rqw1Jka4Eo3wsFRqDOLX2PbB5dIyJwMO n8G+LipVkXkeYQkghRZZj011Tfk+3CAsE1TLLxBV8IbPeMHg3G4R/98dfwq9pB7w6mDeai BqROQNCG8lMdkYxXV3WzqLczpWMP22c= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=google header.b=CasmGuJV; dmarc=none; spf=pass (imf16.hostedemail.com: domain of torvalds@linuxfoundation.org designates 209.85.208.48 as permitted sender) smtp.mailfrom=torvalds@linuxfoundation.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1709315497; a=rsa-sha256; cv=none; b=EnTd6TkoT8dpfs4RBn+HmgFzWrPzSGSOEgpqBMATna+d+CDTp9WuHo2fF9KDFSpSr+MJAF HlXRhq0nzIVV3L0Y0RQ3HLMrUm7vnpmMxf66eIaqfHoVxffSes1VbQm0lBiuxJwuVaqEZe 3sWx+QY5Rmj5WlFKMEt2IkNS5hRYq30= Received: by mail-ed1-f48.google.com with SMTP id 4fb4d7f45d1cf-565a2c4cc1aso3656461a12.2 for ; Fri, 01 Mar 2024 09:51:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; t=1709315495; x=1709920295; darn=kvack.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=uzMbDkgTXsN2/951Q9RWiF9r9QR8HrSOWMnrlfRXcYc=; b=CasmGuJVUFnziKQuGJZDsP5V2yjpnQDL6f61vfEWqiwu+V+HDnwVrBnAUExGOyfHwj ofhVJOV2yoKcliW3fkLWswS1o8h9pOA+HhwmTQnjk30AQk5hqwDUpEvQCP8ZzoTmvgLI 2VGMwSddSAG8O1yVrp+j/g/BbjeLwrNJImhq8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709315495; x=1709920295; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=uzMbDkgTXsN2/951Q9RWiF9r9QR8HrSOWMnrlfRXcYc=; b=hfQoTIxcEHQTeUH1QbZVM24EiyegRfnLM/svenNIpymAOQhKP+f1VYkqRcPvk4sUpD znXhP+KNLv45idK3hRutT7kegXTq9wD9dmzLQeZMrmifXrRgGHivngfbcyX2ggK0PQ0i TIw7lZHlBvOh96nFJLSoOvKgXTasDk1cOl/UIiYX3MVW+dEQQR1upu9+BhkzgItVapQW yRWbcwWlp/sTCy1G5CltI0MSomFLcyELKE6/QY7tpCpyn8XKvFnlTuq0Mf7eeQhR/sZB opm7WK0AlodY3ydMFALOz1tz6RajcoehkNxMlrkTUDDmOGtU/D3d4kh5GqZe2TN3i2Iu aIag== X-Forwarded-Encrypted: i=1; AJvYcCWc46tyezwW8p4iqy1IyUk8IK5Rg2FvFnOsJWqYTp5oJ3JEkNAAy8gZCSky6czWR03nnvUmqiWBbUiHzyfIBC3EajY= X-Gm-Message-State: AOJu0YwJcDXF92fMehjOJWyi2HZEN+8GeBSgAM+G5pX2mn4Lv0cuZ6GE Jd94kAsJnyHrSlAtoxdz0KRScC0KUlgmJEQ+TASMt1LVapGAy8Hg3PBli333VB0gXx6niD4ZC1c 2eS+7HQ== X-Google-Smtp-Source: AGHT+IFW6R4jABPEMB0kCEGjkNXUARGZG1+wsnOEP4f4Zidk/biYOXJM0DfMF9XyfMybD//KSd5kPg== X-Received: by 2002:a05:6402:2151:b0:566:6a0e:7625 with SMTP id bq17-20020a056402215100b005666a0e7625mr1823146edb.7.1709315495297; Fri, 01 Mar 2024 09:51:35 -0800 (PST) Received: from mail-ej1-f42.google.com (mail-ej1-f42.google.com. [209.85.218.42]) by smtp.gmail.com with ESMTPSA id ef11-20020a05640228cb00b00566be034ab2sm1153510edb.51.2024.03.01.09.51.35 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 01 Mar 2024 09:51:35 -0800 (PST) Received: by mail-ej1-f42.google.com with SMTP id a640c23a62f3a-a3122b70439so413723066b.3 for ; Fri, 01 Mar 2024 09:51:35 -0800 (PST) X-Forwarded-Encrypted: i=1; AJvYcCUH/9UYqsRG+lV0vNb/l6w/jwFnj/kKtOqTUi9bK+Cntc5OQW1+hJZW+pwYySs7hN2iugrtu2hcoObcW7LD2kBwub4= X-Received: by 2002:a17:906:5a9a:b0:a44:48db:9060 with SMTP id l26-20020a1709065a9a00b00a4448db9060mr1596114ejq.19.1709315494643; Fri, 01 Mar 2024 09:51:34 -0800 (PST) MIME-Version: 1.0 References: <20240301-slab-memcg-v1-0-359328a46596@suse.cz> <20240301-slab-memcg-v1-4-359328a46596@suse.cz> In-Reply-To: <20240301-slab-memcg-v1-4-359328a46596@suse.cz> From: Linus Torvalds Date: Fri, 1 Mar 2024 09:51:18 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH RFC 4/4] UNFINISHED mm, fs: use kmem_cache_charge() in path_openat() To: Vlastimil Babka Cc: Josh Poimboeuf , Jeff Layton , Chuck Lever , Kees Cook , Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , Roman Gushchin , Hyeonggon Yoo <42.hyeyoo@gmail.com>, Johannes Weiner , Michal Hocko , Shakeel Butt , Muchun Song , Alexander Viro , Christian Brauner , Jan Kara , linux-mm@kvack.org, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-fsdevel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: E82A0180009 X-Stat-Signature: kgnm6zbmkurub4dsiy7tndihgbcx1n7m X-Rspam-User: X-HE-Tag: 1709315496-247341 X-HE-Meta: U2FsdGVkX19vdPrLqCdmf++wEWR6s0sSwVu9XNmzjbnvz2B6WmZH8pB3+CbbcDR+aQ6uEYJ2NgFMnV4NHBV7vULk9DhO3lbp3SFSwLdd95R85KBs25ORiVbwjH2Hs0U3DaPsBzwqCzfPYdPd9y3/3rzU4oHld912WKwKuIYOisAhf+65Cs+ryqi728HmdHWRWpMJA7JuFJiski8GRkzUejTL8ZKYpNCMXysu1hREyzHJXuskOoGBNFliTDfTY2F/5k6C8WINv7TN+Am5tZTQxh+U4J+J2L5hn0qaGdTY2Kkkjry+OblriWZPQPeJukc8ay2fqEoYnhbE8dVTXE8OcMYiGtLC1V3CArg7iag6tV33eXbMqASxoN0xI2Ytapi88BC28RpUBvkQFDojz/eZk6r2CuAMrVGdNjtdoCiYQ9KCsl+IbqFvDJhWg5WNYpkXsb6fIQqUQCCVWnKjFlXcjFuBpqhamKRYBT/5bq2HEDDvodbKh/obFbWyLtISDh/BOruPe8W4BB8kjhDn+H3TiIBjzMlRhM19i4UdVsqcN+OBAYCvxKHWB8m2wArAcncP+LzvMC0Og1dgxT4Pm/NfayXSVNNfJawZJmuVYydpYV0cgrzL36noG+NyBJarFT+kQWCyiAkjeFomeoE4oSwUNgb324g2wR2UfdgbES0MoN4SqPoiAgyXFzpxYdwKGuCUNV/ofuKgsVOsUROoCDK902ITu9eW0hpJCi1AnI3XsFfB3EMyonc05yw7rLIZJ3f69/WbXlkqRbDm9i7VsK5DOWbsNIJj6Ggk/P323McD23zkaM5C5mXv0vGJmuX+0BhdOHZhdaL5MU9FTTAqfMQGeE8PoyzdlTSMXwWLtgAv2uHZh6so7+0zfulZK5ORulp6eE17XRbVUFpwjssTfFyunSHUz5XWQQXD3w13JfxtSIBeH6k2f/M/vPBA++SKkarqHViXzz2dttihAlXPh92 xaT9L00E 8wSy0129IrYJAsArxaEeMJzCMJDogmN++Pm4yLJOc36fSVXOYQBCTA3QpaYH3nE7/p5gaVkJQVol0nIx062lVOag3PVIx+2d5MExUskpEEP1+d3CwiGzPOlhdFmqWgRTrJYJYBEpAY9lyizXpJY6HNecdUKZ8rvxU+VxjViMRPj9KY1eZtPNPjrKqXHlNe4xr5T6kM3LnpYdNMOH/l70HRQwtsPdL2OuzJyyB9ReohgeFnNED4fmb4FVZt7kWrTbDpZJOy4NH1dETxP5PToa35wVwJuSInROef6RoJH+xcr3YO1fQ1wbnP9yVRRKgUsZlcOq3cR6yz8aV+Jsl1IBYLhax7OelmPIi+6P4MWck93BA+xYppOc1gcJeqOQj/fLAmtjjP3FBH35W+7qZ6YNK4jaaHTRzEBb126nryGgG/Gt67dLDOhpIRvEX78++PM28/8v4/7g5N5eFAf/wIRsoDcpL0G8YzM8BEN5kKMDrtBfwERMYWn2/iCsAE4q3pOKXiFuxsQrH6oxrC0RxKEsuRDnF5RFXKBNEZa4KyInCL3I3YKaZnezQqVPWq8J+5i3rVjhqlgKR2akfJp7C7421MZ/1Ul8Y/l3koqgexMx0vNFyQAU1aviU+rgStQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, 1 Mar 2024 at 09:07, Vlastimil Babka wrote: > > This is just an example of using the kmem_cache_charge() API. I think > it's placed in a place that's applicable for Linus's example [1] > although he mentions do_dentry_open() - I have followed from strace() > showing openat(2) to path_openat() doing the alloc_empty_file(). Thanks. This is not the right patch, but yes, patches 1-3 look very nice to me. > The idea is that filp_cachep stops being SLAB_ACCOUNT. Allocations that > want to be accounted immediately can use GFP_KERNEL_ACCOUNT. I did that > in alloc_empty_file_noaccount() (despite the contradictory name but the > noaccount refers to something else, right?) as IIUC it's about > kernel-internal opens. Yeah, the "noaccount" function is about not accounting it towards nr_files. That said, I don't think it necessarily needs to do the memory accounting either - it's literally for cases where we're never going to install the file descriptor in any user space. Your change to use GFP_KERNEL_ACCOUNT isn't exactly wrong, but I don't think it's really the right thing either, because > Why is this unfinished: > > - there are other callers of alloc_empty_file() which I didn't adjust so > they simply became memcg-unaccounted. I haven't investigated for which > ones it would make also sense to separate the allocation and accounting. > Maybe alloc_empty_file() would need to get a parameter to control > this. Right. I think the natural and logical way to deal with this is to just say "we account when we add the file to the fdtable". IOW, just have fd_install() do it. That's the really natural point, and also makes it very logical why alloc_empty_file_noaccount() wouldn't need to do the GFP_KERNEL_ACCOUNT. > - I don't know how to properly unwind the accounting failure case. It > seems like a new case because when we succeed the open, there's no > further error path at least in path_openat(). Yeah, let me think about this part. Becasue fd_install() is the right point, but that too does not really allow for error handling. Yes, we could close things and fail it, but it really is much too late at this point. What I *think* I'd want for this case is (a) allow the accounting to go over by a bit (b) make sure there's a cheap way to ask (before) about "did we go over the limit" IOW, the accounting never needed to be byte-accurate to begin with, and making it fail (cheaply and early) on the next file allocation is fine. Just make it really cheap. Can we do that? For example, maybe don't bother with the whole "bytes and pages" stuff. Just a simple "are we more than one page over?" kind of question. Without the 'stock_lock' mess for sub-page bytes etc How would that look? Would it result in something that can be done cheaply without locking and atomics and without excessive pointer indirection through many levels of memcg data structures? Linus