From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D6C01C02180 for ; Wed, 15 Jan 2025 16:07:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 50B92280001; Wed, 15 Jan 2025 11:07:13 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4BBE86B0082; Wed, 15 Jan 2025 11:07:13 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3356F280001; Wed, 15 Jan 2025 11:07:13 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 1521D6B007B for ; Wed, 15 Jan 2025 11:07:13 -0500 (EST) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 81268160BFC for ; Wed, 15 Jan 2025 16:07:12 +0000 (UTC) X-FDA: 83010165504.05.75474FA Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) by imf27.hostedemail.com (Postfix) with ESMTP id E8F2340014 for ; Wed, 15 Jan 2025 16:07:09 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=FkhRtmfD; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=ElpQlc1q; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=FkhRtmfD; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=ElpQlc1q; spf=pass (imf27.hostedemail.com: domain of vbabka@suse.cz designates 195.135.223.131 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736957230; a=rsa-sha256; cv=none; b=61fHTHuFyV8GRQJVtfZPXBvlpHbA7N3yYvXrjHAFPa79/7UA4kQhwIIQp5MvGAtTU/Ti4m nSDf8e8yV6545tvbH9Hs6oJnmDqInAtdmKXJ9+A2v27+CuirJgltGBQQz4T7HEEqzLqYJT eN2QtVH6aY0xhWwRJ6pCSo89YukGUhM= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=FkhRtmfD; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=ElpQlc1q; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=FkhRtmfD; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=ElpQlc1q; spf=pass (imf27.hostedemail.com: domain of vbabka@suse.cz designates 195.135.223.131 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736957230; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=hmU+95/9z4T3UaikbntFDllWaA4dWGGxS520IW14Ps8=; b=7Y3at8dqtFSeF4RymryRiH4aCHhhbFfodpQik4cytN8+WrvzToJ06IypbmOMeNIphJyDjs xLZNiS7vVdIeDpPqFQ/KPWNuKigdY7nZGyjqJ1vJ9QjDllXhyadUUnFVjnUU1zcJc5vhvh 3hti9tkD32PveD6lQEBEwC4K0MHwoO4= Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 123FD1F37E; Wed, 15 Jan 2025 16:07:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1736957228; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:autocrypt:autocrypt; bh=hmU+95/9z4T3UaikbntFDllWaA4dWGGxS520IW14Ps8=; b=FkhRtmfDAcTdrzYN3J4wGkKgpWvNAcBBNZpW2V7ZrQEHnXrfHPhZAc0gJL7cCJiG6MiT1e GQQGF2LU4uHXLwHuYpsK4EFSmbKLpr5H6QTv15BUZy8qpoNbQc7+PXAnWhgX9cZJKI3Ed9 X0zrm7MdL+aHbnKOCo9jkQlPgYV3JSE= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1736957228; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:autocrypt:autocrypt; bh=hmU+95/9z4T3UaikbntFDllWaA4dWGGxS520IW14Ps8=; b=ElpQlc1qsJj9l1LZ9CO4WUZzEji9wJjosgVhikDvzRDb56zrLGnOChLg6wToVw62mymB9o VQneoseNEgVR+4DQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1736957228; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:autocrypt:autocrypt; bh=hmU+95/9z4T3UaikbntFDllWaA4dWGGxS520IW14Ps8=; b=FkhRtmfDAcTdrzYN3J4wGkKgpWvNAcBBNZpW2V7ZrQEHnXrfHPhZAc0gJL7cCJiG6MiT1e GQQGF2LU4uHXLwHuYpsK4EFSmbKLpr5H6QTv15BUZy8qpoNbQc7+PXAnWhgX9cZJKI3Ed9 X0zrm7MdL+aHbnKOCo9jkQlPgYV3JSE= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1736957228; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:autocrypt:autocrypt; bh=hmU+95/9z4T3UaikbntFDllWaA4dWGGxS520IW14Ps8=; b=ElpQlc1qsJj9l1LZ9CO4WUZzEji9wJjosgVhikDvzRDb56zrLGnOChLg6wToVw62mymB9o VQneoseNEgVR+4DQ== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id DB8C1139CB; Wed, 15 Jan 2025 16:07:07 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id zpKyNCvdh2fPFQAAD6G6ig (envelope-from ); Wed, 15 Jan 2025 16:07:07 +0000 Message-ID: <0676a504-43dc-42a4-a215-040470539cb0@suse.cz> Date: Wed, 15 Jan 2025 17:07:07 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH bpf-next v5 4/7] memcg: Use trylock to access memcg stock_lock. Content-Language: en-US To: Alexei Starovoitov , bpf@vger.kernel.org Cc: andrii@kernel.org, memxor@gmail.com, akpm@linux-foundation.org, peterz@infradead.org, bigeasy@linutronix.de, rostedt@goodmis.org, houtao1@huawei.com, hannes@cmpxchg.org, shakeel.butt@linux.dev, mhocko@suse.com, willy@infradead.org, tglx@linutronix.de, jannh@google.com, tj@kernel.org, linux-mm@kvack.org, kernel-team@fb.com References: <20250115021746.34691-1-alexei.starovoitov@gmail.com> <20250115021746.34691-5-alexei.starovoitov@gmail.com> From: Vlastimil Babka Autocrypt: addr=vbabka@suse.cz; keydata= xsFNBFZdmxYBEADsw/SiUSjB0dM+vSh95UkgcHjzEVBlby/Fg+g42O7LAEkCYXi/vvq31JTB KxRWDHX0R2tgpFDXHnzZcQywawu8eSq0LxzxFNYMvtB7sV1pxYwej2qx9B75qW2plBs+7+YB 87tMFA+u+L4Z5xAzIimfLD5EKC56kJ1CsXlM8S/LHcmdD9Ctkn3trYDNnat0eoAcfPIP2OZ+ 9oe9IF/R28zmh0ifLXyJQQz5ofdj4bPf8ecEW0rhcqHfTD8k4yK0xxt3xW+6Exqp9n9bydiy tcSAw/TahjW6yrA+6JhSBv1v2tIm+itQc073zjSX8OFL51qQVzRFr7H2UQG33lw2QrvHRXqD Ot7ViKam7v0Ho9wEWiQOOZlHItOOXFphWb2yq3nzrKe45oWoSgkxKb97MVsQ+q2SYjJRBBH4 8qKhphADYxkIP6yut/eaj9ImvRUZZRi0DTc8xfnvHGTjKbJzC2xpFcY0DQbZzuwsIZ8OPJCc LM4S7mT25NE5kUTG/TKQCk922vRdGVMoLA7dIQrgXnRXtyT61sg8PG4wcfOnuWf8577aXP1x 6mzw3/jh3F+oSBHb/GcLC7mvWreJifUL2gEdssGfXhGWBo6zLS3qhgtwjay0Jl+kza1lo+Cv BB2T79D4WGdDuVa4eOrQ02TxqGN7G0Biz5ZLRSFzQSQwLn8fbwARAQABzSBWbGFzdGltaWwg QmFia2EgPHZiYWJrYUBzdXNlLmN6PsLBlAQTAQoAPgIbAwULCQgHAwUVCgkICwUWAgMBAAIe AQIXgBYhBKlA1DSZLC6OmRA9UCJPp+fMgqZkBQJkBREIBQkRadznAAoJECJPp+fMgqZkNxIQ ALZRqwdUGzqL2aeSavbum/VF/+td+nZfuH0xeWiO2w8mG0+nPd5j9ujYeHcUP1edE7uQrjOC Gs9sm8+W1xYnbClMJTsXiAV88D2btFUdU1mCXURAL9wWZ8Jsmz5ZH2V6AUszvNezsS/VIT87 AmTtj31TLDGwdxaZTSYLwAOOOtyqafOEq+gJB30RxTRE3h3G1zpO7OM9K6ysLdAlwAGYWgJJ V4JqGsQ/lyEtxxFpUCjb5Pztp7cQxhlkil0oBYHkudiG8j1U3DG8iC6rnB4yJaLphKx57NuQ PIY0Bccg+r9gIQ4XeSK2PQhdXdy3UWBr913ZQ9AI2usid3s5vabo4iBvpJNFLgUmxFnr73SJ KsRh/2OBsg1XXF/wRQGBO9vRuJUAbnaIVcmGOUogdBVS9Sun/Sy4GNA++KtFZK95U7J417/J Hub2xV6Ehc7UGW6fIvIQmzJ3zaTEfuriU1P8ayfddrAgZb25JnOW7L1zdYL8rXiezOyYZ8Fm ZyXjzWdO0RpxcUEp6GsJr11Bc4F3aae9OZtwtLL/jxc7y6pUugB00PodgnQ6CMcfR/HjXlae h2VS3zl9+tQWHu6s1R58t5BuMS2FNA58wU/IazImc/ZQA+slDBfhRDGYlExjg19UXWe/gMcl De3P1kxYPgZdGE2eZpRLIbt+rYnqQKy8UxlszsBNBFsZNTUBCACfQfpSsWJZyi+SHoRdVyX5 J6rI7okc4+b571a7RXD5UhS9dlVRVVAtrU9ANSLqPTQKGVxHrqD39XSw8hxK61pw8p90pg4G /N3iuWEvyt+t0SxDDkClnGsDyRhlUyEWYFEoBrrCizbmahOUwqkJbNMfzj5Y7n7OIJOxNRkB IBOjPdF26dMP69BwePQao1M8Acrrex9sAHYjQGyVmReRjVEtv9iG4DoTsnIR3amKVk6si4Ea X/mrapJqSCcBUVYUFH8M7bsm4CSxier5ofy8jTEa/CfvkqpKThTMCQPNZKY7hke5qEq1CBk2 wxhX48ZrJEFf1v3NuV3OimgsF2odzieNABEBAAHCwXwEGAEKACYCGwwWIQSpQNQ0mSwujpkQ PVAiT6fnzIKmZAUCZAUSmwUJDK5EZgAKCRAiT6fnzIKmZOJGEACOKABgo9wJXsbWhGWYO7mD 8R8mUyJHqbvaz+yTLnvRwfe/VwafFfDMx5GYVYzMY9TWpA8psFTKTUIIQmx2scYsRBUwm5VI EurRWKqENcDRjyo+ol59j0FViYysjQQeobXBDDE31t5SBg++veI6tXfpco/UiKEsDswL1WAr tEAZaruo7254TyH+gydURl2wJuzo/aZ7Y7PpqaODbYv727Dvm5eX64HCyyAH0s6sOCyGF5/p eIhrOn24oBf67KtdAN3H9JoFNUVTYJc1VJU3R1JtVdgwEdr+NEciEfYl0O19VpLE/PZxP4wX PWnhf5WjdoNI1Xec+RcJ5p/pSel0jnvBX8L2cmniYnmI883NhtGZsEWj++wyKiS4NranDFlA HdDM3b4lUth1pTtABKQ1YuTvehj7EfoWD3bv9kuGZGPrAeFNiHPdOT7DaXKeHpW9homgtBxj 8aX/UkSvEGJKUEbFL9cVa5tzyialGkSiZJNkWgeHe+jEcfRT6pJZOJidSCdzvJpbdJmm+eED w9XOLH1IIWh7RURU7G1iOfEfmImFeC3cbbS73LQEFGe1urxvIH5K/7vX+FkNcr9ujwWuPE9b 1C2o4i/yZPLXIVy387EjA6GZMqvQUFuSTs/GeBcv0NjIQi8867H3uLjz+mQy63fAitsDwLmR EP+ylKVEKb0Q2A== In-Reply-To: <20250115021746.34691-5-alexei.starovoitov@gmail.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspamd-Action: no action X-Rspamd-Queue-Id: E8F2340014 X-Stat-Signature: smc3jw1srbzmawzezb6f31fwa4dsnkid X-Rspam-User: X-Rspamd-Server: rspam09 X-HE-Tag: 1736957229-685467 X-HE-Meta: U2FsdGVkX1/MHZ6Erd+BOIAb0I3FIWep7cvsR4029h18DMn5FojhNf9WG4LkrrM6On0kpGA6p/J2ldIUIYmVe8YlJXjHvMSboxICOMEdyfTjKpLkgVS+7H3HV7N0mIzJBgJXoB07GarKOvjxdQElOoN3lnHCkqRkfuhcw+xT/jHDtCE9ucLKZwa+4AFxI5S+fkAD5ryoYkCcr9sTLsYbhWwlTikbru6il29IRqpNdQ1PtXZBGlISIfvXp/jxtEgoDLmPV+JgQuQW9OtSqZuY1gEMAqrQXvVAVjQ84PgbS3+jPDFBq4XcpgRZ7v0g+Mi4AU0MkjFTXm6N4JOV9MADV0/XevV4Tdj9dR0vpqlHIikABing4LInDMPYyjjZjBcUhZaUX98FchL7s3m2IjAlVBgsVRFAH6bDObGCi6REWMAgxUUymAocYDd/R9ZLtItLutdMoWGEY6qglnDwL7bQvUANBQRFpmXFU7705CYI7Z7kMZZ93xSC7sthvNXzw9zfS2JjhUqZts/Ha3994Aw9H60R3CValoeHJlUhG6IKcqIq6+t/FxvSGZE0QcobfnYCVRHovBg0S7599XoyqzOGXP3lWEcgSGz7Tit+QbgTn/RpJsNl4cPtX961EFJ+8kMrBtPcv3E+muv0+cdq0xb1rMZ2yiYrs7emH6SFlQDS0IfdnEmPhHIqsYCG+jMrNADaRBMLVgB+7pyhXFS7mM5wpiZsa2XElGgXSMZ81JIgswVvvhpJcnS9y7MwhZBDFB89R5RNaNW3EX+0JlFsoIi1K72rPJqumAykPmlcsCm0IQzbQlgNzFBUBtFa1lknSq5NptShSKsfMiH+BQB8nsBJoNtsVD7Exzb+el91oMSMcHBcVnFkeYEhO185glTMht7pZgu7Fl1H1AnZyteDNQRB7sQjjOMXONVAjSTHUwziRLge3k2PurkZoVE/gCmBofRYNabBlPGuiICWbsgFghK YYKSjXat t88HCP8DMlDJSyq+lM0O1VskJWUOZWrvROY8kPPDd2RQ/Y+uIQ3XVe4n1nbloo3CLsDKL5Sf9gfR5z+R2UdAPqWkg8v40+F9q5bCXiHpKRjzuE3KT2Cve94jBkoFVzIu7thpSmBIldQ0ileJlrYQcq4H6uP3KfVVe17AQmVeUNTplwSyCBBy7w9SfJTjug+XiONhjYb6J8eAmeAVf8eAZY286kZ2Je22cAu3N7rpZV+fVOusKE22Ok9QlXbUo5JtOpdYbncbZX/R8sY17IvYTc0yO9YpZdyUSNqAT5zRXWIdQtwzW8SrAVBVJ1T56RV98fLyNAmaXb0ilCl0GU/G5AhCWJatKmOX70EuWXdnQq9JyEdMioJc/tl2qx+HHJvkoeM6sHYn2nM7bqWN08IxJd2D6102QbjGHQ6JmGxdgn3wRb7gleURrnhzwGdansqIKlQzOadtTewstOZpZvSAUk311XkzpeMd4KxSh3rmcjjRdwIs= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 1/15/25 03:17, Alexei Starovoitov wrote: > From: Alexei Starovoitov > > Teach memcg to operate under trylock conditions when spinning locks > cannot be used. > > local_trylock might fail and this would lead to charge cache bypass if > the calling context doesn't allow spinning (gfpflags_allow_spinning). > In those cases charge the memcg counter directly and fail early if > that is not possible. This might cause a pre-mature charge failing > but it will allow an opportunistic charging that is safe from > try_alloc_pages path. > > Acked-by: Michal Hocko > Signed-off-by: Alexei Starovoitov Acked-by: Vlastimil Babka > --- > mm/memcontrol.c | 24 ++++++++++++++++++++---- > 1 file changed, 20 insertions(+), 4 deletions(-) > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index 7b3503d12aaf..e4c7049465e0 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -1756,7 +1756,8 @@ static bool obj_stock_flush_required(struct memcg_stock_pcp *stock, > * > * returns true if successful, false otherwise. > */ > -static bool consume_stock(struct mem_cgroup *memcg, unsigned int nr_pages) > +static bool consume_stock(struct mem_cgroup *memcg, unsigned int nr_pages, > + gfp_t gfp_mask) > { > struct memcg_stock_pcp *stock; > unsigned int stock_pages; > @@ -1766,7 +1767,11 @@ static bool consume_stock(struct mem_cgroup *memcg, unsigned int nr_pages) > if (nr_pages > MEMCG_CHARGE_BATCH) > return ret; > > - local_lock_irqsave(&memcg_stock.stock_lock, flags); > + if (!local_trylock_irqsave(&memcg_stock.stock_lock, flags)) { > + if (!gfpflags_allow_spinning(gfp_mask)) > + return ret; > + local_lock_irqsave(&memcg_stock.stock_lock, flags); The last line can practially only happen on RT, right? On non-RT irqsave means we could only fail the trylock from a nmi and then we should have gfp_flags that don't allow spinning. So suppose we used local_trylock(), local_lock() and local_unlock() (no _irqsave) instead, as I mentioned in reply to 3/7. The RT implementation would be AFAICS the same. On !RT the trylock could now fail from a IRQ context in addition to NMI context, but that should also have a gfp_mask that does not allow spinning, so it should work fine. It would however mean converting all users of the lock, i.e. also consume_obj_stock() etc., but AFAIU that will be necessary anyway to have opportunistic slab allocations? > + } > > stock = this_cpu_ptr(&memcg_stock); > stock_pages = READ_ONCE(stock->nr_pages); > @@ -1851,7 +1856,14 @@ static void refill_stock(struct mem_cgroup *memcg, unsigned int nr_pages) > { > unsigned long flags; > > - local_lock_irqsave(&memcg_stock.stock_lock, flags); > + if (!local_trylock_irqsave(&memcg_stock.stock_lock, flags)) { > + /* > + * In case of unlikely failure to lock percpu stock_lock > + * uncharge memcg directly. > + */ > + mem_cgroup_cancel_charge(memcg, nr_pages); > + return; > + } > __refill_stock(memcg, nr_pages); > local_unlock_irqrestore(&memcg_stock.stock_lock, flags); > } > @@ -2196,9 +2208,13 @@ int try_charge_memcg(struct mem_cgroup *memcg, gfp_t gfp_mask, > unsigned long pflags; > > retry: > - if (consume_stock(memcg, nr_pages)) > + if (consume_stock(memcg, nr_pages, gfp_mask)) > return 0; > > + if (!gfpflags_allow_spinning(gfp_mask)) > + /* Avoid the refill and flush of the older stock */ > + batch = nr_pages; > + > if (!do_memsw_account() || > page_counter_try_charge(&memcg->memsw, batch, &counter)) { > if (page_counter_try_charge(&memcg->memory, batch, &counter))