From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1CF23CDB46E for ; Thu, 12 Oct 2023 04:37:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 213A88D0106; Thu, 12 Oct 2023 00:37:06 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1C3898D0002; Thu, 12 Oct 2023 00:37:06 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 08CB98D0106; Thu, 12 Oct 2023 00:37:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id EF8AC8D0002 for ; Thu, 12 Oct 2023 00:37:05 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id C162640512 for ; Thu, 12 Oct 2023 04:37:05 +0000 (UTC) X-FDA: 81335549610.25.003564E Received: from mail-yw1-f181.google.com (mail-yw1-f181.google.com [209.85.128.181]) by imf10.hostedemail.com (Postfix) with ESMTP id ED502C0008 for ; Thu, 12 Oct 2023 04:37:03 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=evJ9fOGl; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf10.hostedemail.com: domain of hughd@google.com designates 209.85.128.181 as permitted sender) smtp.mailfrom=hughd@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1697085424; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=a5OuTYbjlFj2ZnZUysa+ox563mmr19kTFx3cd83SMJA=; b=PqauM9wYzffCv8/FDsmiaLJ+gI7GUV2fqdQJGtVeSYTrjzjkTnr5LBdEAHqER74XmGVxJo jmgKkboDz20IKhcN2T2e+9noytH9Uon2HppBZynwM36dv9LiFbTo7dyeUTOynXJOjIQeFw cXKAnv0tAO/THt4Jo5EiO3tjAci7bsg= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=evJ9fOGl; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf10.hostedemail.com: domain of hughd@google.com designates 209.85.128.181 as permitted sender) smtp.mailfrom=hughd@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1697085424; a=rsa-sha256; cv=none; b=fcZAn0IGrTrtWDhLeGWQDYAsM18yjla6T807bc4o/i1WscFxMD/8sGFgugQTDT7jJeC/FW EsKe6PSFWnq1OWrSUaAEjz0X0NgKSlz/stMlw3Uxkuyj8pv344Sd1bRh+OwEJVKcbEgqBC iPgGweuoaMk9xDT1imtZ9k2w4XushHQ= Received: by mail-yw1-f181.google.com with SMTP id 00721157ae682-5a7eef0b931so7661787b3.0 for ; Wed, 11 Oct 2023 21:37:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1697085423; x=1697690223; darn=kvack.org; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=a5OuTYbjlFj2ZnZUysa+ox563mmr19kTFx3cd83SMJA=; b=evJ9fOGl8daUxoAgP4DIHErL24Ed4hVb2QxjYyAe215n6iNQVR1EQhx0HoVwokNawp NJNipuvItsJNlFMne1CGrfDhdEU2erpG19hqd0waa2brNyqf6e3tfx14l0N8CN1Se/pw jBAfsWkbckcJgVbxrqKbo3pawtgUt7rJOQSuZ06OW9O7rBnk1xhqU7Vcinqv0MEGfdfg HaafaLd2t3GG9lwnsZnjh2+n19vr+SQrNC7Wbj3peJkZsHTsIu40K0n7YsLtxeM3l9xM ncGRhOkexfZeedlK/0lM/Y/6601Yq6/0e5kY6WB3axpw4cj2p8/sCzgAAZzacIVqBUbY fz7Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697085423; x=1697690223; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=a5OuTYbjlFj2ZnZUysa+ox563mmr19kTFx3cd83SMJA=; b=OMRPflDG3SVBNNLaaHrY/VBBmLjfmZqScRN+L4yH+6Yqo3kIDm4dBd03YjDOqjPsd+ 9lZy5CfyeCurm5+BMJOlRkb9Gsz+VFz6wGZO5IZig9ouhkz3qNOz9kVKzRdZEpGjnpMB OSbzOUhAPZNd7Sb966PUifpfhJW8G4Nx58nnolX9isNxtWFC4254svDSbJsRyoOQd3Z5 4xVjbMjZcmjb/QUulMyc4U8Q/SDc3zF1CiOWqWJUz+/yDiYRY0NDbPgTtaCdSA/lLaot Gufl9XiKktKQ0PNL2OjHrqK7FnMJEiNe+zzgb8UnFozUZ5mHdB9moSxe41UufIUwlBIh 3SRQ== X-Gm-Message-State: AOJu0YyQhMbq7cw/gCxa3XrxPXOnGNsjFGn17Jap8S+fSob7h8dYN2z7 5Ta4xua3WkjvXWHxtPIRzMXnaQ== X-Google-Smtp-Source: AGHT+IETdOSgEufuiDP0695WsR1MnpV5RMnezXFLEWIo7fsk77X2Wmp8S9Nzl8vTK4HM2wSRsqLU9A== X-Received: by 2002:a81:c307:0:b0:594:e148:3c42 with SMTP id r7-20020a81c307000000b00594e1483c42mr20822490ywk.52.1697085422882; Wed, 11 Oct 2023 21:37:02 -0700 (PDT) Received: from ripple.attlocal.net (172-10-233-147.lightspeed.sntcca.sbcglobal.net. [172.10.233.147]) by smtp.gmail.com with ESMTPSA id w136-20020a0dd48e000000b0059b547b167esm5668442ywd.98.2023.10.11.21.37.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Oct 2023 21:37:02 -0700 (PDT) Date: Wed, 11 Oct 2023 21:36:59 -0700 (PDT) From: Hugh Dickins X-X-Sender: hugh@ripple.attlocal.net To: Dave Chinner cc: Hugh Dickins , Andrew Morton , Tim Chen , Dave Chinner , "Darrick J. Wong" , Christian Brauner , Carlos Maiolino , Chuck Lever , Jan Kara , Matthew Wilcox , Johannes Weiner , Axel Rasmussen , Dennis Zhou , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH 8/8] shmem,percpu_counter: add _limited_add(fbc, limit, amount) In-Reply-To: Message-ID: References: <2451f678-38b3-46c7-82fe-8eaf4d50a3a6@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Rspamd-Queue-Id: ED502C0008 X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: kbk7x6srhpssmcw1tmxcsrs7o3g4i1pg X-HE-Tag: 1697085423-127652 X-HE-Meta: U2FsdGVkX1/mTw7+3T+lnl2191SEXjvMAVZocmCc3iO38rYbn3D+vxLygpi6Gb3Kx0Btzm9x6wf+oofp9H0tKkUYGaQSlwSGGIBecDw2UXD3zVqRnTqruEFGGbKen/FXyFRThp07ZwcYXDYUlYsMydgP9n6ZPJmHaqkY5iYPdMjIPlwDRYKe6ZO4l45JcfrduGXAOaKJlXM2caoSCY9B0lRq3VZ6SVW7xWismrLfW4CaGH8XvrYHKjD0CcwML3/JBROO577+wlLs0o8bKH/bDZ+Uzi0tRv1DlqmGAzcRo4EWWvpKCYIhldH5yr9m8Q6PfwMZgfiRK0Cf6tSDfqjtSH6CYr5fPuGddG0rSKMcC6LtfXvHzu7I6t3ycv/UeuiPttLtnOAQIJekLLuaJ/W1qw7oUbtsSHGx277JyqxsjCRMtTfVrEHpdl3QPV9SB28pit8IvftTf/Px5r5X2JoK1kGvROpEtd3Tp/2tyT09EwD0ExLnFEyNJsyaoKmNucwq1vjSFFAjPwVWi348x6BW/JBBVlQ+teRt9TMzZdJt34trZhuUYU4PK6L2waAPJ/zaKky/cgX7sreUwIZquMmxqhFtSUZSYjPDiT8UEDosjF1b9QMFNA7sReQAEBDai7hb30WLMVmuy2GBQkwoamVAp2luyH8L8fA88kczd5j6SqkhGHAPtsmB1qatIDSAUSQxkCQFyFoU9W+7qxOabmNBkHRRVJFxdIea2iD+iFTEclGUOdLY+tK+cRXcLLPlKrtYqI9hMreKRwvTdV9IchDsHPxZwrl1nHcpoRCtVkA6HjR0pDVfrsAQwcD48HwReEZJpPhZYEIfD/RVmng84EEfQnufusCfhJv81FZWdna30sdbBrgAmow0BXzvTW2zwGy8VP0MkJXMCwnb7MEH7NX+9nfdUEa4cE1TbvOnnCYjBleTeYbqnyMbDzvbhVdUysy5N9maVzk0ELZvuBGFFH8 1fWEL3G1 I1QgxuYFHIvrKOygXjnNk1Jx8XFYY51MyBO1UsniBEUo4jvu0xuFgcNyIb8GpKi+sNAGP47nBk+j9wGhRJRMS8N/xp6ZKlHHdGHKXriLE4Eqi6hxwzfDqx1g1pctY2b3FUiFNyMJoRE590otEUi0pu3upl89x36ZdenEOYjKZO+v9zuFzFD4PLucIkGFNkyYFnIS32tM/Zx2uHuq0kkyQM4Zif+4dRBsD7kNC4xwxNnXxkPPX8OFUcoKnmG2cKZfwaBNsM/ofW2mpfhTTrdBY0JZ/OT7dEZNxH9uwnQnQT9731xSmsnYN3IgmNVLdrqeQBib83BCcBsV6+W9kk/KrhAwSKsr4bfZU8YWYfV+bUWEnEr763TpxwC5f++QhD1l/L1owarEHUQOHRE1TxP8Oj02NATe3GqUsFGfZ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, 9 Oct 2023, Dave Chinner wrote: > On Thu, Oct 05, 2023 at 10:35:33PM -0700, Hugh Dickins wrote: > > On Thu, 5 Oct 2023, Dave Chinner wrote: > > > > > > Hmmmm. IIUC, this only works for addition that approaches the limit > > > from below? > > > > That's certainly how I was thinking about it, and what I need for tmpfs. > > Precisely what its limitations (haha) are, I'll have to take care to > > spell out. > > > > (IIRC - it's a while since I wrote it - it can be used for subtraction, > > but goes the very slow way when it could go the fast way - uncompared > > percpu_counter_sub() much better for that. You might be proposing that > > a tweak could adjust it to going the fast way when coming down from the > > "limit", but going the slow way as it approaches 0 - that would be neat, > > but I've not yet looked into whether it's feasily done.) Easily done once I'd looked at it from the right angle. > > > > > > > > So if we are approaching the limit from above (i.e. add of a > > > negative amount, limit is zero) then this code doesn't work the same > > > as the open-coded compare+add operation would? > > > > To it and to me, a limit of 0 means nothing positive can be added > > (and it immediately returns false for that case); and adding anything > > negative would be an error since the positive would not have been allowed. > > > > Would a negative limit have any use? There was no reason to exclude it, once I was thinking clearly about the comparisons. > > I don't have any use for it, but the XFS case is decrementing free > space to determine if ENOSPC has been hit. It's the opposite > implemention to shmem, which increments used space to determine if > ENOSPC is hit. Right. > > > It's definitely not allowing all the possibilities that you could arrange > > with a separate compare and add; whether it's ruling out some useful > > possibilities to which it can easily be generalized, I'm not sure. > > > > Well worth a look - but it'll be easier for me to break it than get > > it right, so I might just stick to adding some comments. > > > > I might find that actually I prefer your way round: getting slower > > as approaching 0, without any need for specifying a limit?? That the > > tmpfs case pushed it in this direction, when it's better reversed? Or > > that might be an embarrassing delusion which I'll regret having mentioned. > > I think there's cases for both approaching and upper limit from > before and a lower limit from above. Both are the same "compare and > add" algorithm, just with minor logic differences... Good, thanks, you've saved me: I was getting a bit fundamentalist there, thinking to offer one simplest primitive from which anything could be built. But when it came down to it, I had no enthusiam for rewriting tmpfs's used_blocks as free_blocks, just to avoid that limit argument. > > > > Hence I think this looks like a "add if result is less than" > > > operation, which is distinct from then "add if result is greater > > > than" operation that we use this same pattern for in XFS and ext4. > > > Perhaps a better name is in order? > > > > The name still seems good to me, but a comment above it on its > > assumptions/limitations well worth adding. > > > > I didn't find a percpu_counter_compare() in ext4, and haven't got > > Go search for EXT4_FREECLUSTERS_WATERMARK.... Ah, not a percpu_counter_compare() user, but doing its own thing. > > > far yet with understanding the XFS ones: tomorrow... > > XFS detects being near ENOSPC to change the batch update size so > taht when near ENOSPC the percpu counter always aggregates to the > global sum on every modification. i.e. it becomes more accurate (but > slower) near the ENOSPC threshold. Then if the result of the > subtraction ends up being less than zero, it takes a lock (i.e. goes > even slower!), undoes the subtraction that took it below zero, and > determines if it can dip into the reserve pool or ENOSPC should be > reported. > > Some of that could be optimised, but we need that external "lock and > undo" mechanism to manage the reserve pool space atomically at > ENOSPC... Thanks for going above and beyond with the description; but I'll be honest and admit that I only looked quickly, and did not reach any conclusion as to whether such usage could or should be converted to percpu_counter_limited_add() - which would never take any XFS locks, of course, so might just end up doubling the slow work. But absolutely I agree with you, and thank you for pointing out, how stupidly useless percpu_counter_limited_add() was for decrementing - it was nothing more than a slow way of doing percpu_counter_sub(). I'm about to send in a 9/8, extending it to be more useful: thanks. Hugh