linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Finkel <davidf@vimeo.com>
To: "Michal Koutný" <mkoutny@suse.com>
Cc: Muchun Song <muchun.song@linux.dev>, Tejun Heo <tj@kernel.org>,
	 Roman Gushchin <roman.gushchin@linux.dev>,
	Andrew Morton <akpm@linux-foundation.org>,
	 core-services@vimeo.com, Jonathan Corbet <corbet@lwn.net>,
	Michal Hocko <mhocko@kernel.org>,
	 Shakeel Butt <shakeel.butt@linux.dev>,
	Shuah Khan <shuah@kernel.org>,
	 Johannes Weiner <hannes@cmpxchg.org>,
	Zefan Li <lizefan.x@bytedance.com>,
	cgroups@vger.kernel.org,  linux-doc@vger.kernel.org,
	linux-mm@kvack.org,  linux-kselftest@vger.kernel.org,
	Waiman Long <longman@redhat.com>
Subject: Re: [PATCH v5 1/2] mm, memcg: cg2 memory{.swap,}.peak write handlers
Date: Mon, 29 Jul 2024 09:34:23 -0400	[thread overview]
Message-ID: <CAFUnj5O9bijcu6grPoFh0h7mTVAP-bajeJDq1-jtqWuaJbv8XQ@mail.gmail.com> (raw)
In-Reply-To: <5xlwzzz3gs4rk5df32kfh7fx5ftj3a4iwryqxdb4c3oniuehwk@d5kum5xr4uw6>

Hi Michal,

On Fri, Jul 26, 2024 at 10:16 AM Michal Koutný <mkoutny@suse.com> wrote:
>
> Hello David.
>
> On Wed, Jul 24, 2024 at 12:19:41PM GMT, David Finkel <davidf@vimeo.com> wrote:
> > Writing a specific string to the memory.peak and memory.swap.peak
> > pseudo-files reset the high watermark to the current usage for
> > subsequent reads through that same fd.
>
> This is elegant and nice work! (Caught my attention, so a few nits below.)

Thanks!

You can thank Johannes for the algorithm.
>
> > --- a/include/linux/cgroup-defs.h
> > +++ b/include/linux/cgroup-defs.h
> > @@ -775,6 +775,11 @@ struct cgroup_subsys {
> >
> >  extern struct percpu_rw_semaphore cgroup_threadgroup_rwsem;
> >
> > +struct cgroup_of_peak {
> > +     long                    value;
>
> Wouldn't this better be unsigned like watermarks themselves?

Hmm, interesting question.
I originally set that to be signed to handle the special value of -1.
However, that's kind of irrelevant if I'm casting it to an unsigned
u64 in the only place that value's being handled.

I've switched this over now.

>
> > +     struct list_head        list;
> > +};
>
>
> > --- a/include/linux/page_counter.h
> > +++ b/include/linux/page_counter.h
> > @@ -26,6 +26,7 @@ struct page_counter {
> >       atomic_long_t children_low_usage;
> >
> >       unsigned long watermark;
> > +     unsigned long local_watermark;
>
> At first, I struggled understading what the locality is (when the local
> value is actually in of_peak), IIUC, it's more about temporal position.
>
> I'd suggest a comment (if not a name) like:
>         /* latest reset watermark */
> > +     unsigned long local_watermark;

Yeah, I had a comment before that was a bit inaccurate, and was
advised to remove it instead of trying to fix it in a previous round.

I've added one that says "Latest cg2 reset watermark".

>
>
> > +
> > +     /* User wants global or local peak? */
> > +     if (fd_peak == -1UL)
>
> Here you use typed -1UL but not in other places. (Maybe define an
> explicit macro value ((unsigned long)-1)?)
Good idea!

>
> > +static ssize_t peak_write(struct kernfs_open_file *of, char *buf, size_t nbytes,
> > +                       loff_t off, struct page_counter *pc,
> > +                       struct list_head *watchers)
> > +{
> ...
> > +     list_for_each_entry(peer_ctx, watchers, list)
> > +             if (usage > peer_ctx->value)
> > +                     peer_ctx->value = usage;
>
> The READ_ONCE() in peak_show() suggests it could be WRITE_ONCE() here.

Good point. I've sprinkled a few more READ_ONCE and WRITE_ONCE calls.

>
> > +
> > +     /* initial write, register watcher */
> > +     if (ofp->value == -1)
> > +             list_add(&ofp->list, watchers);
> > +
> > +     ofp->value = usage;
>
> Move the registration before iteration and drop the extra assignment?
My original reason is that I could avoid an extra list hop and conditional,
but at this point I see two reasons to keep it separate:
 - We need to reset this value either way. If it's already been reset, it may
   not get reset by the loop.
 - since these are now unsigned ints, -1 compares greater than everything,
   so it would need a special case (or an additional cast). (Assuming we're
   on a system that uses twos complement)
- I think it's a bit clearer this way

>
> Thanks,
> Michal

Thanks for the review!

--
David Finkel
Senior Principal Software Engineer, Core Services


  reply	other threads:[~2024-07-29 13:34 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-24 16:19 David Finkel
2024-07-24 16:19 ` David Finkel
2024-07-26 14:16   ` Michal Koutný
2024-07-29 13:34     ` David Finkel [this message]
2024-07-24 16:19 ` [PATCH v5 2/2] mm, memcg: cg2 memory{.swap,}.peak write tests David Finkel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAFUnj5O9bijcu6grPoFh0h7mTVAP-bajeJDq1-jtqWuaJbv8XQ@mail.gmail.com \
    --to=davidf@vimeo.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=corbet@lwn.net \
    --cc=core-services@vimeo.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lizefan.x@bytedance.com \
    --cc=longman@redhat.com \
    --cc=mhocko@kernel.org \
    --cc=mkoutny@suse.com \
    --cc=muchun.song@linux.dev \
    --cc=roman.gushchin@linux.dev \
    --cc=shakeel.butt@linux.dev \
    --cc=shuah@kernel.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox