Re: [PATCH v6] hugetlb: Add hugetlb.*.numa_stat file

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Mina Almasry <almasrymina@google.com>
To: Marco Elver <elver@google.com>
Cc: Shakeel Butt <shakeelb@google.com>,
	paulmck@kernel.org,  Mike Kravetz <mike.kravetz@oracle.com>,
	Muchun Song <songmuchun@bytedance.com>,
	 Andrew Morton <akpm@linux-foundation.org>,
	Shuah Khan <shuah@kernel.org>,  Miaohe Lin <linmiaohe@huawei.com>,
	Oscar Salvador <osalvador@suse.de>,
	Michal Hocko <mhocko@suse.com>,
	 David Rientjes <rientjes@google.com>, Jue Wang <juew@google.com>,
	Yang Yao <ygyao@google.com>,  Joanna Li <joannali@google.com>,
	Cannon Matthews <cannonmatthews@google.com>,
	 Linux Memory Management List <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>,
	 kasan-dev@googlegroups.com
Subject: Re: [PATCH v6] hugetlb: Add hugetlb.*.numa_stat file
Date: Tue, 16 Nov 2021 13:53:51 -0800	[thread overview]
Message-ID: <CAHS8izNH282748JeKeT_W6KC9G9=mJww4k9n5WrtoStDqTfQqA@mail.gmail.com> (raw)
In-Reply-To: <YZQnBoPqMGhtLxnJ@elver.google.com>

On Tue, Nov 16, 2021 at 1:48 PM Marco Elver <elver@google.com> wrote:
>
> On Tue, Nov 16, 2021 at 12:59PM -0800, Shakeel Butt wrote:
> > On Tue, Nov 16, 2021 at 12:48 PM Mina Almasry <almasrymina@google.com> wrote:
> [...]
> > > > Per above, probably unlikely, but allowed. WRITE_ONCE should prevent it,
> > > > and at least relieve you to not worry about it (and shift the burden to
> > > > WRITE_ONCE's implementation).
> > > >
> > >
> > > Thank you very much for the detailed response. I can add READ_ONCE()
> > > at the no-lock read site, that is no issue.
> > >
> > > However, for the writes that happen while holding the lock, the write
> > > is like so:
> > > +               h_cg->nodeinfo[page_to_nid(page)]->usage[idx] += nr_pages;
> > >
> > > And like so:
> > > +               h_cg->nodeinfo[page_to_nid(page)]->usage[idx] -= nr_pages;
> > >
> > > I.e. they are increments/decrements. Sorry if I missed it but I can't
> > > find an INC_ONCE(), and it seems wrong to me to do something like:
> > >
> > > +               WRITE_ONCE(h_cg->nodeinfo[page_to_nid(page)]->usage[idx],
> > > +
> > > h_cg->nodeinfo[page_to_nid(page)] + nr_pages);
>
> From what I gather there are no concurrent writers, right?
>
> WRITE_ONCE(a, a + X) is perfectly fine. What it says is that you can
> have concurrent readers here, but no concurrent writers (and KCSAN will
> still check that). Maybe we need a more convenient macro for this idiom
> one day..
>
> Though I think for something like
>
>         h_cg->nodeinfo[page_to_nid(page)]->usage[idx] += nr_pages;
>
> it seems there wants to be an temporary long* so that you could write
> WRITE_ONCE(*usage, *usage + nr_pages) or something.
>

Ah, perfect, OK I can do this, and maybe add a comment explaining that
we don't have concurrent writers.

> > > I know we're holding a lock anyway so there is no race, but to the
> > > casual reader this looks wrong as there is a race between the fetch of
> > > the value and the WRITE_ONCE(). What to do here? Seems to me the most
> > > reasonable thing to do is just READ_ONCE() and leave the write plain?
> > >
> > >
> >
> > How about atomic_long_t?
>
> That would work of course; if this is very hot path code it might be
> excessive if you don't have concurrent writers.
>
> Looking at the patch in more detail, the counter is a stat counter that
> can be read from a stat file, correct? Hypothetically, what would happen
> if the reader of 'usage' reads approximate values?
>
> If the answer is "nothing", then this could classify as an entirely
> "benign" data race and you could only use data_race() on the reader and
> leave the writers unmarked using normal +=/-=. Check if it fits
> "Data-Racy Reads for Approximate Diagnostics" [1].
>
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/memory-model/Documentation/access-marking.txt#n74

Thank you very much for your quick responses. I think if the usage
returns a garbage/approximate value once in a while people will notice
and I can see it causing issues. I think it's worth doing it
'properly' here. I'll upload another version with these changes.

next prev parent reply	other threads:[~2021-11-16 21:54 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-11  1:50 Mina Almasry
2021-11-11  2:04 ` Shakeel Butt
2021-11-11  2:36 ` Muchun Song
2021-11-11  2:59   ` Shakeel Butt
2021-11-12 23:36   ` Mike Kravetz
2021-11-13  2:44     ` Muchun Song
2021-11-13 14:48       ` Mina Almasry
2021-11-13 19:15         ` Shakeel Butt
2021-11-14 13:43           ` Muchun Song
2021-11-15 18:22             ` Mike Kravetz
2021-11-15 18:55               ` Mina Almasry
2021-11-15 19:59                 ` Shakeel Butt
2021-11-16 12:04                   ` Marco Elver
2021-11-16 20:48                     ` Mina Almasry
2021-11-16 20:59                       ` Shakeel Butt
2021-11-16 21:47                         ` Marco Elver
2021-11-16 21:53                           ` Mina Almasry [this message]
2021-11-17  5:54                       ` Muchun Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHS8izNH282748JeKeT_W6KC9G9=mJww4k9n5WrtoStDqTfQqA@mail.gmail.com' \
    --to=almasrymina@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=cannonmatthews@google.com \
    --cc=elver@google.com \
    --cc=joannali@google.com \
    --cc=juew@google.com \
    --cc=kasan-dev@googlegroups.com \
    --cc=linmiaohe@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=mike.kravetz@oracle.com \
    --cc=osalvador@suse.de \
    --cc=paulmck@kernel.org \
    --cc=rientjes@google.com \
    --cc=shakeelb@google.com \
    --cc=shuah@kernel.org \
    --cc=songmuchun@bytedance.com \
    --cc=ygyao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox