From: Muchun Song <songmuchun@bytedance.com>
To: Shakeel Butt <shakeelb@google.com>,
Mina Almasry <almasrymina@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>,
Andrew Morton <akpm@linux-foundation.org>,
Shuah Khan <shuah@kernel.org>, Miaohe Lin <linmiaohe@huawei.com>,
Oscar Salvador <osalvador@suse.de>,
Michal Hocko <mhocko@suse.com>,
David Rientjes <rientjes@google.com>, Jue Wang <juew@google.com>,
Yang Yao <ygyao@google.com>, Joanna Li <joannali@google.com>,
Cannon Matthews <cannonmatthews@google.com>,
Linux Memory Management List <linux-mm@kvack.org>,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v6] hugetlb: Add hugetlb.*.numa_stat file
Date: Sun, 14 Nov 2021 21:43:55 +0800 [thread overview]
Message-ID: <CAMZfGtWJGqbji3OexrGi-uuZ6_LzdUs0q9Vd66SwH93_nfLJLA@mail.gmail.com> (raw)
In-Reply-To: <CALvZod7VPD1rn6E9_1q6VzvXQeHDeE=zPRpr9dBcj5iGPTGKfA@mail.gmail.com>
On Sun, Nov 14, 2021 at 3:15 AM Shakeel Butt <shakeelb@google.com> wrote:
>
> On Sat, Nov 13, 2021 at 6:48 AM Mina Almasry <almasrymina@google.com> wrote:
> >
> > On Fri, Nov 12, 2021 at 6:45 PM Muchun Song <songmuchun@bytedance.com> wrote:
> > >
> > > On Sat, Nov 13, 2021 at 7:36 AM Mike Kravetz <mike.kravetz@oracle.com> wrote:
> > > >
> > > > Subject: Re: [PATCH v6] hugetlb: Add hugetlb.*.numa_stat file
> > > >
> > > > To: Muchun Song <songmuchun@bytedance.com>, Mina Almasry <almasrymina@google.com>
> > > >
> > > > Cc: Andrew Morton <akpm@linux-foundation.org>, Shuah Khan <shuah@kernel.org>, Miaohe Lin <linmiaohe@huawei.com>, Oscar Salvador <osalvador@suse.de>, Michal Hocko <mhocko@suse.com>, David Rientjes <rientjes@google.com>, Shakeel Butt <shakeelb@google.com>, Jue Wang <juew@google.com>, Yang Yao <ygyao@google.com>, Joanna Li <joannali@google.com>, Cannon Matthews <cannonmatthews@google.com>, Linux Memory Management List <linux-mm@kvack.org>, LKML <linux-kernel@vger.kernel.org>
> > > >
> > > > Bcc:
> > > >
> > > > -=-=-=-=-=-=-=-=-=# Don't remove this line #=-=-=-=-=-=-=-=-=-
> > > >
> > > > On 11/10/21 6:36 PM, Muchun Song wrote:
> > > >
> > > > > On Thu, Nov 11, 2021 at 9:50 AM Mina Almasry <almasrymina@google.com> wrote:
> > > >
> > > > >>
> > > >
> > > > >> +struct hugetlb_cgroup_per_node {
> > > >
> > > > >> + /* hugetlb usage in pages over all hstates. */
> > > >
> > > > >> + atomic_long_t usage[HUGE_MAX_HSTATE];
> > > >
> > > > >
> > > >
> > > > > Why do you use atomic? IIUC, 'usage' is always
> > > >
> > > > > increased/decreased under hugetlb_lock except
> > > >
> > > > > hugetlb_cgroup_read_numa_stat() which is always
> > > >
> > > > > reading it. So I think WRITE_ONCE/READ_ONCE
> > > >
> > > > > is enough.
> > > >
> > > >
> > > >
> > > > Thanks for continuing to work this, I was traveling and unable to
> > > >
> > > > comment.
> > >
> > > Have a good time.
> > >
> > > >
> > > >
> > > >
> > > > Unless I am missing something, I do not see a reason for WRITE_ONCE/READ_ONCE
> > >
> > > Because __hugetlb_cgroup_commit_charge and
> > > hugetlb_cgroup_read_numa_stat can run parallely,
> > > which meets the definition of data race. I believe
> > > KCSAN could report this race. I'm not strongly
> > > suggest using WRITE/READ_ONCE() here. But
> > > in theory it should be like this. Right?
> > >
> >
> > My understanding is that the (only) potential problem here is
> > read_numa_stat() reading an intermediate garbage value while
> > commit_charge() is happening concurrently. This will only happen on
> > archs where the writes to an unsigned long aren't atomic. On archs
> > where writes to an unsigned long are atomic, there is no race, because
> > read_numa_stat() will only ever read the value before the concurrent
> > write or after the concurrent write, both of which are valid. To cater
> > to archs where the writes to unsigned long aren't atomic, we need to
> > use an atomic data type.
> >
> > I'm not too familiar but my understanding from reading the
> > documentation is that WRITE_ONCE/READ_ONCE don't contribute anything
> > meaningful here:
> >
> > /*
> > * Prevent the compiler from merging or refetching reads or writes. The
> > * compiler is also forbidden from reordering successive instances of
> > * READ_ONCE and WRITE_ONCE, but only when the compiler is aware of some
> > * particular ordering. One way to make the compiler aware of ordering is to
> > * put the two invocations of READ_ONCE or WRITE_ONCE in different C
> > * statements.
> > ...
> >
> > I can't see a reason why we care about the compiler merging or
> > refetching reads or writes here. As far as I can tell the problem is
> > atomicy of the write.
> >
>
> We have following options:
>
> 1) Use atomic type for usage.
> 2) Use "unsigned long" for usage along with WRITE_ONCE/READ_ONCE.
> 3) Use hugetlb_lock for hugetlb_cgroup_read_numa_stat as well.
>
> All options are valid but we would like to avoid (3).
>
> What if we use "unsigned long" type but without READ_ONCE/WRITE_ONCE.
> The potential issues with that are KCSAN will report this as race and
> possible garbage value on archs which do not support atomic writes to
> unsigned long.
At least I totally agree with you. Thanks for your detailed explanation.
>
> Shakeel
next prev parent reply other threads:[~2021-11-14 13:44 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-11-11 1:50 Mina Almasry
2021-11-11 2:04 ` Shakeel Butt
2021-11-11 2:36 ` Muchun Song
2021-11-11 2:59 ` Shakeel Butt
2021-11-12 23:36 ` Mike Kravetz
2021-11-13 2:44 ` Muchun Song
2021-11-13 14:48 ` Mina Almasry
2021-11-13 19:15 ` Shakeel Butt
2021-11-14 13:43 ` Muchun Song [this message]
2021-11-15 18:22 ` Mike Kravetz
2021-11-15 18:55 ` Mina Almasry
2021-11-15 19:59 ` Shakeel Butt
2021-11-16 12:04 ` Marco Elver
2021-11-16 20:48 ` Mina Almasry
2021-11-16 20:59 ` Shakeel Butt
2021-11-16 21:47 ` Marco Elver
2021-11-16 21:53 ` Mina Almasry
2021-11-17 5:54 ` Muchun Song
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAMZfGtWJGqbji3OexrGi-uuZ6_LzdUs0q9Vd66SwH93_nfLJLA@mail.gmail.com \
--to=songmuchun@bytedance.com \
--cc=akpm@linux-foundation.org \
--cc=almasrymina@google.com \
--cc=cannonmatthews@google.com \
--cc=joannali@google.com \
--cc=juew@google.com \
--cc=linmiaohe@huawei.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=mike.kravetz@oracle.com \
--cc=osalvador@suse.de \
--cc=rientjes@google.com \
--cc=shakeelb@google.com \
--cc=shuah@kernel.org \
--cc=ygyao@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox