linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Muchun Song <songmuchun@bytedance.com>
To: Shakeel Butt <shakeelb@google.com>,
	Mina Almasry <almasrymina@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	 Shuah Khan <shuah@kernel.org>, Miaohe Lin <linmiaohe@huawei.com>,
	 Oscar Salvador <osalvador@suse.de>,
	Michal Hocko <mhocko@suse.com>,
	 David Rientjes <rientjes@google.com>, Jue Wang <juew@google.com>,
	Yang Yao <ygyao@google.com>,  Joanna Li <joannali@google.com>,
	Cannon Matthews <cannonmatthews@google.com>,
	 Linux Memory Management List <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v6] hugetlb: Add hugetlb.*.numa_stat file
Date: Sun, 14 Nov 2021 21:43:55 +0800	[thread overview]
Message-ID: <CAMZfGtWJGqbji3OexrGi-uuZ6_LzdUs0q9Vd66SwH93_nfLJLA@mail.gmail.com> (raw)
In-Reply-To: <CALvZod7VPD1rn6E9_1q6VzvXQeHDeE=zPRpr9dBcj5iGPTGKfA@mail.gmail.com>

On Sun, Nov 14, 2021 at 3:15 AM Shakeel Butt <shakeelb@google.com> wrote:
>
> On Sat, Nov 13, 2021 at 6:48 AM Mina Almasry <almasrymina@google.com> wrote:
> >
> > On Fri, Nov 12, 2021 at 6:45 PM Muchun Song <songmuchun@bytedance.com> wrote:
> > >
> > > On Sat, Nov 13, 2021 at 7:36 AM Mike Kravetz <mike.kravetz@oracle.com> wrote:
> > > >
> > > > Subject:   Re: [PATCH v6] hugetlb: Add hugetlb.*.numa_stat file
> > > >
> > > > To:        Muchun Song <songmuchun@bytedance.com>, Mina Almasry <almasrymina@google.com>
> > > >
> > > > Cc:        Andrew Morton <akpm@linux-foundation.org>, Shuah Khan <shuah@kernel.org>, Miaohe Lin <linmiaohe@huawei.com>, Oscar Salvador <osalvador@suse.de>, Michal Hocko <mhocko@suse.com>, David Rientjes <rientjes@google.com>, Shakeel Butt <shakeelb@google.com>, Jue Wang <juew@google.com>, Yang Yao <ygyao@google.com>, Joanna Li <joannali@google.com>, Cannon Matthews <cannonmatthews@google.com>, Linux Memory Management List <linux-mm@kvack.org>, LKML <linux-kernel@vger.kernel.org>
> > > >
> > > > Bcc:
> > > >
> > > > -=-=-=-=-=-=-=-=-=# Don't remove this line #=-=-=-=-=-=-=-=-=-
> > > >
> > > > On 11/10/21 6:36 PM, Muchun Song wrote:
> > > >
> > > > > On Thu, Nov 11, 2021 at 9:50 AM Mina Almasry <almasrymina@google.com> wrote:
> > > >
> > > > >>
> > > >
> > > > >> +struct hugetlb_cgroup_per_node {
> > > >
> > > > >> +       /* hugetlb usage in pages over all hstates. */
> > > >
> > > > >> +       atomic_long_t usage[HUGE_MAX_HSTATE];
> > > >
> > > > >
> > > >
> > > > > Why do you use atomic? IIUC, 'usage' is always
> > > >
> > > > > increased/decreased under hugetlb_lock except
> > > >
> > > > > hugetlb_cgroup_read_numa_stat() which is always
> > > >
> > > > > reading it. So I think WRITE_ONCE/READ_ONCE
> > > >
> > > > > is enough.
> > > >
> > > >
> > > >
> > > > Thanks for continuing to work this, I was traveling and unable to
> > > >
> > > > comment.
> > >
> > > Have a good time.
> > >
> > > >
> > > >
> > > >
> > > > Unless I am missing something, I do not see a reason for WRITE_ONCE/READ_ONCE
> > >
> > > Because __hugetlb_cgroup_commit_charge and
> > > hugetlb_cgroup_read_numa_stat can run parallely,
> > > which meets the definition of data race. I believe
> > > KCSAN could report this race. I'm not strongly
> > > suggest using WRITE/READ_ONCE() here. But
> > > in theory it should be like this. Right?
> > >
> >
> > My understanding is that the (only) potential problem here is
> > read_numa_stat() reading an intermediate garbage value while
> > commit_charge() is happening concurrently. This will only happen on
> > archs where the writes to an unsigned long aren't atomic. On archs
> > where writes to an unsigned long are atomic, there is no race, because
> > read_numa_stat() will only ever read the value before the concurrent
> > write or after the concurrent write, both of which are valid. To cater
> > to archs where the writes to unsigned long aren't atomic, we need to
> > use an atomic data type.
> >
> > I'm not too familiar but my understanding from reading the
> > documentation is that WRITE_ONCE/READ_ONCE don't contribute anything
> > meaningful here:
> >
> > /*
> > * Prevent the compiler from merging or refetching reads or writes. The
> > * compiler is also forbidden from reordering successive instances of
> > * READ_ONCE and WRITE_ONCE, but only when the compiler is aware of some
> > * particular ordering. One way to make the compiler aware of ordering is to
> > * put the two invocations of READ_ONCE or WRITE_ONCE in different C
> > * statements.
> > ...
> >
> > I can't see a reason why we care about the compiler merging or
> > refetching reads or writes here. As far as I can tell the problem is
> > atomicy of the write.
> >
>
> We have following options:
>
> 1) Use atomic type for usage.
> 2) Use "unsigned long" for usage along with WRITE_ONCE/READ_ONCE.
> 3) Use hugetlb_lock for hugetlb_cgroup_read_numa_stat as well.
>
> All options are valid but we would like to avoid (3).
>
> What if we use "unsigned long" type but without READ_ONCE/WRITE_ONCE.
> The potential issues with that are KCSAN will report this as race and
> possible garbage value on archs which do not support atomic writes to
> unsigned long.

At least I totally agree with you. Thanks for your detailed explanation.

>
> Shakeel


  reply	other threads:[~2021-11-14 13:44 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-11  1:50 Mina Almasry
2021-11-11  2:04 ` Shakeel Butt
2021-11-11  2:36 ` Muchun Song
2021-11-11  2:59   ` Shakeel Butt
2021-11-12 23:36   ` Mike Kravetz
2021-11-13  2:44     ` Muchun Song
2021-11-13 14:48       ` Mina Almasry
2021-11-13 19:15         ` Shakeel Butt
2021-11-14 13:43           ` Muchun Song [this message]
2021-11-15 18:22             ` Mike Kravetz
2021-11-15 18:55               ` Mina Almasry
2021-11-15 19:59                 ` Shakeel Butt
2021-11-16 12:04                   ` Marco Elver
2021-11-16 20:48                     ` Mina Almasry
2021-11-16 20:59                       ` Shakeel Butt
2021-11-16 21:47                         ` Marco Elver
2021-11-16 21:53                           ` Mina Almasry
2021-11-17  5:54                       ` Muchun Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAMZfGtWJGqbji3OexrGi-uuZ6_LzdUs0q9Vd66SwH93_nfLJLA@mail.gmail.com \
    --to=songmuchun@bytedance.com \
    --cc=akpm@linux-foundation.org \
    --cc=almasrymina@google.com \
    --cc=cannonmatthews@google.com \
    --cc=joannali@google.com \
    --cc=juew@google.com \
    --cc=linmiaohe@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=mike.kravetz@oracle.com \
    --cc=osalvador@suse.de \
    --cc=rientjes@google.com \
    --cc=shakeelb@google.com \
    --cc=shuah@kernel.org \
    --cc=ygyao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox