linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Shakeel Butt <shakeelb@google.com>
To: Mina Almasry <almasrymina@google.com>
Cc: Muchun Song <songmuchun@bytedance.com>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	 Andrew Morton <akpm@linux-foundation.org>,
	Shuah Khan <shuah@kernel.org>,  Miaohe Lin <linmiaohe@huawei.com>,
	Oscar Salvador <osalvador@suse.de>,
	Michal Hocko <mhocko@suse.com>,
	 David Rientjes <rientjes@google.com>, Jue Wang <juew@google.com>,
	Yang Yao <ygyao@google.com>,  Joanna Li <joannali@google.com>,
	Cannon Matthews <cannonmatthews@google.com>,
	 Linux Memory Management List <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v6] hugetlb: Add hugetlb.*.numa_stat file
Date: Sat, 13 Nov 2021 11:15:02 -0800	[thread overview]
Message-ID: <CALvZod7VPD1rn6E9_1q6VzvXQeHDeE=zPRpr9dBcj5iGPTGKfA@mail.gmail.com> (raw)
In-Reply-To: <CAHS8izPjJRf50yAtB0iZmVBi1LNKVHGmLb6ayx7U2+j8fzSgJA@mail.gmail.com>

On Sat, Nov 13, 2021 at 6:48 AM Mina Almasry <almasrymina@google.com> wrote:
>
> On Fri, Nov 12, 2021 at 6:45 PM Muchun Song <songmuchun@bytedance.com> wrote:
> >
> > On Sat, Nov 13, 2021 at 7:36 AM Mike Kravetz <mike.kravetz@oracle.com> wrote:
> > >
> > > Subject:   Re: [PATCH v6] hugetlb: Add hugetlb.*.numa_stat file
> > >
> > > To:        Muchun Song <songmuchun@bytedance.com>, Mina Almasry <almasrymina@google.com>
> > >
> > > Cc:        Andrew Morton <akpm@linux-foundation.org>, Shuah Khan <shuah@kernel.org>, Miaohe Lin <linmiaohe@huawei.com>, Oscar Salvador <osalvador@suse.de>, Michal Hocko <mhocko@suse.com>, David Rientjes <rientjes@google.com>, Shakeel Butt <shakeelb@google.com>, Jue Wang <juew@google.com>, Yang Yao <ygyao@google.com>, Joanna Li <joannali@google.com>, Cannon Matthews <cannonmatthews@google.com>, Linux Memory Management List <linux-mm@kvack.org>, LKML <linux-kernel@vger.kernel.org>
> > >
> > > Bcc:
> > >
> > > -=-=-=-=-=-=-=-=-=# Don't remove this line #=-=-=-=-=-=-=-=-=-
> > >
> > > On 11/10/21 6:36 PM, Muchun Song wrote:
> > >
> > > > On Thu, Nov 11, 2021 at 9:50 AM Mina Almasry <almasrymina@google.com> wrote:
> > >
> > > >>
> > >
> > > >> +struct hugetlb_cgroup_per_node {
> > >
> > > >> +       /* hugetlb usage in pages over all hstates. */
> > >
> > > >> +       atomic_long_t usage[HUGE_MAX_HSTATE];
> > >
> > > >
> > >
> > > > Why do you use atomic? IIUC, 'usage' is always
> > >
> > > > increased/decreased under hugetlb_lock except
> > >
> > > > hugetlb_cgroup_read_numa_stat() which is always
> > >
> > > > reading it. So I think WRITE_ONCE/READ_ONCE
> > >
> > > > is enough.
> > >
> > >
> > >
> > > Thanks for continuing to work this, I was traveling and unable to
> > >
> > > comment.
> >
> > Have a good time.
> >
> > >
> > >
> > >
> > > Unless I am missing something, I do not see a reason for WRITE_ONCE/READ_ONCE
> >
> > Because __hugetlb_cgroup_commit_charge and
> > hugetlb_cgroup_read_numa_stat can run parallely,
> > which meets the definition of data race. I believe
> > KCSAN could report this race. I'm not strongly
> > suggest using WRITE/READ_ONCE() here. But
> > in theory it should be like this. Right?
> >
>
> My understanding is that the (only) potential problem here is
> read_numa_stat() reading an intermediate garbage value while
> commit_charge() is happening concurrently. This will only happen on
> archs where the writes to an unsigned long aren't atomic. On archs
> where writes to an unsigned long are atomic, there is no race, because
> read_numa_stat() will only ever read the value before the concurrent
> write or after the concurrent write, both of which are valid. To cater
> to archs where the writes to unsigned long aren't atomic, we need to
> use an atomic data type.
>
> I'm not too familiar but my understanding from reading the
> documentation is that WRITE_ONCE/READ_ONCE don't contribute anything
> meaningful here:
>
> /*
> * Prevent the compiler from merging or refetching reads or writes. The
> * compiler is also forbidden from reordering successive instances of
> * READ_ONCE and WRITE_ONCE, but only when the compiler is aware of some
> * particular ordering. One way to make the compiler aware of ordering is to
> * put the two invocations of READ_ONCE or WRITE_ONCE in different C
> * statements.
> ...
>
> I can't see a reason why we care about the compiler merging or
> refetching reads or writes here. As far as I can tell the problem is
> atomicy of the write.
>

We have following options:

1) Use atomic type for usage.
2) Use "unsigned long" for usage along with WRITE_ONCE/READ_ONCE.
3) Use hugetlb_lock for hugetlb_cgroup_read_numa_stat as well.

All options are valid but we would like to avoid (3).

What if we use "unsigned long" type but without READ_ONCE/WRITE_ONCE.
The potential issues with that are KCSAN will report this as race and
possible garbage value on archs which do not support atomic writes to
unsigned long.

Shakeel


  reply	other threads:[~2021-11-13 19:15 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-11  1:50 Mina Almasry
2021-11-11  2:04 ` Shakeel Butt
2021-11-11  2:36 ` Muchun Song
2021-11-11  2:59   ` Shakeel Butt
2021-11-12 23:36   ` Mike Kravetz
2021-11-13  2:44     ` Muchun Song
2021-11-13 14:48       ` Mina Almasry
2021-11-13 19:15         ` Shakeel Butt [this message]
2021-11-14 13:43           ` Muchun Song
2021-11-15 18:22             ` Mike Kravetz
2021-11-15 18:55               ` Mina Almasry
2021-11-15 19:59                 ` Shakeel Butt
2021-11-16 12:04                   ` Marco Elver
2021-11-16 20:48                     ` Mina Almasry
2021-11-16 20:59                       ` Shakeel Butt
2021-11-16 21:47                         ` Marco Elver
2021-11-16 21:53                           ` Mina Almasry
2021-11-17  5:54                       ` Muchun Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CALvZod7VPD1rn6E9_1q6VzvXQeHDeE=zPRpr9dBcj5iGPTGKfA@mail.gmail.com' \
    --to=shakeelb@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=almasrymina@google.com \
    --cc=cannonmatthews@google.com \
    --cc=joannali@google.com \
    --cc=juew@google.com \
    --cc=linmiaohe@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=mike.kravetz@oracle.com \
    --cc=osalvador@suse.de \
    --cc=rientjes@google.com \
    --cc=shuah@kernel.org \
    --cc=songmuchun@bytedance.com \
    --cc=ygyao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox