From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C4509C433FE for ; Sun, 14 Nov 2021 13:44:37 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 521366112E for ; Sun, 14 Nov 2021 13:44:37 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 521366112E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=bytedance.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 7379C6B007B; Sun, 14 Nov 2021 08:44:36 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6E5406B007D; Sun, 14 Nov 2021 08:44:36 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5D4016B007E; Sun, 14 Nov 2021 08:44:36 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0113.hostedemail.com [216.40.44.113]) by kanga.kvack.org (Postfix) with ESMTP id 4C2646B007B for ; Sun, 14 Nov 2021 08:44:36 -0500 (EST) Received: from smtpin30.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id F3EBF8249980 for ; Sun, 14 Nov 2021 13:44:35 +0000 (UTC) X-FDA: 78807655752.30.769A6FE Received: from mail-yb1-f177.google.com (mail-yb1-f177.google.com [209.85.219.177]) by imf11.hostedemail.com (Postfix) with ESMTP id 63BBCF000209 for ; Sun, 14 Nov 2021 13:44:34 +0000 (UTC) Received: by mail-yb1-f177.google.com with SMTP id q74so38356918ybq.11 for ; Sun, 14 Nov 2021 05:44:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=XiL4sazgjnRcq78QU16A4FxwkDV5n8t9TKCfKPhOGWc=; b=3oNlfvHeHCnq0x3+Hdkw6nCI++VSaUBJeBDm8k3FtH/emg9+7H++TSlsQwxngCJjy9 vHzAlSJH7NAx/0lBfGH+UrUgnxP14gIQGX0rg1fBcdT/La5xo7/xjCSTlponZk70XuqH Fx8TkRTd0dI5gt0K+hnZgXDLgUzp+DILMu8i4sHWxWa2EHQqI28VzT/jkkoY+MoDn2Cv oTKGqHPRX6Wk7FgMPuKsKwSq7WENNeCr83h+fIhADsRx3bjNjWO5islwxVW+axDCCfMh alDDLsaPbwKU7hEh3YB4B/3Y9l980afHwT5m+eovTsejcJ99Z6SeU2EsGnKVFPs3eSBw ZJFw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=XiL4sazgjnRcq78QU16A4FxwkDV5n8t9TKCfKPhOGWc=; b=gOH5gJmW0lVhaYdvWcvsiP1dl0WdHSVWcnNer4YOmC1zr9OLm6lqkfyORsb47dQnV/ RDpn2XSTI4xh1qfhoepxxXWQEyS3dILcIOe5KztvMjNF/SwmdjvU301W8dS+Z2G5y8W7 UzfAfgmtNZLAi2dpB6tbT/31yr9tjUqbt/KkymxTtQb8/Kje1PEyvqZeoK80b42VYpub svDCgPGw9xqIJZ9x/jg91TzdIHAAZUM+6UQVBaiNTQFicOWUHeJG+hWYrDs84jTZN0fG NoSZJ+W0jH7WXZFQCZcXaM2PgaaHT+ECy4sBSXVdGEotnbdgIitvLxTUqZCKaxZP5r3p +/BQ== X-Gm-Message-State: AOAM5321HWmyCLISS9exNDwRYR44PxsHnL51bE4BoxWk24uEejSi0yzE lEXadodju19+nL5YdKDb16Xv0uJRg5dV+j7zuu+dQg== X-Google-Smtp-Source: ABdhPJw//e9xfUXRSj/P3NnV+MQYvHcC5UtGgf5hdCBInyairlNUjoM/JYQydjIh8rF5bHLUFfdH+0IP3JdX9BGZGTs= X-Received: by 2002:a25:ef0b:: with SMTP id g11mr32894341ybd.404.1636897473416; Sun, 14 Nov 2021 05:44:33 -0800 (PST) MIME-Version: 1.0 References: <20211111015037.4092956-1-almasrymina@google.com> In-Reply-To: From: Muchun Song Date: Sun, 14 Nov 2021 21:43:55 +0800 Message-ID: Subject: Re: [PATCH v6] hugetlb: Add hugetlb.*.numa_stat file To: Shakeel Butt , Mina Almasry Cc: Mike Kravetz , Andrew Morton , Shuah Khan , Miaohe Lin , Oscar Salvador , Michal Hocko , David Rientjes , Jue Wang , Yang Yao , Joanna Li , Cannon Matthews , Linux Memory Management List , LKML Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=bytedance-com.20210112.gappssmtp.com header.s=20210112 header.b=3oNlfvHe; spf=pass (imf11.hostedemail.com: domain of songmuchun@bytedance.com designates 209.85.219.177 as permitted sender) smtp.mailfrom=songmuchun@bytedance.com; dmarc=pass (policy=none) header.from=bytedance.com X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 63BBCF000209 X-Stat-Signature: igxqitopjy7syj1h93sucy7dr66pcysz X-HE-Tag: 1636897474-878566 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Sun, Nov 14, 2021 at 3:15 AM Shakeel Butt wrote: > > On Sat, Nov 13, 2021 at 6:48 AM Mina Almasry wro= te: > > > > On Fri, Nov 12, 2021 at 6:45 PM Muchun Song = wrote: > > > > > > On Sat, Nov 13, 2021 at 7:36 AM Mike Kravetz wrote: > > > > > > > > Subject: Re: [PATCH v6] hugetlb: Add hugetlb.*.numa_stat file > > > > > > > > To: Muchun Song , Mina Almasry > > > > > > > > Cc: Andrew Morton , Shuah Khan , Miaohe Lin , Oscar Salvador , Michal Hocko , David Rientjes , Shakeel Butt , Jue Wang , Ya= ng Yao , Joanna Li , Cannon Matthews= , Linux Memory Management List , LKML > > > > > > > > Bcc: > > > > > > > > -=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D# Don't remove this line #=3D-= =3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D- > > > > > > > > On 11/10/21 6:36 PM, Muchun Song wrote: > > > > > > > > > On Thu, Nov 11, 2021 at 9:50 AM Mina Almasry wrote: > > > > > > > > >> > > > > > > > > >> +struct hugetlb_cgroup_per_node { > > > > > > > > >> + /* hugetlb usage in pages over all hstates. */ > > > > > > > > >> + atomic_long_t usage[HUGE_MAX_HSTATE]; > > > > > > > > > > > > > > > > > > Why do you use atomic? IIUC, 'usage' is always > > > > > > > > > increased/decreased under hugetlb_lock except > > > > > > > > > hugetlb_cgroup_read_numa_stat() which is always > > > > > > > > > reading it. So I think WRITE_ONCE/READ_ONCE > > > > > > > > > is enough. > > > > > > > > > > > > > > > > Thanks for continuing to work this, I was traveling and unable to > > > > > > > > comment. > > > > > > Have a good time. > > > > > > > > > > > > > > > > > > > Unless I am missing something, I do not see a reason for WRITE_ONCE= /READ_ONCE > > > > > > Because __hugetlb_cgroup_commit_charge and > > > hugetlb_cgroup_read_numa_stat can run parallely, > > > which meets the definition of data race. I believe > > > KCSAN could report this race. I'm not strongly > > > suggest using WRITE/READ_ONCE() here. But > > > in theory it should be like this. Right? > > > > > > > My understanding is that the (only) potential problem here is > > read_numa_stat() reading an intermediate garbage value while > > commit_charge() is happening concurrently. This will only happen on > > archs where the writes to an unsigned long aren't atomic. On archs > > where writes to an unsigned long are atomic, there is no race, because > > read_numa_stat() will only ever read the value before the concurrent > > write or after the concurrent write, both of which are valid. To cater > > to archs where the writes to unsigned long aren't atomic, we need to > > use an atomic data type. > > > > I'm not too familiar but my understanding from reading the > > documentation is that WRITE_ONCE/READ_ONCE don't contribute anything > > meaningful here: > > > > /* > > * Prevent the compiler from merging or refetching reads or writes. The > > * compiler is also forbidden from reordering successive instances of > > * READ_ONCE and WRITE_ONCE, but only when the compiler is aware of some > > * particular ordering. One way to make the compiler aware of ordering i= s to > > * put the two invocations of READ_ONCE or WRITE_ONCE in different C > > * statements. > > ... > > > > I can't see a reason why we care about the compiler merging or > > refetching reads or writes here. As far as I can tell the problem is > > atomicy of the write. > > > > We have following options: > > 1) Use atomic type for usage. > 2) Use "unsigned long" for usage along with WRITE_ONCE/READ_ONCE. > 3) Use hugetlb_lock for hugetlb_cgroup_read_numa_stat as well. > > All options are valid but we would like to avoid (3). > > What if we use "unsigned long" type but without READ_ONCE/WRITE_ONCE. > The potential issues with that are KCSAN will report this as race and > possible garbage value on archs which do not support atomic writes to > unsigned long. At least I totally agree with you. Thanks for your detailed explanation. > > Shakeel