From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.1 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FSL_HELO_FAKE,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1, USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 067D6CA9EAE for ; Tue, 29 Oct 2019 18:28:12 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id BC9FB208E3 for ; Tue, 29 Oct 2019 18:28:11 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="iHFxJvYu" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org BC9FB208E3 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 5B4A66B0003; Tue, 29 Oct 2019 14:28:11 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 53EA26B0005; Tue, 29 Oct 2019 14:28:11 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 405FD6B0006; Tue, 29 Oct 2019 14:28:11 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0196.hostedemail.com [216.40.44.196]) by kanga.kvack.org (Postfix) with ESMTP id 1D88A6B0003 for ; Tue, 29 Oct 2019 14:28:11 -0400 (EDT) Received: from smtpin30.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with SMTP id AA88C45D8 for ; Tue, 29 Oct 2019 18:28:10 +0000 (UTC) X-FDA: 76097656740.30.cart99_6471738262b62 X-HE-Tag: cart99_6471738262b62 X-Filterd-Recvd-Size: 6696 Received: from mail-wr1-f68.google.com (mail-wr1-f68.google.com [209.85.221.68]) by imf19.hostedemail.com (Postfix) with ESMTP for ; Tue, 29 Oct 2019 18:28:10 +0000 (UTC) Received: by mail-wr1-f68.google.com with SMTP id n1so7306616wra.10 for ; Tue, 29 Oct 2019 11:28:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=Lc0YR6zDkvJBR1Lmyr78EIEUjbyCrD4ENhnxzUSJhcA=; b=iHFxJvYuZq4G9qaQCdumXteszLysWMaL+ueo6DBsLBl3z9HNC1yqtPLD6gCDJbLYqM miI2K6+47J3PmvgsRFG+XsCPP09qOfnRSZ2o3fVAQcL6IWq+F0Lb0IVuNVN6LyDMuhWN QcyjxJQ5J6fHF0BExW1H737h62yFQrOyNhixlBx7jo+y1a/HGWvG4vsHMkcFzQIBku2c nvCLr73gax20ZKvuzP5i84oLfymNJYCk27XQapnmXlrGUYsZcnZvJesVN6wyDdLWfTnh vCGc+KlD3c16A7xbdM6tQ4F2q90fB2FvGO2/z2ROvny0qvzWII8MwF+O73vO3S3lMpkR WOYQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=Lc0YR6zDkvJBR1Lmyr78EIEUjbyCrD4ENhnxzUSJhcA=; b=Q3iYnenL49rxyM2bOA4F72R3HvCZgcONWOkS7GGILIfi7e0pBJ7bLnx9UYc5McuFE0 IwU9ku+J5cpAC95JhlRMHZSW1mXvs3IE7s2AAQjSorg/oL10I2wLA6TI01znV1R/bYaY fFxdtgXnncCIkPJcKSNGy6xIz36UXxRNogpO53ldIc3WVa+2WVMQWoQYduyHeqKhPZ9H tVxrEmrXJmQRj3GXFsM7xLTjPM8T6cqUigb9XGV2lkqOzac2DOROXOTR7RkPMsjixGiX QxZSBgii0Z1yp6GBHeFRTQhTsRTEZRg4RXbwLRIsrEZEblD1/MTPjVGVnrYBSKgEnzbu Jnlw== X-Gm-Message-State: APjAAAVrZv68WMidyawEcAyBEpFHujq/CqYvNKoKS5t/5T8MFTq+dmgn ZztjZ+D6N2SQH2LfkM5hqLjkyQ== X-Google-Smtp-Source: APXvYqxnoCz8IBto2jLLfwAmwTJ06maD4sFXkbhBzKj0Zkkm55yFIuhz+cjLJm9VDxK2s8TbMVeSPw== X-Received: by 2002:adf:9799:: with SMTP id s25mr21295633wrb.390.1572373688432; Tue, 29 Oct 2019 11:28:08 -0700 (PDT) Received: from google.com ([100.105.32.75]) by smtp.gmail.com with ESMTPSA id f14sm4059375wmc.22.2019.10.29.11.28.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 29 Oct 2019 11:28:07 -0700 (PDT) Date: Tue, 29 Oct 2019 19:28:02 +0100 From: Marco Elver To: Shakeel Butt Cc: Michal Hocko , Roman Gushchin , Johannes Weiner , Andrew Morton , Linux MM , Cgroups , LKML , Eric Dumazet , Greg Thelen , syzbot+13f93c99c06988391efe@syzkaller.appspotmail.com Subject: Re: [PATCH] mm: memcontrol: fix data race in mem_cgroup_select_victim_node Message-ID: <20191029182802.GA193152@google.com> References: <20191029005405.201986-1-shakeelb@google.com> <20191029090347.GG31513@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, 29 Oct 2019, Shakeel Butt wrote: > +Marco > > On Tue, Oct 29, 2019 at 2:03 AM Michal Hocko wrote: > > > > On Mon 28-10-19 17:54:05, Shakeel Butt wrote: > > > Syzbot reported the following bug: > > > > > > BUG: KCSAN: data-race in mem_cgroup_select_victim_node / mem_cgroup_select_victim_node > > > > > > write to 0xffff88809fade9b0 of 4 bytes by task 8603 on cpu 0: > > > mem_cgroup_select_victim_node+0xb5/0x3d0 mm/memcontrol.c:1686 > > > try_to_free_mem_cgroup_pages+0x175/0x4c0 mm/vmscan.c:3376 > > > reclaim_high.constprop.0+0xf7/0x140 mm/memcontrol.c:2349 > > > mem_cgroup_handle_over_high+0x96/0x180 mm/memcontrol.c:2430 > > > tracehook_notify_resume include/linux/tracehook.h:197 [inline] > > > exit_to_usermode_loop+0x20c/0x2c0 arch/x86/entry/common.c:163 > > > prepare_exit_to_usermode+0x180/0x1a0 arch/x86/entry/common.c:194 > > > swapgs_restore_regs_and_return_to_usermode+0x0/0x40 > > > > > > read to 0xffff88809fade9b0 of 4 bytes by task 7290 on cpu 1: > > > mem_cgroup_select_victim_node+0x92/0x3d0 mm/memcontrol.c:1675 > > > try_to_free_mem_cgroup_pages+0x175/0x4c0 mm/vmscan.c:3376 > > > reclaim_high.constprop.0+0xf7/0x140 mm/memcontrol.c:2349 > > > mem_cgroup_handle_over_high+0x96/0x180 mm/memcontrol.c:2430 > > > tracehook_notify_resume include/linux/tracehook.h:197 [inline] > > > exit_to_usermode_loop+0x20c/0x2c0 arch/x86/entry/common.c:163 > > > prepare_exit_to_usermode+0x180/0x1a0 arch/x86/entry/common.c:194 > > > swapgs_restore_regs_and_return_to_usermode+0x0/0x40 > > > > > > mem_cgroup_select_victim_node() can be called concurrently which reads > > > and modifies memcg->last_scanned_node without any synchrnonization. So, > > > read and modify memcg->last_scanned_node with READ_ONCE()/WRITE_ONCE() > > > to stop potential reordering. Strictly speaking, READ_ONCE/WRITE_ONCE alone avoid various bad compiler optimizations, including store tearing, load tearing, etc. This does not add memory barriers to constrain memory ordering. (If this code needs some memory ordering guarantees w.r.t. previous loads/stores then this alone is not enough.) > > I am sorry but I do not understand the problem and the fix. Why does the > > race happen and why does _ONCE fixes it? There is still no > > synchronization. Do you want to prevent from memcg->last_scanned_node > > reloading? > > > > The problem is memcg->last_scanned_node can read and modified > concurrently. Though to me it seems like a tolerable race and not > worth to add an explicit lock. My aim was to make KCSAN happy here to > look elsewhere for the concurrency bugs. However I see that it might > complain next on memcg->scan_nodes. The plain concurrent reads/writes are a data race, which may manifest in various undefined behaviour due to compiler optimizations. The _ONCE will prevent these (KCSAN only reports data races). Note that, "data race" does not necessarily imply "race condition"; some data races are race conditions (usually the more interesting bugs) -- but not *all* data races are race conditions. If there is no race condition here that warrants heavier synchronization (locking etc.), then this patch is all that should be needed. I can't comment on the rest. Thanks, -- Marco