From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CEBEAC43463 for ; Fri, 18 Sep 2020 02:10:53 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 7202F21D92 for ; Fri, 18 Sep 2020 02:10:53 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="t0jP8hO2" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7202F21D92 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id D79566B0093; Thu, 17 Sep 2020 22:10:52 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CDA468E0001; Thu, 17 Sep 2020 22:10:52 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B521F6B0098; Thu, 17 Sep 2020 22:10:52 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0005.hostedemail.com [216.40.44.5]) by kanga.kvack.org (Postfix) with ESMTP id 97BD66B0093 for ; Thu, 17 Sep 2020 22:10:52 -0400 (EDT) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 668E0362A for ; Fri, 18 Sep 2020 02:10:52 +0000 (UTC) X-FDA: 77274553944.14.land55_030bd5527127 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin14.hostedemail.com (Postfix) with ESMTP id 41D9018229818 for ; Fri, 18 Sep 2020 02:10:52 +0000 (UTC) X-HE-Tag: land55_030bd5527127 X-Filterd-Recvd-Size: 8612 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf16.hostedemail.com (Postfix) with ESMTP for ; Fri, 18 Sep 2020 02:10:51 +0000 (UTC) Received: from sasha-vm.mshome.net (c-73-47-72-35.hsd1.nh.comcast.net [73.47.72.35]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 2B6FB21582; Fri, 18 Sep 2020 02:10:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1600395051; bh=XHqc2PisLSS9XHIsVtEJZEdheMBOPqxkTlMnJKQ08jg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=t0jP8hO2Ut8pDQHemY1u47X427R4mn+pHr4xeLZRZbHKtEyhGQNGbDnLYVS8OPbSX qs6CL4e0slsOgGD/JdJ0a7vG55f099K9opXqkq9Oq7KhB8PdIKIBXTsz/s+AveN+kf imnpnvrcHmei2PB2YdqZ4JP7QiCIvNBlatZv8ih4= From: Sasha Levin To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Cc: Qian Cai , Andrew Morton , Marco Elver , Matthew Wilcox , Linus Torvalds , Sasha Levin , linux-mm@kvack.org Subject: [PATCH AUTOSEL 4.19 139/206] mm/vmscan.c: fix data races using kswapd_classzone_idx Date: Thu, 17 Sep 2020 22:06:55 -0400 Message-Id: <20200918020802.2065198-139-sashal@kernel.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918020802.2065198-1-sashal@kernel.org> References: <20200918020802.2065198-1-sashal@kernel.org> MIME-Version: 1.0 X-stable: review X-Patchwork-Hint: Ignore Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Qian Cai [ Upstream commit 5644e1fbbfe15ad06785502bbfe5751223e5841d ] pgdat->kswapd_classzone_idx could be accessed concurrently in wakeup_kswapd(). Plain writes and reads without any lock protection result in data races. Fix them by adding a pair of READ|WRITE_ONCE() as well as saving a branch (compilers might well optimize the original code in an unintentional way anyway). While at it, also take care of pgdat->kswapd_order and non-kswapd threads in allow_direct_reclaim(). Th= e data races were reported by KCSAN, BUG: KCSAN: data-race in wakeup_kswapd / wakeup_kswapd write to 0xffff9f427ffff2dc of 4 bytes by task 7454 on cpu 13: wakeup_kswapd+0xf1/0x400 wakeup_kswapd at mm/vmscan.c:3967 wake_all_kswapds+0x59/0xc0 wake_all_kswapds at mm/page_alloc.c:4241 __alloc_pages_slowpath+0xdcc/0x1290 __alloc_pages_slowpath at mm/page_alloc.c:4512 __alloc_pages_nodemask+0x3bb/0x450 alloc_pages_vma+0x8a/0x2c0 do_anonymous_page+0x16e/0x6f0 __handle_mm_fault+0xcd5/0xd40 handle_mm_fault+0xfc/0x2f0 do_page_fault+0x263/0x6f9 page_fault+0x34/0x40 1 lock held by mtest01/7454: #0: ffff9f425afe8808 (&mm->mmap_sem#2){++++}, at: do_page_fault+0x143/0x6f9 do_user_addr_fault at arch/x86/mm/fault.c:1405 (inlined by) do_page_fault at arch/x86/mm/fault.c:1539 irq event stamp: 6944085 count_memcg_event_mm+0x1a6/0x270 count_memcg_event_mm+0x119/0x270 __do_softirq+0x34c/0x57c irq_exit+0xa2/0xc0 read to 0xffff9f427ffff2dc of 4 bytes by task 7472 on cpu 38: wakeup_kswapd+0xc8/0x400 wake_all_kswapds+0x59/0xc0 __alloc_pages_slowpath+0xdcc/0x1290 __alloc_pages_nodemask+0x3bb/0x450 alloc_pages_vma+0x8a/0x2c0 do_anonymous_page+0x16e/0x6f0 __handle_mm_fault+0xcd5/0xd40 handle_mm_fault+0xfc/0x2f0 do_page_fault+0x263/0x6f9 page_fault+0x34/0x40 1 lock held by mtest01/7472: #0: ffff9f425a9ac148 (&mm->mmap_sem#2){++++}, at: do_page_fault+0x143/0x6f9 irq event stamp: 6793561 count_memcg_event_mm+0x1a6/0x270 count_memcg_event_mm+0x119/0x270 __do_softirq+0x34c/0x57c irq_exit+0xa2/0xc0 BUG: KCSAN: data-race in kswapd / wakeup_kswapd write to 0xffff90973ffff2dc of 4 bytes by task 820 on cpu 6: kswapd+0x27c/0x8d0 kthread+0x1e0/0x200 ret_from_fork+0x27/0x50 read to 0xffff90973ffff2dc of 4 bytes by task 6299 on cpu 0: wakeup_kswapd+0xf3/0x450 wake_all_kswapds+0x59/0xc0 __alloc_pages_slowpath+0xdcc/0x1290 __alloc_pages_nodemask+0x3bb/0x450 alloc_pages_vma+0x8a/0x2c0 do_anonymous_page+0x170/0x700 __handle_mm_fault+0xc9f/0xd00 handle_mm_fault+0xfc/0x2f0 do_page_fault+0x263/0x6f9 page_fault+0x34/0x40 Signed-off-by: Qian Cai Signed-off-by: Andrew Morton Reviewed-by: Andrew Morton Cc: Marco Elver Cc: Matthew Wilcox Link: http://lkml.kernel.org/r/1582749472-5171-1-git-send-email-cai@lca.p= w Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin --- mm/vmscan.c | 45 ++++++++++++++++++++++++++------------------- 1 file changed, 26 insertions(+), 19 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index bc2ecd43251ad..da09b741d08a0 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -3101,8 +3101,9 @@ static bool allow_direct_reclaim(pg_data_t *pgdat) =20 /* kswapd must be awake if processes are being throttled */ if (!wmark_ok && waitqueue_active(&pgdat->kswapd_wait)) { - pgdat->kswapd_classzone_idx =3D min(pgdat->kswapd_classzone_idx, - (enum zone_type)ZONE_NORMAL); + if (READ_ONCE(pgdat->kswapd_classzone_idx) > ZONE_NORMAL) + WRITE_ONCE(pgdat->kswapd_classzone_idx, ZONE_NORMAL); + wake_up_interruptible(&pgdat->kswapd_wait); } =20 @@ -3618,9 +3619,9 @@ out: static enum zone_type kswapd_classzone_idx(pg_data_t *pgdat, enum zone_type prev_classzone_idx) { - if (pgdat->kswapd_classzone_idx =3D=3D MAX_NR_ZONES) - return prev_classzone_idx; - return pgdat->kswapd_classzone_idx; + enum zone_type curr_idx =3D READ_ONCE(pgdat->kswapd_classzone_idx); + + return curr_idx =3D=3D MAX_NR_ZONES ? prev_classzone_idx : curr_idx; } =20 static void kswapd_try_to_sleep(pg_data_t *pgdat, int alloc_order, int r= eclaim_order, @@ -3664,8 +3665,11 @@ static void kswapd_try_to_sleep(pg_data_t *pgdat, = int alloc_order, int reclaim_o * the previous request that slept prematurely. */ if (remaining) { - pgdat->kswapd_classzone_idx =3D kswapd_classzone_idx(pgdat, classzone= _idx); - pgdat->kswapd_order =3D max(pgdat->kswapd_order, reclaim_order); + WRITE_ONCE(pgdat->kswapd_classzone_idx, + kswapd_classzone_idx(pgdat, classzone_idx)); + + if (READ_ONCE(pgdat->kswapd_order) < reclaim_order) + WRITE_ONCE(pgdat->kswapd_order, reclaim_order); } =20 finish_wait(&pgdat->kswapd_wait, &wait); @@ -3747,12 +3751,12 @@ static int kswapd(void *p) tsk->flags |=3D PF_MEMALLOC | PF_SWAPWRITE | PF_KSWAPD; set_freezable(); =20 - pgdat->kswapd_order =3D 0; - pgdat->kswapd_classzone_idx =3D MAX_NR_ZONES; + WRITE_ONCE(pgdat->kswapd_order, 0); + WRITE_ONCE(pgdat->kswapd_classzone_idx, MAX_NR_ZONES); for ( ; ; ) { bool ret; =20 - alloc_order =3D reclaim_order =3D pgdat->kswapd_order; + alloc_order =3D reclaim_order =3D READ_ONCE(pgdat->kswapd_order); classzone_idx =3D kswapd_classzone_idx(pgdat, classzone_idx); =20 kswapd_try_sleep: @@ -3760,10 +3764,10 @@ kswapd_try_sleep: classzone_idx); =20 /* Read the new order and classzone_idx */ - alloc_order =3D reclaim_order =3D pgdat->kswapd_order; + alloc_order =3D reclaim_order =3D READ_ONCE(pgdat->kswapd_order); classzone_idx =3D kswapd_classzone_idx(pgdat, classzone_idx); - pgdat->kswapd_order =3D 0; - pgdat->kswapd_classzone_idx =3D MAX_NR_ZONES; + WRITE_ONCE(pgdat->kswapd_order, 0); + WRITE_ONCE(pgdat->kswapd_classzone_idx, MAX_NR_ZONES); =20 ret =3D try_to_freeze(); if (kthread_should_stop()) @@ -3808,20 +3812,23 @@ void wakeup_kswapd(struct zone *zone, gfp_t gfp_f= lags, int order, enum zone_type classzone_idx) { pg_data_t *pgdat; + enum zone_type curr_idx; =20 if (!managed_zone(zone)) return; =20 if (!cpuset_zone_allowed(zone, gfp_flags)) return; + pgdat =3D zone->zone_pgdat; + curr_idx =3D READ_ONCE(pgdat->kswapd_classzone_idx); + + if (curr_idx =3D=3D MAX_NR_ZONES || curr_idx < classzone_idx) + WRITE_ONCE(pgdat->kswapd_classzone_idx, classzone_idx); + + if (READ_ONCE(pgdat->kswapd_order) < order) + WRITE_ONCE(pgdat->kswapd_order, order); =20 - if (pgdat->kswapd_classzone_idx =3D=3D MAX_NR_ZONES) - pgdat->kswapd_classzone_idx =3D classzone_idx; - else - pgdat->kswapd_classzone_idx =3D max(pgdat->kswapd_classzone_idx, - classzone_idx); - pgdat->kswapd_order =3D max(pgdat->kswapd_order, order); if (!waitqueue_active(&pgdat->kswapd_wait)) return; =20 --=20 2.25.1