From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.2 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,UNPARSEABLE_RELAY,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 37F17C2D0DD for ; Thu, 2 Jan 2020 03:58:01 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B508120863 for ; Thu, 2 Jan 2020 03:58:00 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B508120863 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 445E38E0005; Wed, 1 Jan 2020 22:58:00 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3F7528E0003; Wed, 1 Jan 2020 22:58:00 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 30C8B8E0005; Wed, 1 Jan 2020 22:58:00 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0093.hostedemail.com [216.40.44.93]) by kanga.kvack.org (Postfix) with ESMTP id 1BECE8E0003 for ; Wed, 1 Jan 2020 22:58:00 -0500 (EST) Received: from smtpin02.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with SMTP id CD544180AD801 for ; Thu, 2 Jan 2020 03:57:59 +0000 (UTC) X-FDA: 76331335878.02.game05_45b1854435b15 X-HE-Tag: game05_45b1854435b15 X-Filterd-Recvd-Size: 4205 Received: from out30-43.freemail.mail.aliyun.com (out30-43.freemail.mail.aliyun.com [115.124.30.43]) by imf07.hostedemail.com (Postfix) with ESMTP for ; Thu, 2 Jan 2020 03:57:58 +0000 (UTC) X-Alimail-AntiSpam:AC=PASS;BC=-1|-1;BR=01201311R151e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e07488;MF=wenyang@linux.alibaba.com;NM=1;PH=DS;RN=6;SR=0;TI=SMTPD_---0TmXvN1X_1577937470; Received: from IT-C02W23QPG8WN.local(mailfrom:wenyang@linux.alibaba.com fp:SMTPD_---0TmXvN1X_1577937470) by smtp.aliyun-inc.com(127.0.0.1); Thu, 02 Jan 2020 11:57:51 +0800 Subject: Re: [PATCH] mm/page-writeback.c: avoid potential division by zero To: Qian Cai Cc: Andrew Morton , xlpang@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, julia.lawall@lip6.fr References: <20200101093204.3592-1-wenyang@linux.alibaba.com> <230E8A87-2900-427B-9EA3-CC48B4DCA5FC@lca.pw> From: Wen Yang Message-ID: <62482b58-81e1-0295-1e28-e11261404831@linux.alibaba.com> Date: Thu, 2 Jan 2020 11:57:50 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:68.0) Gecko/20100101 Thunderbird/68.1.1 MIME-Version: 1.0 In-Reply-To: <230E8A87-2900-427B-9EA3-CC48B4DCA5FC@lca.pw> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.032796, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2020/1/1 8:39 =E4=B8=8B=E5=8D=88, Qian Cai wrote: >=20 >=20 >> On Jan 1, 2020, at 4:32 AM, Wen Yang wrote= : >> >> The variables 'min', 'max' and 'bw' are unsigned long and >> do_div truncates them to 32 bits, which means it can test >> non-zero and be truncated to zero for division. >> Fix this issue by using div64_ul() instead. >=20 > How did you find out the issue? If it is caught by compilers, can you p= aste the original warnings? Also, can you figure out which commit introdu= ced the issue in the first place, so it could be backported to stable if = needed? >=20 Thanks for your comments. There are no compilation warnings here. We found this issue by following these steps: We were first inspired by commit b0ab99e7736a ("sched: Fix possible=20 divide by zero in avg_atom () calculation"), combined with our recently=20 analyzed mm code, we found this suspicious place. And we also disassembled and confirmed it: 201 if (min) { 202 min *=3D this_bw; 203 do_div(min, tot_bw); 204 } /usr/src/debug/kernel-4.9.168-016.ali3000/linux-4.9.168-016.ali3000.alios= 7.x86_64/mm/page-writeback.c:=20 201 0xffffffff811c37da <__wb_calc_thresh+234>: xor %r10d,%r10d 0xffffffff811c37dd <__wb_calc_thresh+237>: test %rax,%rax 0xffffffff811c37e0 <__wb_calc_thresh+240>: je=20 0xffffffff811c3800 <__wb_calc_thresh+272> /usr/src/debug/kernel-4.9.168-016.ali3000/linux-4.9.168-016.ali3000.alios= 7.x86_64/mm/page-writeback.c:=20 202 0xffffffff811c37e2 <__wb_calc_thresh+242>: imul %r8,%rax /usr/src/debug/kernel-4.9.168-016.ali3000/linux-4.9.168-016.ali3000.alios= 7.x86_64/mm/page-writeback.c:=20 203 0xffffffff811c37e6 <__wb_calc_thresh+246>: mov %r9d,%r10d=20 ---> truncates it to 32 bits here 0xffffffff811c37e9 <__wb_calc_thresh+249>: xor %edx,%edx 0xffffffff811c37eb <__wb_calc_thresh+251>: div %r10 0xffffffff811c37ee <__wb_calc_thresh+254>: imul %rbx,%rax 0xffffffff811c37f2 <__wb_calc_thresh+258>: shr $0x2,%rax 0xffffffff811c37f6 <__wb_calc_thresh+262>: mul %rcx 0xffffffff811c37f9 <__wb_calc_thresh+265>: shr $0x2,%rdx 0xffffffff811c37fd <__wb_calc_thresh+269>: mov %rdx,%r10 This issue was introduced by commit 693108a8a667 (=E2=80=9Cwriteback: mak= e=20 bdi->min/max_ratio handling cgroup writeback aware=E2=80=9D). Finally, we will summarize the above cases and plan to write a general=20 coccinelle rule to check for similar problems. >> >> For the two variables 'numerator' and 'denominator', >> though they are declared as long, they should actually be >> unsigned long (according to the implementation of >> the fprop_fraction_percpu() function).