From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.1 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, MENTIONS_GIT_HOSTING,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 647EEC4361B for ; Sun, 20 Dec 2020 01:10:09 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4D50B2395C for ; Sun, 20 Dec 2020 01:10:07 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4D50B2395C Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 472486B005C; Sat, 19 Dec 2020 20:10:06 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4237B6B005D; Sat, 19 Dec 2020 20:10:06 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 312DB6B0068; Sat, 19 Dec 2020 20:10:06 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0040.hostedemail.com [216.40.44.40]) by kanga.kvack.org (Postfix) with ESMTP id 08E936B005C for ; Sat, 19 Dec 2020 20:10:06 -0500 (EST) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id B1AA6181AF5C3 for ; Sun, 20 Dec 2020 01:10:05 +0000 (UTC) X-FDA: 77611879170.14.hair00_4816f5f2744a Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin14.hostedemail.com (Postfix) with ESMTP id 962FB18229837 for ; Sun, 20 Dec 2020 01:10:05 +0000 (UTC) X-HE-Tag: hair00_4816f5f2744a X-Filterd-Recvd-Size: 8044 Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) by imf09.hostedemail.com (Postfix) with ESMTP for ; Sun, 20 Dec 2020 01:10:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=merlin.20170209; h=Content-Transfer-Encoding:Content-Type: In-Reply-To:MIME-Version:Date:Message-ID:From:References:Cc:To:Subject:Sender :Reply-To:Content-ID:Content-Description; bh=D7fV6i243PUByZ2aWIAxx7aXHH/rWVIhgPMnRukR5Y0=; b=OYOwe6ycSnm39H3hNeAXsLwVpZ HHuj8/J5rbL7Ia3MTZPSr2ye3mBJfWetMsK02qwWmTgjnvSsCbXbF5SUxaMiCAC3KAuEGqnkPDtOp OpV3OkTXY/zC2rA0qvnqVNMX0DvjesehNnwkr2MoSdqUduXvzFsca3zWx98bcuMLovOOb14DDo9bj sa5yuDCXIHVskvqA+sOK2f1m/mR44J8ZxxOjhVKbgUqeaS8DKNVL9IZp+44o97SOTiQop72IX+Kv5 5JwgChu5R5nNywFfYpQU6dpxCI23X12wIo2rHSx2cg9kVtTpJ8oUNCSthHfRKzKEqz9hULhQ3RJwf 9YOaDn4w==; Received: from [2601:1c0:6280:3f0::64ea] by merlin.infradead.org with esmtpsa (Exim 4.92.3 #3 (Red Hat Linux)) id 1kqnEU-0003dj-MC; Sun, 20 Dec 2020 01:09:59 +0000 Subject: Re: 5.10.1: UBSAN: shift-out-of-bounds in ./include/linux/log2.h:57:1 To: =?UTF-8?Q?Toralf_F=c3=b6rster?= , jgg@ziepe.ca Cc: Linux Kernel , Linux MM , Andrew Morton , axboe References: <5c172fad-a9cf-c29d-0a27-f2b0505dc33d@infradead.org> <43d52285-a10e-692d-daa6-6f5eb07e3132@gmx.de> From: Randy Dunlap Message-ID: <1affc309-709b-556e-fe51-72e59e83f90c@infradead.org> Date: Sat, 19 Dec 2020 17:09:52 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.4.0 MIME-Version: 1.0 In-Reply-To: <43d52285-a10e-692d-daa6-6f5eb07e3132@gmx.de> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 12/18/20 2:20 AM, Toralf F=C3=B6rster wrote: > On 12/18/20 7:54 AM, Randy Dunlap wrote: >> Hi, >> >> [adding linux-mm] >> >> On 12/16/20 1:54 AM, Toralf F=C3=B6rster wrote: >>> Hi, >>> >>> I got this recently at this hardened Gentoo Linux server: >>> >>> Linux mr-fox 5.10.1 #1 SMP Tue Dec 15 22:09:42 CET 2020 x86_64 Intel(= R) >>> Xeon(R) CPU E5-1650 v3 @ 3.50GHz GenuineIntel GNU/Linux >>> >>> >>> Dec 15 23:31:51 mr-fox kernel: [ 1974.206972] >>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D >>> Dec 15 23:31:51 mr-fox kernel: [ 1974.206977] UBSAN: shift-out-of-bou= nds >>> in ./include/linux/log2.h:57:13 >>> Dec 15 23:31:51 mr-fox kernel: [ 1974.206980] shift exponent 64 is to= o >>> large for 64-bit type 'long unsigned int' >>> Dec 15 23:31:51 mr-fox kernel: [ 1974.206982] CPU: 11 PID: 21051 Comm= : >>> cc1 Tainted: G=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 T 5.10.1 #1 >>> Dec 15 23:31:51 mr-fox kernel: [ 1974.206984] Hardware name: ASUSTeK >>> COMPUTER INC. Z10PA-U8 Series/Z10PA-U8 Series, BIOS 3703 08/02/2018 >>> Dec 15 23:31:51 mr-fox kernel: [ 1974.206985] Call Trace: >>> Dec 15 23:31:51 mr-fox kernel: [ 1974.206993]=C2=A0 dump_stack+0x57/0= x6a >>> Dec 15 23:31:51 mr-fox kernel: [ 1974.206996]=C2=A0 ubsan_epilogue+0x= 5/0x40 >>> Dec 15 23:31:51 mr-fox kernel: [ 1974.206999] >>> __ubsan_handle_shift_out_of_bounds.cold+0x61/0x10e >>> Dec 15 23:31:51 mr-fox kernel: [ 1974.207002] >>> ondemand_readahead.cold+0x16/0x21 >>> Dec 15 23:31:51 mr-fox kernel: [ 1974.207007] >>> generic_file_buffered_read+0x452/0x890 >>> Dec 15 23:31:51 mr-fox kernel: [ 1974.207011]=C2=A0 new_sync_read+0x1= 56/0x200 >>> Dec 15 23:31:51 mr-fox kernel: [ 1974.207014]=C2=A0 vfs_read+0xf8/0x1= 90 >>> Dec 15 23:31:51 mr-fox kernel: [ 1974.207016]=C2=A0 ksys_read+0x65/0x= e0 >>> Dec 15 23:31:51 mr-fox kernel: [ 1974.207018]=C2=A0 do_syscall_64+0x3= 3/0x40 >>> Dec 15 23:31:51 mr-fox kernel: [ 1974.207021] >>> entry_SYSCALL_64_after_hwframe+0x44/0xa9 >>> Dec 15 23:31:51 mr-fox kernel: [ 1974.207024] RIP: 0033:0x7f01b2df198= e >>> Dec 15 23:31:51 mr-fox kernel: [ 1974.207026] Code: c0 e9 b6 fe ff ff= 50 >>> 48 8d 3d 66 c3 09 00 e8 59 e2 01 00 66 0f 1f 84 00 00 00 00 00 64 8b = 04 >>> 25 18 00 00 00 85 c0 75 14 0f 05 <48> 3d 00 f0 ff ff 77 5a c3 66 0f 1= f >>> 84 00 00 00 00 00 48 83 ec 28 >>> Dec 15 23:31:51 mr-fox kernel: [ 1974.207028] RSP: 002b:00007fff2167e= 998 >>> EFLAGS: 00000246 ORIG_RAX: 0000000000000000 >>> Dec 15 23:31:51 mr-fox kernel: [ 1974.207030] RAX: ffffffffffffffda R= BX: >>> 0000000000000000 RCX: 00007f01b2df198e >>> Dec 15 23:31:51 mr-fox kernel: [ 1974.207032] RDX: 0000000000000000 R= SI: >>> 00000000054dcc50 RDI: 0000000000000004 >>> Dec 15 23:31:51 mr-fox kernel: [ 1974.207033] RBP: 00000000054dcc50 R= 08: >>> 00000000054dcc50 R09: 0000000000000000 >>> Dec 15 23:31:51 mr-fox kernel: [ 1974.207034] R10: 0000000000000000 R= 11: >>> 0000000000000246 R12: 00000000054dc3b0 >>> Dec 15 23:31:51 mr-fox kernel: [ 1974.207035] R13: 0000000000008000 R= 14: >>> 00000000054c9800 R15: 0000000000000000 >>> Dec 15 23:31:51 mr-fox kernel: [ 1974.207037] >>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D >>> >>> >>> Known issue ? >> >> Not that I have heard about, but that's not conclusive. >> >> Looks to me like this is in mm/readahead.c: >> >> static unsigned long get_init_ra_size(unsigned long size, unsigned lon= g max) >> { >> =C2=A0=C2=A0=C2=A0=C2=A0unsigned long newsize =3D roundup_pow_of_two(s= ize); >> >> >> What filesystem?=C2=A0 What workload? >=20 > / is a 32 GB ext4 filesystem. > Data are at 3 BTRFS filesystems, 1x 500 GB and 2x 1.6TB. >=20 > 2 Tor relays run at 100% each and utilizes the 1 GBit/s by 50%-60% [1] >=20 > 7 build bots are running over the Gentoo software repostory [2] > 1 AFL bot fuzzies the Tor sources. > Those 8 jobs are contained by a cgroup of 9 CPUs and 120 GB RAM [3], > each job is contained further by an own sub cgroup of 1.5 CPU and 20 GB > RAM [4] >=20 > The host is monitored using sysstat, the load is about 11.8, CPU[all] a= t > 80%, proc/s at 1800, cswchs/s at 20000 and so on. >=20 >=20 > [1] https://metrics.torproject.org/rs.html#search/zwiebeltoralf > [2] https://zwiebeltoralf.de/tinderbox.html > [3] https://github.com/toralf/tinderbox/blob/master/bin/cgroup.sh > [4] https://github.com/toralf/tinderbox/blob/master/bin/bwrap.sh#L15 >=20 > --=20 Hi Toralf, Is this something that happens more than once? I think we would like to find out what is causing it. I see a couple of problems. (a) UBSAN: shift-out-of-bounds in ./include/linux/log2.h:57:13 shift exponent 64 is too large for 64-bit type 'long unsigned int' :57: is like so: 50 /** 51 * __roundup_pow_of_two() - round up to nearest power of two 52 * @n: value to round up 53 */ 54 static inline __attribute__((const)) 55 unsigned long __roundup_pow_of_two(unsigned long n) 56 { 57 return 1UL << fls_long(n - 1); 58 } It's OK/valid for fls_long() [fls64()] to return 64 for a bit position -- it just means the high-order bit in a 64-bit value. So this code should either always subtract 1 from fls_long() or do that if fls_long() =3D=3D 64. (b) in mm/readahead.c:get_init_ra_size(): 305 /* 306 * Set the initial window size, round to next power of 2 and squar= e 307 * for small size, x 4 for medium, and x 2 for large 308 * for 128k (32 page) max ra 309 * 1-8 page =3D 32k initial, > 8 page =3D 128k initial 310 */ 311 static unsigned long get_init_ra_size(unsigned long size, unsigned= long max) 312 { 313 unsigned long newsize =3D roundup_pow_of_two(size); It looks like 'size' is either extremely large or it might be negative if it were a signed long instead of unsigned, so maybe it's 0x80000000_00000= 000 or 0xffffffff_ffffffff or something similar. I think that we should add a WARN_ON_ONCE() there to try to catch whatever it is. Is this something that you could test if I send some patches? Unless other people have some other ideas, that is... thanks. --=20 ~Randy