From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.2 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7AF3BECE599 for ; Wed, 16 Oct 2019 20:03:49 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 3D23920663 for ; Wed, 16 Oct 2019 20:03:48 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3D23920663 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id ECD3A8E0007; Wed, 16 Oct 2019 16:03:47 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E7D848E0005; Wed, 16 Oct 2019 16:03:47 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D6D2E8E0007; Wed, 16 Oct 2019 16:03:47 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0075.hostedemail.com [216.40.44.75]) by kanga.kvack.org (Postfix) with ESMTP id B31EB8E0005 for ; Wed, 16 Oct 2019 16:03:47 -0400 (EDT) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with SMTP id 6F7671848B0C7 for ; Wed, 16 Oct 2019 20:03:47 +0000 (UTC) X-FDA: 76050723294.29.view33_20f6dfaaa593b X-HE-Tag: view33_20f6dfaaa593b X-Filterd-Recvd-Size: 10329 Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by imf33.hostedemail.com (Postfix) with ESMTP for ; Wed, 16 Oct 2019 20:03:46 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 91C6AB25B; Wed, 16 Oct 2019 20:03:44 +0000 (UTC) Subject: Re: [bug, 5.2.16] kswapd/compaction null pointer crash [was Re: xfs_inode not reclaimed/memory leak on 5.2.16] To: Florian Weimer Cc: Dave Chinner , linux-mm@kvack.org, Mel Gorman References: <87pnji8cpw.fsf@mid.deneb.enyo.de> <20190930085406.GP16973@dread.disaster.area> <87o8z1fvqu.fsf@mid.deneb.enyo.de> <20190930211727.GQ16973@dread.disaster.area> <96023250-6168-3806-320a-a3468f1cd8c9@suse.cz> <87blugh452.fsf@mid.deneb.enyo.de> From: Vlastimil Babka Autocrypt: addr=vbabka@suse.cz; prefer-encrypt=mutual; keydata= mQINBFZdmxYBEADsw/SiUSjB0dM+vSh95UkgcHjzEVBlby/Fg+g42O7LAEkCYXi/vvq31JTB KxRWDHX0R2tgpFDXHnzZcQywawu8eSq0LxzxFNYMvtB7sV1pxYwej2qx9B75qW2plBs+7+YB 87tMFA+u+L4Z5xAzIimfLD5EKC56kJ1CsXlM8S/LHcmdD9Ctkn3trYDNnat0eoAcfPIP2OZ+ 9oe9IF/R28zmh0ifLXyJQQz5ofdj4bPf8ecEW0rhcqHfTD8k4yK0xxt3xW+6Exqp9n9bydiy tcSAw/TahjW6yrA+6JhSBv1v2tIm+itQc073zjSX8OFL51qQVzRFr7H2UQG33lw2QrvHRXqD Ot7ViKam7v0Ho9wEWiQOOZlHItOOXFphWb2yq3nzrKe45oWoSgkxKb97MVsQ+q2SYjJRBBH4 8qKhphADYxkIP6yut/eaj9ImvRUZZRi0DTc8xfnvHGTjKbJzC2xpFcY0DQbZzuwsIZ8OPJCc LM4S7mT25NE5kUTG/TKQCk922vRdGVMoLA7dIQrgXnRXtyT61sg8PG4wcfOnuWf8577aXP1x 6mzw3/jh3F+oSBHb/GcLC7mvWreJifUL2gEdssGfXhGWBo6zLS3qhgtwjay0Jl+kza1lo+Cv BB2T79D4WGdDuVa4eOrQ02TxqGN7G0Biz5ZLRSFzQSQwLn8fbwARAQABtCBWbGFzdGltaWwg QmFia2EgPHZiYWJrYUBzdXNlLmN6PokCVAQTAQoAPgIbAwULCQgHAwUVCgkICwUWAgMBAAIe AQIXgBYhBKlA1DSZLC6OmRA9UCJPp+fMgqZkBQJcbbyGBQkH8VTqAAoJECJPp+fMgqZkpGoP /1jhVihakxw1d67kFhPgjWrbzaeAYOJu7Oi79D8BL8Vr5dmNPygbpGpJaCHACWp+10KXj9yz fWABs01KMHnZsAIUytVsQv35DMMDzgwVmnoEIRBhisMYOQlH2bBn/dqBjtnhs7zTL4xtqEcF 1hoUFEByMOey7gm79utTk09hQE/Zo2x0Ikk98sSIKBETDCl4mkRVRlxPFl4O/w8dSaE4eczH LrKezaFiZOv6S1MUKVKzHInonrCqCNbXAHIeZa3JcXCYj1wWAjOt9R3NqcWsBGjFbkgoKMGD usiGabetmQjXNlVzyOYdAdrbpVRNVnaL91sB2j8LRD74snKsV0Wzwt90YHxDQ5z3M75YoIdl byTKu3BUuqZxkQ/emEuxZ7aRJ1Zw7cKo/IVqjWaQ1SSBDbZ8FAUPpHJxLdGxPRN8Pfw8blKY 8mvLJKoF6i9T6+EmlyzxqzOFhcc4X5ig5uQoOjTIq6zhLO+nqVZvUDd2Kz9LMOCYb516cwS/ Enpi0TcZ5ZobtLqEaL4rupjcJG418HFQ1qxC95u5FfNki+YTmu6ZLXy+1/9BDsPuZBOKYpUm 3HWSnCS8J5Ny4SSwfYPH/JrtberWTcCP/8BHmoSpS/3oL3RxrZRRVnPHFzQC6L1oKvIuyXYF rkybPXYbmNHN+jTD3X8nRqo+4Qhmu6SHi3VquQENBFsZNQwBCACuowprHNSHhPBKxaBX7qOv KAGCmAVhK0eleElKy0sCkFghTenu1sA9AV4okL84qZ9gzaEoVkgbIbDgRbKY2MGvgKxXm+kY n8tmCejKoeyVcn9Xs0K5aUZiDz4Ll9VPTiXdf8YcjDgeP6/l4kHb4uSW4Aa9ds0xgt0gP1Xb AMwBlK19YvTDZV5u3YVoGkZhspfQqLLtBKSt3FuxTCU7hxCInQd3FHGJT/IIrvm07oDO2Y8J DXWHGJ9cK49bBGmK9B4ajsbe5GxtSKFccu8BciNluF+BqbrIiM0upJq5Xqj4y+Xjrpwqm4/M ScBsV0Po7qdeqv0pEFIXKj7IgO/d4W2bABEBAAGJA3IEGAEKACYWIQSpQNQ0mSwujpkQPVAi T6fnzIKmZAUCWxk1DAIbAgUJA8JnAAFACRAiT6fnzIKmZMB0IAQZAQoAHRYhBKZ2GgCcqNxn k0Sx9r6Fd25170XjBQJbGTUMAAoJEL6Fd25170XjDBUH/2jQ7a8g+FC2qBYxU/aCAVAVY0NE YuABL4LJ5+iWwmqUh0V9+lU88Cv4/G8fWwU+hBykSXhZXNQ5QJxyR7KWGy7LiPi7Cvovu+1c 9Z9HIDNd4u7bxGKMpn19U12ATUBHAlvphzluVvXsJ23ES/F1c59d7IrgOnxqIcXxr9dcaJ2K k9VP3TfrjP3g98OKtSsyH0xMu0MCeyewf1piXyukFRRMKIErfThhmNnLiDbaVy6biCLx408L Mo4cCvEvqGKgRwyckVyo3JuhqreFeIKBOE1iHvf3x4LU8cIHdjhDP9Wf6ws1XNqIvve7oV+w B56YWoalm1rq00yUbs2RoGcXmtX1JQ//aR/paSuLGLIb3ecPB88rvEXPsizrhYUzbe1TTkKc 4a4XwW4wdc6pRPVFMdd5idQOKdeBk7NdCZXNzoieFntyPpAq+DveK01xcBoXQ2UktIFIsXey uSNdLd5m5lf7/3f0BtaY//f9grm363NUb9KBsTSnv6Vx7Co0DWaxgC3MFSUhxzBzkJNty+2d 10jvtwOWzUN+74uXGRYSq5WefQWqqQNnx+IDb4h81NmpIY/X0PqZrapNockj3WHvpbeVFAJ0 9MRzYP3x8e5OuEuJfkNnAbwRGkDy98nXW6fKeemREjr8DWfXLKFWroJzkbAVmeIL0pjXATxr +tj5JC0uvMrrXefUhXTo0SNoTsuO/OsAKOcVsV/RHHTwCDR2e3W8mOlA3QbYXsscgjghbuLh J3oTRrOQa8tUXWqcd5A0+QPo5aaMHIK0UAthZsry5EmCY3BrbXUJlt+23E93hXQvfcsmfi0N rNh81eknLLWRYvMOsrbIqEHdZBT4FHHiGjnck6EYx/8F5BAZSodRVEAgXyC8IQJ+UVa02QM5 D2VL8zRXZ6+wARKjgSrW+duohn535rG/ypd0ctLoXS6dDrFokwTQ2xrJiLbHp9G+noNTHSan ExaRzyLbvmblh3AAznb68cWmM3WVkceWACUalsoTLKF1sGrrIBj5updkKkzbKOq5gcC5AQ0E Wxk1NQEIAJ9B+lKxYlnKL5IehF1XJfknqsjuiRzj5vnvVrtFcPlSFL12VVFVUC2tT0A1Iuo9 NAoZXEeuoPf1dLDyHErrWnDyn3SmDgb83eK5YS/K363RLEMOQKWcawPJGGVTIRZgUSgGusKL NuZqE5TCqQls0x/OPljufs4gk7E1GQEgE6M90Xbp0w/r0HB49BqjUzwByut7H2wAdiNAbJWZ F5GNUS2/2IbgOhOychHdqYpWTqyLgRpf+atqkmpIJwFRVhQUfwztuybgJLGJ6vmh/LyNMRr8 J++SqkpOFMwJA81kpjuGR7moSrUIGTbDGFfjxmskQV/W/c25Xc6KaCwXah3OJ40AEQEAAYkC PAQYAQoAJhYhBKlA1DSZLC6OmRA9UCJPp+fMgqZkBQJbGTU1AhsMBQkDwmcAAAoJECJPp+fM gqZkPN4P/Ra4NbETHRj5/fM1fjtngt4dKeX/6McUPDIRuc58B6FuCQxtk7sX3ELs+1+w3eSV rHI5cOFRSdgw/iKwwBix8D4Qq0cnympZ622KJL2wpTPRLlNaFLoe5PkoORAjVxLGplvQIlhg miljQ3R63ty3+MZfkSVsYITlVkYlHaSwP2t8g7yTVa+q8ZAx0NT9uGWc/1Sg8j/uoPGrctml hFNGBTYyPq6mGW9jqaQ8en3ZmmJyw3CHwxZ5FZQ5qc55xgshKiy8jEtxh+dgB9d8zE/S/UGI E99N/q+kEKSgSMQMJ/CYPHQJVTi4YHh1yq/qTkHRX+ortrF5VEeDJDv+SljNStIxUdroPD29 2ijoaMFTAU+uBtE14UP5F+LWdmRdEGS1Ah1NwooL27uAFllTDQxDhg/+LJ/TqB8ZuidOIy1B xVKRSg3I2m+DUTVqBy7Lixo73hnW69kSjtqCeamY/NSu6LNP+b0wAOKhwz9hBEwEHLp05+mj 5ZFJyfGsOiNUcMoO/17FO4EBxSDP3FDLllpuzlFD7SXkfJaMWYmXIlO0jLzdfwfcnDzBbPwO hBM8hvtsyq8lq8vJOxv6XD6xcTtj5Az8t2JjdUX6SF9hxJpwhBU0wrCoGDkWp4Bbv6jnF7zP Nzftr4l8RuJoywDIiJpdaNpSlXKpj/K6KrnyAI/joYc7 Message-ID: <3560f07e-03a7-9291-6494-e0580eeaa6bd@suse.cz> Date: Wed, 16 Oct 2019 22:03:39 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.1.0 MIME-Version: 1.0 In-Reply-To: <87blugh452.fsf@mid.deneb.enyo.de> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 10/16/19 9:38 PM, Florian Weimer wrote: > This time, I've got a kernel with debugging information (still > 5.2.18). The crash is at offset 0x39f: >=20 > if (!mem_section[SECTION_NR_TO_ROOT(nr)]) > 384: 48 c1 ea 35 shr $0x35,%rdx > 388: 48 8b 14 d7 mov (%rdi,%rdx,8),%rdx > 38c: 48 c1 e8 2d shr $0x2d,%rax > 390: 48 85 d2 test %rdx,%rdx > 393: 74 0a je 39f <__reset_isolation_p= fn+0x27f> > return &mem_section[SECTION_NR_TO_ROOT(nr)][nr & SECTION_ROOT_M= ASK]; > 395: 0f b6 c0 movzbl %al,%eax > 398: 48 c1 e0 04 shl $0x4,%rax > 39c: 48 01 c2 add %rax,%rdx > unsigned long map =3D section->section_mem_map; > 39f: 48 8b 02 mov (%rdx),%rax > clear_pageblock_skip(page); > 3a2: 4c 89 f2 mov %r14,%rdx > 3a5: 41 b8 01 00 00 00 mov $0x1,%r8d > 3ab: 31 f6 xor %esi,%esi > 3ad: b9 03 00 00 00 mov $0x3,%ecx > 3b2: 4c 89 f7 mov %r14,%rdi >=20 > Hmm, -l output is likely more helpful here: >=20 > /home/fw/src/linux/linux/mm/compaction.c:293 > 37a: a8 10 test $0x10,%al > 37c: 74 bc je 33a <__reset_isolation_p= fn+0x21a> > page_to_section(): > /home/fw/src/linux/linux/./include/linux/mm.h:1265 > 37e: 49 8b 16 mov (%r14),%rdx > 381: 48 89 d0 mov %rdx,%rax > __nr_to_section(): > /home/fw/src/linux/linux/./include/linux/mmzone.h:1218 > 384: 48 c1 ea 35 shr $0x35,%rdx > 388: 48 8b 14 d7 mov (%rdi,%rdx,8),%rdx > page_to_section(): > /home/fw/src/linux/linux/./include/linux/mm.h:1265 > 38c: 48 c1 e8 2d shr $0x2d,%rax > __nr_to_section(): > /home/fw/src/linux/linux/./include/linux/mmzone.h:1218 > 390: 48 85 d2 test %rdx,%rdx > 393: 74 0a je 39f <__reset_isolation_p= fn+0x27f> > /home/fw/src/linux/linux/./include/linux/mmzone.h:1220 > 395: 0f b6 c0 movzbl %al,%eax > 398: 48 c1 e0 04 shl $0x4,%rax > 39c: 48 01 c2 add %rax,%rdx > __section_mem_map_addr(): > /home/fw/src/linux/linux/./include/linux/mmzone.h:1247 > 39f: 48 8b 02 mov (%rdx),%rax > __reset_isolation_pfn(): > /home/fw/src/linux/linux/mm/compaction.c:294 > 3a2: 4c 89 f2 mov %r14,%rdx > 3a5: 41 b8 01 00 00 00 mov $0x1,%r8d > 3ab: 31 f6 xor %esi,%esi >=20 > It's this loop: >=20 > 286 /* > 287 * Only clear the hint if a sample indicates there is eit= her a > 288 * free page or an LRU page in the block. One or other co= ndition > 289 * is necessary for the block to be a migration source/ta= rget. > 290 */ > 291 do { > 292 if (pfn_valid_within(pfn)) { > 293 if (check_source && PageLRU(page)) { > 294 clear_pageblock_skip(page); Thanks. Looks like it's indeed here in the page_to_pfn() embedded in the clear_pageblock_skip() expansion. We've got a wrong struct page pointer, so page_to_section gives us a bogus value, __nr_to_section() a null pointer, and __section_mem_map_addr then accesses it. Hopefully the commit [1] should address the reason why we got a wrong page pointer. You could try cherry-picking it to your stable tree, or wait until it appears in a (hopefully near) future stable 5.3.y (5.2 is EOL, so it won't appear there). Thanks, Vlastimil > 295 return true; > 296 } > 297=20 > 298 if (check_target && PageBuddy(page)) { > 299 clear_pageblock_skip(page); > 300 return true; > 301 } > 302 } > 303=20 > 304 page +=3D (1 << PAGE_ALLOC_COSTLY_ORDER); > 305 pfn +=3D (1 << PAGE_ALLOC_COSTLY_ORDER); > 306 } while (page < end_page); >=20