From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.1 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_2 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9601FC4360C for ; Tue, 8 Oct 2019 20:18:23 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 23E5221721 for ; Tue, 8 Oct 2019 20:18:23 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=lca.pw header.i=@lca.pw header.b="Fk+R68Fk" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 23E5221721 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=lca.pw Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id C31798E0005; Tue, 8 Oct 2019 16:18:22 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BE1D78E0003; Tue, 8 Oct 2019 16:18:22 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AD0AC8E0005; Tue, 8 Oct 2019 16:18:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0041.hostedemail.com [216.40.44.41]) by kanga.kvack.org (Postfix) with ESMTP id 8A2058E0003 for ; Tue, 8 Oct 2019 16:18:22 -0400 (EDT) Received: from smtpin02.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with SMTP id 32581181AC9B4 for ; Tue, 8 Oct 2019 20:18:22 +0000 (UTC) X-FDA: 76021729644.02.fish63_83373274d172d X-HE-Tag: fish63_83373274d172d X-Filterd-Recvd-Size: 7148 Received: from mail-qt1-f176.google.com (mail-qt1-f176.google.com [209.85.160.176]) by imf45.hostedemail.com (Postfix) with ESMTP for ; Tue, 8 Oct 2019 20:18:21 +0000 (UTC) Received: by mail-qt1-f176.google.com with SMTP id c21so13010qtj.12 for ; Tue, 08 Oct 2019 13:18:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=lca.pw; s=google; h=message-id:subject:from:to:cc:date:mime-version :content-transfer-encoding; bh=sxjFlRZeieXOJ7BpgEgdRgqZ0nCCp9v7KTDBX8Ujnnc=; b=Fk+R68FkonSbOf5qOFb114l8++ic3r71mvv3VK2386hnV2XzsbXJUQ56bABg7iPvs7 SXyAbCEj2d+unzTJOw22s1yosqng6WBXK86MceeJFAIuwOinVrtFFIwTxvjFo5JEAPIF B5a0B3vd27TE53wOhQiwW/QPtaKbOCoKVCtA8pYxQ374ZJ15Rv0jMuGUwyU1QAwtoAvQ 7wmhpdrwXIdTBxDPzDzhGDaY+7SuOTbwPZj8PIY0dvXvA962CGxWRXf9v1H5hKmCLczF koWeN9E3g8b5795hwsYRTvlk5xiuduwT0GtKYpNM7KpsjN18Dv/WOWfIg7bJdJxfuLIW Putw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:subject:from:to:cc:date:mime-version :content-transfer-encoding; bh=sxjFlRZeieXOJ7BpgEgdRgqZ0nCCp9v7KTDBX8Ujnnc=; b=E6oWyvrsJynuKaOubrxVfJ2a5oryZDzBI241j+jOkGD9KsYU7jcdgxVlPZBzb5p6Tu bT+gPZWIRni2j5nOV5ls3/oKnjPkM5PfTMCjy+SqknJDaUGGZ3OINdz/W/Hn4DWup27U w7jUP7Jo0Hn6lzNfiMyNieoyoT/jwECBaGe4cZQFNvS2aqAua8/rbrxGMx/FJZx8u4mX 6lmwBgr9ruofn8/EGf4ihqlzVhjoW1+FuqsOD3ovPkxHch2z2PWlhQOmuUljpnOcqa1U f2PF/1yqaY0pOQy7j/oTerBo7/zOZh2XZk8lUcHk7PgH4TngntXuZCo1IA7dqndwJbev 2OKA== X-Gm-Message-State: APjAAAWBQewVoI9ALTXZBGbOtPIpJmTZLmN+sxidQGFQXSlCf9I3Y0Zw XUWKL4yXbi46Ipw6nlapNu/Cqw== X-Google-Smtp-Source: APXvYqysQBOH0tHNU4sIPA6eyPqQzdPK786LyHK1ll0eiyARnEkTfnoq+XHggCbHcxvcYwpiCT/XIQ== X-Received: by 2002:a0c:b49a:: with SMTP id c26mr35206877qve.105.1570565900838; Tue, 08 Oct 2019 13:18:20 -0700 (PDT) Received: from dhcp-41-57.bos.redhat.com (nat-pool-bos-t.redhat.com. [66.187.233.206]) by smtp.gmail.com with ESMTPSA id j2sm10087454qki.15.2019.10.08.13.18.19 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 08 Oct 2019 13:18:20 -0700 (PDT) Message-ID: <1570565898.5576.314.camel@lca.pw> Subject: "Shrink zones before removing memory" causes kernel panic with kpagecount From: Qian Cai To: David Hildenbrand Cc: Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org Date: Tue, 08 Oct 2019 16:18:18 -0400 Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.22.6 (3.22.6-10.el7) Mime-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000002, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The linux-next series "mm/memory_hotplug: Shrink zones before removing me= mory" [1] causes a kernel panic while reading /proc/kpagecount after offlining = a memory section. It was reproduced on both x86 and powerpc. Reverted the w= hole series fixed the problem. [1] https://lore.kernel.org/linux-mm/20191006085646.5768-1-david@redhat.c= om/ #=C2=A0echo offline > /sys/devices/system/memory/memory124/state=C2=A0 # cat /proc/kpagecount [=C2=A0=C2=A0133.268032][ T8809] remove from free list 7c000 256 7d000 [=C2=A0=C2=A0133.268134][ T8809] remove from free list 7c100 256 7d000 [=C2=A0=C2=A0133.268153][ T8809] remove from free list 7c200 256 7d000 [=C2=A0=C2=A0133.268182][ T8809] remove from free list 7c300 256 7d000 [=C2=A0=C2=A0133.268212][ T8809] remove from free list 7c400 256 7d000 [=C2=A0=C2=A0133.268241][ T8809] remove from free list 7c500 256 7d000 [=C2=A0=C2=A0133.268260][ T8809] remove from free list 7c600 256 7d000 [=C2=A0=C2=A0133.268289][ T8809] remove from free list 7c700 256 7d000 [=C2=A0=C2=A0133.268329][ T8809] remove from free list 7c800 256 7d000 [=C2=A0=C2=A0133.268359][ T8809] remove from free list 7c900 256 7d000 [=C2=A0=C2=A0133.268399][ T8809] remove from free list 7ca00 256 7d000 [=C2=A0=C2=A0133.268429][ T8809] remove from free list 7cb00 256 7d000 [=C2=A0=C2=A0133.268458][ T8809] remove from free list 7cc00 256 7d000 [=C2=A0=C2=A0133.268488][ T8809] remove from free list 7cd00 256 7d000 [=C2=A0=C2=A0133.268517][ T8809] remove from free list 7ce00 256 7d000 [=C2=A0=C2=A0133.268546][ T8809] remove from free list 7cf00 256 7d000 [=C2=A0=C2=A0133.268580][ T8809] Offlined Pages 4096 [=C2=A0=C2=A0144.038732][ T8944] BUG: Unable to handle kernel data access= at 0xfffffffffffffffe [=C2=A0=C2=A0144.038769][ T8944] Faulting instruction address: 0xc0000000= 00590c08 [=C2=A0=C2=A0144.038794][ T8944] Oops: Kernel access of bad area, sig: 11= [#1] [=C2=A0=C2=A0144.038807][ T8944] LE PAGE_SIZE=3D64K MMU=3DRadix MMU=3DHas= h SMP NR_CPUS=3D256 DEBUG_PAGEALLOC NUMA PowerNV [=C2=A0=C2=A0144.038822][ T8944] Modules linked in: ip_tables x_tables xf= s sd_mod bnx2x mdio ahci libahci tg3 libata libphy firmware_class dm_mirror dm_region_ha= sh dm_log dm_mod [=C2=A0=C2=A0144.038864][ T8944] CPU: 116 PID: 8944 Comm: cat Not tainted= 5.4.0-rc2+ #6 [=C2=A0=C2=A0144.038898][ T8944] NIP:=C2=A0=C2=A0c000000000590c08 LR: c00= 0000000577330 CTR: c0000000005909d0 [=C2=A0=C2=A0144.038945][ T8944] REGS: c00020196bd6fa30 TRAP: 0380=C2=A0=C2= =A0=C2=A0Not tainted=C2=A0=C2=A0(5.4.0- rc2+) [=C2=A0=C2=A0144.038989][ T8944] MSR:=C2=A0=C2=A09000000000009033 =C2=A0=C2=A0CR: 48022428=C2=A0=C2=A0XER: 20040000 [=C2=A0=C2=A0144.039028][ T8944] CFAR: c000000000590ad0 IRQMASK: 0=C2=A0 [=C2=A0=C2=A0144.039028][ T8944] GPR00: c000000000577330 c00020196bd6fcc0= c000000001122d00 c0002009d3d4a880=C2=A0 [=C2=A0=C2=A0144.039028][ T8944] GPR04: 00007fffb6870000 0000000000020000= fffffffffffffffe c00c000000000000=C2=A0 [=C2=A0=C2=A0144.039028][ T8944] GPR08: 0000000001f00000 c00c000001f00000= 0000000000000001 c0000000009413d0=C2=A0 [=C2=A0=C2=A0144.039028][ T8944] GPR12: c0000000005909d0 c000201fff677000= 0000000000000000 0000000000000000=C2=A0 [=C2=A0=C2=A0144.039028][ T8944] GPR16: 0000000000000002 00007fffca34cfa8= ffffffffffffffff 0000000000000000=C2=A0 [=C2=A0=C2=A0144.039028][ T8944] GPR20: 0000000000000000 0000000000000000= c000000000000000 c00020196bd6fdf0=C2=A0 [=C2=A0=C2=A0144.039028][ T8944] GPR24: 00007fffb6870000 0000000007ffffff= 0000000000000000 c000000000aa6c20=C2=A0 [=C2=A0=C2=A0144.039028][ T8944] GPR28: 00007fffb6890000 0000000000000008= 000000000007c000 00007fffb6870000=C2=A0 [=C2=A0=C2=A0144.039240][ T8944] NIP [c000000000590c08] kpagecount_read+0= x238/0x3f0 [=C2=A0=C2=A0144.039263][ T8944] LR [c000000000577330] proc_reg_read+0x90= /0x130 [=C2=A0=C2=A0144.039274][ T8944] Call Trace: [=C2=A0=C2=A0144.039304][ T8944] [c00020196bd6fd30] [c000000000577330] proc_reg_read+0x90/0x130 [=C2=A0=C2=A0144.039342][ T8944] [c00020196bd6fd60] [c0000000004978bc] __vfs_read+0x3c/0x70 [=C2=A0=C2=A0144.039377][ T8944] [c00020196bd6fd80] [c00000000049799c] vf= s_read+0xac/0x170 [=C2=A0=C2=A0144.039423][ T8944] [c00020196bd6fdd0] [c000000000497dfc] ksys_read+0x7c/0x140 [=C2=A0=C2=A0144.039472][ T8944] [c00020196bd6fe20] [c00000000000b378] system_call+0x5c/0x68 [=C2=A0=C2=A0144.039495][ T8944] Instruction dump: [=C2=A0=C2=A0144.039513][ T8944] 4e800020 60000000 3d22000d 3929c098 7bc8= 3664 e8e90000 7d274215 418200ac=C2=A0 [=C2=A0=C2=A0144.039540][ T8944] e9490008 38caffff 714a0001 7cc9309e 2faaffff e9490008 419e00fc=C2=A0 [=C2=A0=C2=A0144.039580][ T8944] ---[ end trace 96fb2ea2d503fda9 ]--- [=C2=A0=C2=A0144.492072][ T8944]=C2=A0 [=C2=A0=C2=A0145.492172][ T8944] Kernel panic - not syncing: Fatal except= ion