From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A111DC433B4 for ; Fri, 23 Apr 2021 21:05:20 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 146CB613CB for ; Fri, 23 Apr 2021 21:05:20 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 146CB613CB Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 935786B0036; Fri, 23 Apr 2021 17:05:19 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8E5206B006C; Fri, 23 Apr 2021 17:05:19 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7AD496B006E; Fri, 23 Apr 2021 17:05:19 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0121.hostedemail.com [216.40.44.121]) by kanga.kvack.org (Postfix) with ESMTP id 5ABA46B0036 for ; Fri, 23 Apr 2021 17:05:19 -0400 (EDT) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 10CB12471 for ; Fri, 23 Apr 2021 21:05:19 +0000 (UTC) X-FDA: 78064862358.01.6952CE0 Received: from mail-ed1-f54.google.com (mail-ed1-f54.google.com [209.85.208.54]) by imf21.hostedemail.com (Postfix) with ESMTP id B9DEFE000111 for ; Fri, 23 Apr 2021 21:05:15 +0000 (UTC) Received: by mail-ed1-f54.google.com with SMTP id j12so33840448edy.3 for ; Fri, 23 Apr 2021 14:05:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=GLxgpeVbGKr6qI6qXibuqKj1y8ZDl2luM8lfFiloI48=; b=DlZSefGm349tfe+h9Zh7q2O388lYMdln2NXNR0wECa2IKLIZDLF1ki9OcdNgq0dkvc zTpxoM0mdlijzZ5ZmCbmkH38+f9MVRefdGAxSdDA/Aj+OXjo7Naws2rEvDXGkz5/BLTy TfuUETvMg140ZEMLiRvBXtauzCnZVjuguv1OggMCSwcfMSjaKX2xvEW3WF50ZStuoT7Z StOhUKHDoumKQW4ypwIATOxjixQW6YWXgQcJUORaaOdEXI/iPuyqPKykBfOKOktAy9r4 1tMHqe6KGqJG4lsvqzMVzslRTwC5x6bUXUPUdjO/JrTS0hDx35xnD5TqH1TCBxDl3ZPm bcnw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=GLxgpeVbGKr6qI6qXibuqKj1y8ZDl2luM8lfFiloI48=; b=EvHxIHgiCWladJlG16E4uLu6fXU6ABJGVeoedJYEbTHlkP58fx4YifinRUut0fSr1S pXVkbp92D/jDVskTQx/C/IovxVLnYdqG85k5FU7RYWixMGHSxcpU3/oJ7KDY7tE6jhhQ TJKXX4cLrhsdNjMjjPtLFNN6YedygSeWeSXyecPE3Hn/4MZ1NM/zVnpnH2mNvajBwX+k G25c+JMBVMVvDD2+xl0wP7V6sAzgJQTSCBc7WpcKJVvyPig9rU19k7pa0UzbLvF0qZl8 KOSrJsy0rqF35sGHXgWfeDbfMG+4oNFkjR4HEnRqCd8Qc8ocCjyJsR9+NMkOAeAMSJEp 9vZA== X-Gm-Message-State: AOAM533fc4dqldg/MZeXvI8VKw9SFHOhAaIKYk/vvu9I5qdKe5oAclhb P7kac0qDxrR+JA7oWaUD8tAn0Anel0R0wqzjO9g= X-Google-Smtp-Source: ABdhPJxrpVR2klTb7SWwfLlgFPYlNMeDXgt7O/nrM2nq2ogZsLhwk+dh8FoYIg6uSARJbK75P703z10mjPgQIduYu0o= X-Received: by 2002:a05:6402:5189:: with SMTP id q9mr6694624edd.168.1619211917205; Fri, 23 Apr 2021 14:05:17 -0700 (PDT) MIME-Version: 1.0 References: <20210423101654.1242.409509F4@e16-tech.com> <20210423160753.6A51.409509F4@e16-tech.com> In-Reply-To: <20210423160753.6A51.409509F4@e16-tech.com> From: Yang Shi Date: Fri, 23 Apr 2021 14:05:05 -0700 Message-ID: Subject: Re: kernel BUG at mm/huge_memory.c:2736(linux 5.10.29) To: Wang Yugui Cc: "Kirill A. Shutemov" , Linux MM Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: n93oied4j8j8im4hy3hampasc3cf6m3p X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: B9DEFE000111 Received-SPF: none (gmail.com>: No applicable sender policy available) receiver=imf21; identity=mailfrom; envelope-from=""; helo=mail-ed1-f54.google.com; client-ip=209.85.208.54 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1619211915-226079 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Apr 23, 2021 at 1:07 AM Wang Yugui wrote: > > Hi, > > > With this patch, the problem yet not happen after 4 tests(5.10.x). > > With this patch , another problem happened at 6th test. > > kernel BUG at mm/huge_memory.c:2343! > static void unmap_page(struct page *page) > { > enum ttu_flags ttu_flags =3D TTU_IGNORE_MLOCK | > TTU_RMAP_LOCKED | TTU_SPLIT_HUGE_PMD; > bool unmap_success; > > VM_BUG_ON_PAGE(!PageHead(page), page); > > if (PageAnon(page)) > ttu_flags |=3D TTU_SPLIT_FREEZE; > > unmap_success =3D try_to_unmap(page, ttu_flags); > L2343:VM_BUG_ON_PAGE(!unmap_success,page); Thanks for running the test. This is what I expected from the debug patch. It means try_to_unmap() didn't unmap the huge page successfully. The huge page is PTE-mapped, try_to_unmap() is supposed to unmap every mapped subpage. But it seems it didn't unmap any subpage at all (the refcount of the huge page is 512 per the log from earlier email). By reading the code, I didn't figure out what went wrong yet. You mentioned that the 5.4.x kernel is fine, so may you try to do some bisect? > } > > > This is the full dmesg output. > > T7610 login: [59085.082973] page:000000008becb0e6 refcount:512 mapcount:0= mapping:0000000000000000 index:0x7f3eb7382 pfn:0x2804a00 > [59085.093430] head:000000008becb0e6 order:9 compound_mapcount:0 compound= _pincount:0 > [59085.100999] anon flags: 0x57ffffc009001d(locked|uptodate|dirty|lru|hea= d|swapbacked) > [59085.108750] raw: 0057ffffc009001d ffffc140640e0008 ffffc1405fc80008 ff= ff8afa82038581 > [59085.116572] raw: 00000007f3eb7382 0000000000000000 00000200ffffffff ff= ff8b05c2a1c000 > [59085.124388] page dumped because: VM_BUG_ON_PAGE(!unmap_success) > [59085.130361] page->mem_cgroup:ffff8b05c2a1c000 > [59085.134766] ------------[ cut here ]------------ > [59085.139426] kernel BUG at mm/huge_memory.c:2343! > [59085.144091] invalid opcode: 0000 [#1] SMP NOPTI > [59085.145083] CPU: 19 PID: 377 Comm: kswapd1 Tainted: G S = 5.10.32-2.el7.x86_64 #1 > [59085.145083] Hardware name: Dell Inc. Precision T7610/0NK70N, BIOS A18 = 09/11/2019 > [59085.145083] RIP: 0010:split_huge_page_to_list+0x7a2/0xb30 > [59085.145083] Code: e8 b3 be fc ff e9 42 fb ff ff 48 c7 c6 98 6b 3a 98 4= c 89 e7 e8 bf 7f f9 ff 0f 0b 48 c7 c6 88 f5 3a 98 4c 89 e7 e8 ae 7f f9 ff <= 0f> 0b 48 c7 c6 a8 f5 3a 98 4c 89 e7 e8 9d 7f f9 ff 0f 0b 49 8b 54 > [59085.145083] RSP: 0018:ffff9a234d183b10 EFLAGS: 00010286 > [59085.145083] RAX: 0000000000000000 RBX: ffff8b05c2a1cae0 RCX: 000000000= 0000000 > [59085.145083] RDX: 0000000000000000 RSI: ffff8b156fa58a80 RDI: ffff8b156= fa58a80 > [59085.145083] RBP: ffffc14060128080 R08: 0000000000000000 R09: c0000000f= fffbfff > [59085.145083] R10: 0000000000000001 R11: ffff9a234d1837e8 R12: ffffc1406= 0128000 > [59085.145083] R13: 0000000000000000 R14: ffff8afa82038580 R15: ffff8b15a= ffd3000 > [59085.145083] FS: 0000000000000000(0000) GS:ffff8b156fa40000(0000) knlG= S:0000000000000000 > [59085.145083] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [59085.145083] CR2: 00007f5c226c01a8 CR3: 00000020792b6006 CR4: 000000000= 01706e0 > [59085.145083] Call Trace: > [59085.145083] ? free_unref_page_commit+0x9b/0x110 > [59085.145083] deferred_split_scan+0x1ca/0x320 > [59085.145083] do_shrink_slab+0x11f/0x250 > [59085.145083] shrink_slab+0x20f/0x2c0 > [59085.145083] shrink_node+0x24b/0x6d0 > [59085.145083] balance_pgdat+0x2db/0x550 > [59085.145083] kswapd+0x201/0x390 > [59085.145083] ? finish_wait+0x80/0x80 > [59085.145083] ? balance_pgdat+0x550/0x550 > [59085.145083] kthread+0x116/0x130 > [59085.145083] ? kthread_park+0x80/0x80 > [59085.145083] ret_from_fork+0x1f/0x30 > [59085.145083] Modules linked in: rpcsec_gss_krb5 nfsv4 dns_resolver nfs = fscache rfkill rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_tran= sport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma= _ucm ib_umad snd_hda_codec_realtek intel_rapl_msr snd_hda_codec_generic int= el_rapl_common ledtrig_audio snd_hda_codec_hdmi snd_hda_intel snd_intel_dsp= cfg soundwire_intel soundwire_generic_allocation snd_soc_core sb_edac x86_p= kg_temp_thermal snd_compress intel_powerclamp snd_pcm_dmaengine coretemp so= undwire_cadence iTCO_wdt dcdbas intel_pmc_bxt mei_hdcp mei_wdt iTCO_vendor_= support snd_hda_codec dell_smm_hwmon kvm_intel snd_hda_core ac97_bus snd_hw= dep snd_seq kvm snd_seq_device irqbypass snd_pcm rapl snd_timer mei_me inte= l_cstate i2c_i801 intel_uncore i2c_smbus mei lpc_ich snd soundcore nvme_rdm= a nvme_fabrics rdma_cm iw_cm ib_cm nfsd rdmavt rdma_rxe ib_uverbs ip6_udp_t= unnel auth_rpcgss udp_tunnel ib_core nfs_acl lockd grace nfs_ssc ip_tables = xfs radeon i2c_algo_bit ttm > [59085.145083] drm_kms_helper cec bnx2x crct10dif_pclmul crc32_pclmul cr= c32c_intel nvme drm ghash_clmulni_intel mpt3sas e1000e pcspkr mdio nvme_cor= e raid_class scsi_transport_sas wmi dm_multipath scsi_dh_rdac scsi_dh_emc s= csi_dh_alua btrfs xor raid6_pq sunrpc i2c_dev > [59085.410667] ---[ end trace c12d9c5dce775958 ]--- > [59085.583739] RIP: 0010:split_huge_page_to_list+0x7a2/0xb30 > [59085.589189] Code: e8 b3 be fc ff e9 42 fb ff ff 48 c7 c6 98 6b 3a 98 4= c 89 e7 e8 bf 7f f9 ff 0f 0b 48 c7 c6 88 f5 3a 98 4c 89 e7 e8 ae 7f f9 ff <= 0f> 0b 48 c7 c6 a8 f5 3a 98 4c 89 e7 e8 9d 7f f9 ff 0f 0b 49 8b 54 > [59085.608129] RSP: 0018:ffff9a234d183b10 EFLAGS: 00010286 > [59085.613405] RAX: 0000000000000000 RBX: ffff8b05c2a1cae0 RCX: 000000000= 0000000 > [59085.620606] RDX: 0000000000000000 RSI: ffff8b156fa58a80 RDI: ffff8b156= fa58a80 > [59085.627806] RBP: ffffc14060128080 R08: 0000000000000000 R09: c0000000f= fffbfff > [59085.635016] R10: 0000000000000001 R11: ffff9a234d1837e8 R12: ffffc1406= 0128000 > [59085.642218] R13: 0000000000000000 R14: ffff8afa82038580 R15: ffff8b15a= ffd3000 > [59085.649422] FS: 0000000000000000(0000) GS:ffff8b156fa40000(0000) knlG= S:0000000000000000 > [59085.657588] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [59085.663388] CR2: 00007f5c226c01a8 CR3: 00000020792b6006 CR4: 000000000= 01706e0 > [59085.670590] Kernel panic - not syncing: Fatal exception > [59085.671587] Kernel Offset: 0x16000000 from 0xffffffff81000000 (relocat= ion range: 0xffffffff80000000-0xffffffffbfffffff) > [59085.671587] ---[ end Kernel panic - not syncing: Fatal exception ]--- > > > Best Regards > Wang Yugui (wangyugui@e16-tech.com) > 2021/04/23 > > > Hi, > > > > > On Sat, Apr 17, 2021 at 1:33 AM Wang Yugui w= rote: > > > > > > > > Hi, > > > > > > > > > On Mon, Apr 12, 2021 at 3:07 AM Wang Yugui wrote: > > > > > > > > > > > > Hi, > > > > > > > > > > > > kernel BUG at mm/huge_memory.c:2736(linux 5.10.29) is triggered > > > > > > by some files write test. > > > > > > > > > > > > mm/huge_memory.c: > > > > > > if (IS_ENABLED(CONFIG_DEBUG_VM) && mapcount) { > > > > > > pr_alert("total_mapcount: %u, page_count(): %u\n", > > > > > > mapcount, count); > > > > > > if (PageTail(page)) > > > > > > dump_page(head, NULL); > > > > > > dump_page(page, "total_mapcount(head) > 0"); > > > > > > L2736: BUG(); > > > > > > } > > > > > > > > > > We just can tell the mapcount of the page is not zero from the cu= rrent > > > > > log, it might mean the unmap_page() call is failed. It seems you = have > > > > > CONFIG_DEBUG_VM enabled, could you please paste more log? There i= s > > > > > "VM_BUG_ON_PAGE(!unmap_success, page)" in unmap_page(). It should= be > > > > > able to tell us if unmap_page() is failed or not, or something el= se > > > > > happened. > > > > > > > > This is the full dmesg output > > > > > > > > [63080.331513] huge_memory: total_mapcount: 511, page_count(): 512 > > > > [63080.332167] page:00000000d2e1a982 refcount:512 mapcount:0 mappin= g:0000000000000000 index:0x7fe260582 pfn:0x676a00 > > > > [63080.332167] head:00000000d2e1a982 order:9 compound_mapcount:0 co= mpound_pincount:0 > > > > [63080.332167] anon flags: 0x17ffffc009001d(locked|uptodate|dirty|l= ru|head|swapbacked) > > > > [63080.332167] raw: 0017ffffc009001d ffffc93cda0d0008 ffffc93cd9ab0= 008 ffff8f21be9f0cb9 > > > > [63080.332167] raw: 00000007fe260582 0000000000000000 00000200fffff= fff ffff8f1021810000 > > > > [63080.332167] page->mem_cgroup:ffff8f1021810000 > > > > [63080.332167] page:00000000bc78ac24 refcount:512 mapcount:1 mappin= g:0000000000000000 index:0x7fe260584 pfn:0x676a02 > > > > [63080.332167] head:00000000d2e1a982 order:9 compound_mapcount:0 co= mpound_pincount:0 > > > > [63080.332167] anon flags: 0x17ffffc009001d(locked|uptodate|dirty|l= ru|head|swapbacked) > > > > [63080.332167] raw: 0017ffffc0000000 ffffc93cd9da8001 dead000000000= 000 ffffc93d428d0098 > > > > [63080.332167] raw: ffffa002cd183bf0 0000000000000000 0000000000000= 000 0000000000000000 > > > > [63080.332167] head: 0017ffffc009001d ffffc93cda0d0008 ffffc93cd9ab= 0008 ffff8f21be9f0cb9 > > > > [63080.332167] head: 00000007fe260582 0000000000000000 00000200ffff= ffff ffff8f1021810000 > > > > [63080.332167] page dumped because: total_mapcount(head) > 0 > > > > > > Added Kirill in this loop too, he may have some insights. > > > > > > Thanks a lot for pasting the full log. It seems the BUG_ON in > > > unmap_page() and VM_BUG_ON_PAGE(compound_mapcount(head), head) were > > > not triggered. But the dumped page shows its total_mapcount is 511. I= t > > > means 511 subpages of the huge page are PTE mapped. It seems all tail > > > pages are PTE mapped. It may be because unmap_page() is failed or the= y > > > are mapped again after unmap_page(). > > > > > > But the VM_BUG_ON_PAGE just checks compound_mapcount, and it seems > > > page_mapcount() call in unmap_page() also just checks > > > compound_mapcount and the mapcount of the head page. If the mapcount > > > of the head page is 0 and compound_mapcount is also 0, try_to_unmap() > > > considers unmap is successful. > > > > > > So we can't tell which case it is although I don't think of how > > > unmap_page() could fail for this case. I think we should check the > > > total mapcount in try_to_unmap() instead. > > > > > > Can you please try the below debug patch (untested) to help narrow > > > down the problem? > > > > > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > > > index ae907a9c2050..c10e89be1c99 100644 > > > --- a/mm/huge_memory.c > > > +++ b/mm/huge_memory.c > > > @@ -2726,7 +2726,7 @@ int split_huge_page_to_list(struct page *page, > > > struct list_head *list) > > > } > > > > > > unmap_page(head); > > > - VM_BUG_ON_PAGE(compound_mapcount(head), head); > > > + VM_BUG_ON_PAGE(total_mapcount(head), head); > > > > > > /* block interrupt reentry in xa_lock and spinlock */ > > > local_irq_disable(); > > > diff --git a/mm/rmap.c b/mm/rmap.c > > > index b0fc27e77d6d..537dfc557744 100644 > > > --- a/mm/rmap.c > > > +++ b/mm/rmap.c > > > @@ -1777,7 +1777,7 @@ bool try_to_unmap(struct page *page, enum ttu_f= lags flags) > > > else > > > rmap_walk(page, &rwc); > > > > > > - return !page_mapcount(page) ? true : false; > > > + return !total_mapcount(page) ? true : false; > > > } > > > > > > /** > > > > > > > > > > With this patch, the problem yet not happen after 4 tests(5.10.x). > > > > By the way, the problem does not happen in 5.4.x.(>about 120 tests) > > does this match the code version? > > > > Best Regards > > Wang Yugui (wangyugui@e16-tech.com) > > 2021/04/23 > > > > > > > > > >