Dear Andrew folks, dear Linux folks, On a four socket Dell PowerEdge R815 with AMD Opterons 6276, Linux 5.4.14 hit the kernel bug below. We were *not* able to reproduce it yet. [10834.604899] ------------[ cut here ]------------ [10834.604906] kernel BUG at mm/vmscan.c:1740! [10834.604917] invalid opcode: 0000 [#1] SMP NOPTI [10834.609485] CPU: 46 PID: 409 Comm: kswapd3 Kdump: loaded Not tainted 5.4.14.mx64.317 #1 [10834.617505] Hardware name: Dell Inc. PowerEdge R815/0THJFH, BIOS 3.2.2 09/15/2014 [10834.625014] RIP: 0010:isolate_lru_pages+0x367/0x370 [10834.629904] Code: e9 53 4d 89 f8 41 54 48 8b 4c 24 18 8b 54 24 28 8b 74 24 40 e8 4a 3b c4 00 49 8b 06 48 83 c4 18 48 85 c0 75 d0 e9 c4 fe ff ff <0f> 0b e8 42 c0 ea ff 66 90 0f 1f 44 00 00 41 57 41 56 41 55 41 54 [10834.648716] RSP: 0018:ffffc9000d407ae0 EFLAGS: 00010082 [10834.653955] RAX: 00000000ffffffea RBX: ffffea00e0800008 RCX: ffff88bfdc595420 [10834.661103] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffea00e0800000 [10834.668253] RBP: ffffc9000d407de8 R08: ffffc9000d407de8 R09: 0000000000000020 [10834.675401] R10: 00000000f0000000 R11: 0000000000000000 R12: ffff88bfdc595420 [10834.682549] R13: 0000000000000001 R14: 0000000000000010 R15: 0000000000000010 [10834.689698] FS: 0000000000000000(0000) GS:ffff88bfdfb80000(0000) knlGS:0000000000000000 [10834.697802] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [10834.703559] CR2: 000000000060d348 CR3: 0000004fd3876000 CR4: 00000000000406e0 [10834.710709] Call Trace: [10834.713170] shrink_inactive_list+0x113/0x3d0 [10834.717543] shrink_node_memcg+0x3c8/0x800 [10834.721655] ? shrink_slab+0x295/0x2c0 [10834.725417] ? shrink_slab+0x295/0x2c0 [10834.729179] ? shrink_node+0xb6/0x420 [10834.732866] shrink_node+0xb6/0x420 [10834.736367] balance_pgdat+0x250/0x550 [10834.740130] kswapd+0x15d/0x3f0 [10834.743286] ? wait_woken+0x80/0x80 [10834.746785] ? balance_pgdat+0x550/0x550 [10834.750724] kthread+0x117/0x130 [10834.753968] ? kthread_create_worker_on_cpu+0x70/0x70 [10834.759039] ret_from_fork+0x22/0x40 [10834.762627] Modules linked in: nfsv4 nfs rpcsec_gss_krb5 ext4 mbcache jbd2 8021q garp stp mrp llc input_leds led_class mgag200 drm_vram_helper ttm kvm_amd drm_kms_helper kvm drm fb_sys_fops syscopyarea sysfillrect sysimgblt ixgbe irqbypass 3w_9xxx crc32c_intel acpi_cpufreq nfsd auth_rpcgss oid_registry nfs_acl lockd grace sunrpc ip_tables x_tables unix ipv6 nf_defrag_ipv6 autofs4 [10834.796386] ---[ end trace 28611096f6473c90 ]--- More traces follow in the log. Please find the full log attached. Here is the code in question from `mm/vmscan.c`: 1699 while (scan < nr_to_scan && !list_empty(src)) { 1700 struct page *page; 1701 1702 page = lru_to_page(src); 1703 prefetchw_prev_lru_page(page, src, flags); 1704 1705 VM_BUG_ON_PAGE(!PageLRU(page), page); 1706 1707 nr_pages = compound_nr(page); 1708 total_scan += nr_pages; 1709 1710 if (page_zonenum(page) > sc->reclaim_idx) { 1711 list_move(&page->lru, &pages_skipped); 1712 nr_skipped[page_zonenum(page)] += nr_pages; 1713 continue; 1714 } 1715 1716 /* 1717 * Do not count skipped pages because that makes the function 1718 * return with no isolated pages if the LRU mostly contains 1719 * ineligible pages. This causes the VM to not reclaim any 1720 * pages, triggering a premature OOM. 1721 * 1722 * Account all tail pages of THP. This would not cause 1723 * premature OOM since __isolate_lru_page() returns -EBUSY 1724 * only when the page is being freed somewhere else. 1725 */ 1726 scan += nr_pages; 1727 switch (__isolate_lru_page(page, mode)) { 1728 case 0: 1729 nr_taken += nr_pages; 1730 nr_zone_taken[page_zonenum(page)] += nr_pages; 1731 list_move(&page->lru, dst); 1732 break; 1733 1734 case -EBUSY: 1735 /* else it is being freed elsewhere */ 1736 list_move(&page->lru, src); 1737 continue; 1738 1739 default: 1740 BUG(); 1741 } 1742 } We haven’t seen this before with Linux versions up to 4.19.57, but also only started to use this system as a cluster node now. Before it was used interactively. Could this be a regression from 4.19.x to 5.4? Kind regards, Paul