From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E059AC27C79 for ; Mon, 17 Jun 2024 23:28:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5A9B56B02AA; Mon, 17 Jun 2024 19:28:55 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 559AE6B02AB; Mon, 17 Jun 2024 19:28:55 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3FA406B02AC; Mon, 17 Jun 2024 19:28:55 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 223B86B02AA for ; Mon, 17 Jun 2024 19:28:55 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id C3725A01D5 for ; Mon, 17 Jun 2024 23:28:54 +0000 (UTC) X-FDA: 82241972988.19.588BAA8 Received: from mail-yw1-f171.google.com (mail-yw1-f171.google.com [209.85.128.171]) by imf27.hostedemail.com (Postfix) with ESMTP id 0498040010 for ; Mon, 17 Jun 2024 23:28:52 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=zk6QCO9q; spf=pass (imf27.hostedemail.com: domain of hughd@google.com designates 209.85.128.171 as permitted sender) smtp.mailfrom=hughd@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1718666926; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=VAB39G3/EEy29AwXsRYdDzEcSOFINWe7KYOlv3bA8Os=; b=4ZTapGcWi6gnY1dzcAPlAEfejWAHq827sctL6NWD367Rk6Av0Wcf1sp2u8Bv8oWg2Vek/p DlcD3z+yqoGk8vBLsNa0VG6Ud/c3g3ieVfrZwjC92myQhJue9koTjlzzgwFbab/pJIwnex 1NFdoLH3wK1E3nl9Sp3MoMxgvVTUR6o= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1718666926; a=rsa-sha256; cv=none; b=qo3PcaB4H+eL0C2A8+safHj0RmS/a8pfjNfRdZ25tZkqM6y5CWtoF3EPTUZqqTSh1D7JSj ueNpv3KeEps/ckLwpc4h6EweIe0U8unaa9G3oPsxwgS3iJyv/uBxS4wbAcPZcN7OFGOuJy h3QuCD3V1mtr4KYimiwsDYG0u5dcIug= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=zk6QCO9q; spf=pass (imf27.hostedemail.com: domain of hughd@google.com designates 209.85.128.171 as permitted sender) smtp.mailfrom=hughd@google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-yw1-f171.google.com with SMTP id 00721157ae682-632843cff4fso32991377b3.3 for ; Mon, 17 Jun 2024 16:28:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1718666932; x=1719271732; darn=kvack.org; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=VAB39G3/EEy29AwXsRYdDzEcSOFINWe7KYOlv3bA8Os=; b=zk6QCO9qar381NO26fd+SBLsuq7T1ZBgFOF1bd8IsBj1Kr/VaWter8FyH6icnhwMoT prMCxlWZEIC0554dWM+kOD9vOpxuAxW/kAP6nTY4UgnxnFsqxWsS2qZD6NnpRyP8gx3c drTimCpteeJDgB/hdUS1fzHWxYKL+vD+iZiuHpa/V9MAA4FzRsWQ3vbX6ivFoMDt4+MU tE3jvJt0xnOosgHPDpXzTtAx9bzboSO9upSzSr8LvZvU4/7z/1e1SpFzOh98IhZLCr6J yGNJ6jCUWklhPbC/sM9zb/BlSokhXOqH3+MxqVN+4unNn5I+t1gu8l3/HWFBpFpx3aKP HCQA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1718666932; x=1719271732; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=VAB39G3/EEy29AwXsRYdDzEcSOFINWe7KYOlv3bA8Os=; b=w10B/2WCjIXmoXTW7gqlIAjX0NQ/81QWHyD0Y7bnskIiJRzrVx6/sBNE6gBm2gEhZ9 gPn6mqgkdW0NRkVgipsNA7GLq1v0LCIzi531QQvyCLNjoG/lBQe6csAXDrYpxfJnpwsv ii8jZMW+sT6I8uf50rPjn/Rn+ZW9QJXyoM6YPeAOdE3R3zG+lUN+rNsyKKXgnFQFYy1X ifzerBYg0B3S1OzjtibNpBXDmsu2tYd30/6ciIQ43VrxogfVopiitmvp+HXoMeOALCvL aLef/3zcroywFPFWz+QUCHATndVqGKpOwIDUa0rr25VqjA/JwbEidYvuDfIbkA5E+oO9 BVWQ== X-Forwarded-Encrypted: i=1; AJvYcCXTJCg96ynxed0ljU7bsaeCSKqwkRpemn7cuF/YCQl0Nmihsg6cp67Z85r5IPvT+temYeOBd/nqkoQSM0Fn+7QxPko= X-Gm-Message-State: AOJu0Yzd6l18HW5+nSj6dX453TaswFVdlZW7nrdQpe/KnAWdTo67q2IU dhz+3/fliD9ZXvOk6IFVRjEfbDlpe+oIo9yam70P+W/rUwNfypQ3kAAmQeLJTA== X-Google-Smtp-Source: AGHT+IEVhqpFFYTb+8knJrW1OLetMn73OD3PiJAx+bHYk+bqCr2VJvW63nSHseMRk9KggRjygPCKGA== X-Received: by 2002:a0d:d6c7:0:b0:62a:2a7d:b512 with SMTP id 00721157ae682-6322206f588mr112744437b3.10.1718666931694; Mon, 17 Jun 2024 16:28:51 -0700 (PDT) Received: from darker.attlocal.net (172-10-233-147.lightspeed.sntcca.sbcglobal.net. [172.10.233.147]) by smtp.gmail.com with ESMTPSA id 00721157ae682-63118e9917asm15870257b3.60.2024.06.17.16.28.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 17 Jun 2024 16:28:50 -0700 (PDT) Date: Mon, 17 Jun 2024 16:28:39 -0700 (PDT) From: Hugh Dickins To: kernel test robot cc: Zi Yan , oe-lkp@lists.linux.dev, lkp@intel.com, linux-kernel@vger.kernel.org, Andrew Morton , Baolin Wang , David Hildenbrand , "Huang, Ying" , "Kirill A. Shutemov" , Matthew Wilcox , Ryan Roberts , SeongJae Park , Yang Shi , Yin Fengwei , linux-mm@kvack.org Subject: Re: [linus:master] [mm/migrate] 7262f208ca: kernel_BUG_at_mm/migrate.c In-Reply-To: <202406171436.a30c129-oliver.sang@intel.com> Message-ID: <8a3aa391-d3f1-200f-fa5a-82352b1cc161@google.com> References: <202406171436.a30c129-oliver.sang@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Stat-Signature: hr3o55bgkw47w1hwnhhn1s9dgprfetxa X-Rspamd-Queue-Id: 0498040010 X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1718666932-924122 X-HE-Meta: U2FsdGVkX18qMOcgsjuESd5VadD4QkRy1SrUcywTtEOj6ieAiWOe4U498YuPZbb0d56HRdTpk2OZPpOu/O/xv520/NNR8PBQEQHMc9HziVCMvWPJeSIiK07blpgyH5kUJSorQeatqf1FFs4okDFV4b7JWxCUjrwcRmCBd9ysUCz1BjbXm25DbE7PgpK9ZFvfznaptCSp9HllZJ1XAjJN5e3aJWvx0cGpbkKaEYaQG+zjBX2EynCcdQ7JhhSo2Gj8sEnty9xqCqCBD8dYXs6rjTbbqQZgcEngxlcHC64z7FynCXwqwkJS+iMJ3oQiVQ2X/rEYJp+7gUXKB1dqArtlltR/zRYq5gkry1O8XUfVEKFt2BDRIxJ16GL0/FOBvtjES3CwSL8i3J9VG0BIAcLhqETTjudEXv5ig2/Jt+ehpNsDynw/baC4TDxD1DtegLwvqt7xE6Px4cMApVLrOMAUeyq99uNvyV2iVhE1Vp/YEwGbl7fxgrrsYiPujCCk1zPtCLnDlgCxEgWMASMpWs9znqDybdCAHJynG5Q8ZGXNvs6ifzht0dbQdCGJKr+fyHvIJXMQwSpFGL496rQ0GBmg98GY0DZ0kJCKrGuz44+HnHinKviROvhXXHRCN4DEUJqR8qM1YBsr75RFKE0UJ/0i1QUaTxvcdnECpX1wusSuPdNam0fCeChpAnE+KNm7atjjr3P18VrOGPQaIqMCC6LYNovpjqS0TAE1yUh4XxzYnka2jS991ZArE1Hhh+2f7aXVqlxaW/7h88nRsgzV/++Po2Ha0J0ln7Rc+MryWwgp0kkbWhJML53S6Tgcf+sRgIh6TzKgvUYASQZ/kBrcNsboVne13Y2szQ2Q7lDISo8Rt5c4fyJsmZUq/7WyBM+v/MUNHsTdZ0HbslbOHaPKm9uJ6UHQeK6UzroiUtv9S476nh2bdufsvnaZF2a+QgOtJ4nljSurZ2Chr0+hAJ2uYg0 xLI/dBtf RHrln0zIb4oLpm/pJYZGbbMAKLSYw8J/WMQEbLwKHRxZWZAVO9+laebxIlGCMdgo5r9u25ffj7EVRijU5tdEz+xF3Y0G/gkJh8csiYUiXUMfyl+Alry4cyT9jIi69l88qdRUTnWdoWfA/k/UNl1vQTfb/Uv32DZQ6JFE5U4MOVYlHvZaQS3lNeD7u2hswAQx8TpJ9SQigNbWIFte1Qxk/yOxfdEHzMv+fpNa0hfb24FdSNr1ZXL79e523RpTV6ijFMREfb03qMPCaDB+Cx7kJ20sP9Dlvf1NgvWXHVfUGtXWNnHSeOYCn5WlIY6TTK0Cc0iLBVW8baGb/NxYZWOA7hECApUXB7sLo81xtBe1ZiPWQxi7/GOVB9/WJlOCXHZV8kLLAGLaoze3gTZY0M+ZHrXUC8soj4LMNs98CRIcxjw2eyG0KhM0mbXqfBAkAUq3noRTQqpQg5VfCkSTWWWO5bUD7bDPCGtJ1m9BX6ol4bfzDQG12Fcr+scZ+1Olqd9z+WFqtGHmYSg7Q5Cs= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, 17 Jun 2024, kernel test robot wrote: > > > Hello, > > kernel test robot noticed "kernel_BUG_at_mm/migrate.c" on: > > commit: 7262f208ca681385d133844be8a58d9b4ca185f7 ("mm/migrate: split source folio if it is on deferred split list") > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master > > [test failed on linus/master 32f88d65f01bf6f45476d7edbe675e44fb9e1d58] > [test failed on linux-next/master 234cb065ad82915ff8d06ce01e01c3e640b674d2] > > in testcase: vm-scalability > version: vm-scalability-x86_64-6f4ef16-0_20240303 > with following parameters: > > runtime: 300s > size: 8T > test: anon-cow-seq > cpufreq_governor: performance > > > > compiler: gcc-13 > test machine: 96 threads 2 sockets Intel(R) Xeon(R) Platinum 8260L CPU @ 2.40GHz (Cascade Lake) with 128G memory > > (please refer to attached dmesg/kmsg for entire log/backtrace) > > > > If you fix the issue in a separate patch/commit (i.e. not just a new version of > the same patch/commit), kindly add following tags > | Reported-by: kernel test robot > | Closes: https://lore.kernel.org/oe-lkp/202406171436.a30c129-oliver.sang@intel.com > > > [ 84.214952][ T6581] ------------[ cut here ]------------ > [ 84.219158][ T1289] 781916337 bytes / 1533701 usecs = 497874 KB/s > [ 84.219928][ T6581] kernel BUG at mm/migrate.c:2634! > [ 84.225273][ T1289] > [ 84.226742][ T1289] 781916337 bytes / 1533702 usecs = 497873 KB/s > [ 84.231379][ T6581] invalid opcode: 0000 [#1] SMP NOPTI > [ 84.236360][ T1289] > [ 84.238534][ T6581] CPU: 15 PID: 6581 Comm: usemem Tainted: G S 6.9.0-rc4-00136-g7262f208ca68 #1 > [ 84.238538][ T6581] Hardware name: Intel Corporation S2600WFD/S2600WFD, BIOS SE5C620.86B.0D.01.0286.011120190816 01/11/2019 > [ 84.246187][ T1289] 781916337 bytes / 1533701 usecs = 497874 KB/s > [ 84.249854][ T6581] RIP: 0010:migrate_misplaced_folio (mm/migrate.c:2634 (discriminator 1)) > [ 84.252050][ T1289] > [ 84.262214][ T6581] Code: a0 b4 1b 83 e8 a8 23 f6 ff 48 89 df e8 a0 3f f5 ff 45 31 e4 8b 44 24 1c 85 c0 75 10 48 8b 44 24 20 48 39 e8 0f 84 27 fe ff ff <0f> 0b 41 89 c5 65 4c 01 2d ba 1d bf 7e 48 8b 3b 48 c1 ef 36 e8 2e > All code > ======== > 0: a0 b4 1b 83 e8 a8 23 movabs 0xfff623a8e8831bb4,%al > 7: f6 ff > 9: 48 89 df mov %rbx,%rdi > c: e8 a0 3f f5 ff callq 0xfffffffffff53fb1 > 11: 45 31 e4 xor %r12d,%r12d > 14: 8b 44 24 1c mov 0x1c(%rsp),%eax > 18: 85 c0 test %eax,%eax > 1a: 75 10 jne 0x2c > 1c: 48 8b 44 24 20 mov 0x20(%rsp),%rax > 21: 48 39 e8 cmp %rbp,%rax > 24: 0f 84 27 fe ff ff je 0xfffffffffffffe51 > 2a:* 0f 0b ud2 <-- trapping instruction > 2c: 41 89 c5 mov %eax,%r13d > 2f: 65 4c 01 2d ba 1d bf add %r13,%gs:0x7ebf1dba(%rip) # 0x7ebf1df1 > 36: 7e > 37: 48 8b 3b mov (%rbx),%rdi > 3a: 48 c1 ef 36 shr $0x36,%rdi > 3e: e8 .byte 0xe8 > 3f: 2e cs > > Code starting with the faulting instruction > =========================================== > 0: 0f 0b ud2 > 2: 41 89 c5 mov %eax,%r13d > 5: 65 4c 01 2d ba 1d bf add %r13,%gs:0x7ebf1dba(%rip) # 0x7ebf1dc7 > c: 7e > d: 48 8b 3b mov (%rbx),%rdi > 10: 48 c1 ef 36 shr $0x36,%rdi > 14: e8 .byte 0xe8 > 15: 2e cs > [ 84.262217][ T6581] RSP: 0000:ffffc9002080fd08 EFLAGS: 00010206 > [ 84.262221][ T6581] RAX: ffffea01487467c8 RBX: ffffea0148740000 RCX: 0000000000000000 > [ 84.262223][ T6581] RDX: 000000000000027f RSI: 00000000000001ff RDI: 0000000000000001 > [ 84.262225][ T6581] RBP: ffffc9002080fd28 R08: 0000000000000000 R09: 0000000000000001 > [ 84.262226][ T6581] R10: 000000000000080c R11: 0000000000000000 R12: 0000000000000001 > [ 84.274946][ T1289] 781916337 bytes / 1533699 usecs = 497874 KB/s > [ 84.279439][ T6581] R13: 00000000000001ff R14: 0000000000000200 R15: ffff88907ffd5000 > [ 84.279441][ T6581] FS: 00007f1b11213740(0000) GS:ffff88903f9c0000(0000) knlGS:0000000000000000 > [ 84.285537][ T1289] > [ 84.287725][ T6581] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 84.287728][ T6581] CR2: 00007f1b0fe00000 CR3: 0000005f54c9e003 CR4: 00000000007706f0 > [ 84.287730][ T6581] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 84.287731][ T6581] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [ 84.308684][ T1289] 781916337 bytes / 1533744 usecs = 497860 KB/s > [ 84.313094][ T6581] PKRU: 55555554 > [ 84.313097][ T6581] Call Trace: > [ 84.313100][ T6581] > [ 84.320929][ T1289] > [ 84.328756][ T6581] ? die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434 arch/x86/kernel/dumpstack.c:447) > [ 84.337626][ T1289] 781916337 bytes / 1533702 usecs = 497873 KB/s > [ 84.344414][ T6581] ? do_trap (arch/x86/kernel/traps.c:114 arch/x86/kernel/traps.c:155) > [ 84.350509][ T1289] > [ 84.358331][ T6581] ? migrate_misplaced_folio (mm/migrate.c:2634 (discriminator 1)) > [ 84.368158][ T1289] 781916337 bytes / 1533806 usecs = 497840 KB/s > [ 84.369304][ T6581] ? do_error_trap (arch/x86/include/asm/traps.h:58 arch/x86/kernel/traps.c:176) > [ 84.369306][ T6581] ? migrate_misplaced_folio (mm/migrate.c:2634 (discriminator 1)) > [ 84.375745][ T1289] > [ 84.383566][ T6581] ? exc_invalid_op (arch/x86/kernel/traps.c:267) > [ 84.392383][ T1289] 781916337 bytes / 1537066 usecs = 496784 KB/s > [ 84.399219][ T6581] ? migrate_misplaced_folio (mm/migrate.c:2634 (discriminator 1)) > [ 84.399222][ T6581] ? asm_exc_invalid_op (arch/x86/include/asm/idtentry.h:621) > [ 84.405314][ T1289] > [ 84.408718][ T6581] ? migrate_misplaced_folio (mm/migrate.c:2634 (discriminator 1)) > [ 84.408721][ T6581] ? migrate_misplaced_folio (mm/migrate.c:2630 (discriminator 2)) > [ 84.412861][ T1289] 781916337 bytes / 1533702 usecs = 497873 KB/s > [ 84.414661][ T6581] do_huge_pmd_numa_page (mm/huge_memory.c:1759) > [ 84.416855][ T1289] > [ 84.420436][ T6581] __handle_mm_fault (mm/memory.c:5429) > [ 84.427485][ T1289] 781916337 bytes / 1545381 usecs = 494111 KB/s > [ 84.430542][ T6581] handle_mm_fault (mm/memory.c:5608) > [ 84.432735][ T1289] > [ 84.438220][ T6581] do_user_addr_fault (arch/x86/mm/fault.c:1364) > [ 84.445291][ T1289] 781916337 bytes / 1545380 usecs = 494111 KB/s > [ 84.448765][ T6581] exc_page_fault (arch/x86/include/asm/irqflags.h:37 arch/x86/include/asm/irqflags.h:72 arch/x86/mm/fault.c:1514 arch/x86/mm/fault.c:1564) > [ 84.454252][ T1289] > [ 84.456445][ T6581] asm_exc_page_fault (arch/x86/include/asm/idtentry.h:623) > [ 84.456448][ T6581] RIP: 0033:0x561f6c00dad4 > [ 84.461953][ T1289] 781916337 bytes / 1545382 usecs = 494110 KB/s > [ 84.467077][ T6581] Code: 01 00 00 00 e8 0d f9 ff ff 89 c7 e8 6c ff ff ff bf 00 00 00 00 e8 fc f8 ff ff 85 d2 74 08 48 8d 04 f7 48 8b 00 c3 48 8d 04 f7 <48> 89 30 b8 00 00 00 00 c3 41 54 55 53 48 85 ff 0f 84 23 01 00 00 > All code > ======== > 0: 01 00 add %eax,(%rax) > 2: 00 00 add %al,(%rax) > 4: e8 0d f9 ff ff callq 0xfffffffffffff916 > 9: 89 c7 mov %eax,%edi > b: e8 6c ff ff ff callq 0xffffffffffffff7c > 10: bf 00 00 00 00 mov $0x0,%edi > 15: e8 fc f8 ff ff callq 0xfffffffffffff916 > 1a: 85 d2 test %edx,%edx > 1c: 74 08 je 0x26 > 1e: 48 8d 04 f7 lea (%rdi,%rsi,8),%rax > 22: 48 8b 00 mov (%rax),%rax > 25: c3 retq > 26: 48 8d 04 f7 lea (%rdi,%rsi,8),%rax > 2a:* 48 89 30 mov %rsi,(%rax) <-- trapping instruction > 2d: b8 00 00 00 00 mov $0x0,%eax > 32: c3 retq > 33: 41 54 push %r12 > 35: 55 push %rbp > 36: 53 push %rbx > 37: 48 85 ff test %rdi,%rdi > 3a: 0f 84 23 01 00 00 je 0x163 > > Code starting with the faulting instruction > =========================================== > 0: 48 89 30 mov %rsi,(%rax) > 3: b8 00 00 00 00 mov $0x0,%eax > 8: c3 retq > 9: 41 54 push %r12 > b: 55 push %rbp > c: 53 push %rbx > d: 48 85 ff test %rdi,%rdi > 10: 0f 84 23 01 00 00 je 0x139 > > > The kernel config and materials to reproduce are available at: > https://download.01.org/0day-ci/archive/20240617/202406171436.a30c129-oliver.sang@intel.com > > > > -- > 0-DAY CI Kernel Test Service > https://github.com/intel/lkp-tests/wiki 7262f208ca68:mm/migrate.c line 2634 is BUG_ON(!list_empty(&migratepages)): in a different caller of migrate_pages() than the VM_BUG_ON I hit, but 8e279f970b5c ("mm/migrate: fix kernel BUG at mm/compaction.c:2761!"), which went into Linus's tree today, is very likely to fix it. Hugh