From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BBA2AC531DC for ; Fri, 23 Aug 2024 16:17:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 52F486B046D; Fri, 23 Aug 2024 12:17:08 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4B8886B046E; Fri, 23 Aug 2024 12:17:08 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 331266B046F; Fri, 23 Aug 2024 12:17:08 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 12C4C6B046D for ; Fri, 23 Aug 2024 12:17:08 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id CBBA741EE5 for ; Fri, 23 Aug 2024 16:17:07 +0000 (UTC) X-FDA: 82484014494.01.11B6FF3 Received: from mail-qv1-f46.google.com (mail-qv1-f46.google.com [209.85.219.46]) by imf01.hostedemail.com (Postfix) with ESMTP id EA76440010 for ; Fri, 23 Aug 2024 16:17:05 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=gAx8z2Rr; spf=pass (imf01.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.219.46 as permitted sender) smtp.mailfrom=nphamcs@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1724429761; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=flB7OHR5QrKHPhbrF3On1DA0RFx/CeDMftCyKCVbPiY=; b=BXdppEEFTOCSVd406FfGlocHGrQ3Vnf1dCbIe+2rnjHcjC9uaGbcfWHmFXCg+rcE2WwlDM YiN7ZHooya9amM+WZ42OpVdC1X+lvBqh3K73ShHhYLaqvPUGgsfwu/pU7xrFs3j+4w90Hf a52Rj+bMZjJygXJDYBriNUb3BA41UDI= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=gAx8z2Rr; spf=pass (imf01.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.219.46 as permitted sender) smtp.mailfrom=nphamcs@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1724429761; a=rsa-sha256; cv=none; b=URkPWc53f37ixNp9Vg/89I62mvV9rg5NY7V/PjROqsfLdR8H0xCwGwYH2ON2nYJxZIiNfr ad/Drm6cKsdWYzDgVlAw3oaqnbvEdnOn21gbKxX3TZqEgavwUFgxnOcABwmUDIEQsKL7O+ h+S0qGR4qXFW2eA6EFD8hl43i/r98C8= Received: by mail-qv1-f46.google.com with SMTP id 6a1803df08f44-6bf90d52e79so11086236d6.3 for ; Fri, 23 Aug 2024 09:17:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1724429825; x=1725034625; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=flB7OHR5QrKHPhbrF3On1DA0RFx/CeDMftCyKCVbPiY=; b=gAx8z2Rr4u/VIzQm43FXXlWPNJpaODHWH3N2N/dcr3ih0sSMl8K4psLEv8QHnB0N+d 9YtSd6cerZpGOJ/4gXHSK7OECj1qBBnyDjzGdL3cb060WH0dSwCZDDhf2/xWGXPnL3or TLlc1Ss/HJa6mvM+Y3x2Z/Yn5uzXNAsw/eJL8ZkY8GuUBmC9b3RP1mK8Mh3eWaUq8zOV 6okA1w30Ub3PL2ATerMIj/vMBuYAXqmsIyZSgYY7r7NwMG+qZJ3jEJh2NTg2I/Zzfx1O CLYF6ZoHpAElR2CjBjco9tFCj34jN3bqQKruuKRKzW6/tSKZgPSk24yc54S2iWQPTyWT 5W4Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724429825; x=1725034625; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=flB7OHR5QrKHPhbrF3On1DA0RFx/CeDMftCyKCVbPiY=; b=lRHey62Ru86ixSdnG7iIm2Gvc/ieH/xB0re4tpqroeif23irn7l3vAAcWHwSKDIt1e VE6x4tJfn08IEQuGhbh1jAT9e/XK70c+s3NpBY9AUDrs7CRh9AJGiuL21gIo+gLqzHfK 8zbOQC1bsaEK5Tbn3gZmr4naLgYxqa42ZxM5q0j8noULDdixk5iuPThdDzSpXWxh9ev8 V7Vbon2IDTvOxh4WTtPakX/VMdIQBNg/5iQsZdhJogjuyBefqL1wPPL/p78ly2w0IwK5 V971AaX4xSXIAs15rQAF2oV50BM0FOKj0lqlCZSOuy7Zxt7GWpaPIzap9lvA2jJ70prV 2LRQ== X-Forwarded-Encrypted: i=1; AJvYcCWLGG+pOkgwmVF8RX2xfUqWAMwBTS+l5y/6UjkwUWOI27GnSH3xiW2mOcpRSIZBNKv0cqG3ZgnM0A==@kvack.org X-Gm-Message-State: AOJu0YznABUjG2S6FMZ6C92bDD94Vt9EAO5rmmEC6mGEDefn7hJeq2rG P/1b18qBFQmHdH6nZq4uZ+XUV0627YwqyVXz84HJPG8olKiD8ZeBMdHPCyzezgbIwHBaHwJsEpl YjEM9QFeQ2sTyWdiKXZgpf3nPuFY= X-Google-Smtp-Source: AGHT+IFL358z/bGmiPDknmSO46yYht9Fbjiric2FHtiPjzBP0j8i9TbAuhNt5ETwz3uYpEPLaVomzadAPZWLpJ7izO0= X-Received: by 2002:a05:6214:3a87:b0:6b5:e1fb:68ee with SMTP id 6a1803df08f44-6c16dc7ca9dmr33223576d6.25.1724429824918; Fri, 23 Aug 2024 09:17:04 -0700 (PDT) MIME-Version: 1.0 References: <6f65e3a6-5f1a-4fda-b406-17598f4a72d5@leemhuis.info> <02D2DA66-4A91-4033-8B98-ED25FC2E0CD6@gmail.com> In-Reply-To: <02D2DA66-4A91-4033-8B98-ED25FC2E0CD6@gmail.com> From: Nhat Pham Date: Fri, 23 Aug 2024 12:16:53 -0400 Message-ID: Subject: Re: [regression] oops on heavy compilations ("kernel BUG at mm/zswap.c:1005!" and "Oops: invalid opcode: 0000") To: Piotr Oniszczuk Cc: Matthew Wilcox , Linux regressions mailing list , LKML , Johannes Weiner , Yosry Ahmed , Linux-MM Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: ze3qyy1wkch9smqu66t1gtnkjpntgkdw X-Rspam-User: X-Rspamd-Queue-Id: EA76440010 X-Rspamd-Server: rspam02 X-HE-Tag: 1724429825-919950 X-HE-Meta: U2FsdGVkX18hYkum6fv4khw8XoCpn2CUzI+Wxp29D2S/UISFv1V3xMFtPJ2i/36DRIv8nojYl7PKsSldYN6bF9km+eky8PVjmhivlKZY95kTVjPMDQrwHwd2VQSIGPHCeP878eRGkTi3Q9Mom4dXJjWr2W1f/by6llsXPIwXog2Y9tESeEaLJsezWiWlDkO/FjqrMmytZ03NkfKwFL/1SnMnmphS9PY0/pyl89xgU1gLDH4Xso0al+w9Ih3jIbYK6R7Z+j17kFy5BULpuaV15XxCuh8Q6hVZlIuW9r1OJthQzCwEahf1CV7RingnHU20rGf4lqgCE96cnUjxyrkBzDar6VsIysmtDT9NMF27ON4phE2TNIQ12VG6+6XXDM4/ZvOegyMAg7x6TOJ/zXa2NIQWhB//LTNs5jNm3OE8wscLvHy0A3QixFbNdicvromyXRJdtJkmqHsexOHdi5DR68Vkn0olZjlJjuOAi06Z1hJ9MBZEFPU9QMs0E2alZun9wd/c3OObhYqig4g7DBFY2kKquHZSNWT6WDxxa3dyUKJmTh5RmZYqOXcI22EKtm14BIZrGQwQghFfZ+yWtkUeUvCFGU9Hqh4gpcxIfL3DzSPteANJlmjbZ3IxryVxvXcxel6RYxIG2T0nZhdIJWbwpX8zfO/qZsjT21+FoZxqDnjOJ+OsF/CJTEDeyFYl1Tqn31f3YUD7pwJAOJLJS9a4XMs9Xg0oNq5zaUVSbz6mC2UHAgBGRzv93ZMhaLgZHHoxFJyxVTmujokgfnXXLziZaCDVVRX6O2V/mMtd51p9LvKJO9WtL4yLwOayY2YusE6GjkXpJjgkYbxbgpbn5B7CyyDORWNeIRyrHgSDhZ77SMutENGBo9286SbBvIVVo9MexP9sd9xigeQGY1r5aaf6KzS8gBo/SvAjypjliJqKn55djeLlLIAI7zBJNUM5FrM6/2zquqg8AQwzl5Y53+Y ph55/y8S f7rUuCF1l8tidIDPcSdQwUTJO+ykZVc0Vop4yWp2FxwRtdqVNP4BaVLD2wnPt6QtHgBfpTVWYMUL6gQVBjiVCleKsvff2dqq8XNTnn9d3MGtAedwGpJTAYcjAyqQlEJpCY/AcrG/gpK2VNyi0PJvpPAXysX8i39N2WhA0rmGBkmze0FuVSb0sKSLt699i7TKBni63dpKmgMnYpqVEoJm/WR9xA3BRFEmAuQP2FcSX2itHIK/Emy/wmw4i7PVeJTsqukQy2rEO9J7k6G0RENNscJR6PymQv71mKYTVW8p5R1Uj0A0j66BpX6SU9QKFrGMJ2HLsnuj1l4x2UpKMH28PwO0vXcg6IGajruXRyq5O6nSeTOwia7IrB4HAQSCE4fdNfG6kdyvCRU/QkUz1YKX+POuHCPHWERSrGaIaaEDfzJgwH5hm64o6OJ1UF8yxBua5NApwL0QsjchN7X4q4eRaqtYCCXb0FtpxZfiSP2GGzNvPF5E= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Aug 23, 2024 at 11:07=E2=80=AFAM Piotr Oniszczuk wrote: > > > > > Wiadomo=C5=9B=C4=87 napisana przez Matthew Wilcox = w dniu 23.08.2024, o godz. 15:13: > > > > I wouldn't be surprised if this were dodgy ram. > > > Well - that was my initial hypothesis. > > in fact i had few of them. Ranked (and ordered) like this: > 1. downstream kernel patches > 2. hw (ram) issue > 3. kernel bug > > So full history was: > -build myself archlinux 6.10.2 kernel; upgrade builder OS (only kernel; n= othing else) > -run normal devel process and (to my surprise) discover interrupted CI/CD= builds by kernel oops > -downgrade to 6.8.2 and done 4 full builds (full takes 8..9h of constant = 12c/24/t compile). all good. > -prepare vanilla 6.10.6 (to exclude potential downstream (ArchLinux) root= causes) > -run normal devel process and still discover oops > -make sure hw is ok by week of test with 6.8.2 (recompiling for 3 archite= ctures on 4 OS (3 in kvm). This was almost 5 full days of 12c/24 compiling.= All good > -because last steep was all good - decide to go to you :-) > > sure - this is possible that 6.8.2 had luck with my ram and 6.10.6 had no= luck=E2=80=A6.but i personally don=E2=80=99t believe this is a case=E2=80= =A6. Have you tried with 6.9 yet? IIRC, there are two major changes to zswap architecture in recent versions. 1. In 6.9, we range-partition zswap's rbtrees to reduce lock contention. 2. In 6.10, we replace zswap's rbtrees with xarrays. If 6.9 is fine, then the latter is the suspect, and vice versa. Of course, the minor changes are still suspect - but you get the idea :) > > btw: we can go with elimination strategy. > So what i need to change/disable to be closer to finding root cause? Could you let me know more about the setup? A couple things come to my mind= : 1. zswap configs (allocator - is it zsmalloc? compressor?) 2. Is mTHP enabled? mTHP swapout was merged in 6.10, and there seems to be some conflicts with zswap, but Yosry will know more about this than me... 3. Is there any proprietary driver etc.? > swap? > now it is swapfile on system nvme >