From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3FC9AC4332F for ; Thu, 14 Dec 2023 01:08:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C3C1B6B05D5; Wed, 13 Dec 2023 20:08:29 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id BEB7C6B05D6; Wed, 13 Dec 2023 20:08:29 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A3DF36B05D7; Wed, 13 Dec 2023 20:08:29 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 8C3516B05D5 for ; Wed, 13 Dec 2023 20:08:29 -0500 (EST) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 4D279803C2 for ; Thu, 14 Dec 2023 01:08:29 +0000 (UTC) X-FDA: 81563638338.26.F868E5D Received: from mx0a-002e3701.pphosted.com (mx0a-002e3701.pphosted.com [148.163.147.86]) by imf05.hostedemail.com (Postfix) with ESMTP id 8A36310001F for ; Thu, 14 Dec 2023 01:08:25 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=hpe.com header.s=pps0720 header.b=H7u1ZKYB; dmarc=pass (policy=none) header.from=hpe.com; arc=pass ("microsoft.com:s=arcselector9901:i=1"); spf=pass (imf05.hostedemail.com: domain of christoph.schmitz@hpe.com designates 148.163.147.86 as permitted sender) smtp.mailfrom=christoph.schmitz@hpe.com ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1702516105; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=RXrlByS1hh+UpSKIXNnHbHcL79zOQ70eZncdvVLdcmI=; b=pWk+FrgGSzuXFlbPbWWk4LalNtFwGvYzQeEa8hNSA7Dk0DykEbwhQZHlMTJdrwPCGzPhS8 9AZeEzzc5N9TepxrTUDFYrxXEXo+ZSMU9oxzpn2BxJiEGeWKKOUPmVBV46fFiAVtpvTUff oXzs/87nuwyN/j/tVfEGJArUwre6LW0= ARC-Authentication-Results: i=2; imf05.hostedemail.com; dkim=pass header.d=hpe.com header.s=pps0720 header.b=H7u1ZKYB; dmarc=pass (policy=none) header.from=hpe.com; arc=pass ("microsoft.com:s=arcselector9901:i=1"); spf=pass (imf05.hostedemail.com: domain of christoph.schmitz@hpe.com designates 148.163.147.86 as permitted sender) smtp.mailfrom=christoph.schmitz@hpe.com ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1702516105; a=rsa-sha256; cv=pass; b=jJFTosS8m+ooeEK6Z0wVpUykbOxdBNRWIfjfb/uvzUJaxf795/TwfcFcaMx4L033kmVAkA tT2mO45bTmDNOb43w49LiWYcXiYoHHFBIpjq6EMkqJelV3sTeD8wy5HJFFAI7xk0lG0qB4 H+r+i8106hSqfnEXuTw95QoPTtRAjkE= Received: from pps.filterd (m0148663.ppops.net [127.0.0.1]) by mx0a-002e3701.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3BE0fPNG026556; Thu, 14 Dec 2023 01:07:50 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : content-type : content-transfer-encoding : mime-version; s=pps0720; bh=RXrlByS1hh+UpSKIXNnHbHcL79zOQ70eZncdvVLdcmI=; b=H7u1ZKYBchH97EKXjipBC0hoxJbCcvFJl9JCJXidPDDvAz3urum4vfhAFRoy9iCaDUIy tLJqQboWvfQIGI4x1uGOcVDu6UpGEG2EfTb6vWmS2rdsNeI2cqdwmlIwymhXOOJ2MWmZ MFtWJqz3MD8nYNhNqTpmjxSPzgsQLEbq/S8YAssX+XbNoxF/HijXMZ4tB1p2uO5XUNCh jx+3yeNj6Pc2bYEc++95aSs8Ew5lpqDPHhrENTOEoSDGTFtBDTdxBbr4GsvQOzwvsMdx A0un1rSrC2oQzQfupx8bsZpeEu1lxGOZuPJA/4abw2yAjrCs4+UGFNRJqE0cdx2PYhdP 0g== Received: from p1lg14878.it.hpe.com ([16.230.97.204]) by mx0a-002e3701.pphosted.com (PPS) with ESMTPS id 3uyec5vtxy-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 Dec 2023 01:07:49 +0000 Received: from p1wg14924.americas.hpqcorp.net (unknown [10.119.18.113]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by p1lg14878.it.hpe.com (Postfix) with ESMTPS id C67D81316F; Thu, 14 Dec 2023 01:07:48 +0000 (UTC) Received: from p1wg14923.americas.hpqcorp.net (10.119.18.111) by p1wg14924.americas.hpqcorp.net (10.119.18.113) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.42; Wed, 13 Dec 2023 13:07:11 -1200 Received: from P1WG14918.americas.hpqcorp.net (16.230.19.121) by p1wg14923.americas.hpqcorp.net (10.119.18.111) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.42 via Frontend Transport; Wed, 13 Dec 2023 13:07:11 -1200 Received: from NAM11-DM6-obe.outbound.protection.outlook.com (192.58.206.38) by edge.it.hpe.com (16.230.19.121) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.42; Thu, 14 Dec 2023 01:07:11 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=GLozP4zzLKl/AL+ckwpO1ywOeEFfiwlK+bRmNQYra4LOit1ngTG1avQJY5qZH3niwJhOym2KBvS0k63eIzWtRf2ESptUA3H5pBIYZPkwMa2CYGhpkWV1/MZNiA3sM2P3SKUqt+UJXMtivv//PlFz4pjoF7sTFnqEmxgfOxAYN75kUy+Xvosl3pEIHVRva4ecCBLuJXAHjgRW4N2vK2RqGuzJJY92pYu7n1VsmdPeliWI9Kh927LK2p0+Sa8Cf94O8XESzeaZpqyqHFMddZvO/9ip/QAbOdEo5J72yg+l1Xe84IH4aBY4Td+f6rgQyDWzWwyne06h666pF35KcdoPJQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=RXrlByS1hh+UpSKIXNnHbHcL79zOQ70eZncdvVLdcmI=; b=IdqbOsX3ebwnUg3GYGPcFF3Ym7JkfjMmszkCuP1xqTbeSBf2V0f0lOq6TQL9e1l+4OuI3OLWbXyMWJUHbCXO2nZ2hisP6tY3j9Ol0ZtKtwZofUh4cdUnk6ac77nVYaEzZhKCmZK/pZnXO0axOybD8MEuoNng3+Vpd0An4jbleNeTXEWwVHcO3NTbZC8wM5oCI2F0g+zRkNGnQ8ctOnk8SjC+DuTpv6jXXgGMvthl+6Ua0q79LPAU7ArMv/VWaGWffW3bQVhqVgfDnRSCueT699RSsnLzncRQ+YLMjQLAOCl2eaq1EerkH+hRHNUnjiuczn4DsihK0oqUyxqzxkZ62Q== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=hpe.com; dmarc=pass action=none header.from=hpe.com; dkim=pass header.d=hpe.com; arc=none Received: from SJ0PR84MB1824.NAMPRD84.PROD.OUTLOOK.COM (2603:10b6:a03:436::11) by SJ0PR84MB1699.NAMPRD84.PROD.OUTLOOK.COM (2603:10b6:a03:432::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7068.36; Thu, 14 Dec 2023 01:07:09 +0000 Received: from SJ0PR84MB1824.NAMPRD84.PROD.OUTLOOK.COM ([fe80::cea4:dc89:e403:4aa3]) by SJ0PR84MB1824.NAMPRD84.PROD.OUTLOOK.COM ([fe80::cea4:dc89:e403:4aa3%5]) with mapi id 15.20.7091.028; Thu, 14 Dec 2023 01:07:09 +0000 From: "Schmitz, Christoph" To: "linux-mm@kvack.org" CC: Vlastimil Babka , Mel Gorman , "Nitin Gupta" , Minchan Kim , Muchun Song Subject: [BUG REPORT][ARM] Compacting glibc code pages causes random process crashes in user space (SIGILL, SIGSEGV) Thread-Topic: [BUG REPORT][ARM] Compacting glibc code pages causes random process crashes in user space (SIGILL, SIGSEGV) Thread-Index: AdouKJkddHljSvoDRwGpn+Hgcpr4wg== Date: Thu, 14 Dec 2023 01:07:09 +0000 Message-ID: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-publictraffictype: Email x-ms-traffictypediagnostic: SJ0PR84MB1824:EE_|SJ0PR84MB1699:EE_ x-ms-office365-filtering-correlation-id: 7ed1e802-2e0a-4a37-d932-08dbfc40ff90 x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: rAAUBFJA/xK5PmLP0p5pwnJY/OqPMBxt4eBbcAajd7elCs5FBcgYumemkdiaJUcYkqUgvsEZRteW1GuQf7zHoW6QR4yTkJ1l/WPMTL5btIuD2IOPhUaMzxcUObICMIvAsn05o0G1t30TCflUyLMXWR8MbZqAig1hp1a8HiYgEjOBbW8vFItIgU+GbJmW+o0nmM6WTu0547Ra0uShaXmz7WaPCtMNnOdUsCp86jzUCA74JsssxBw6wvZJGyH+R8UWJZRwrKoRNFnIhH0ohg752BpukaGGaPWdpqp1hunu1g1FVlf29IbQqEQTUPoMl+npG9sUGOP4EMlx5hhbaR3JrCayRO4AL+1Coa42KxDGipt2Xx/gUanso5PRf4nbrzX0LYFPyzuYMfJxglstrRFjjJAGMzBDJBlwAio8oN6zttByziKvC/5SgRH6T+P+hrXbX3HF8N55PltLH7IVC4IroEF8mV5ZPJB0rupv4HioGj5x0q952eR7Mf9s/SxAjqJU/e93OExB8mBSM/AgIZEMhTI3IwdEIPdSVc/pGBV8qC3siJDINOdB0liv2axF7/DgRpo3IqbMXjbm1/v+mPGnaz0IlPIaV5f44nRvljDvEEtzfmxbuueEH60sVeflkPB6qWV6yXwHHY1mSn5cbUjDrQ== x-forefront-antispam-report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SJ0PR84MB1824.NAMPRD84.PROD.OUTLOOK.COM;PTR:;CAT:NONE;SFS:(13230031)(39860400002)(396003)(346002)(376002)(366004)(136003)(230922051799003)(64100799003)(1800799012)(186009)(451199024)(41300700001)(55016003)(2906002)(71200400001)(5660300002)(83380400001)(38070700009)(478600001)(76116006)(66446008)(8676002)(316002)(66476007)(66556008)(6916009)(54906003)(33656002)(66946007)(64756008)(8936002)(6506007)(9686003)(38100700002)(4326008)(82960400001)(122000001)(26005)(7696005)(52536014)(966005)(86362001);DIR:OUT;SFP:1102; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?us-ascii?Q?D3mW+SLDkys6Kwkro4shAORyOtesos5yrXgpCKTj8AkL6yC562VMzxz87hre?= =?us-ascii?Q?6SfcYwt8VYeTAcXLUzaZiLvroetoN/f3BYtjYy8aTfIRrJ+AGSKLIl5CJAmD?= =?us-ascii?Q?ge7aSpRmk62li/bwkwgr/3mFMPnKLcKDaojkGUhAjevJm7t/GyF0r9OUh7Kv?= =?us-ascii?Q?ssp7xRrscRJu7aWmdOTOT3/GZsRpeqZgrAYZbmBXVIW9y1HmbdumS1f9up9o?= =?us-ascii?Q?VgcffxS7EcpWNWF5WFjt62LURv+i2/8io7+tzjv+AbpJTTQ1UYBCiTctlQBR?= =?us-ascii?Q?lq6zJeB7MzugL/Pxk0AVDxJ4lXqBT7JhEATFkTCqwyEnWi92OuIVaPelmWsr?= =?us-ascii?Q?1WllqyfAxFLm8WwbRvR6ft2EVBTR+fm6UnQtg63FTEXFwnZoNoK0Y4AiazGi?= =?us-ascii?Q?00FTDSTbF/28ahtz0ICCdFP1gcMvmtVHUFTGmjgugeSY49l6YvaZ9BuWxbT1?= =?us-ascii?Q?62UH9tfDVjBRBlsaKfyqD3Is9v7/wbE+3d3KQKRuzOT4UQj3rD3VcVGlmwNx?= =?us-ascii?Q?Y094cYybIT9fQEQDzVgUUsDxB5+rchney8UDGDNwVAlYOFBuWK0lqLSGVcd5?= =?us-ascii?Q?Uh0+ZmAbt6Wov3JoAOwjOh3HAyHOae2sGj9RsWClYxMRstZ3AsbLgfXVmld9?= =?us-ascii?Q?tTulwugAoQ+dMZRXQec2AfCuL/pyRnm85ejNeAWggCGPTUKI0QinZZL4fgZp?= =?us-ascii?Q?yTJ7Em0EpgyuxOr6mF7FNV9LXvSaIUDEEUHu7S7k1r2AQpvSGx0UuDJ1JdKz?= =?us-ascii?Q?L0Bq0I4tcHX69/Mp5U02DgBPqzN+JH66ZWHjdOspWMKxhIrRD7LBUMIfkegb?= =?us-ascii?Q?sAhi0txhBDNZLKjw2CsRv0OtokOJtffkNWw6WhVkcjJskTtRtn5VYYJP0TIR?= =?us-ascii?Q?UvapcKN033IHDtcgUKgqUsLcrCl0OKmZghxPkXb+Ro4jEpJUWYUptEtPJ31d?= =?us-ascii?Q?arI9kr83xTt/kTPROjBW4Hr1TmxsKaqR9LXFrC1p2Ir70JBQ8FuZbautFz8O?= =?us-ascii?Q?1/1T3FpUdtOAFYZzZ8wqWCdug/CI+8fj1WPAKQaHGS2StIbRqGgRoxKwJvDX?= =?us-ascii?Q?Gi2eoHowhIX2EaGeh5N9QxBYxEZSnRI4L0n0lanzAEfvipXXa7ReGQgZhBYe?= =?us-ascii?Q?5XZNqK4s23xQl+qKGmtUl8G3YIwmt7WZwEJKaAdH0rT8etDaQ4xJ8RnPqxfn?= =?us-ascii?Q?nvkVu4S36o0XmPY9FHNgBfbphxOq4suD4YqT1TZVvl3jXQyJvl/4LosK9bDt?= =?us-ascii?Q?g1IBiL5rEmutMagBoEN0grjW4vlrHMBbG9OauE3/Rv+6+q7rBvtUjNV32oyN?= =?us-ascii?Q?9DpGHav9TvBRc7jKHSSQxJIc8GyFiemlL4JfHr1/Knt/j7wQDB6+7YkhcGAy?= =?us-ascii?Q?CIN3ffvIcAV8u2gWudV3HtNzZjRYOXdG9NUvS2rijwb/o+utMzqn4Mp8hGnW?= =?us-ascii?Q?cNDOQnRs0ffpvx0OUH6MwcujuEA9j0/cU+te6gMRDwuotRI1WAdoVok+86iq?= =?us-ascii?Q?BtE2hIk7gXKI8V59URNW5fyGDp7Ca9i2mMk1WllA0kKXFN6+Em1X9I5GMyeT?= =?us-ascii?Q?WcWazc+9pQHGy5Y72U3B2J/fNqEo3AJ5KHISYtewMwV7fo1vzSaq0/Ncz0yo?= =?us-ascii?Q?Pw=3D=3D?= Content-Type: text/plain; charset="us-ascii" X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: SJ0PR84MB1824.NAMPRD84.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-Network-Message-Id: 7ed1e802-2e0a-4a37-d932-08dbfc40ff90 X-MS-Exchange-CrossTenant-originalarrivaltime: 14 Dec 2023 01:07:09.8275 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 105b2061-b669-4b31-92ac-24d304d195dc X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: 09mB1s+mh5wqnnkCPG5KNocXPN1Qs4AROeGhLwD6Mjk9ZJhQWkRsiTa0kAIM2VCs68KbT6mSPzDb0hUuQQziWSVgknI4SfQrWBUoTg/Mtd8= X-MS-Exchange-Transport-CrossTenantHeadersStamped: SJ0PR84MB1699 X-OriginatorOrg: hpe.com X-Proofpoint-ORIG-GUID: sSXRqRfhdFIgUFc9CIsjje6nAkvsoB6q X-Proofpoint-GUID: sSXRqRfhdFIgUFc9CIsjje6nAkvsoB6q Content-Transfer-Encoding: quoted-printable X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.997,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-12-13_16,2023-12-13_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 impostorscore=0 spamscore=0 malwarescore=0 phishscore=0 clxscore=1011 mlxscore=0 adultscore=0 lowpriorityscore=0 suspectscore=0 mlxlogscore=999 bulkscore=0 priorityscore=1501 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2311290000 definitions=main-2312140004 X-Rspamd-Queue-Id: 8A36310001F X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: 7qoyzny8uwcp1mmqxdn94sktx315kywj X-HE-Tag: 1702516105-116374 X-HE-Meta: U2FsdGVkX19UsmTR6uXzMdCY1DWuqq3A2qefkpDtYh9ExtUVcACuqDPn1juLLDIcZFi2nDQ54aJ2CgxD3Kkif/ivOC1iHWnzlBk5uZaUmB3XobYE4ph7H7/rsFoq7818zPK+rvYP+DGEZ7kg/o4/+jA8sI/R/bqFMRiUvKU066fHMi0guqogDB47CKEo8vnQ1nB70h81jkDtMLnpKA1ILCK1m3jgpgJTuNjoMeJ4Ypj4lhfrVUhSz6v/WoN+RNw6UJsGcnvTgJAkol17RJwMWmTg2YuWvBbG/Xrs7LrviaQnRcZwkdtICjW+RLlKiOWDy1ZV6uhBKKNae92CY7oyHbsK25BpDs4XrUebWEKq5PKwAVFWZDgcRcNd4wtXdLGG9KGEQ/qUkLveiz6bgCYxxYlpJlzR7kMPkbHii+smSkYeDYEwhgRWSjH3AcKVDA3rNMySjTYphl57Od4sRPFZoHLseQ65g59p1aZXfmQhWB4ibbX+gkfnWnB0T1SFtJmW1WAnV2vf8vQnXhpvkJ81AL5QSqwYstvupmurs1CeZeySBWhNXSNAdcb46K4FWvVNobggFhtzAiU0cz3zvCzlw6DKvk2NlCWkKIpAryPSUYuEavIhVr1lP66GRhBUuMZgM49eFEB2AHEFvg1CoahrMe5F0A/srvmsPOmZAxC5Au70y/WYFMnOd1sLwsH896qzLxP2At6R3IlbSlSFRq2cZvnuqObEOtgdykKZH54mPuFOwjfJbey0rHrfHGTQQJOofqEYYYRQjLeeEYo/n+JAIS8XY0plzR3bZ4tQ9osIYJ4JeAXvS9w/PYg94jDgDdARDv7KxvygegeF5dA8XlGSGcFOKeZ4Q9IJdqe/eJjwtacq4drdf9bVox/FHdQfjDISwJz1yaGIfyeigUqURrDWc4vqfkj4ULgY2F+r/8Qjbs0laMfVJ8v5nSpuQWh79aM0iJm6B3ODNdSt7BZojXQ V45hGgRZ Hjvu1gZazZPC+GEB13fYM/XhmIGNKEC12A6fPyzd6v4rtgCkgqWfHlj5YmqfAUkS1SZGKwWic6TMNkff4mKq0mtkc7jt7IPwksm5b6X7TkdGp3xRg4zEdrRwZoQXsoVvlYaZHFbGkurVrs49eZh1jkVp1brlA+B9UjVuNteKEtJQirQhfGKQ12WBdic69AquhoHdWK+7Yzo/u8HqwbzmupJLsv/CqSgEAB0+TXO0mpn62pE46FgH7JtegcvyQHErHE0WQt9AkQyMrpUKU8G1pniSI/1WMflycBPh5Ntvnwh5Mfm+VqReggkX/JEWUgCVp+HJfAEpHsJANJNt2occ51zNWFmr3GNNZpURAoT7QcVagmRGvMtKnOZ2c3qvLR1M4ZkQpmpbfcx003UU= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000048, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi, I am part of a team of Linux developers at HPE who work on various embe= dded boards one of which is based on Freescale's IMX6 SOC design.=20 Our IMX6DL is a two core ARM solution (ARMv7 / ARM Cortex A9) which runs at= 1.2GHz. We have 1GB of main memory.=20 We have puzzled over this problem for many months now, so we are desperate = to develop a "clean" solution (not based on avoidance strategies). We are n= ot 100% sure that the problem does not reside elsewhere, so we must consult= with the subject matter experts. It seems that we have uncovered a race in mm/compaction.c and mm/migrate.c = code which can cause random crashes in user space applications rather indis= criminately and with detectable probability. Due to heightened focus on security concerns, we were starting to upgrade t= he kernel=20 from our previous builds (based on 4.9.11 - https://github.com/Freescale/li= nux-fslc/tree/4.9-2.3.x-imx)=20 to a much more contemporary build (based on 5.15.77 - https://github.com/Fr= eescale/linux-fslc/tree/5.15.x+fslc). This is where the trouble started: During weekly regression tests, we obser= ved at least 2 core dumps in every run. Over many weeks, it became apparent= that these were no ordinary stability issues: * Core dumps affected LOTS of different processes. Open source processes su= ch as gawk, python or apache (normally super stable) were affected. * Core dumps were due to both SIGSEGV (80%) or SIGILL (20%). * Crashes tended to affect processes that were scheduled often ("CPU hogs")= and seem to prefer the ones that were scheduled with elevated priorities (= e.g. corosync - nice -20). Some processes even use SCHED_FIFO, prio 90 at t= imes (e.g. proprietary broadcom daemon).=20 * The core dumps were hard to analyze: - Stack content was often corrupted making unwinding imposs= ible. Including frame pointer helped a little bit, but not always. - SIGILL never revealed any true illegal instructions - the= code always looked OK in the core files (and in line with what we compiled= ). - Crash sites were varied. Only common denominator was that= they appeared in library code and tended to cluster around blocking synchr= onizing primitives (e.g. pthread_mutex_lock, pthread_cond_wait, etc.) We had noticed before that turning kernel tracing ON would aggravate the co= re dumps, but tracing gave us novel insight into what was going on right be= fore the fatal signal was generated. We noticed that kcompactd0 was ALWAYS = running right before a core dump was observed. As an experiment, we turned compaction off (CONFIG_COMPACTION=3Dn) and that= FIXED the issue. Our stress test (firmware upgrade) would usually reproduc= e a core dump every 20 iterations or so, but now we ran 1500+ iterations wi= thout any issues. This, of course, is not recommended ... (Avoidance Strate= gy 1) We started investigating why this was never an issue in 4.9.11 (previous ke= rnel) before. We noticed two areas that had changed. * New in Kernel 5.15 "proactive compaction" * New in Kernel 4.20 "watermark_boost_factor": This feature seemed to alway= s provoke a huge compaction step of order 13 (pageblock_order) in our archi= tecture. vmscan.c balance_pgdat 4065 wakeup_kcompactd(pgda= t, pageblock_order, highest_zoneidx); commit 1c30844d2dfe272d58c8fc000960b835d13aa2ac We were able to prove that tuning down compaction_proactiveness=3D0, waterm= ark_boost_factor=3D0 would fix the issue for us as well (Avoidance Strategy= 2) None of these features existed in 4.9.11 ! This explains why compaction has= never been an issue before (even though enabled in 4.9.11). We also tried to root cause the migration process: * The dying process always ran in parallel with kcompactd (sometimes on the= same core (context switch), but most often on the alternate core). * The dying process was always executing code pages in glibc which were mig= rated just split seconds and a few migration steps before. Locking glibc memory (via mlockall()) and forbidding migration of locked pa= ges fixes the issue as well (Avoidance Strategy 3) Additional experiments: * Tried to disable core 1 temporarily (cpu_remove/cpu_add) and pin kcompact= d0 to boot core 0. This is unfortunately not viable for us, since we have r= eal time processes running with tight scheduling constraints (corosync). Ru= nning kcompactd0 exclusively for many 100s of milliseconds is not possible. * Tried to invalidate cache page/TLB very explicitly - I noticed that for o= ur architecture update_mmu_cache is a NOOP. Added flush_cache_page() ... fl= ush_tlb_page() for each "remove_migration_pte" step. This did not help (pos= sibly not a cache coherency issue - this was my pet theory based on https:/= /gitlab.eclipse.org/eclipse/oniro-core/linux/-/commit/4774a369518091f46435e= 0539de6a45bf0681c74). Any help, reply or tip would be greatly appreciated! Christoph Schmitz Firmware Engineer Hewlett Packard Enterprises