From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0C271CF2578 for ; Sat, 12 Oct 2024 20:06:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4C93A6B0082; Sat, 12 Oct 2024 16:06:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 479806B0083; Sat, 12 Oct 2024 16:06:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 368606B0085; Sat, 12 Oct 2024 16:06:14 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 1A55F6B0082 for ; Sat, 12 Oct 2024 16:06:14 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id A4CF014035F for ; Sat, 12 Oct 2024 20:06:07 +0000 (UTC) X-FDA: 82666031742.04.9BD42FC Received: from mail-ed1-f44.google.com (mail-ed1-f44.google.com [209.85.208.44]) by imf05.hostedemail.com (Postfix) with ESMTP id 300A4100005 for ; Sat, 12 Oct 2024 20:06:03 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=f0G3QkHT; spf=pass (imf05.hostedemail.com: domain of shy828301@gmail.com designates 209.85.208.44 as permitted sender) smtp.mailfrom=shy828301@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1728763431; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=vnVRRLNp2W4/+c2wBPiX3Nf+fei0+gnYY/4ftfVexqU=; b=UXUi6ZSJlGahJ8SHQpU/to2GS2SESnce5AgdkdoapJgVdn7UeBmz0JyBwgVAJ6vNtILYWZ 5gb9jAAvLpu3ZsP6Xv8V0x2/fLr1cC7gEhMeHfnzIThOVEMEhbeqPsscrWKjj7dbrIGtTw vTW8dCYoPNOVzsoQ42LIQDa0fLByVBI= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1728763431; a=rsa-sha256; cv=none; b=B/qpRDJaFPFErAtNmNvK39PkdQilWh5sjWmQ8whQ1fS8mLwhYMghCzMFXD/sEbhMN47pC7 oqAu+oBH1Wre0njN/ElsXpxHBJBimEDfXRVW584SGFtTzaFdlFofjjmuVGpmt8WT7r4GXE 0WrGIKGE5r0w4xoqC2kGoVD7q86M3eY= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=f0G3QkHT; spf=pass (imf05.hostedemail.com: domain of shy828301@gmail.com designates 209.85.208.44 as permitted sender) smtp.mailfrom=shy828301@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-ed1-f44.google.com with SMTP id 4fb4d7f45d1cf-5c40aea5c40so5842966a12.0 for ; Sat, 12 Oct 2024 13:06:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1728763570; x=1729368370; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=vnVRRLNp2W4/+c2wBPiX3Nf+fei0+gnYY/4ftfVexqU=; b=f0G3QkHTK15Zps0nahkRi+nd/mpWPCTqumG+k9Caoc8CkM/e3Ttj17NX22CPZpZTcv Xz661QcvbMAHirRyHrQwkOWMZij5HfSpoz18WtwN2TbksyrVVkMo1sPzAv8Yzb9P3rjE KL/Tntfq8RchrigDfRO1gk3aIQPd4te7Tvj7EKVWNe9NoRDuFP3R/MBbrz7ishwcqdEc k8YqwrBBk0Z/WNit9PJC9oPSERe0mJCloR+24W4YnWdcY2FAGkVvQOsJPi+/bA5RXAST 1+W3EqMojShBmfhRe5Sq6mpUp8eKyDvbSDNbkRR6F96OSRyCQ2yzpZjbq/ZF38AOPWrz n7JQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728763570; x=1729368370; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=vnVRRLNp2W4/+c2wBPiX3Nf+fei0+gnYY/4ftfVexqU=; b=YNr8iqXYlV9EkHnLGMm8kyEOpxUK71h9G6GjXRw4zdmbOv1ZKsG92a4WLaW4XFJhUX 4RY/rNd6+cyN1eQxQFU0vWqJqAIzA6nt1pa+LroRQDF0VgF/6DWk4IEF2vn3Gyp8T/Wj kG4X6wbkuEFR84/P4EoySP3M0TE1neUILxGkctwnyMZ17A5kOpUbJkd/c8TX+ZX5Uq9B 3uodiO8Jr5kHAYIEw6YMPYxv5oH24W1Z5xqYf3uB+U+vh0ueM+7oLHr13+QokRVBBAe9 LMrLkwMfQAZeqVD7omhdDXCA2v9N0pgHAtrNJdKd+uZkUrWqhLI2VHJdXHPQjWsCOv/T +gEw== X-Gm-Message-State: AOJu0Yz1tBXnerxHb13O9mim2G9t7DVJZyBp/ZuqEWawxGVjrp72Axbv rVLITe5iSSozsPqai9nMwQVNHBelRMwEd9fX0pgKLArHGaWDdSxNyGVAhMWjjmDv+zaGvJR5hJP s35zFwxb5p3E5zDjmd4XJW9pXKVI= X-Google-Smtp-Source: AGHT+IFMzMobTIYtETnlNDp04efl5FTeyp7gDDU0Q/faFSl4uMBOa5Ml6/jTCJ+/ePruAY2iBxD20iji+JEnTXVDOTY= X-Received: by 2002:a05:6402:3907:b0:5c9:19ee:97f1 with SMTP id 4fb4d7f45d1cf-5c947596650mr6026875a12.19.1728763569689; Sat, 12 Oct 2024 13:06:09 -0700 (PDT) MIME-Version: 1.0 References: <8ac28fb858a2394cc72c3dc5924f1fd031fc6fe0.camel@scylladb.com> In-Reply-To: From: Yang Shi Date: Sat, 12 Oct 2024 13:05:58 -0700 Message-ID: Subject: Re: Possible regression with file madvise(MADV_COLLAPSE) To: Avi Kivity Cc: linux-mm Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: cj5nump8wpcgz7rxgtf6srosqr4yofjt X-Rspamd-Queue-Id: 300A4100005 X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1728763563-586302 X-HE-Meta: U2FsdGVkX1/wAzdFLowXywd83JtceB900+bIEbjeFLe4Yjc2+wrcRRyKSQu1IrJkv5dM94K40rB9NC2uCYTZRO/cSrYfa0pH9CwNTHnpvWXkXJgvuBRt6UHKrcla+IBB6fW/I32ixXjF4Orj8uELF6Qyp+NhedIx7IV5M/PwZeOPkqYw4XTSgAjO1JsktvDpQuXVWnXGp8mRxm+MkVwrjsgfCPA0Lu1E+NB1wkrEoxb8S890gPzQ9Pl8VMIp7ZklsK+r/ZtbF0vE2thWDrXohkDVWg7nzMkv/68z7XsQw/T7abVW/KWQ6vA5JlBtInVA84S4k2U2V24HHaLs30eadu8oOcCV5FpvMRZPfIOYHseSQeW9vE5Syg1tMK4oXnhdIbzi5LzewLCP1IjgWOpA7o1TB6kZUnfkMZxcOho7FxGlfL/OIjCoKAJ4Yx1iHRrbtWTKaaFOJXB1yfaGYcdrv72F5ehTdsWk5z03/S09JtP5pjk+wTSx2+daF2CWwb7PCIAmZXTxNxCE2swpKaopFZwn2tHhqR+iq4ogc0ypLr8+oQ9S+wQIFeRRZgmOc6gExCJG+9XODSlP4NfYzqtC9M2NutbqdPo3nn3onn/hFgDZv74zl/Uqi0piYZ1cSK44B2HGeyfu7UNcqtMa4qzld+6Pe2PCWGqQIbp6Y5j9EwCn2zSE+HTKT4618h7T1Gb8a9MiOSm6v6dXUA5EnW53aNEZ5b0D+WZoPYNyb7o/6vp27MLtcNJF3UY/SGwiuBTHpu77wSZqUF/5m0G51vUAP7xKS5TWYtfrKe6/D/GFjFpLr+nUxkTd8io2nkbaOA7AiJRzRsV0EXWpCFl1sAwM56w6X0BSv9LwNOFGKHsNEFvHGbGQwiDCn7WCXERQM8cr5qHrrMiJd0ItolXTyYrdLGLb/x+12c1qx366lR/1as3BH1muxLXWxi03Qtjri8FE4pwjx1np+1E5I3GcZl3 Jt8Bt+5a Vtib9F6vPCPgzF3/gbIO+fi0pz9vkXg8+iLsqK44J/v694pNC6OqXltfj0thHFdrf8haYabQC1GvV9f7V36dwnIgm8WKvhwRDI3loYvuu9c0mgXyMY4bmqu3cZ6iUwuN+wFGt3AF2S6tJCFYsHPr/aYm7dXO/kl2aRMgj6F6/6mQRQY0JsGKhOuxnvLY5+b4weNAxnhd9YtHwsKLTQ/hiMWuFdnJfBToqzzgznTf0GQVrWOBHzLAfsVlBTrfmwdKGAUGEW9XL3PiK663tS0Dl6qLHKdJAsKuSyzeyY6ja6mbrNJyyIKTY2r4KwX67G4tTaxCguxbMMVCb5es7jnBv81qTYjRCYMC3GbFKNiXZOVYA6u+Uk1ARRL0eVBAp3TuOoQmMXNnp2qIOVArdhnA25rm/jqgz1cj2oR4fmu/5KOgZbOU= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000006, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sat, Oct 12, 2024 at 8:38=E2=80=AFAM Avi Kivity wrote= : > > On Fri, 2024-10-11 at 15:29 -0700, Yang Shi wrote: > > On Wed, Oct 9, 2024 at 9:04=E2=80=AFAM Avi Kivity wr= ote: > > > > > > On Linux 6.10.10 with CONFIG_READ_ONLY_THP_FOR_FS=3Dy, > > > madvise(MADV_COLLAPSE) on program text fails with EINVAL. > > > > > > To reproduce, compile the reproducer with > > > > > > clang -g -o text-hugepage text-hugepage.c \ > > > -fuse-ld=3Dlld \ > > > -Wl,-zcommon-page-size=3D2097152 -Wl,-zmax-page-size=3D209715= 2 > > > \ > > > -Wl,-z,separate-loadable-segments > > > > > > and run: > > > > Didn't clang make the page cache dirty? > > > > Having sync between clang and the execution made the problem go away > > for me. > > > > I see it even with sync (and msync just before the madvise calls). Did you stop khugepaged? It may race with MADV_COLLAPSE. If it failed due to race with khugepaged, you should see -EAGAIN instead of -EINVAL. I did the below commands in a loop for 1000 times, it never failed (I modified the test program a little bit to print out failure if MADV_COLLAPSE returns failure). I had khugepaged stopped and ran the test on v6.12-rc1 kernel on my AmpereOne machine. rm text-hugepage clang -g -o text-hugepage text-hugepage.c -fuse-ld=3Dlld -Wl,-zcommon-page-size=3D2097152 -Wl,-zmax-page-size=3D2097152 -Wl,-z,separate-loadable-segments sync ./text-hugepage > > > Tracing shows this (last lines before syscall exit): > > | hpage_collapse_scan_file() { > | __rcu_read_lock(); > | __rcu_read_unlock(); > | } It meant collapse_file() was not called at all. hpage_collapse_scan_file() failed. A couple of reasons may fail it, for example, refcount is not expected, not on lru, etc. You can trace huge_memory:mm_khugepaged_scan_file to get more information about the failure. > > > so, it's not clear what the root cause is. >