From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 62CF8CF2578 for ; Sat, 12 Oct 2024 20:24:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BB52A6B0082; Sat, 12 Oct 2024 16:24:52 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B65216B0083; Sat, 12 Oct 2024 16:24:52 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A05F46B0085; Sat, 12 Oct 2024 16:24:52 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 81F336B0082 for ; Sat, 12 Oct 2024 16:24:52 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 2F3FE40348 for ; Sat, 12 Oct 2024 20:24:48 +0000 (UTC) X-FDA: 82666078698.12.BF0A566 Received: from mail-wm1-f50.google.com (mail-wm1-f50.google.com [209.85.128.50]) by imf02.hostedemail.com (Postfix) with ESMTP id 05C998000E for ; Sat, 12 Oct 2024 20:24:41 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=scylladb.com header.s=google header.b=y6O8Wz3A; spf=pass (imf02.hostedemail.com: domain of avi@scylladb.com designates 209.85.128.50 as permitted sender) smtp.mailfrom=avi@scylladb.com; dmarc=pass (policy=reject) header.from=scylladb.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1728764549; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=sWoICzgA0bQoeRZMtl/c1HbB1QIY609RnTxoCJ6pk/0=; b=rJhlh51OY8YQwAxZcMtoYvoRrklwzq3ehTQaQTp46efOmREWLmjBUubYxzQsteYlNgK8RM uXBVAn7h6SY/OmghrylQBG4Zgjpr+N4dFWxWyMKvZelRJdLxNQm3yhD94WqTaqE94dMhRR xIDawlnacCZRKCxPmBrLlXdt5bFXLpc= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1728764549; a=rsa-sha256; cv=none; b=LlBiw1/KdQaEQlnMUTtbB5NYXBGPApKhDVjex3YyhVXvlUiyqEXvndvrHqkvnl2eWsNlGc 2RmNNP/G1aXdyNXvu75CmkH1O5nJW/974ZxucSY2HgQbtrDzFalYhwCYck7IbGCs/NDVdE dFVQE6EVQmHAS4onGC+MKzTISastb/4= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=scylladb.com header.s=google header.b=y6O8Wz3A; spf=pass (imf02.hostedemail.com: domain of avi@scylladb.com designates 209.85.128.50 as permitted sender) smtp.mailfrom=avi@scylladb.com; dmarc=pass (policy=reject) header.from=scylladb.com Received: by mail-wm1-f50.google.com with SMTP id 5b1f17b1804b1-431126967d6so26706885e9.0 for ; Sat, 12 Oct 2024 13:24:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=scylladb.com; s=google; t=1728764688; x=1729369488; darn=kvack.org; h=mime-version:user-agent:content-transfer-encoding:references :in-reply-to:date:cc:to:from:subject:message-id:from:to:cc:subject :date:message-id:reply-to; bh=sWoICzgA0bQoeRZMtl/c1HbB1QIY609RnTxoCJ6pk/0=; b=y6O8Wz3Ap1bjVhHQojAWS4EKCYNryabDOjPgmWGOHX1Eu8FNA8dFSR2lnRi8kncMLZ crl5fvS36b9tPqtRtjkVbsDBcqxsv2QY9Sd/AIhaJDPZ3goB3LnzQxqr++ccYXaMuMzu F7riYHg9fkcYMTvW2bnmfg1P2APq+W+AY/bSBkCWyBMZWJ13/q7ntNGNp1ZrB1Adl+wW Jxi011teHAkh4WefNKBfMz4Xebna9+0bQ1FnMrzUcBXRYEicGA+dzMJsKlXivKp3DZfX RkGfmjU5Zy0LReLP6QGpEct9nPk9saUCUpEmsXHe/COU2Ey9HqqnCapsagamdZbsOpxr 2Dcg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728764688; x=1729369488; h=mime-version:user-agent:content-transfer-encoding:references :in-reply-to:date:cc:to:from:subject:message-id:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=sWoICzgA0bQoeRZMtl/c1HbB1QIY609RnTxoCJ6pk/0=; b=LBIlYkmKuA3r+iodFObrLt62jsBJ1husnx7oU2z6NRpKjsA0nR09Dx+aTPjZkcy9FS LUI0W3kX8LoU/SVfGVGKHzbyHabN2htFCQxfXDKwBS989xdgaPqtrXKVoNOpqVTJ59bk nTD5ISw4PHVr+E0rqWIbNj+svBIS9xAzzpjBoH2aPyhyBat7KmD7Ni7o7gf8ycK+WbXS 7wL0hJxnNh9wqLxT4VQPzS0PLeM4FXGSpYW6xDsVPebarCIsUNnJKjTZ0KdS2F1pN/vf h2sZZt/NqdLMlg82g7zDCHJeD0DwR+1gqXzLDt9dW8IgpS/qJKPKxkXzm+m6dq0+u4lP wDoA== X-Gm-Message-State: AOJu0YyO7AJEcRAqJNfY5jSuRAKlynmIQrI2FYMRi+Ydl0IyByt1LUML PyD3NJqeXOjdqRmgYHzlhP4H5iu0HeLgPCO211EOaUwr7DicJbbsX5hH2y8+sRufKiGMuKVwm0Q JT1rDfqN3cEylqaQegIMqCAnYDzIG0l7ZGurS8EvonL0sMZJSgSEZtnokdT7yLQPXiCyppYS4v4 hswHipUMUuT5BhakDiMB3p08XcKvy3YCMGaR44OfyrGrPAyYhuSUzUHQszcFKkKUyt7FtIpSL+Y qfCfKCtKD0J3UkY6o0ch/2VKy/yCSwhIr50Ui+URg/d6Gu+T8Nq0eV9Ghv0NM2N+l6xrjtsxOUr SJSkW44auJVUNshITkY= X-Google-Smtp-Source: AGHT+IE4/kE/YZFGKk3QzI2/4lYWhAI6YiCXWVwojTTae3HtvYlR6E49zn7W1wjmmJ6YbGUT32jY1g== X-Received: by 2002:a05:600c:3555:b0:42c:bb96:340e with SMTP id 5b1f17b1804b1-4311df56158mr52368145e9.31.1728764688042; Sat, 12 Oct 2024 13:24:48 -0700 (PDT) Received: from avi.scylladb.com ([5.29.124.170]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-431183065e2sm75864605e9.28.2024.10.12.13.24.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 12 Oct 2024 13:24:46 -0700 (PDT) Message-ID: <04a11431e9edffb85470da3611c287cb7caf3281.camel@scylladb.com> Subject: Re: Possible regression with file madvise(MADV_COLLAPSE) From: Avi Kivity To: Yang Shi Cc: linux-mm Date: Sat, 12 Oct 2024 23:24:45 +0300 In-Reply-To: References: <8ac28fb858a2394cc72c3dc5924f1fd031fc6fe0.camel@scylladb.com> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.54.0 (3.54.0-1.fc41) MIME-Version: 1.0 X-CLOUD-SEC-AV-Sent: true X-CLOUD-SEC-AV-Info: scylladb,google_mail,monitor X-Gm-Spam: 0 X-Gm-Phishy: 0 X-CLOUD-SEC-AV-Sent: true X-CLOUD-SEC-AV-Info: scylla,google_mail,monitor X-Gm-Spam: 0 X-Gm-Phishy: 0 X-Stat-Signature: hgguwramyuo849ws6ze1ygbsgrn9ghaf X-Rspamd-Queue-Id: 05C998000E X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1728764681-43647 X-HE-Meta: U2FsdGVkX18VMbpYsVyEOXEwmnyybzeZlZcCkcv4bvGHjPWMD/mcdJxIq5tlgTX/P2Av+xSkDO/oCTizV94Pl/19TFM4YpgvIAe9URlMYWKc8EZr07T+/gwSsTMZoGGOKW2Aqtdgcj7zRs1NP66YqNCWMiHh3NZHixID26R7MZ7XARseVoKTUMVnEw/6TDHrtlzuX/fo+BRcxCNkpX3kD6oRwXze4Y42AtX1qFA+9cKXRouU09o2TnInnUw1T1lxUY/03ShlFVqybB7v/a0OnQI0BkJEvHrej4D5rBpdcOsgabCfI1ibIRFhro0cFMJqs9laBedviwqQuJvBJyGKhfxOfSsJyjoUF0ixL9RAiUsHyrU0EeflGBYRDoA9mxeezo9soZSJeUzZcYP3PuIZn5A9VoAzVESjEDB3Aqc+R4scCg/6CpjJoagmnlXRVAURa94HtSU6iz0shSS7W73VtVVwtoZi3YQRrM7nK2KaLlrq57uV2eIueZWRtMA13f2zAVMuZFdUPfL/D4Lx0+Tt3up/MR0cfnLhgRXQMIIvNFqrrOGu8J8N9YNBR1aNgGBRkrg473pkziUiofw3QNkinJyRmTQNBEix++29ldRobVWv9v08bKOjoxuaitDVko0w+QlKEwC7NeuS9hGus5dzsVJ5m76ZviDY4bWth1AVLOHscM6pwHgwqtotNh1sKFLTdSZofHGK5hvfwQPjxeahfHdvRa2bnhIpmr3GOrUu+qNp/vUQV74hJkjfSyQF5xRc3B+3hjxAKmMZXMqpOY+iCGvbHhcKXltZURZn0a9cbO/7Iv7FPw6EotUhAtuR+KwcOtbOS+MTmsuW2keWaIuWf2YWE8bw1w5PIRwkH6eVBxoEYaAixyUerY5r6vhUdT0XMYLmDMYhYyAhOoJiqR11CY1S6QdDMmRl1zeQ5iz+OHp0pz+MuGSCRIy4/e6pOX4iSykAgXmd4BLclOvEqPl mml6Ef/8 XybwuOrTGuX5+++Cz9fOoxpmQgJQYRqjgYrq0yfFaGDj0uSfRq16V2Qh4nkfZDhplrzCVsGVBGW8QtGddBD/hoU78It6xlTMQS/3lBcrrW1uXRFHBfxh/6eOMJfV29gg4tLKrj1hlZ9CPUXKFzA6ZoX9aSqKwgppm4QN5jWiNZ/tr6xaIMft18gJVYX1WB/IzySoly+c2XsIcU9U+iddNQMAcvRymcKeQC5lDH/UxjLEsHPx9z/w9bRtErqvw2Fl8d7Vtfi8L/JKyw0aWCzLD8NAhx3yDSHzZV/cU4zWOhai+2T8vyrxc2EA16bf77DK25fLUY5IjDYQT7Cb5BaYh4eYeD3Tc7E42yzEOIMpTWUjuojMMIZ8lwaJEGu0aGvG+iBk6 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000364, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sat, 2024-10-12 at 13:05 -0700, Yang Shi wrote: > On Sat, Oct 12, 2024 at 8:38=E2=80=AFAM Avi Kivity wro= te: > >=20 > > On Fri, 2024-10-11 at 15:29 -0700, Yang Shi wrote: > > > On Wed, Oct 9, 2024 at 9:04=E2=80=AFAM Avi Kivity > > > wrote: > > > >=20 > > > > On Linux 6.10.10 with CONFIG_READ_ONLY_THP_FOR_FS=3Dy, > > > > madvise(MADV_COLLAPSE) on=C2=A0 program text fails with EINVAL. > > > >=20 > > > > To reproduce, compile the reproducer with > > > >=20 > > > > clang -g -o text-hugepage=C2=A0 text-hugepage.c \ > > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 -fuse-ld=3Dlld \ > > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 -Wl,-zcommon-page-size= =3D2097152 -Wl,-zmax-page- > > > > size=3D2097152 > > > > \ > > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 -Wl,-z,separate-loadable= -segments > > > >=20 > > > > and run: > > >=20 > > > Didn't clang make the page cache dirty? > > >=20 > > > Having sync between clang and the execution made the problem go > > > away > > > for me. > > >=20 > >=20 > > I see it even with sync (and msync just before the madvise calls). >=20 > Did you stop khugepaged? It may race with MADV_COLLAPSE. If it failed > due to race with khugepaged, you should see -EAGAIN instead of > -EINVAL. I did not, but I don't imagine I hit the race in all my attempts. >=20 > I did the below commands in a loop for 1000 times, it never failed (I > modified the test program a little bit to print out failure if > MADV_COLLAPSE returns failure). I had khugepaged stopped and ran the > test on v6.12-rc1 kernel on my AmpereOne machine. >=20 > rm text-hugepage > clang -g -o text-hugepage=C2=A0 text-hugepage.c -fuse-ld=3Dlld > -Wl,-zcommon-page-size=3D2097152 -Wl,-zmax-page-size=3D2097152 > -Wl,-z,separate-loadable-segments > sync > ./text-hugepage >=20 > >=20 > >=20 > > Tracing shows this (last lines before syscall exit): > >=20 > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 hpage_collapse_scan_= file() { > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 __rcu_re= ad_lock(); > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 __rcu_re= ad_unlock(); > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 } >=20 > It meant collapse_file() was not called at all. > hpage_collapse_scan_file() failed. A couple of reasons may fail it, > for example, refcount is not expected, not on lru, etc. You can trace > huge_memory:mm_khugepaged_scan_file to get more information about the > failure. text-hugepage-689146 [023] 200457.073794: mm_khugepaged_scan_file: mm=3D0xffff92fc512aac00, scan_pfn=3D0x5a4310, filename=3Dtext-hugepage, present=3D0, swap=3D0, result=3Dpage_compound