From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3105BC636D3 for ; Tue, 31 Jan 2023 01:17:09 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B05326B0074; Mon, 30 Jan 2023 20:17:08 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A8DA66B0075; Mon, 30 Jan 2023 20:17:08 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 92E7B6B0078; Mon, 30 Jan 2023 20:17:08 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 7C7636B0074 for ; Mon, 30 Jan 2023 20:17:08 -0500 (EST) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 3D4B4120AC7 for ; Tue, 31 Jan 2023 01:17:08 +0000 (UTC) X-FDA: 80413330536.29.55DB5BA Received: from r3-24.sinamail.sina.com.cn (r3-24.sinamail.sina.com.cn [202.108.3.24]) by imf01.hostedemail.com (Postfix) with ESMTP id D1E2940021 for ; Tue, 31 Jan 2023 01:17:03 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=none; spf=pass (imf01.hostedemail.com: domain of hdanton@sina.com designates 202.108.3.24 as permitted sender) smtp.mailfrom=hdanton@sina.com; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1675127826; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nC7VWA79hCJVVqW9+Ghs28VgE4I8l3YkjCzEkJwSwU4=; b=t/0cXIxEYl+pSXqkEAeSXAymAoAbtjtJ+gPKnLrUHX7/giCf6ve5pnLbn4k3jUGjhKYyP6 VUZSWqztpxNsoTImBwYEQb0RzxejA4izanqxG8bywPbbgSVe3gYchYskS0lFMlU3z0Gmda he+fFK53XlSxO8fr3ySaRJbQHbwhfb8= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=none; spf=pass (imf01.hostedemail.com: domain of hdanton@sina.com designates 202.108.3.24 as permitted sender) smtp.mailfrom=hdanton@sina.com; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1675127826; a=rsa-sha256; cv=none; b=zx2fbB1HpFhmWHYcwO9vXKBZmYSMjkNdR5g/bJiMBRt58M3GBfD0uX6MZQMLxDJenOXyb6 nk+nx1stcOpQdzwVGSUUSBjgs2q/93E2EAPyLhwjFCsMBJOHkLdBZzo8vgW1VZc5zPtXG7 A775zyHL2XtC8DQ49Ed0zwIgs/E0rPk= Received: from unknown (HELO localhost.localdomain)([114.249.61.130]) by sina.com (172.16.97.23) with ESMTP id 63D86B3F00023ED4; Tue, 31 Jan 2023 09:13:36 +0800 (CST) X-Sender: hdanton@sina.com X-Auth-ID: hdanton@sina.com X-SMAIL-MID: 77924554919835 From: Hillf Danton To: Matthew Wilcox Cc: Hugh Dickins , "Liam R. Howlett" , David Hildenbrand , Sanan Hasanov , "akpm@linux-foundation.org" , "linux-mm@kvack.org" , "linux-kernel@vger.kernel.org" , "syzkaller@googlegroups.com" , Huang Ying Subject: Re: kernel BUG in page_add_anon_rmap Date: Tue, 31 Jan 2023 09:16:46 +0800 Message-Id: <20230131011646.167-1-hdanton@sina.com> In-Reply-To: References: <713c6242-be65-c212-b790-2b908627c1b4@google.com> <9d8fb9c-1b81-67cd-e55b-34517388e1ab@google.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Stat-Signature: mduf83w514hwhk5njmukm6hnxomd4xfm X-Rspam-User: X-Rspamd-Queue-Id: D1E2940021 X-Rspamd-Server: rspam06 X-HE-Tag: 1675127823-548665 X-HE-Meta: U2FsdGVkX1/Bv4M6Sf4gpOtO5r6K59Nsp8zGwoKr8FlgeysXku9mix6Sb80UPVl9Wlk/nw+GeHgysdv30/ZyM2jD5T3HzYZhMxhTvIaWoSCo9iwdQnNyD4PhM/aIstErbHOi4kv3cNCNDpApklCW5U/C+JTRXw7KC0xccwUkGVmzx9yl9L8KkFJVAvWe52TgkK/ZAX2kTkI/y/aHD1vUO+LTiESjfovefWHquEHk3y1AdHxyrDFUiwXUTSRyNPBkrYo8l04Ym9L4PCbnJdAEHrvpqe0sqJpxnauhTW2N46HgpxKpErkOn+rq6hUXP4b+HaWzFhCQXuFcElVh0Qc5/FlqfYkpisBeRDaEM8av9eC0bdlRgsuXcQ4q1Nt7mDqn8i5RPGzYVk3K1xT6waM+N/zbnb3rzef2vQxHaS3dX6XSXwAlyl4OTwN+i5QHoBTlPt5AJh68Sk7VKs2W5BBtuWmK1Wq1E9eOa4iSZpLa1b8pf5Y25N3RGv19E9PILXJmNONSawCj0NQ8eblqpixIdzmLGiBxXVjsDLHYtSzYumCHGEIA/T5KScjlZofIn/HqI33Z2PMWy156vvvJgnWfLm53BVUEB825xCxY6y7zjGpMCay/jrDXXNMolGYjHSajTd6owsNb/OOpEQ71gLj4lCgEHXVR3s0Fsvogt8lfj4UVQWZtN7MZreT385LaLfETJl1ccAEelX+/9iwv4lhv4Jn4wK/zltobqHU4WkhDCh688LyQpNDhVEl/1M7ss+h/xwjp9ZTaaesBJIby3m3hwtwv7H692GkYOLieQ9/TvwfUUT1DsxZglyUFngsrrTpCkTero0tXLysSjgLwuMiRPGD1x6JCDXLF/AHGHw4ScHXQLO3yBPdo1euyMWBAIY0q4ZhmE33+qshrwpwsT/5x3c+OKB03JgXQPfK8LLnPLJceXHrhddB2pxGK5lsllX+vaKjZXYw/Om6eHOHsZsx i0N0Vtse nMI6UJ3mVwiIniP/kg1H7SaGf6Zcx4QpVyUA/7WFcGS3gUFKX/uw62UOidlewBmSA1cmWqQZe9NItxVe9rwXM6Jxvia/G1nzYPHi4vOyNrJXo18xsy/Kkv9PkkxT7gn5cfpEiTvdhjYoPa2PXEdZyxknp7wOfHl844QSl5bXc6TgGJIAYUMCmLpZAprPt26DroGpl2bTwnIomHOc= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, 30 Jan 2023 16:11:58 +0000 Matthew Wilcox > On Sat, Jan 28, 2023 at 10:49:31PM -0800, Hugh Dickins wrote: > > I guess it will turn out not to be relevant to this particular syzbug, > > but what do we expect an mbind() of just 0x1000 of a THP to do? > > > > It's a subject I've wrestled with unsuccessfully in the past: I found > > myself arriving at one conclusion (split THP) in one place, and a contrary > > conclusion (widen range) in another place, and never had time to work out > > one unified answer. > > > > So I do wonder what pte replaces the migration entry when the bug here > > is fixed: is it a pte pointing into the THP as before, in which case > > what was the point of "migration"? is it a Copy-On-Bind page? > > or has the whole THP been migrated? > > I have an Opinion! > > The important thing about THP (IMO) is the Transparency part. > Applications don't need to do anything special to get memory managed > in larger chunks, the only difference is in performance. That is, they > get better performance if the kernel can do it, and thinks it worthwhile. > > The tradeoff with THP is that we treat all memory in this 2MB chunk the > same way; we track its dirtiness and age as a single thing (position > on LRU, etc). That assumes we're doing no harm, or less harm than we > would be tracking each page independently. > > If userspace gives us a hint like "I want this range of memory on that > node", that's a strong signal that *this* range of memory is considered > by userspace to be a single unit. So my opinion is that userspace is > letting us know that we previously made a bad decision and we should > rectify it by splitting now. Apart from MADV_HUGEPAGE, what do you need wrt tracking THP and splitting it?