From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9EEF1C54EED for ; Mon, 30 Jan 2023 16:12:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 157F36B0074; Mon, 30 Jan 2023 11:12:18 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 1071D6B0075; Mon, 30 Jan 2023 11:12:18 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F10B06B0078; Mon, 30 Jan 2023 11:12:17 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id E10E86B0074 for ; Mon, 30 Jan 2023 11:12:17 -0500 (EST) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 78678AAFBC for ; Mon, 30 Jan 2023 16:12:17 +0000 (UTC) X-FDA: 80411957514.03.6FFE614 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf01.hostedemail.com (Postfix) with ESMTP id 95F2240026 for ; Mon, 30 Jan 2023 16:12:12 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b="RZ/491jd"; dmarc=none; spf=none (imf01.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1675095135; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=23pAo9JSVshof/GsDAaXnVhq9lqul+f0c6BSy4eq+zo=; b=F4OxC+chJB69Pn307/WBZAAL6N8AuIcOAaHYAWvnBIfPdI5/hoB6zo+nAWCDBzvqPAQCAf 9phspZ8z3Amm3OXYLW7QDcbVXf7FaHecXJSofbw90sxuBHWnSlToPcLpWiGWKyAD+EIZuI 8Yz2H+2wfUNR1G7NAMt85so7kSwjnV4= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b="RZ/491jd"; dmarc=none; spf=none (imf01.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1675095135; a=rsa-sha256; cv=none; b=iE2v3t7i8Ms82YQRXoxCv/wbkZIxlNxol8mRq05onYXdtvGa/dpA/mqYHXUK5ZSHas6xGS 8FKACji5H8w428raLvw3fv8Bb3SRVhMXKMSKUp0+XYIvqfDC+B9mKQrV51qF/tFkKrG1c6 r4owKKoTWUHP8aDbwsmUUVp+WQ1H8cU= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=23pAo9JSVshof/GsDAaXnVhq9lqul+f0c6BSy4eq+zo=; b=RZ/491jdGkmzMV6UR4tEokPjxM QVrFJd0V0gIoQ7FxW5F5luq8VwZYRFhCIE/9XemzJbSdEcCpTIN3ORzitto9t4UrLfmsMNiwW6Bol pYcdJxt5Sgwm1aW09OqkOLUifmGd0UPiFXTTMPRrGiPjSM+yB2fXaG/p6sZ4VMWhyr6tf8EIzvKOF Lo1AQD/3Gu9wcYlHG5Z9jo8y44Km73kYM6rvonZ05p+p3CwM9r9uk6k7VlayaSjmUjSEBCRPz3M5k ntD7DVRiigghw1u2/5tOqwFtUlY+h95iE4Gx8dGgUusav+Muo2muZIf0F11T8czDu4xCfdwmamnXp vkdX+i4w==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1pMWlC-00AUKR-PL; Mon, 30 Jan 2023 16:11:59 +0000 Date: Mon, 30 Jan 2023 16:11:58 +0000 From: Matthew Wilcox To: Hugh Dickins Cc: "Liam R. Howlett" , David Hildenbrand , Sanan Hasanov , "akpm@linux-foundation.org" , "linux-mm@kvack.org" , "linux-kernel@vger.kernel.org" , "contact@pgazz.com" , "syzkaller@googlegroups.com" , Huang Ying Subject: Re: kernel BUG in page_add_anon_rmap Message-ID: References: <713c6242-be65-c212-b790-2b908627c1b4@google.com> <9d8fb9c-1b81-67cd-e55b-34517388e1ab@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <9d8fb9c-1b81-67cd-e55b-34517388e1ab@google.com> X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 95F2240026 X-Stat-Signature: 35g7ogzq334irrzzc4kqsfeajo8fc9h4 X-HE-Tag: 1675095132-218300 X-HE-Meta: U2FsdGVkX1+mSNo7S+vfKGyjtPDN0GKOimhyoXSHhJ8u0zu0Qb9hgw2SMapNMT0BF8c4g+67nolNAUx8KO1WYXo+rWItoSpvsQWPtokKQsBrnf2d8znDbYh77ACwoW58WPpHBpGSfG72MxNsJhOgDTJPtUfpl/W5DnU/Ha9JCtlCl3mfukXzaGSMjLKKUy/pYz0+ds7rLsLoUO29/ftEcGbJ5mJ6AgkyI+6UEdEZO6nGr7K4f/JOJSxtR11f1IgsmD0smmr9zAXmwIyOlJmRAHjb67pd9dfrJWh44ypURNmD/LY3yl0xdf3TqI41pqxLAvskZ9uJDflQWU9fxqnuglGP1iIWP607gk+twovocsmXkDpAqUu5lZkB+vXzR0fWxD3UoGGyBGSv74Jpenit9JudEEpWctgakKkZtFWSSoOJGUJPKDHZvVBtHZy3FsQw+4TSCg50ITKgj5qBPtURXY7Li5an3ZcY1pUuN4mhSul9wltLuGl+MXZukW4y/S0hIIfhlqcP7LtiJXHJSSZ3yuIMW86PEiZUiq1HPN5TvgjrvHY7qGXt1th8hVcX+a4MJwzrxdGVps1BUg2pcveP5Yt05EmkPsJ8zM1KU3dRjnNjz1hWlD34plBzf6M7JWiifxjtA0LDT7+KPaG0yCZGypc2ahq5IVZ9Nmr/MuAwqiatWtBtDuVBK6/I4mpuove8XNk8/yWyycSp7ibVqrQhB2oGo2XFWG0QJXMzxl6T+y+GIS1+vKQh9KTK6bsdzdNMIUAzAA7UoLLqPYrmIJgwRpd9noRIEKmSPbgmHtWdAt4oVyNhzSctJlp+Ww4pa/bLzpN+ZsETVamBhyx2ehFFuZxn4jMPlG6TG63TY96I2GQqkdBhK1nChi0sadb+t54BDKHXbfJNpquy+7iYBfD7cFLy7gKhiswdc+O9lB4q4JhBbZukAWNj8ZO2rIoGwKpKnyh6s5fJB7GSH9UEtJR WPNP04AW Zi5krzIetScA+EduVcV9QNKEqXwDSZqfnVTImEBlHLbCDYLKgzkYS9A7xl9VrDQZ+p+vQNV+Y7j6/SxgAn1a9CZSyQyEt0sq2duLnMF6Zxdmv5xeCO1Sf1NGDN4dYTcEdyqy/jSHWPJ56sVX5GpYvIGget5KE60FXtGGf554RU4zi0DoFqhwgOFG1iQryG5yVlnJOH1ZLGO3/qTMnJtABSZiVXVJNc8k1hcwkbmK5ohf+piVNvKyXdRRSGA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Sat, Jan 28, 2023 at 10:49:31PM -0800, Hugh Dickins wrote: > I guess it will turn out not to be relevant to this particular syzbug, > but what do we expect an mbind() of just 0x1000 of a THP to do? > > It's a subject I've wrestled with unsuccessfully in the past: I found > myself arriving at one conclusion (split THP) in one place, and a contrary > conclusion (widen range) in another place, and never had time to work out > one unified answer. > > So I do wonder what pte replaces the migration entry when the bug here > is fixed: is it a pte pointing into the THP as before, in which case > what was the point of "migration"? is it a Copy-On-Bind page? > or has the whole THP been migrated? I have an Opinion! The important thing about THP (IMO) is the Transparency part. Applications don't need to do anything special to get memory managed in larger chunks, the only difference is in performance. That is, they get better performance if the kernel can do it, and thinks it worthwhile. The tradeoff with THP is that we treat all memory in this 2MB chunk the same way; we track its dirtiness and age as a single thing (position on LRU, etc). That assumes we're doing no harm, or less harm than we would be tracking each page independently. If userspace gives us a hint like "I want this range of memory on that node", that's a strong signal that *this* range of memory is considered by userspace to be a single unit. So my opinion is that userspace is letting us know that we previously made a bad decision and we should rectify it by splitting now. Zi Yan has a patch to allow pages to be split to arbitrary orders instead of 0. We should probably give that a review so that we're not making the opposite mistake of tracking at too fine a granularity. > I ought to read through those "estimated mapcount" threads more > carefully: might be relevant, but I've not paid enough attention. I'm not sure they're relevant to this, although obviously I'd love your thoughts on how we could handle mapcount more efficiently.