From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7CF3CEB64D9 for ; Fri, 7 Jul 2023 17:26:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C6DE36B0072; Fri, 7 Jul 2023 13:26:52 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C1DBE8D0002; Fri, 7 Jul 2023 13:26:52 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AE8188D0001; Fri, 7 Jul 2023 13:26:52 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 9BB346B0072 for ; Fri, 7 Jul 2023 13:26:52 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 623901C8448 for ; Fri, 7 Jul 2023 17:26:52 +0000 (UTC) X-FDA: 80985495864.15.75A0650 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf18.hostedemail.com (Postfix) with ESMTP id 171CB1C0008 for ; Fri, 7 Jul 2023 17:26:49 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=pYhqnPGo; dmarc=none; spf=none (imf18.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1688750810; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=YqH26Sn9y0zL/3WdNc/PCnXRHUQbKzei+dfuzKT49X0=; b=aEFXM0nJnajhDnAnUbDfsY/ld3P5EjLwXls0/dlPg2WjzH9u6/CG9cArSIF+dC1YrgGx3t 5yJsG5v2kB/1sudjA5It/2TR6hd/MkyJ1Q2FF8k1JAt1iEgR30byrsJk9rxBfmbpNB33AO K6aZ+K/1JBycwzwM42ydtZAXPsJwvqE= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=pYhqnPGo; dmarc=none; spf=none (imf18.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1688750810; a=rsa-sha256; cv=none; b=qc+A/cqthce7bDvLAraWvmGEHs24EbDFmXLgdg2KLGqldCQ2u2CM9eU0DOsjE3h8QI10yx j5riIEkWM9628ruUYJG3Bmuq+2ys9zY//nEgdkx6YRE+zQlUFDMMOafB7/opsPoj4fkXrm Ixtt2tiKtvxnX3fUljBAqbIolQGhQDU= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=YqH26Sn9y0zL/3WdNc/PCnXRHUQbKzei+dfuzKT49X0=; b=pYhqnPGolHUm+dexf585s0+35y WC6NfA9y9NOhvYaBagQyB91AWjyiBR2KrZwk1nsxaL0gfBUaMCeO8BAvyUWdGZ5a6iyvBElB4+6Q4 uRhGpoCfD7ue7S9jY0piRKwvQAT0U7uf7XhQGHTtYSMxe4NymlBKHr7LomxsArLruGFu7RFKOPhli wyMvYm6KnaHA9ALKyIGTi4/SoVk22H4PvfbiigUb4YTiNNHaGCAQgL1e3qa17lipVVDcg0v1kJNO1 1T1trOJyMghG39/zLgy4Gx6HJzvcDUMON6epSLDpQdRkRuNjAERLynuAMe8docVVENAXGN6EKWJq3 PzhAZyEg==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1qHpEC-00CDQu-Pn; Fri, 07 Jul 2023 17:26:44 +0000 Date: Fri, 7 Jul 2023 18:26:44 +0100 From: Matthew Wilcox To: Yin Fengwei Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, yuzhao@google.com, ryan.roberts@arm.com, shy828301@gmail.com, akpm@linux-foundation.org, david@redhat.com Subject: Re: [RFC PATCH 0/3] support large folio for mlock Message-ID: References: <20230707165221.4076590-1-fengwei.yin@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230707165221.4076590-1-fengwei.yin@intel.com> X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 171CB1C0008 X-Stat-Signature: 1ewaarz9cpy9mbm3i41hetert59gxmjk X-HE-Tag: 1688750809-343682 X-HE-Meta: U2FsdGVkX1+ecc01h1UIWYLfu1sBFdefoqBlol84vnXNYCc6o9C6M9CvB4pZZpbCurPASqw8QUxWEYRT1BYDIw6oUArNTwzO7tbg75sg1YvKHd9A39pP/+gKbJ8+SUTzErS+TbdkOFIh47GKngcbumGNv2pZJF2OkIr21w0O/SjWJFeYq92VeAE9DSNOvRB94Fv1fKemjjE8ikwfqdeGEGp7XNH5GywkVBrwiN3JHXatktjChSMRrNU9faiwQN7mCI/zS3+1Zrnv7NyMe+e09vDhfX25jVPU4vykAb23ZlLk6PlvOo58wXy7wkrO0UADEN2XdGVnLeVqUZgAVO6sSgPLAq/v8ZBLnKpb5uOcIrj9MGJZvyxf6xyPfo/Zr8HEs1eq4fZTbfve0IBe8yrab/f7oFTVnp4WfGW4jx6xDFNomOb3NPSps7q4TbWxL41/+JY8ii2ursduDpCFnUwzv3cDb0RUhlLcAu24mMabEnJj4ZLou1+qE95TQcEf9jXC4Rcvq/iBvQK2khICe+wCqruXmQcaFGpLi9aGabPqzOQovN/pSqoATJdgL7RF0cooduxMCrgcZR1wIaMN9RHVoawZC6mchr9w5Oh60sIUaiO+jnlkdd9wXAeLLxQ/CjJvhQd6orUNHLBdMe3pkbEsHk10+OQYVKDet5cxBhVtM8nhANmq6DkUzZmYy3VddvdR5ktl2Ql8UauglF6QihjSp6nz9Oz3J/fHfHHsOlCNR7PFFC0FKY/0eR2crHkFz99+PlCSmT3wMvDFI45hnX/LKiZPf5RspZ1WJ26kqQ0rwi0cknrio6j+BGJ4CRNlYryzDtFPHBPjwUojtbUz4Gsng7NiWWiCC2Fy/yYVfvu1KJMdI8psf1eeezOgo36U4pFA+6C2wDi6u4Q7FiPWjy0C95FxaQI9yqnlgDpDFybDmNyowsosKsT7oPovqCpmdJSvEOneSvhuLaTAJBs1T0i ta49quLo 3jl/GEFX/XDiC2pcGzm7yaNV5ZenDj8sjykxka44goTfoTYhXIWfDUE1RXB0L0HhrydG/ot9c5nlXJ3VI8MPysfMW6SOux2wKkgIoaS/u0TXovOXfKhcmZ/g96a4bK07zCLfZg1NkeVNEMRU= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000895, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Sat, Jul 08, 2023 at 12:52:18AM +0800, Yin Fengwei wrote: > This series identified the large folio for mlock to two types: > - The large folio is in VM_LOCKED VMA range > - The large folio cross VM_LOCKED VMA boundary This is somewhere that I think our fixation on MUST USE PMD ENTRIES has led us astray. Today when the arguments to mlock() cross a folio boundary, we split the PMD entry but leave the folio intact. That means that we continue to manage the folio as a single entry on the LRU list. But userspace may have no idea that we're doing this. It may have made several calls to mmap() 256kB at once, they've all been coalesced into a single VMA and khugepaged has come along behind its back and created a 2MB THP. Now userspace calls mlock() and instead of treating that as a hint that oops, maybe we shouldn't've done that, we do our utmost to preserve the 2MB folio. I think this whole approach needs rethinking. IMO, anonymous folios should not cross VMA boundaries. Tell me why I'm wrong.