From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B320AC61DA3 for ; Tue, 21 Feb 2023 15:55:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D3D626B0071; Tue, 21 Feb 2023 10:55:29 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id CED9F6B0072; Tue, 21 Feb 2023 10:55:29 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BB60F6B0073; Tue, 21 Feb 2023 10:55:29 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id A96706B0071 for ; Tue, 21 Feb 2023 10:55:29 -0500 (EST) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 5FE2A405AB for ; Tue, 21 Feb 2023 15:55:29 +0000 (UTC) X-FDA: 80491748778.26.9CCCB4E Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf11.hostedemail.com (Postfix) with ESMTP id 4FE4840019 for ; Tue, 21 Feb 2023 15:55:27 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=wE1Oya1o; spf=none (imf11.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676994927; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=CQFaUnjutXvSzumUqDcFjwIQh6o7iYKYr4xHSuO5HnE=; b=U+yYIG6v2a8lFAr2nzTeei0OX3rmQ3piffslch3Yw4AwlAcr95UbN9THdKeYZCVZWH7yiW 6NqtsMFS3MltK9ZwEb+bgIoh5LIAhQAuqCQw73yYdTMmhSq3j5IqOVVB9GaxD2Fw4C9+s4 UaLbnNhUYfn9WyLxbdX2lMgZCyoN4jg= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=wE1Oya1o; spf=none (imf11.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676994927; a=rsa-sha256; cv=none; b=O6C0O4NH2A31OOzrNqBYYsnEkESnCf5xlu+NVfFJ34JWJyKj7LMogH/gbqBPh8PQyR2We+ aQXmjIjv1J/xGCjbAbnT72g2XBvHa/6ouNWLyrbKllb2eMLKQnZL8j8sliS+AEJkeQbXtv n0P4FeDmcIGH84evwl43CuRnzTVbUCU= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=CQFaUnjutXvSzumUqDcFjwIQh6o7iYKYr4xHSuO5HnE=; b=wE1Oya1oqgRCZ+RUkbJVRNW57V lDoUOtvd6mZ7ZxkdYDWBmw3pNXTpjiT4KLorVZ/VUN1OuuckupiSVSs9mCugye1KzvwLC5NvM03Ih 2uc+VTE33J6szb/5FQRgHYu+uAbZaYXsNBm3EHKlmAZyVaDIvEVykQa9wHvSUPQjbdWeFSI9f95Gj mnZN1MxIHjL+383vhPLV5w7NO5OskbtqrIWCeHrqxMGUNCuvJ6OsoezAmH7/izNWsgItKImAyE8HB IlZDE+6YHgRWgoYt9kS5kQ46gjIXCqyBVz2DOTOCTqomzqAOKcSqiowMXBV2hhuyalouTp0CBeFRd TKdaK3Aw==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1pUUzB-00Cit2-T9; Tue, 21 Feb 2023 15:55:21 +0000 Date: Tue, 21 Feb 2023 15:55:21 +0000 From: Matthew Wilcox To: Pasha Tatashin Cc: lsf-pc@lists.linux-foundation.org, linux-mm Subject: Re: [LSF/MM/BPF TOPIC] Single Owner Memory Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 4FE4840019 X-Stat-Signature: rbu43rxhr41b4o4jureu1gtpb1oun54t X-HE-Tag: 1676994927-225082 X-HE-Meta: U2FsdGVkX1/uUxjs4IWJ0Lavt/ieXeg96tdWBV6CrH81xppS0T3mQgQlgF+IOl322lBCY1fDeB+bEkdRZZFlk33q+WIyOwtLrygaePRdSgFG89+wC6r7Um6vhUOquLMJEgLM8bjNYMKwms6b5m83UXeRkiiOYP/F1bzIkyEHWBxONjQDGG6avOymKVXN77g6BkdlwNAdIkjyg+mQl7VFG/34l2KdFMKt1ThII6n9QPZ30oX1gl8EXlfq+dGM6Ymzl4BthkFSca47qcLOuwrcS8OXqTO46pcyCjdGWWC02uNIPFUfyF81r626tr+6FHAXBfSVR1X+uvfWbSYHom+ETLF9WwkmMN1s8dVF1DhrtFYCUimFm9bhQtvdjgDZBCQtAWioXl0x1/cdiXZmPIBmq2EeI/Gnn7o+i0AIRHxCX69VRNb4eswlYTXMosHdhNh79s0lSdVXoIWgyO8e32M10Y9tHcajFSGhmfgmfvFwbwewSnBn46AWhN6KhAZKu6FkysrS0OI7leapsARdw8ZXRJCRUB8HMeebHr6Ak27uu1AYAmhTbxVAbLP0L/LHTJWrYWFySbtuLysjE9w/8xyKPAuiMcLY+hgonJH/rinCXPYc0mDABy5RTXz2riW8jIG6Y7oFvWNDcPaxz5bnZcRtevbzcJZQbg/gfZvMtEdxCsKk0Yzy8kpzlc+NIJOyL/GfzbbxhnH5pV0tDYKSHw9ENnAZ1KDzuy8DpXJX3Y1j4Dvms/hE63Hil//jBh9iyw0PAMo/oEi8KsJqHrmeb7y4CcO8Dnka/SmeRkFsRxrg10u1F9nORApQ3wrlDPpdkpR+l5rhOjSXMDjP0bk5DPNPf2zYe+BBIMTl7ZqFXE87Eu+cFPdUSGwnnRWtTEYXjeRrD3cmTlipmd2+gp8c7zE8BIQUYA2tDQFxyp+VXSyjKS5OV5c2O7P2PFK6zmLTdJh/UrcO9BJIKKY5eSnxJyS Nnbn84UO 8Cr5KCLGtlyfvFtVBl+BNpqV5/jxBsmrpNkOCwFLiAfni4GE0P+wHKxy0aHjXQcVgvEKYKGBmV1HP8pl+Tc88HWgWTNu2GPi0TMIeeK0ttJ7oQZT0Gw1GxJG8eUrou6YPie5KSQ3zMM4+FJfdLZTrbwnqnQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Feb 20, 2023 at 02:10:24PM -0500, Pasha Tatashin wrote: > The discussion should include the following topics: > - Interaction with folio and the proposed struct page {memdesc}. > - Handling for migrate_pages() and friends. > - Handling for FOLL_PIN and FOLL_LONGTERM. > - What type of madvise() properties the som memory should handle Something I didn't see covered was how you'd want to handle memory pressure. The answer for memdescs is that we'd treat each userspace allocation as a single object; if you allocate a 256kB folio, that has one accessed bit (set every time any of the PTEs which reference that folio is accessed), one dirty bit, is aged on the LRU as a single unit and will be written to swap as a single unit. Assuming we're dealing with objects smaller than PMDs, we have a number of PTEs each of which has its own A and D bits, so we can determine at each revolution of the LRU clock whether it still makes sense to be treating the folio as a single unit, or whether pages in the first half of the folio are no longer being accessed and we should split the folio in half and age the two halves separately. All of that is still theoretical; we don't allocate anon memory in sizes other than PAGE_SIZE and PMD size. And we don't track page cache A and D bits to see whether the decision to allocate a particular page size was the right one (most page cache memory is never mapped into userspace, so it might be of limited value, but I'm sure we could track the equivalent information with read() and write()).