From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B936FC636D7 for ; Tue, 21 Feb 2023 17:21:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5CD1C6B0071; Tue, 21 Feb 2023 12:21:18 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 57C9D6B0072; Tue, 21 Feb 2023 12:21:18 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4442A6B0073; Tue, 21 Feb 2023 12:21:18 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 316CB6B0071 for ; Tue, 21 Feb 2023 12:21:18 -0500 (EST) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id F24FFA68FF for ; Tue, 21 Feb 2023 17:21:17 +0000 (UTC) X-FDA: 80491964994.05.147F67B Received: from mail-qv1-f41.google.com (mail-qv1-f41.google.com [209.85.219.41]) by imf04.hostedemail.com (Postfix) with ESMTP id 266394001C for ; Tue, 21 Feb 2023 17:21:15 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=soleen.com header.s=google header.b=jBeQm2qT; dmarc=none; spf=pass (imf04.hostedemail.com: domain of pasha.tatashin@soleen.com designates 209.85.219.41 as permitted sender) smtp.mailfrom=pasha.tatashin@soleen.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1677000076; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=i/AVgATxWbWiAr2W5tEBdKculwno8KOKGYQnhcPaa70=; b=T4ILlMCJfBClVX/yq4hmjNasiWnRBUxjV1NljvgKobtACq+4tLCYcBKSpPyAnk68izge3W yjr+7TFzd3yqkN2duvx+VBZvUiJjLybVLyQv0edXKheYaoQ2OjLXpEzxKoV/JTrEYozMIV LWTEdGC+PKb7vQHuFp9HQxl22n0kh3c= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=soleen.com header.s=google header.b=jBeQm2qT; dmarc=none; spf=pass (imf04.hostedemail.com: domain of pasha.tatashin@soleen.com designates 209.85.219.41 as permitted sender) smtp.mailfrom=pasha.tatashin@soleen.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1677000076; a=rsa-sha256; cv=none; b=8oI/HS+tX31kUav/JX6mTcmQ22Aeo2AZIW/h4THoix//rG8cFP2f73N+9ae/k4z61V4enq hqxW6HAScNtdw9mb9vRgASeWCUHqPt/h3VLzvLRkGdZYWl6msuqN/EO1/f+GPMGaNpIkF+ E9KSFlvBS7NRQBtC15oauDwOb0G1KOI= Received: by mail-qv1-f41.google.com with SMTP id op8so6045909qvb.11 for ; Tue, 21 Feb 2023 09:21:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1677000075; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=i/AVgATxWbWiAr2W5tEBdKculwno8KOKGYQnhcPaa70=; b=jBeQm2qT6/9C3MlcWjZ8Fk+Rut/sYlRJxlQl7SuBWq+5aPNdorwvHk6RQjDiV6advu iJcBvmuoFV2KBqS1FzXLUvJ+lw6Axgv4+zSUl57w3rMkZbJ61ibAUlbliwC8L/1A04NO fnsRBvcCbGnizJ5FUdnM/wGin1yPVK7PGxJNIMZhISwc0ubj87F112t6FJHe7euRUNHZ nydmsqzt36TVAnbWF4HKYVVQbktm+YGPTzUgkcO1T9rK+O8aLYVKQp3ScarJEowNaD2O +GSrH41s2JVicMi7QfFEa3j4b/ojVX7zIBstixwHSpGgQP5kGZZKVs3+I4wgGL+8ySz1 TgfQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1677000075; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=i/AVgATxWbWiAr2W5tEBdKculwno8KOKGYQnhcPaa70=; b=kOi2jMX5oe+NdDY/BYDrjFGZcD1/UR/eiUKH9glxkG/PrItph3f/8cD+9Hrmo5NcNM domGVCQvtAF5/HoBrYD7VzjyScF29QqvE4jPGzXFv3/dJGO2bGsTXhRLVNsJR/Ro1rqW xG1BHhnyLyu+Q+Sm3Xn35PikrUcCNktEAv7TwPlBC0MwXWJflpDbjH+WpOb6LTK2/o+X ajJ1y6d8FSoOXpRbE3c3NWQKABEQgvlq/7UhLdy8QY1MJMIpZuY99RJxhXzCaAtfsfJK dgFCSBBCemiSiwwBBwHS7AIlA3OIQ360SKbgHf0+XsFag7TJY+VlYCAmY2Z2UDlCDJto tlsA== X-Gm-Message-State: AO0yUKV6rVGiaL4cdFEDjOpiiRYVuxnNqG3+w++kH7HKxabiobw8i0Y5 aN9OTNe4uz8U58AglhRO2ZZUBqCyvZ1mReI1xyzRKa+XtL0Q7g== X-Google-Smtp-Source: AK7set8yLdHjgGQJ0PcS44oB9dwS3DDcX+1yLbp1Q5k/6X4pXxVE/JumLdYlsZf3fLiv/8bLpPmgx3fVirn00dV9LCk= X-Received: by 2002:a0c:b390:0:b0:571:bb7c:3bd4 with SMTP id t16-20020a0cb390000000b00571bb7c3bd4mr391403qve.82.1677000075213; Tue, 21 Feb 2023 09:21:15 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Pasha Tatashin Date: Tue, 21 Feb 2023 12:20:38 -0500 Message-ID: Subject: Re: [LSF/MM/BPF TOPIC] Single Owner Memory To: Matthew Wilcox Cc: lsf-pc@lists.linux-foundation.org, linux-mm Content-Type: text/plain; charset="UTF-8" X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 266394001C X-Stat-Signature: bzrqts6zc58tdus1i9j6gmyhjh7idwjf X-HE-Tag: 1677000075-127553 X-HE-Meta: U2FsdGVkX1//YdUqJKi7sPlg50DOHtTI3jr6MgMax98I4w+BWC/SYworGYeidD+H4++ZhpwKRXHd8PN0G+jDOizPKCIwiKaXqTjueoctGvFyRbaSMomAbF+CZg1ZeCqFa0GZMiwmWSU8nS4mYux80QE9k8CCygydXlwdyJPLOEiEmo/FdT2vlRtVe8U1skfZud+k6oP9A5YNOQ3C5+yFe0AZ5pNltBmw0tcfgenZvW5cCClwWeWdnASTFgD2Bbo+rNFGHZDxsLmwY0uosZrDo6lPEy3vkQqJVWlMMyTRsKKT7G5NN2iAjBnNRpLd9HWJ2qBTvYWOYiEuxm/wKBMiU/jFc/PgnEol01cSfMySUIeHct2rMZJIxnJyX46DjBMwVCC5l5jXxk/84azvPjSoYPyKC92e4s0ce+M+V77pGD1JpvHSigZ0JIsnRp/BUbGX1AmLICd+IyGKBsmoG36QYdKlluX8ctwN7iGZktQTx3RasyuP6IrVjTE/tQB4opv+etrXnL36CAQn63S2XBk6sqdo27XJlF5F/nFIwtH2fze0I7lbptkwGS7JSbDaEq2HUIv3wjUy5PBSXblBOvXbdaFM87Ae248CYU4EEBViI/BGNzlR8YrSvTTDZvHZCXW1UcJcJpcN5cdVbYikyf1V9zCphaqiWyvNlxPESQwZfoLqm5uFDdTAITY7TsCg5Y1uJJS1FlOdNInKSCIoA+aQKWRzhILKbUDDZqO3EmJU9eJyBvFNKh4aR37AlMENWXJiCjhFlUYXlj327lIAZ02yikxYOPv6Etbe9kpIn/WM0JoXoL5urPILGjWNLAvRGf99V8k8l2A+qp81sNlc7KudQqNiDKFA9ULNq1cnDtYy9i12Xbt95beJvPS/WyspgfKEH7ZB1NqJQWgLLm4K2mpKwre3pPK4lWChjg64no9AFZ+RYM9HYYaLONnk21TOhnojTCHlwmTNqtT+EWJwSCz DnAih0AX 1JFLTpeCxyZjoEM8gYo/fuqbKv8lHWxNF99HyO72tbCHPWcq5DEXmOPJylL1sQMmPmgizMdkGC69/yTWmkeCRKNuTcOO+V2F0AoaR61yMnfgclR4JHzoxlcEsBgO1q4vDBJLgDbpuXdA3N3os0oXPbadzxtVErZiph8QX8AmUM4wkK/oW/gYSOO+GFt/50Dt8HiQQJ3wEfjk5NIn3A5SmTMqgLBcpWKmE7PI3UmzwtF/rPkv6Z97vQI+u4gnG6CqHTKiMYHXwuO9nr+B4Wpagey5mow== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Feb 21, 2023 at 10:55 AM Matthew Wilcox wrote: > > On Mon, Feb 20, 2023 at 02:10:24PM -0500, Pasha Tatashin wrote: > > The discussion should include the following topics: > > - Interaction with folio and the proposed struct page {memdesc}. > > - Handling for migrate_pages() and friends. > > - Handling for FOLL_PIN and FOLL_LONGTERM. > > - What type of madvise() properties the som memory should handle > > Something I didn't see covered was how you'd want to handle memory > pressure. The answer for memdescs is that we'd treat each userspace Indeed, this is something that should be covered. I had a few thoughts about that, but it needs more work. Some possibilities: 1. When memory is pressured we can migrate pages to normal memory, and that would enable that memory to become swappable etc. 2. Teach in-memory compressions such as zswap/zram to work directly with /dev/som. > allocation as a single object; if you allocate a 256kB folio, that has > one accessed bit (set every time any of the PTEs which reference that > folio is accessed), one dirty bit, is aged on the LRU as a single unit > and will be written to swap as a single unit. > > Assuming we're dealing with objects smaller than PMDs, we have a number > of PTEs each of which has its own A and D bits, so we can determine > at each revolution of the LRU clock whether it still makes sense to be > treating the folio as a single unit, or whether pages in the first half > of the folio are no longer being accessed and we should split the folio > in half and age the two halves separately. Interesting > > All of that is still theoretical; we don't allocate anon memory in sizes > other than PAGE_SIZE and PMD size. And we don't track page cache A and > D bits to see whether the decision to allocate a particular page size was > the right one (most page cache memory is never mapped into userspace, so > it might be of limited value, but I'm sure we could track the equivalent > information with read() and write()). Thanks, Pasha