From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5E3F4C25B10 for ; Fri, 10 May 2024 21:19:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BB8BA6B012C; Fri, 10 May 2024 17:19:01 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B5C1F6B012D; Fri, 10 May 2024 17:19:01 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A4AFC6B012E; Fri, 10 May 2024 17:19:01 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 83A206B012C for ; Fri, 10 May 2024 17:19:01 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 0C9C41A17D1 for ; Fri, 10 May 2024 21:19:01 +0000 (UTC) X-FDA: 82103751282.14.CAE2BD1 Received: from mail-ej1-f43.google.com (mail-ej1-f43.google.com [209.85.218.43]) by imf24.hostedemail.com (Postfix) with ESMTP id 3194718001F for ; Fri, 10 May 2024 21:18:58 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=M3Xvy8gM; spf=pass (imf24.hostedemail.com: domain of shy828301@gmail.com designates 209.85.218.43 as permitted sender) smtp.mailfrom=shy828301@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1715375939; a=rsa-sha256; cv=none; b=ewswKER0mErNdwVFalfFm8YA2jyaHcboQtGXAHzf7H9CfuCZ5mnPPZxk7olA62LwiZSLtm G6Swzbi5VMf60nXonM0GolJ2BipGMkyK09XFa5nV8o5AAm+GcRuUiq8h2iVPCFm3WrnE4t cVbkEqrikJUJHV0tfVg59B4nQbPXb0s= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=M3Xvy8gM; spf=pass (imf24.hostedemail.com: domain of shy828301@gmail.com designates 209.85.218.43 as permitted sender) smtp.mailfrom=shy828301@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1715375939; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=7wJM+pRW0Boj8jdU7hqdsxkGa5UTo3+escaKlSI8CLM=; b=OJKF7b+nsz/ASZrovoGtkNq5tYvzQ78ocAGAIdn9/16+SLDIhSSnMdixrSfhAkCJQEY4cx jICO5AJmiF8ORuQzIfVQAcfvZzERisJ3ujU0JpL4LkNl9yWoIR7I+QS7MLojo7T3D92DCT Yblh1wNPqdcZjiCDPzh04UM33+OW4SM= Received: by mail-ej1-f43.google.com with SMTP id a640c23a62f3a-a59cc765c29so548058666b.3 for ; Fri, 10 May 2024 14:18:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1715375937; x=1715980737; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=7wJM+pRW0Boj8jdU7hqdsxkGa5UTo3+escaKlSI8CLM=; b=M3Xvy8gM877MosQkUxd3t+LSjuW/ixO4R3YfXFQXE9pAfrioJtFwpDJcpJVadhcozI 850nKX6ZTQVSb/DlcL/SDao8rz4hg4t0mamrHnAGKoKnY1jRpgJSftt9wf8rNybZdtN9 UeOXbWI+PrN4ZKHPKsCo6zRFXknyxoIwkEcjDlvyEW4DRsK4jAYTHNGd8DRVg3CNPDeT +fcD1jYiDA9LiZJ0KvPV3BKXzCjZ14yXU65NbPrXuCdNOr/ab0b1aPVAFOgTlshITgjq vSLRObEtZ/GiOmXjIoUJSWBSI+mw7Bm6z13tugc6rLaryjRDV3U1ydTWkliP/t6w5C1k KQ9Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1715375937; x=1715980737; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7wJM+pRW0Boj8jdU7hqdsxkGa5UTo3+escaKlSI8CLM=; b=MeVZGu487RZK/yW/AEvOgZW+eb/GT/odPOi/nSlBrfJrWOy2h3e9csAynCc7k8kYqB y0UadFFNLotiwMOF1jKWrvb5FvcZ1blcf4Eik8geNBtdl4uxix6UlBxwbvkKymiDMa8e coE95qPdRbog7q8cQOoXF4E5GSwT7ufmjmTCbvb/ftk4DcxUOTN/s/cOSa2gvttLY4hc CVtG9Ncz5U6zPoZXwA41cxeQCNE8t7HIZMv8rRvH7313jUIUYdLnQlhlttcRBv6plwER TO1xui+0QLa4eP56iqv2UIUEQq6wDstTCSoU/v9yikBRAZ6GV+BgM3H+GKq+DpouZbxx lONg== X-Forwarded-Encrypted: i=1; AJvYcCUv5iFhM/JMHj9qP7SO/fzZCuHk3RaRDRC2YFqno+r2WXVTxX4anj+dbxzxhdWmGOkQ8QoTqQ599EwxI3FF+BhclXA= X-Gm-Message-State: AOJu0YwSoTdGELwIMNf0lt1dGSqy9OAx7NFaAkFztJFymIOInmoH/nOI /lodbFFNMlJw7NM2oQNm6ld3OcigtX421p3Cp33YaQcphqYPODYleGqE/lxNqIs5wWaEJFZbeqA ND3Coga2ywSQ/VODk0ZxB6aVu5Dg= X-Google-Smtp-Source: AGHT+IFgPu7eVVgWiaT5EXaofdUug3C0fF/Chehy1ZNGvWHra4xIXZcnE01BS1wSlB/CVW061SdrKJVhxwLiHHIxrMk= X-Received: by 2002:a50:d7c2:0:b0:572:68a6:97c with SMTP id 4fb4d7f45d1cf-5734d5ccb06mr2557651a12.11.1715375937377; Fri, 10 May 2024 14:18:57 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Yang Shi Date: Fri, 10 May 2024 14:18:45 -0700 Message-ID: Subject: Re: [LSF/MM/BPF TOPIC]mTHP reliable allocation and reclamation To: Barry Song <21cnbao@gmail.com> Cc: lsf-pc@lists.linux-foundation.org, Linux-MM Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 3194718001F X-Rspam-User: X-Rspamd-Server: rspam12 X-Stat-Signature: e35rzforsh4hi5ppd9wnqsc39bayysrr X-HE-Tag: 1715375938-833187 X-HE-Meta: U2FsdGVkX19x2E9Vx5DBVqeR3GsGezdLW0llHh4BZzMfF8eHX3+B7quaMJp5MKSN6nyNbXRg08QIaWI2Uh2SxyFlo/zC+9BzXgs0GbtccVMEx4MNMNzyAAx4RZg5I8e++9jL08DGNvV2gAogsLvxOL9zKHo5e5S+tJou77MhqziuIcFHUPueSnmZQNSV7rgcjqyokwLY5B4NrbSVyBg52W1CFhymVmeOstLrwe0WapTYcVZ3lqoZ81QK0M3dVcQFrLN9N0yVITxXNuGB9yp71nxzUhUzOg/PjfaxyJZDFglZMalXdmCr5OiFYLWBV/PqpJAW5pBP63v8v+nUD6z7yqEJcSGh7pnF/kJCHPwVFVuXY8xRk7fI4Jly5dUZNulkJAHWdLFKCiYX8L8UW2Wysx6qoaLQ4wxyrOMOa+WbReF0qC1+eUq/+3odGY41dzqQPqrLbGGuA+54rfwd8vMseuOlnqwBxFT4rz9TyiC9BD0uTu9Tn6Ec08F7LIffo8W9hkpVkFFVWZV1XsD261LpEyp6FTqeYp2+MdmO2MR4yZBfhm180V3vPXtG0bD3dqtRgmOO3SXNAm2oqm55Rsn/s06roas96m5oNtKuH4paqzGPZRLUTI4c06a4U3FjbsMfkbVJCK/4jJck6bjqjJZK1rXagHNE+jjj9zo4yuufNfGF42CZpykJjLS66sIlWimgWrFa5euiF4GZ+UDBNgxSS2V63nrda6zuwhyM8CHiBT78cCe0s7fibrEzQmT13CdmJcDN2NQHjhsidF1DOsLagwSHkomNNhQsMjQXQ7Ry4qgGjvkFIe10NikBkvoq96ePWnAc4qZMwicGD7FtDEJ5evbBnYXQ9m+Q3X9eikBMLdXkILdPpl3dvjpunerH1PsCMNxr8oQvQpCEyvQzgOLwPYutvaLgY91VRe4Hl3/IW+9MRognD8LeLKAeNI0tCtRVSJyVPnMkjjE+Fj/HtDr cbyfwxti trH8nBax15fzkhkvacM9s3Io+OxN9uzx/smz9iQxIHFECZ6A/F8K3lPzA38QeAATWyVnkUO7tpAmi2YPmYRxCZncQRZpe0d2tJziC4MUVWPfY5CslD04QlErlRisU89IK1xgehtKjE3fAVQ3RLVtLVjjYqEYZYO7Fyxma2SHktlZP6IDwQdzBy6ytcLxY7oAsvy7bVNrsSyRPEUF31hoVrACiRp4MS6kzJS+0Eo1qG8adRdZUmd8VuEkomOUmlMrD68HjA1l/ufdSdYvOf4mggm6xm0da2aso4UceKtGizgvc2VoGgfy72jOFCpXxDR11B2y5oz+Wm1bdVh2/qdGboUEIvBNoYa2kCkuZqK44BBH/womfbt21griAxU5u0gqCsfoShUzQdBZAgdFlw+8ijrNlvLXhbTXB7oiViJ+zHsLaXcEXXly1X4CmPm4kOvQ4+sIaQtjgQs9iYT+3A82dsGAykz++p8R5qHP6CJ+l5mKURHNxOoxQ89b6yvQKoEsudtnxYCfnVCR9trYwF9RA12Y+PlAci6uTiP/n/DbNaY2a0Sc= X-Bogosity: Ham, tests=bogofilter, spamicity=0.057161, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, May 9, 2024 at 7:22=E2=80=AFPM Barry Song <21cnbao@gmail.com> wrote= : > > Hi, > > I'd like to propose a session about the allocation and reclamation of > mTHP. This is related to Yu Zhao's > TAO[1] but not the same. > > OPPO has implemented mTHP-like large folios across thousands of > genuine Android devices, utilizing > ARM64 CONT-PTE. However, we've encountered challenges: > > - The allocation of mTHP isn't consistently reliable; even after > prolonged use, obtaining large folios > remains uncertain. > As an instance, following a few hours of operation, the likelihood > of successfully allocating large > folios on a phone may decrease to just 2%. > > - Mixing large and small folios in the same LRU list can lead to > mutual blocking and unpredictable > latency during reclamation/allocation. I'm also curious how much large folios can improve reclamation efficiency. Having large folios is supposed to reduce the scan time since there should be fewer folios on LRU. But IIRC I haven't seen too much data or benchmark (particularly real life workloads) regarding this. > > For instance, if you require large folios, the LRU list's tail could > be filled with small folios. > LRU(LF- large folio, SF- small folio): > > LF - LF - LF - SF - SF - SF - SF - SF - SF -SF - SF - SF - SF - SF - = SF - SF > > You might end up reclaiming many small folios yet still struggle to > allocate large folios. Conversely, > the inverse scenario can occur when the LRU list's tail is populated > with large folios. > > SF - SF - SF - LF - LF - LF - LF - LF - LF -LF - LF - LF - LF - LF - = LF - LF > > In OPPO's products, we allocate dedicated pageblocks solely for large > folios allocation, and we've > fine-tuned the LRU mechanism to support dual LRU=E2=80=94one for small fo= lios > and another for large ones. > Dedicated page blocks offer a fundamental guarantee of allocating > large folios. Additionally, segregating > small and large folios into two LRUs ensures that both can be > efficiently reclaimed for their respective > users' requests. However, while the implementation may lack aesthetic > appeal and is primarily tailored > for product purposes, it isn't fully upstreamable. > > You can obtain the architectural diagram of OPPO's approach from link[2]. > > Therefore, my plan is to present: > > - Introduce the architecture of OPPO's mTHP-like approach, which > encompasses additional optimizations > we've made to address swap fragmentation issues and improve swap > performance, such as dual-zRAM > and compression/decompression of large folios [3]. > > - Present OPPO's method of utilizing dedicated page blocks and a > dual-LRU system for mTHP. > > - Share our observations from employing Yu Zhao's TAO on Pixel 6 phones. > > - Discuss our future direction=E2=80=94are we leaning towards TAO or dedi= cated > page blocks? If we opt for page > blocks, how do we plan to resolve the LRU issue? > > [1] https://lore.kernel.org/linux-mm/20240229183436.4110845-1-yuzhao@goog= le.com/ > [2] https://github.com/21cnbao/mTHP/blob/main/largefoliosarch.png > [3] https://lore.kernel.org/linux-mm/20240327214816.31191-1-21cnbao@gmail= .com/ > > Thanks, > Barry >