From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BA410C10F1A for ; Fri, 10 May 2024 02:22:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0FDA36B0088; Thu, 9 May 2024 22:22:17 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0ACE96B008A; Thu, 9 May 2024 22:22:17 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E8F9F6B0089; Thu, 9 May 2024 22:22:16 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id C55F46B0087 for ; Thu, 9 May 2024 22:22:16 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 346E8A1C56 for ; Fri, 10 May 2024 02:22:16 +0000 (UTC) X-FDA: 82100886672.10.C7DD38B Received: from mail-vs1-f52.google.com (mail-vs1-f52.google.com [209.85.217.52]) by imf17.hostedemail.com (Postfix) with ESMTP id 91CD440011 for ; Fri, 10 May 2024 02:22:14 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=jClvhBuL; spf=pass (imf17.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.217.52 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1715307734; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=UOL9iCsRSA89mOTZFg1foYRJVisZeBfoo4sXUMedKCo=; b=kbKVakDMph3b+lmQtiWToIhyi9pXrxdeHEKA1R1rShCI40hcnVLARpqwCIejxNyFz+RN9h tBwk0+qaVJhzVbaCUG7LTyhZ1/bhdXiyzypnq97xx+5EyA9m6fDgsp/bfGep88xLN+RrA0 BiqZbOJyIELVnB5H/PVoZlG0E5c/91o= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1715307734; a=rsa-sha256; cv=none; b=km2Y36YTcCeTvX0z2AC9rH/0HDU2V0DS6U76ceLqvCQFH0t9H87mJ15w4kxTsb4k3tKnkW Qb8rmQk3koV7HJLFfIexi/tz/YEYiKTDi4EN3M59cQAgXn2TpQ0zY4EEDLWULTQ3fymPow KL1AemSt4pg/chgQijQYT/bw+P68UW4= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=jClvhBuL; spf=pass (imf17.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.217.52 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-vs1-f52.google.com with SMTP id ada2fe7eead31-47ef7e85cf0so504534137.1 for ; Thu, 09 May 2024 19:22:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1715307733; x=1715912533; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=UOL9iCsRSA89mOTZFg1foYRJVisZeBfoo4sXUMedKCo=; b=jClvhBuLVr9gVU3I5sE1SsyIJkTkcOr3JyQ52KtrxMPcM5yShbantRurhMaWLrq3r+ /YSbI/N7Rv1fSGu1A7rBzNZtZ9N40nVgzR5LZjAVxr2ritl/yvSqsRqDoXtbKUREmknw HVjKfGuF8DOx+ZlgpgEfAW8VBii+ZFFvJ00Pvs3dNPxweXQ54KEp2OnUiplFU3IiEB/P hLXvXRL9X/sAhfin6ppMOMJ6ovVJGMHH9xhtA0dzVtRKz/bQ7mX/kndEH5bwxqOyg171 6z2giy9/MLYUFekv/oy+eC5tWwgV+kiWWp2ioZ7wwpQWMlT3XprmGHtTK8D3bmTQQs4y cVYw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1715307733; x=1715912533; h=content-transfer-encoding:cc:to:subject:message-id:date:from :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=UOL9iCsRSA89mOTZFg1foYRJVisZeBfoo4sXUMedKCo=; b=CldrE1cwn9uB9ezmt5UPV6YM2o0L5uczwm9oJVRgmWO97k2Z2moYTZ81Igcn0R8qBY QtfHClbjEsFY1dT0rRz/FwmTXFtzhbiLO9wPSrahxl2+9z9R6WK5rIunzv2jPL0vuFjK F0gvbj2Yn7nZnEruMZHSAb2Mq4l3M7xk/Q6JXxfXzDRRxDTO+qu9gb/ybykhGqm3vlhH luRF2h7o7m5A4YPUUn3jy+emFEwVZdCcFzHvsPKmqQG4I5E5G4Lf1s7rPCPiu6Yop1Pa tDHMfawC0kitcDeD+X00pf9s8levu/3ZBnnTmBdMefOMkP2twJiP/wzpqKOpw5kt1Xod z0iw== X-Gm-Message-State: AOJu0YwScHh6u4AEETVGA8xG9faK4kUfyFr451mFdued7ainBkP+yYeE jDjqPz53w6ZdpZBxQpMCDNSSux+53bb2e0WfEXIaKd44P3d4SM4AhQR8TC748jp7M7BkhbVbdAM sWQh2tc7J1y9LIfWLINCmM7+xNCk+3Yig X-Google-Smtp-Source: AGHT+IFLhrXxkpZ/nU0GiquaU4I8sauX4KgUyEkiDa4ULs1OLXAqKqTYd7NsGmshqwp7an+V6HkRQgzkNu6Ro1B0CL4= X-Received: by 2002:a05:6102:5090:b0:47c:248d:cfee with SMTP id ada2fe7eead31-48077e1ab7amr1785398137.15.1715307733542; Thu, 09 May 2024 19:22:13 -0700 (PDT) MIME-Version: 1.0 From: Barry Song <21cnbao@gmail.com> Date: Fri, 10 May 2024 14:22:02 +1200 Message-ID: Subject: [LSF/MM/BPF TOPIC]mTHP reliable allocation and reclamation To: lsf-pc@lists.linux-foundation.org Cc: Linux-MM Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Stat-Signature: kyte7whsaqysgnbyzrsk6dae9n3fcdqg X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: 91CD440011 X-HE-Tag: 1715307734-245888 X-HE-Meta: U2FsdGVkX18SJyiLRzu9srkTf2GFZE8csqgqk6k27TAh0Bbg7vghMw5nfDrnD5SIPl8kGDSXeBCGgrycD6iztjbyhB9TH7ON9mRQ/wkxY+7Sb2wGtqSFnhc9uS5dXUydHuVCNkH+pvfgmPtH7LDiRU6riDMbhM6ssZw0ENDZfznBBn/usd/Sj69AmAtZ7x1KZx9WrBU8cxpANIqTL7hG9QoNTyzmDd9OUKrmXFsMzO9YqEwAUfC2Aq32TqWrG0D1nzaS+E+Y5usj/qZAqgN0/sxYmYNB2MFo3KE5Hlaeo8jqAAxWD/5jMQa3BAWejt9jHaurJUEVppHdlhbkm/UlQvdZjZErH65dwDtf6IpURvgrmMvppKdZlM4ucyTLnoJFXhX0lG+YjwmDaXgT24dijw54hpO2yxt3qQ2vplhC44BHwRILJlS9rNAl3oRIK1XDs/K0aarTgUmJz9AyB2ST+aK1bCoiI4fngrEaDB1nuRebNh5S3PU9+ikHPH79CQZm7gc75Ey4f+Ph9xYFoJfBh2SQhJbBfLbPNb+quaqynJAU6OLfYG2XUFevMP8WyyYV6+c7g8oTmwjjNvv2rOvNt8M4BzoM3BFfl2Y23gWGTIfzH+gKb7Uz1IVOhX59blyqQoJTrCvLto17tUgNHqnZQEa/AyjsvSinQepHGGP9JUgb1NiHNRQzCqx3GL3GGGw3o3C2Y0N9KYIRAtsWA6ZnUFxTVrplaNIZ2tjpZYwXZJt/o+Eq9cifqGxlmDxu7tPz/Isp0UHyWs9bnyOej29k2bm1AsbExBArl8vouFxmBeCyuQZMeqQMq6wJ/+iBPAoXn3sATNbEJjSpEFIkdfc3v3ibcAvFDn5Gnt+BnEtrYq6qYqUlkBMnM2QViuKNptwjG3uA0gR6zS3kzYk7OZW35M4lN34iQkdPsaIoEFxwxPc+2HFuzaCnjUvxyr02gLavMD6ODzyZE+F+DWkHlQo yZZ9BZhK PZP5IKS5TNuoJLTL3cluCmuFQZAVofA4WdAZa8orc/+QufMni/xC979mk85y6tPM8YKnAVksJDjSGeVkVEh3vKdONEJFF0shbEp6IqK9gGz2OLjQmiXemkfMawGQVqGT0T0IbFgLvs+2+cpFrrOhokavYYFS6IXV22D/zfxkNA0tZtQmvQ5elR0xCqcXJS3XmXIDIgs+oX2rJ1NFzTHI5ySaBky8ZDZu2ykYninabYPndvGJfvRcUD5DJyAvelUIitjptK3T+PWXe0fYKEoxpYYq5ci9uMdKfdmkt8VEkaJp3n1bGje0wFbXOScwlP9oQ+ZRqU6xtuG8xnflHH0tGFjb7KvpcRx1yMmh9zArL0m+yFDYN5DelQC6tftqjUyXJABoq9+hnJy1RbD2W7qbWNB7sOTs9l7wPafGkG2QeDnv/fYt76AUhSk7mUfjCvtznf9QLn8ZECkqPbtU2YCJuIxJkkHA7ClrsUzDiV4dF40uRhiN2KFutBRU20lSGpkxwYIxqQQTpyUeWtZJ7QgpZnUTlPSAH9JFfHQNwEXH7lO2Inc0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.376389, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi, I'd like to propose a session about the allocation and reclamation of mTHP. This is related to Yu Zhao's TAO[1] but not the same. OPPO has implemented mTHP-like large folios across thousands of genuine Android devices, utilizing ARM64 CONT-PTE. However, we've encountered challenges: - The allocation of mTHP isn't consistently reliable; even after prolonged use, obtaining large folios remains uncertain. As an instance, following a few hours of operation, the likelihood of successfully allocating large folios on a phone may decrease to just 2%. - Mixing large and small folios in the same LRU list can lead to mutual blocking and unpredictable latency during reclamation/allocation. For instance, if you require large folios, the LRU list's tail could be filled with small folios. LRU(LF- large folio, SF- small folio): LF - LF - LF - SF - SF - SF - SF - SF - SF -SF - SF - SF - SF - SF - SF= - SF You might end up reclaiming many small folios yet still struggle to allocate large folios. Conversely, the inverse scenario can occur when the LRU list's tail is populated with large folios. SF - SF - SF - LF - LF - LF - LF - LF - LF -LF - LF - LF - LF - LF - LF= - LF In OPPO's products, we allocate dedicated pageblocks solely for large folios allocation, and we've fine-tuned the LRU mechanism to support dual LRU=E2=80=94one for small foli= os and another for large ones. Dedicated page blocks offer a fundamental guarantee of allocating large folios. Additionally, segregating small and large folios into two LRUs ensures that both can be efficiently reclaimed for their respective users' requests. However, while the implementation may lack aesthetic appeal and is primarily tailored for product purposes, it isn't fully upstreamable. You can obtain the architectural diagram of OPPO's approach from link[2]. Therefore, my plan is to present: - Introduce the architecture of OPPO's mTHP-like approach, which encompasses additional optimizations we've made to address swap fragmentation issues and improve swap performance, such as dual-zRAM and compression/decompression of large folios [3]. - Present OPPO's method of utilizing dedicated page blocks and a dual-LRU system for mTHP. - Share our observations from employing Yu Zhao's TAO on Pixel 6 phones. - Discuss our future direction=E2=80=94are we leaning towards TAO or dedica= ted page blocks? If we opt for page blocks, how do we plan to resolve the LRU issue? [1] https://lore.kernel.org/linux-mm/20240229183436.4110845-1-yuzhao@google= .com/ [2] https://github.com/21cnbao/mTHP/blob/main/largefoliosarch.png [3] https://lore.kernel.org/linux-mm/20240327214816.31191-1-21cnbao@gmail.c= om/ Thanks, Barry