From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 40912C3DA61 for ; Mon, 29 Jul 2024 20:03:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 61AF16B007B; Mon, 29 Jul 2024 16:03:20 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5A5DC6B0083; Mon, 29 Jul 2024 16:03:20 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 46C0A6B0085; Mon, 29 Jul 2024 16:03:20 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 269CE6B007B for ; Mon, 29 Jul 2024 16:03:20 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id A8CAAA0660 for ; Mon, 29 Jul 2024 20:03:19 +0000 (UTC) X-FDA: 82393864518.09.CB77A5D Received: from mail-oi1-f177.google.com (mail-oi1-f177.google.com [209.85.167.177]) by imf24.hostedemail.com (Postfix) with ESMTP id C040E180031 for ; Mon, 29 Jul 2024 20:03:17 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=A37AZFWk; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf24.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.167.177 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722283358; a=rsa-sha256; cv=none; b=euGZXpdsOLyr7oj0a98N5frEUvotefsQcUv8E6bRlGMBSelgbqIL35m2PC5DPnBEgn711U UMXzNdHRHwYkVAW+L++rFRiHsJONdqMivq4E2ad9frhL9nzOJzjWGu3+F8EHGsJncPmN+n BJ3ZOVN2v/NLmOHt2HvojncfcCK/PRE= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=A37AZFWk; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf24.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.167.177 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722283358; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=uDq1Gdt3n+usFmiUTBcsV00Y3w2xz3JMkPDIMEdofYk=; b=F9dTz1ZPiuKAm5CgGhVavjRJc7VbR3Fg6oltoq9gkU7qlQqL+Axomrk/Jx+zUIrobJuQvB dz8yRpCyteEgGca91z/AuiCbToqt6dVDvEcKO1F1AUw17OOfydA50GAqNsh9p7t45Qnv2X CbeJesNU7GPYzqNSTUFNJmexUNIcDyA= Received: by mail-oi1-f177.google.com with SMTP id 5614622812f47-3db18102406so2744936b6e.1 for ; Mon, 29 Jul 2024 13:03:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1722283396; x=1722888196; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=uDq1Gdt3n+usFmiUTBcsV00Y3w2xz3JMkPDIMEdofYk=; b=A37AZFWkRkWuL4PXBKKffwOcKff3ZXrgoTYGGgOpSlRVLqDfRSx57XyZvJ37iP5BIo 6OxjhuBNHx91ul/V7geOImkWsG/yUAWdb0eIsmHQBdirosMWoipXI5ZywLNCGQ2eaYGm mRoDQRgTvvSivTdYEYo2P0nDCAavqYayS0WS+scJ+Fuxb6eJKhpwKek/lRrAwY0775Is e8IFew+prXaG44Wo3xu3xvk4gXSyAr+RBPeG2d9PjOO4K+nrbFJa3cgAy4CpFR/KRu29 8D91jF9MM+lfAdztYNMlvha259uo51V5xSIVhlhAh3FeORduuNoFVA5MIpwQJpCGhISp HUlA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722283396; x=1722888196; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=uDq1Gdt3n+usFmiUTBcsV00Y3w2xz3JMkPDIMEdofYk=; b=rYjJ2RPQcOugtH/fK8GUt8TU3F/Ex104QPBx0eVonSejtE3GcoVdbbBNkFq2lBuy8q YkR7c8Wh80C/ZpfgHXFUHTyHbixIGB+iSuJ+PEomB+lXKySKMomDZN9mqJrLA9uGy+uS u/JyrKGnIOwO20rQhLlY1uj6x7UTZiZOhd/FpDVVYYBeD6oef1pEws7mvPiLwo9ADXWm PolhEYHlj4YlZaVMgrUXI2DbYyazraa+upbQoIBJOFrcfgyou89eW8EowNbYntKTVlzb n/s4qRpB/io+A6+UDwmq+9OnCE5o/3hBHn58orbHDcugVZiqOUygSDTAyA8IylFNks1X TZ5Q== X-Forwarded-Encrypted: i=1; AJvYcCU0RIU8Efj/IN2E5Q2dqUq1jyvA+3tPm5twH7xnWsH8I/yYAtggwA8vTrNFXYb83sIhoe/thD/JUQuLU42CKNmqN5c= X-Gm-Message-State: AOJu0Ywcl2Zz9xMMBicnepuXkuPNcy9Hzz1XTkr4YiNVmxHhJwIOd2tW cEuPhHWGFiXWDgPCrWEQpoZEiOmGHp6FgkhkZ7SUjLGtoPhZZ6uAAufu40I9I+DPDFDinqPe0ym LXx3qk6hG+RHTud2HLVUI8F8ocTE= X-Google-Smtp-Source: AGHT+IGbZo8tM4Jx81FtIJRkLT4fg5oCgMI5kCyd+x+xfpAEG9W3DAU19E1zTOQK5uTv4avib9PWrMnP0uS+k/fDFa4= X-Received: by 2002:a05:6358:d26:b0:1aa:b9ec:50ca with SMTP id e5c5f4694b2df-1adcbf03d0cmr1459541055d.25.1722283396404; Mon, 29 Jul 2024 13:03:16 -0700 (PDT) MIME-Version: 1.0 References: <20240726094618.401593-1-21cnbao@gmail.com> <20240726094618.401593-4-21cnbao@gmail.com> In-Reply-To: From: Barry Song <21cnbao@gmail.com> Date: Tue, 30 Jul 2024 08:03:05 +1200 Message-ID: Subject: Re: [PATCH v5 3/4] mm: support large folios swapin as a whole for zRAM-like swapfile To: Matthew Wilcox Cc: akpm@linux-foundation.org, linux-mm@kvack.org, ying.huang@intel.com, baolin.wang@linux.alibaba.com, chrisl@kernel.org, david@redhat.com, hannes@cmpxchg.org, hughd@google.com, kaleshsingh@google.com, kasong@tencent.com, linux-kernel@vger.kernel.org, mhocko@suse.com, minchan@kernel.org, nphamcs@gmail.com, ryan.roberts@arm.com, senozhatsky@chromium.org, shakeel.butt@linux.dev, shy828301@gmail.com, surenb@google.com, v-songbaohua@oppo.com, xiang@kernel.org, yosryahmed@google.com, Chuanhua Han Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: C040E180031 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: 5tqx5mte1ymr93nt7ufy3io1ead5kus5 X-HE-Tag: 1722283397-157230 X-HE-Meta: U2FsdGVkX18GfjsaYkefTGmOHxsiDdc8w925oT0eJFmv+XOStflyxnusOebtCCPgCxRYpV29E5174eFA+9Yey9FWwQ83FD9lZtC0y5iblsRygI86CKEFXoGm6TmXDELX8IvLwKAnyMmOlJZKm2F2tI3YU90CvExEEW9zG2qGAG1ReOhHbEOMsd2kPWfBYoty+0P2Ap6YHSLPTJfwRRhjEesghm7CFis6PDKDmu29bB9EeAtsLWNMdzGpMNOC9tVCZ+z7W0sWJI0+7uciHYCqzYOkLx8YyRRk/9/s+R+hICf+x/LHZueSQ8CfUe0OEvJD2TcCmzUSSiNzTJgvKFh4kb42Qscgysd0aXUVTRyepazmfQX77dkwxe2RtgsasU6Bmz5+8iP04CBg/TBYXDr90ReWr2AJbdGMiPbojYZzxvgE2xGvRFw9+IQYkjezOYsw0wZUOQufLY6qcpKL/r93fa55lRLeTpSWBnENWnQigB4AnPBqXzH15jCHMPEhyl5e2h4wPoimo04eZh6Xpxt3JuRckoH/vHgbR2zFIfrmUW//fqZH/J3nJ6CReqYcWygeLUSRNBbSGLyuc5aYJ9OJv7rvC7sMYg817iB2cc2qC/HThMBkS5h47pKecXeTzzKnUcvJECIlQCcZmaBCxOlmxh34eLfy0yI6HkfjWJnYWL0fcQ9V2cnnzVa3ZELM+8F1FeMbHH8RAok0d23EsetsVAsaTLOKiyPQnjevd56mRt75SgdsdEdHiwe+wjFFSGnwANga+pxuT05KEKHJSdFiArxl+wK83uq2U7MQ4XV+KAhZ32F6hXc61rHjsrwUhPXwloIfYf5KXKE7DtvyIHx2EeIc7q3Va3NGFoWkgSUAAQCqcGeMQhgOolIr1JufBd5/xzcVaWTzhJIeiRNbqX9KBPP7easI24yIl7j/h8JTzitya4TkrvVVUvpjBxyhrX+nYQjro5oleZDwtR8hCyc 5O8gX9KO Ov75jOiUjivGj3SELZfuKNq41E5vJXXQ9Skcj/Dv7B8fJNM3ObZgOJ5t9bVII8QNa2qL+32Th0eWexdrTsqL3t5QvZJXG28TTRb0VyFYkJagtBH8I+0lnr9wTNdmyr9XsVlIYUvmElN4qwJgS7H/+nvIzJXO8quuinVs1YKYjNwBZYg3UMsWqNKj3KKZsXgdJQN0hU+eGKMHg81gGrS0D/QFuYRq5D6jePO3HlESJtXN15zHTCZevoowNxnJXZmS5YQGbzlEgjrQ9O2GVeF3Lb85sOYTcSp0jgNYZzCjFpBXLWMHH7SoZgh6YXjD7K/zBJ+/X4JyDZ4XK0wuagZNpdPpYoyR1dXskrcDlS7fnDHQAKww= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Jul 30, 2024 at 3:13=E2=80=AFAM Matthew Wilcox wrote: > > On Tue, Jul 30, 2024 at 01:11:31AM +1200, Barry Song wrote: > > for this zRAM case, it is a new allocated large folio, only > > while all conditions are met, we will allocate and map > > the whole folio. you can check can_swapin_thp() and > > thp_swap_suitable_orders(). > > YOU ARE DOING THIS WRONGLY! > > All of you anonymous memory people are utterly fixated on TLBs AND THIS > IS WRONG. Yes, TLB performance is important, particularly with crappy > ARM designs, which I know a lot of you are paid to work on. But you > seem to think this is the only consideration, and you're making bad > design choices as a result. It's overly complicated, and you're leaving > performance on the table. > > Look back at the results Ryan showed in the early days of working on > large anonymous folios. Half of the performance win on his system came > from using larger TLBs. But the other half came from _reduced software > overhead_. The LRU lock is a huge problem, and using large folios cuts > the length of the LRU list, hence LRU lock hold time. > > Your _own_ data on how hard it is to get hold of a large folio due to > fragmentation should be enough to convince you that the more large folios > in the system, the better the whole system runs. We should not decline t= o > allocate large folios just because they can't be mapped with a single TLB= ! I am not convinced. for a new allocated large folio, even alloc_anon_folio(= ) of do_anonymous_page() does the exactly same thing alloc_anon_folio() { /* * Get a list of all the (large) orders below PMD_ORDER that are en= abled * for this vma. Then filter out the orders that can't be allocated= over * the faulting address and still be fully contained in the vma. */ orders =3D thp_vma_allowable_orders(vma, vma->vm_flags, TVA_IN_PF | TVA_ENFORCE_SYSFS, BIT(PMD_ORDER) - 1); orders =3D thp_vma_suitable_orders(vma, vmf->address, orders); } you are not going to allocate a mTHP for an unaligned address for a new PF. Please point out where it is wrong. Thanks Barry