From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 05372C433F5 for ; Thu, 7 Apr 2022 03:05:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4B3F36B0071; Wed, 6 Apr 2022 23:04:50 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4637D6B0073; Wed, 6 Apr 2022 23:04:50 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 304936B0074; Wed, 6 Apr 2022 23:04:50 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.27]) by kanga.kvack.org (Postfix) with ESMTP id 21D066B0071 for ; Wed, 6 Apr 2022 23:04:50 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id D6B7A252C9 for ; Thu, 7 Apr 2022 03:04:39 +0000 (UTC) X-FDA: 79328590278.19.65C2A09 Received: from mail-vs1-f44.google.com (mail-vs1-f44.google.com [209.85.217.44]) by imf25.hostedemail.com (Postfix) with ESMTP id 76F16A0005 for ; Thu, 7 Apr 2022 03:04:39 +0000 (UTC) Received: by mail-vs1-f44.google.com with SMTP id t6so2406436vsq.11 for ; Wed, 06 Apr 2022 20:04:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=JlG6HVFbiriXMTJdqZSQqniMpGFV8xnAyrnEzIlMNOY=; b=g3dD8HMIW9o4XOj5/F4RYCZnbEq63R4EFyiRBbXsQlUUpD8109X/uRO1gnNIi0+9Pi lM1gQs/hacv3X7DHc77AbOsPgGGj22DYZPNeVsSVPpo97gzaVorDV1FW/tOTqx+DYT/P jBJwysHsCHMQOUC3rMpl0RAjgDGuNdLmjaqRyAgu6yjOnfQwLgp70zPEKn1m4byVx+Gn oaQ4TwpPmIJlhF6DGjFJDaCj+IZModFF71wQbWOneko1mH8M9Gnp2X+nPDvv8I+bcUjq 1KXKi2GF347JNly/H9fpusZvX9svfQ6yBM2A9CFZJRzmHwYsiNuDABJBRzlGNgaeCuWI BRNw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=JlG6HVFbiriXMTJdqZSQqniMpGFV8xnAyrnEzIlMNOY=; b=A+J5H14nWOg1kM7tyZmCuQH15S2nt6Ci+pWLIAAuccWfFNWCBlzUrJcaNQu23vUI+s aUjwAFVl713w4aHeX4tv+qDaN+fsuecs8YgTp3Moft8OYK1Gv2yjqn+kyssj256F5wTz G5urKe/k3qfLDGS8x16qRBGs6ClRLLHOfJh9TzrheFSkXqMYiOnh55ZuFUjrPzj7j4NR Dw5ZsRmYhAEF2qDN7uMZuuvIZGDDsMZsWMIUW86TVBpGSC3EzBzWkDnLFgfI0P35Wnsh 03PI9OAsqyPfiGg+ZIVkHYFNQcYu6WsVz6mMd4Lhm3fCesVhucnhXD/sXZguw67ZNEqn ZsKA== X-Gm-Message-State: AOAM5320g4QYV2PNK1gtrbdhrXbj6rgkrIGUYlEz7OuY6Yq1lrDn88Po wXe1SKsNA8E7Q/+Cig384W6LJD+6aAKnrdy17bcB6w== X-Google-Smtp-Source: ABdhPJyuYJMGr6LXPgy65MMvs9O/WDf8CYTQgrUNNgYQ5HIVVErldv2mOXJVQnNH++zy/vxigLt2YykfDAtVJj3qS2Q= X-Received: by 2002:a05:6102:2922:b0:325:7818:8669 with SMTP id cz34-20020a056102292200b0032578188669mr4104348vsb.41.1649300678575; Wed, 06 Apr 2022 20:04:38 -0700 (PDT) MIME-Version: 1.0 References: <20220309021230.721028-1-yuzhao@google.com> <20220309021230.721028-8-yuzhao@google.com> In-Reply-To: From: Yu Zhao Date: Wed, 6 Apr 2022 21:04:27 -0600 Message-ID: Subject: Re: [PATCH v9 07/14] mm: multi-gen LRU: exploit locality in rmap To: Barry Song <21cnbao@gmail.com> Cc: Andrew Morton , Linus Torvalds , Andi Kleen , Aneesh Kumar , Catalin Marinas , Dave Hansen , Hillf Danton , Jens Axboe , Jesse Barnes , Johannes Weiner , Jonathan Corbet , Matthew Wilcox , Mel Gorman , Michael Larabel , Michal Hocko , Mike Rapoport , Rik van Riel , Vlastimil Babka , Will Deacon , Ying Huang , LAK , Linux Doc Mailing List , LKML , Linux-MM , Kernel Page Reclaim v2 , x86 , Brian Geffon , Jan Alexander Steffens , Oleksandr Natalenko , Steven Barrett , Suleiman Souhlal , Daniel Byrne , Donald Carr , =?UTF-8?Q?Holger_Hoffst=C3=A4tte?= , Konstantin Kharlamov , Shuang Zhai , Sofia Trinh , Vaibhav Jain Content-Type: text/plain; charset="UTF-8" X-Rspam-User: Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=g3dD8HMI; spf=pass (imf25.hostedemail.com: domain of yuzhao@google.com designates 209.85.217.44 as permitted sender) smtp.mailfrom=yuzhao@google.com; dmarc=pass (policy=reject) header.from=google.com X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 76F16A0005 X-Stat-Signature: yoidrsxdmsf3oeetuc8h1rthk5byb4wk X-HE-Tag: 1649300679-541663 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Apr 6, 2022 at 8:29 PM Barry Song <21cnbao@gmail.com> wrote: > > On Wed, Mar 9, 2022 at 3:48 PM Yu Zhao wrote: > > > > Searching the rmap for PTEs mapping each page on an LRU list (to test > > and clear the accessed bit) can be expensive because pages from > > different VMAs (PA space) are not cache friendly to the rmap (VA > > space). For workloads mostly using mapped pages, the rmap has a high > > CPU cost in the reclaim path. > > > > This patch exploits spatial locality to reduce the trips into the > > rmap. When shrink_page_list() walks the rmap and finds a young PTE, a > > new function lru_gen_look_around() scans at most BITS_PER_LONG-1 > > adjacent PTEs. On finding another young PTE, it clears the accessed > > bit and updates the gen counter of the page mapped by this PTE to > > (max_seq%MAX_NR_GENS)+1. > > Hi Yu, > It seems an interesting feature to save the cost of rmap. but will it lead to > possible judging of cold pages as hot pages? > In case a page is mapped by 20 processes, and it has been accessed > by 5 of them, when we look around one of the 5 processes, the page > will be young and this pte is cleared. but we still have 4 ptes which are not > cleared. then we don't access the page for a long time, but the 4 uncleared > PTEs will still make the page "hot" since they are not cleared, we will find > the page is hot either due to look-arounding the 4 processes or rmapping > the page later? Why are the remaining 4 accessed PTEs skipped? The rmap should check all the 20 PTEs. Even if they were skipped, it doesn't matter. The same argument could be made for the rest of 1 millions minus 1 pages that have been timely scanned, on a 4GB laptop. The fundamental principle (assumption) of MGLRU is never about making the best choices. Nothing can because it's impossible to predict the future that well, given the complexity of today's workloads, not on a phone, definitely not on a server that runs mixed types of workloads. The primary goal is to avoid the worst choices at a minimum (scanning) cost. The second goal is to pick good ones at an acceptable cost, which probably are a half of all possible choices.