From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C559BC64ED8 for ; Thu, 23 Feb 2023 20:10:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 62BEE6B0071; Thu, 23 Feb 2023 15:10:15 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5DC096B0073; Thu, 23 Feb 2023 15:10:15 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4A4146B0074; Thu, 23 Feb 2023 15:10:15 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 3747E6B0071 for ; Thu, 23 Feb 2023 15:10:15 -0500 (EST) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 1882C160BAE for ; Thu, 23 Feb 2023 20:10:15 +0000 (UTC) X-FDA: 80499648390.19.3607B2E Received: from mail-vs1-f44.google.com (mail-vs1-f44.google.com [209.85.217.44]) by imf09.hostedemail.com (Postfix) with ESMTP id 5901A140011 for ; Thu, 23 Feb 2023 20:10:13 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=Buy8GsfJ; spf=pass (imf09.hostedemail.com: domain of yuzhao@google.com designates 209.85.217.44 as permitted sender) smtp.mailfrom=yuzhao@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1677183013; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=tb4E/KDAfmO+OWY5M3vwQnCShVGqXeIPzhu+t6ji2TI=; b=ddfFeQEoXjiMAU1m09yy4aQC3Ug7G/d2E3bA0Dbd3OfwwgK3Yl0riAap3FxD1r5sMlVQvz sNScYtBHP5N2f9kU5nNhrCqIBLjw6TmaL9sKcaUmb0c5iZC+Yda6qG1YXHNIo8rYe1gjQ8 do3klTCYGfTOHoxy3Z3/3yJsHSIFwjs= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=Buy8GsfJ; spf=pass (imf09.hostedemail.com: domain of yuzhao@google.com designates 209.85.217.44 as permitted sender) smtp.mailfrom=yuzhao@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1677183013; a=rsa-sha256; cv=none; b=Wjevi2YWdjPm90pdj/MAdm66FrXpZn4AcVSG4bOXBenZjbJWD9wLm3QszSg45E0K8ItPPj 5NbI6jAuvy2qHKEAfUwZlSOybChDVKKff1IAtBcCNXDM0CbCR2oM2YfEaenWrCdzIVfhXZ uGxoqM/Bu/1g1XXnuB5PC+XCcJ544HY= Received: by mail-vs1-f44.google.com with SMTP id d20so9989085vsf.11 for ; Thu, 23 Feb 2023 12:10:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=tb4E/KDAfmO+OWY5M3vwQnCShVGqXeIPzhu+t6ji2TI=; b=Buy8GsfJbaC7xe6w4gIMVDaR17FRMpXubc52+8pPyuvFEIOFipuVV1KBEuKEAKh0El 7DlCtGhjvPy6R3F83LlbXWVkFsviditFGOnyOFNuaCOFkrNwI/obEofFnE7fHFIFAjz5 TS0RlbZVFOqi87gU5+jO/2VyNSZYlD1MRdH/KauUtlrOFE91xF4O4zPf42C2bpL/k3d4 f10eOthZsxd161U/wfkLecBG0lA6RlgWgp1Y7V0lS1+X71MKNI7c2knUOng4oSETF3N3 4wA3dsBYF7sv1XPjiAiCOtvvfYyIrBnGgsjlWHce/SoH5Z12GmZZtRDih8OTERcpCo1Z 77Kw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=tb4E/KDAfmO+OWY5M3vwQnCShVGqXeIPzhu+t6ji2TI=; b=6XdHtqt7U8CPvlwApY1GSpHES/yhoZspTlw5kKA8msQEjkpbKHTqJjMoZTpvkp35wR Nl3SFKQ0IqAsRyunueU+gUs/L8u60+j/xhpKlFyd0Eqx44HPc8HcoEmK1ELXOyoXh9aP wmCHgRQKkoP5ge02z95Vnz4j3zUhQsEi/Utjm7u0LIMRpZzvjkgH1+Qrou+0pvt2kENp l1+bTRvHCGM+rnbsYZ+D9zaVbdRAW0diSAog1FhCfUpt7yuHP/9rfYBJPKFL43keeMq3 2ChwmvL+81tQgs8gmgGYyJ6FY3WvGRB6cDT4mMlePLZF1eN5Lh4e2De6sYjL9yIjV3C9 Wz5A== X-Gm-Message-State: AO0yUKUqpLUuQncp+AXh9FR/07Ektsmz7DNKO7aArQa7cfO8KBTBdF5A J0hPkerjrSUxQGU64Z9YvzpYqLJO7ZJ3RyX4qmvMWw== X-Google-Smtp-Source: AK7set/BdoUj1YfuS8JpwJQ04kHrKoqkmuSB4hZ/qUc3HI+kpJOsmACSu4zfNA2VRxx+6n1Ke4C2BNibi2yxxoTp1zE= X-Received: by 2002:a05:6102:3181:b0:414:34d3:89a with SMTP id c1-20020a056102318100b0041434d3089amr752650vsh.6.1677183012138; Thu, 23 Feb 2023 12:10:12 -0800 (PST) MIME-Version: 1.0 References: <20230217041230.2417228-1-yuzhao@google.com> <20230217041230.2417228-6-yuzhao@google.com> In-Reply-To: From: Yu Zhao Date: Thu, 23 Feb 2023 13:09:33 -0700 Message-ID: Subject: Re: [PATCH mm-unstable v1 5/5] mm: multi-gen LRU: use mmu_notifier_test_clear_young() To: Sean Christopherson Cc: Johannes Weiner , Andrew Morton , Paolo Bonzini , Jonathan Corbet , Michael Larabel , kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-mm@google.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 5901A140011 X-Stat-Signature: k7566t47f3cds1nea951oe65b4ufng8i X-HE-Tag: 1677183013-833290 X-HE-Meta: U2FsdGVkX1/0apJAIHErn+mh9H2MRP5aQ0ZMaeJ+LMoqo48ZZ5oDTiMlKjUiaoawO4xQewIrQM6qi1RIF2bY2V4JLo6bSrRIZeZTIecZ8IrvzuusWcp8PUFdcFtIxdwqayfz2bNAMGTVq6I06RPn/YTjuNPiLW807HYCisrkYqmnoKj51tkfHseVD7SyGwb/T27FuGQ48RHWftrkik3uoG2Sr2McM2spHnkX4D0jIsumqK9E8vNgrk8JEkcOuXLHQnsdAn5i37mob2cZG/VQJq/GoqhCuwMAGDOoS2UO7a4csvt+D2O4wU7QZnBRagZmmvCsoCw51KnQk5kWOnYGe6siMKux5q6uVVCyxLB2RRJx8PqczjXW39IRx4nAGU+B/ygtaQhtCNfRdrS9ClG5mrrAwFG8Z/E/cfi8LDOyRXDKc7k+nXywPsRmjJu3Dfg90dEoQJ5IGcVnS3ujF53lARiGDAbwnmUwWbgWywphfmcKnxzDK4PzZ+algM8mUuSg7eEEUnTfk1J2b8W+DONe0hKp4Ax1ys3GOyFuACnhk7xr1B0jxBhkGWOdOicmFerLfVTm4hY/yVzGrlv/74UacckQe1LtLyudlvN2ZnFjNGDTuOHPvMITTmEVocaSeKD0wAgVnLWocQkgPDMcSRQh53QDsg8CfscMp/yWN5/Q+S7DBsb+6Q6Oegy01FpznC0UbTtRyzan1iIHnZIRoAxs6w+bbeamkHdmu4FSMTNMTxfDa13fhal0HDalh81XbeFBkNTuGxNKx4z6fka/ViqhDnz+tMuTCGcLQ7B8ldGQsxMGHq9WWUSGfOfnY6TWBq7OQsKz5hKrwxQP207rBEdLI+tJeNtV4bmog4UU7elU1y3zfoLfQ69IdWt3WdjPDIz/jMLJzVeSIWD9sCnkNl5M7nyhv+HvmGvojEShrRUjsekLH+6OKQxWrxWlCkRsW7Plx1ah7wZlSTI5w/Wv6eJ Zea7RF3C lXeIVIcNh+rq2/GC3QX2ISPrR+BwOFK668g6hfxyhdoG8Z6oKs49PQKEFyhh7NicrcIDHUJn4HE9vHPiDSR26MYxNUZPu5pOQ80wsZSQc+5Ywm8XxNn0oEgy0Bdel3kLXcELU5g83LYqLbtB2EHnoyXgPF0Ft+YbkIPGZO2xi7uyl5vCVa2N9EN5asW9gFyMtrDk4NQd7b5dlPittc5+hE4TxsnWnztDIs32EMPDpqbuwJAYeDhVEuDTeakh9e/4yF1S08AccHeKqrlST3L6MqWtBlY4MyIDOqIdReCWx/ToSkMgtPSXl2DTTbQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Feb 23, 2023 at 12:58=E2=80=AFPM Sean Christopherson wrote: > > On Thu, Feb 23, 2023, Yu Zhao wrote: > > On Thu, Feb 23, 2023 at 12:11=E2=80=AFPM Sean Christopherson wrote: > > > > > > On Thu, Feb 23, 2023, Yu Zhao wrote: > > > > > As alluded to in patch 1, unless batching the walks even if KVM d= oes _not_ support > > > > > a lockless walk is somehow _worse_ than using the existing mmu_no= tifier_clear_flush_young(), > > > > > I think batching the calls should be conditional only on LRU_GEN_= SPTE_WALK. Or > > > > > if we want to avoid batching when there are no mmu_notifier liste= ners, probe > > > > > mmu_notifiers. But don't call into KVM directly. > > > > > > > > I'm not sure I fully understand. Let's present the problem on the M= M > > > > side: assuming KVM supports lockless walks, batching can still be > > > > worse (very unlikely), because GFNs can exhibit no memory locality = at > > > > all. So this option allows userspace to disable batching. > > > > > > I'm asking the opposite. Is there a scenario where batching+lock is = worse than > > > !batching+lock? If not, then don't make batching depend on lockless = walks. > > > > Yes, absolutely. batching+lock means we take/release mmu_lock for > > every single PTE in the entire VA space -- each small batch contains > > 64 PTEs but the entire batch is the whole KVM. > > Who is "we"? Oops -- shouldn't have used "we". > I don't see anything in the kernel that triggers walking the whole > VMA, e.g. lru_gen_look_around() limits the walk to a single PMD. I feel = like I'm > missing something... walk_mm() -> walk_pud_range() -> walk_pmd_range() -> walk_pte_range() -> test_spte_young() -> mmu_notifier_test_clear_young(). MGLRU takes two passes: during the first pass, it sweeps entire VA space on each MM (per MM/KVM); during the second pass, it uses the rmap on = each folio (per folio). The look around exploits the (spatial) locality in the second pass, to get the best out of the expensive per folio rmap walk. (The first pass can't handle shared mappings; the second pass can.)