From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 93E0DEEEC08 for ; Fri, 13 Sep 2024 03:59:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A1B936B00D1; Thu, 12 Sep 2024 23:59:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9A3EC6B00D2; Thu, 12 Sep 2024 23:59:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7F6DB6B00D3; Thu, 12 Sep 2024 23:59:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 5C99B6B00D1 for ; Thu, 12 Sep 2024 23:59:42 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id EF82FC164A for ; Fri, 13 Sep 2024 03:59:41 +0000 (UTC) X-FDA: 82558360962.23.0142C84 Received: from mail-vs1-f43.google.com (mail-vs1-f43.google.com [209.85.217.43]) by imf08.hostedemail.com (Postfix) with ESMTP id 36DC4160013 for ; Fri, 13 Sep 2024 03:59:40 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=hLHRRX20; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf08.hostedemail.com: domain of yuzhao@google.com designates 209.85.217.43 as permitted sender) smtp.mailfrom=yuzhao@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726199862; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Wjaz+/L4ivgWFscGpOVnIcipagmEtAyOhVSza8mlrVc=; b=EcKUXT2QBVjBHo46dUZr0q3RQOBH5EN/7nDdsxVXg/GPf4CemDZPyRggd4zU42Qelj5PEE Jli8aAw51IQ2lDgYLE/E0VxeljuIWZhx023oQqxPS0y73UKJaMAT0hoMamE+KmxHg/XgQg PHmGseLkkrxf6AOVE9u6GCaFv51tQkc= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726199862; a=rsa-sha256; cv=none; b=Ehe3+H0r+OS937UeMBskxQHbbqVy8LcPPHdOp5TRkOLn/eXashii4aiR9FaAduQ5Vdfnff 0mACZx0BT7ekpkaPx+up+rWASx0x0dgD4I6ljqIeWygURNfhRPrvkO5z09Aqjp/ygQQumw Q/eKVaQdnAFKBbakheHrUxnaReP6gSE= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=hLHRRX20; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf08.hostedemail.com: domain of yuzhao@google.com designates 209.85.217.43 as permitted sender) smtp.mailfrom=yuzhao@google.com Received: by mail-vs1-f43.google.com with SMTP id ada2fe7eead31-49bf3b4d07bso558747137.1 for ; Thu, 12 Sep 2024 20:59:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726199979; x=1726804779; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Wjaz+/L4ivgWFscGpOVnIcipagmEtAyOhVSza8mlrVc=; b=hLHRRX20quCcnBmke81odpYybQ/RwhdXyo8smi+y7ADBuGITiOTKLVmnwyp2CZfj2q Tt7DLFrip0+zXXVp9jHnF0DB4TS8XENOgCJNdWihqH5LsRstb2A+trKitziURxMpnVv3 fIDHHJzbQz8x4mfH3tdDIkaUi8qqGcfGX3RHvYEXcosPMniE5i4Ecq3SoZof0XP2A7oT m3oSIf/adV4dJlJhtRBoa5DQVVlzctR2ad2Fb6mGTiI3cLxWcG0XPDcZE+WTA1hP/C7z dDYRWUs46Gtdk35J8PjHEU6GSs/MPBUCsbeyUjFPm3iq55gf+REkZmg2rbR55nkuwfOb l4CA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726199979; x=1726804779; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Wjaz+/L4ivgWFscGpOVnIcipagmEtAyOhVSza8mlrVc=; b=HpCOUpucFYcJ37rJ6M5YDcYHrwK7uEgeco04Oc6+agw34DEUPVkZOQZWNTqq+oRIT1 wLpn+0/bfZEfvDb3jZC5F17jPoSlCZHCcxxZX0Y5lHY+jqT/H+yu8ZN1OcDikGmz/02e b00/+mEmUmXMd3zCQuklFSbRf3tt/pfwf95KHsgo0V4Xk2DOGqBqh5Dcs1qi3pReWoBB E4nHbkIfJ4wfuTyxJeVTyqltdBGuex7/XVNmd5It/nnQJYo7q+cW7J8+wsuXzrFehkYA 5+ESvhYBOQzvDsVtEs7YWT/P1tL/0GV4aOAiEsz5evSNKM18mSUijelECHml/QOkwOiP j2CQ== X-Forwarded-Encrypted: i=1; AJvYcCXriResfGgC/pb3pylRasXe9kps/RBzChPy8Vn4VLT1eu8U1Am5jj9KXay+g4KbforgcxNP+Ab5Uw==@kvack.org X-Gm-Message-State: AOJu0YwR6PMeqtoHB0ivIiPecYG1vsmnsyj2wRyV2zKpcqrdGFbxDE6A XizSR2bn6OW5Uy5oE4St/DZONjuTq85XOXN8lUOOL3xajQIGPgbu75kW1eddfSYtLiuwKszATbL KmT/PsMk1BC6s/auKml4por6bpLRyK7XwLDku X-Google-Smtp-Source: AGHT+IF9t7MLhyQq97OhgkWFKYTc+WAVksqYcOH5IiIJh1FZaMtTc2/nLGQnEm40+o5LmoHj2gXy70fS2hTrJpMkvVM= X-Received: by 2002:a05:6102:3908:b0:493:e60c:e6d2 with SMTP id ada2fe7eead31-49d41471e8amr4128301137.12.1726199978817; Thu, 12 Sep 2024 20:59:38 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Yu Zhao Date: Thu, 12 Sep 2024 21:59:00 -0600 Message-ID: Subject: Re: Questions about TLB flushing and lru_gen_look_around To: Phil Elwell Cc: Andrew Morton , linux-mm@kvack.org, linux-rpi-kernel@lists.infradead.org, Linux ARM , Will Deacon Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 36DC4160013 X-Stat-Signature: gha98ckm686ujr33cy1dh9thtuepuquh X-Rspam-User: X-HE-Tag: 1726199980-122222 X-HE-Meta: U2FsdGVkX1/STdhOipGl/xkeNh48D+yTyHFR3AAeWcaV0TkujKkbAeTBX4T4F1i2hMioCBS2X2hAxCtXTVPC252srBiWj3KOGO33fpS25DXSj4ZDce2c9u/LOlhkLD1hoW7ux27fdzfZiFrzsxcAJsMrWCF1f/CPSc69ihE0suwfJ+5sjw0GSIiFBzNOexQ9HcgMVS3Zy51t9ZxOmh6XVWdR0X+/WbkgxfNdTMZB0oFDea2wnp6TyMwdvgFg6xKbt5/6+/DyjUd2Syvxw0OTCEAjaP6h/rYFNbNHjDGcXJlgV9inu0wDMKdkRPoYvWsx7qEN++RPIsuwrEAdTVV8M3SyhubZJ231Nhwfg0aYIUYGGcSnqNw9yHZJpB9kTvPVSTv8WATQz+5oP0qq3yFR4AjxIFUq56zAPsPJVQN29IuZe1iHt05k3HSaxuoqimt2i5zWaHSMGz7IFJWH1mox6iC7Wya+EcePfjAXj3v+n03EMMg/EGnfxlgwkgO64uB1QwtGwlEehcz+HtFgIhW534o5s7QEdoyK5EGHJMyGNbgyBwC7kOrhKd6W+cpzijDoZ4YjuPcfeS60fMWsaCCRVb9B/tegS+Dls6BaV+yInR/ZhSlBZ5wQ5sXUdIPLU7UeroQ6296v9fY7W11xuqmggxk7DypkJkIRrpkT2teAFsocAVolz0KlLP5yJaDxPMul8cgEBYxDY3mz2J1vtGJJ9iiA1EarIXbkvxsON9sHkFmbS5+9NkHDB+VlyDacfM+6ubXD7uWnPMYO4AEl3wEzTsNzXvpKHNNuvF5Ri51cJSWAxQgltJ289+RBgp/3goohM9wVWyyLbVyG0Sb6CTICt29Ef3cNpS3SMMSboBPNVbQeYVGYfNHCtJPDKNeq4vCjYRvLxu1vBIGF7LcvSOAi8tP3P04UcTdO/5LKaJAIY+KzNWR7ycpTl+8wEVAdPxfaQcOwYCJ/d+5FsMvP1XI 2uDjhOlD omI1J3gX8Kgmq2pj3BlqGKYIg7IcEFG4jgAcQK1G85ut7tnqvCXyBWeIv9xzMO2X/kRDoHhWB+/7LatBiresbtUvVrKqr4/x0wBmA01av7yEriitBzalv/Xeb3hBx8Lh9At1R0iELgwC+QsaPbOjD/Am9JX3g7U3JfY/9ZEttSH5BHxcXDRcnkEtuo4/7ei9ZQOHJGPxn2xHTLOa0qldVTqrDlhugxBAN3fgcFliiGGEDP1VDgSjKHXBeJvo6mcGYSFCiHqnY+67cMeffe+v//VyfANKL5Y77/bRXy920xu7IlQw= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi Phil, On Thu, Sep 12, 2024 at 7:03=E2=80=AFAM Phil Elwell = wrote: > > Hi, > > I've spent many hours recently trying to diagnose a problem that > manifests as a CPU spin, under load and memory pressure, that can last > for many seconds. The problem can be seen on our downstream kernels > from 6.5 onwards, when built for ARCH=3Darm, running on a Pi 3B (BCM2837 > - quad A53). I've not tested a pure Linux 6.5, but this is not a bug > report. > > Pi 3B has limited RAM (1GB), and it was discovered that restricting > this further to 512MB made the spins more frequent, as did adding > other processes. Running an ARM64 kernel in the same configuration > leads to normal OOM behaviour. > > I traced the spin to a loop in __copy_to_user_memcpy where > pin_page_for_write fails repeatedly, sometimes for hundreds of > thousands of times. The pin is failing because the user page in > question is marked as being old (L_PTE_YOUNG is unset). When this > happens, the code tries to freshen the page using __put_user, but in > this case it is not triggering the required page fault. Digging > deeper, it can be seen that the PTE in the ARM's shadow hardware PTE > is 0 as expected, but clearly the MMU is not seeing this otherwise it > would be faulting; a TLB flush for that PTE fixes it. > > The TLB non-coherency for that PTE can be attributed to a call to > ptep_test_and_clear_young from lru_gen_look_around, which clears the > L_PTE_YOUNG bit in the Linux PTE Yes, it does that. > and zeroes the hardware PTE I don't see how it can happen, or why it's needed. Could you explain? > but doesn't call flush_tlb_cache. Correct, and this is because that arch-specific API currently doesn't require TLB flushes, from the MM's POV. None of the current callers does, I doubt they were used on arm (32 bit) at all, except MGLRU. > Two possible "fixes" are: > > a. Replace ptep_test_and_clear_young with ptep_clear_flush_young, > which includes the TLB flush. > b. After the loop over the page range from "start" to "end", include a > call to flush_tlb_range from "start" to "end" if the "young" count is > non-zero. > > My questions are: > > 1. Which bit of code is meant to take care of TLB coherency where > lru_gen_look_around has made changes? None, since the API doesn't explicitly require it (or at least the MM assumes), as I mentioned above. > 2. Between the two patches a) and b), which is preferable? b) would > seem better if IPIs are needed to broadcast the TLB flushes, but it > seems that BCM2837 has new enough CPU cores not to require such > broadcasts. Could this be fixed within arm? If not, we would have to update the requirement of that arch-specific API. This would affect other archs that don't require TLB flushes, assuming they exist. And we would need to fix all callers of ptep_test_and_clear_young() in MM. > 3. walk_pte_range has a similar loop, but it seems it doesn't need to > be patched to fix my spin, possibly because it isn't called. Correct. > If a > patch to lru_gen_look_around is needed, might one be needed here as > well? No, because that code is disabled, unless hardware can set A-bit, e.g., arm64 v8.2. Thanks.