From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 67AECEEB59F for ; Thu, 12 Sep 2024 13:03:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DE2F06B00A7; Thu, 12 Sep 2024 09:03:28 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D92516B00A8; Thu, 12 Sep 2024 09:03:28 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C80EE6B00AB; Thu, 12 Sep 2024 09:03:28 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id A93546B00A7 for ; Thu, 12 Sep 2024 09:03:28 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 643DB1A0444 for ; Thu, 12 Sep 2024 13:03:28 +0000 (UTC) X-FDA: 82556102496.17.89BD269 Received: from mail-yw1-f181.google.com (mail-yw1-f181.google.com [209.85.128.181]) by imf04.hostedemail.com (Postfix) with ESMTP id 721EA4002A for ; Thu, 12 Sep 2024 13:03:26 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=raspberrypi.com header.s=google header.b=VoJOb23s; spf=pass (imf04.hostedemail.com: domain of phil@raspberrypi.com designates 209.85.128.181 as permitted sender) smtp.mailfrom=phil@raspberrypi.com; dmarc=pass (policy=reject) header.from=raspberrypi.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726146067; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=kVlFMSPRsG+SVdmClghwEIPi0XXVShQ9qjEwycfYwwU=; b=OZDK3rZN8z5L5ccpKGg+sP4L6VfKQdffvT9A/gGMmw+liJEiir9ozKgdN1SxI/TY2uAxUe 8MQnsLbiZh6Cr+6vtE6wuW1CC4yZ0bUGcDzmafjrDtEkKyDjhaZEarLpxuuMWZuzXM8PPG 3Oj56a0sYXChM6eSy+wRRNV4TKqFdj8= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=raspberrypi.com header.s=google header.b=VoJOb23s; spf=pass (imf04.hostedemail.com: domain of phil@raspberrypi.com designates 209.85.128.181 as permitted sender) smtp.mailfrom=phil@raspberrypi.com; dmarc=pass (policy=reject) header.from=raspberrypi.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726146067; a=rsa-sha256; cv=none; b=sDe7dnXtm2LXzi3YWiNPboDHiB60fVRBAax3mnYsW4YFX1LqRzWer3ArKIlaSq485/KH4p a9mpPyBmDyQ3J2oQ6i3Uf7pisMyHV8RsbbywANP4OWzGyxuuHlevy2F3ogYzgMOhs6Q9sT lBGN9CYCvgWCu1xZ8xy+mONLvFRG3rY= Received: by mail-yw1-f181.google.com with SMTP id 00721157ae682-6daf1c4aa86so7456577b3.3 for ; Thu, 12 Sep 2024 06:03:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=raspberrypi.com; s=google; t=1726146205; x=1726751005; darn=kvack.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=kVlFMSPRsG+SVdmClghwEIPi0XXVShQ9qjEwycfYwwU=; b=VoJOb23sy7YGL5lxaOHXogeSB3OcT+MEa4dSxQpdA9YTjmk0ML91pYa8LqOK0NEzpB 1fng7nB/l1pkjCuCs4bXWMXNQp4KUZKjr20vwD9RaoXFFzCJp1QZGQy5KYwULQy7gk/9 PqWuFE5ge8eqHLYwv4AeHaQgLyFAALfM6kO4Cl+xr9uKtSSVgY80hfF18Togd0VmaDzN EakpjqMt5INQM4yShjPHhzR3faoavj7IUyA7fp6fesz4NZHu7R4/giUjeyOFJ0UTTzeh S+Jq2S19zegSOLeeexEUE0a8jLuYB0uA/y9TXCm4qKNg3tF7qVeYOKTr44tfrisUYdZ2 5jrA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726146205; x=1726751005; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=kVlFMSPRsG+SVdmClghwEIPi0XXVShQ9qjEwycfYwwU=; b=mNAOV8FAtTnvScQ1mj3JUYhhuEztHrsnjrSIDqBZaxJvXoBNxa67AdaAif5Ewppg9k l2TevoZs4v0zUCy6OHJ+v3tT2wdoxVc8HSClrEKJDnneTKs3PFpkGUXIOji+a2SD3hEJ n62h/uXaCE0L73cs5+k7IRFtHHqKBX07Z8ELX5Tdgbj37dxmo5TgzaohVekns8g2BgPl JCMd//TLfx2LLz/fDpHSeQpbimaffNm4txTO+2280u50Kbz8JFPg5GztSAp3gYHGmKys tMPmoITX/06G6iBin2Ng2fnKYgtqW8mg2rs+Q4qqp4P9PiFNzN1IKSl+bMtXoqg/+Cxu dnZQ== X-Forwarded-Encrypted: i=1; AJvYcCW1IqLZFSt2mqFvPUNe2Z8KbFMHOfAT7H9f9HJpXiGHWWAFQmKbCR66JaPvlL/0jLbqiP8AH4IBqw==@kvack.org X-Gm-Message-State: AOJu0YxZ6x05ieiMxp3GiAyCCqybhEnu1KRvyeO4/ZK2kwFjlwHyf1ie brpq8RP3hI960LvP+pDez1LG2aUde2QrL3gitGDEp8X5ZDuK/iGnBo4QyIw4LnGijtFaxxfC6hP pA55S8AzzIFsWlvQytI6Ww8Rjt6MKHOCHP1+r7JFv5wytsHEUOB0= X-Google-Smtp-Source: AGHT+IGkG1D+VGlTPR2mLAR3epSlJj4oF0yI0qM3VcNcB/uYqx3joNbTRP89KIOXWryPe9u0UpuSLBF/zBTIkv5sF2A= X-Received: by 2002:a05:690c:6812:b0:630:f7c9:80d6 with SMTP id 00721157ae682-6dbb6b579a0mr29001457b3.27.1726146205145; Thu, 12 Sep 2024 06:03:25 -0700 (PDT) MIME-Version: 1.0 From: Phil Elwell Date: Thu, 12 Sep 2024 14:03:13 +0100 Message-ID: Subject: Questions about TLB flushing and lru_gen_look_around To: Andrew Morton , linux-mm@kvack.org, linux-rpi-kernel@lists.infradead.org Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 721EA4002A X-Stat-Signature: 7pjjdicpj3nz1hxsrrms85b9paetid88 X-Rspam-User: X-HE-Tag: 1726146206-73412 X-HE-Meta: U2FsdGVkX1/02Kvb6BYpDqRz4+YN5hCWxkMEBqPYB3wEnyNirllyqtQaT3Pct6z7GDP+ipXeDv9393Tp+ma6QLBIk/VJKL7jO9jKoeSc0MXDqOiUYwVeE/X+J3lJYLtEOcaVonb5OPYzQL9gDGio92muFT7ToXRzBekX22iP431OJgZ6n/pyjZelLL9axE5egJt9claN6Bsga/cqp10/gJjIDK5ZzX2u0PJtLx2QJCXTk9qRqoefC6+ujGri2VdYi91jDKidcfKfxvtpqWXQGRUClaO1nBCRy4oNHrtDeAI4P0P8verQAljXo7bFJ0YqgDPmnKV9oEHlcmgrstFuHWj2GKCOSHkT3Gr6yezOFWIwHDwxso/MLaM8KGJCyOwtS5Q2iqH4Elg+4Bo09Fp3ewBI8WmKz8er7EbbDZetr++LSbBHeG86xwm6gxORPIEpKtfwGenq69FBupK0f/2yKjAmOrW3ceSJw3vYMMYhcbWveb8oNoVQxmBjAAiHBm3zeXu0wsBvu+8Hq9TOQNOMN+Oz6SE5AYhmeTsWblS7PHTKY6vwjz1kplsFT+D+qSPttTqu1Tg1xDJphS3MYvyfq2GWYjjaqD53uRXyHOTgwGlQxUYCxDyY9LFHY+uKToleAMPJQN2nJMNxagkQP6CECsj73O6rWeqESSfVin7hgVuLrD/B3d+SshrqP+II7AccVj5E9uUCwMlxBTWaJaDwtqlG70HbgM1WiMqPLDZobKKsZETFV9+gA2K8rzAJBZaUn+b1kmmhfhj8Fnl1aYgwH7jm/E8xl2QOMc1yrkuW8TOJs2MceM4EhgcSmzX2J7x/Z6dHga0Pa1bH/BnCRnEx/h1BPEC/cOQ2vKY/IU6KIWGBBRL89ryVKKNEZwLEXs3ZCD3AyoU+DLmseeQwxRAa4lYCejUGtq+O/DfmjW30dOg/K0escLYK0tnjbikb44JzK1sQ+Dc89anXNSS2f6O Yw4W4/ON aMoUdX8UARaY784ZY4mTxdB2Wv0L9SqXkcCc7K0N3wwR9xOFL3B13LPa27X84uGymDt0AR5uKsj3tE0QqO2rw++WL9Bv/HH71ynfYFAOL/og+54GCmNhEnfja4/AO4CrcR6iaYBGLS6ojHgCMob01Ub4Y8LgquH82dMlZKlPlP6dzz/0JaPQR8HBI3J629/xO44aP4I+A4BYwt/vI2p8pyRshaJDSs3y5p3pkBZ9xAcV4CV8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000001, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi, I've spent many hours recently trying to diagnose a problem that manifests as a CPU spin, under load and memory pressure, that can last for many seconds. The problem can be seen on our downstream kernels from 6.5 onwards, when built for ARCH=arm, running on a Pi 3B (BCM2837 - quad A53). I've not tested a pure Linux 6.5, but this is not a bug report. Pi 3B has limited RAM (1GB), and it was discovered that restricting this further to 512MB made the spins more frequent, as did adding other processes. Running an ARM64 kernel in the same configuration leads to normal OOM behaviour. I traced the spin to a loop in __copy_to_user_memcpy where pin_page_for_write fails repeatedly, sometimes for hundreds of thousands of times. The pin is failing because the user page in question is marked as being old (L_PTE_YOUNG is unset). When this happens, the code tries to freshen the page using __put_user, but in this case it is not triggering the required page fault. Digging deeper, it can be seen that the PTE in the ARM's shadow hardware PTE is 0 as expected, but clearly the MMU is not seeing this otherwise it would be faulting; a TLB flush for that PTE fixes it. The TLB non-coherency for that PTE can be attributed to a call to ptep_test_and_clear_young from lru_gen_look_around, which clears the L_PTE_YOUNG bit in the Linux PTE and zeroes the hardware PTE but doesn't call flush_tlb_cache. Two possible "fixes" are: a. Replace ptep_test_and_clear_young with ptep_clear_flush_young, which includes the TLB flush. b. After the loop over the page range from "start" to "end", include a call to flush_tlb_range from "start" to "end" if the "young" count is non-zero. My questions are: 1. Which bit of code is meant to take care of TLB coherency where lru_gen_look_around has made changes? 2. Between the two patches a) and b), which is preferable? b) would seem better if IPIs are needed to broadcast the TLB flushes, but it seems that BCM2837 has new enough CPU cores not to require such broadcasts. 3. walk_pte_range has a similar loop, but it seems it doesn't need to be patched to fix my spin, possibly because it isn't called. If a patch to lru_gen_look_around is needed, might one be needed here as well? Thanks for your time, Phil