From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8F8D0C47DA7 for ; Tue, 16 Jan 2024 17:20:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0FE4F6B008C; Tue, 16 Jan 2024 12:20:45 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0AE686B0092; Tue, 16 Jan 2024 12:20:45 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EB7B06B0093; Tue, 16 Jan 2024 12:20:44 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id D888F6B008C for ; Tue, 16 Jan 2024 12:20:44 -0500 (EST) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id A8E2B160A7A for ; Tue, 16 Jan 2024 17:20:44 +0000 (UTC) X-FDA: 81685838808.29.7604CB0 Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) by imf04.hostedemail.com (Postfix) with ESMTP id E5AC64001A for ; Tue, 16 Jan 2024 17:20:42 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=36a3fXkU; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf04.hostedemail.com: domain of 36bqmZQYKCC4cOKXTMQYYQVO.MYWVSXeh-WWUfKMU.YbQ@flex--seanjc.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=36bqmZQYKCC4cOKXTMQYYQVO.MYWVSXeh-WWUfKMU.YbQ@flex--seanjc.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1705425642; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=clNdxL3spBiROjgCSpV36akIqm7jAf49yzjKH1UtjfE=; b=tcVBBykk+mr12E0SfOw5ZSsEZNwYlr20Izwbt6bmbbuIT1W9JGTgWiR/U/n2/W7ZR804qb EEXpnqD2XZDMKYVdo6Q4anFs1oZ83AN+sWV++iORM1wa8SChlriq3ZBvjqUIyWnBrT2K5u Em2ASMAKgbiJzvmbFIfjRtM2a7qzgzI= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=36a3fXkU; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf04.hostedemail.com: domain of 36bqmZQYKCC4cOKXTMQYYQVO.MYWVSXeh-WWUfKMU.YbQ@flex--seanjc.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=36bqmZQYKCC4cOKXTMQYYQVO.MYWVSXeh-WWUfKMU.YbQ@flex--seanjc.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1705425642; a=rsa-sha256; cv=none; b=RcJW3pncV+RGZxAqWcE5bux+5/cglWjagJuayq9Tdz8OqvG8scx60ZkYWFWbtLA2gKcvXT CAfmt4558hKkuBn+CEYZdRvmbSEUGhZLJVxvRyKxsV7lqT7dUE1TzhfVu8zN6/LaqLQuMM BLMnM1Pg73eRmMjHw1GZ95pK1G3SwGY= Received: by mail-yb1-f201.google.com with SMTP id 3f1490d57ef6-dbdad99096fso13654507276.2 for ; Tue, 16 Jan 2024 09:20:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1705425641; x=1706030441; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=clNdxL3spBiROjgCSpV36akIqm7jAf49yzjKH1UtjfE=; b=36a3fXkU9ViOkJWQDRBSTxQ1vg9C3tAyqz1a07/KSbmbXipRDLBGbs+heKxdgYaWWq DhNGf2ZZW2c0C/nlkCeqWDpB9AIJm1aWB3AW9lmZnfnHK+B29oiboL4MXGOzuvDxpabr TN01hvnMx6eo5zRNdWle0OVeJe60hzjcvgdfChjy79iCgSD/G7DqbYLIc3lSb/biH4Iu 9/D0aRS82q70Hgnvuc2jm9auHXDWMcArHarats1CdAddUKz/wHpRQvegRd2m5MuI/9yc OSkXLgQthWw+9ITjyzm8kA9Fzl/0Ou8FBTMRtvOjabROMnXgBc/kTzYmg2eL0c8RWQZq iTQQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705425641; x=1706030441; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=clNdxL3spBiROjgCSpV36akIqm7jAf49yzjKH1UtjfE=; b=dbILuUXtaGtWi6BmEgF9NlrExBMTJB7/1iB0qF6r2Zb9GYOsVexLvd7TZ91RNEn+fY wfh9sr91qj+bbWxCS23DnIP1+eVDDlJ5WqROyLjsOlyN+r0tIxEZiceAiyw60A7JhuUU WqVKyTWSr32g12iHWtbA2RletYNP/DQiZvI9Pnn9O5m0PldqxoY++SCcscd13sFQn2/v 4Bh/7Wk27W1Lb77T57jmU9NmmcrOt3pGZjkoPBubnIQgdfnqp6I9jr/+X7lT8XZErHB5 w+Znu2W2ZxRfz8B+CLlIJ2WAlO8TMOmac1KlmFUfFgsvnXngZhWiD0bBqjCVS6VJuW7Y pU9g== X-Gm-Message-State: AOJu0Yw39ib7UDk50Ey0r+ow+h6dOYg/KzMVkBot7G687gpFRYVrHhnF wM1Mx2CHC6zsaoVPquUjcu421PxN1+aGdhg6jg== X-Google-Smtp-Source: AGHT+IEa8FYhTZbg1Grsnfnbe1lAtRLd0Lu5zVL6RIJlU788VzmSGiCHMQ/3GdhuUMVq+NOCaEYXe+qYFyU= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a05:6902:564:b0:dc2:26f8:91f with SMTP id a4-20020a056902056400b00dc226f8091fmr479444ybt.8.1705425641075; Tue, 16 Jan 2024 09:20:41 -0800 (PST) Date: Tue, 16 Jan 2024 09:20:39 -0800 In-Reply-To: Mime-Version: 1.0 References: <832697b9-3652-422d-a019-8c0574a188ac@proxmox.com> Message-ID: Subject: Re: Temporary KVM guest hangs connected to KSM and NUMA balancer From: Sean Christopherson To: Friedrich Weber Cc: kvm@vger.kernel.org, Paolo Bonzini , Andrew Morton , linux-kernel@vger.kernel.org, linux-mm@kvack.org Content-Type: text/plain; charset="us-ascii" X-Rspamd-Queue-Id: E5AC64001A X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: k7qc9ec6n6iyeej5r6qzo7x9m5t3utsn X-HE-Tag: 1705425642-754237 X-HE-Meta: U2FsdGVkX1+V5HVjBqF2hcrYknVsMZEyHgYaJOhD3HbQeE7t3nhPjR3RwFLHlgm6RNKJn6qJxnEpKcHfU1gnLX84EmMo1iPdFzaikiHMOReFEGvM/S0iYP9so288KZCvkG6FbkPV8RV0C8RMjwAa3v94UPeqO828+4kGNjFyPvdaLnxgBsKtT7GK6jqOq18vCn1ppQ9aX0pGseGJFFljPfRy9tAfCOP6TFBiTgpF0Euze1uLvHU6KH39Cfle+JzpY56fftTYyuN03C7HwJPPpZC5GCXgvNPmk/KtUWiHkiEKtIALD6xmDichfwGywedv5n9boQVZFovlEjeoyuJsrmtCWcEg7ZhzkXncOEVMTcGkIc74+FMJbRyToP1mmY8lvOXl2gOF1b2lgUvZuHTFO9OBDJwFM7RhACx0u3iVUzXwnrCPHnpblJ/vjGvsD8Kq71nP0T2lcP2wy6S+AFthyjUNLA2OlTiPnIfnXXCPsvkSoHOzH8oHkPq/Fa+HJXV0S45EcJHQwOFtQqi4s8TbEKXW2gH7ez8ic5FxRwPZlsNqLQBinnyEVkLMwhG7p2OV19P5P3U+7yLGHRH9iiI/dWQJmGbZFBU64MOKYLqkALoxghgVG2VRmtLoXacHtIJm7jELjAx8CgtwedtPYGsEMbkBfzIQ0pSrZCrCZjCWPHoc+guuq5LMTT5c47q68sMefJXM/oVNAzP+5AgSIfPXx+HM17/fDtN0BAF860ZR7om6+gkbJV2KGzf+uZ2DCHFs9rKou6lo4l3VQH+jRVwDAmRZ8OjY79sV9aSbfDInRmbcAJDVLP3uBwXBn7kSOmK3+ZadOy/+yszrEAMnenp0U6kQ43i1YIsCq9KizUk3hGQ3mZEf0D0fh2wE2dKzdsOTD1Si0/zjrx1+4kHrPrxRfAEmxhzR7Kflr1bBMNn01yepo7ENdBYA7N0I7ACYVFg8pp+SnH9b+hXImbeo73j /6k1OQzb Inay3rThRvx3qXuRw/ZgyIYQ0vo5MQ0oIRbRxisd8xNxlDVWSFa7lVjNz5IE/EiTtYi5TTAZ+46233tcZR/p0oeJ4G78S27NANrzr6hRIkTvRU5bbyG9gqLa+anKd9gd3t1+pg6R3lRR+5RoOv6vNvWhwT2dJbXVbhgoTIsUqHzWfqclxPHfU8miZ7vPkgaYD7xwPsqGD7LvlKvoks0RYOcJBK9LhpoOqbGtaWW44GRO25siv0+T61HYcnqj2h69S8EYYwPT84svHTlq/YhOwSaRvDot+JEs9FAcTMQaPtigllAG3C1uWISCcOOdPFBY36Kjesqm61b/YengyCCn1my1B7t53SrcGINUxX8fGCZ1FbIJgMfB7LbgaZmhmCM87ZFYy9ZLzyQC4PUwehddmvmOxI5WzcppKvQredoilbS2ZHml+cUKG5mAxRMRSNfvfcFI3ydxPoWFQYu3e+jRIndhY2ladrkjZM8RtAxbwMW9/9uy9nRXi7U7rrC+sghOjXozbRGQ1US6yJS0+SX+LbHc1bRD3UBD2robsYuaZTWaw227xu/39JdZO32r3KkZcc+X/DEYCS9ZvX8jKJ/6J7DXBATmBmMCPpRfxnBKry+vTz5KhPilfG/IAnNJOgpispVKU X-Bogosity: Ham, tests=bogofilter, spamicity=0.000024, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Jan 16, 2024, Friedrich Weber wrote: > Hi Sean, > > On 11/01/2024 17:00, Sean Christopherson wrote: > > This is a known issue. It's mostly a KVM bug[...] (fix posted[...]), but I suspect > > that a bug in the dynamic preemption model logic[...] is also contributing to the > > behavior by causing KVM to yield on preempt models where it really shouldn't. > > I tried the following variants now, each applied on top of 6.7 (0dd3ee31): > > * [1], the initial patch series mentioned in the bugreport ("[PATCH 0/2] > KVM: Pre-check mmu_notifier retry on x86") > * [2], its v2 that you linked above ("[PATCH v2] KVM: x86/mmu: Retry > fault before acquiring mmu_lock if mapping is changing") > * [3], the scheduler patch you linked above ("[PATCH] sched/core: Drop > spinlocks on contention iff kernel is preemptible") > * both [2] & [3] > > My kernel is PREEMPT_DYNAMIC and, according to > /sys/kernel/debug/sched/preempt, defaults to preempt=voluntary. For case > [3], I additionally tried manually switching to preempt=full. > > Provided I did not mess up, I get the following results for the > reproducer I posted: > > * [1] (the initial patch series): no hangs > * [2] (its v2): hangs > * [3] (the scheduler patch) with preempt=voluntary: no hangs > * [3] (the scheduler patch) with preempt=full: hangs > * [2] & [3]: no hangs > > So it seems like: > > * [1] (the initial patch series) fixes the hangs, which is consistent > with the feedback in the bugreport [4]. > * But weirdly, its v2 [2] does not fix the hangs. > * As long as I stay with preempt=voluntary, [3] (the scheduler patch) > alone is already enough to fix the hangs in my case -- this I did not > expect :) > > Does this make sense to you? Happy to double-check or run more tests if > anything seems off. Ha! It too me a few minutes to realize what went sideways with v2. KVM has an in-flight change that switches from host virtual addresses (HVA) to guest physical frame numbers (GFN) for the retry check, commit 8569992d64b8 ("KVM: Use gfn instead of hva for mmu_notifier_retry"). That commit is in the KVM pull request for 6.8, and so v2 is based on top of a branch that contains said commit. But for better or worse (probably worse), the switch from HVA=GFN didn't change the _names_ of mmu_invalidate_range_{start,end}, only the type. So v2 applies and compiles cleanly on 6.7, but it's subtly broken because checking for a GFN match against an HVA range is all but guaranteed to get false negatives. If you can try v2 on top of `git://git.kernel.org/pub/scm/virt/kvm/kvm.git next`, that would be helpful to confirm that I didn't screw up something else. Thanks very much for reporting back! I'm pretty sure we would have missed the semantic conflict when backporting the fix to 6.7 and earlier, i.e. you likely saved us from another round of bug reports for various stable trees.