From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 53B2DC54EBC for ; Thu, 12 Jan 2023 10:31:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B47B68E0002; Thu, 12 Jan 2023 05:31:39 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id AF7918E0001; Thu, 12 Jan 2023 05:31:39 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 971D18E0002; Thu, 12 Jan 2023 05:31:39 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 8442A8E0001 for ; Thu, 12 Jan 2023 05:31:39 -0500 (EST) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 55C2D40CF3 for ; Thu, 12 Jan 2023 10:31:39 +0000 (UTC) X-FDA: 80345780718.25.60D519B Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf23.hostedemail.com (Postfix) with ESMTP id 30A2914000B for ; Thu, 12 Jan 2023 10:31:37 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=BITNCe1f; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf23.hostedemail.com: domain of mlevitsk@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=mlevitsk@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1673519497; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=d13KL8x/5My4Hrns01HyF5fP7AwMPPejudzCmpqjaxQ=; b=US4NlnOGKIldNdLX5k5b8o/QLJqyoCSXO8x+aWlg3M4Oxs/wexBKeNCQhJO5XtBrK4WsAF 6/rFqd6hLnjHmimiMrnngupZv3jZcrDEOzR4QXNcRZbvfvDln0RiPlvrzMjdnVYLGjdAkf QOsX6jYC+UKWbSeQnRDttIFVtA4y8qU= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=BITNCe1f; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf23.hostedemail.com: domain of mlevitsk@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=mlevitsk@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1673519497; a=rsa-sha256; cv=none; b=yL/2hv2hcYGVCDLRXQuPQHldGBhWu89UgbRjHcVRDqaQoN6m1LRI8nUGGeAGTMmiajH2xH 3aYb+6Bm6tIfcvDMgilEWu/TA7Zcj4anUyFgpXxov1a0W190HWDV6niAOnTyBBwzhSZMdT z0fPDcUoWcls9DixzFVtKKQNLkk+/ME= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1673519496; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=d13KL8x/5My4Hrns01HyF5fP7AwMPPejudzCmpqjaxQ=; b=BITNCe1fkDXQF6hpU3setlxBD+sw1LwZb/myj0NZgXhcqNuZ5vm6t+F/BXFZoEisF8FjqD a1dE+X5lYC+fqVTMtSHFDsD3txuis2Ym36PRxcK9eQjksSLM8WcdWhZx3ob5MT9d+daAR6 +k6k+j8zHtUNPKyuGJZmr6zb31CgpK4= Received: from mail-wm1-f69.google.com (mail-wm1-f69.google.com [209.85.128.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-323-nNx6qGR9MgmJtEacQOnGDA-1; Thu, 12 Jan 2023 05:31:33 -0500 X-MC-Unique: nNx6qGR9MgmJtEacQOnGDA-1 Received: by mail-wm1-f69.google.com with SMTP id m7-20020a05600c4f4700b003d971a5e770so9084372wmq.3 for ; Thu, 12 Jan 2023 02:31:33 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:date:cc:to:from:subject:message-id:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=d13KL8x/5My4Hrns01HyF5fP7AwMPPejudzCmpqjaxQ=; b=OAwJYXVAbg9yTo5hGtn3nZpZnIZVagCRNvV/py/cE6iLjq6dIXNPmxeYPi9lVtVFki x0qpXRaMWZicsrBvXIwFNPtcJ4ODn3/KBg7WYTntH7DkPEtW7NwdekMrh1pw5q4pRxaU g7AydruZcGgrp+cwNawAHjL5zuR5CojBCyOo8CiCp7b9rtVVXLCmo19XOAk7gSd3D5bs e3DdPaVOM2ZGinzvD4zL9/f8C4Q7gkW61b0pkhpebQpaenEbtv6eLN8CPvBfe0Nv7FUd CWe52jF5qmQHlze/1FDcequX7eAc1btlWMX6Rrfdy6LQzBIhdRIzWNXmhBdzQYH1tI4r BWPg== X-Gm-Message-State: AFqh2krVUvTWZNhbvKZSFXl7nNVA2+vTTNUxI+yozIqdUsMVoypxPYpG VOSLhi78GeZaoNqo6T2V2muywLhzE6h2RqLDeS4jk/CCZ7cDPq7W7WN+2Ut8GZQf6hSnz2p9HPE D8qM2QOAACFQ= X-Received: by 2002:a1c:ed19:0:b0:3d3:52bb:3984 with SMTP id l25-20020a1ced19000000b003d352bb3984mr55851471wmh.17.1673519492357; Thu, 12 Jan 2023 02:31:32 -0800 (PST) X-Google-Smtp-Source: AMrXdXv+pSB25y0r7askouK+s08X+9rW7JiV82+G1/D5MIClZENG0NSkfEb8ZDss3pKnnHa168s3uQ== X-Received: by 2002:a1c:ed19:0:b0:3d3:52bb:3984 with SMTP id l25-20020a1ced19000000b003d352bb3984mr55851459wmh.17.1673519492121; Thu, 12 Jan 2023 02:31:32 -0800 (PST) Received: from starship ([89.237.103.62]) by smtp.gmail.com with ESMTPSA id q6-20020a05600c46c600b003d1f3e9df3csm27873970wmo.7.2023.01.12.02.31.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 12 Jan 2023 02:31:31 -0800 (PST) Message-ID: <6d989ca2330748ed682c81fc5f43e054a70e70a8.camel@redhat.com> Subject: Re: Stalls in qemu with host running 6.1 (everything stuck at mmap_read_lock()) From: Maxim Levitsky To: Jiri Slaby , Pedro Falcato Cc: Paolo Bonzini , kvm@vger.kernel.org, Andrew Morton , mm , yuzhao@google.com, Michal Hocko , Vlastimil Babka , shy828301@gmail.com Date: Thu, 12 Jan 2023 12:31:29 +0200 In-Reply-To: <7aa90802-d25c-baa3-9c03-2502ad3c708a@kernel.org> References: <7aa90802-d25c-baa3-9c03-2502ad3c708a@kernel.org> User-Agent: Evolution 3.36.5 (3.36.5-2.fc32) MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 30A2914000B X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: b6q8bq6yuswwmgj97t3xpdid6hnn7dhw X-HE-Tag: 1673519497-713401 X-HE-Meta: U2FsdGVkX19/rEJGehST/KCKn51shxXQmzZLl2If/zIDHH2vExW8w6TnteT2yaV+WSeTrbIsNVwOYJW6N4RNLoLdAp1lm7ZBYsbpyVpk1/IrxoDq0W18oXbpf9hupgD0Yc0HSEQpybp4LI0Fr3kHtyYW1MIYHsFmSNAIvOSu0nyeg2EMbqFf+70aiECp3XKegaH2MVyt8cBr3jczw5qYp30VE1MTs8SahU8ZlNURZpYtNYXJyscxFhjca40NZGaSM+nf5//SsnCELqpxOjt/BLNWrOv8PJ38BnuyBqmaozZXiNxG7yTpY7dIbCLPMLRrHuB8CCYUn0acjexUGsJmHA/NYOII7ylaJdFZsrYUkItSbOvyL2EbNQ02TFlDAbs5KUIS1+gPfxe+s0fKdd5E3d6zeRYBvlk5dIQTKC0ntX40q52/dhQyQqkV90ewKut64PQkdnyliT7+icwnOsDNg3LkIhIziTpFLWvpX5jh9kmQskWm26q3JusKIXYNdGb2/IM7GegEulD1c0XSc+vIUzq90eCqt4WbI6/kHWsWUuq40EeyYEsTTxy/Gw38NNZeSO/NecMxBBa5QIVD0TwUec+2p1KDIQiIq4IRuVlAICnhhx3iMc5+C+tUVXnqVZG1hYKA3Cuf95yk+oJEN/gogXeJ2Qvj8FkoIGLJopXYa7EEmcgdNsQZ7DGZly2obO/22VRp+Q5w/MMuE4WblJx/1v4p+PBY4XwnCNMgbRKQXVAYf/iq7dpVcc4kh/Tff/WBiIfwIIRSZPLjXcLnadCQwVRoMAO4waypeh1hhy9/10NHuL4kGSvrM9GyY48CGRNwZbhMK9eRV31gtYgPKnOW2nXpe2oIGqYOW6CBZsAL76gtnf+OGL3jT0+TgXVQWJrcqsOficAQs5aNp1ZcTHrmpIATAV/A2BH+Nr7Dn+Y8rvcpHn51QbsH1hMyi8fL3jD5kVDHWlFc48OS38SKsq3 JIAWHsfG KPo+FsuZjS9UnSouLwjoKyfHRqjpSx69XzWHYrDl2J29Cuyhq9qt0/XZxkPPnyikIG730Y/syrb15pvOl7VmIHxJmilobICadnSCgc/k2VGNgm7IfLbaDviiIPVQe5REjSC1N9ksJ4L1LknlRf7yjF0URTXbVuoBOY+LbfioY91hy3Wv8WP2idVSBdM/rQr9oXAu/EB5nG4orWGLYX2g6/0Wrtkhmo9Zd148n X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, 2023-01-12 at 07:07 +0100, Jiri Slaby wrote: > Hi, > > On 12. 01. 23, 1:37, Pedro Falcato wrote: > > I just want to chime in and say that I've also hit this regression > > right as I (Arch) updated to 6.1 a few weeks ago. > > This completely ruined my qemu workflow such that I had to fallback to > > using an LTS kernel. > > > > Some data I've gathered: > > 1) It seems to not happen right after booting - I'm unsure if this is > > due to memory pressure or less CPU load or any other factor > > +1 as I wrote. > > > 2) It seems to intensify after swapping a fair amount? At least this > > has been my experience. > > I have no swap. > > > 3) The largest slowdown seems to be when qemu is booting the guest, > > possibly during heavy memory allocation - problems range from "takes > > tens of seconds to boot" to "qemu is completely blocked and needs a > > SIGKILL spam". > > +1 > > > 4) While traditional process monitoring tools break (likely due to > > mmap_lock getting hogged), I can (empirically, using /bin/free) tell > > that the system seems to be swapping in/out quite a fair bit > > Yes, htop/top/ps and such are stuck at the read of /proc//cmdline > as I wrote (waiting for the mmap lock). > > > My 4) is particularly confusing to me as I had originally blamed the > > problem on the MGLRU changes, while you don't seem to be swapping at > > all. > > Could this be related to the maple tree patches? Should we CC both the > > MGLRU folks and the maple folks? > > > > I have little insight into what the kernel's state actually is apart > > from this - perf seems to break, and I have no kernel debugger as this > > is my live personal machine :/ > > I would love it if someone hinted to possible things I/we could try in > > order to track this down. Is this not git-bisectable at all? > > I have rebooted to a fresh kernel which 1) have lockdep enabled, and 2) > I have debuginfo for. So next time this happens, I can print held locks > and dump a kcore (kdump is set up). > > regards, It is also possible that I noticed something like that on 6.1: For me it happens when my system (also no swap, 96G out which 48 are permanetly reserved as 1G hugepages, and this happens with VMs which don't use this hugepage reserve) is somewhat low on memory and qemu tries to lock all memory (I use -overcommit mem-lock=on) Like it usually happens when I start 32 GB VM while having lot of stuff open in background, but still not nearly close to 16GB. As a workaround I lowered the reserved area to 32G. I also see indication that things like htop or even opening a new shell hang quite hard. What almost instantly helps is 'echo 3 | sudo tee /proc/sys/vm/drop_caches' e.g that makes the VM start booting, and unlocks everything. Best regards, Maxim Levitsky