From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 41EA4C3ABDD for ; Mon, 19 May 2025 17:04:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 53C566B00CE; Mon, 19 May 2025 13:04:47 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5113B6B00CF; Mon, 19 May 2025 13:04:47 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3DA6C6B00D1; Mon, 19 May 2025 13:04:47 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 1CC966B00CE for ; Mon, 19 May 2025 13:04:47 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id D1AE61D062E for ; Mon, 19 May 2025 17:04:49 +0000 (UTC) X-FDA: 83460281898.08.6058621 Received: from mail-pf1-f202.google.com (mail-pf1-f202.google.com [209.85.210.202]) by imf24.hostedemail.com (Postfix) with ESMTP id 15B7B18001F for ; Mon, 19 May 2025 17:04:47 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=u6ZOT7dv; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf24.hostedemail.com: domain of 3rmQraAsKCG0LNVPcWPjeYRRZZRWP.NZXWTYfi-XXVgLNV.ZcR@flex--ackerleytng.bounces.google.com designates 209.85.210.202 as permitted sender) smtp.mailfrom=3rmQraAsKCG0LNVPcWPjeYRRZZRWP.NZXWTYfi-XXVgLNV.ZcR@flex--ackerleytng.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1747674288; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:dkim-signature; bh=Goq4u7JwRNa/mNKgeJWxWxL3A/olzwx0SUya3IskYzo=; b=T4ai0Smw0ELkcKN5lQccmlUuwhtn0/lOlvbfn6uRfDsjUW2BdTequGqm5Mm2ikk9WriBv/ LrHjWTrsMifY70NNN8OCl9DQ05rgV7TvmaCbjydIT/2aUyRg5sOvXkP4xvFz01K2WcjGVv II/8TyC8cKByesImkpRwV4rpy7t6Zio= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1747674288; a=rsa-sha256; cv=none; b=wc/v65MH2ZUCt4kvQrdx+MBTVlP1WT8dR5z0ZC8N+aoM1qW42xRqvcLut5pSKvm07w1uSs 5isX0aJLYrUVqOViZ6Q+kk0WRRh08PMTcv4rGr226jiRlk6QT13BzauI6fghGmMHewe+Ek 6yiUpk0bpIcn8TdUEagsTKaMqUuZ+Q8= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=u6ZOT7dv; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf24.hostedemail.com: domain of 3rmQraAsKCG0LNVPcWPjeYRRZZRWP.NZXWTYfi-XXVgLNV.ZcR@flex--ackerleytng.bounces.google.com designates 209.85.210.202 as permitted sender) smtp.mailfrom=3rmQraAsKCG0LNVPcWPjeYRRZZRWP.NZXWTYfi-XXVgLNV.ZcR@flex--ackerleytng.bounces.google.com Received: by mail-pf1-f202.google.com with SMTP id d2e1a72fcca58-742c03c0272so2690995b3a.1 for ; Mon, 19 May 2025 10:04:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1747674287; x=1748279087; darn=kvack.org; h=cc:to:from:subject:message-id:mime-version:in-reply-to:date:from:to :cc:subject:date:message-id:reply-to; bh=Goq4u7JwRNa/mNKgeJWxWxL3A/olzwx0SUya3IskYzo=; b=u6ZOT7dvssp4B/9CrkFKDNMVZZlH5IfBBaNrQr5lb9P7935/ZRKYF7HsH9q9rEAkX6 tj9+kZKOSFlQ/ooZ6flQfYyBzDeAvT9NP8j0KyirZf1StHIfvK/PeiO0RGgY66hj82g9 foDolvhbfXC8mPc8LFDGJ3LjimHxPhulmmelkO2eodMt0azK+EehKqOFFKRCWnn5A370 RGWJhSCl6HKnwvtRUvAK+RNLPoTRUC3ljA1hzRyoo1M3FQXMzEI1TORA9TNsE+ZfZMC+ IkQbF1aBmfYzapZxHA1Z8yNhHqxE889Zg5aexK12fFEr+d+YZH0bqh+ZpB3oagX9lz+u MOYA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1747674287; x=1748279087; h=cc:to:from:subject:message-id:mime-version:in-reply-to:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Goq4u7JwRNa/mNKgeJWxWxL3A/olzwx0SUya3IskYzo=; b=HbZPaRXDHmm0Z8i2vhDc9U+8tPnqhHqBVOi06gUI/0Otrka1SewuRXYntCkUfogP2Q /zsqOk/rTQxRKwzCCiEtl86KVlFxGaIJsAK4B8CVzxY7nFAwaauqkayyDq0rMIAaiSIs Rtvqh6gzRhiO9cCm/XC5Eg5/A2BWKHEaeRr7rl4sxDI/+9mUNnsiOl42P6YR/E+MY2ST WYoVHPFvec+EnmtGWsZ4w21WN8QfVT/zY3Ph4NbSRz4S2bzUchdYkBBR49KcAvMYTLY3 mr6wm49a9gfgLd/cY5s6VnOIMNfLk+cLxhNm9Lo2wfC17n7di0AFK3U68Mf7cFo43irM tcNg== X-Forwarded-Encrypted: i=1; AJvYcCWv/oF07QiwK2BS7mEi76ls0V6K/xP45+brjA3kLkiHFEHIMehYKnlloQynNQ28XmMZzAScFTVw/A==@kvack.org X-Gm-Message-State: AOJu0YwF3XvLVw964H8WQzKIU6RS0nElTmKHL52idepEoL+bUzrzf6FY CiPMZj9tNS4urkSqHm2X1wXl8yQ2iWyqa4po5WM7TePEN5YjmKYW4yTpe7iVV9kn+1JtXXZsY6Q LZhY10Y719mPdb4NEPDD3qzrgrQ== X-Google-Smtp-Source: AGHT+IFMPwogMPOLeEVsy9zVVtiGIXpX7XWCAue2qkIgrW2Gtgg1AnNlznNb/LyQz+ITtedo/7pWya9Uw1d88i2rcQ== X-Received: from pfbbo8.prod.google.com ([2002:a05:6a00:e88:b0:740:b0f1:1ede]) (user=ackerleytng job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:3a14:b0:740:67ce:1d8b with SMTP id d2e1a72fcca58-742a97b7a1bmr20111944b3a.7.1747674286740; Mon, 19 May 2025 10:04:46 -0700 (PDT) Date: Mon, 19 May 2025 10:04:45 -0700 In-Reply-To: (message from Ackerley Tng on Wed, 23 Apr 2025 13:30:16 -0700) Mime-Version: 1.0 Message-ID: Subject: Re: [PATCH 3/5] KVM: gmem: Hold filemap invalidate lock while allocating/preparing folios From: Ackerley Tng To: Ackerley Tng Cc: yan.y.zhao@intel.com, michael.roth@amd.com, kvm@vger.kernel.org, linux-coco@lists.linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org, jroedel@suse.de, thomas.lendacky@amd.com, pbonzini@redhat.com, seanjc@google.com, vbabka@suse.cz, amit.shah@amd.com, pratikrajesh.sampat@amd.com, ashish.kalra@amd.com, liam.merwick@oracle.com, david@redhat.com, vannapurve@google.com, quic_eberman@quicinc.com Content-Type: text/plain; charset="UTF-8" X-Stat-Signature: nakpi8dy13ttzhcxcam9ykxoxbemgbo4 X-Rspam-User: X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 15B7B18001F X-HE-Tag: 1747674287-564619 X-HE-Meta: U2FsdGVkX1/6e6qjT6tu8YJlGQkw35MpZ/UVSxQqASVBgZvgBcLokLGYR8snTImP+0ZmI0zF/TMUyMUqYETuNlaJNtrp3dtvWljMc0ly3I1C3H5aRo4OOOaoeVmOaMieh57ynn+v3TLceFtCW+2cA8acgy/wk8MSBYEFgmXq3CgxhFH36EUSOuRBrWzIF58jxVWTjUvNYDAJYnc/1PeZS6CMz/0Rl6zmfve/f3NDqmc+1nK5/jy+bZxOkH2lPWOi+ItYKM3EA3PKX1MRyiW7dIPD1w1YhZAzNEVrv9CTD191J0YimK3tWg/KALxCdREgO9qSdx3a4Y34tcG47JAGtMudC1cZSVESODIL3DogZ8BQAvmE+PRIrndqMvZI26au3zoX5j++Cbli9bpO8I+Y4pDOOttHfNdAshUSTXcgu7JYNx2iQB8zzOdCOSfwXfiqdkbb52xJedzjJOKhjFzcQdPTa1BJ5ZD9fDbdIlowJqdyohqsWp1v2Pgi8QY9d9Bqvyc5a9kyJ0yd+sEXQZqdTcEB+94i+V0eh33ljoh0r77TZltAaf7p+XxhhAZKEzlCh2sKzZeE908Vi32Yih2LRKMds8kwUSO6JkXBVMrz9feqLNmVZ/n7FQmT7EfnPvvdmw174MBG7GLEF5RpjJ5vhh2kaGrL6cln2fNvXNaBoIPd/r3dK/tbJbiwsPpRKdv8Ekpl/Xdvmb5ERRpM4Y+NPFzsDBmE4AIAgZG5Lu5SUSFUcAX04UOdSFuZWi6rYzU1/qxf0qmb9orKGffjM5wXG2aoDuuZ5OEGk65W8zcL9/FLrhz8LSOEGvkiSRwdxnnB3q9abf/VDBArByXm1skILDn7V68MzbLoMMyai+OYuvaTRKQgipQfxPYeCuuCp8RmKFBlLEcksg81Dx/z0JAN+QNrYzyal+vnckQQKmDzFXgTvlz8Vd/4g5hn5EojxTCKKpxRjLKmAsQqkzefa0f 9wcO4c0L 7TU+K3ryrOnebjOQqvFFmKZf1pj+YKrp1xGFD7MfpxvXF+OM1b7ukIcp8rSQKZZx9YTzZNlcYGB8HZt1EKUdwkISCTDGYD25KbI4+lPKGvgQeJm6oGPWxChCsRs3IofnnIfPc1R2jABF0Fzoc804s1WB/xg1ZmBBaeFqN6mqBt1uUXwDI+h1VU86sl/8qHB1LIR7ACZA/ER6Od8sZPh/lGtQjpZ2rl+OzDxKhctCPuQVGD7RVr1bkps0KipXwYaiypWAgKP01jK2gYZ1cwbh7vAst3jmMz5XYGnKVAoaafvJiuz7Euh68ZTUK7e7R+FovtTXOGppm9WFAVJe4uwWzgYaETzYkr3CK5E8+YGYoDwaKTUp74cpRpNxvLCi+SvrpNvdbv423CP21P+8q3I9613/jqd3pAdFFw09Qd2zt7LxXZU0niP7813iMsjGpkfJ62Got7fLyObZDxbPQSCKcaf7KQoXbFmqIMEgHIxEh+zia2iiiQHcerJ5ySzeLtwV05Y4LC8nAlyuf4lxBZ+papQSCq4CWohEOat6Q6Zar6/7Co1JOhgJfFm0TKWC2b0r05w9xS97wY4HiHzMzyOvqecd5+0b7YPPkHQIVVIV+nL1MKCE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Ackerley Tng writes: > Yan Zhao writes: > >> On Fri, Mar 14, 2025 at 05:20:21PM +0800, Yan Zhao wrote: >>> This patch would cause host deadlock when booting up a TDX VM even if huge page >>> is turned off. I currently reverted this patch. No further debug yet. >> This is because kvm_gmem_populate() takes filemap invalidation lock, and for >> TDX, kvm_gmem_populate() further invokes kvm_gmem_get_pfn(), causing deadlock. >> >> kvm_gmem_populate >> filemap_invalidate_lock >> post_populate >> tdx_gmem_post_populate >> kvm_tdp_map_page >> kvm_mmu_do_page_fault >> kvm_tdp_page_fault >> kvm_tdp_mmu_page_fault >> kvm_mmu_faultin_pfn >> __kvm_mmu_faultin_pfn >> kvm_mmu_faultin_pfn_private >> kvm_gmem_get_pfn >> filemap_invalidate_lock_shared >> >> Though, kvm_gmem_populate() is able to take shared filemap invalidation lock, >> (then no deadlock), lockdep would still warn "Possible unsafe locking scenario: >> ...DEADLOCK" due to the recursive shared lock, since commit e918188611f0 >> ("locking: More accurate annotations for read_lock()"). >> > > Thank you for investigating. This should be fixed in the next revision. > This was not fixed in v2 [1], I misunderstood this locking issue. IIUC kvm_gmem_populate() gets a pfn via __kvm_gmem_get_pfn(), then calls part of the KVM fault handler to map the pfn into secure EPTs, then calls the TDX module for the copy+encrypt. Regarding this lock, seems like KVM'S MMU lock is already held while TDX does the copy+encrypt. Why must the filemap_invalidate_lock() also be held throughout the process? If we don't have to hold the filemap_invalidate_lock() throughout, 1. Would it be possible to call kvm_gmem_get_pfn() to get the pfn instead of calling __kvm_gmem_get_pfn() and managing the lock in a loop? 2. Would it be possible to trigger the kvm fault path from kvm_gmem_populate() so that we don't rebuild the get_pfn+mapping logic and reuse the entire faulting code? That way the filemap_invalidate_lock() will only be held while getting a pfn. [1] https://lore.kernel.org/all/cover.1747264138.git.ackerleytng@google.com/T/ >>> > @@ -819,12 +827,16 @@ int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot, >>> > pgoff_t index = kvm_gmem_get_index(slot, gfn); >>> > struct file *file = kvm_gmem_get_file(slot); >>> > int max_order_local; >>> > + struct address_space *mapping; >>> > struct folio *folio; >>> > int r = 0; >>> > >>> > if (!file) >>> > return -EFAULT; >>> > >>> > + mapping = file->f_inode->i_mapping; >>> > + filemap_invalidate_lock_shared(mapping); >>> > + >>> > /* >>> > * The caller might pass a NULL 'max_order', but internally this >>> > * function needs to be aware of any order limitations set by >>> > @@ -838,6 +850,7 @@ int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot, >>> > folio = __kvm_gmem_get_pfn(file, slot, index, pfn, &max_order_local); >>> > if (IS_ERR(folio)) { >>> > r = PTR_ERR(folio); >>> > + filemap_invalidate_unlock_shared(mapping); >>> > goto out; >>> > } >>> > >>> > @@ -845,6 +858,7 @@ int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot, >>> > r = kvm_gmem_prepare_folio(kvm, file, slot, gfn, folio, max_order_local); >>> > >>> > folio_unlock(folio); >>> > + filemap_invalidate_unlock_shared(mapping); >>> > >>> > if (!r) >>> > *page = folio_file_page(folio, index); >>> > -- >>> > 2.25.1 >>> > >>> >