From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D466AC3DA5D for ; Mon, 22 Jul 2024 15:15:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3F4556B0085; Mon, 22 Jul 2024 11:15:22 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 389056B0088; Mon, 22 Jul 2024 11:15:22 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1F73C6B0089; Mon, 22 Jul 2024 11:15:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id F31986B0085 for ; Mon, 22 Jul 2024 11:15:21 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 9D4CC1A0FA2 for ; Mon, 22 Jul 2024 15:15:21 +0000 (UTC) X-FDA: 82367737242.18.1D4EE64 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf03.hostedemail.com (Postfix) with ESMTP id 4294520029 for ; Mon, 22 Jul 2024 15:15:18 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=HBpOrc+c; spf=pass (imf03.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1721661272; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=IwV4EVwTFVUmCfewLq5KTWokf+/tzGEG2FIyhpLUO5s=; b=g/W2PJ/7yVL8RHzIDUE+an/XZLT68aOCMoBt3uEWOBy3fEAkHT2CFY5PTnFjqD26mJqXju DSUYZJk0znFymybBaVHwJNrU5PLhMisYaC/0zVlJMbuf74159YHaRruQ7+pG7M/xVFOqhO +ClQG4RBX+347Ke1Cc+E2lPDOjYyuGw= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1721661272; a=rsa-sha256; cv=none; b=uGwXweLGiubcpDZU+fvoC9tlieSBHqYtPxw534DkJhXjf5WMGJRLfXzl+ipmurXOmj56nQ NInR3GfU8FknXhe41lXbamUgGSpIB675h7tUrOtDoGaHzP3Cu0Kb8MQb9fAe7ulSNkNwxG FR8wPZ5Ac2Y3jADYcXsAVdhjSUzm1NA= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=HBpOrc+c; spf=pass (imf03.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1721661317; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=IwV4EVwTFVUmCfewLq5KTWokf+/tzGEG2FIyhpLUO5s=; b=HBpOrc+cNoVMEKZ0obyAOEpNI6FWaNgYwRP7dGij82+0cAWVfsPheytzdTA6F4BDIwU8yI PWH2XtPbgVBr5N+ly1pM+XZ4pUbZakouON9OnQgodXtrIFP4UF3QXmua8IW24OZ3q4GTrD LtoPn6CVYxd3/GNFLMTKR2i7f8Ag7q4= Received: from mail-qt1-f199.google.com (mail-qt1-f199.google.com [209.85.160.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-488-ZNeEY-XkMuqBFURbqLptYA-1; Mon, 22 Jul 2024 11:15:15 -0400 X-MC-Unique: ZNeEY-XkMuqBFURbqLptYA-1 Received: by mail-qt1-f199.google.com with SMTP id d75a77b69052e-44aeacbf2baso2351191cf.2 for ; Mon, 22 Jul 2024 08:15:15 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1721661314; x=1722266114; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=IwV4EVwTFVUmCfewLq5KTWokf+/tzGEG2FIyhpLUO5s=; b=kb2XLvWZFmaZwD2DKx/iVDfkzw72r9Iw1OdIlC1oRDMgVSAlhoUfO+C7BKHkJks1p/ N92v48311QeffBg75pbrG37iossdDvB6R69Ofxnr+htjGV3lAKyDkkEZ8xLxX0byRHsH ObUG6nQIUXULgtogb9IeVOiGKbFtoRnmVqUY5OAahoGhzUDdZh0BuO2/STSoO39QYfOH JHCSDvZGqXT2EgHrDyHtjlLWfWCa3CmR8sP09hcok/MRr6hib9go+kD4AxAudPPU7w/E 2l/zNbfCpQPyUUErpC0IlmYqlIQZqWit5O0dqHtlf6A6h0s0j+JbkLh5YIu2Op62g9Ca FSlQ== X-Forwarded-Encrypted: i=1; AJvYcCXf5Ac/EmON8ZG8QCMppdH7ysEtr/aDbgVH6W84DqakHlvlvLcGN4cCHrDX3/a/1mRGEI6bb+H1DJxbLVToK9lcPRk= X-Gm-Message-State: AOJu0YyYNP7vqDfsQYTLNrLiaf8Tf48wNXlL1NRAqf8FUKWsVNJwySU3 cCjdceXdMpi+moPKqGkGM/5TUHSJlMHjOPhLqmAjivpBj9XHOw6mouPH8uS40Yx+z3J8m0RE3VT VqQPXz6ZqynLi1JfYnZNNz7Bfd7F+ZkSpr+mg19qEi9ZWQrMz X-Received: by 2002:a05:620a:4686:b0:79f:1556:37c9 with SMTP id af79cd13be357-7a1a122ffa7mr583609485a.0.1721661314112; Mon, 22 Jul 2024 08:15:14 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFoPhA2XpD7vdd+w+1X6Fpe4UnduSzjJakKLgwmnDYed+val5xuaIYc5nVZBz4YRXeYP8zHSw== X-Received: by 2002:a05:620a:4686:b0:79f:1556:37c9 with SMTP id af79cd13be357-7a1a122ffa7mr583605685a.0.1721661313577; Mon, 22 Jul 2024 08:15:13 -0700 (PDT) Received: from x1n (pool-99-254-121-117.cpe.net.cable.rogers.com. [99.254.121.117]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-44f9cda34c4sm33648931cf.63.2024.07.22.08.15.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 22 Jul 2024 08:15:12 -0700 (PDT) Date: Mon, 22 Jul 2024 11:15:10 -0400 From: Peter Xu To: "Liam R. Howlett" , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Andrew Morton , Alex Williamson , Jason Gunthorpe , Al Viro , Dave Hansen , Andy Lutomirski , Peter Zijlstra , Thomas Gleixner , Ingo Molnar , Borislav Petkov , "Kirill A . Shutemov" , x86@kernel.org, Yan Zhao , Kevin Tian , Pei Li , David Hildenbrand , David Wang <00107082@163.com>, Bert Karwatzki , Sergey Senozhatsky Subject: Re: [PATCH] mm/x86/pat: Only untrack the pfn range if unmap region Message-ID: References: <20240712144244.3090089-1-peterx@redhat.com> MIME-Version: 1.0 In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 4294520029 X-Stat-Signature: yzatunnsr8obypu5brxj96krnowepxiw X-HE-Tag: 1721661318-865399 X-HE-Meta: U2FsdGVkX1+kjmvst7Vajw5Mv0Nru+MV3Yyn+zALenW7/5UI8s2pgP+VqpGW1ZLV38sOUE9tIWSbMCigr3XxCZKV1bOViUXZ6jksiVar9pdGk1w05+Vcez/dK1LbKZR/fZnKW8gA9j/WCqiwuNEGbb/IATXbzkRzaxDUiQgLpEUk91xPyPUVI+Z59RC9tG7l3oOsW0Ih4DmcdBnOTU3s3SKK/SHFqYDISyeevpzlS5oUC5Xmi5j2agseDhkIWnNs/2To0WchMT0ICCPTpcY4q1+7Xck9uMgbA4l5Z8AB6Ln35HivTG11JV+4rHlxCS23oNiLrcNfMTIvTV3ovXMCKzUL/pAER2oaJEdSW/hq4EM65JQ7ufMYmaf5PXspNAwQQkU44u6PsgDc7c7QdKSn9h7VqmgnsPID0k6dYhL0I969o7K/rVJ0eeduTF1Jgbr28tSVpJ2Ux0+EV8mmuf2RWfa/VEbCR2eGtC0OBzdFInPWM/1+a6ECgT937UVCyDqixNGIUtlaoL8YFmrFMpK3VXTcWsvb6qJnHu2Xog+SU0IgW84o1wcDboy0UFzC7ah0PA84u3ynGb5kYgkdhSkpTpexWjEN1+/nb16Xmv8KPPFoveO56oMc5rPIjCSE695eyyuk49Vb0iIi0sDu4YBa9fsmKrAQw7rLNDat8bQg8gYvjfk3p7zSAB/nDrpMrZePKty0GO/bIK+S7NKJFEk3NjfuTIh8g3UVDrgcz9m0yQtm+lPsEVfS4zOwB+bpWx95m1NXI4n8qVKnr8F8prOPwgw0hZwlBtNniJa3aqyMz+GuCkL3AplI/iP+7LSN7nEB3N+FFr5e2SwWhS7v5gvZpbQ58MBggvIUTCTjfXB3icXMGst53mG7Wu41BSyDOvif6VhG5fD7D6HTqDygJvi/E0vD6W4HJxo4kfJT+K6eoMEclRvCmc0WGFBr6EQQjCSrZKJIQps30VYvw50XDEO K+ZC+uXy yLiPpUjYYgPYnUvEHXIZRgvNLMTinCBFtPPc7D4ght85goKdNu/aIPzKKmSOIdakN0+9o/BuPLDvk01ZsTubke1f4APmI8Fp7LofR/UrEf/HaaW20h/sHbmD5HevCnNc9XytoRw8TXA4aHbQ5kyvQ70abM8fdq6eHfKujoBmuCjp14TQErNXPF99KJoxrCt/9ojj0P20AbumIh780lKmtTnC/4cj88ZPGb/WhSI/4PwH/IVE0dA24rwyY5fkdsTMvTX5g8tYUCOgqYgp5bDdMALwgfouDMBB86l4aMBTJO7ATeMdj5iBREQe18yd12FA4legge2d/AKzaFdyXXnQBLixafAPTT/F8Fd4976Fk4wtFhpiKQNIp7ZVbCR8NSzgRdgWUfPNdMpG0on/lwYlKSy66uIb98IFgmcKNegjlWFcW3+fPf4HROFyDil9untAmVP1cvKAlnqzch5I9hp46TaSDdElcgR/69Ak2kpEFxYC6oNukM/0YuLu63Np6OK+0j+1CZXbDPaRo68QysnHjN81LBPph44jqmeVrThyjVrT//zLqBUbJaTBRMg71x5dZZUYQYGHJVme9VzkvQixd0NqpY5jYvd/TOCwBoU6suoAKBl4TkexeOrSJswNl5oM4o1YVrb8gDJ3BSrOg+lDgOHB8cA9Vv7xZdqupBOSdeF4Ef0vnXAxFVO9aTqnfnqq+BI1IrebX0nJDjNzHKv64cstMmfhyOCfSaDkeyQi2uPHJ9cgri3Lebr+inFo0B4XHmO4EjFrLVpyvQIpFbn4mxlcGLdd6r138cpHTkTybc0T3HmQyame8Rw3C2x/AS3IXErF6fE9txcXTReYLTtNTwrzmG+wNpqA48nS3Cpqb2IDEmfa90BwEu8m6B/27IFnKcGPS5Wzsd0yWnp4= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Jul 19, 2024 at 10:18:12PM -0400, Liam R. Howlett wrote: > * Peter Xu [240712 10:43]: > > This patch is one patch of an old series [1] that got reposted standalone > > here, with the hope to fix some reported untrack_pfn() issues reported > > recently [2,3], where there used to be other fix [4] but unfortunately > > which looks like to cause other issues. The hope is this patch can fix it > > the right way. > > > > X86 uses pfn tracking to do pfnmaps. AFAICT, the tracking should normally > > start at mmap() of device drivers, then untracked when munmap(). However > > in the current code the untrack is done in unmap_single_vma(). This might > > be problematic. > > > > For example, unmap_single_vma() can be used nowadays even for zapping a > > single page rather than the whole vmas. It's very confusing to do whole > > vma untracking in this function even if a caller would like to zap one > > page. It could simply be wrong. > > > > Such issue won't be exposed by things like MADV_DONTNEED won't ever work > > for pfnmaps and it'll fail the madvise() already before reaching here. > > However looks like it can be triggered like what was reported where invoked > > from an unmap request from a file vma. > > > > There's also work [5] on VFIO (merged now) to allow tearing down MMIO > > pgtables before an munmap(), in which case we may not want to untrack the > > pfns if we're only tearing down the pgtables. IOW, we may want to keep the > > pfn tracking information as those pfn mappings can be restored later with > > the same vma object. Currently it's not an immediate problem for VFIO, as > > VFIO uses UC- by default, but it looks like there's plan to extend that in > > the near future. > > > > IIUC, this was overlooked when zap_page_range_single() was introduced, > > while in the past it was only used in the munmap() path which wants to > > always unmap the region completely. E.g., commit f5cc4eef9987 ("VM: make > > zap_page_range() callers that act on a single VMA use separate helper") is > > the initial commit that introduced unmap_single_vma(), in which the chunk > > of untrack_pfn() was moved over from unmap_vmas(). > > > > Recover that behavior to untrack pfnmap only when unmap regions. > > > > [1] https://lore.kernel.org/r/20240523223745.395337-1-peterx@redhat.com > > [2] https://groups.google.com/g/syzkaller-bugs/c/FeQZvSbqWbQ/m/tHFmoZthAAAJ > > [3] https://lore.kernel.org/r/20240712131931.20207-1-00107082@163.com > > [4] https://lore.kernel.org/all/20240710-bug12-v1-1-0e5440f9b8d3@gmail.com/ > > [5] https://lore.kernel.org/r/20240523195629.218043-1-alex.williamson@redhat.com > > > > Cc: Alex Williamson > > Cc: Jason Gunthorpe > > Cc: Al Viro > > Cc: Dave Hansen > > Cc: Andy Lutomirski > > Cc: Peter Zijlstra > > Cc: Thomas Gleixner > > Cc: Ingo Molnar > > Cc: Borislav Petkov > > Cc: Kirill A. Shutemov > > Cc: x86@kernel.org > > Cc: Yan Zhao > > Cc: Kevin Tian > > Cc: Pei Li > > Cc: David Hildenbrand > > Cc: David Wang <00107082@163.com> > > Cc: Bert Karwatzki > > Cc: Sergey Senozhatsky > > Signed-off-by: Peter Xu > > --- > > > > NOTE: I massaged the commit message comparing to the rfc post [1], the > > patch itself is untouched. Also removed rfc tag, and added more people > > into the loop. Please kindly help test this patch if you have a reproducer, > > as I can't reproduce it myself even with the syzbot reproducer on top of > > mm-unstable. Instead of further check on the reproducer, I decided to send > > this out first as we have a bunch of reproducers on the list now.. > > --- > > mm/memory.c | 5 ++--- > > 1 file changed, 2 insertions(+), 3 deletions(-) > > > > diff --git a/mm/memory.c b/mm/memory.c > > index 4bcd79619574..f57cc304b318 100644 > > --- a/mm/memory.c > > +++ b/mm/memory.c > > @@ -1827,9 +1827,6 @@ static void unmap_single_vma(struct mmu_gather *tlb, > > if (vma->vm_file) > > uprobe_munmap(vma, start, end); > > > > - if (unlikely(vma->vm_flags & VM_PFNMAP)) > > - untrack_pfn(vma, 0, 0, mm_wr_locked); > > - > > if (start != end) { > > if (unlikely(is_vm_hugetlb_page(vma))) { > > /* > > @@ -1894,6 +1891,8 @@ void unmap_vmas(struct mmu_gather *tlb, struct ma_state *mas, > > unsigned long start = start_addr; > > unsigned long end = end_addr; > > hugetlb_zap_begin(vma, &start, &end); > > + if (unlikely(vma->vm_flags & VM_PFNMAP)) > > + untrack_pfn(vma, 0, 0, mm_wr_locked); > > unmap_single_vma(tlb, vma, start, end, &details, > > mm_wr_locked); > > hugetlb_zap_end(vma, &details); > > -- > > 2.45.0 > > > ...Trying to follow this discussion across several threads and bug > reports. I was looped in when syzbot found that the [4] fix was a > deadlock. > > How are we reaching unmap_vmas() without the mmap lock held in any mode? > We must be holding the read or write lock - otherwise the vma pointer is > unsafe...? The report was not calling unmap_vmas() but unmap_single_vma(), and this patch proposed to move the untrack operation there. We should always hold write lock for unmap_vmas(), afaiu. > > In any case, since this will just keep calling unmap_single_vma() it has > to be an incomplete fix? I think there's indeed some issue to settle besides this patch, however I didn't quickly get why this patch is incomplete from this specific "untrack pfn within unmap_single_vma()" problem. I thought it was complete from that regard, or could you elaborate otherwise? For example, I think it's pretty common to use unmap_single_vma() in a truncation path. Thanks, -- Peter Xu