From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.5 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F095FC388F9 for ; Fri, 23 Oct 2020 12:26:08 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 64C112463C for ; Fri, 23 Oct 2020 12:26:08 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=ffwll.ch header.i=@ffwll.ch header.b="K/Mlvxqb" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 64C112463C Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=ffwll.ch Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id A82636B005D; Fri, 23 Oct 2020 08:26:07 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A30616B0068; Fri, 23 Oct 2020 08:26:07 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8A9F16B006C; Fri, 23 Oct 2020 08:26:07 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0029.hostedemail.com [216.40.44.29]) by kanga.kvack.org (Postfix) with ESMTP id 5DF4E6B005D for ; Fri, 23 Oct 2020 08:26:07 -0400 (EDT) Received: from smtpin17.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id EABA18945726 for ; Fri, 23 Oct 2020 12:26:06 +0000 (UTC) X-FDA: 77403112332.17.flock83_58075f927259 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin17.hostedemail.com (Postfix) with ESMTP id D1C54180D0181 for ; Fri, 23 Oct 2020 12:26:06 +0000 (UTC) X-HE-Tag: flock83_58075f927259 X-Filterd-Recvd-Size: 12098 Received: from mail-wm1-f65.google.com (mail-wm1-f65.google.com [209.85.128.65]) by imf16.hostedemail.com (Postfix) with ESMTP for ; Fri, 23 Oct 2020 12:26:05 +0000 (UTC) Received: by mail-wm1-f65.google.com with SMTP id c16so1311627wmd.2 for ; Fri, 23 Oct 2020 05:26:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=date:from:to:cc:subject:message-id:mail-followup-to:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to; bh=PAFtVAQ3tChjSUzKujX9wl7xWmpuDPW53yjWkRt+YX4=; b=K/MlvxqbXxKBD7NtxzEVlnQQgEo8R7jzzQ6d8aTKu/t7mVOfhin25s4agcsW7aC9Fm Dg+jn2+aY9gWuXi7wduTHjxE/oiqtUmeSjJS9d5bV3rltqi1wTiNFP2rJPUUFWgVs7E6 gRtIWzD7wrVKYbLUc6keZxA0hqL4l6D2l+jUk= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id :mail-followup-to:references:mime-version:content-disposition :content-transfer-encoding:in-reply-to; bh=PAFtVAQ3tChjSUzKujX9wl7xWmpuDPW53yjWkRt+YX4=; b=lJDedQMNno5FSH2vp6z6lSomupOeOHrfyPNaT8fWvhf+OVIlGUyBBxGaEzyEh9O3/x sgdklPYn2sr13CuQfEkEIRebYgp6F3EdXUyU9ZCfSVUEqAXQDdHv63tDJqQmkGlbV/PT MVzXKNrzjKd3Jtb/9tbBer8LeKuE7togfHfxDjJ+lqGMFSPE5+wF4SYiSjVqAdFPgl81 WMW67ZOPnZ0xp9la45Js+3yapaXMEuMg95+JMPHJBYnBRdaW6eSKuGXUAqUT6QvqdmcR RfyM12m0DM1lFGrDG8gJNtqTTh0AZuctA2SlHLwHn0AR00sqw2arA1PaF2d9AOUddzf/ Sfwg== X-Gm-Message-State: AOAM533qSFiQ8oiadOTuu4xM8Wn8CftQO04FORc/BGbAWW9gCjie+WRJ TVPXLM5jUU6F7ef2SjEGJKKa2A== X-Google-Smtp-Source: ABdhPJzmCXFy0gj+CVwTI7L79a3sWTdS3hTTu9nmTWuT4VCd2/G9Mq4Y+61MqYgpQ+yiCjJ3FCfC5Q== X-Received: by 2002:a7b:c015:: with SMTP id c21mr2044322wmb.22.1603455964962; Fri, 23 Oct 2020 05:26:04 -0700 (PDT) Received: from phenom.ffwll.local ([2a02:168:57f4:0:efd0:b9e5:5ae6:c2fa]) by smtp.gmail.com with ESMTPSA id k203sm3232158wmb.37.2020.10.23.05.26.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 23 Oct 2020 05:26:04 -0700 (PDT) Date: Fri, 23 Oct 2020 14:26:02 +0200 From: Daniel Vetter To: DRI Development Cc: Intel Graphics Development , Daniel Vetter , Gerald Schaefer , Daniel Vetter , Jason Gunthorpe , Dan Williams , Kees Cook , Andrew Morton , John Hubbard , =?iso-8859-1?B?Suly9G1l?= Glisse , Jan Kara , linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linux-samsung-soc@vger.kernel.org, linux-media@vger.kernel.org, Niklas Schnelle , linux-s390@vger.kernel.org Subject: Re: [PATCH 29/65] s390/pci: Remove races against pte updates Message-ID: <20201023122602.GB401619@phenom.ffwll.local> Mail-Followup-To: DRI Development , Intel Graphics Development , Gerald Schaefer , Daniel Vetter , Jason Gunthorpe , Dan Williams , Kees Cook , Andrew Morton , John Hubbard , =?iso-8859-1?B?Suly9G1l?= Glisse , Jan Kara , linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linux-samsung-soc@vger.kernel.org, linux-media@vger.kernel.org, Niklas Schnelle , linux-s390@vger.kernel.org References: <20201021163242.1458885-1-daniel.vetter@ffwll.ch> <20201023122216.2373294-1-daniel.vetter@ffwll.ch> <20201023122216.2373294-29-daniel.vetter@ffwll.ch> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: <20201023122216.2373294-29-daniel.vetter@ffwll.ch> X-Operating-System: Linux phenom 5.7.0-1-amd64 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Oct 23, 2020 at 02:21:40PM +0200, Daniel Vetter wrote: > Way back it was a reasonable assumptions that iomem mappings never > change the pfn range they point at. But this has changed: >=20 > - gpu drivers dynamically manage their memory nowadays, invalidating > ptes with unmap_mapping_range when buffers get moved >=20 > - contiguous dma allocations have moved from dedicated carvetouts to > cma regions. This means if we miss the unmap the pfn might contain > pagecache or anon memory (well anything allocated with GFP_MOVEABLE) >=20 > - even /dev/mem now invalidates mappings when the kernel requests that > iomem region when CONFIG_IO_STRICT_DEVMEM is set, see commit > 3234ac664a87 ("/dev/mem: Revoke mappings when a driver claims the > region") >=20 > Accessing pfns obtained from ptes without holding all the locks is > therefore no longer a good idea. Fix this. >=20 > Since zpci_memcpy_from|toio seems to not do anything nefarious with > locks we just need to open code get_pfn and follow_pfn and make sure > we drop the locks only after we're done. The write function also needs > the copy_from_user move, since we can't take userspace faults while > holding the mmap sem. >=20 > v2: Move VM_IO | VM_PFNMAP checks around so they keep returning EINVAL > like before (Gerard) >=20 > v3: Polish commit message (Niklas) >=20 > Reviewed-by: Gerald Schaefer > Signed-off-by: Daniel Vetter > Cc: Jason Gunthorpe > Cc: Dan Williams > Cc: Kees Cook > Cc: Andrew Morton > Cc: John Hubbard > Cc: J=E9r=F4me Glisse > Cc: Jan Kara > Cc: linux-mm@kvack.org > Cc: linux-arm-kernel@lists.infradead.org > Cc: linux-samsung-soc@vger.kernel.org > Cc: linux-media@vger.kernel.org > Cc: Niklas Schnelle > Cc: Gerald Schaefer > Cc: linux-s390@vger.kernel.org > Cc: Niklas Schnelle > Signed-off-by: Daniel Vetter Apologies for this random patch bomb, typoed git send-email command :-/ v4 of this will come properly on Monday or so. -Daniel > --- > arch/s390/pci/pci_mmio.c | 98 +++++++++++++++++++++++----------------- > 1 file changed, 57 insertions(+), 41 deletions(-) >=20 > diff --git a/arch/s390/pci/pci_mmio.c b/arch/s390/pci/pci_mmio.c > index 401cf670a243..1a6adbc68ee8 100644 > --- a/arch/s390/pci/pci_mmio.c > +++ b/arch/s390/pci/pci_mmio.c > @@ -119,33 +119,15 @@ static inline int __memcpy_toio_inuser(void __iom= em *dst, > return rc; > } > =20 > -static long get_pfn(unsigned long user_addr, unsigned long access, > - unsigned long *pfn) > -{ > - struct vm_area_struct *vma; > - long ret; > - > - mmap_read_lock(current->mm); > - ret =3D -EINVAL; > - vma =3D find_vma(current->mm, user_addr); > - if (!vma) > - goto out; > - ret =3D -EACCES; > - if (!(vma->vm_flags & access)) > - goto out; > - ret =3D follow_pfn(vma, user_addr, pfn); > -out: > - mmap_read_unlock(current->mm); > - return ret; > -} > - > SYSCALL_DEFINE3(s390_pci_mmio_write, unsigned long, mmio_addr, > const void __user *, user_buffer, size_t, length) > { > u8 local_buf[64]; > void __iomem *io_addr; > void *buf; > - unsigned long pfn; > + struct vm_area_struct *vma; > + pte_t *ptep; > + spinlock_t *ptl; > long ret; > =20 > if (!zpci_is_enabled()) > @@ -158,7 +140,7 @@ SYSCALL_DEFINE3(s390_pci_mmio_write, unsigned long,= mmio_addr, > * We only support write access to MIO capable devices if we are on > * a MIO enabled system. Otherwise we would have to check for every > * address if it is a special ZPCI_ADDR and would have to do > - * a get_pfn() which we don't need for MIO capable devices. Currentl= y > + * a pfn lookup which we don't need for MIO capable devices. Current= ly > * ISM devices are the only devices without MIO support and there is = no > * known need for accessing these from userspace. > */ > @@ -176,21 +158,37 @@ SYSCALL_DEFINE3(s390_pci_mmio_write, unsigned lon= g, mmio_addr, > } else > buf =3D local_buf; > =20 > - ret =3D get_pfn(mmio_addr, VM_WRITE, &pfn); > + ret =3D -EFAULT; > + if (copy_from_user(buf, user_buffer, length)) > + goto out_free; > + > + mmap_read_lock(current->mm); > + ret =3D -EINVAL; > + vma =3D find_vma(current->mm, mmio_addr); > + if (!vma) > + goto out_unlock_mmap; > + if (!(vma->vm_flags & (VM_IO | VM_PFNMAP))) > + goto out_unlock_mmap; > + ret =3D -EACCES; > + if (!(vma->vm_flags & VM_WRITE)) > + goto out_unlock_mmap; > + > + ret =3D follow_pte_pmd(vma->vm_mm, mmio_addr, NULL, &ptep, NULL, &ptl= ); > if (ret) > - goto out; > - io_addr =3D (void __iomem *)((pfn << PAGE_SHIFT) | > + goto out_unlock_mmap; > + > + io_addr =3D (void __iomem *)((pte_pfn(*ptep) << PAGE_SHIFT) | > (mmio_addr & ~PAGE_MASK)); > =20 > - ret =3D -EFAULT; > if ((unsigned long) io_addr < ZPCI_IOMAP_ADDR_BASE) > - goto out; > - > - if (copy_from_user(buf, user_buffer, length)) > - goto out; > + goto out_unlock_pt; > =20 > ret =3D zpci_memcpy_toio(io_addr, buf, length); > -out: > +out_unlock_pt: > + pte_unmap_unlock(ptep, ptl); > +out_unlock_mmap: > + mmap_read_unlock(current->mm); > +out_free: > if (buf !=3D local_buf) > kfree(buf); > return ret; > @@ -274,7 +272,9 @@ SYSCALL_DEFINE3(s390_pci_mmio_read, unsigned long, = mmio_addr, > u8 local_buf[64]; > void __iomem *io_addr; > void *buf; > - unsigned long pfn; > + struct vm_area_struct *vma; > + pte_t *ptep; > + spinlock_t *ptl; > long ret; > =20 > if (!zpci_is_enabled()) > @@ -287,7 +287,7 @@ SYSCALL_DEFINE3(s390_pci_mmio_read, unsigned long, = mmio_addr, > * We only support read access to MIO capable devices if we are on > * a MIO enabled system. Otherwise we would have to check for every > * address if it is a special ZPCI_ADDR and would have to do > - * a get_pfn() which we don't need for MIO capable devices. Currentl= y > + * a pfn lookup which we don't need for MIO capable devices. Current= ly > * ISM devices are the only devices without MIO support and there is = no > * known need for accessing these from userspace. > */ > @@ -306,22 +306,38 @@ SYSCALL_DEFINE3(s390_pci_mmio_read, unsigned long= , mmio_addr, > buf =3D local_buf; > } > =20 > - ret =3D get_pfn(mmio_addr, VM_READ, &pfn); > + mmap_read_lock(current->mm); > + ret =3D -EINVAL; > + vma =3D find_vma(current->mm, mmio_addr); > + if (!vma) > + goto out_unlock_mmap; > + if (!(vma->vm_flags & (VM_IO | VM_PFNMAP))) > + goto out_unlock_mmap; > + ret =3D -EACCES; > + if (!(vma->vm_flags & VM_WRITE)) > + goto out_unlock_mmap; > + > + ret =3D follow_pte_pmd(vma->vm_mm, mmio_addr, NULL, &ptep, NULL, &ptl= ); > if (ret) > - goto out; > - io_addr =3D (void __iomem *)((pfn << PAGE_SHIFT) | (mmio_addr & ~PAGE= _MASK)); > + goto out_unlock_mmap; > + > + io_addr =3D (void __iomem *)((pte_pfn(*ptep) << PAGE_SHIFT) | > + (mmio_addr & ~PAGE_MASK)); > =20 > if ((unsigned long) io_addr < ZPCI_IOMAP_ADDR_BASE) { > ret =3D -EFAULT; > - goto out; > + goto out_unlock_pt; > } > ret =3D zpci_memcpy_fromio(buf, io_addr, length); > - if (ret) > - goto out; > - if (copy_to_user(user_buffer, buf, length)) > + > +out_unlock_pt: > + pte_unmap_unlock(ptep, ptl); > +out_unlock_mmap: > + mmap_read_unlock(current->mm); > + > + if (!ret && copy_to_user(user_buffer, buf, length)) > ret =3D -EFAULT; > =20 > -out: > if (buf !=3D local_buf) > kfree(buf); > return ret; > --=20 > 2.28.0 >=20 --=20 Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch