From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.7 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 491BAC5DF60 for ; Tue, 5 Nov 2019 23:43:43 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D21FF2087E for ; Tue, 5 Nov 2019 23:43:42 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=intel-com.20150623.gappssmtp.com header.i=@intel-com.20150623.gappssmtp.com header.b="R8Alj67i" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D21FF2087E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id A0A126B0007; Tue, 5 Nov 2019 18:43:42 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 9BA7E6B0008; Tue, 5 Nov 2019 18:43:42 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8D0A46B000A; Tue, 5 Nov 2019 18:43:42 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0189.hostedemail.com [216.40.44.189]) by kanga.kvack.org (Postfix) with ESMTP id 781056B0007 for ; Tue, 5 Nov 2019 18:43:42 -0500 (EST) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with SMTP id 21F19180AD81A for ; Tue, 5 Nov 2019 23:43:42 +0000 (UTC) X-FDA: 76123853484.06.alarm20_3d57435533536 X-HE-Tag: alarm20_3d57435533536 X-Filterd-Recvd-Size: 7425 Received: from mail-ot1-f66.google.com (mail-ot1-f66.google.com [209.85.210.66]) by imf21.hostedemail.com (Postfix) with ESMTP for ; Tue, 5 Nov 2019 23:43:41 +0000 (UTC) Received: by mail-ot1-f66.google.com with SMTP id v24so13963209otp.5 for ; Tue, 05 Nov 2019 15:43:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=WqF5WeWUkV8OSvmMn1NqVj3O1bVUGhAU7UxRwoOyQkQ=; b=R8Alj67iPtwWeE3roElZW+pYOh4ky6ik6lq6w2YDqxXxftHuL4xvlbGp/AuCb0KSF4 fF4h2qKN/xAR7aB4QOjGG/Km9vamjGEGd5EWQf2UCkUQ7094fLn60YrhxpB59rWSuuj7 7nYcFIzc4zA9efWctufgYQhIDnEfkoqphQsH8BEqo2jLW6Hzf3tWn5AOfjrJFaUwi37b mD+ZMfV0bhs0NXJ88Ult28HQhN/DXV2PO1Ue8zzlYMbfoUo9D38/0PdeApkFIZBhNMdh OAPGecXpyJFc4c5KNcOLo/Tij1p7AqQxiKKMhOsGDsUROf+iLWTtTImhk4GgDWDE2fAg 1QdA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=WqF5WeWUkV8OSvmMn1NqVj3O1bVUGhAU7UxRwoOyQkQ=; b=rmhjVA8DVLKl9sDvK8oygf/alZqDdTH30PfXyB4GR69yI55Ee4BUl3ooMieRrTTduM 5/Xm+Dxu9S3bosu0BDXvytRDy64d+LYydPNviSJBh7W05EczituKdCx28zxU6tkVS0AC hJP7u/lCpWIo+NZ1x6Cp7zjLwkaOJJbXQkopwLeUidYPTDgMKRsf18tinfRUslF0YzNR eBKGgRttuJm7EsVGdx36TPMnTUnNOiFgxDPNPB37rKZ6rspFSW9jA6rf/1jSPIxvMbAc vPdWwEXmMsHxm1FJU5eThEE7ocDfOMBkun/LYU0MKCHMXNFQNR8swLJmYQJ1t6VBFZz+ 2bcw== X-Gm-Message-State: APjAAAXa2eF23e/C4sReNdDwrtQppOnfpXRuCN+9YVeQtzhdhIxcm2jd iHcEDIzWQ3+n05kgToJR0zpc+dr1dPCjePKcwja3AA== X-Google-Smtp-Source: APXvYqyIc0AZHvfDAJRxs8BSUdGdJvbmvdIf2oP2r6ZSIdEGqlYbdICndzlbJrs4CN9R6UBYQWP9X3frciT2ip5IH08= X-Received: by 2002:a9d:5f11:: with SMTP id f17mr24190398oti.207.1572997420669; Tue, 05 Nov 2019 15:43:40 -0800 (PST) MIME-Version: 1.0 References: <20191024120938.11237-1-david@redhat.com> <20191024120938.11237-4-david@redhat.com> <01adb4cb-6092-638c-0bab-e61322be7cf5@redhat.com> <613f3606-748b-0e56-a3ad-1efaffa1a67b@redhat.com> <20191105160000.GC8128@linux.intel.com> <20191105231316.GE23297@linux.intel.com> In-Reply-To: From: Dan Williams Date: Tue, 5 Nov 2019 15:43:29 -0800 Message-ID: Subject: Re: [PATCH v1 03/10] KVM: Prepare kvm_is_reserved_pfn() for PG_reserved changes To: Sean Christopherson Cc: David Hildenbrand , Linux Kernel Mailing List , Linux MM , Michal Hocko , Andrew Morton , kvm-ppc@vger.kernel.org, linuxppc-dev , KVM list , linux-hyperv@vger.kernel.org, devel@driverdev.osuosl.org, xen-devel , X86 ML , Alexander Duyck , Alexander Duyck , Alex Williamson , Allison Randal , Andy Lutomirski , "Aneesh Kumar K.V" , Anshuman Khandual , Anthony Yznaga , Benjamin Herrenschmidt , Borislav Petkov , Boris Ostrovsky , Christophe Leroy , Cornelia Huck , Dave Hansen , Haiyang Zhang , "H. Peter Anvin" , Ingo Molnar , "Isaac J. Manjarres" , Jim Mattson , Joerg Roedel , Johannes Weiner , Juergen Gross , KarimAllah Ahmed , Kees Cook , "K. Y. Srinivasan" , "Matthew Wilcox (Oracle)" , Matt Sickler , Mel Gorman , Michael Ellerman , Michal Hocko , Mike Rapoport , Mike Rapoport , Nicholas Piggin , Oscar Salvador , Paolo Bonzini , Paul Mackerras , Paul Mackerras , Pavel Tatashin , Pavel Tatashin , Peter Zijlstra , Qian Cai , =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= , Sasha Levin , Stefano Stabellini , Stephen Hemminger , Thomas Gleixner , Vitaly Kuznetsov , Vlastimil Babka , Wanpeng Li , YueHaibing , Adam Borowski Content-Type: text/plain; charset="UTF-8" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Nov 5, 2019 at 3:30 PM Dan Williams wrote: > > On Tue, Nov 5, 2019 at 3:13 PM Sean Christopherson > wrote: > > > > On Tue, Nov 05, 2019 at 03:02:40PM -0800, Dan Williams wrote: > > > On Tue, Nov 5, 2019 at 12:31 PM David Hildenbrand wrote: > > > > > The scarier code (for me) is transparent_hugepage_adjust() and > > > > > kvm_mmu_zap_collapsible_spte(), as I don't at all understand the > > > > > interaction between THP and _PAGE_DEVMAP. > > > > > > > > The x86 KVM MMU code is one of the ugliest code I know (sorry, but it > > > > had to be said :/ ). Luckily, this should be independent of the > > > > PG_reserved thingy AFAIKs. > > > > > > Both transparent_hugepage_adjust() and kvm_mmu_zap_collapsible_spte() > > > are honoring kvm_is_reserved_pfn(), so again I'm missing where the > > > page count gets mismanaged and leads to the reported hang. > > > > When mapping pages into the guest, KVM gets the page via gup(), which > > increments the page count for ZONE_DEVICE pages. But KVM puts the page > > using kvm_release_pfn_clean(), which skips put_page() if PageReserved() > > and so never puts its reference to ZONE_DEVICE pages. > > Oh, yeah, that's busted. Ugh, it's extra busted because every other gup user in the kernel tracks the pages resulting from gup and puts them (put_page()) when they are done. KVM wants to forget about whether it did a gup to get the page and optionally trigger put_page() based purely on the pfn. Outside of VFIO device assignment that needs pages pinned for DMA, why does KVM itself need to pin pages? If pages are pinned over a return to userspace that needs to be a FOLL_LONGTERM gup.