From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7454DEB64DA for ; Wed, 5 Jul 2023 22:26:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EFCB28D0002; Wed, 5 Jul 2023 18:26:27 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EAB928D0001; Wed, 5 Jul 2023 18:26:27 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D73D88D0002; Wed, 5 Jul 2023 18:26:27 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id C3C158D0001 for ; Wed, 5 Jul 2023 18:26:27 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 8E691B061B for ; Wed, 5 Jul 2023 22:26:27 +0000 (UTC) X-FDA: 80978993214.09.C982AFF Received: from mail-yw1-f179.google.com (mail-yw1-f179.google.com [209.85.128.179]) by imf04.hostedemail.com (Postfix) with ESMTP id A658040015 for ; Wed, 5 Jul 2023 22:26:25 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b="5FehI5K/"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf04.hostedemail.com: domain of hughd@google.com designates 209.85.128.179 as permitted sender) smtp.mailfrom=hughd@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1688595985; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=NDnLbLcuwE9iQDqKqrCS7qCokL1RGxXvXlYeJ0eQJ30=; b=3ywYjOBDQoIZh2kzkgzdxXywCAwKFKOP+RQvChduKkkhbnhtikSuFGnfEsonlofVht6B5T UpbOKBjsFSdH9vFb3DYHwPGP4tlhOUjMXvrdjK6FtG1HB8IMoxZ2v6VkR36fAzlq1e2Lt8 mabW2VremkwBw4ehtcFCSyVduo4H8H0= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b="5FehI5K/"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf04.hostedemail.com: domain of hughd@google.com designates 209.85.128.179 as permitted sender) smtp.mailfrom=hughd@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1688595985; a=rsa-sha256; cv=none; b=YsIMLuBjeVXvwkdl4OrGG+jp4LuBmjdjpR/leMLdCHE0h0fYhA1yMsNpuNmvhm2DMrph9d cWvfkEr4nTa0B6NX0GQ/hi9jWsVoYOgDnuzXjligZoWpt8kUdwMpUgKvmzBJayVmxlPu/L tlAMqYmj81f0IrevTeFEnoX4jnyI8/E= Received: by mail-yw1-f179.google.com with SMTP id 00721157ae682-57916e0badcso1457567b3.0 for ; Wed, 05 Jul 2023 15:26:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1688595985; x=1691187985; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=NDnLbLcuwE9iQDqKqrCS7qCokL1RGxXvXlYeJ0eQJ30=; b=5FehI5K/bhI6ucC1rNUMjym0CpeBgMqNbuSn9VGh+iWrebWQXbjNnByw86n2+6rQdD ngzwr3EBiNgTv11e/qLEO7J+aIt+roNc8DkO76FvOEty6eZsVwcmkFBmeWxNJCJyrOYm Y5uaWDkoBxwYkOo61O21NQGSM+1d/q2v5HtwrJluAC3lSMrE5F2PYrAw7yXrdW009Tf9 WyTdxu3gXqkzQChcTcQE60vI+R3gJlNOaC8lNnokeKpE0iIEsF7Qw3B4jiwLsHi00QmJ xWj8V+WuLNpSAXuDltrX1gXiZfaMXRW1i9auAAWpIlV3XJpFhgEpVsLMKbgLysxUxWbe c/Bg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688595985; x=1691187985; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=NDnLbLcuwE9iQDqKqrCS7qCokL1RGxXvXlYeJ0eQJ30=; b=NL4212O+tG/K4RD/ppnJsksUZGLJzWQJR41QLP1s1O7Qbu91GaMEsKF8vWS7dbvtX8 YIlKjeYQCjkkEAgEYPrGa1d0N294znbVlETWbgKeUPEmKtLmYFF1GQV/sdvSsSvrFUbk qC1wiF2dTmSkTzw01fKtTadV+F+5xUwR4SsXRkLelIs/S+nwkLMj3J3HMqas6N3LlTeB u/oBWX3qB7A4fYbekaQLVkoXT6mCAwuZh8AI5irViLKl5iahoow3BgsCn9pZzd7yEreZ lBn+iyn9V/Slev0olyvLZoTtuYnzb0IQZh8ToVC6gXuoRUKYWrL1G6033sWgTF9Y+62Y axTg== X-Gm-Message-State: ABy/qLZXFolM7oxQPnZ+WxpJBFsk+DARq95mjFeuWohiJWkAkgLwC71z t8h6IzHrffASqte8i/YxOVq9iQ== X-Google-Smtp-Source: APBJJlHqGpoZ+nLiN32zt+ZgOuEAR9yG9oiJSFv2dzaPuHS4wUlJAw2cpnmLLKyZe4ZXfSxoqplOEw== X-Received: by 2002:a0d:d40d:0:b0:577:2bcf:8f0e with SMTP id w13-20020a0dd40d000000b005772bcf8f0emr186530ywd.51.1688595984607; Wed, 05 Jul 2023 15:26:24 -0700 (PDT) Received: from ripple.attlocal.net (172-10-233-147.lightspeed.sntcca.sbcglobal.net. [172.10.233.147]) by smtp.gmail.com with ESMTPSA id x132-20020a81638a000000b00576beffe858sm5718746ywb.97.2023.07.05.15.26.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 05 Jul 2023 15:26:24 -0700 (PDT) Date: Wed, 5 Jul 2023 15:26:13 -0700 (PDT) From: Hugh Dickins X-X-Sender: hugh@ripple.attlocal.net To: "Aneesh Kumar K.V" cc: Hugh Dickins , Andrew Morton , Mike Kravetz , Mike Rapoport , "Kirill A. Shutemov" , Matthew Wilcox , David Hildenbrand , Suren Baghdasaryan , Qi Zheng , Yang Shi , Mel Gorman , Peter Xu , Peter Zijlstra , Will Deacon , Yu Zhao , Alistair Popple , Ralph Campbell , Ira Weiny , Steven Price , SeongJae Park , Naoya Horiguchi , Christophe Leroy , Zack Rusin , Jason Gunthorpe , Axel Rasmussen , Anshuman Khandual , Pasha Tatashin , Miaohe Lin , Minchan Kim , Christoph Hellwig , Song Liu , Thomas Hellstrom , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH 04/31] mm/pgtable: allow pte_offset_map[_lock]() to fail In-Reply-To: <87y1juxvmb.fsf@linux.ibm.com> Message-ID: <1bb35f43-8556-8654-b11d-98ecc1f1dc0@google.com> References: <68a97fbe-5c1e-7ac6-72c-7b9c6290b370@google.com> <8218ffdc-8be-54e5-0a8-83f5542af283@google.com> <87y1juxvmb.fsf@linux.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Rspam-User: X-Stat-Signature: jecybjny9hc77ec41jfw9ygpyjin9kmx X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: A658040015 X-HE-Tag: 1688595985-11768 X-HE-Meta: U2FsdGVkX18OP23rqqXN9xLaC9nTVHdcLDZeh8x9LM3zU323MkKFFH3X6MVhU07ARGN97FqKHBJR5Z195BwsRwzaMYzUwYkQoB793XNMCnGyKdHohBc3iKMjDztf1+5lni8SoXRubmvjfUvPpxf2D7aKZD+SSu9o8DA11blUS7q8yJ0ITQANZExKHdL0UXxz/Y548J4yRn9g2HPhGy5Q5zGBiybMrLU8V2jJaX5CgWtem/Kje7P/HfDuRwvgBzw0hyviQ9epU4j7Gwrkc0j5iyrZhMgJJVXo4KMVwakdq2HPwvf/YXrHVIff/JQcL+AdoOnhfX6zDgdoZL0PZTaU14BtMb+TiTWr9Cmg4AllOJmoKczIqdgR7gaLy4bCqBgTTwccxP8FogxO7ZUnejOF9G+gadlW0HixCvgXRwqGZK2HgI/7xBnQnu4r/b6GMHdfQOUaOZC72KcrvSCyfPKU0gdoPQ/JEGmJ+0+YTNOpVqZHzqIElnWOTJaULaaMUfXEKB29JUdb807sAG4GegTyjCoRvcHCy6Fddk4dkbcdRJ4t/3pW55Drl0/7nGrn6qkZXMeUEqr+NqosGOmNzZYG/pjGlL15CN3qeVUdYkhi8TYQAthxxiENPEXiyeJ0XS9uJ+PLOIVYlduFQliVitjg8B4rB1yLBesyHeMaOZF7wkvCH84+jbDyiFiLqv3TADNi6z4Lq0DaCN5NAb5qEi0neRP1VTtIxlkdxIqbezNUs2HSSTFWpyi4/xQmGoeQ9qptWCvdxZb82nR1QvvErZInJc1EcZb6/Sq0lV5K5REQWE8pm+RAQ62ymJC5cmPK9ngxkyxn1YWitgB5EDrOsDTINa8l09qdOlV5/DhJVJ/SONO6Kn9u8jrQzvPaLxD5RPtNvo7Xa6TMDtOninLq8FgF1M4XVHep1UV/7fs7/ZWDLPio6+73vuGE77Y4zEP5SHHm9U3Hd4eBSHHXExChvKl Zfubz8pd 8yYIgG6Dw6nj852OH2RKYBa/g5RcEbOQNu9BtdZ45JlEnaa0G5zbj9XTfRoyWgUkNCO9EW1X/wr64+NgS4bS2mAO+TXsvTscHE6e+yPS2NOQDDixcfNiQHNCZtgxMf1XCDMzwyagAoZ/65DGEgOQXnPb0EzQUcIC4OoL8f5LprXq8mWLcZyhE3QnfLy+MltYfmAHcS83iVcJugj9IrSG7TII1ziceaq+z/kwdat7wrSDavOR7XusqMfJUj0O0Wsd6s6xiBHqf/OXiQblyQhvb9VlFtkB1wmMLNpxP4UdTgOkxvRJkwaSwERSE2rLPxWTCTTmDB5BwTmBInxaCMoHYgVnpZ94mvIOtwtZRWz2kFWL8X3f16Hew8+lY8yl3WLZjST74wEi8fEKO/PTEmH0NNWM/6ZF0YTKlZEOQ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi Aneesh, On Wed, 5 Jul 2023, Aneesh Kumar K.V wrote: > > Hi Hugh, > > Sorry for not checking about this before. I was about to say "No problem" - but it appears I would be lying! > I am looking at a kernel > crash (BUG_ON()) on ppc64 with 4K page size. The reason we hit > BUG_ON() is beause we have pmd_same calling BUG_ON on 4K with hash > translation. We don't support THP with 4k page size and hash > translation. I misunderstood you at first. I was trying to work out what in that context might lead to *pmd changing suddenly, was going to ask for stack backtrace (in faulting? or in copying mm? or?), whether you have PER_VMA_LOCK enabled, etc. etc. Then I looked at the source: oh, that is gross, and not something I had expected at all. > > +pte_t *__pte_offset_map_lock(struct mm_struct *mm, pmd_t *pmd, > > + unsigned long addr, spinlock_t **ptlp) > > +{ > > + spinlock_t *ptl; > > + pmd_t pmdval; > > + pte_t *pte; > > +again: > > + pte = __pte_offset_map(pmd, addr, &pmdval); > > + if (unlikely(!pte)) > > + return pte; > > + ptl = pte_lockptr(mm, &pmdval); > > + spin_lock(ptl); > > + if (likely(pmd_same(pmdval, pmdp_get_lockless(pmd)))) { > > + *ptlp = ptl; > > + return pte; > > + } > > + pte_unmap_unlock(pte, ptl); > > + goto again; > > +} > > What is expected by that pmd_same check? We are holding pte lock > and not pmd lock. So contents of pmd can change. And you don't need me to answer that question: the answer is in the "likely". We do not expect *pmd to change there (though maybe some ancillary bits of it, like "accessed"), unless the page table is on its way to being freed; and other locking higher up (mmap_lock or rmap lock) prevent all possibilities of that at present. Later, we arrange to hold pte lock as well as pmd lock when removing page table. So the obvious quick fix is: --- a/arch/powerpc/include/asm/book3s/64/hash-4k.h +++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h @@ -138,8 +138,7 @@ static inline int hash__pmd_trans_huge(p static inline int hash__pmd_same(pmd_t pmd_a, pmd_t pmd_b) { - BUG(); - return 0; + return 1; } static inline pmd_t hash__pmd_mkhuge(pmd_t pmd) But I hope you will reject that as almost as gross, and instead commit a patch which makes hash__pmd_same() ... check whether the pmd_ts are the same - as in ppc64's other implementations. That will save having to change it again, when/if someone extends the current replace-page-table-by-THP work by non-THP work to remove empty page tables without mmap_lock or rmap lock. Thanks for finding this, Aneesh, and I'm sorry I didn't notice it before, and I'm sorry to have given you trouble... but really, that BUG(), (H)ugh!