From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 11062ECAAD3 for ; Wed, 31 Aug 2022 08:07:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 41AA58D0001; Wed, 31 Aug 2022 04:07:30 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3C9DF6B0072; Wed, 31 Aug 2022 04:07:30 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 292028D0001; Wed, 31 Aug 2022 04:07:30 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 1A0376B0071 for ; Wed, 31 Aug 2022 04:07:30 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id DB8B540A42 for ; Wed, 31 Aug 2022 08:07:29 +0000 (UTC) X-FDA: 79859158218.03.86193FF Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf21.hostedemail.com (Postfix) with ESMTP id 7DE631C0040 for ; Wed, 31 Aug 2022 08:07:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1661933248; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qSnF52SfmuOINnaqDpTQuzzr8jWMSKcPsAm32mHJfIw=; b=KVjAeIN0Gc/xlaGx1U8JTYcv3u6vjY+Lw9oa8l8AG2PGz634W9Qa0Y4t6sp4KI6FfSkhxo Q0LtmULQkd7XWEKAqgcKrqQVNFp6ydn9MEk0SE7mQrowKN2+oTDfEPdKfbjLfkHZwNZp7Y n4653VXkfFYyX7WCVQVJMr9OyCsR4bo= Received: from mail-wm1-f71.google.com (mail-wm1-f71.google.com [209.85.128.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-249-R44Qi2NhMR-cDje12CN3Dw-1; Wed, 31 Aug 2022 04:07:27 -0400 X-MC-Unique: R44Qi2NhMR-cDje12CN3Dw-1 Received: by mail-wm1-f71.google.com with SMTP id c64-20020a1c3543000000b003a61987ffb3so7943809wma.6 for ; Wed, 31 Aug 2022 01:07:27 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:subject:organization:from :references:cc:to:content-language:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc; bh=qSnF52SfmuOINnaqDpTQuzzr8jWMSKcPsAm32mHJfIw=; b=ncE1mhjoZSBX8dkkyfJQE1XTGaLT2qLX5paLQYTZAO/Bbw4LgzAwFi/P9YuWYiOfs9 f4BmcaXKYOGcvoeSLupsRLgtpEOkkBwAy81oiwqsjodMxw18uh0A8KKF7XxqjOu25cnp rekh0IMxtgyEf60U2/H5WM3UjZD5j7NAA+57j2VloEQ6N0A9+tsqhyc6Syi/GguM/dWP b0qAlntUr9bMM5ubBImYlIJ1UpTQ6+vl0Ad40Xt6330x88WvabwK8vic1X0sQh80Pl33 7qTbLVF5Og5cTm6hcsENU4LvQIzC2dyWfA8QEppo7J37QA1P1FWlC80Kr+1CBd7GoY2S ZSlQ== X-Gm-Message-State: ACgBeo3xOveBwOdc1O4A5N1/90YEI5Tmo1ohoJqfHbGeXXJqLVLBJuyZ EEsdHLoy6KKAzdATDEjAmqeEmKdGsnf/793h5SqiQ54g+8pQchnn238O3ONlAwpsp2ehzyqFuWJ rt3hwUYpAEnM= X-Received: by 2002:a5d:6609:0:b0:226:ced9:be58 with SMTP id n9-20020a5d6609000000b00226ced9be58mr10588875wru.80.1661933246549; Wed, 31 Aug 2022 01:07:26 -0700 (PDT) X-Google-Smtp-Source: AA6agR7+zr9SDOfnc1BY3rtMb1gu5hLqgKvvag/QnKqwKnaFC0qeDT8qfFIWYdUn7DntJxak7IMJmg== X-Received: by 2002:a5d:6609:0:b0:226:ced9:be58 with SMTP id n9-20020a5d6609000000b00226ced9be58mr10588845wru.80.1661933246233; Wed, 31 Aug 2022 01:07:26 -0700 (PDT) Received: from ?IPV6:2003:cb:c706:2900:1613:4308:aca3:2786? (p200300cbc706290016134308aca32786.dip0.t-ipconnect.de. [2003:cb:c706:2900:1613:4308:aca3:2786]) by smtp.gmail.com with ESMTPSA id z14-20020adfd0ce000000b002253fd19a6asm13439906wrh.18.2022.08.31.01.07.25 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 31 Aug 2022 01:07:25 -0700 (PDT) Message-ID: <739dc825-ece3-a59f-adc5-65861676e0ae@redhat.com> Date: Wed, 31 Aug 2022 10:07:24 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.12.0 To: Mike Kravetz Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, inuxppc-dev@lists.ozlabs.org, linux-ia64@vger.kernel.org, Baolin Wang , "Aneesh Kumar K . V" , Naoya Horiguchi , Michael Ellerman , Muchun Song , Andrew Morton , Christophe Leroy References: <20220829234053.159158-1-mike.kravetz@oracle.com> <608934d4-466d-975e-6458-34a91ccb4669@redhat.com> From: David Hildenbrand Organization: Red Hat Subject: Re: [PATCH] hugetlb: simplify hugetlb handling in follow_page_mask In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=KVjAeIN0; spf=pass (imf21.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1661933249; a=rsa-sha256; cv=none; b=owD+qNCaQniE4EnlP4VQCju73FpS3ZMsJRZJbBuYWC8jNlTEaHwhTho3BB6qBxt6ctJol9 hWStpkxg9NdAZwvHFN9mLmuJx8ms/ckg3kVKJnv9YxhoDFZ/fDfpeoQyXSi50LmdxQkgAf pkZ4nv0lH9tkLjFgRsFdM6wtlxPzv5Q= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1661933249; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=qSnF52SfmuOINnaqDpTQuzzr8jWMSKcPsAm32mHJfIw=; b=YBJv+FNmwWbq+fCUJJM8HwkY4nd2oD6xDZWjNMwnLEdQPt/wK9sPXFrc6uP+647oJx2F7P 15r79Kk8G+cBSRfkgGnY9P568tUBEkfHQRvq6sbmz9hGnGxRA/rm7G3y5KJJxuZ6Rd1XIi mOB9QYTC4bRoHPjyq6uhi6mMfal1bSs= X-Rspam-User: X-Rspamd-Queue-Id: 7DE631C0040 Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=KVjAeIN0; spf=pass (imf21.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Stat-Signature: h3bgc5j3redhifqggfj6ygqauki7rozf X-Rspamd-Server: rspam08 X-HE-Tag: 1661933249-873924 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 30.08.22 23:31, Mike Kravetz wrote: > On 08/30/22 09:52, Mike Kravetz wrote: >> On 08/30/22 10:11, David Hildenbrand wrote: >>> On 30.08.22 01:40, Mike Kravetz wrote: >>>> During discussions of this series [1], it was suggested that hugetlb >>>> handling code in follow_page_mask could be simplified. At the beginning >>> >>> Feel free to use a Suggested-by if you consider it appropriate. >>> >>>> of follow_page_mask, there currently is a call to follow_huge_addr which >>>> 'may' handle hugetlb pages. ia64 is the only architecture which provides >>>> a follow_huge_addr routine that does not return error. Instead, at each >>>> level of the page table a check is made for a hugetlb entry. If a hugetlb >>>> entry is found, a call to a routine associated with that entry is made. >>>> >>>> Currently, there are two checks for hugetlb entries at each page table >>>> level. The first check is of the form: >>>> if (p?d_huge()) >>>> page = follow_huge_p?d(); >>>> the second check is of the form: >>>> if (is_hugepd()) >>>> page = follow_huge_pd(). >>> >>> BTW, what about all this hugepd stuff in mm/pagewalk.c? >>> >>> Isn't this all dead code as we're essentially routing all hugetlb VMAs >>> via walk_hugetlb_range? [yes, all that hugepd stuff in generic code that >>> overcomplicates stuff has been annoying me for a long time] >> >> I am 'happy' to look at cleaning up that code next. Perhaps I will just >> create a cleanup series. >> > > Technically, that code is not dead IIUC. The call to walk_hugetlb_range in > __walk_page_range is as follows: > > if (vma && is_vm_hugetlb_page(vma)) { > if (ops->hugetlb_entry) > err = walk_hugetlb_range(start, end, walk); > } else > err = walk_pgd_range(start, end, walk); > > We also have the interface walk_page_range_novma() that will call > __walk_page_range without a value for vma. So, in that case we would > end up calling walk_pgd_range, etc. walk_pgd_range and related routines > do have those checks such as: > > if (is_hugepd(__hugepd(pmd_val(*pmd)))) > err = walk_hugepd_range((hugepd_t *)pmd, addr, next, walk, PMD_SHIFT); > > So, it looks like in this case we would process 'hugepd' entries but not > 'normal' hugetlb entries. That does not seem right. :/ walking a hugetlb range without knowing whether it's a hugetlb range is certainly questionable. > > Christophe Leroy added this code with commit e17eae2b8399 "mm: pagewalk: fix > walk for hugepage tables". This was part of the series "Convert powerpc to > GENERIC_PTDUMP". And, the ptdump code uses the walk_page_range_novma > interface. So, this code is certainly not dead. Hm, that commit doesn't actually mention how it can happen, what exactly will happen ("crazy result") and if it ever happened. > > Adding Christophe on Cc: > > Christophe do you know if is_hugepd is true for all hugetlb entries, not > just hugepd? > > On systems without hugepd entries, I guess ptdump skips all hugetlb entries. > Sigh! IIUC, the idea of ptdump_walk_pgd() is to dump page tables even outside VMAs (for debugging purposes?). I cannot convince myself that that's a good idea when only holding the mmap lock in read mode, because we can just see page tables getting freed concurrently e.g., during concurrent munmap() ... while holding the mmap lock in read we may only walk inside VMA boundaries. That then raises the questions if we're only calling this on special MMs (e.g., init_mm) whereby we cannot really see concurrent munmap() and where we shouldn't have hugetlb mappings or hugepd entries. -- Thanks, David / dhildenb