From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7484FC3DA7A for ; Thu, 5 Jan 2023 10:19:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C9F8494001C; Thu, 5 Jan 2023 05:19:41 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C2914940008; Thu, 5 Jan 2023 05:19:41 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A7CDE94001C; Thu, 5 Jan 2023 05:19:41 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 8C1F1940008 for ; Thu, 5 Jan 2023 05:19:41 -0500 (EST) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 6E98E1A0D08 for ; Thu, 5 Jan 2023 10:19:41 +0000 (UTC) X-FDA: 80320348962.23.1FC9FE5 Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) by imf15.hostedemail.com (Postfix) with ESMTP id D4319A000C for ; Thu, 5 Jan 2023 10:19:39 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=RZvyYiGX; spf=pass (imf15.hostedemail.com: domain of 3O6S2YwoKCIIpznu0mnzutmuumrk.iusrot03-ssq1giq.uxm@flex--jthoughton.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3O6S2YwoKCIIpznu0mnzutmuumrk.iusrot03-ssq1giq.uxm@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672913979; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=D5Cz37H+8pkm+eG9Z0H34ERNe8HsOxRqux6VnMQZxVI=; b=hkGp1NTEqSKm0C88EUsfN1jBUziiykq/Y7u+NPTUmSI8CI/70rI7b1aOApD1M0himvE8bw uYZ9wIJsx1nPtaLW4+9+4cacIpYCyfmqvUbq0fu41dQvVeSUkq+bqD8GSM8uhP+gZkuT90 oxTQTIpLtKqv0r95Ahc5EKGoIuEjQLc= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=RZvyYiGX; spf=pass (imf15.hostedemail.com: domain of 3O6S2YwoKCIIpznu0mnzutmuumrk.iusrot03-ssq1giq.uxm@flex--jthoughton.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3O6S2YwoKCIIpznu0mnzutmuumrk.iusrot03-ssq1giq.uxm@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672913979; a=rsa-sha256; cv=none; b=H8TIF0pWpHXSlrQTDgf7j9v8wOa1/8o635ai6VNUbryCwr7rTkfQe0PaODXEfKw6GPEIw0 BOaeJoC8usks4OoPMUVxYOLw1O8JDfBaWdbWU5u6ozO1UAHOl4V6Jcq0dd+mBcKaSn8ErY AD/PkNGwlelavSvSmE2uFast0ssshPU= Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-46a518991cfso343231667b3.18 for ; Thu, 05 Jan 2023 02:19:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=D5Cz37H+8pkm+eG9Z0H34ERNe8HsOxRqux6VnMQZxVI=; b=RZvyYiGXLrB3ETdxawuVmo48jsdy59hZOrayu4MwbFs3pJ3ttnsPWlfACHcTKuKELB u7elr2o9ZzrreQn73HuxEenrQvdmQK+5qRx8fmCbb3+0y2yuQjtc7Eke6FKuTchDEYK8 YYzHLqNy10+sQRvm2sKmrDW/f1TF8kIqeHUqYRoK6qBEqz5HHAvdPBYZSb8+r9/77pgk lP8AdMMJCj/pfVHI2L4IlZhXYA3ZFnieW2soK/tiuKrevAeNTQ9+1Pzbbjl+RBoEO1Mv 1xYC+3/f2aDhItaL5lZuwhNnbQvNespJUKgHC6UK4LpthFqABlyIPMSYy0cETysfVVG/ YFkA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=D5Cz37H+8pkm+eG9Z0H34ERNe8HsOxRqux6VnMQZxVI=; b=Jd79D0z7H5iu9tmlSf49yXNiNO7rFxq86kWOq3DAG8+mNka656JO8aDwmzSgDIxvWq 2hv92uitFf0HjbTONhVxAeZ6L7XuooiMW20ScImVSiNCWnxHl+tcPuS5I3uKCIsaATSw /wWkdz7KCzCZA3eJcSbgFP6DrMdsF6wbFjNCYCnu8g8LtkfzkYwciBgw5BTQEgizuDrD Oj3bLVHNX0jcF+XYFzJFmGg60pqdiarDGGOGA5MUBsmgvC9LD1PC1tQ8hGBeZ7S0SI/b 7qsiJxWbWzAc0CvQQvZjqroftQatUuccV/1Pe2Y05VB4s96gtZWTT8dlWlIFiYqYpdKE 9nbg== X-Gm-Message-State: AFqh2koEfCgnqyRiTjYJU2i5cVXG0TJm0FzgoecFW7n9KzsRsl3Ei+CD 9jpZ23TWQpKhQPAgtez3QAkHHlVevm3V6hd2 X-Google-Smtp-Source: AMrXdXtZGNLt5SyLCQDlE7Cx/SKZasA9AHgebF17gcCgna+iBzg3syTx/7S3c65ST+dTPGEZeQ5Z177L8GEcKyYJ X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a81:17d6:0:b0:3ea:9ce2:cd76 with SMTP id 205-20020a8117d6000000b003ea9ce2cd76mr93655ywx.217.1672913979081; Thu, 05 Jan 2023 02:19:39 -0800 (PST) Date: Thu, 5 Jan 2023 10:18:29 +0000 In-Reply-To: <20230105101844.1893104-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230105101844.1893104-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230105101844.1893104-32-jthoughton@google.com> Subject: [PATCH 31/46] hugetlb: sort hstates in hugetlb_init_hstates From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton Content-Type: text/plain; charset="UTF-8" X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: D4319A000C X-Stat-Signature: iynyr374to8uhrfwi41t6qfs7kg1gitn X-HE-Tag: 1672913979-125546 X-HE-Meta: U2FsdGVkX18SXrYEYEl9xitTJeJGeMCtSkbbtUTcElaSyZukZJbqnKdmaTWib0V0EPjQFezhk1vk9N2Fb/wiprnVP3O9onqWeBKzdCLP+/14EkYrvUfEYmgNA1c5sjgBB0dobhbcsVMZRXO5xOplVK4fqikLEM4t7ZFEHfWlvBNkkI/SNz+VwXWjVpoJIbiXVnA9DOtttlJJbBQSunWxMqiwnCRvB4PGhi7VkEyj1TD9SLgA45GJ6aTNsl+7Q5zxd2bP7uXtwpGMRLy8id9LjB4enGn8mT0Se3Hv7IBGawAg4jkX8OAt74irik34ZKqKloDd2UvLQNaLSlOLOdh/CiM0d8vSUO83Rcl+48tkXwz3iGq6CP3FCMIyBb6+POWM2XyNlRbMjpzHMULhaRgwKg1K4z13JP2OAfHr+02XzVpYEw6IW+v5vjYChH2EEEZBzWtO5ygX8qvdqiFcKke/HxBKbimazYRUhpce1G0bP57BkOIVDrjd7zvFUo3zdjcYuqjGeNoRie1BjGKyW+o1s/kYdl0LEFteqUdtrhhxgtnN5aqkcH5MvWnqMS3HS9cKqN288mGG1wwikFUTy59VF9S7XZrZMYwvWUUxSllpEXfVR4+qMSTF/V6ckbvZpYsvt8mLHBNNpyz/cCxSXjA4pfCbVS/SPOXLEY6dnxyV4dDZtkQb68CCKvpfb5ST28NFX77SUIjmuEiuaao5J5xAtE0hlF+8aU+DoFaYuXQ7OndULrIdqQ/XrUJDDnYCBFcKBjyuPOgdk5b36uykfjhHb2CUaO7tTupUI9d0hSh3LYrBiL9xY5wQjpuuNwJu39fAGXcPyE4jmaTksfLJUgH2SS5/+unuPO83o4J3lVjClp6W0qQh/qrBHrx4iN5JKFA/5mIeZ/yg6L4lzcMD6ZCL0QqbuDGGdsJsVCqpQTdHn7NuNS00HBkqYFHbxuwrCTjD4j7YNGuS54zR35OYgxN /mlbZcAo I4u+3eINY2Gwesq/gxO6Hd/2hxR5n8nTGTC1Fi+BxFn6yHj1675C8Tk60SBWU8AZFHxv2XPLG6XLRzTp1O9tr6SHWbVROoyuFbUNbwCgdgr4Bnvwc5nHZbaU3AznCppYP5y3jZwnK9lf3axKSKXxe6zBUdX8az4TkiEg0EvVTgnuS3/F3XvfeOph+wRdQj89iC1By4nSuRX0EvJxVoOnQF5aeE2dvcIN4OqjwswdZ1frs9fHev6GVK05mc9b9jN0LcfhG5njsjJvlgsbx3UdPKvxg1+g85Hq/1hgp6kqx9ay30fFG/se7FDuNJkps0HjFQ3FmjLKsMNiW0fm1GPpvE90RGzOKRDRbkf6gGXlScovDGjt7GBQJ/UMdhbydSFVA2I2URYqqG/UV10nWI3Mt/qvyu/rYkPriHMt38pLYwvoDTaRQgN14LVezowTGf49Yht8Y X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: When using HugeTLB high-granularity mapping, we need to go through the supported hugepage sizes in decreasing order so that we pick the largest size that works. Consider the case where we're faulting in a 1G hugepage for the first time: we want hugetlb_fault/hugetlb_no_page to map it with a PUD. By going through the sizes in decreasing order, we will find that PUD_SIZE works before finding out that PMD_SIZE or PAGE_SIZE work too. This commit also changes bootmem hugepages from storing hstate pointers directly to storing the hstate sizes. The hstate pointers used for boot-time-allocated hugepages become invalid after we sort the hstates. `gather_bootmem_prealloc`, called after the hstates have been sorted, now converts the size to the correct hstate. Signed-off-by: James Houghton --- include/linux/hugetlb.h | 2 +- mm/hugetlb.c | 49 ++++++++++++++++++++++++++++++++--------- 2 files changed, 40 insertions(+), 11 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index daf993fdbc38..8a664a9dd0a8 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -789,7 +789,7 @@ struct hstate { struct huge_bootmem_page { struct list_head list; - struct hstate *hstate; + unsigned long hstate_sz; }; int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list); diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 2fb95ecafc63..1e9e149587b3 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -34,6 +34,7 @@ #include #include #include +#include #include #include @@ -49,6 +50,10 @@ int hugetlb_max_hstate __read_mostly; unsigned int default_hstate_idx; +/* + * After hugetlb_init_hstates is called, hstates will be sorted from largest + * to smallest. + */ struct hstate hstates[HUGE_MAX_HSTATE]; #ifdef CONFIG_CMA @@ -3347,7 +3352,7 @@ int __alloc_bootmem_huge_page(struct hstate *h, int nid) /* Put them into a private list first because mem_map is not up yet */ INIT_LIST_HEAD(&m->list); list_add(&m->list, &huge_boot_pages); - m->hstate = h; + m->hstate_sz = huge_page_size(h); return 1; } @@ -3362,7 +3367,7 @@ static void __init gather_bootmem_prealloc(void) list_for_each_entry(m, &huge_boot_pages, list) { struct page *page = virt_to_page(m); struct folio *folio = page_folio(page); - struct hstate *h = m->hstate; + struct hstate *h = size_to_hstate(m->hstate_sz); VM_BUG_ON(!hstate_is_gigantic(h)); WARN_ON(folio_ref_count(folio) != 1); @@ -3478,9 +3483,38 @@ static void __init hugetlb_hstate_alloc_pages(struct hstate *h) kfree(node_alloc_noretry); } +static int compare_hstates_decreasing(const void *a, const void *b) +{ + unsigned long sz_a = huge_page_size((const struct hstate *)a); + unsigned long sz_b = huge_page_size((const struct hstate *)b); + + if (sz_a < sz_b) + return 1; + if (sz_a > sz_b) + return -1; + return 0; +} + +static void sort_hstates(void) +{ + unsigned long default_hstate_sz = huge_page_size(&default_hstate); + + /* Sort from largest to smallest. */ + sort(hstates, hugetlb_max_hstate, sizeof(*hstates), + compare_hstates_decreasing, NULL); + + /* + * We may have changed the location of the default hstate, so we need to + * update it. + */ + default_hstate_idx = hstate_index(size_to_hstate(default_hstate_sz)); +} + static void __init hugetlb_init_hstates(void) { - struct hstate *h, *h2; + struct hstate *h; + + sort_hstates(); for_each_hstate(h) { /* oversize hugepages were init'ed in early boot */ @@ -3499,13 +3533,8 @@ static void __init hugetlb_init_hstates(void) continue; if (hugetlb_cma_size && h->order <= HUGETLB_PAGE_ORDER) continue; - for_each_hstate(h2) { - if (h2 == h) - continue; - if (h2->order < h->order && - h2->order > h->demote_order) - h->demote_order = h2->order; - } + if (h - 1 >= &hstates[0]) + h->demote_order = huge_page_order(h - 1); } } -- 2.39.0.314.g84b9a713c41-goog