From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.0 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BF6C7C433E2 for ; Thu, 3 Sep 2020 14:23:03 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 679A0206EB for ; Thu, 3 Sep 2020 14:23:03 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=shutemov-name.20150623.gappssmtp.com header.i=@shutemov-name.20150623.gappssmtp.com header.b="Hld+nvYk" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 679A0206EB Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=shutemov.name Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id B588D6B0003; Thu, 3 Sep 2020 10:23:02 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B07DF6B005C; Thu, 3 Sep 2020 10:23:02 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9F5828E0001; Thu, 3 Sep 2020 10:23:02 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0181.hostedemail.com [216.40.44.181]) by kanga.kvack.org (Postfix) with ESMTP id 8A6086B0003 for ; Thu, 3 Sep 2020 10:23:02 -0400 (EDT) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 40CB7824556B for ; Thu, 3 Sep 2020 14:23:02 +0000 (UTC) X-FDA: 77221967004.26.grade46_5d17899270aa Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin26.hostedemail.com (Postfix) with ESMTP id ED2721804B65A for ; Thu, 3 Sep 2020 14:23:01 +0000 (UTC) X-HE-Tag: grade46_5d17899270aa X-Filterd-Recvd-Size: 5574 Received: from mail-lf1-f67.google.com (mail-lf1-f67.google.com [209.85.167.67]) by imf45.hostedemail.com (Postfix) with ESMTP for ; Thu, 3 Sep 2020 14:23:01 +0000 (UTC) Received: by mail-lf1-f67.google.com with SMTP id q8so2010484lfb.6 for ; Thu, 03 Sep 2020 07:23:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=shutemov-name.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=O/z6sRcm/w3Gq8tyci41HnHcIsT/GCvCxeftWXJE3ZM=; b=Hld+nvYkZKgt0X3+OpEeMXJhLN4SNaQYMvUr7kv8tw+aY0UpEhoAwv4V8pgi4SSJ2H MS0NMaV+HXyKOyijf8Jfnzzt+XnwgGnbRVdaDBTnnuCb94tTx6mpw46euxE22JBBvwGw bX5kmq9SSzO2dMF9YyvSoS2iNRHFGTjTpHnwQH9Tny80Y/wTWeDwaBjazfnVp62L1PZY d1ffElPpl1Gzi1F2mSu59qmoYxS/BO7YWQp+Duj3mHgBaOmDqmtd3kDSD9vgGhT7PuOT pByXoVovDnxO+xK3Ai3jNII0arq5wBAMfabJzQ3mViHkzqHkHjsfLv4rgizikRXBrkia evgw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=O/z6sRcm/w3Gq8tyci41HnHcIsT/GCvCxeftWXJE3ZM=; b=EbZrereMx4TlbAdqAFsZayIzQCBKmkPvbghtZzDKKbnE61nTU2FPf9pj8NnohMTsBt m2QyA1d9tDQ9iefos9FIcxEnlLB/bX0VZHhgvNTyjzwJxewZWTqWPUnvS3WY17f4ETOB FIoVcjROZYRS0xU9f76EMhp7REk1oUFzRCS+6KIdSBmzHcLjSey+47ryuW10zjqaqHw8 fs2zJiSojCSfjlCcgmSFZvtfbWBoZGl/TnJF74wQLlHud5fBct2oooboWWYYu9z58td8 B0BiTXxGVmVvszI0fGbXdvWCtm/U3j6a6b3I7J8DQdDYKIoMxD62tb8gNupA4Im8Oq8x ESVw== X-Gm-Message-State: AOAM533B6aj1H3UeYerC9ci7AUTTSo07Z9iJezSKi9AE+tsp2C4RNnKq 3TmHtkGSK5SC86KhAPtXbqKgTQ== X-Google-Smtp-Source: ABdhPJzpabZDHSmsRn/rqjHHltKbK/pr+mbWx6JJtKMzY8LLajUTdLpQxTCqAr5ypxTd8fG0IK1lpg== X-Received: by 2002:a19:8142:: with SMTP id c63mr1407517lfd.175.1599142979609; Thu, 03 Sep 2020 07:22:59 -0700 (PDT) Received: from box.localdomain ([86.57.175.117]) by smtp.gmail.com with ESMTPSA id k12sm632952ljh.95.2020.09.03.07.22.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 03 Sep 2020 07:22:59 -0700 (PDT) Received: by box.localdomain (Postfix, from userid 1000) id BE37E102212; Thu, 3 Sep 2020 17:23:00 +0300 (+03) Date: Thu, 3 Sep 2020 17:23:00 +0300 From: "Kirill A. Shutemov" To: Zi Yan Cc: linux-mm@kvack.org, Roman Gushchin , Rik van Riel , "Kirill A . Shutemov" , Matthew Wilcox , Shakeel Butt , Yang Shi , David Nellans , linux-kernel@vger.kernel.org Subject: Re: [RFC PATCH 00/16] 1GB THP support on x86_64 Message-ID: <20200903142300.bjq2um5y5nwocvar@box> References: <20200902180628.4052244-1-zi.yan@sent.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200902180628.4052244-1-zi.yan@sent.com> X-Rspamd-Queue-Id: ED2721804B65A X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam04 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Sep 02, 2020 at 02:06:12PM -0400, Zi Yan wrote: > From: Zi Yan > > Hi all, > > This patchset adds support for 1GB THP on x86_64. It is on top of > v5.9-rc2-mmots-2020-08-25-21-13. > > 1GB THP is more flexible for reducing translation overhead and increasing the > performance of applications with large memory footprint without application > changes compared to hugetlb. This statement needs a lot of justification. I don't see 1GB THP as viable for any workload. Opportunistic 1GB allocation is very questionable strategy. > Design > ======= > > 1GB THP implementation looks similar to exiting THP code except some new designs > for the additional page table level. > > 1. Page table deposit and withdraw using a new pagechain data structure: > instead of one PTE page table page, 1GB THP requires 513 page table pages > (one PMD page table page and 512 PTE page table pages) to be deposited > at the page allocaiton time, so that we can split the page later. Currently, > the page table deposit is using ->lru, thus only one page can be deposited. False. Current code can deposit arbitrary number of page tables. What can be problem to you is that these page tables tied to struct page of PMD page table. > A new pagechain data structure is added to enable multi-page deposit. > > 2. Triple mapped 1GB THP : 1GB THP can be mapped by a combination of PUD, PMD, > and PTE entries. Mixing PUD an PTE mapping can be achieved with existing > PageDoubleMap mechanism. To add PMD mapping, PMDPageInPUD and > sub_compound_mapcount are introduced. PMDPageInPUD is the 512-aligned base > page in a 1GB THP and sub_compound_mapcount counts the PMD mapping by using > page[N*512 + 3].compound_mapcount. I had hard time reasoning about DoubleMap vs. rmap. Good for you if you get it right. > 3. Using CMA allocaiton for 1GB THP: instead of bump MAX_ORDER, it is more sane > to use something less intrusive. So all 1GB THPs are allocated from reserved > CMA areas shared with hugetlb. At page splitting time, the bitmap for the 1GB > THP is cleared as the resulting pages can be freed via normal page free path. > We can fall back to alloc_contig_pages for 1GB THP if necessary. > -- Kirill A. Shutemov