From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C9D0CC433EF for ; Fri, 15 Apr 2022 00:53:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 132746B0071; Thu, 14 Apr 2022 20:53:23 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0BBB76B0073; Thu, 14 Apr 2022 20:53:23 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E76DD6B0074; Thu, 14 Apr 2022 20:53:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.27]) by kanga.kvack.org (Postfix) with ESMTP id D2B8E6B0071 for ; Thu, 14 Apr 2022 20:53:22 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 9C66763C48 for ; Fri, 15 Apr 2022 00:53:22 +0000 (UTC) X-FDA: 79357289844.11.1E7B29F Received: from mail-lf1-f45.google.com (mail-lf1-f45.google.com [209.85.167.45]) by imf03.hostedemail.com (Postfix) with ESMTP id 359FD20008 for ; Fri, 15 Apr 2022 00:53:22 +0000 (UTC) Received: by mail-lf1-f45.google.com with SMTP id w19so11792511lfu.11 for ; Thu, 14 Apr 2022 17:53:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=K4ylHojvi98JeEoIjH+B6iXU4rHapKVOg8gpmcEs3LQ=; b=p9nfCauBwj+9IuF7dFpAWaJ/ESqnvUCjOoz03raR2CWhsedvbf9Z/qREWEJyqKISW1 HUnYHJYMqIFT488OYeOpoONlotw968BSammHrfcsmxCj9pKuBfrTmCp7cGHueKo/n5n9 Ex5VNRL32Sk5EQHEHW+HKkmm9L6UcAoLMbhUCjlooDs6D1/7Y6/tzHxzZrddyiev77zC 0R4EyxqC+IgZyMXaJeplkNWG9IMNAArj9Oznnw5BAx4sSz5seH/QGrkOsIsIaUprU9JX kPFuUalXKxC/tKdXoheIUEiY1d33e9a7shWf+reyzg7TSzyRzvCCOzJnzbjy+AaCseD4 lnJg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=K4ylHojvi98JeEoIjH+B6iXU4rHapKVOg8gpmcEs3LQ=; b=iQzupMaAviVLlmiKl4oCqVooTNOxnRSYP9EBYFzVqlYft/A1Avkf++2nue0utDbJja eBqOqat74UpzhxTX1F9ru/ewM941o8esgjq9KlNrZrJbddk6zM3+YF6slag1CiSUi4Uz 0zLEUvJdcmVUDfdaArIpagcV11zAzVl92aVcINJmeZP1busgoUz7QUbiC9nDOhMXpCTh k8/aOFwu0rXwqj+zvTPUApyYzGKYvDggNONq5vtki4S2k2pjIbijyxEQeY8oPB40i15N FFGnumwBnBHnFVtXS32Qc5j0zTLVi+RWYQ92muXsldzuRGO4VtQVlp/jiEENdXrWZxMa e/XA== X-Gm-Message-State: AOAM531piFIgd742nLP+U5uKc80CgTp7kdX7i7eJ6XTjfPW6+FMYcWNo RmYXp32+POtPkfmKPU0Ez1tcwGMXqKqYb/6zuIg3wQ== X-Google-Smtp-Source: ABdhPJynmG/YtfasVpslePUI9B3mbLT76qUC1DPUDuNcvD4wQHZ9ouDNZarNq0sqhCcw6ikL05iq1LJbol50GufCLig= X-Received: by 2002:a19:f008:0:b0:46e:87b3:b79e with SMTP id p8-20020a19f008000000b0046e87b3b79emr1133848lfc.60.1649984000278; Thu, 14 Apr 2022 17:53:20 -0700 (PDT) MIME-Version: 1.0 References: <20220414180612.3844426-1-zokeefe@google.com> In-Reply-To: From: "Zach O'Keefe" Date: Thu, 14 Apr 2022 17:52:43 -0700 Message-ID: Subject: Re: [PATCH v2 00/12] mm: userspace hugepage collapse To: Peter Xu Cc: Alex Shi , David Hildenbrand , David Rientjes , Matthew Wilcox , Michal Hocko , Pasha Tatashin , SeongJae Park , Song Liu , Vlastimil Babka , Yang Shi , Zi Yan , linux-mm@kvack.org, Andrea Arcangeli , Andrew Morton , Arnd Bergmann , Axel Rasmussen , Chris Kennelly , Chris Zankel , Helge Deller , Hugh Dickins , Ivan Kokshaysky , "James E.J. Bottomley" , Jens Axboe , "Kirill A. Shutemov" , Matt Turner , Max Filippov , Miaohe Lin , Minchan Kim , Patrick Xia , Pavel Begunkov , Thomas Bogendoerfer Content-Type: text/plain; charset="UTF-8" X-Rspam-User: X-Rspamd-Queue-Id: 359FD20008 X-Stat-Signature: azj9ad8777p496pfpcboc81hxrndjcb6 Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=p9nfCauB; spf=pass (imf03.hostedemail.com: domain of zokeefe@google.com designates 209.85.167.45 as permitted sender) smtp.mailfrom=zokeefe@google.com; dmarc=pass (policy=reject) header.from=google.com X-Rspamd-Server: rspam01 X-HE-Tag: 1649984002-343380 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hey Peter, Thanks for taking the time to review! On Thu, Apr 14, 2022 at 5:04 PM Peter Xu wrote: > > Hi, Zach, > > On Thu, Apr 14, 2022 at 11:06:00AM -0700, Zach O'Keefe wrote: > > process_madvise(2) > > > > Performs a synchronous collapse of the native pages > > mapped by the list of iovecs into transparent hugepages. > > > > Allocation semantics are the same as khugepaged, and depend on > > (1) the active sysfs settings > > /sys/kernel/mm/transparent_hugepage/enabled and > > /sys/kernel/mm/transparent_hugepage/khugepaged/defrag, and (2) > > the VMA flags of the memory range being collapsed. > > > > Collapse eligibility criteria differs from khugepaged in that > > the sysfs files > > /sys/kernel/mm/transparent_hugepage/khugepaged/max_ptes_[none|swap|shared] > > are ignored. > > The userspace khugepaged idea definitely makes sense to me, though I'm > curious how the line is drown on the different behaviors here by explicitly > ignoring the max_ptes_* entries. > > Let's assume the initiative is to duplicate a more data-aware khugepaged in > the userspace, then IMHO it makes more sense to start with all the policies > that applies to khugepaged already, including max_pte_*. > > I can understand the willingness to provide even stronger semantics here > than khugepaged since the userspace could have very clear knowledge of how > to provision the memories (better than a kernel scanner). It's just that > IMHO it could be slightly confusing if the new interface only partially > apply the khugepaged rules. > > No strong opinion here. It could already been a trade-off after the > discussion from the RFC with Michal which I read.. Just curious about how > you made that design decision so feel free to read it as a pure question. > Understand your point here. The allocation and max_pte_* semantics are split between khugepaged-like and fault-like, respectively - which could be confusing. Originally, I proposed a MADV_F_COLLAPSE_LIMITS flag to control the former's behavior, but agreed to keep things simple to start, and expand the interface if/when necessary. I opted to ignore max_ptes_* as the default since I envisioned that early adopters would "just want it to work". One such example would be backing executable text by hugepages on program load when many pages haven't been demand-paged in yet. What do you think? Thanks, Zach > Thanks, > > -- > Peter Xu >