From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.2 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 64AECC433ED for ; Thu, 8 Apr 2021 17:19:04 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id DB22D60FEA for ; Thu, 8 Apr 2021 17:19:03 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org DB22D60FEA Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 5573D6B0036; Thu, 8 Apr 2021 13:19:01 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 513456B006E; Thu, 8 Apr 2021 13:19:01 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3A84A6B0071; Thu, 8 Apr 2021 13:19:01 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0227.hostedemail.com [216.40.44.227]) by kanga.kvack.org (Postfix) with ESMTP id 1E1896B0036 for ; Thu, 8 Apr 2021 13:19:01 -0400 (EDT) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id C2C7918000DA9 for ; Thu, 8 Apr 2021 17:19:00 +0000 (UTC) X-FDA: 78009860040.24.454E76E Received: from mail-lf1-f50.google.com (mail-lf1-f50.google.com [209.85.167.50]) by imf22.hostedemail.com (Postfix) with ESMTP id 31A95C0007D6 for ; Thu, 8 Apr 2021 17:18:58 +0000 (UTC) Received: by mail-lf1-f50.google.com with SMTP id b14so5277292lfv.8 for ; Thu, 08 Apr 2021 10:19:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=0gEpNulq0JouzND7X8G/z4bZ8lZh7ERFU8FSsc0Dzac=; b=noL0zUmUFj2zwRUlYv3RquGNc8nwvDeqlE7uolCFVwehPyaRUjY3VAKKdGmRjzbZLl fD/PB2RNqDPIm2ezoQBU50070lVbfVsBOPRUfF+ZdyFQ+u6wHf9KMr7fBrLDaTEI1FbC XnC8zE6YqHXvJ/ZndiMiAa/1OhHSxtZ5S2Qa5UuUIrwGZtS96hvH34Wr1Io9GkW4/YgG zP/vAfPyeqU0PLkig/L+1rfZZmne+DWF9FAvwzjvjDLs6bwdvCIEtJyrdhIzts+iS2OA TCT04r2xZHuzfYbs0VIAhgUPfg2fj8fWboitjtol+M2nlDBeCyOsQessN0yhxxmJ/rQH Ix3A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=0gEpNulq0JouzND7X8G/z4bZ8lZh7ERFU8FSsc0Dzac=; b=HNiJoEUz7WvCzu3E2hWKX4wmhFI6a6vgZJlwnAghfx7AhCZ3RNpC+/SBVBaAuZ4Hxx ODQ471SkWYRM5oE4QycmFQGVTfFOTWsLNQElBLBY6BPyj5OpMAPEVF5ZZde2WO42hYCg L1JCLjuprDuDR0jRZUQjpa5QEZ/VXzHASIJ0jm+KprQcvzwYfptBlPyzLpzo7cS7rc3f hGVHwRazyJb2+t+0ljb+9tzPtiXCz/+jxv6a2SVDAkt3oBSxDeF0ooQJTtj+JamZIue0 2MHYXAqDqRIDnlHtyJtWy24uFQjp7ry8ysLLY3RtoRv8PjPX2LOrDtuZgiIy/4lNHug7 WzAA== X-Gm-Message-State: AOAM532A0UtUfcHt4vDrrjgZdp/jmfffUP44PHEgeOjG2KqC7RVCsmJ5 Ser0uP60i4uxNeslOyht8aEcg2lIKZFl7ZKqSIsNxA== X-Google-Smtp-Source: ABdhPJxrQIVn50yQwVDP1KBGEaeW1VjAaTwKlqTcXbP8XbkcSZjoaC/A0UKnyxmh/feBPbCrnpCocgTLcYJ/w/IEW9M= X-Received: by 2002:a05:6512:3703:: with SMTP id z3mr7163066lfr.358.1617902338550; Thu, 08 Apr 2021 10:18:58 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Shakeel Butt Date: Thu, 8 Apr 2021 10:18:47 -0700 Message-ID: Subject: Re: [RFC PATCH v1 00/11] Manage the top tier memory in a tiered memory To: Tim Chen Cc: Michal Hocko , Johannes Weiner , Andrew Morton , Dave Hansen , Ying Huang , Dan Williams , David Rientjes , Linux MM , Cgroups , LKML Content-Type: text/plain; charset="UTF-8" X-Stat-Signature: 5gmyqhy9qij4oah9mdne41cwtbtidbsi X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 31A95C0007D6 Received-SPF: none (google.com>: No applicable sender policy available) receiver=imf22; identity=mailfrom; envelope-from=""; helo=mail-lf1-f50.google.com; client-ip=209.85.167.50 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1617902338-88109 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi Tim, On Mon, Apr 5, 2021 at 11:08 AM Tim Chen wrote: > > Traditionally, all memory is DRAM. Some DRAM might be closer/faster than > others NUMA wise, but a byte of media has about the same cost whether it > is close or far. But, with new memory tiers such as Persistent Memory > (PMEM). there is a choice between fast/expensive DRAM and slow/cheap > PMEM. > > The fast/expensive memory lives in the top tier of the memory hierachy. > > Previously, the patchset > [PATCH 00/10] [v7] Migrate Pages in lieu of discard > https://lore.kernel.org/linux-mm/20210401183216.443C4443@viggo.jf.intel.com/ > provides a mechanism to demote cold pages from DRAM node into PMEM. > > And the patchset > [PATCH 0/6] [RFC v6] NUMA balancing: optimize memory placement for memory tiering system > https://lore.kernel.org/linux-mm/20210311081821.138467-1-ying.huang@intel.com/ > provides a mechanism to promote hot pages in PMEM to the DRAM node > leveraging autonuma. > > The two patchsets together keep the hot pages in DRAM and colder pages > in PMEM. Thanks for working on this as this is becoming more and more important particularly in the data centers where memory is a big portion of the cost. I see you have responded to Michal and I will add my more specific response there. Here I wanted to give my high level concern regarding using v1's soft limit like semantics for top tier memory. This patch series aims to distribute/partition top tier memory between jobs of different priorities. We want high priority jobs to have preferential access to the top tier memory and we don't want low priority jobs to hog the top tier memory. Using v1's soft limit like behavior can potentially cause high priority jobs to stall to make enough space on top tier memory on their allocation path and I think this patchset is aiming to reduce that impact by making kswapd do that work. However I think the more concerning issue is the low priority job hogging the top tier memory. The possible ways the low priority job can hog the top tier memory are by allocating non-movable memory or by mlocking the memory. (Oh there is also pinning the memory but I don't know if there is a user api to pin memory?) For the mlocked memory, you need to either modify the reclaim code or use a different mechanism for demoting cold memory. Basically I am saying we should put the upfront control (limit) on the usage of top tier memory by the jobs.