From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.6 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EE4E1C433E1 for ; Thu, 20 Aug 2020 16:27:16 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B391C207BB for ; Thu, 20 Aug 2020 16:27:16 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="gkGLCi0T" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B391C207BB Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 58DBC8D0036; Thu, 20 Aug 2020 12:27:16 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 53D688D0001; Thu, 20 Aug 2020 12:27:16 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 403FA8D0036; Thu, 20 Aug 2020 12:27:16 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0152.hostedemail.com [216.40.44.152]) by kanga.kvack.org (Postfix) with ESMTP id 25BA28D0001 for ; Thu, 20 Aug 2020 12:27:16 -0400 (EDT) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id CA8168248047 for ; Thu, 20 Aug 2020 16:27:15 +0000 (UTC) X-FDA: 77171476830.22.eggs02_600c87927032 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin22.hostedemail.com (Postfix) with ESMTP id EE64A18038E74 for ; Thu, 20 Aug 2020 16:27:10 +0000 (UTC) X-HE-Tag: eggs02_600c87927032 X-Filterd-Recvd-Size: 5340 Received: from mail-ej1-f67.google.com (mail-ej1-f67.google.com [209.85.218.67]) by imf06.hostedemail.com (Postfix) with ESMTP for ; Thu, 20 Aug 2020 16:27:10 +0000 (UTC) Received: by mail-ej1-f67.google.com with SMTP id g19so3215055ejc.9 for ; Thu, 20 Aug 2020 09:27:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=eM1zOn+FR2yFJ0AIl1j1XjtlYy3Z0GppqAmGTNPlYdw=; b=gkGLCi0TAZy8OtR7kqtCVPj9RrcD2K6WpI6S3OnyUKOo53ddYAKRUuT1ExoiMt6FW/ V93Tvta7cuDpo/0Q5i3Tj8AprcriwlmeUdtk0fEu1dYQgD/em/zC+zvWv2NbPjg37sY+ rbkPfhHThJOy8UEiUar8Sfm6nzBjwoilAk4uaEa8nKuYKxrB5YtLsnQvulVim1Qk5Upu L8bxSxqtIcNyGJLFbv3fd6JmcgMUFiaTDImhYaYjCKmK6BlwxNerQmyLQ3/mw8Ry2D15 mF0nSZ66EbDkAsbauTX4Aw5m4OGSZAlxKThn3DBNVzcBzLEFXBOBTipVNcm+Lmb2CVCz WXwg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=eM1zOn+FR2yFJ0AIl1j1XjtlYy3Z0GppqAmGTNPlYdw=; b=g/i3QSH7C+FM15/MLJpceaVndzMLJdNgrggX9de0RXSDx35yS6fMZjdj9q9fcYq4z3 Ae+WTDMiEUAYeL+yOhSIaXKARQs7iH8KXe73hrIAcrmYkWsnRWMNJNykSoXews+Uk5aZ 5kb0hDQ3k/H2f9PsIXlTVz9qvVkHhPcCAavOJ4IfWkZ9XBBInZCxc4SeXdg5AbinLjCN 8qo66hINJ7dTJ+E4FyPdbOdvOmYlRq1I0c+Cl/6M8u9lpSYsdrFjUoJhtzpGLsWuLO9Z 9TprGU+H2t8GQuXsvI2HB88thsQgGv0iMniMhV2rxKm9l1U8aXEqaTOjdkcEWh1rcrlE Q70Q== X-Gm-Message-State: AOAM530JA9HAptONtI6fpFHkAhdT8vsO40iC6Ei52z9/Fa6tg6AnuFh7 mDV7DoVjkvDJXyN3odA55x5FLRfTcByvm2TnuQw= X-Google-Smtp-Source: ABdhPJzIt71V2sZLGyr1YL/5rldNRwaxAE5Ib2Ga+VF9Px2vVt8VjP861EQXb+xUXh1yEWcxOzliynbXtKA2VdL404o= X-Received: by 2002:a17:906:3b8d:: with SMTP id u13mr3924409ejf.383.1597940829251; Thu, 20 Aug 2020 09:27:09 -0700 (PDT) MIME-Version: 1.0 References: <20200818184122.29C415DF@viggo.jf.intel.com> <20200818184131.C972AFCC@viggo.jf.intel.com> <87lfi9wxk9.fsf@yhuang-dev.intel.com> <6a378a57-a453-0318-924b-05dfa0a10c1f@intel.com> In-Reply-To: <6a378a57-a453-0318-924b-05dfa0a10c1f@intel.com> From: Yang Shi Date: Thu, 20 Aug 2020 09:26:57 -0700 Message-ID: Subject: Re: [RFC][PATCH 5/9] mm/migrate: demote pages during reclaim To: Dave Hansen Cc: "Huang, Ying" , Dave Hansen , Linux Kernel Mailing List , Yang Shi , David Rientjes , Dan Williams , Linux-MM Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: EE64A18038E74 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam05 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Aug 20, 2020 at 8:22 AM Dave Hansen wrote: > > On 8/20/20 1:06 AM, Huang, Ying wrote: > >> + /* Migrate pages selected for demotion */ > >> + nr_reclaimed += demote_page_list(&ret_pages, &demote_pages, pgdat, sc); > >> + > >> pgactivate = stat->nr_activate[0] + stat->nr_activate[1]; > >> > >> mem_cgroup_uncharge_list(&free_pages); > >> _ > > Generally, it's good to batch the page migration. But one side effect > > is that, if the pages are failed to be migrated, they will be placed > > back to the LRU list instead of falling back to be reclaimed really. > > This may cause some issue in some situation. For example, if there's no > > enough space in the PMEM (slow) node, so the page migration fails, OOM > > may be triggered, because the direct reclaiming on the DRAM (fast) node > > may make no progress, while it can reclaim some pages really before. > > Yes, agreed. Kind of. But I think that should be transient and very rare. The kswapd on pmem nodes will be waken up to drop pages when we try to allocate migration target pages. It should be very rare that there is not reclaimable page on pmem nodes. > > There are a couple of ways we could fix this. Instead of splicing > 'demote_pages' back into 'ret_pages', we could try to get them back on > 'page_list' and goto the beginning on shrink_page_list(). This will > probably yield the best behavior, but might be a bit ugly. > > We could also add a field to 'struct scan_control' and just stop trying > to migrate after it has failed one or more times. The trick will be > picking a threshold that doesn't mess with either the normal reclaim > rate or the migration rate. In my patchset I implemented a fallback mechanism via adding a new PGDAT_CONTENDED node flag. Please check this out: https://patchwork.kernel.org/patch/10993839/. Basically the PGDAT_CONTENDED flag will be set once migrate_pages() return -ENOMEM which indicates the target pmem node is under memory pressure, then it would fallback to regular reclaim path. The flag would be cleared by clear_pgdat_congested() once the pmem node memory pressure is gone. We already use node flags to indicate the state of node in reclaim code, i.e. PGDAT_WRITEBACK, PGDAT_DIRTY, etc. So, adding a new flag sounds more straightforward to me IMHO. > > This is on my list to fix up next. >