From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E7608C433DB for ; Wed, 3 Mar 2021 20:23:42 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 39FED64EF9 for ; Wed, 3 Mar 2021 20:23:42 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 39FED64EF9 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 766326B0005; Wed, 3 Mar 2021 15:23:41 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 716156B0006; Wed, 3 Mar 2021 15:23:41 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 58FDD6B0007; Wed, 3 Mar 2021 15:23:41 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0156.hostedemail.com [216.40.44.156]) by kanga.kvack.org (Postfix) with ESMTP id 3936E6B0005 for ; Wed, 3 Mar 2021 15:23:41 -0500 (EST) Received: from smtpin13.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id E275EA77C for ; Wed, 3 Mar 2021 20:23:40 +0000 (UTC) X-FDA: 77879688600.13.81DCE9B Received: from mail-pf1-f179.google.com (mail-pf1-f179.google.com [209.85.210.179]) by imf07.hostedemail.com (Postfix) with ESMTP id 6BEECA0049D0 for ; Wed, 3 Mar 2021 20:23:32 +0000 (UTC) Received: by mail-pf1-f179.google.com with SMTP id 18so2249024pfo.6 for ; Wed, 03 Mar 2021 12:23:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=0FH7LxOsmxIsgoTenyw1jRon/uhYAlBiA4+ZL3nwHtQ=; b=AIxxwOFh8H6cXbMpEdUcTMgTdPU74pl6sVrvlULaxh05OgQMd7JETuqq6JECoOf+1t SNlWMPlsbvKX8LXBSllxuBVY4Fh4k3kT9RciAiOqo70OYzQagHRgd1TTACTGBOS8a1Zl D4GuPN//j2pELtPdyiQ6YCP7l9V8CtyHtz5llwdzhNs2iAmYf7f01+tlPM6xPuDFMTgR d2tWa8VGxraO/6LahwJzsCNWcC3FzO7b85LnyOymqMgzzk9Q3NBzwLnZApS4/VdWzR3s Ejev7zAmc8y9v2fFS2VwEjmP+tfa70h2/0Jw85ETgMA7OR9zaP47plt7PYF6SxvTT5kS PO5A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to; bh=0FH7LxOsmxIsgoTenyw1jRon/uhYAlBiA4+ZL3nwHtQ=; b=isjoUQXnpG2EsASt1aYuTipzvs1nUkj0E0QeuD5zJVvjbFP3PoCSBuqe+kpuyqjz/0 A/nX7S/Nql4jFaPotjsmQOGim0+r/0YaIQIL7fZTEBAaE+MKHiMeLki7aU46Tc78T3h+ 5ucf73RjOxuNjcJ6xc9a+caFPUk07ASJjN5H0KJjn09JVIsE4I5eaTnynsHTl25MX5t3 /zHFZQQsXucW6QwYMuE+OmtpzCerHiM8HBAscdyYTxk2aJ4ejdnZIcpIrNcrY37467L6 X682KgfkaKQYFaoXBw/sZNtLGYRQv3wRuZRQWaJ02bHfnOYbTPE78Ec4aFADqBeo8WTa CjAQ== X-Gm-Message-State: AOAM5312fWl1ZjZNXH7rOCR0XLooKeqyAzAR4OckjqIJ7voK0b7kcosF FgF2pIWgjK5eZ19jvVg1igA= X-Google-Smtp-Source: ABdhPJzjTbg+doQsb4ri01ZrBnPufRMmByvBrcHUbX8wMYPuO9f//uSNLvMzwnmMPt3noGJiw4BXEA== X-Received: by 2002:a63:74d:: with SMTP id 74mr619392pgh.316.1614803006263; Wed, 03 Mar 2021 12:23:26 -0800 (PST) Received: from google.com ([2620:15c:211:201:c87:c34:99dc:ba23]) by smtp.gmail.com with ESMTPSA id g15sm25192777pfb.30.2021.03.03.12.23.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 03 Mar 2021 12:23:24 -0800 (PST) Date: Wed, 3 Mar 2021 12:23:22 -0800 From: Minchan Kim To: Michal Hocko Cc: Andrew Morton , linux-mm , LKML , joaodias@google.com, surenb@google.com, cgoldswo@codeaurora.org, willy@infradead.org, david@redhat.com, vbabka@suse.cz, linux-fsdevel@vger.kernel.org Subject: Re: [PATCH 1/2] mm: disable LRU pagevec during the migration temporarily Message-ID: References: <20210302210949.2440120-1-minchan@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 6BEECA0049D0 X-Stat-Signature: d6py4bf7fmwks8pp13xmtqiaux9ek9gt Received-SPF: none (gmail.com>: No applicable sender policy available) receiver=imf07; identity=mailfrom; envelope-from=""; helo=mail-pf1-f179.google.com; client-ip=209.85.210.179 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1614803012-736443 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Mar 03, 2021 at 01:49:36PM +0100, Michal Hocko wrote: > On Tue 02-03-21 13:09:48, Minchan Kim wrote: > > LRU pagevec holds refcount of pages until the pagevec are drained. > > It could prevent migration since the refcount of the page is greater > > than the expection in migration logic. To mitigate the issue, > > callers of migrate_pages drains LRU pagevec via migrate_prep or > > lru_add_drain_all before migrate_pages call. > > > > However, it's not enough because pages coming into pagevec after the > > draining call still could stay at the pagevec so it could keep > > preventing page migration. Since some callers of migrate_pages have > > retrial logic with LRU draining, the page would migrate at next trail > > but it is still fragile in that it doesn't close the fundamental race > > between upcoming LRU pages into pagvec and migration so the migration > > failure could cause contiguous memory allocation failure in the end. > > > > To close the race, this patch disables lru caches(i.e, pagevec) > > during ongoing migration until migrate is done. > > > > Since it's really hard to reproduce, I measured how many times > > migrate_pages retried with force mode below debug code. > > > > int migrate_pages(struct list_head *from, new_page_t get_new_page, > > .. > > .. > > > > if (rc && reason == MR_CONTIG_RANGE && pass > 2) { > > printk(KERN_ERR, "pfn 0x%lx reason %d\n", page_to_pfn(page), rc); > > dump_page(page, "fail to migrate"); > > } > > > > The test was repeating android apps launching with cma allocation > > in background every five seconds. Total cma allocation count was > > about 500 during the testing. With this patch, the dump_page count > > was reduced from 400 to 30. > > Have you seen any improvement on the CMA allocation success rate? Unfortunately, the cma alloc failure rate with reasonable margin of error is really hard to reproduce under real workload. That's why I measured the soft metric instead of direct cma fail under real workload(I don't want to make some adhoc artificial benchmark and keep tunes system knobs until it could show extremly exaggerated result to convice patch effect). Please say if you belive this work is pointless unless there is stable data under reproducible scenario. I am happy to drop it. > > > Signed-off-by: Minchan Kim > > --- > > * from RFC - http://lore.kernel.org/linux-mm/20210216170348.1513483-1-minchan@kernel.org > > * use atomic and lru_add_drain_all for strict ordering - mhocko > > * lru_cache_disable/enable - mhocko > > > > fs/block_dev.c | 2 +- > > include/linux/migrate.h | 6 +++-- > > include/linux/swap.h | 4 ++- > > mm/compaction.c | 4 +-- > > mm/fadvise.c | 2 +- > > mm/gup.c | 2 +- > > mm/khugepaged.c | 2 +- > > mm/ksm.c | 2 +- > > mm/memcontrol.c | 4 +-- > > mm/memfd.c | 2 +- > > mm/memory-failure.c | 2 +- > > mm/memory_hotplug.c | 2 +- > > mm/mempolicy.c | 6 +++++ > > mm/migrate.c | 15 ++++++----- > > mm/page_alloc.c | 5 +++- > > mm/swap.c | 55 +++++++++++++++++++++++++++++++++++------ > > 16 files changed, 85 insertions(+), 30 deletions(-) > > The churn seems to be quite big for something that should have been a > very small change. Have you considered not changing lru_add_drain_all > but rather introduce __lru_add_dain_all that would implement the > enforced flushing? Good idea. > > [...] > > +static atomic_t lru_disable_count = ATOMIC_INIT(0); > > + > > +bool lru_cache_disabled(void) > > +{ > > + return atomic_read(&lru_disable_count); > > +} > > + > > +void lru_cache_disable(void) > > +{ > > + /* > > + * lru_add_drain_all's IPI will make sure no new pages are added > > + * to the pcp lists and drain them all. > > + */ > > + atomic_inc(&lru_disable_count); > > As already mentioned in the last review. The IPI reference is more > cryptic than useful. I would go with something like this instead > > /* > * lru_add_drain_all in the force mode will schedule draining on > * all online CPUs so any calls of lru_cache_disabled wrapped by > * local_lock or preemption disabled would be ordered by that. > * The atomic operation doesn't need to have stronger ordering > * requirements because that is enforece by the scheduling > * guarantees. > */ Thanks for the nice description. I will use it in next revision if you believe this work is useful.