From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.8 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A63D4C433E0 for ; Tue, 9 Mar 2021 02:21:09 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 401A0652B1 for ; Tue, 9 Mar 2021 02:21:09 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 401A0652B1 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id AAA438D00A7; Mon, 8 Mar 2021 21:21:08 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A80DC8D007F; Mon, 8 Mar 2021 21:21:08 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 925008D00A7; Mon, 8 Mar 2021 21:21:08 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0203.hostedemail.com [216.40.44.203]) by kanga.kvack.org (Postfix) with ESMTP id 755858D007F for ; Mon, 8 Mar 2021 21:21:08 -0500 (EST) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 384A68249980 for ; Tue, 9 Mar 2021 02:21:08 +0000 (UTC) X-FDA: 77898733416.21.48A6718 Received: from mail-pf1-f179.google.com (mail-pf1-f179.google.com [209.85.210.179]) by imf19.hostedemail.com (Postfix) with ESMTP id 94BBA90009EF for ; Tue, 9 Mar 2021 02:21:06 +0000 (UTC) Received: by mail-pf1-f179.google.com with SMTP id 18so8448919pfo.6 for ; Mon, 08 Mar 2021 18:21:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=9bGq9kHaxz4ieImbA52oG7Wty2DH5Z2+L9jd9skILhQ=; b=Y+4R14mZu6I94rC8QSoez7DJAMGBLNDKd678iRXgTDfVAox7b7CagA4IGLgGLqDsb1 kqKLVLzx0aqGWW14eIvbxXFjulOfwJybI/gWxaBTZToe5vZlfic8a5O4ePprOwoDZfQq lUxQwzwKkE441eKniwRc9xc9csfFqDKn3n0gBeEM8AFM+zOkiIgXmW9DL1jbLuO2/Pbt p3iS8aowU66jIht1PiKeIiSgkF0qwmwkKSyeFBDZyL6sosrDsegcHFBNZTstu+jm+MPF MRgt82z1LLdwDzNfP8UHgTO9a6stAdIdd/JR8B1BS7dIcmxL+kUVq8EFV1gowt9C1GZQ L1HQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to; bh=9bGq9kHaxz4ieImbA52oG7Wty2DH5Z2+L9jd9skILhQ=; b=XM5NZb0KQ/2sMjE+WqIiXrLxnXlYfgbFhMQgMvmujzlhzMXZbGj6wYrVP7ZLfaLK5y pkQS0aFgu6TClZ6mTUwN9FWb6PsApNsS4pS6gB9THhVYM96X+XQkDEY9NWovMz43XlXO DjxyJXRUFR5/C1HpEliTn8CFFoq8L589hNS2A9vsSV9HYhr7fgfozs/7EpyalHhaXKxx y6BIJrNtdwbhiB7qnHAF8YsMRhYipI96seS2bRDA3nE6E55MHYlRJdZ87NZtBV3TB5GL rJb8fSgfG80up3ipKKQchpjy6+sPycH9IZ7dkPK79d0FJYcitYECHn9QY10yFU84Edsu ww1g== X-Gm-Message-State: AOAM532ekTeWt0On4FMeHWAPjeUzHmnHorHVrLl/H3weG7GMODJnCUti UjGGfFmEEabCJk0ujqs+Uoo= X-Google-Smtp-Source: ABdhPJzSADszP6X8rpqEUnfBA3fu76HLyan3RFIkVdZJ4XQ3bW8pfq4FCtjHk0zsWNVmfl4SmWY4bg== X-Received: by 2002:a63:5962:: with SMTP id j34mr22779391pgm.331.1615256466723; Mon, 08 Mar 2021 18:21:06 -0800 (PST) Received: from google.com ([2620:15c:211:201:4ccc:acdd:25da:14d1]) by smtp.gmail.com with ESMTPSA id q25sm3400915pfh.34.2021.03.08.18.21.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 08 Mar 2021 18:21:05 -0800 (PST) Date: Mon, 8 Mar 2021 18:21:03 -0800 From: Minchan Kim To: Andrew Morton Cc: linux-mm , LKML , John Dias , Michal Hocko , David Hildenbrand , Jason Baron Subject: Re: [PATCH v2] mm: page_alloc: dump migrate-failed pages Message-ID: References: <20210308202047.1903802-1-minchan@kernel.org> <20210308162128.9b4a7d4c1576a72fd4878bdb@linux-foundation.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210308162128.9b4a7d4c1576a72fd4878bdb@linux-foundation.org> X-Stat-Signature: sej6scg86q6jwrc8qoqu3z4djwmcbq5x X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 94BBA90009EF Received-SPF: none (gmail.com>: No applicable sender policy available) receiver=imf19; identity=mailfrom; envelope-from=""; helo=mail-pf1-f179.google.com; client-ip=209.85.210.179 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1615256466-768340 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Mar 08, 2021 at 04:21:28PM -0800, Andrew Morton wrote: > On Mon, 8 Mar 2021 12:20:47 -0800 Minchan Kim wrote: > > > alloc_contig_range is usually used on cma area or movable zone. > > It's critical if the page migration fails on those areas so > > dump more debugging message. > > > > page refcount, mapcount with page flags on dump_page are > > helpful information to deduce the culprit. Furthermore, > > dump_page_owner was super helpful to find long term pinner > > who initiated the page allocation. > > > > Admin could enable the dump like this(by default, disabled) > > > > echo "func dump_migrate_failure_pages +p" > control > > > > Admin could disable it. > > > > echo "func dump_migrate_failure_pages =_" > control > > > > ... > > > > --- a/mm/page_alloc.c > > +++ b/mm/page_alloc.c > > @@ -8453,6 +8453,34 @@ static unsigned long pfn_max_align_up(unsigned long pfn) > > pageblock_nr_pages)); > > } > > > > +#if defined(CONFIG_DYNAMIC_DEBUG) || \ > > + (defined(CONFIG_DYNAMIC_DEBUG_CORE) && defined(DYNAMIC_DEBUG_MODULE)) > > +static DEFINE_RATELIMIT_STATE(alloc_contig_ratelimit_state, > > + DEFAULT_RATELIMIT_INTERVAL, DEFAULT_RATELIMIT_BURST); > > +int alloc_contig_ratelimit(void) > > +{ > > + return __ratelimit(&alloc_contig_ratelimit_state); > > +} > > Wow, that's an eyesore. We're missing helpers in the ratelimit code. > Can we do something like > > /* description goes here */ > #define RATELIMIT2(interval, burst) > ({ > static DEFINE_RATELIMIT_STATE(_rs, interval, burst); > > __ratelimit(_rs); > }) > > #define RATELIMIT() > RATELIMIT2(DEFAULT_RATELIMIT_INTERVAL, DEFAULT_RATELIMIT_BURST) > > > +void dump_migrate_failure_pages(struct list_head *page_list) > > +{ > > + DEFINE_DYNAMIC_DEBUG_METADATA(descriptor, > > + "migrate failure"); > > + if (DYNAMIC_DEBUG_BRANCH(descriptor) && > > + alloc_contig_ratelimit()) { > > + struct page *page; > > + > > + WARN(1, "failed callstack"); > > + list_for_each_entry(page, page_list, lru) > > + dump_page(page, "migration failure"); > > + } > > +} > > Then we can simply do > > if (DYNAMIC_DEBUG_BRANCH(descriptor) && RATELIMIT()) Sounds good idea to me. There are many places to take the benefit. However, let me leave it until we could discuss this patch. We could clean them up as follow patch. Thank you.