From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 71C0BC433E0 for ; Tue, 9 Mar 2021 16:15:48 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 023DB64FBD for ; Tue, 9 Mar 2021 16:15:47 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 023DB64FBD Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 88E688D010E; Tue, 9 Mar 2021 11:15:47 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 83EC88D007F; Tue, 9 Mar 2021 11:15:47 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6914C8D010E; Tue, 9 Mar 2021 11:15:47 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0113.hostedemail.com [216.40.44.113]) by kanga.kvack.org (Postfix) with ESMTP id 48BF68D007F for ; Tue, 9 Mar 2021 11:15:47 -0500 (EST) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id ED43E180AD80F for ; Tue, 9 Mar 2021 16:15:46 +0000 (UTC) X-FDA: 77900836692.22.0C98DBF Received: from mail-pl1-f177.google.com (mail-pl1-f177.google.com [209.85.214.177]) by imf12.hostedemail.com (Postfix) with ESMTP id DCDEBFE for ; Tue, 9 Mar 2021 16:15:42 +0000 (UTC) Received: by mail-pl1-f177.google.com with SMTP id 30so2254681ple.4 for ; Tue, 09 Mar 2021 08:15:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=q8QcsiDggpb3mUao/vPMlCQf9DIzik0cDjIhhbkZ1xU=; b=kurzX2QEsZgWOk8Mh1sbjNngCHk6XBbPsZJWyXicZERzy0zHoyqVMbXT5GZgCRIJQr /fEApk1lteEzwU211kh81mfmqSuOWYzEm+8P8j0bhxfC/pzGpRKix6ZxO2P9KEY87caL EjrrT6qnehJsKEPq+27N+k5AAxHipLak2kP4H7ffzgtk8PzmIq98kdazRicmfsiiCz9a jlq0W5JykHT85+g0YlzzZ+Nu8lHjP9CcVyU0UDrMy+as8kLPefIZ4D784szO6IR7cwda FnIfZwPLKT6XmNWS4oWarsVl4PREzvmITjUqX8qk48isdxRyuMWUUNvMmiWlWvv612oC SECw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to; bh=q8QcsiDggpb3mUao/vPMlCQf9DIzik0cDjIhhbkZ1xU=; b=ugpdJwWop41vw+51PxFP+pbks8Jyef1nBUHhWwVG9738+KDBsUMM/lpSFk8+FAJ7ou QV9hIRUNavDTEWlp2NtOxwz7ZTmjefV8svbiaTIHzdEO4Dpj+CW92u1l9lytpGVAnNtQ KawoMZbJpQiGeNWBlWOYoCxBFzONN5TIC9iZh/aLxeF1/QRIhkEG9ZIGDJBV6NrG5l08 nXvMiD9LIagxha81RBpOLUt7mNVFsv9RCmwTrSboY/1xr9vrL3HF+WBsve/ydCn9lyo4 7uK9Howyl8/Hcf5COODRWWfvV+RdUT/JgQfuxC9U/WXGcVRqUZoiZwysqiqnIje9H8Hv d86Q== X-Gm-Message-State: AOAM532XnMPQ2a+4qMYWZc0bUmhsRFiezOoIDOHaMlHlhkSvPZ5NYGkr fEPQl7XUvGSXNQ/LzI1USQ0= X-Google-Smtp-Source: ABdhPJwvXPilPXsLCeZvlYRu1vDeUu5wIgsqikU51/KQCQC6Hsol40QPCM9/bNVg2VMvX6OfnsvpSw== X-Received: by 2002:a17:902:ab8f:b029:e5:c92e:2c5f with SMTP id f15-20020a170902ab8fb02900e5c92e2c5fmr4565623plr.75.1615306544357; Tue, 09 Mar 2021 08:15:44 -0800 (PST) Received: from google.com ([2620:15c:211:201:f896:d6be:86d4:a59b]) by smtp.gmail.com with ESMTPSA id i10sm15315250pgo.75.2021.03.09.08.15.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 09 Mar 2021 08:15:43 -0800 (PST) Date: Tue, 9 Mar 2021 08:15:41 -0800 From: Minchan Kim To: Michal Hocko Cc: Andrew Morton , linux-mm , LKML , John Dias , David Hildenbrand , Jason Baron Subject: Re: [PATCH v2] mm: page_alloc: dump migrate-failed pages Message-ID: References: <20210308202047.1903802-1-minchan@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: DCDEBFE X-Stat-Signature: hsj6gnwp1ju73z5hyhzjorxdnyca3144 Received-SPF: none (gmail.com>: No applicable sender policy available) receiver=imf12; identity=mailfrom; envelope-from=""; helo=mail-pl1-f177.google.com; client-ip=209.85.214.177 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1615306542-261936 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Mar 09, 2021 at 10:32:51AM +0100, Michal Hocko wrote: > On Mon 08-03-21 12:20:47, Minchan Kim wrote: > > alloc_contig_range is usually used on cma area or movable zone. > > It's critical if the page migration fails on those areas so > > dump more debugging message. > > I disagree with this statement. alloc_contig_range is not a reliable > allocator. Any user, be it CMA or direct users of alloc_contig_range > have to deal with allocation failures. Debugging information can be > still useful but considering migration failures critical is > overstatement to say the least. Fair enough. Let's change it. "Currently, debugging CMA allocation failure is too hard due to lacking of page information. alloc_contig_range is proper place to dump them since it has migrate-failed page list." > > > page refcount, mapcount with page flags on dump_page are > > helpful information to deduce the culprit. Furthermore, > > dump_page_owner was super helpful to find long term pinner > > who initiated the page allocation. > > > > Admin could enable the dump like this(by default, disabled) > > > > echo "func dump_migrate_failure_pages +p" > control > > > > Admin could disable it. > > > > echo "func dump_migrate_failure_pages =_" > control > > My original idea was to add few pr_debug and -DDYNAMIC_DEBUG_MODULE for > page_alloc.c. It makes sense to enable a whole bunch at once though. > The naming should better reflect this is alloc_contig_rage related > because the above sounds like a generic migration failure thing. alloc_contig_dump_pages? > > Somebody more familiar with the dynamic debugging infrastructure needs > to have a look but from from a quick look it seems ok. > > Do we really need all the ugly ifdefery, though? Don't we want to have > this compiled in all the time and just rely on the static branch managed > by the dynamic debugging framework? I have no further idea to make it simple while we keep the flexibility for arguments and print format. #if defined(CONFIG_DYNAMIC_DEBUG) || \ (defined(CONFIG_DYNAMIC_DEBUG_CORE) && defined(DYNAMIC_DEBUG_MODULE)) static void alloc_contig_dump_pages(struct list_head *page_list) { static DEFINE_RATELIMIT_STATE(_rs, DEFAULT_RATELIMIT_INTERVAL, DEFAULT_RATELIMIT_BURST); DEFINE_DYNAMIC_DEBUG_METADATA(descriptor, "migrate failure"); if (DYNAMIC_DEBUG_BRANCH(descriptor) && __ratelimit(&_rs)) { struct page *page; WARN(1, "failed callstack"); list_for_each_entry(page, page_list, lru) dump_page(page, "migration failure"); } } #else static inline void alloc_contig_dump_pages(struct list_head *page_list) { } #endif > > [...] > > +void dump_migrate_failure_pages(struct list_head *page_list) > > +{ > > + DEFINE_DYNAMIC_DEBUG_METADATA(descriptor, > > + "migrate failure"); > > + if (DYNAMIC_DEBUG_BRANCH(descriptor) && > > + alloc_contig_ratelimit()) { > > + struct page *page; > > + > > + WARN(1, "failed callstack"); > > + list_for_each_entry(page, page_list, lru) > > + dump_page(page, "migration failure"); > > + } > > Apart from the above, do we have to warn for something that is a > debugging aid? A similar concern wrt dump_page which uses pr_warn and Make sense. > page owner is using even pr_alert. > Would it make sense to add a loglevel parameter both into __dump_page > and dump_page_owner? Let me try it.