From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.8 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 01DF7C433DB for ; Thu, 4 Mar 2021 18:11:43 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 6CDE364EFE for ; Thu, 4 Mar 2021 18:11:42 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6CDE364EFE Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id DEB5E6B0005; Thu, 4 Mar 2021 13:11:41 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D9B286B0006; Thu, 4 Mar 2021 13:11:41 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C12CC6B0007; Thu, 4 Mar 2021 13:11:41 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0173.hostedemail.com [216.40.44.173]) by kanga.kvack.org (Postfix) with ESMTP id 9F3C16B0005 for ; Thu, 4 Mar 2021 13:11:41 -0500 (EST) Received: from smtpin03.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 0C4DC8249980 for ; Thu, 4 Mar 2021 18:11:41 +0000 (UTC) X-FDA: 77882984802.03.517D6F0 Received: from mail-pg1-f169.google.com (mail-pg1-f169.google.com [209.85.215.169]) by imf19.hostedemail.com (Postfix) with ESMTP id 72B6E9000738 for ; Thu, 4 Mar 2021 18:11:39 +0000 (UTC) Received: by mail-pg1-f169.google.com with SMTP id n10so19400331pgl.10 for ; Thu, 04 Mar 2021 10:11:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=SvoG8PH6dFacNFIYLiSSp4aCzcuSGoW8XP7i139LeKE=; b=RhMBTJO7wwKdRPvYiCjrVMyO9I4FmWrIsLlUGIu9mc/tHqdN9miLpof0Gu6ZHfySPF +RaVuaa8vvwjEp5Be80jlEiGIkkdnrc0WxpG4qNHM3i1R8tqMZse2lMtQ78bZGJxAWHX 62868iG/7zBLZNvw1/7izUu7tssTkR7DIIUhCpFLOR49XD0J4WLmlOxAoUHRw9H09WKH PryaSYvAb+Qkj46mMbHj3fGhhBRWV68sUNrbauUM9QN7FlnH/yGq9528/pzWffY4w1sx FFexn3oqShC2mxFpQS6bYkDUALT57scwUxA4bfFvnO0MwA3zdkXxl7NZLEhRXGjZPyme nZRQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to; bh=SvoG8PH6dFacNFIYLiSSp4aCzcuSGoW8XP7i139LeKE=; b=o0srkmcDJhf9puyoV5nADctfVTReeRvb4mjfdgihLtX2fYdMsLuWXN0fb9QgHeERmm Wr1wkNA50kFhGXNgfXodRThnqYeSOxA3IzEHCbqTdyWUpN55kPsFpuYUd3SWDkDbUly0 gKjnu9T+3+trgf1uwnFJaQab2KrA2Ctepy2JT97gF7HvSD+ns21Ujm9YHnaXz+pCQL/9 j9HaHUjuVjQDiydgKx6hw3kOSzkpAysSHD56u6qmVInIRP1klgv5PmvQ/cRh62/9yKMs UGY8wH423pSkqgbqIXjJZQy4NiKnJIp3ewzjT2Y751Yad4Pj27zKEwQAQZIHofAC7v1P XGlg== X-Gm-Message-State: AOAM532eGwDkuzl+OKd+BhXn6fA4fe3DQNl7nLN9EtK1yJfPuCRBa60Y rlM0dgQZCqTcu+uw9R55xFk= X-Google-Smtp-Source: ABdhPJwo/Eix2fryzn2tB7wHfS3UXP/mgLxvJ+tlAB2sCpzp+2XbmfUsjrHNJ9GwDr6Buqg8xFUOhw== X-Received: by 2002:a63:c343:: with SMTP id e3mr4746948pgd.8.1614881498786; Thu, 04 Mar 2021 10:11:38 -0800 (PST) Received: from google.com ([2620:15c:211:201:edb1:8010:5c27:a8cc]) by smtp.gmail.com with ESMTPSA id j9sm10398865pjn.32.2021.03.04.10.11.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 Mar 2021 10:11:37 -0800 (PST) Date: Thu, 4 Mar 2021 10:11:35 -0800 From: Minchan Kim To: David Hildenbrand Cc: Michal Hocko , Andrew Morton , linux-mm , LKML , joaodias@google.com Subject: Re: [PATCH] mm: be more verbose for alloc_contig_range faliures Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 72B6E9000738 X-Stat-Signature: 6y4okcihe5gwx99t4eh8b7rkycnathuc Received-SPF: none (gmail.com>: No applicable sender policy available) receiver=imf19; identity=mailfrom; envelope-from=""; helo=mail-pg1-f169.google.com; client-ip=209.85.215.169 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1614881499-123796 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Mar 04, 2021 at 06:23:09PM +0100, David Hildenbrand wrote: > > > You want to debug something, so you try triggering it and capturing debug > > > data. There are not that many alloc_contig_range() users such that this > > > would really be an issue to isolate ... > > > > cma_alloc uses alloc_contig_range and cma_alloc has lots of users. > > Even, it is expoerted by dmabuf so any userspace would trigger the > > allocation by their own. Some of them could be tolerant for the failure, > > rest of them could be critical. We should't expect it by limited kernel > > usecase. > > Assume you are debugging allocation failures. You either collect the data > yourself or ask someone to send you that output. You care about any > alloc_contig_range() allocation failures that shouldn't happen, don't you? > > > > > > > > > Strictly speaking: any allocation failure on ZONE_MOVABLE or CMA is > > > problematic (putting aside NORETRY logic and similar aside). So any such > > > page you hit is worth investigating and, therefore, worth getting logged for > > > debugging purposes. > > > > If you believe the every alloc_contig_range failure is problematic > > Every one where we should have guarantees I guess: ZONE_MOVABLE or > MIGRAT_CMA. On ZONE_NORMAL, there are no guarantees. Indeed. > > > and there is no such realy example I menionted above in the world, > > I am happy to put this chunk to support dynamic debugging. > > Okay? > > > > +#if defined(CONFIG_DYNAMIC_DEBUG) || \ > > + (defined(CONFIG_DYNAMIC_DEBUG_CORE) && defined(DYNAMIC_DEBUG_MODULE)) > > +static DEFINE_RATELIMIT_STATE(alloc_contig_ratelimit_state, > > + DEFAULT_RATELIMIT_INTERVAL, DEFAULT_RATELIMIT_BURST); > > +int alloc_contig_ratelimit(void) > > +{ > > + return __ratelimit(&alloc_contig_ratelimit_state); > > +} > > + > > ^ do we need ratelimiting with dynamic debugging enabled? Main argument was debug message flooding. Even though we play with dynamic debugging, the issue never disappear. > > > +void dump_migrate_failure_pages(struct list_head *page_list) > > +{ > > + DEFINE_DYNAMIC_DEBUG_METADATA(descriptor, > > + "migrate failure"); > > + if (DYNAMIC_DEBUG_BRANCH(descriptor) && > > + alloc_contig_ratelimit()) { > > + struct page *page; > > + > > + WARN(1, "failed callstack"); > > + list_for_each_entry(page, page_list, lru) > > + dump_page(page, "migration failure"); > > Are all pages on the list guaranteed to be problematic, or only the first > entry? I assume all. All. > > > + } > > +} > > +#else > > +static inline void dump_migrate_failure_pages(struct list_head *page_list) > > +{ > > +} > > +#endif > > + > > /* [start, end) must belong to a single zone. */ > > static int __alloc_contig_migrate_range(struct compact_control *cc, > > unsigned long start, unsigned long end) > > @@ -8496,6 +8522,7 @@ static int __alloc_contig_migrate_range(struct compact_control *cc, > > NULL, (unsigned long)&mtc, cc->mode, MR_CONTIG_RANGE); > > } > > if (ret < 0) { > > + dump_migrate_failure_pages(&cc->migratepages); > > putback_movable_pages(&cc->migratepages); > > return ret; > > } > > > > > > If that's the way dynamic debugging is configured/enabled (still have to > look into it) - yes, that goes into the right direction. As I said above, > you should dump only where we have some kind of guarantees I assume. Sure, let me wait for your review before sending next revision. Thanks for the review!