From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E01A6C432BE for ; Tue, 31 Aug 2021 16:37:27 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 5E5616108E for ; Tue, 31 Aug 2021 16:37:27 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 5E5616108E Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kerneltoast.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id BE007940008; Tue, 31 Aug 2021 12:37:26 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B8F238D0001; Tue, 31 Aug 2021 12:37:26 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A7F5B940008; Tue, 31 Aug 2021 12:37:26 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0054.hostedemail.com [216.40.44.54]) by kanga.kvack.org (Postfix) with ESMTP id 9B36B8D0001 for ; Tue, 31 Aug 2021 12:37:26 -0400 (EDT) Received: from smtpin38.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 52A568249980 for ; Tue, 31 Aug 2021 16:37:26 +0000 (UTC) X-FDA: 78535931292.38.6803885 Received: from mail-pg1-f179.google.com (mail-pg1-f179.google.com [209.85.215.179]) by imf14.hostedemail.com (Postfix) with ESMTP id 08B2D600198A for ; Tue, 31 Aug 2021 16:37:25 +0000 (UTC) Received: by mail-pg1-f179.google.com with SMTP id x4so17347805pgh.1 for ; Tue, 31 Aug 2021 09:37:25 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=FjfxfLj+fG4rooFHLvP0f8WffL8MrE8x/5JcvPYS7Xo=; b=NVRRtWPAQk/cSSaqhqlFd02mIGxqC9V+cXjiaNXBFf75ybdVtcPrmgUmU1IL8wo2lP ukgShOoa5oMco+iIRNa6knIXQVHLckbJsSd6/ACow6IRzitP6C9H9doxlzlGg6VuoLy5 k679rYjJWo/bqC5sX1nIBC0x/Txh0qGxvUIaGJ15Q+yPRGEVHWrL5MFxD7cQZTobdOFv X9cNwhsWWQN3WVxnBO3iBHm9gYcDvls91iPT2K0RwDGvatINtSrZ3Tv273ENi3Wr8+ey fMH0M7t7EuXUOYGrLObrxfXDHK2yYj+Z7VKv/lCLb0ye6IAVqlzl9SgOQQRf3gB4d+lD mG2A== X-Gm-Message-State: AOAM530hFSObBJ59nn+0eASkVWTz7Nch+fq33Kc7b/2A4e8SsFX4dVGB NitNIGSd6VU2xwta9fHkyao= X-Google-Smtp-Source: ABdhPJzcDb5dLJUeAkSVNgRk7H+su0RFuBd+hyh6LovS4I/nz5Dr/cezzlRgCu6p16P4RtgSOKv0hg== X-Received: by 2002:a62:1c96:0:b0:3f5:e01a:e47 with SMTP id c144-20020a621c96000000b003f5e01a0e47mr21105599pfc.76.1630427844913; Tue, 31 Aug 2021 09:37:24 -0700 (PDT) Received: from sultan-box.localdomain (static-198-54-131-119.cust.tzulo.com. [198.54.131.119]) by smtp.gmail.com with ESMTPSA id 130sm8120059pfy.175.2021.08.31.09.37.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Aug 2021 09:37:24 -0700 (PDT) Date: Tue, 31 Aug 2021 09:37:23 -0700 From: Sultan Alsawaf To: Mel Gorman Cc: linux-mm@kvack.org, mhocko@suse.com, akpm@linux-foundation.org, linux-kernel@vger.kernel.org Subject: Re: Stuck looping on list_empty(list) in free_pcppages_bulk() Message-ID: References: <20210831124449.GB4128@techsingularity.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210831124449.GB4128@techsingularity.net> Authentication-Results: imf14.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf14.hostedemail.com: domain of sultankerneltoast@gmail.com designates 209.85.215.179 as permitted sender) smtp.mailfrom=sultankerneltoast@gmail.com X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 08B2D600198A X-Stat-Signature: u9kkm997nqiw7by7z9g61wazja6pmzpx X-HE-Tag: 1630427845-630196 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Aug 31, 2021 at 01:44:49PM +0100, Mel Gorman wrote: > That's your answer -- the PCP count has been corrupted or misaccounted. > Given this is a Fedora kernel, check for any patches affecting > mm/page_alloc.c that could be accounting related or that would affect > the IRQ disabling or zone lock acquisition for problems. Another > possibility is memory corruption -- either kernel or the hardware > itself. Hmm, I don't see any changes to mm/page_alloc.c from Fedora for this kernel. What about a memory allocation originating from inside an NMI? I think this could occur quite easily with an eBPF program registered to a tracepoint, as some of the eBPF helpers do allocate memory on the fly for certain map types. > > I tried to find some way that this could happen, but the only thing I could > > think of was that maybe an allocation had both __GFP_RECLAIMABLE and > > __GFP_MOVABLE set in its gfp mask, in which case the rmqueue() call in > > get_page_from_freelist() would pass in a migratetype equal to MIGRATE_PCPTYPES > > and then pages could be added to an out-of-bounds pcp list while still > > incrementing the overall pcp count. This seems pretty unlikely though. > > It's unlikely because it would be an outright bug to specify both flags. Perhaps that VM_WARN_ON should be changed to a VM_BUG_ON? Thanks, Sultan