From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 95C07CD3440 for ; Tue, 19 Sep 2023 06:49:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2EE8A6B04B2; Tue, 19 Sep 2023 02:49:19 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 29E846B04B4; Tue, 19 Sep 2023 02:49:19 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1668F6B04B5; Tue, 19 Sep 2023 02:49:19 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 057EE6B04B2 for ; Tue, 19 Sep 2023 02:49:19 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id CA1FD1C9697 for ; Tue, 19 Sep 2023 06:49:18 +0000 (UTC) X-FDA: 81252420396.23.584E3C4 Received: from mail-qk1-f169.google.com (mail-qk1-f169.google.com [209.85.222.169]) by imf01.hostedemail.com (Postfix) with ESMTP id CF41F40006 for ; Tue, 19 Sep 2023 06:49:16 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b=ZGDaGGPI; dmarc=pass (policy=none) header.from=cmpxchg.org; spf=pass (imf01.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.222.169 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1695106157; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=8ZcLvNMvIS9630ZAJ+yWXO2chaibTNPALRuIptGLqe0=; b=0zICWOL2SHvHbqqzhUNo0H6/+kETFnb8K0ykw6uPhrswIK82wAwPmdyAoAEHVbQG28DO2Q HlDhgCSftY2yUkmz5Ks6zetVfK3B32Ugg/BbJSomWGCOE7b73hoBhOVbryMI2eyXZ32q/g WfSLxKiLyDYyXz+85A2Ojv63A3UC2c0= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b=ZGDaGGPI; dmarc=pass (policy=none) header.from=cmpxchg.org; spf=pass (imf01.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.222.169 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1695106157; a=rsa-sha256; cv=none; b=TY+g8QLFPkpFfgaIPM6owmdLStXgKiIAtnv4dB9pou9pN4lp6UvzDPKnsacQc2T4Tz6f1G xrVFuT8oAnG9qGGWc1zVFhUjEyoasju8DJtbQa20SOXXfzLVgDCxmBj94wFlnPkdl1GEsp Wkka9ofGKLJGKpRR4JEqzrSJNAeq3Vc= Received: by mail-qk1-f169.google.com with SMTP id af79cd13be357-770ef353b8fso347654785a.0 for ; Mon, 18 Sep 2023 23:49:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20230601.gappssmtp.com; s=20230601; t=1695106156; x=1695710956; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=8ZcLvNMvIS9630ZAJ+yWXO2chaibTNPALRuIptGLqe0=; b=ZGDaGGPIoKDlym/3QEruFYuUTDMAqxkbd6yWN4gANWIwSfnC9m/TGRCYkMRhPbKOy5 /SGG1BOqvE0jZ87Dn4P8d8XMD6HeXHr7vnF8rLZHyHYu4EJj1nViHGwUvAzu22dJlKaO 2AdBzD/zMeIl3A5JjIIk9Yn5C0jrS5vCBzLKmDPVDCGOvYPCBCWWfQUVJqYHfy7A4Dtv Fg1nx1VNusMLZm3FOfrhvktLUgkqLYK1Sv0A3GwGg9NPRtKS6CDlV5V/suip9CXdLc98 +Wl+lspl63pYkByDF30xiSKjU5ebFnmGFOPkpNfsdvHYLVcAtb7ppabSheC67SlmESPr 3Yiw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695106156; x=1695710956; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=8ZcLvNMvIS9630ZAJ+yWXO2chaibTNPALRuIptGLqe0=; b=l4Qse0AWW/PdahtDeja5A/CwU0J/+1bOewM3O2huc3GY0wkbuoxOer0krlzIh8psH2 2Ga8Ok5LWJNAn6mR5lIHmtZxnN/aQUUh66AqLShDVvisQYgUWEKQfJepWK+V7w7KFikC d24M0izHCiSG9Ax+LUpUToXgrievE1Sh1b0Efp5A/50Zl2GOjNIP3t1apB3bpA4k/TAd sOR000p3qW8B6mDfuqgLLbSkLl6vNCQAqLiT1LNIutt9TgJa7e17eDSfJxcvL+6A6N2w r9ALM0Tf6aQzfPg6sYgohSJKb+Fm4E16d0+VY7x1ncI0f3CT+XbSutKXRVmOnQ+WyS/E AdHg== X-Gm-Message-State: AOJu0YzWGpxgzTjlbvJlPU5GUjrA5VQ+N6Kr41WKvK0O4NyuLNcLFZJf f/rwiQQQsKbgDgVkqn1FAPd9KQ== X-Google-Smtp-Source: AGHT+IHksFiKt0PJEIwkyv9eFPRmrlk6ySx1p5wt3SuN8Nni1F8mtf+FFpG9Fny9ralr+52sk7mFLQ== X-Received: by 2002:a05:620a:5629:b0:76c:da86:3169 with SMTP id vv9-20020a05620a562900b0076cda863169mr10871733qkn.40.1695106155658; Mon, 18 Sep 2023 23:49:15 -0700 (PDT) Received: from localhost (2603-7000-0c01-2716-3012-16a2-6bc2-2937.res6.spectrum.com. [2603:7000:c01:2716:3012:16a2:6bc2:2937]) by smtp.gmail.com with ESMTPSA id 16-20020a05620a071000b007671b599cf5sm3747079qkc.40.2023.09.18.23.49.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 18 Sep 2023 23:49:15 -0700 (PDT) Date: Tue, 19 Sep 2023 02:49:14 -0400 From: Johannes Weiner To: Mike Kravetz Cc: Vlastimil Babka , Andrew Morton , Mel Gorman , Miaohe Lin , Kefeng Wang , Zi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH V2 0/6] mm: page_alloc: freelist migratetype hygiene Message-ID: <20230919064914.GA124289@cmpxchg.org> References: <20230911195023.247694-1-hannes@cmpxchg.org> <20230914235238.GB129171@monkey> <20230915141610.GA104956@cmpxchg.org> <20230916195739.GB618858@monkey> <20230918145204.GB16104@cmpxchg.org> <20230918174037.GA112714@monkey> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230918174037.GA112714@monkey> X-Rspamd-Queue-Id: CF41F40006 X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: ip6q9eniam1rawwfr96f9uhq7e8ueaf9 X-HE-Tag: 1695106156-194410 X-HE-Meta: U2FsdGVkX19ayzlNYDX3cej82m/2cPLIUE2jrPT1MPOpBj8RWTVGqTWBxMhhU9YSE9Ac9W8u5Bfi5vGXRc0PKdSdsEWpjLMdDRnL2Csjnth3Wd50Vb4/E9g5GoujHdL3YcqkdE8k1NNAqN0xOeeP9Ck/Fkyv4YeChtKtLMPmrP46fbZnhynrFxWrPo6g789JwfqP5oBidEQ2q3YWtZh9wcITP50tqsxitWmlPL02BUD++Xj9/jps6ipHI7vg8kbPqoEkavZKL3SrFl7HLcZt3gIz1qjeKZ5pGiD/8ZYlhujBqOCZEuix+Mly5tQsjfeNKzdxqIHUExJidiIxCj+ksFAzScypeY8f7i3hYH/ExYmD0BS1Byblb0cHcTWJkAuXwNY4h4ERrvHtY5wX/qHYRnYo6Og4ce/yoaTexcbdCLjaxvxnmlhqeYoLuAUkk9L3CknB8nlbWrdMYa/r80DZQ/0BxyNSUA39W368xHoFmSUF7/zp6EWYyQMZ4vTqEYEYkA8JmnCmyXhICaY0YBhTLHheYo/5Jl2DiPZIFdHwt8k/jgVLSRFfm1kdiI42BSZxxoyHiCzLPUhKC05zk+kuwHHidG6p6AjLkdS/KcWX9tjE7XPU9f20lwHJaMweqATaI+33hDWQxlcxfugdkab20KpNOF0A9b65ZPxg+k4dZ6XxxNKjGJvH0z1G6WcABnLbN1i292L7g2On0Qsm/pjugenXBSbieyN0i83feE+vbm4dx7Um89IYrL2vPkV7QBuIC57uJZcZdC8uto84tjcP7Mcp+vqsDXpLFIBDxLJWzHDIZiFTNkQpR2usezZXrhwCJ+IsQe6j6XaoNUCWEQ3cmgC9xLxBW+cNVJtl0fb+aQOR7jtZv9e0cOAAspZAA0fmDD1XpwxJiGRC2wHgM9IxMCxBetTl1/wEaYU/L5Jjx+EKIZVgd4EV9TF1Wx4SoCJ1RWJoHQr3eH343KyJysL rBkxxfeT yY6pOHvBxDr91nPoGxC5gZmAw+eMn4sOmbWIAomQyfqVCRmhSRSJK1mxyK+LVzbw4Pf0VFFSSHCo4scq7MPDzleeG60zHnXwSAFtvr7016qaTv3bSZDEK3B/gQ2DiKl3T8zvPuDYuqFip+G4HBu4K0zXay886vJFxJAjegjKN+fn60LlziVGTN1HUi3YCkbikywrrARt8BXdyZfU73LvsOxzFF2fEk/qvqMaj3j8dXS5NLduz1qkUjDGaKSj/kTjp0pfpHEVnL6kMPBGGQQTR8vbB+2bcMy1xRyAI/RVKe7aHH0Ht0Sby9GSeJLJOj/WDOkRghVnrZ8POiUKidallMo0J5A6zRJAcJ9HbXKCb3a3xo4bwMWiKN9pXTAwIhVDa8/7YfmcgXBpP9dbu11AfrYXY5hx2SlVNAzFHZOGMlRBxWXE4em53g4Xq7gOHoCQUw7Hm X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Sep 18, 2023 at 10:40:37AM -0700, Mike Kravetz wrote: > On 09/18/23 10:52, Johannes Weiner wrote: > > On Mon, Sep 18, 2023 at 09:16:58AM +0200, Vlastimil Babka wrote: > > > On 9/16/23 21:57, Mike Kravetz wrote: > > > > On 09/15/23 10:16, Johannes Weiner wrote: > > > >> On Thu, Sep 14, 2023 at 04:52:38PM -0700, Mike Kravetz wrote: > > > > > > > > With the patch below applied, a slightly different workload triggers the > > > > following warnings. It seems related, and appears to go away when > > > > reverting the series. > > > > > > > > [ 331.595382] ------------[ cut here ]------------ > > > > [ 331.596665] page type is 5, passed migratetype is 1 (nr=512) > > > > [ 331.598121] WARNING: CPU: 2 PID: 935 at mm/page_alloc.c:662 expand+0x1c9/0x200 > > > > > > Initially I thought this demonstrates the possible race I was suggesting in > > > reply to 6/6. But, assuming you have CONFIG_CMA, page type 5 is cma and we > > > are trying to get a MOVABLE page from a CMA page block, which is something > > > that's normally done and the pageblock stays CMA. So yeah if the warnings > > > are to stay, they need to handle this case. Maybe the same can happen with > > > HIGHATOMIC blocks? Ok, the CMA thing gave me pause because Mike's pagetypeinfo didn't show any CMA pages. 5 is actually MIGRATE_ISOLATE - see the double use of 3 for PCPTYPES and HIGHATOMIC. > > This means we have an order-10 page where one half is MOVABLE and the > > other is CMA. This means the scenario is different: We get a MAX_ORDER page off the MOVABLE freelist. The removal checks that the first pageblock is indeed MOVABLE. During the expand, the second pageblock turns out to be of type MIGRATE_ISOLATE. The page allocator wouldn't have merged those types. It triggers a bit too fast to be a race condition. It appears that MIGRATE_ISOLATE is simply set on the tail pageblock while the head is on the list, and then stranded there. Could this be an issue in the page_isolation code? Maybe a range rounding error? Zi Yan, does this ring a bell for you? I don't quite see how my patches could have caused this. But AFAICS we also didn't have warnings for this scenario so it could be an old bug. > > Mike, could you describe the workload that is triggering this? > > This 'slightly different workload' is actually a slightly different > environment. Sorry for mis-speaking! The slight difference is that this > environment does not use the 'alloc hugetlb gigantic pages from CMA' > (hugetlb_cma) feature that triggered the previous issue. > > This is still on a 16G VM. Kernel command line here is: > "BOOT_IMAGE=(hd0,msdos1)/vmlinuz-6.6.0-rc1-next-20230913+ > root=UUID=49c13301-2555-44dc-847b-caabe1d62bdf ro console=tty0 > console=ttyS0,115200 audit=0 selinux=0 transparent_hugepage=always > hugetlb_free_vmemmap=on" > > The workload is just running this script: > while true; do > echo 4 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages > echo 4 > /sys/kernel/mm/hugepages/hugepages-1048576kB/demote > echo 0 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages > done > > > > > Does this reproduce instantly and reliably? > > > > It is not 'instant' but will reproduce fairly reliably within a minute > or so. > > Note that the 'echo 4 > .../hugepages-1048576kB/nr_hugepages' is going > to end up calling alloc_contig_pages -> alloc_contig_range. Those pages > will eventually be freed via __free_pages(folio, 9). No luck reproducing this yet, but I have a question. In that crash stack trace, the expand() is called via this: [ 331.645847] get_page_from_freelist+0x3ed/0x1040 [ 331.646837] ? prepare_alloc_pages.constprop.0+0x197/0x1b0 [ 331.647977] __alloc_pages+0xec/0x240 [ 331.648783] alloc_buddy_hugetlb_folio.isra.0+0x6a/0x150 [ 331.649912] __alloc_fresh_hugetlb_folio+0x157/0x230 [ 331.650938] alloc_pool_huge_folio+0xad/0x110 [ 331.651909] set_max_huge_pages+0x17d/0x390 I don't see an __alloc_fresh_hugetlb_folio() in my tree. Only alloc_fresh_hugetlb_folio(), which has this: if (hstate_is_gigantic(h)) folio = alloc_gigantic_folio(h, gfp_mask, nid, nmask); else folio = alloc_buddy_hugetlb_folio(h, gfp_mask, nid, nmask, node_alloc_noretry); where gigantic is defined as the order exceeding MAX_ORDER, which should be the case for 1G pages on x86. So the crashing stack must be from a 2M allocation, no? I'm confused how that could happen with the above test case.