From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0697EC433F5 for ; Tue, 2 Nov 2021 09:24:52 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 9C1E060F4B for ; Tue, 2 Nov 2021 09:24:51 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 9C1E060F4B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id EBA90940014; Tue, 2 Nov 2021 05:24:50 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E443794000A; Tue, 2 Nov 2021 05:24:50 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CE477940014; Tue, 2 Nov 2021 05:24:50 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0118.hostedemail.com [216.40.44.118]) by kanga.kvack.org (Postfix) with ESMTP id BA6EE94000A for ; Tue, 2 Nov 2021 05:24:50 -0400 (EDT) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 7894A25F26 for ; Tue, 2 Nov 2021 09:24:50 +0000 (UTC) X-FDA: 78763455540.24.C651ABF Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf18.hostedemail.com (Postfix) with ESMTP id EFA93400208E for ; Tue, 2 Nov 2021 09:24:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1635845089; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Is5x4qunizskcubykNuexE0rm+el3szUQ1h1UP1TOd8=; b=SWszG9Q/v8Ar2ASHt2pI5X7leb8oclQg+hXQdVgRQFRRVOvJK1/ODyjneo6+AEtWi5CW8O OxXs3r64pXz2E/nfbxfzEyRaIKvJabo3TK/0RE6Rk3/j6Vzd7aEuFJ2KLrdN+tm6UTGtGm WLGBvuG7VJt9FFcBQMFnGA5Gjpa8H+o= Received: from mail-wr1-f70.google.com (mail-wr1-f70.google.com [209.85.221.70]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-205-UioKsKdlNL6Ob_WsPkbhaA-1; Tue, 02 Nov 2021 05:24:46 -0400 X-MC-Unique: UioKsKdlNL6Ob_WsPkbhaA-1 Received: by mail-wr1-f70.google.com with SMTP id c4-20020adfed84000000b00185ca4eba36so2133946wro.21 for ; Tue, 02 Nov 2021 02:24:45 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent :content-language:to:cc:references:from:organization:subject :in-reply-to:content-transfer-encoding; bh=Is5x4qunizskcubykNuexE0rm+el3szUQ1h1UP1TOd8=; b=j6HhyV/Q44z8Hxf08uT6g90obGpQ2IgN5OzBsemetD2ZAAinSDKSQyoUyevj827h4d 5yV02R2Hgic8mW661PxDRfxJP6RHaev6XapgY03PuR39wPjme8BWQ37jBDa1Qw3ghYz8 2gyPU1XMJ88DD/MNeXrofozFaVlqCWbZvlZ29REt6sXau1OvTp9/khTRrfBm4YoYWCUR L1+2tmbY8N6gDNlM7u1ozGjwvHnxNHDY4vXSPK7VmjoMS715s571GQ3SN3Ake/SZH8ZJ gqh23eBWVk9QennuRhxXJP8P7/PaxKvzPLDgPunDMekyKbDZ9y9tGyEmZliecfM72A0Q qv9w== X-Gm-Message-State: AOAM531LCnfiWj3JVdzc6wk1V/Bkdk2IfDh/do8NQ/TGfkb7C0rxlokG +8c5r5+ScTgwTvE+537el9boexm3IKta6WPZmGGTSUO1tU4JBe20alldU/yLvNSPgxJIKugsQjf HC9SSWcdGJuE= X-Received: by 2002:adf:c604:: with SMTP id n4mr44318065wrg.202.1635845085001; Tue, 02 Nov 2021 02:24:45 -0700 (PDT) X-Google-Smtp-Source: ABdhPJw0fGQ3DEwyi8rkMHH3dWnFdffOx2xdXnyQCeUm6PEHTWPpSxAA5u0XtJ73kJCG3Hgbf97vUQ== X-Received: by 2002:adf:c604:: with SMTP id n4mr44318038wrg.202.1635845084800; Tue, 02 Nov 2021 02:24:44 -0700 (PDT) Received: from [192.168.3.132] (p5b0c6810.dip0.t-ipconnect.de. [91.12.104.16]) by smtp.gmail.com with ESMTPSA id p13sm2193869wmi.0.2021.11.02.02.24.44 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 02 Nov 2021 02:24:44 -0700 (PDT) Message-ID: Date: Tue, 2 Nov 2021 10:24:43 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.2.0 To: Michal Hocko , Alexey Makhalov Cc: "linux-mm@kvack.org" , Andrew Morton , "linux-kernel@vger.kernel.org" , "stable@vger.kernel.org" , Oscar Salvador References: <20211101201312.11589-1-amakhalov@vmware.com> <7136c959-63ff-b866-b8e4-f311e0454492@redhat.com> From: David Hildenbrand Organization: Red Hat Subject: Re: [PATCH] mm: fix panic in __alloc_pages In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: EFA93400208E X-Stat-Signature: rjz6kgokzzz39rjgyxph481g8j6jzaer Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="SWszG9Q/"; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf18.hostedemail.com: domain of david@redhat.com has no SPF policy when checking 170.10.129.124) smtp.mailfrom=david@redhat.com X-HE-Tag: 1635845089-900745 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: >>> In add_memory_resource() we hotplug the new node if required and set it >>> online. Memory might get onlined later, via online_pages(). >> >> You are correct. In case of memory hot add, it is true. But in case of adding >> CPU with memoryless node, try_node_online() will be called only during CPU >> onlining, see cpu_up(). >> >> Is there any reason why try_online_node() resides in cpu_up() and not in add_cpu()? >> I think it would be correct to online node during the CPU hot add to align with >> memory hot add. > > I am not familiar with cpu hotplug, but this doesn't seem to be anything > new so how come this became problem only now? So IIUC, the issue is that we have a node a) That has no memory b) That is offline This node will get onlined when onlining the CPU as Alexey says. Yet we have some code that stumbles over the node and goes ahead trying to use the pgdat -- that code is broken. If we take a look at build_zonelists() we indeed skip any !node_online(node). Any other code should do the same. If the node is not online, it shall be ignored because we might not even have a pgdat yet -- see hotadd_new_pgdat(). Without node_online(), the pgdat might be stale or non-existant. The node onlining logic when onlining a CPU sounds bogus as well: Let's take a look at try_offline_node(). It checks that: 1) That no memory is *present* 2) That no CPU is *present* We should online the node when adding the CPU ("present"), not when onlining the CPU. -- Thanks, David / dhildenb