From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3A0FBC0032E for ; Mon, 23 Oct 2023 02:53:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7E3036B009C; Sun, 22 Oct 2023 22:53:22 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 791F66B009E; Sun, 22 Oct 2023 22:53:22 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 659E06B009F; Sun, 22 Oct 2023 22:53:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 56CA66B009C for ; Sun, 22 Oct 2023 22:53:22 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 24FE71CB191 for ; Mon, 23 Oct 2023 02:53:22 +0000 (UTC) X-FDA: 81375205044.06.47CC98C Received: from mail-pf1-f173.google.com (mail-pf1-f173.google.com [209.85.210.173]) by imf08.hostedemail.com (Postfix) with ESMTP id 3493016000C for ; Mon, 23 Oct 2023 02:53:18 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=EikFs3tG; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf08.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.210.173 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1698029600; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=FxlrpfT+18WYqvl8iBNwHnQjJ9RP07edFblWvZMOgNU=; b=S6bV2WL6fCSa6cpmi7FgEIueMbi0kS0cIsf2V419ySH0sCU+2WsBABoP0mtKFJCEBD5xx9 cBQKa16BrM8rRqReMQS0+13B7sdXsZ+k5ariPrwpNzdi5iPk9gb1fX9k0Os+14bWJnqvt9 sZtb/Z2iZuY80IiqzuyvhPYUVOVO22E= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=EikFs3tG; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf08.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.210.173 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1698029600; a=rsa-sha256; cv=none; b=XezEc3ODcIuMcKtL89q3vuS4hKcmWmfvx5PFnv2ulOs+Oc4zc38n0adgwfUbqAeLOGT2RG jpeh94jwwWPUPQFUMMrFA2Lrfz4yDv34RCF2Pdz+DeaZyDvgYGArAqFfAQ/sCSrBB5Uwfw PYr/PlEdY25jWleEtZ9j22XkN+tu6VM= Received: by mail-pf1-f173.google.com with SMTP id d2e1a72fcca58-6bd20c30831so774494b3a.1 for ; Sun, 22 Oct 2023 19:53:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1698029596; x=1698634396; darn=kvack.org; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=FxlrpfT+18WYqvl8iBNwHnQjJ9RP07edFblWvZMOgNU=; b=EikFs3tGPMHQFxq816/BUuUNn3gigaWqcDNI59GZMWuasCmFsrYEEJjoPC+gO6Sk0J 25UdflgezRKwavR1Qx2gZc+JxOZs1Lo9OgnNvALorEyHnvCi+OBjYofseohg+KoJRRTK f1zjAQCakYJdtja8okbLNqYyFOQbbxcn6LrLBidVVBwn5kRa3HCv0sAdpU/7IOWCLE+5 GWi/vcCc+SsLOe1KBvhdOqNSmQorSZrvOxjsQJeqYY/qEvRRfx6Sz40Rx1Ax8BsttmSQ kuMwY1irgWKyZhxRZo1v6BE523tPzENS2EIbPoLePV0SDqdbyat3MSGnYHpUhnzzM8JQ zPBQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698029596; x=1698634396; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=FxlrpfT+18WYqvl8iBNwHnQjJ9RP07edFblWvZMOgNU=; b=wWiotPpaWcOFX0gLDajqvX5TuQvbGYMDdVR8prV0d/8PHf0iqJzAgJZU3G408UndLg B3cCyslyo2Fnnp9h/TT6gM1WSLEcaV3qwYKIS1awO2TQnXhvNqqYQaSZzUwq+u89W9nZ Z8LXFfIxy/4zkG3YrbE1IDRuvE+0NAg/nIOSsUGpUSZESGfmLJZasfEp8Zuxf+GTTpN9 pSAQ9dVh3GmjNUr064iIYDFXO5UEJxHTVMOo+F0uV4oT1fLLLKRb9b6hSjz4APEkby2w sbYOQ54+1sEFBdD4b90JER3qdJkip6S1SqU+xncnZaT5AAYwpNx+R5R3EgrIJF9S/2O0 KlHw== X-Gm-Message-State: AOJu0YyJYbsuBask3cjXVlebjD7WvNbbvv/qVaw08kvVyJ5JfM82Pe5T uPB6Pw8pMGYqxvkSRw/h4lV3igO17y4gAuTMWbg= X-Google-Smtp-Source: AGHT+IHWcz3LizzawCs7mBkpZJv+RjoDj04rWXMMSSoX22lUYmglkwmMnyN8sUPfBaXb33R45Lk1OQ== X-Received: by 2002:a05:6a21:1a2:b0:171:947f:465b with SMTP id le34-20020a056a2101a200b00171947f465bmr12208058pzb.4.1698029596701; Sun, 22 Oct 2023 19:53:16 -0700 (PDT) Received: from [10.4.238.83] ([139.177.225.246]) by smtp.gmail.com with ESMTPSA id a7-20020aa794a7000000b006be484e5b9bsm5116413pfl.58.2023.10.22.19.53.11 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 22 Oct 2023 19:53:16 -0700 (PDT) Message-ID: Date: Mon, 23 Oct 2023 10:53:08 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v3 2/2] mm: memory_hotplug: drop memoryless node from fallback lists Content-Language: en-US To: "Huang, Ying" Cc: akpm@linux-foundation.org, rppt@kernel.org, david@redhat.com, vbabka@suse.cz, mhocko@suse.com, willy@infradead.org, mgorman@techsingularity.net, mingo@kernel.org, aneesh.kumar@linux.ibm.com, hannes@cmpxchg.org, osalvador@suse.de, linux-kernel@vger.kernel.org, linux-mm@kvack.org References: <9f1dbe7ee1301c7163b2770e32954ff5e3ecf2c4.1697711415.git.zhengqi.arch@bytedance.com> <87bkctg4f4.fsf@yhuang6-desk2.ccr.corp.intel.com> <4bfa007c-a20f-9e68-4a9f-935dacf43222@bytedance.com> <8734y2f868.fsf@yhuang6-desk2.ccr.corp.intel.com> From: Qi Zheng In-Reply-To: <8734y2f868.fsf@yhuang6-desk2.ccr.corp.intel.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 3493016000C X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: 898r6az9zmsns8tw6zi7qh3kktao1bei X-HE-Tag: 1698029598-997742 X-HE-Meta: U2FsdGVkX1+yaVR43Fr0oSHlerT2M7OEVGzPKzNU2231ko2R4RWD6wfz4+0qXPAK8o/ukHbu8nDFhJM456o2MjYadGNTKBB6JU8k+Vr0Ou9mcVOQqshOFCM8yT/0z0REzk6My8oUZFcJvEDz7ZPZYDlu0JIoGOaCvgpYjDkJjM6EZvUJsq3jzA4vfe0sanuEyuwXb+Vsgaje/JYEiraan5p6eu/jBzQjFiwW2gaVYBgSeEbBBIh4lNNmgQl6J24pZLslywGhPaAStakc8BnjJR+KRxvkH9QMaEyLCtdcoYDHuN513IhhwG25H/jAczRj46yiAWdX/JulI/HA4AsihHJd14lplWlQcPuTNQEb9XPCxWKwaOKNnIddOqs83Qsypi7bx77X/rm8MKZtlbrtXHcIzYEfYz2NOqSKAogAwhudhtOCLl46ShvPUjoJkztZf2rzcebeqBe2H5W03ojTfqsXlB2F7DoQVOYY3znBWYwsO6o/Vtsq9WtFESgrKNFUWTF/KRIoe3j+jDhNnWbPg0cuYa5gjTjK9wsGD9y6GmRxTS+5BHuz0f56F808fhSJGDV6pGbEHb7OkNDhlNvvjRBtTZCOaOQ7jQdZw3YFUub2TAT5uC1+W3xniSx46evarGMl5FzNdvNQcqlT7UN2jhJdCUNMAk/NEBk2gfzABlt2aMZIYj8ZJVmGCy25t1Rnu0lpiiSHiGNIppqSWjXdcrIK1aTLt5B/RqJuMIUH7Zn8DeneODkUxNTKW+pGguzpPS2c2dyF0ia+dBmDaLgVAvY8z5dnI51FWLrCHJ5tGawzIFUiG0UVOUsJaMcmdpL8PUdyr27yYrdJvEiR+9VmmWtz7YjuTCS+JUVvJbBJenTCtAVsV/9EsQvofdpMb0sAt38Yr9fww0k/g7pYJBObu2rY5bfxSW2WYpXj/Q8hGeeitmP1yQn6c1b6MMdQIu1L6HP2ZQuH5avElxiMZyv FAlIYJ9r yEiGKvFU/NvytXb/5Mp1Kf73+vz1PCDgd4ziOB2gVib/0SsQazQxZv3ENZ+PO0kco4oq+rRWng5drkfQ/oXtWQn60EqYoDFhAxdh99+Cxn9xXqVOjTwyudWM8DaxLRmBatceazwOKKOprQrBRj89/bxZU/TiMxJNNA2a/6yzzPicNk77/caaFBZk/d5hoDAjSlfsObpjJ/XSgvXV2cd8qvkUweyLpTScf/9otTWb0cynVakOcXGk5pt2oS0eHLuf8Eg4WCeoL8TAlzuEFBc3TN/sF0JAdowRXCIyRKxsy7H3RakeWIAXniM8YJKh9JJdci09G/6F+aQK2vnM53uShgRbZ8IxcAgxqJSv9muw59hZiBoXpnhonjbSJkW71wNS4sUNEl+33itb++miTV0+W4LOM3L0qyMq2Fso9ZAyLAhGQ3CGnuPppWagebjpEdNKZ2ug2coFO/tm2/2N5e7zazO+eWqLIsBKPoKZJ5/YdBmVFyJL3DuZHxgyDcu6iMAgjjmS17psEVQkCzrKLSp3sbdvNZUEdOTBD12AIERtKX6Zj2k3l6GV6ITa2GCQ26rQ00QHkinmZBHvvh3qWZeB007iUFg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi Ying, On 2023/10/23 09:18, Huang, Ying wrote: > Qi Zheng writes: > >> Hi Ying, >> >> On 2023/10/20 15:05, Huang, Ying wrote: >>> Qi Zheng writes: >>> >>>> In offline_pages(), if a node becomes memoryless, we >>>> will clear its N_MEMORY state by calling node_states_clear_node(). >>>> But we do this after rebuilding the zonelists by calling >>>> build_all_zonelists(), which will cause this memoryless node to >>>> still be in the fallback list of other nodes. >>> For fallback list, do you mean pgdat->node_zonelists[]? If so, in >>> build_all_zonelists >>> __build_all_zonelists >>> build_zonelists >>> build_zonelists_in_node_order >>> build_zonerefs_node >>> populated_zone() will be checked before adding zone into zonelist. >>> So, IIUC, we will not try to allocate from the memory less node. >> >> Normally yes, but if it is the weird topology mentioned in [1], it's >> possible to allocate memory from it, it is a memoryless node, but it >> also has memory. >> >> In addition to the above case, I think it's reasonable to remove >> memory less node from node_order[] in advance. In this way it will >> not to be traversed in build_zonelists_in_node_order(). >> >> [1]. https://lore.kernel.org/all/20230212110305.93670-1-zhengqi.arch@bytedance.com/ > > Got it! Thank you for information. I think that it may be good to > include this in the patch description to avoid potential confusing in > the future. OK, maybe the commit message can be changed to the following: ``` In offline_pages(), if a node becomes memoryless, we will clear its N_MEMORY state by calling node_states_clear_node(). But we do this after rebuilding the zonelists by calling build_all_zonelists(), which will cause this memoryless node to still be in the fallback nodes (node_order[]) of other nodes. To drop memoryless nodes from fallback nodes in this case, just call node_states_clear_node() before calling build_all_zonelists(). In this way, we will not try to allocate pages from memoryless node0, then the panic mentioned in [1] will also be fixed. Even though this problem has been solved by dropping the NODE_MIN_SIZE constrain in x86 [2], it would be better to fix it in the core MM as well. [1]. https://lore.kernel.org/all/20230212110305.93670-1-zhengqi.arch@bytedance.com/ [2]. https://lore.kernel.org/all/20231017062215.171670-1-rppt@kernel.org/ ``` Thanks, Qi > > -- > Best Regards, > Huang, Ying > >> Thanks, >> Qi >> >> >>> -- >>> Best Regards, >>> Huang, Ying >>> >>>> This will incur >>>> some runtime overhead. >>>> >>>> To drop memoryless node from fallback lists in this case, just >>>> call node_states_clear_node() before calling build_all_zonelists(). >>>> >>>> Signed-off-by: Qi Zheng >>>> Acked-by: David Hildenbrand >>> [snip] >>> -- >>> Best Regards, >>> Huang, Ying