From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 21966C46467 for ; Tue, 10 Jan 2023 16:54:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5C87F8E0003; Tue, 10 Jan 2023 11:54:15 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5785E8E0001; Tue, 10 Jan 2023 11:54:15 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 441808E0003; Tue, 10 Jan 2023 11:54:15 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 354898E0001 for ; Tue, 10 Jan 2023 11:54:15 -0500 (EST) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 1065016119B for ; Tue, 10 Jan 2023 16:54:15 +0000 (UTC) X-FDA: 80339487270.07.8D56F3E Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by imf06.hostedemail.com (Postfix) with ESMTP id 39DA918001D for ; Tue, 10 Jan 2023 16:54:12 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b=fv2EQDPi; spf=pass (imf06.hostedemail.com: domain of mhocko@suse.com designates 195.135.220.29 as permitted sender) smtp.mailfrom=mhocko@suse.com; dmarc=pass (policy=quarantine) header.from=suse.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1673369653; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=6H/L0YNx2q/kJ6wUJLwEfx1DFp++2ZZ2irpOZ5TOO3M=; b=44s0BH2XJyoHEhnPWcKEZhNNjstHTFb08NL7YsxAPlnmtGXYlbjeQwDA5C3GxxNF1EBONd OGm3Ba5mSbeVFTL0lLiJXK/kYTpINop8UFHRkNmfIvncxMD3uuLEWZGB4IdK+Ss+AVXhPI eFXbMo/WBJyUKB0pZQMFV+6QTONoFRM= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b=fv2EQDPi; spf=pass (imf06.hostedemail.com: domain of mhocko@suse.com designates 195.135.220.29 as permitted sender) smtp.mailfrom=mhocko@suse.com; dmarc=pass (policy=quarantine) header.from=suse.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1673369653; a=rsa-sha256; cv=none; b=Kc64scSTdDar3bGspIe5TCPh3Dyho2SO3p2iCpqrBovI7qYkD0n1kptzQMEWWAWjINuUJA VJCKewLAHj5kr6tsS6KB1oug/V03xq1LtNARkGGS09SoBCPoERhMtLjJDF8EJzVKTCPd/q XigAN45C4IF++p5UeOqSCWL/6voPNl0= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 944CB1FED3; Tue, 10 Jan 2023 16:54:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1673369651; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=6H/L0YNx2q/kJ6wUJLwEfx1DFp++2ZZ2irpOZ5TOO3M=; b=fv2EQDPimH98qNOFAgegBCM1OV71ROKSupMw4O9X7YV0iqLkHPby1a0g6ngqudzuLxzZiW RccSNzTX5HFnSF/xJ6yVzIO5DCnQaqdO6viHzXFQNhcFWRsNjZroE1wkhqM+PL5GzyzDA9 V13Vej9bb4mpV+JflrJlWZI6NcxePTw= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 744C41358A; Tue, 10 Jan 2023 16:54:11 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id Whd8GTOYvWNvMAAAMHmgww (envelope-from ); Tue, 10 Jan 2023 16:54:11 +0000 Date: Tue, 10 Jan 2023 17:54:10 +0100 From: Michal Hocko To: Mike Rapoport Cc: Jonathan Corbet , Andrew Morton , Bagas Sanjaya , David Hildenbrand , Johannes Weiner , Lorenzo Stoakes , "Matthew Wilcox (Oracle)" , Mel Gorman , Vlastimil Babka , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH v2 2/2] docs/mm: Physical Memory: add structure, introduction and nodes description Message-ID: References: <20230110152358.2641910-1-rppt@kernel.org> <20230110152358.2641910-3-rppt@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230110152358.2641910-3-rppt@kernel.org> X-Stat-Signature: esorp9mzi4b1d5nn5am9grw1kogmuppd X-Rspam-User: X-Rspamd-Queue-Id: 39DA918001D X-Rspamd-Server: rspam06 X-HE-Tag: 1673369652-284076 X-HE-Meta: U2FsdGVkX18ndZmDQCn5fPiQfZE7kn1THyHQk5o1RTgmY067Ula+0aIbJrEg1t1jihYWCV7FbUXed+w6Og77N6CEvUBsAy83j7wNXnoxzGo7FIGwE4dKBI79W1F7fU5MPhgZgUpP75nZHgqu77RJTl1T6seVCq9mCczDDcNQubFcsrwlDbgbSCCYSFlnkFE+VM0kbWFPl+S2UJzDLnyoQccTOTz3iXQ48VmG69ZycMWyRneSNFk/F+cP6SHNlGjYu5Db5s38XbRHFU8cgq+5dW8JKxjLRlFoTBlsdc1yuhLwfDW44N1PjbckdSw+ZISfZH9LPvOQNh17NUFbW+eBnD0iPPFGVBal6g2qT45RiM6XbRbOmRI5xyWVvOfT6x17TJypBI31hNw0KJArRS083HFnL/VKgx60DLBH318mJ1SYVFfUg1C+xr+1fEUG/0YxFT5HnY6PPDC9pe6BEPyueYtwY/newC0vryG3Yjc1NwZfhFEs7B1nWGpo/aYz8hz5G6HnEF1bW1K06z5JHbvHOQatV3fh4zCOmQPVCd5h5Ph8qrU1P1MGlHoLVD3TcdeLVAxw7cn3laFw4XWeZ8VuKwdJXjfRq6ffj1S3fNv55ExechZmCN9w//jZZtF0py2TnO+a8u1aNEcR419KBU/WNjOlDzyQVTNUGJVGx6TQ33zu8CjdgjjFEgjm92v72h8jdJ+aJED/H+69SCPHbIz229YAruuCP3ROefEdcG+HsShWqfKcXdyAbj71qUcKYXkKXwWbfVv/WUJ+SuHQXaZgi5bvvvmW52xy+DglUdenZoSD2XtkoD5CUA4AbiZLYx2Ll4Sa/firJ5MrGslmjs0qsMgrBebwQEnoMc2m1mgKHSV7UiOxjGEalQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue 10-01-23 17:23:58, Mike Rapoport wrote: [...] > +* ``ZONE_DMA`` and ``ZONE_DMA32`` represent memory suitable for DMA by > + peripheral devices that cannot access all of the addressable memory. I think it would be better to not keep the historical DMA based menaning and teach that future developers. You can say something like ZONE_DMA and ZONE_DMA32 have historically been used for memory suitable for DMA. For many years there are better more robust interfaces to get memory with DMA specific requirements (Documentation/core-api/dma-api.rst). > + Depending on the architecture, either of these zone types or even they both > + can be disabled at build time using ``CONFIG_ZONE_DMA`` and > + ``CONFIG_ZONE_DMA32`` configuration options. Some 64-bit platforms may need > + both zones as they support peripherals with different DMA addressing > + limitations. > + > +* ``ZONE_NORMAL`` is for normal memory that can be accessed by the kernel all > + the time. DMA operations can be performed on pages in this zone if the DMA > + devices support transfers to all addressable memory. ``ZONE_NORMAL`` is > + always enabled. > + > +* ``ZONE_HIGHMEM`` is the part of the physical memory that is not covered by a > + permanent mapping in the kernel page tables. The memory in this zone is only > + accessible to the kernel using temporary mappings. This zone is available > + only on some 32-bit architectures and is enabled with ``CONFIG_HIGHMEM``. > + > +* ``ZONE_MOVABLE`` is for normal accessible memory, just like ``ZONE_NORMAL``. > + The difference is that most pages in ``ZONE_MOVABLE`` are movable. This is really confusing because those pages are not really movable. You cannot move a page itself. I guess you meant to say something like The difference is that there are means to migrate memory via migrate_pages interface. A typical example would be a memory mapped to userspace which can be rellocate the underlying memory content and update page tables so that userspace doesn't notice the physical data placement has changed. > That means > + that while virtual addresses of these pages do not change, their content may > + move between different physical pages. ``ZONE_MOVABLE`` is only enabled when > + one of ``kernelcore``, ``movablecore`` and ``movable_node`` parameters is > + present in the kernel command line. See :ref:`Page migration > + ` for additional details. This is not really true. The movable zone can be also enabled by memory hotplug. In fact it is one of the more common usecases for the zone because memory hot remove largerly depends on memory to be migrated for offlining to succeed in most cases. > + > +* ``ZONE_DEVICE`` represents memory residing on devices such as PMEM and GPU. > + It has different characteristics than RAM zone types and it exists to provide > + :ref:`struct page ` and memory map services for device driver > + identified physical address ranges. ``ZONE_DEVICE`` is enabled with > + configuration option ``CONFIG_ZONE_DEVICE``. > + > +It is important to note that many kernel operations can only take place using > +``ZONE_NORMAL`` so it is the most performance critical zone. Zones are > +discussed further in Section :ref:`Zones `. > + > +The relation between node and zone extents is determined by the physical memory > +map reported by the firmware, architectural constraints for memory addressing > +and certain parameters in the kernel command line. > + > +For example, with 32-bit kernel on an x86 UMA machine with 2 Gbytes of RAM the > +entire memory will be on node 0 and there will be three zones: ``ZONE_DMA``, > +``ZONE_NORMAL`` and ``ZONE_HIGHMEM``:: > + > + 0 2G > + +-------------------------------------------------------------+ > + | node 0 | > + +-------------------------------------------------------------+ > + > + 0 16M 896M 2G > + +----------+-----------------------+--------------------------+ > + | ZONE_DMA | ZONE_NORMAL | ZONE_HIGHMEM | > + +----------+-----------------------+--------------------------+ > + > + > +With a kernel built with ``ZONE_DMA`` disabled and ``ZONE_DMA32`` enabled and > +booted with ``movablecore=80%`` parameter on an arm64 machine with 16 Gbytes of > +RAM equally split between two nodes, there will be ``ZONE_DMA32``, > +``ZONE_NORMAL`` and ``ZONE_MOVABLE`` on node 0, and ``ZONE_NORMAL`` and > +``ZONE_MOVABLE`` on node 1:: > + > + > + 1G 9G 17G > + +--------------------------------+ +--------------------------+ > + | node 0 | | node 1 | > + +--------------------------------+ +--------------------------+ > + > + 1G 4G 4200M 9G 9320M 17G > + +---------+----------+-----------+ +------------+-------------+ > + | DMA32 | NORMAL | MOVABLE | | NORMAL | MOVABLE | > + +---------+----------+-----------+ +------------+-------------+ I think it is useful to note that nodes and zones can overlap in the physical address range. It is not uncommong to interleave two nodes and it is also possible that memory holes are memory hotplugged into MOVABLE zone arbitrarily in the physical address range. Other than that looks good to me and thanks for taking care of filling up these gaps! This is highly appreciated. -- Michal Hocko SUSE Labs