From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.5 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6BB21C433DB for ; Mon, 15 Mar 2021 00:08:12 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A8C6464DD1 for ; Mon, 15 Mar 2021 00:08:11 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A8C6464DD1 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 130696B006C; Sun, 14 Mar 2021 20:08:11 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0E1CA6B006E; Sun, 14 Mar 2021 20:08:11 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EEBCF6B0070; Sun, 14 Mar 2021 20:08:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0100.hostedemail.com [216.40.44.100]) by kanga.kvack.org (Postfix) with ESMTP id D0FDE6B006C for ; Sun, 14 Mar 2021 20:08:10 -0400 (EDT) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 9A61D4DB8 for ; Mon, 15 Mar 2021 00:08:10 +0000 (UTC) X-FDA: 77920171140.09.97E068F Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf23.hostedemail.com (Postfix) with ESMTP id B65A3A0009DA for ; Mon, 15 Mar 2021 00:08:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Transfer-Encoding: Content-Type:MIME-Version:References:Message-ID:Subject:Cc:To:From:Date: Sender:Reply-To:Content-ID:Content-Description; bh=Yex/7xSfNtYrzRXgg0P9XY6oHVID57XWbPEAmsyauiM=; b=RrabvKcOHh4DBLQ3gvKmlSv3Ve 3lZW+gsGG/xCLbdQIfsmldF4sAjZG3SWqpR968AGZQS3vRu8cOSAdN1qRRwQO0IU2Ae9h6an+N8xL Xp1BNjpQvm/JHoYgKdAK5M7z1QMuMFcEfNbRxD2B7+VjgrmZ/DDxGVpjk3WKkCapq1PVJTb0N62na VUOmYlprxke3Inyl4iWx6fxO96IoFBRQeZtkd2oX+xuaNo2F9IzuN7tjt3J8xuZdyrnx1A5oXxX5m xjh22WqXkJioC8l6/3NrU5VZMJTNnlLp0k0t2GjmTchrDJc5EQOLVScSzPLhy8iIaz/yTw/KVRlWp 2WM/skuw==; Received: from willy by casper.infradead.org with local (Exim 4.94 #2 (Red Hat Linux)) id 1lLalm-00GqzJ-40; Mon, 15 Mar 2021 00:07:44 +0000 Date: Mon, 15 Mar 2021 00:07:38 +0000 From: Matthew Wilcox To: Wxz76@protonmail.com Cc: linux-mm@kvack.org, peter.weber@flapflap.eu Subject: Re: Is anonymous memory part of the page cache on Linux? Message-ID: <20210315000738.GR2577561@casper.infradead.org> References: <003301d7191e$36b035b0$a410a110$@protonmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <003301d7191e$36b035b0$a410a110$@protonmail.com> X-Stat-Signature: n4djaz1satetoxhf8yo3r5rd73sctjkz X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: B65A3A0009DA Received-SPF: none (infradead.org>: No applicable sender policy available) receiver=imf23; identity=mailfrom; envelope-from=""; helo=casper.infradead.org; client-ip=90.155.50.34 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1615766889-356462 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Sun, Mar 14, 2021 at 11:06:12PM +0000, Wxz76@protonmail.com wrote: > Hi Matthew and Peter, >=20 > I had a few questions to clarify my understanding of the page/swap cach= e. These are good questions to ask. I see the confusion, and I'm not entirely sure how to sort it out for you, but let me try. > > There's a swap cache, but that's not the same thing as the page cache= . >=20 > My understanding of the swap cache comes from: >=20 > 1) Understanding the Linux Kernel by Bovet and Cesati: >=20 > =E2=80=9CThe swap cache is implemented by the page cache data structure= s and procedures =E2=80=9D and >=20 > =E2=80=9CPages in the swap cache are stored as every other page in the = page cache, with the following special treatment: >=20 > =E2=80=A2 The mapping field of the page descriptor is set to NULL. >=20 > =E2=80=A2 The PG_swapcache flag of the page descriptor is set. >=20 > =E2=80=A2 The private field stores the swapped-out page identifier asso= ciated with the page=E2=80=9D They're not wrong, but may be misleading. The swap cache reuses (repurposes?) many of the same data structures used by the page cache. In particular the address_space. It's still considered to be separate from the page cache. > 2) Understanding the Linux Virtual Memory Manager by Mel Gorman: >=20 > =E2=80=9CThe swap cache is purely conceptual because it is simply a spe= cialization of the page cache. The first principal difference between pag= es in the swap cache rather than the page cache is that pages in the swap >=20 > cache always use swapper space as their address space in page=E2=86=92m= apping. The second difference is that pages are added to the swap cache w= ith add to swap cache(), shown in Figure 11.3, instead of add to page cac= he().=E2=80=9D >=20 > I understand that those books are more than ten years old, but is what = they write no longer the case? That one, being a little more specific, is now a little more out of date. > Is the swap cache mechanism not a specialization of the page cache, and= , if not, how are they different? >=20 > > Anonymous memory is not handled by the page cache. >=20 > > Anonymous pages enter the storage stack via swap; they are >=20 > > found in the page tables, sent to the swap cache and then written to >=20 > > swap devices or swap files. >=20 > This is for the case of swapping out anonymous memory, but what about a= nonymous memory that is allocated dynamically with malloc/mmap: where is = this memory coming from? When the kernel needs to allocate a page for stack or malloc (whether it be implemented through MAP_PRIVATE or brk()), it gets it from its pool of free pages. It sets up the process's page tables to refer to that page, and it adds the page to the LRU list (so it can be swapped out if the pool of free pages runs low). > When mmap opens files, it maps a process address spaces to a region in = the page cache for the file, does it not? Yes. This is the vm_area_struct which records the mapping from the process address spce to the region of the file. > Is this behavior not the same for allocating anonymous memory (minus de= aling with a file)? It's also a vm_area_struct, but several of the fields in the vm_area_struct are used differently by an anonymous mapping than they are by a file-based mapping. The particularly interesting case that you didn't ask about is what happens for mmap(MAP_PRIVATE) of a file. In that case, we set up for a file-based mapping, but on a write fault, we allocate a new page, copy from the page cache into the new page, set up the process page table to point to this new page and put this new page into the LRU list, so it can be swapped out if needed. > I appreciate the help in clarifying this for me. You're welcome! I hope this is useful.