From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4FA17C38150 for ; Wed, 10 Jul 2024 04:44:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A803B6B008A; Wed, 10 Jul 2024 00:44:40 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A30A96B0092; Wed, 10 Jul 2024 00:44:40 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8F8626B0095; Wed, 10 Jul 2024 00:44:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 702A56B008A for ; Wed, 10 Jul 2024 00:44:40 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 1581881AC5 for ; Wed, 10 Jul 2024 04:44:40 +0000 (UTC) X-FDA: 82322602320.25.8DDD50D Received: from mail-vs1-f45.google.com (mail-vs1-f45.google.com [209.85.217.45]) by imf24.hostedemail.com (Postfix) with ESMTP id 4BB6F18001B for ; Wed, 10 Jul 2024 04:44:38 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=W3PO2RsN; spf=pass (imf24.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.217.45 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1720586662; a=rsa-sha256; cv=none; b=0ylGK0D7bPpzGPaaW7M54O/r8JJvnuaJsGwh4yS5RXNBqS1DCamIZmMNyK66wy7Mfr+9/i xodZGMrAiwXdcwgTb3emtx8t5IAGPKiX04w40AwHGqbCeGgicwRR9eeXlUtlCUptqf1S62 xM4KeiYhGkSj17qJnWi3hanEXZ2PYKo= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=W3PO2RsN; spf=pass (imf24.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.217.45 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1720586662; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=y53NzQgMOK0rpI6ftnaa6VX6nvsG1CSTn7iwk/7Hdp8=; b=hTFD7oskGvy7eaiWrOFh99tstM3XaE73Li+CEEXQ1ew50Ll06fUIy/6iJ4EwdioptHrxPT j8eS5wZ7+4DJJIJQLKFBw/G70+hoYnd0fKj1AO7zdel04bST+NyTAJ+ukMBSeFkIBmqIaI Q9PeE8kgmIQ6MIvQbmDEQJeTwMsBZug= Received: by mail-vs1-f45.google.com with SMTP id ada2fe7eead31-48fec155a0bso1829785137.1 for ; Tue, 09 Jul 2024 21:44:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1720586677; x=1721191477; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=y53NzQgMOK0rpI6ftnaa6VX6nvsG1CSTn7iwk/7Hdp8=; b=W3PO2RsNP1bXizGQKWPvG+c01vXGQGAFCBVjbbt/xW4SjuyMUOMkBjimqnxe3G+fBf f7Z2fK99v8S2y/jycl4L56+XOZ1swVGhMr9XJ8A14+J9LNXBlBVq7yJP1w9s9QVTe9Kk 31Cwtq0F6pNKGVqyowM8kPd2abBZTVN1JW7/vp10G6Sf7ZF2xz1yQz5mpDyhelSRJPn1 6fGRI5kGKJzOpqIOoqsdPW9eoEei8Gsh/FdO59AazoJFGqxNLtsLr9629VtY7bYbRWuk hnBIK07HGJhjqzzBalgH92mfJLxiZ8xwudadgswlUetmcgSNg0r7QCupGN73hBMtnouN 9vsQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1720586677; x=1721191477; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=y53NzQgMOK0rpI6ftnaa6VX6nvsG1CSTn7iwk/7Hdp8=; b=J7WS5jQyvCs6Q4fRU5xbU6eajs4i/wJXv72SNyUOHyj27geU8/r6Qa7HCXF8wvdJLL WDpm2H9LcHPcW3IILFVqmBKd9rX2HVzBIZbnej1RfUDKi/Dh9WmvPfHX620BvJ8rq4NB fOnQhYFlDfbV1ufdYSTht738thuRCeTWyIs9MIYtOhpqOuJTmzkYIj4OUYohE3vUmgZF QwWzuTCUZTMv9epLxYKr6PofIcGD2Wqz9YtSdj7WzmITmrx+83ovz4OuM3FQGDEhCh29 LdmTa9GvZqC80wIPfvZlpvb+0nzcp0jLXf0uaUWEOE4SgSWfZFFI7XePQEZyOXdOXavl JFKg== X-Forwarded-Encrypted: i=1; AJvYcCV6KOeMXwao5Gt4UOV2fdruIrqP+tUsECWolK4jofTF9BbKYyZ/TAFKgMwKfeob5ZuGo0FoCHHO5k8ZAxcO2grdarI= X-Gm-Message-State: AOJu0YyGeNGSBwm0maSBD6VrkH/XWEJJ+JAUiaohdfAups/Ims/M2w7h 6OIO5ZaDt0GEnuKLXrK+lLaF85tfxVVLeJU24G7SL2ScQiuUhZEW8YbcGDz0Qj+LyKzkynyjL/X VL3POaBUUCjXkawKu89il0HRte38= X-Google-Smtp-Source: AGHT+IGxRAGjwgygmRWdYGw6icPX2MCh/jShZgkxGy5unZ57KvTs89yesiUcVhrx3Kk30qHOjWE3bwPBNrFc0vB5sHE= X-Received: by 2002:a05:6102:c86:b0:48c:3174:e8a5 with SMTP id ada2fe7eead31-490322109fbmr5848222137.29.1720586677210; Tue, 09 Jul 2024 21:44:37 -0700 (PDT) MIME-Version: 1.0 References: <20240709142312.372b20d49c6a97ecd2cd9904@linux-foundation.org> <20240710033212.36497-1-21cnbao@gmail.com> <9d77dc44-f61c-4e52-938f-c268daf0e169@redhat.com> In-Reply-To: <9d77dc44-f61c-4e52-938f-c268daf0e169@redhat.com> From: Barry Song <21cnbao@gmail.com> Date: Wed, 10 Jul 2024 16:44:25 +1200 Message-ID: Subject: Re: [PATCH v7] mm: shrink skip folio mapped by an exiting process To: David Hildenbrand Cc: akpm@linux-foundation.org, justinjiang@vivo.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, opensource.kernel@vivo.com, willy@infradead.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: 4b75o1cmyk7wo7ptipnp35p981436k5d X-Rspamd-Queue-Id: 4BB6F18001B X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1720586678-333234 X-HE-Meta: U2FsdGVkX1/Ayh2fwvgMo9H+Dd25ucB8Ud4tD5wBPBDrOho6PwesRVIfw0rjrRJCnXh0KcSNMhpH80OrUYtoU9hjd3oB4Xd4PPrzBT/8pWMNKATZ0alEuBjaessN6w1y+0rq49XYOWoK9DLvvfnyseyFPuWBt/43eAQxLw2GPWXKCF1519B4OXyyfpMdO6pZ+a0jfwrERKcRKWDawknsKLZKE7RmDzg3I8U0204xa2w5v1Xqxk5RTEXWxtF55lonYEBFOHc5BH3aruNu8mVuFC0u+pGKb6VH/nI70V7CaPrIaIHWxRs+BtAi1gmw15I2fmESxlWXUPnWwWP+iRrJk8W3QjEjOaWaI31h8KeJF3uLicczPnGrtK2jX/JoviXaciO8YfhZYS8BnLOyls57hhOUW939jRNPt3ISeOqqsvq/QCANpgYgW5SJwnqEVy7m3/66VgtjYvF1gvuUtVfjp883x5Ke2vpewhktQSvDQhO8oU9/6mlOXO0jJXFLvNl81ia58WT7KTh+TlX6PTplChUBp7z+6q4l84QCP4+GDocmYKkchihHMcPlVUAVsPvD8/WoEYIQMN/ndBsWJV7zw9C4DEcT6y6LTqi0QY4sVHr0zu4s5dTfrM3ymuFHYYbS56qvjCrHGrPa/YHAkW8TnC0RX4A3cpQ/PRwbqeJja/37B02oRqBb3GDSt/FntHbsJupgaivKkwn0owR7TPvZgu0I+bYzjjMEoQRMUGqxhcmVrMsFB7MBC8z4DtS7n8CYA/Hu3mY59KIL77ZCanAEOhOsl58lZmNSjupJnob+mKV7EsVQ1REzIGXVMENcYIJdkiHOmfcORk0wTgu5PP0DpDoyXbM59a3+r5mlgt3Hy+dlqCitREHUDcOxACnC778Sjcxf8oARIRLLaIMjivGvOC9YeF5NgeKTVJjLeKJIwqLAzFUCOgIhAz7i9pqOoQaK9LpFEG+bDdyHCNVQH+j 1L+K1y72 L0RILBfO9+643Dl74FJ42TWNXw0lziS7/4qz9ws5cR3YEnqaL+stA9oeNFGp3CxtrP9Fzlq17DDgs5Mb9u0129dHmWgSCrRmo5G+iJcAQo4tIOr3kS6bayFYrc9EC0YTvtvQIPc/h3gFRR0LwOhI+aClDLG8LU/r58rUv/1ao+KNVQx8r0rPqjpIOVuQ6uJlnn6lrWUn9hPLA1BfBdv9/WbYy/GQidb30cFlmc5L8j/ges8FmFvIVDPFNbmgUXjEHjcTMtwnK0ABzl8tZr1eB4nsvLzWOtJRsdLgZm4T7HtzN6ugC1fIPp7edLeZr1n0HyBsFTtQmSt9Q/j82oSuJPIsk9QMnABnmcguCjlx4jSEIGAr2Knn5rgR5oWJb45isD2TRobP9Qo0fi8eMDlMdKCy3AfH3p2rjaKjs6qPZrO9KY5BfEnD1rinuoA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Jul 10, 2024 at 4:04=E2=80=AFPM David Hildenbrand wrote: > > On 10.07.24 06:02, Barry Song wrote: > > On Wed, Jul 10, 2024 at 3:59=E2=80=AFPM David Hildenbrand wrote: > >> > >> On 10.07.24 05:32, Barry Song wrote: > >>> On Wed, Jul 10, 2024 at 9:23=E2=80=AFAM Andrew Morton wrote: > >>>> > >>>> On Tue, 9 Jul 2024 20:31:15 +0800 Zhiguo Jiang wrote: > >>>> > >>>>> The releasing process of the non-shared anonymous folio mapped sole= ly by > >>>>> an exiting process may go through two flows: 1) the anonymous folio= is > >>>>> firstly is swaped-out into swapspace and transformed into a swp_ent= ry > >>>>> in shrink_folio_list; 2) then the swp_entry is released in the proc= ess > >>>>> exiting flow. This will result in the high cpu load of releasing a > >>>>> non-shared anonymous folio mapped solely by an exiting process. > >>>>> > >>>>> When the low system memory and the exiting process exist at the sam= e > >>>>> time, it will be likely to happen, because the non-shared anonymous > >>>>> folio mapped solely by an exiting process may be reclaimed by > >>>>> shrink_folio_list. > >>>>> > >>>>> This patch is that shrink skips the non-shared anonymous folio sole= ly > >>>>> mapped by an exting process and this folio is only released directl= y in > >>>>> the process exiting flow, which will save swap-out time and allevia= te > >>>>> the load of the process exiting. > >>>> > >>>> It would be helpful to provide some before-and-after runtime > >>>> measurements, please. It's a performance optimization so please let= 's > >>>> see what effect it has. > >>> > >>> Hi Andrew, > >>> > >>> This was something I was curious about too, so I created a small test= program > >>> that allocates and continuously writes to 256MB of memory. Using QEMU= , I set > >>> up a small machine with only 300MB of RAM to trigger kswapd. > >>> > >>> qemu-system-aarch64 -M virt,gic-version=3D3,mte=3Doff -nographic \ > >>> -smp cpus=3D4 -cpu max \ > >>> -m 300M -kernel arch/arm64/boot/Image > >>> > >>> The test program will be randomly terminated by its subprocess to tri= gger > >>> the use case of this patch. > >>> > >>> #include > >>> #include > >>> #include > >>> #include > >>> #include > >>> #include > >>> #include > >>> #include > >>> > >>> #define MEMORY_SIZE (256 * 1024 * 1024) > >>> > >>> unsigned char *memory; > >>> > >>> void allocate_and_write_memory() > >>> { > >>> memory =3D (unsigned char *)malloc(MEMORY_SIZE); > >>> if (memory =3D=3D NULL) { > >>> perror("malloc"); > >>> exit(EXIT_FAILURE); > >>> } > >>> > >>> while (1) > >>> memset(memory, 0x11, MEMORY_SIZE); > >>> } > >>> > >>> int main() > >>> { > >>> pid_t pid; > >>> srand(time(NULL)); > >>> > >>> pid =3D fork(); > >>> > >>> if (pid < 0) { > >>> perror("fork"); > >>> exit(EXIT_FAILURE); > >>> } > >>> > >>> if (pid =3D=3D 0) { > >>> int delay =3D (rand() % 10000) + 10000; > >>> usleep(delay * 1000); > >>> > >>> /* kill parent when it is busy on swapping */ > >>> kill(getppid(), SIGKILL); > >>> _exit(0); > >>> } else { > >>> allocate_and_write_memory(); > >>> > >>> wait(NULL); > >>> > >>> free(memory); > >>> } > >>> > >>> return 0; > >>> } > >>> > >>> I tracked the number of folios that could be redundantly > >>> swapped out by adding a simple counter as shown below: > >>> > >>> @@ -879,6 +880,9 @@ static bool folio_referenced_one(struct folio *fo= lio, > >>> check_stable_address_space(vma->vm_mm)) && > >>> folio_test_swapbacked(folio) && > >>> !folio_likely_mapped_shared(folio)) { > >>> + static long i, size; > >>> + size +=3D folio_size(folio); > >>> + pr_err("index: %d skipped folio:%lx total siz= e:%d\n", i++, (unsigned long)folio, size); > >>> pra->referenced =3D -1; > >>> page_vma_mapped_walk_done(&pvmw); > >>> return false; > >>> > >>> > >>> This is what I have observed: > >>> > >>> / # /home/barry/develop/linux/skip_swap_out_test > >>> [ 82.925645] index: 0 skipped folio:fffffdffc0425400 total size:655= 36 > >>> [ 82.925960] index: 1 skipped folio:fffffdffc0425800 total size:131= 072 > >>> [ 82.927524] index: 2 skipped folio:fffffdffc0425c00 total size:196= 608 > >>> [ 82.928649] index: 3 skipped folio:fffffdffc0426000 total size:262= 144 > >>> [ 82.929383] index: 4 skipped folio:fffffdffc0426400 total size:327= 680 > >>> [ 82.929995] index: 5 skipped folio:fffffdffc0426800 total size:393= 216 > >>> ... > >>> [ 88.469130] index: 6112 skipped folio:fffffdffc0390080 total size:= 97230848 > >>> [ 88.469966] index: 6113 skipped folio:fffffdffc038d000 total size:= 97296384 > >>> [ 89.023414] index: 6114 skipped folio:fffffdffc0366cc0 total size:= 97300480 > >>> > >>> I observed that this patch effectively skipped 6114 folios (either 4K= B or 64KB > >>> mTHP), potentially reducing the swap-out by up to 92MB (97,300,480 by= tes) during > >>> the process exit. > >>> > >>> Despite the numerous mistakes Zhiguo made in sending this patch, it i= s still > >>> quite valuable. Please consider pulling his v9 into the mm tree for t= esting. > >> > >> BTW, we dropped the folio_test_anon() check, but what about shmem? The= y > >> also do __folio_set_swapbacked()? > > > > my point is that the purpose is skipping redundant swap-out, if shmem i= s single > > mapped, they could be also skipped. > > But they won't get necessarily *freed* when unmapping them. They might > just continue living in tmpfs? where some other process might just map > them later? > You're correct. I overlooked this aspect, focusing on swap and thinking of = shmem solely in terms of swap. > IMHO, there is a big difference here between anon and shmem. (well, > anon_shmem would actually be different :) ) Even though anon_shmem behaves similarly to anonymous memory when releasing memory, it doesn't seem worth the added complexity? So unfortunately it seems Zhiguo still needs v10 to take folio_test_anon() back? Sorry for my bad, Zhiguo. > > -- > Cheers, > > David / dhildenb > Thanks Barry