From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id ED8CAC05027 for ; Tue, 14 Feb 2023 18:52:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7C7BB6B0072; Tue, 14 Feb 2023 13:52:55 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 777726B0074; Tue, 14 Feb 2023 13:52:55 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5F1BF6B007E; Tue, 14 Feb 2023 13:52:55 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 4BC6E6B0072 for ; Tue, 14 Feb 2023 13:52:55 -0500 (EST) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 1B3711A04EE for ; Tue, 14 Feb 2023 18:52:55 +0000 (UTC) X-FDA: 80466794310.07.7688F65 Received: from mail-qk1-f182.google.com (mail-qk1-f182.google.com [209.85.222.182]) by imf15.hostedemail.com (Postfix) with ESMTP id 34B33A0017 for ; Tue, 14 Feb 2023 18:52:53 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=soleen.com header.s=google header.b=Ayqx7xMh; spf=pass (imf15.hostedemail.com: domain of pasha.tatashin@soleen.com designates 209.85.222.182 as permitted sender) smtp.mailfrom=pasha.tatashin@soleen.com; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676400773; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=OO7zbNPYsLgMAueQIKSxf/Kr+7+LgMABqvRk9SFj/Jk=; b=KSLv+3uhbKxRHqQZPODnRzlW+Qizmb56Pz471IcCd6CzQrM5J31FQfsT2gJqckRgluLC0B Hsi/zygW9AjlHrSvz9jZgPd6Kmqs8WmcXsazzjO9YF8/CzXP7TO9QqgBqdLNQJ4GVIpWgS MvCsvpJPgMNWTq0LScji+5VwUiixo+o= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=soleen.com header.s=google header.b=Ayqx7xMh; spf=pass (imf15.hostedemail.com: domain of pasha.tatashin@soleen.com designates 209.85.222.182 as permitted sender) smtp.mailfrom=pasha.tatashin@soleen.com; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676400773; a=rsa-sha256; cv=none; b=sYZUSWUfYD5JLukXUnjn3CjJVvigDSRHZfvjjra/etIp7oXFTMwQrnA+SrR1SQ6ZHbCdDE uHjqPMtQ7rF+pqqoOkguTGZLLik5hzKzSRrIq63tq/fdUfunmVLULKUXScqWXxBOFW5nzu xOSSrhIj99i2lSej5xHpDjfB7G+QRVs= Received: by mail-qk1-f182.google.com with SMTP id bl15so6537074qkb.4 for ; Tue, 14 Feb 2023 10:52:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=OO7zbNPYsLgMAueQIKSxf/Kr+7+LgMABqvRk9SFj/Jk=; b=Ayqx7xMhzV5E3CwlCICFqozMpZyqraoxxSDrfJbeU4C0r04jYtPH/VZOq0X/zQBJsJ cVbztPVtJDwSojXI0guADhvaylJaR9jGSbH23VL3t09cdmzwhUo0nwC06tH7PF76ZLus jqpi+aYN0gw3r6WUedmXi1qlDlrPobJp8gOLQ8ie/JXGOQyMcKJYRxioUrcuyne+A1+m ElLjsTheSlhAOO07Q/QYhyqjGWG+R2KAHWkplQ6PoAuhVwZDs2P0n8FgFEBGIYUMJrU0 KiiD6TeOE5hvz+AJcqKz1zXpcYWKV+jStjOOeJLKWq4gm9Kb58M7hMV6ovXRG6OLSr4O abhA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=OO7zbNPYsLgMAueQIKSxf/Kr+7+LgMABqvRk9SFj/Jk=; b=1JBvYnLbhbSr0py4re2vmZBLsN9r46cGTNHHi6+Kg++tu+0BjHO/MhSFex8gSY6jWr E7SBv9zIxw4OCNZamDjHIm8yRvQ9YGe+4L+MMo+1ekfwjorDGM1QyWnhUYHEKzWnqdqO NNZJSYmRqHEvjDv2AwmnckiUDZxseuwysKXYA99rORTKl9tLdAopYGB56uyId4hk3Bfo TeKCz8v+A9UfRMk/Fq5OyVUW2VIYiFbYp8O87j9XQDcJhkQUGCNN2dfN0HUmTKlTHGoe qBMIo76T5nyDHaauYlebm+os5Cf/8ZK8wtxZWkN9UhDM2PX/hgZg0S5q5wILJzte2WdP Kk+w== X-Gm-Message-State: AO0yUKXIEJb/EfNSnYxBaOdVMt4hEhQczaGemmRtJZ61MMIbMAzIGRJ6 EMJfHM3l9BoG7CRwveXi9iWxXSJ+i/+cHjdQ37A9JA== X-Google-Smtp-Source: AK7set+wb66HZ+f0aMDWEe3NDhalPbeZXZ4gImsWcqepnaTG0T3+q5z7ICV7K81oQPd1lHtaiAHk1AS8WVD6fknRwZc= X-Received: by 2002:a05:620a:cc1:b0:720:6045:25ea with SMTP id b1-20020a05620a0cc100b00720604525eamr225052qkj.27.1676400772177; Tue, 14 Feb 2023 10:52:52 -0800 (PST) MIME-Version: 1.0 References: <20230207035139.272707-1-shiyn.lin@gmail.com> <62c44d12-933d-ee66-ef50-467cd8d30a58@redhat.com> In-Reply-To: From: Pasha Tatashin Date: Tue, 14 Feb 2023 13:52:16 -0500 Message-ID: Subject: Re: [PATCH v4 00/14] Introduce Copy-On-Write to Page Table To: Chih-En Lin Cc: David Hildenbrand , Andrew Morton , Qi Zheng , "Matthew Wilcox (Oracle)" , Christophe Leroy , John Hubbard , Nadav Amit , Barry Song , Steven Rostedt , Masami Hiramatsu , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Mark Rutland , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Yang Shi , Peter Xu , Vlastimil Babka , "Zach O'Keefe" , Yun Zhou , Hugh Dickins , Suren Baghdasaryan , Yu Zhao , Juergen Gross , Tong Tiangen , Liu Shixin , Anshuman Khandual , Li kunyu , Minchan Kim , Miaohe Lin , Gautam Menghani , Catalin Marinas , Mark Brown , Will Deacon , Vincenzo Frascino , Thomas Gleixner , "Eric W. Biederman" , Andy Lutomirski , Sebastian Andrzej Siewior , "Liam R. Howlett" , Fenghua Yu , Andrei Vagin , Barret Rhoden , Michal Hocko , "Jason A. Donenfeld" , Alexey Gladkov , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Dinglan Peng , Pedro Fonseca , Jim Huang , Huichun Feng Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 34B33A0017 X-Stat-Signature: eamr8gkw1xmj8wpqzei9q9jt4e8ii4t8 X-Rspam-User: X-HE-Tag: 1676400773-955506 X-HE-Meta: U2FsdGVkX1/sU3yN5jsd2SropvbpFJMKN5AgCJe36qQ0NHLzAp1V/L5eiNxsD9HZFtyE0vN3PDKzzh7wS+ZrA/R9zGAJGEeWiTZip5u41W/dI0D3BZhUy3IdFzdjfzE2waVZUcy9e9IwMh+sRCSlsqgCpv2Z9V+9MOr6FYsQ+UMfW7nfnKEAnmrU94VM0wzPJxZ1JIWofHXnZG/ZfgYNNrkBEPkUY/xZ6fcapImIOuLT5yweZxXbLF5yOzMvX9HhxlbO+OXhIye5sRj77i1JWQAyTa/ONlfwcKUCAAcTAguRyrIq5KspbUZ/mFV0FC794/uMWiUx5J4fBAe6vX5aV7tQ2p9vzwDuHXyW22wcdTeHAJplGkbp8DW7TT+8G6X1SMWoHd+acpI+Eez1Ty8DJIviurK3Z+L6PuOhxg8C7iAgjOroywdDhR4yaMKqX6KzeVtmjoFtoJtgRnq+OQsvnolgyWqJqKwt/BLy/ZtT2RZCF/vNv9UmK+qDC9+xuVJOGouTonCcy1Tz1mA5oLn/XuWGP5x9sTjN7TRWMyKnhhPEL9K3hLOEYUtnutgSpvo8OldGQ44kSH7I0JXMkTfyLkC3Fty+/CA1IlO+7TrXoqhrHEOlXxd/+35B1F7HFo0C3lHtO/Ln8dNYja+W0HgY7mxdLRVvVSHxF7gD7dhgGkSt7PVpVCLk/0ioIZPCcgMezxvA6v/1YJKAV/992hpyVvJfZaTjNKValmeKoq0N6tmvvIuiMdCJVDea4K2RxsGHlYBOveKKKzPLj7WJ1w9KlCTxtvzOzjy4gR2zx7SYC1FLK4n4346ggLoxMch2kXGVQsYnp6Ulu2NiGgX+N6RNgNy8nfc1531bs7a3txrqtq/i9d/DGn9g+2kqMbvJ+/qU42LtsIHiH8ah2qS4KmrOZa5+fBnvpPn9Aiww9JOSjazqvMjpLqLccpro/PE0rfPfSQPI8n/N1aX/njTolFr zM69vQ48 Cp66ZPYeCQE7Fmw/q1e5pHkck+i5jBhQCr8bTKtvgb2ndsnjeguhf4R38ABjo7mUR3R0DDZu1ZTUwkVE8d3kE7iZ3danS8fvZEXNQviTDD3t7y2wboQrCnGG1XbdYt6GTWwl3PNr2vwpRV9pm9946IYdWNNPLm4aXTTDC1YncfBDjjnWwUzHIofFFdyaxXTIeJRKjwy05ez+Rn5wi3o8zwUNVs15U1n/k0Vo45kxL3R4x8pXWTs6waal15bVorqO2hGV38DW7XnKe+MEXbEMJzgzv+i5imIhsO1JTIvzk7CNd26VY+b7Hj87hHg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Feb 14, 2023 at 1:42 PM Chih-En Lin wrote: > > On Tue, Feb 14, 2023 at 11:30:26AM -0500, Pasha Tatashin wrote: > > > > The thing with THP is, that during fork(), we always allocate a backup PTE > > > > table, to be able to PTE-map the THP whenever we have to. Otherwise we'd > > > > have to eventually fail some operations we don't want to fail -- similar to > > > > the case where break_cow_pte() could fail now due to -ENOMEM although we > > > > really don't want to fail (e.g., change_pte_range() ). > > > > > > > > I always considered that wasteful, because in many scenarios, we'll never > > > > ever split a THP and possibly waste memory. > > > > > > > > Optimizing that for THP (e.g., don't always allocate backup THP, have some > > > > global allocation backup pool for splits + refill when close-to-empty) might > > > > provide similar fork() improvements, both in speed and memory consumption > > > > when it comes to anonymous memory. > > > > > > When collapsing huge pages, do/can they reuse those PTEs for backup? > > > So, we don't have to allocate the PTE or maintain the pool. > > > > It might not work for all pages, as collapsing pages might have had > > holes in the user page table, and there were no PTE tables. > > So if there have holes in the user page table, after we doing the > collapsing and then splitting. Do those holes be filled? Assume it is, > then, I think it's the reason why it's not work for all the pages. > > But, after those operations, Will the user get the additional and > unexpected memory (which is from the huge page filling)? Yes, more memory is going to be allocated for a process in such THP collapse case. This is similar to madvise huge pages, and touching the first byte may allocate 2M. Pasha