From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1B519CA1018 for ; Sat, 31 Aug 2024 09:55:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A9D3C8D0024; Sat, 31 Aug 2024 05:55:24 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A7DC58D0022; Sat, 31 Aug 2024 05:55:24 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 93BDA8D0024; Sat, 31 Aug 2024 05:55:24 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 7442B8D0022 for ; Sat, 31 Aug 2024 05:55:24 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 25969C0ABB for ; Sat, 31 Aug 2024 09:55:24 +0000 (UTC) X-FDA: 82512082968.12.FF99F6A Received: from mail-vk1-f170.google.com (mail-vk1-f170.google.com [209.85.221.170]) by imf27.hostedemail.com (Postfix) with ESMTP id 6431640003 for ; Sat, 31 Aug 2024 09:55:22 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=AsH6NWcS; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf27.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.221.170 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1725098100; a=rsa-sha256; cv=none; b=b1c2Mx1YUV6lrQQ6kOMe2mKW8/d2u5d8EsXGqFC0d+wYFD/xk6C3thQlOjip50pCWzvLOl gntpwslUK4xge0bEw0v/qQecpnvdVQqljYSL/7UC13E2htbD2f9EGPOYXUhCIABfGGBRwH 20xJX+Wrjc7giQp5gSJq6mDGmNgGvXw= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=AsH6NWcS; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf27.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.221.170 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1725098100; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=bA83jpAm5TtB/dQRQjhwKsRlQuXlKSfd6wMf3uCUyCE=; b=J7fYw7C43Sa7O4shCEMKGMywT6CF+isQ6yar4/zlU7FBnvUvdjJH/tpek7iITC11xmAR72 P7AXTjSUtnXWKl+bquQyURx2Q94QjY80P92lzNMW1GYmRjn3kU1piQuwBORwyXGG+/5Amh 6NVbk+r86hxKifRsTCNF435ExWN8laA= Received: by mail-vk1-f170.google.com with SMTP id 71dfb90a1353d-4fd0d7fe6f6so1050528e0c.2 for ; Sat, 31 Aug 2024 02:55:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1725098121; x=1725702921; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=bA83jpAm5TtB/dQRQjhwKsRlQuXlKSfd6wMf3uCUyCE=; b=AsH6NWcSjuCsZ1sVYiRdu/5MDAyTslv0G27FOrYxZ4E/lPJOLGiqCc0aEqr5LrDs2h IGBbJbArarQYNosCq7po1RTYMnCClTj8ApY0sTCjx/LEWr++29vnu0eYt0GevEt7qpha OYWifgPdM7TIxkt888/Uwc5zKJ303ifeaD8C2Pb6uVV+oRyJst5/22SS0pXNT2Yr2bce AQydRpFe83Uw1Vk/JzzFivbD0Ojf/XkLH2Chx3CvYqRdRq82kLfCvXSgVwFJi6c22LIs RIBWpuF6eCXdtuX/ZIo60098vr5jz5M002l2hYKwW/2kAOU/ODz0zAMCHbx/2kxWAx0A /Jsg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725098121; x=1725702921; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=bA83jpAm5TtB/dQRQjhwKsRlQuXlKSfd6wMf3uCUyCE=; b=If3YNrv1biPemlzqkSou4pXMrLojMQN2IidM9mCvsc4aXBQIFZWBp+RyKX8rj8E+8B ydI+S+huwTNzrbtBugLehPZ0eFhzV9gzFxfjG1MUnmmtQHQS7LD2Smm0FnOv/g0c2Uv6 FyU/b2ZGy1UUyRVbeN9r3G+vvdFZzcy+OZzBHKlrywjdfV+ZzAwJ6J7Q6BtMai8DHXCh CG0MPcFDnly4jBRc8Kn3kaI6H0OtHjmqONJMM4sarZhE6qQ4vurcLysOGOwauf+35q6d kuqsuYSCeJjuVULcgp/QcFUWe/8LSREMBXOMGRJX6KmpzVXGe2fWSIQ8uXS+0SzZ72sI A1tw== X-Forwarded-Encrypted: i=1; AJvYcCVDXmEkH7JrbJI2QPWV+KyirgZRCXr6BdufhySDG6RGdGr1xtBQeQGnrd3cDVc2OuMMM7FugIIBbw==@kvack.org X-Gm-Message-State: AOJu0Yy+ipHtKmO284jfmRoOPBHcx3Vn8i67+VB24cRSCWFq9mqWjGqD 16ecchhaS/Zl242p4t8q+SQA+f2AkgNig92eV0kl43GNtvoaRvEqW574gO3LVrCjFIhuhEXOJbi GZnEBmygjFm8YvZtA/0JrWhUlhPQ= X-Google-Smtp-Source: AGHT+IFjA/yo7bM97InVQY4FPIFKW8dS/BaR2e60RIphe8Mt6Hj7CScciypELUX8ycgB7ScXwsAMRNTGrIrJtYJEURM= X-Received: by 2002:a05:6122:221b:b0:4fc:e4f5:7f83 with SMTP id 71dfb90a1353d-5009b1305bfmr2139672e0c.9.1725098121281; Sat, 31 Aug 2024 02:55:21 -0700 (PDT) MIME-Version: 1.0 References: <20240831092339.66085-1-21cnbao@gmail.com> In-Reply-To: From: Barry Song <21cnbao@gmail.com> Date: Sat, 31 Aug 2024 21:55:09 +1200 Message-ID: Subject: Re: [PATCH RFC] mm: entirely reuse the whole anon mTHP in do_wp_page To: David Hildenbrand Cc: akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Barry Song , Chuanhua Han , Baolin Wang , Ryan Roberts , Zi Yan , Chris Li , Kairui Song , Kalesh Singh , Suren Baghdasaryan Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Queue-Id: 6431640003 X-Rspamd-Server: rspam01 X-Stat-Signature: qkn5115gsxt1yn7mjnqezasrnj7iyw6f X-HE-Tag: 1725098122-118019 X-HE-Meta: U2FsdGVkX19AVNvvhIDgvoPnNQhj/hoQTHafKDImTJp4YNxIqOzMqmNTXLjH1mdE7lk8jrmFaFfqFHn3X/AmdA4UWV8vHawTBQwAe+hvH5Lo015Q69Iv7tbex6lM18hNtx/CL0KZYd4ppKhAaFPvHyxHmuaxRzOYgjNIE+zbnBFkmb8GzfqaZEkcq0xcO1eGBQGWmhZhNevIhEejBWdHpEP1YeQ8zE5W4XqpvQRp3HTtyOI/G3WGhhkXiFhJOcerTDnR3TqYVudGPXeAhdwIDKQ2tqCpZKeN3/8oImmzfW9z9KsoZD4sPwlR475UdpVQCqWJ3EyQI4giVBLSAzmURjzLukpugOTQcHIfBEGUIcxN4wT789/ep0Jjw9qyUkGTiYBXVargXByQrbg8hMZMKIhB3BT1XAM240jv81emVYwYlTmPGgMo9Vpivi2qSthUknxgko3/u19ORBdBd5INL61n4yYOQ2KDgUwvNsADtS7Hv0fFRTE4iCvHyUH9f3c0Sh0CqI08y4V0CfGxUtbcLzNj8m5X2aJPk9peTjUhOm7MK7qyh2bG4IXhE1Mypuo6wWHJabJrTZT07UwExXRUy28GGuY2Fae+DF4Z0lgV2xSqtQhTv8ewC4kUwXT4nrKxQHJMTX0f2YzeJ/zVIQXQWW32NUmeI2jgzzSiLB8/5PuFKMcePn/HkwH/H5ekUprVUsuCG7Fz+P1DeF7bNnZ0DzO7p9Jq3HXTYOQgOdVgIljXbnKLtVlbLrnWSV+HhE15ZeSm+8TmuOYPzcUeajcEgO8efcUUcvxja0ML0YzozKwHlp2xI8DcBLuo3sNbwwhGZbs/MUVCCQ+BrfBnMCUV5SX80dr6R6AdnyKqWFQUykh2QYbIsp2Sx/iBiF5IMzwuWLtFPLPFBWcuGUJFrGm9p5gUW6uktpHuGCZhettpyZAL8+tn6jOmrHjT5dvi/oTW04MsbRCXNeNkF4yvFb+ 6+rzmiEu TMiXuxdfQjHJyLmTYbSpt4cvacPZi2/rga1TfmUPo6mbbDM8g3eNmWp+WjzA9kXO2Ui5mlETv7uP1Wd3CpSAQyFxOWfjX0oKT6Jcc6IQoF/cWP5yKr2eKGOI9rYlxLD/zYT6jbUSpBrYwS1cGzUYFFXqip+hvatEQmbaDduxVNxQyKXt6PvoT6i0S2p4r+IEpQxDcyUbS778IGF2G1C2PlubFBykdx5VGYSDhvEa7DVhyB0uFlI5u3UvrjJzW5ROUl+ItEq9ucnIi/nYKs+DqpG1kECLz9hEV9Uf7D9m6HKC3Tr1ARuR9REE9eQrSGBDaKqBB2gvOsG5OtG0TeoJcNVJ9sU3yiJ5Ynw05q4v4xOWFcuG3SXXvvtXPwLBLO1tom+Gav8xo7NPnVTVhRPUHdzMdzS4CrNZphWjgJzUqTU4Uh1s/WArB83aV5w== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000569, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sat, Aug 31, 2024 at 9:44=E2=80=AFPM David Hildenbrand wrote: > > On 31.08.24 11:23, Barry Song wrote: > > From: Barry Song > > > > On a physical phone, it's sometimes observed that deferred_split > > mTHPs account for over 15% of the total mTHPs. Profiling by Chuanhua > > indicates that the majority of these originate from the typical fork > > scenario. > > When the child process either execs or exits, the parent process should > > ideally be able to reuse the entire mTHP. However, the current kernel > > lacks this capability and instead places the mTHP into split_deferred, > > performing a CoW (Copy-on-Write) on just a single subpage of the mTHP. > > > > main() > > { > > #define SIZE 1024 * 1024UL > > void *p =3D malloc(SIZE); > > memset(p, 0x11, SIZE); > > if (fork() =3D=3D 0) > > exec(....); > > /* > > * this will trigger cow one subpage from > > * mTHP and put mTHP into split_deferred > > * list > > */ > > *(int *)(p + 10) =3D 10; > > printf("done\n"); > > while(1); > > } > > > > This leads to two significant issues: > > > > * Memory Waste: Before the mTHP is fully split by the shrinker, > > it wastes memory. In extreme cases, such as with a 64KB mTHP, > > the memory usage could be 64KB + 60KB until the last subpage > > is written, at which point the mTHP is freed. > > > > * Fragmentation and Performance Loss: It destroys large folios > > (negating the performance benefits of CONT-PTE) and fragments memory. > > > > To address this, we should aim to reuse the entire mTHP in such cases. > > > > Hi David, > > > > I=E2=80=99ve renamed wp_page_reuse() to wp_folio_reuse() and added an > > entirely_reuse argument because I=E2=80=99m not sure if there are still= cases > > where we reuse a subpage within an mTHP. For now, I=E2=80=99m setting > > entirely_reuse to true only for the newly supported case, while all > > other cases still get false. Please let me know if this is incorrect=E2= =80=94if > > we don=E2=80=99t reuse subpages at all, we could remove the argument. > > See [1] I sent out this week, that is able to reuse even without > scanning page tables. If we find the the folio is exclusive we could try > processing surrounding PTEs that map the same folio. > > [1] https://lkml.kernel.org/r/20240829165627.2256514-1-david@redhat.com Great! It looks like I missed your patch again. Since you've implemented th= is in a better way, I=E2=80=99d prefer to use your patchset. I=E2=80=99m curious about how you're handling ptep_set_access_flags_nr() or= similar things because I couldn=E2=80=99t find the related code in your patch 10/17= : [PATCH v1 10/17] mm: COW reuse support for PTE-mapped THP with CONFIG_MM_ID Am I missing something? > > -- > Cheers, > > David / dhildenb > Thanks Barry