From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <owner-linux-mm@kvack.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 4DCA2C19F32
	for <linux-mm@archiver.kernel.org>; Wed,  5 Mar 2025 22:58:09 +0000 (UTC)
Received: by kanga.kvack.org (Postfix)
	id 81D69280005; Wed,  5 Mar 2025 17:58:06 -0500 (EST)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id 7CD5A280003; Wed,  5 Mar 2025 17:58:06 -0500 (EST)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id 66D7D280005; Wed,  5 Mar 2025 17:58:06 -0500 (EST)
X-Delivered-To: linux-mm@kvack.org
Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17])
	by kanga.kvack.org (Postfix) with ESMTP id 4946F280003
	for <linux-mm@kvack.org>; Wed,  5 Mar 2025 17:58:06 -0500 (EST)
Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1])
	by unirelay10.hostedemail.com (Postfix) with ESMTP id 152B5C0B87
	for <linux-mm@kvack.org>; Wed,  5 Mar 2025 22:58:08 +0000 (UTC)
X-FDA: 83189012256.12.B9107B4
Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91])
	by imf09.hostedemail.com (Postfix) with ESMTP id 6252114000A
	for <linux-mm@kvack.org>; Wed,  5 Mar 2025 22:58:06 +0000 (UTC)
Authentication-Results: imf09.hostedemail.com;
	dkim=pass header.d=kernel.org header.s=k20201202 header.b="nI+oXO6/";
	spf=pass (imf09.hostedemail.com: domain of sj@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=sj@kernel.org;
	dmarc=pass (policy=quarantine) header.from=kernel.org
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com;
	s=arc-20220608; t=1741215486;
	h=from:from:sender:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references:dkim-signature;
	bh=BWFToNk4L59PSLDATDh9cJQY4kBO2+XJ8JKMbtNiU+0=;
	b=K6AhUs1SQnYBBMWhx0sw4WE6cK9QHs2LGJ5fxJKKPqRh8El7m1nWf6bnirH/HgyqzRWBLd
	QfR1aKGl/gadUn/BKTl5T17Ax4EDf66FTpmV/b/4NLEk/vqYb68t2vpNX1hEuOUuz+eWVS
	jwI6KBfvbqjgutsCzD88d1FFRexRSUA=
ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1741215486; a=rsa-sha256;
	cv=none;
	b=73SKjPz1G6iWQlaBxdU9G+AfjhOYfNOiXCvJ2ObnLZok0UzJpUmdZM3n9hovLFs0JH3eTj
	1Fg/yetzjkWDR6q6Mib/tOTMulx60Mg0X1rpydgVqt0u8LXuIfxhPTV1ljo2qTm29SsOZI
	Sfv+hREop1w7BJHle7gsMZlD0vKqE2k=
ARC-Authentication-Results: i=1;
	imf09.hostedemail.com;
	dkim=pass header.d=kernel.org header.s=k20201202 header.b="nI+oXO6/";
	spf=pass (imf09.hostedemail.com: domain of sj@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=sj@kernel.org;
	dmarc=pass (policy=quarantine) header.from=kernel.org
Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58])
	by nyc.source.kernel.org (Postfix) with ESMTP id E3FEBA46536;
	Wed,  5 Mar 2025 22:52:34 +0000 (UTC)
Received: by smtp.kernel.org (Postfix) with ESMTPSA id BE658C4CED1;
	Wed,  5 Mar 2025 22:58:04 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org;
	s=k20201202; t=1741215484;
	bh=U7UUG3IAebvuSH3KIMAvaRXjcrI688/RCigv4OtEDUE=;
	h=From:To:Cc:Subject:Date:In-Reply-To:References:From;
	b=nI+oXO6/flE62nyvNu03ThfyeTeH4sD/N8/9JJI0nzH1ndtrfXC7iB1MnFmRIt93c
	 oj7xiFA1ncpRhQ20y87t2M6OVLXgpbccjceFAVYl464Gl4ZSxKo7VNC67Fp5U9VWSb
	 wwC5b+JacTmUvq3h1b1pg4OtTpY6rritDKHc8IqwqZq5j+q51CD2EuCXT+aP3w9dvE
	 xYy/fY0Pdw25eqwOHW+9COMSzj7o7YEs6Ue5/rZIivLmXRPGzbCvoPtUQ/uKWvQMOt
	 H8TBysPsBVjPK+gojXi/p1reCstQroF2upP++kt1r0caM/wz58Ztyk0kM/CABOxiVh
	 hBMAsZn3pMdEw==
From: SeongJae Park <sj@kernel.org>
To: Shakeel Butt <shakeel.butt@linux.dev>
Cc: SeongJae Park <sj@kernel.org>,
	"Liam R. Howlett" <howlett@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	David Hildenbrand <david@redhat.com>,
	Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	kernel-team@meta.com,
	linux-kernel@vger.kernel.org,
	linux-mm@kvack.org
Subject: Re: [RFC PATCH 00/16] mm/madvise: batch tlb flushes for MADV_DONTNEED and MADV_FREE
Date: Wed,  5 Mar 2025 14:58:02 -0800
Message-Id: <20250305225803.60171-1-sj@kernel.org>
X-Mailer: git-send-email 2.39.5
In-Reply-To: <ro2wtggwxbmwk6lhvcixwrefo44x7ggeumevv7lyupvudwxjsg@onh2e46eqzcy>
References: 
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
X-Stat-Signature: q1p38y4jr9wtn9nxmt7fa4sp7qj395ds
X-Rspamd-Queue-Id: 6252114000A
X-Rspam-User: 
X-Rspamd-Server: rspam01
X-HE-Tag: 1741215486-117728
X-HE-Meta: U2FsdGVkX18GGYT6IM+d6oN2x14xxVVNF8Irbed2vgE/XMSu5y4HFjFL7P9sRAfQeA+/3juBK4zhVhvm2w8WLELUYW4nHbjbj5GhGlFwS9wzTW9THtON9qzJqvLO76VA4qXdstc50yf2DkmDkGLnrvfZxrEAlAdMF+BGMmwYyMrRKDMUV4JDfeT5sp3IEfUAMZxq4/RL18NumDy0ZhWZ71EgKFKvrw/GakyK/C7pn8SgKchxjf9tJ5rm4/7a8xVg5bds3+MBTvtKKEJcfKGracj+xoxrVfvxs1WdPUzSs5JZ1tHEeQhQtMCeq+fdx8ydfD+njWgKXqphFTLc5UYVtHvCtLJIBwLCI7yJVjcoVZtoEXhsKH0866Opno5qhCds1aFqrRL/FgOp7JS9oNdepAiWGcH7EOZCupL6YuJsudj119ntr4PQ6exb/SsLUxswOcRZljLsCrWpQMeQ4GSVJUbA1q8dpLDAE7XTF6G1WTBhr3KgXldRSA26b/qR7FqJjQ2FmeeVrX7AergT+Jj81PXF9abYIRBwFNWSNvvBH+6mOJpvpWjw8lPC043t3yupDIZrQ57bCKnEfPGlhViHxTyzitIuK40DH+vxehniuthoiFu6OWiEIljDO2KgZrLgtNbnxNYM7sWPM3i0Hzx5G+BoHzKnlt8hTlPjdpQvWmXcHf/2cyP3gWgeq4fGbde3hvrhrgif8J9NjuxH0c+wbS4dOyIIksB8tkLXbwZzdb+I+G/oK+RNt9ZZloNR9Km/CMZ/sv+CsYrqIMY0t2MkTPDP/JMSFD8HC7DUEu/21ez96qYUqJGWtCtuZ3guyRp71H+0Y9lxbEqZaQNrB/FzCw54LGvwgi2fiv3TwSYue0NS8DCZ01fG20koErvY9keut3Ks0TGydEFRYg6GNQSwZOCFNIunTZV+W2etdmYNojNOb/ZGWiqwPEcUNGevqqFpth7t8xVAcVu6WliB+jh
 fRb1hfj3
 kv9G1CqCAgQGQ7pOBXuqIbonSVgEh1Vt6ARa3hu+wlX20FQndZPdWIrPDwEA0etH4Y/KU7JF6pvthw4DqALIuzp27asOMIixWJAgH48UBC4LoU69vMRtXHG00wdY3g4H5lIW0ET5xZp/n7jhi5pwRYfKzvZGhivAsE/HIOdmR0oU6dUqq8oCv667DZ2pu3s/PmTwkMgz68sPTzcv8Dg9/reoqIYxUYiICjj/dTxluObVUyPUjYTsUppCKC4tH+gUpoAUD
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>
List-Subscribe: <mailto:majordomo@kvack.org>
List-Unsubscribe: <mailto:majordomo@kvack.org>

On Wed, 5 Mar 2025 12:22:25 -0800 Shakeel Butt <shakeel.butt@linux.dev> wrote:

> On Wed, Mar 05, 2025 at 10:15:55AM -0800, SeongJae Park wrote:
> > For MADV_DONTNEED[_LOCKED] or MADV_FREE madvise requests, tlb flushes
> > can happen for each vma of the given address ranges.  Because such tlb
> > flushes are for address ranges of same process, doing those in a batch
> > is more efficient while still being safe.  Modify madvise() and
> > process_madvise() entry level code path to do such batched tlb flushes,
> > while the internal unmap logics do only gathering of the tlb entries to
> > flush.
> > 
> > In more detail, modify the entry functions to initialize an mmu_gather
> > ojbect and pass it to the internal logics.  Also modify the internal
> > logics to do only gathering of the tlb entries to flush into the
> > received mmu_gather object.  After all internal function calls are done,
> > the entry functions finish the mmu_gather object to flush the gathered
> > tlb entries in the one batch.
> > 
> > Patches Seuquence
> > =================
> > 
> > First four patches are minor cleanups of madvise.c for readability.
> > 
> > Following four patches (patches 5-8) define new data structure for
> > managing information that required for batched tlb flushing (mmu_gather
> > and behavior), and update code paths for MADV_DONTNEED[_LOCKED] and
> > MADV_FREE handling internal logics to receive it.
> > 
> > Three patches (patches 9-11) for making internal MADV_DONTNEED[_LOCKED]
> > and MADV_FREE handling logic ready for batched tlb flushing follow. 
> 
> I think you forgot to complete the above sentence or the 'follow' at the
> end seems weird.

Thank you for catching this.  I just wanted to say these three patches come
after the previous ones.  I will wordsmith this part in the next version.

> 
> > The
> > patches keep the support of unbatched tlb flushes use case, for
> > fine-grained and safe transitions.
> > 
> > Next three patches (patches 12-14) update madvise() and
> > process_madvise() code to do the batched tlb flushes utilizing the
> > previous patches introduced changes.
> > 
> > Final two patches (patches 15-16) clean up the internal logics'
> > unbatched tlb flushes use case support code, which is no more be used.
> > 
> > Test Results
> > ============
> > 
> > I measured the time to apply MADV_DONTNEED advice to 256 MiB memory
> > using multiple process_madvise() calls.  I apply the advice in 4 KiB
> > sized regions granularity, but with varying batch size (vlen) from 1 to
> > 1024.  The source code for the measurement is available at GitHub[1].
> > 
> > The measurement results are as below.  'sz_batches' column shows the
> > batch size of process_madvise() calls.  'before' and 'after' columns are
> > the measured time to apply MADV_DONTNEED to the 256 MiB memory buffer in
> > nanoseconds, on kernels that built without and with the MADV_DONTNEED
> > tlb flushes batching patch of this series, respectively.  For the
> > baseline, mm-unstable tree of 2025-03-04[2] has been used.
> > 'after/before' column is the ratio of 'after' to 'before'.  So
> > 'afetr/before' value lower than 1.0 means this patch increased
> > efficiency over the baseline.  And lower value means better efficiency.
> 
> I would recommend to replace the after/end column with percentage i.e.
> percentage improvement or degradation.

Thank you for the nice suggestion.  I will do so in the next version.

> 
> > 
> >     sz_batches    before       after        after/before
> >     1             102842895    106507398    1.03563204828102
> >     2             73364942     74529223     1.01586971880929
> >     4             58823633     51608504     0.877343022998937
> >     8             47532390     44820223     0.942940655834895
> >     16            43591587     36727177     0.842529018271347
> >     32            44207282     33946975     0.767904595446515
> >     64            41832437     26738286     0.639175910310939
> >     128           40278193     23262940     0.577556694263817
> >     256           41568533     22355103     0.537789077136785
> >     512           41626638     22822516     0.54826709762148
> >     1024          44440870     22676017     0.510251419470411
> > 
> > For <=2 batch size, tlb flushes batching shows no big difference but
> > slight overhead.  I think that's in an error range of this simple
> > micro-benchmark, and therefore can be ignored.  
> 
> I would recommend to run the experiment multiple times and report
> averages and standard deviation which will support your error range
> claim.

Again, good suggestion.  I will do so.

> 
> > Starting from batch size
> > 4, however, tlb flushes batching shows clear efficiency gain.  The
> > efficiency gain tends to be proportional to the batch size, as expected.
> > The efficiency gain ranges from about 13 percent with batch size 4, and
> > up to 49 percent with batch size 1,024.
> > 
> > Please note that this is a very simple microbenchmark, so real
> > efficiency gain on real workload could be very different.
> > 
> 
> I think you are running a single thread benchmark on a free machine. I
> expect this series to be much more beneficial on loaded machine and for
> multi-threaded applications.

Your understanding of my test setup is correct and I agree to your expectation.

> No need to test that scenario but if you
> have already done that then it would be good to report.

I don't have such test results or plans for those with specific timeline for
now.  I will share those if I get a chance, of course.


Thanks,
SJ