From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8890DC27C4F for ; Sat, 15 Jun 2024 04:02:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F13226B0184; Sat, 15 Jun 2024 00:02:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E9BC16B0185; Sat, 15 Jun 2024 00:02:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D3CBD6B0186; Sat, 15 Jun 2024 00:02:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id B28936B0184 for ; Sat, 15 Jun 2024 00:02:42 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 2EE791A0A45 for ; Sat, 15 Jun 2024 04:02:42 +0000 (UTC) X-FDA: 82231776564.24.44D2282 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf14.hostedemail.com (Postfix) with ESMTP id 0DF2B100008 for ; Sat, 15 Jun 2024 04:02:37 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=MjJ9h1N+; spf=none (imf14.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1718424157; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=NXexhGWDYtYr0oJSzk1NG3eD/CO5/DdYrVVL4DBvufA=; b=PUkA2TXfk7gHJZ0BzKHHuKeh92SEyK51QENhlYctopmtftevAIs9fzKoXI4QNCIGJIZgcT TMnBb8IVGX7A4Ou6yoN99HfanR+GAT2u511HhVPzRAaSAgYuH7KM9Vlf6akhjs4yCR/GqA kNTbWF2CJJKozjH7CrfXuUc9gtpIGm4= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1718424157; a=rsa-sha256; cv=none; b=oQhOkkE6N4lEDjpR9MHcL79DBEehNL6FtD2me4qDAT29YOgX2Z1qUwsMvTb+sjkOFFeVA3 bOXXlKShehPVApr3EecB1m23ydI3LvPokUndoLQieiBAEKUm6NKrUoMe2YHOPgoV4sVFg0 8mntWBj7pDgx2IFdoTLz50xiNaCJM4o= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=MjJ9h1N+; spf=none (imf14.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Transfer-Encoding: Content-Type:MIME-Version:References:Message-ID:Subject:Cc:To:From:Date: Sender:Reply-To:Content-ID:Content-Description; bh=NXexhGWDYtYr0oJSzk1NG3eD/CO5/DdYrVVL4DBvufA=; b=MjJ9h1N+6YRmBqezbqDVal17LT nhmk778fNc40LaKkMmKjaKlk33Q084Abzd4oND4RErPAowJYTzK8jXTxntZJuy9ysC3KaV4SyYx/4 vNwx9NeEHDv79hI2/RXqytpEcTZvQhq3hK/4XtIkW8aYuyXt+erd5mRmWqmZQHqOl/5zrKr/p/4TZ mbSP0KSi4XfOkFfgP7jgyxNaa5drT/4oVOG15DLFDfrrH2CWqh2LUkyB8aobk/Swd2wPxXI4hzAT0 51k9Vtx/un4S37Clhznni0YL/8Dc5KilIQF7AJ3fQt3f0EZp+ul+nn9YbM9bys0lu/XVtxLTu7tzQ o1v+kinw==; Received: from willy by casper.infradead.org with local (Exim 4.97.1 #2 (Red Hat Linux)) id 1sIKcb-0000000HZXR-3dUg; Sat, 15 Jun 2024 04:02:33 +0000 Date: Sat, 15 Jun 2024 05:02:33 +0100 From: Matthew Wilcox To: Shivank Garg Cc: akpm@linux-foundation.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, bharata@amd.com, raghavendra.kodsarathimmappa@amd.com, Michael.Day@amd.com, dmaengine@vger.kernel.org, vkoul@kernel.org Subject: Re: [RFC PATCH 0/5] Enhancements to Page Migration with Batch Offloading via DMA Message-ID: References: <20240614221525.19170-1-shivankg@amd.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20240614221525.19170-1-shivankg@amd.com> X-Rspamd-Queue-Id: 0DF2B100008 X-Stat-Signature: axesb81dyyfqw5mwkccommsgsb6pemux X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1718424157-649346 X-HE-Meta: U2FsdGVkX18sMvki0rHopqT6TL8Nf107xI7B1MWgOmeIo602xili8VXVNkmHTH0KBnQ4vy7LRhEXftdHuMwJCr5KeNhnEhxRWtLPmve5j491rH0znHQO38A8IwzuwmqfHZmsEGEmQQe2eIc4q9q6QkbFsdj8zQtXWm4PAuZNn1RVdJw2S4WXkhYr+J9cU3pUwfL5Ux2kXCP33NGU72XOwc5ZboBF3RK10R/UmCI483ZFP95DwTOZtZVjqDikuyRVse1SSjswh6URKa7PgvoVDZfzA0nN8q5REwzKFmRhluTaJwozBUefgOI79yjwp2k5ZRrTsjifesnshQ3JvEVYrmoYsNKIvk7TRuUcu5gP5fMH1M7c4tilF5qnWazRsLMV1qKQWfGv639wrAYgtFUNs3IES/za8fJWisyTTrWW3qol+7AxuO6VZnqi0Ad+CWZ61w8aDX1H8TEbq/MS40MPHfw4lrjQm99h7K/bSmrF9I/iWmRA+W73AJfXHX0ew9PBuv3O9GuoFuqziKzdfkNHuSGoEjAb2/roVkbrv3VZdkVMevUASMfqQ4WzkP4e1y60msbf1VpN7SjytGKgIEzTQ9ANXtgqhyBnn/h793LkJvVayBsuZmKYZuB7+SrVrpECozRo/8a3y0hPgnShXRK2NpqtGGR9VXftfdjeo+etU+bdEQB88/guq5/NcpI+fc/uzaKSggbxezpoUroNhn+NoIxMVCW2SRMUJzFZNKfM+AEjFohBEfHneVMjw3kctFHmEJf9XnraSiyMdA7nbWd/ve+VQjTpoThZ8Njpln08cWnMJt+MHwdFu/ZWnLQB9xrOrr800xLJdpvDO+2ImJNF54SooZYdNlxaBWetF4tf9T3teq08yfZrkWaLXA6RFzkES8vkUeBiVfttZK7GKX6pbulXirfj025voSiYNb4Lo574b5HY4t4r4zkCgoV/ManW86WdQwNXoEJDhJbVGIn 07lzP4ls 2oCXL1BcVH79ZlBIV4r5tHgV/oDr7hY6me1HZPDScv8dg1eKPPBhDBL5ftGLK7jB1zta6qkjpmFLCG/EwlT/MSml8hyOvQv3lNLTS82oakh2IGB5bFQ/aKHrgZ26UBYt+714QIWKg4fewfZ46tSktl86SceVjV0T4LxdPsH6HGzkiwX83D/7clrE9bOgPOIXWTK4JTSKaANndy+bCYYeMrWXfz1FBJcdZl5MO X-Bogosity: Ham, tests=bogofilter, spamicity=0.001575, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sat, Jun 15, 2024 at 03:45:20AM +0530, Shivank Garg wrote: > We conducted experiments to measure folio copy overheads for page > migration from a remote node to a local NUMA node, modeling page > promotions for different workload sizes (4KB, 2MB, 256MB and 1GB). > > Setup Information: AMD Zen 3 EPYC server (2-sockets, 32 cores, SMT > Enabled), 1 NUMA node connected to each socket. > Linux Kernel 6.8.0, DVFS set to Performance, and cpuinfo_cur_freq: 2 GHz. > THP, compaction, numa_balancing are disabled to reduce interfernce. > > migrate_pages() { <- t1 > .. > <- t2 > folio_copy() > <- t3 > .. > } <- t4 > > overheads Fraction, F= (t3-t2)/(t4-t1) > Measurement: Mean ± SD is measured in cpu_cycles/page > Generic Kernel > 4KB:: migrate_pages:17799.00±4278.25 folio_copy:794±232.87 F:0.0478±0.0199 > 2MB:: migrate_pages:3478.42±94.93 folio_copy:493.84±28.21 F:0.1418±0.0050 > 256MB:: migrate_pages:3668.56±158.47 folio_copy:815.40±171.76 F:0.2206±0.0371 > 1GB:: migrate_pages:3769.98±55.79 folio_copy:804.68±60.07 F:0.2132±0.0134 > > Results with patched kernel: > 1. Offload disabled - folios batch-move using CPU > 4KB:: migrate_pages:14941.60±2556.53 folio_copy:799.60±211.66 F:0.0554±0.0190 > 2MB:: migrate_pages:3448.44±83.74 folio_copy:533.34±37.81 F:0.1545±0.0085 > 256MB:: migrate_pages:3723.56±132.93 folio_copy:907.64±132.63 F:0.2427±0.0270 > 1GB:: migrate_pages:3788.20±46.65 folio_copy:888.46±49.50 F:0.2344±0.0107 > > 2. Offload enabled - folios batch-move using DMAengine > 4KB:: migrate_pages:46739.80±4827.15 folio_copy:32222.40±3543.42 F:0.6904±0.0423 > 2MB:: migrate_pages:13798.10±205.33 folio_copy:10971.60±202.50 F:0.7951±0.0033 > 256MB:: migrate_pages:13217.20±163.99 folio_copy:10431.20±167.25 F:0.7891±0.0029 > 1GB:: migrate_pages:13309.70±113.93 folio_copy:10410.00±117.77 F:0.7821±0.0023 You haven't measured the important thing though -- what's the cost _to userspace_? When the CPU does the copy, the data is now cache-hot in that CPU's cache. When the DMA engine does the copy, it's not cache-hot in any CPU. Now, this may not be a big problem. I don't think we do anything to ensure that the CPU that is going to access the folio in userspace is the one which does the copy. But your methodology is wrong.