From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5D02EC021BE for ; Thu, 27 Feb 2025 05:48:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CE60C6B0098; Thu, 27 Feb 2025 00:48:12 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C96226B0099; Thu, 27 Feb 2025 00:48:12 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B84466B009A; Thu, 27 Feb 2025 00:48:12 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 9B7336B0098 for ; Thu, 27 Feb 2025 00:48:12 -0500 (EST) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 336EB80530 for ; Thu, 27 Feb 2025 05:48:12 +0000 (UTC) X-FDA: 83164644024.03.111E810 Received: from out-182.mta0.migadu.com (out-182.mta0.migadu.com [91.218.175.182]) by imf07.hostedemail.com (Postfix) with ESMTP id BF93340002 for ; Thu, 27 Feb 2025 05:48:08 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=NnfxxYai; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf07.hostedemail.com: domain of yosry.ahmed@linux.dev designates 91.218.175.182 as permitted sender) smtp.mailfrom=yosry.ahmed@linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740635289; a=rsa-sha256; cv=none; b=rQAFvqBvf9EqhvJ4sZwKZV0vS4ZMWJfjKwUQp6ZQQeCgmFS2yklY0zQmCPJMRLwaWufYXJ qjCU9pUcuo3Cij8yh1kjYEFAe0CxuW7IBTq80BNdaw3q7ZfqvAd9prn53AxftJKd50HqrR Q/HuSNFn5ldRCVqcREvYfx1JZaR8buM= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=NnfxxYai; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf07.hostedemail.com: domain of yosry.ahmed@linux.dev designates 91.218.175.182 as permitted sender) smtp.mailfrom=yosry.ahmed@linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740635288; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=K4TkQ8UrhrrQr19zoNVeArPkdnwKAuEj5fc35ftPO8E=; b=Ew1bNiM2E1GrjA2Yr3Ua3kNjH8GY4uSmIOfY4gCToqqfJqJc+dV8HuIfJJVlha9CcP0UlV A84QDQ6IiewDnLwkFJPQM3OTVb+pUueOIw7+jVZvMkzGrAeHYUWjgZ6UDoEteRcPoAYN5M jSjrGp7nwsC09w+7hHzht1xq/lwcl/o= Date: Thu, 27 Feb 2025 05:48:02 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1740635286; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=K4TkQ8UrhrrQr19zoNVeArPkdnwKAuEj5fc35ftPO8E=; b=NnfxxYaiQFZIfNf1wYPc3S8quk/CqgB+eHNUeSTKeAtdm+oZA5COBTyleDiadlulaa3MGC 5nLn7EOi7zJpsD5MrNRD+xCnVtZVzppDCcXt6fx2II7N3DaW92/jshuqNErGni5gzqabrj aIKZstGxkNUsp5Hp7/2xGF6lHskPSeY= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Yosry Ahmed To: Sergey Senozhatsky Cc: Andrew Morton , Hillf Danton , Kairui Song , Sebastian Andrzej Siewior , Minchan Kim , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v9 14/19] zsmalloc: introduce new object mapping API Message-ID: References: <20250227043618.88380-1-senozhatsky@chromium.org> <20250227043618.88380-15-senozhatsky@chromium.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20250227043618.88380-15-senozhatsky@chromium.org> X-Migadu-Flow: FLOW_OUT X-Rspam-User: X-Rspamd-Queue-Id: BF93340002 X-Rspamd-Server: rspam12 X-Stat-Signature: u55mqcy7wmgcf9x1hi1ptw37hzryi1pn X-HE-Tag: 1740635288-821933 X-HE-Meta: U2FsdGVkX18xBncWE6jQuPDEcF8SHZGzuIqK/Y1qWzAWqVYzK8HR+SV1ULxsBdFOjECXHpqN2hVMf+6T9rAzcVjpPIs3ZNzvHL1pgsqBl1Hwpe/U33JEItNOeJVLtzStAvJGEhohzZ6Isz7f+5ERj/jFKoA5umuorfbGbr8rT3xdyTk3iY2pLH3ZXgXbTd0uC55wpF6MrIXBXYhcBXQJgNgC8bWkB+48s4nDRVjlkSgNNCYankEZsWKiVotqn4/2WtMb2PwcWQR00DOGqGZqqhkjQFizOTB3qusAdRmJ6a28Ts3fpQ7zTF3bSoRBU8rc3W5XLtsvgvhfmhKAfX34smbOqeVU3EbJ2T1dldMb3puEFovcSxGlwu+YKZwYgimPIBKPhDQ/IwPdD5s8iTHMAjUTwEn9wbnFf2H1IyRIQYbgJY5Qv391ltHV4vgBnC0LiIz6LgWlXR1ancIVOzNbks7LBTdefcjESuefDJJWOWZrO3PHex4LmUZSiQAq9FTNLZfBfNBVqd0V1J6+diHyB+QIMdFRcw4CYOfyrI2swxS+PlFoolXKnY72DFArJJZJGPZ6DaV1gviuEI82djxIODw5EXFSdzWwqOx8pTJJdtoyIEwVpB5qSiHe8RzZRJFEIw6rSijiIXP8XhaMkxuShrzBUrfGSX14ngllHOrr39n1b631QDC2G1zrP+Zel7tQk9KTZ8IHuE3ZSZQ0ia6a7LMdBqxtkbY4Q4hhAxID/KMvlWbuEBGb1LOK1roFZTShQ5n/XhZYMxVrDp+MdTvpjNKwwsBXFmlJf/JEVrexWKGB/Vq8qbYgUsszcRvSwj7Cx8/gAbUZPE26edPIxTX36lnULotgpeFKytoka/FBucBJqCskHcS1rum09NokJ/PueXwFVnWlu4T9CnyHIYaqvO/PQXf7WirQi+M83sVcedLLepKlU5Ms9zu/1RBLLdYB4bvLMG4hGvEZ5WF75dQ 4QNJBoNj 0zo2xzXNcDe4R7p95XWd9byBb06gGL8YmZYcc5AlKQWl6iFauePF39cxMoE92x/bmgGiboZJCPNFialY9eIZspZqRBNiD/xO+KUKqg/vGadP04+dBKS87jfQQgleQytPlsAzQC70YBp1JBxjH7P7ohK8ZubVXz9N85MNXw2HnDd1GB4a2YskIvY/2u8tiATMrdrsdl4RFbVJcLc14mYfWFYuCZoX4hEX86TL3GI4OFrO+2WU= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Feb 27, 2025 at 01:35:32PM +0900, Sergey Senozhatsky wrote: > Current object mapping API is a little cumbersome. First, it's > inconsistent, sometimes it returns with page-faults disabled and > sometimes with page-faults enabled. Second, and most importantly, > it enforces atomicity restrictions on its users. zs_map_object() > has to return a liner object address which is not always possible > because some objects span multiple physical (non-contiguous) > pages. For such objects zsmalloc uses a per-CPU buffer to which > object's data is copied before a pointer to that per-CPU buffer > is returned back to the caller. This leads to another, final, > issue - extra memcpy(). Since the caller gets a pointer to > per-CPU buffer it can memcpy() data only to that buffer, and > during zs_unmap_object() zsmalloc will memcpy() from that per-CPU > buffer to physical pages that object in question spans across. > > New API splits functions by access mode: > - zs_obj_read_begin(handle, local_copy) > Returns a pointer to handle memory. For objects that span two > physical pages a local_copy buffer is used to store object's > data before the address is returned to the caller. Otherwise > the object's page is kmap_local mapped directly. > > - zs_obj_read_end(handle, buf) > Unmaps the page if it was kmap_local mapped by zs_obj_read_begin(). > > - zs_obj_write(handle, buf, len) > Copies len-bytes from compression buffer to handle memory > (takes care of objects that span two pages). This does not > need any additional (e.g. per-CPU) buffers and writes the data > directly to zsmalloc pool pages. > > In terms of performance, on a synthetic and completely reproducible > test that allocates fixed number of objects of fixed sizes and > iterates over those objects, first mapping in RO then in RW mode: > > OLD API > ======= > > 3 first results out of 10 > > 369,205,778 instructions # 0.80 insn per cycle > 40,467,926 branches # 113.732 M/sec > > 369,002,122 instructions # 0.62 insn per cycle > 40,426,145 branches # 189.361 M/sec > > 369,036,706 instructions # 0.63 insn per cycle > 40,430,860 branches # 204.105 M/sec > > [..] > > NEW API > ======= > > 3 first results out of 10 > > 265,799,293 instructions # 0.51 insn per cycle > 29,834,567 branches # 170.281 M/sec > > 265,765,970 instructions # 0.55 insn per cycle > 29,829,019 branches # 161.602 M/sec > > 265,764,702 instructions # 0.51 insn per cycle > 29,828,015 branches # 189.677 M/sec > > [..] > > T-test on all 10 runs > ===================== > > Difference at 95.0% confidence > -1.03219e+08 +/- 55308.7 > -27.9705% +/- 0.0149878% > (Student's t, pooled s = 58864.4) > > The old API will stay around until the remaining users switch > to the new one. After that we'll also remove zsmalloc per-CPU > buffer and CPU hotplug handling. > > The split of map(RO) and map(WO) into read_{begin/end}/write is > suggested by Yosry Ahmed. > > Suggested-by: Yosry Ahmed > Signed-off-by: Sergey Senozhatsky I see my Reviewed-by was removed at some point. Did something change in this patch (do I need to review it again) or was it just lost?