From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3A14FC61D88 for ; Tue, 21 Nov 2023 01:55:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BCF476B033B; Mon, 20 Nov 2023 20:55:58 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B7FAD6B0341; Mon, 20 Nov 2023 20:55:58 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A20796B0342; Mon, 20 Nov 2023 20:55:58 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 8F1206B033B for ; Mon, 20 Nov 2023 20:55:58 -0500 (EST) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 708EEC0960 for ; Tue, 21 Nov 2023 01:55:58 +0000 (UTC) X-FDA: 81480295596.19.2FF4496 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.100]) by imf11.hostedemail.com (Postfix) with ESMTP id A934F40004 for ; Tue, 21 Nov 2023 01:55:54 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=IRNyIt63; spf=pass (imf11.hostedemail.com: domain of ying.huang@intel.com designates 134.134.136.100 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1700531755; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=fcn6UE6XUVSDCOeoqI9L9fuo3llUEYQPa5KVCsl6Hgg=; b=xQOqtlJHos6YQ2SZIKeexseeP/ey0sI7ievGJeob8QhetaJe8U1ripjhb4pJHcwcZAHNwD aJkyOhIbGzrqa3T8u8A9nhIuI5RF0FXGJdnSPch1oiDOV/v72eYRqjETwJdybqyglITX51 psVUUfbtB1xBia6pmirMiTza0QiloRQ= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1700531755; a=rsa-sha256; cv=none; b=cAKaZN4MnM36TGv7Fw69tFfsJTBETN2NqvOCxqhq8KkmDqSl1AN5WWXS2QyaqMXHkNut0m H0W6sAntSq64gBscS4tG/Av8SJa0yje1L4RN7S1gzzMerhtBJNk6jlZasp7hRA+vMis+7J onmMVHtwb9cbovMXmCm+YrlZAQfpcjI= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=IRNyIt63; spf=pass (imf11.hostedemail.com: domain of ying.huang@intel.com designates 134.134.136.100 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1700531754; x=1732067754; h=from:to:cc:subject:in-reply-to:references:date: message-id:mime-version:content-transfer-encoding; bh=kq28EHYGvh4sRWAhKoXm/zX2T5wCVpxNHx7fWDZxre4=; b=IRNyIt63/0zbCF2Zf+DH5X3ZlgyN0FuQ3n3Hb2nAQo5cY490IV5tHORm Xl9do+6/jcLuZTPHYMoW8Lowaavwv2XpLPeyaYUR9kELjCPkDiUr3+J2M xx66RCiXcY86PbfLXcZILAYX570V9XPHTTn5fs+DUK98TJw3Zlrft4iKq Fr/c9dwB0Udi+aILyVj70kl5y1nMyAII+TSAS6BLBqsup4OvUyXUtsHDB JB2r4J6f3WQn7/PR2FRBTqvGoUhRthu/ijSwrC7w4aNxNiJuqg3S7bLNl FF2+sUgK+t9XFDm9LgMap5tyigJ0NY4eSQ6bT3pGCvpRjnz5YurCAJRks g==; X-IronPort-AV: E=McAfee;i="6600,9927,10900"; a="458238704" X-IronPort-AV: E=Sophos;i="6.04,215,1695711600"; d="scan'208";a="458238704" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Nov 2023 17:55:53 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10900"; a="910314635" X-IronPort-AV: E=Sophos;i="6.04,215,1695711600"; d="scan'208";a="910314635" Received: from yhuang6-desk2.sh.intel.com (HELO yhuang6-desk2.ccr.corp.intel.com) ([10.238.208.55]) by fmsmga001-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Nov 2023 17:55:49 -0800 From: "Huang, Ying" To: Yosry Ahmed Cc: Chris Li , Zhongkun He , Andrew Morton , Johannes Weiner , Nhat Pham , Seth Jennings , Dan Streetman , Vitaly Wool , linux-mm , LKML Subject: Re: [PATCH] mm:zswap: fix zswap entry reclamation failure in two scenarios In-Reply-To: (Yosry Ahmed's message of "Mon, 20 Nov 2023 17:15:15 -0800") References: <20231113130601.3350915-1-hezhongkun.hzk@bytedance.com> <8734x1cdtr.fsf@yhuang6-desk2.ccr.corp.intel.com> <87edgkapsz.fsf@yhuang6-desk2.ccr.corp.intel.com> Date: Tue, 21 Nov 2023 09:53:48 +0800 Message-ID: <875y1vc1n7.fsf@yhuang6-desk2.ccr.corp.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: A934F40004 X-Rspam-User: X-Stat-Signature: rww9mysx3t4w56pcx9wios4n4675ofyw X-Rspamd-Server: rspam03 X-HE-Tag: 1700531754-550481 X-HE-Meta: U2FsdGVkX19i4jpNR6mGPjZnlkoDesW1ccMc2mM31Wuh1v70y2K417/Ngi0BggtYcHuEeB5GdyPgi3oA1qPP6AK0oBa95pdADdBCs1fUgPv5MFYfWqrGxuIhgI2wA4Tx2lGVzafneXsFgoRT42oSdqzk2O6moztKdsXOJucz9VW5dDXa5uh0txInXpFLPFHX567cH8EnFDzVIRoo4YH6hE2tjyR2j3pe2Bel0tWRALmPmu9Ec+CKAnD2Gbpr9j9XeAhXWCRXYIi6qKvLpZg37K1xnPBZwOkdAZ75bSV1COpvixYQVSV0Cr74pV5czRB9c26DZiyYJ6DCiMZulpv8aB0udUfUVE3IFKWsf7BUS7lhl3+U2AHcj+P4YqFMEwXUAMX7U7Q4zZX3a7te52WmUXdaisaFYzxDD39/yssU5cdU338Mw5lRrmk34lkpAKBkqJMhWO+SavLaXf7y5ApNot2XwEnoKmRobnbu9o+Mq5YzAs2lonrXHrKm29hglHT39RRahej5B5U2YDkFmqgUcM8idACUJ60DgozmEWYi1MBXBhFsuVfoCUqfbLWR1F5siyjOJFfdaDTcVp7/OuZKM5C0W2DnO0HMuRc+G0IWza3ZHUZseMDA2MT6A9SEWZWYEnraaDtTTusY2udSAM31G6n3pUhObumffTGCB0LGOn2G8e/dsi4OUAALRMJOj6AhLBQWtEPBP1OB5K9qt4Wm8K2QYcoGZtyh8Nc6IxJdCVfXm1sa639ZzoiODVXfUfjO4UbRwQGnDme0hJKqyrT5s4SFko7UHOmldg1Gp6dKg/GXYvJ2uk+K3f7fvSlM31X0tbJX2WFJu0lIyrkEqylMhjZuDxf4HzScXFisctaxurdbo9WMNYJ452VfQdJjVbfJfNLpC/Q+YZn148dCFkZFw45+5QXL3BWYWcVkwGS3FksgEuE+rPRtQqS6d6SIhYHQ5cfc6dkT+uGV83BbDDp hZl8VIgO 7fC2IrJ6WsdmDLumqMW8OUBiVhbWOV//T5ygurz3u8prUsUCS7TEr4pGpRUECXVETy3Uq90PfWDbvZ6LpHVvPGnTAtpFzxwhZ0gEa5tKU3xiln/n4QpDfzItA4039Y/njATjo9uJ3o5fqSHEF7xYlj8ieXceVjEb1BNM1kuObWBrV42fdjDqmzPTahPknaM1wKtAC X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Yosry Ahmed writes: > On Mon, Nov 20, 2023 at 4:57=E2=80=AFPM Huang, Ying wrote: >> >> Yosry Ahmed writes: >> >> > On Sun, Nov 19, 2023 at 7:20=E2=80=AFPM Huang, Ying wrote: >> >> >> >> Chris Li writes: >> >> >> >> > On Thu, Nov 16, 2023 at 12:19=E2=80=AFPM Yosry Ahmed wrote: >> >> >> >> >> >> Not bypassing the swap slot cache, just make the callbacks to >> >> >> invalidate the zswap entry, do memg uncharging, etc when the slot = is >> >> >> no longer used and is entering the swap slot cache (i.e. when >> >> >> free_swap_slot() is called), instead of when draining the swap slot >> >> >> cache (i.e. when swap_range_free() is called). For all parts of MM >> >> >> outside of swap, the swap entry is freed when free_swap_slot() is >> >> >> called. We don't free it immediately because of caching, but this >> >> >> should be transparent to other parts of MM (e.g. zswap, memcg, etc= ). >> >> > >> >> > That will cancel the batching effect on the swap slot free, making = the >> >> > common case for swapping faults take longer to complete, righ? >> >> > If I recall correctly, the uncharge is the expensive part of the sw= ap >> >> > slot free operation. >> >> > I just want to figure out what we are trading off against. This is = not >> >> > one side wins all situations. >> >> >> >> Per my understanding, we don't batch memcg uncharging in >> >> swap_entry_free() now. Although it's possible and may improve >> >> performance. >> > >> > Yes. It actually causes a long tail in swapin fault latency as Chris >> > discovered in our prod. I am wondering if doing the memcg uncharging >> > outside the slots cache will actually amortize the cost instead. >> > >> > Regardless of memcg charging, which is more complicated, I think we >> > should at least move the call to zswap_invalidate() before the slots >> > cache. I would prefer that we move everything non-swapfile specific >> > outside the slots cache layer (zswap_invalidate(), >> > arch_swap_invalidate_page(), clear_shadow_from_swap_cache(), >> > mem_cgroup_uncharge_swap(), ..). However, if some of those are >> > controversial, we can move some of them for now. >> >> That makes sense for me. >> >> > When draining free swap slots from the cache, swap_range_free() is >> > called with nr_entries =3D=3D 1 anyway, so I can't see how any batchin= g is >> > going on. If anything it should help amortize the cost. >> >> In swapcache_free_entries(), the sis->lock will be held to free multiple >> swap slots via swap_info_get_cont() if possible. This can reduce >> sis->lock contention. > > Ah yes that's a good point. Since most of these callbacks don't > actually access sis, but use the swap entry value itself, I am > guessing the reason we need to hold the lock for all these callbacks > is to prevent swapoff and swapon reusing the same swap entry on a > different swap device, right? In, swapcache_free_entries() swap_entry_free() swap_range_free() Quite some sis fields will be accessed. -- Best Regards, Huang, Ying