From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 54EEAC46CD2 for ; Wed, 24 Jan 2024 05:22:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A2AA66B007D; Wed, 24 Jan 2024 00:22:06 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 9DAD36B0082; Wed, 24 Jan 2024 00:22:06 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8A2566B0083; Wed, 24 Jan 2024 00:22:06 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 773426B007D for ; Wed, 24 Jan 2024 00:22:06 -0500 (EST) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 45682140928 for ; Wed, 24 Jan 2024 05:22:06 +0000 (UTC) X-FDA: 81713058252.30.2041D2D Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.8]) by imf10.hostedemail.com (Postfix) with ESMTP id A7A28C001E for ; Wed, 24 Jan 2024 05:22:03 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=kfb5632t; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf10.hostedemail.com: domain of ying.huang@intel.com designates 192.198.163.8 as permitted sender) smtp.mailfrom=ying.huang@intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1706073724; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=IIarv1J03hfMUBxMhamsi7w4qU4dGEKzyBSUZZTReTI=; b=XmuaUYzU98tq3bTY+vn2J+PJYZySAn8jP+J97JBVEbN5WKCBmfoNQmQa+sTMgMvXqOsaDV bAsASAUUsyFWu9NnYL2p3zh6AsCHHWZAk+jk7W9dLuajCbYcuRbf1hqnhM3wjPvpZgow8C 4tOlIg9hDw0xM+HrKEn1RbSBtmqrAwc= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=kfb5632t; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf10.hostedemail.com: domain of ying.huang@intel.com designates 192.198.163.8 as permitted sender) smtp.mailfrom=ying.huang@intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1706073724; a=rsa-sha256; cv=none; b=csQWfk2nAcGwDpsTWG+yAj0WRQ3iSRow3XhWCPB5Kij/6g0xnSc6wan3cghyYPDkKl3kY4 Jytdy3HE4S6ivsZIG2RtUWQi6jZh9NkB4wildxvSgSOhSoF10hMNqX0iu/fhFJ7U0tDaSQ Ij2xrAoi3YiRGBGhCKzpix1H2B1IwBE= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1706073723; x=1737609723; h=from:to:cc:subject:in-reply-to:references:date: message-id:mime-version; bh=hU252DMMuSZQLgzYBhf+Vx2BxaVBXcFr8qCm+J/aDJM=; b=kfb5632tXL4QiyZD4Cz9nV3cIOAdiyONF+kae1ZoknTvdzkXQLtFFfQU RY+EkXTc5sAS+KrVPcdKB2Szw/wFC7m3F+wMTe/WmEqb1fo2m0C2MSlts n5HgfoO8eaQV3gZGpQpE0RPCgsKwoeMWYzNB4CLjC8zBigmeTZ9XpPy2R lkrIOgNyghvX3OmEOCV3Nm2dTCYmDIuI4M2/0mj1ZEeXbRtlGQQUHoPaH PklwWczKOvo9AbI2Z+1/TL2ncuTNPNZFO4kDlYsoCpPAvlIf1bSEKCwmj ha5DKmVqYk9+oRHXKUc8gyclcYpY0oBRMcj6xHY5KcOmtMh3Xn1YK/Asj g==; X-IronPort-AV: E=McAfee;i="6600,9927,10962"; a="15261381" X-IronPort-AV: E=Sophos;i="6.05,216,1701158400"; d="scan'208";a="15261381" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmvoesa102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Jan 2024 21:22:02 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10962"; a="959392792" X-IronPort-AV: E=Sophos;i="6.05,216,1701158400"; d="scan'208";a="959392792" Received: from yhuang6-desk2.sh.intel.com (HELO yhuang6-desk2.ccr.corp.intel.com) ([10.238.208.55]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Jan 2024 21:21:59 -0800 From: "Huang, Ying" To: Yosry Ahmed Cc: Andrew Morton , Johannes Weiner , Nhat Pham , Chris Li , Chengming Zhou , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v2 1/2] mm: swap: enforce updating inuse_pages at the end of swap_range_free() In-Reply-To: <20240124045113.415378-2-yosryahmed@google.com> (Yosry Ahmed's message of "Wed, 24 Jan 2024 04:51:11 +0000") References: <20240124045113.415378-1-yosryahmed@google.com> <20240124045113.415378-2-yosryahmed@google.com> Date: Wed, 24 Jan 2024 13:20:02 +0800 Message-ID: <87v87js3y5.fsf@yhuang6-desk2.ccr.corp.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain; charset=ascii X-Rspamd-Queue-Id: A7A28C001E X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: i5tum8u9fdmqipg9bwcchimknzq9kqop X-HE-Tag: 1706073723-821689 X-HE-Meta: U2FsdGVkX18vXKsgSkTbSw2L1TmjygLBtsKaQl2StSU8Gg9YnoAWbytdwhd3vFFpuEWndqr53YkV2eFsoSRUiAv3n/N+OmS/CYhUUjgadt9uG1xWsMoTsgZjcKMVGXfDtlBhVj2j3gjHlIY29z9mpZk55VwFKiiHVFXzz6DQybUuk39+t1iiCUSk4FYoHRL7cHllm79biCfKWMejGoMr90Gyvc2PEvw536zlkADlrRS4ky/W6pJIu+wAzXNOJcVUG9bgsqZjFJK+V/jQcSjxbYd2XAjWuonsT3OjCFy6vxw+k3PjGqNdELF2sYWX3B+o9zgw9ZlySHCgxYvxy/4KvksJinmb4IN5jP6TRuDwDPhZl40Ec0NV7795xk9HjyfcV5cHZwx3dSvek7UEVGVRV7EQ30HtnDIWp2OMg0aV7YdxQI+NPxoytYBTD9McVWTTAOWhMuO76jwr0FavIu2SDiocUsb1dz+KA3AkYRdBouD09chymdfQcKv9N55NTT2T7wUS/ubgRr4HGcLuiuqXOVEmnQ89hf2Vm/9/r+mbVPlAiDCdLpOk9vf94pEHAWlpgogDFyZCGQORViGMg3kYLa3XLCHviUW5/5xsrdhRbrCmW2AOaL0G+gknZHK9E4gMWGfCmGvf+vNauE41v9FHpST1Qgu9S+VVIQA2k8lIJb7Ohg2yEVdJ9e2g8FeFE3qdoSd+b9+cmKvMxoujmT1r4H/ekzfjICdJbd9tDmdbmCRU1j85+NzAfSyEoZsMaIYJEnDCmkhg9aNQVOX3Mj8/VOoV3GjegFjwN//zFnk35BHjbBR7lEaOLA7WEeQ6U+sh6Iy930d6v13YbooLqekI2I/Lz+KIBkARK37FpfeDd0ea7sdhnl8ks0Afbq8vs9dgiI2DdwjG02+UlwhsFYsIasGe6+eRnlprJJGWisL8VoPfYe1QIi4GXAwJmNY2i8s/BxxKbasw9oKn6WpaqYO u0SLLJHp 8QMCF3xP+1xu6B++xd1j8ckWgRYoC/sB03o5uJ3Bm9xVocytqIekJqwLCzELv+c5ISB+a/UWDCx6XyF+9MsiGvLM+dtMVk5IuWrScqtS+ZwcnvkvUZRRNwHdXH3JC4wBwynISoIVjjAWYFi2C0IHrrf1eXMgjTHigNH9P7k60anvIrSaL4xepxhqi1fHqKGFqr9v99Vuhd1EdZ/dD63JwIk8zxzh6y2NS2huWeQgcmMiyaS0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Yosry Ahmed writes: > In swap_range_free(), we update inuse_pages then do some cleanups (arch > invalidation, zswap invalidation, swap cache cleanups, etc). During > swapoff, try_to_unuse() checks that inuse_pages is 0 to make sure all > swap entries are freed. Make sure we only update inuse_pages after we > are done with the cleanups in swap_range_free(), and use the proper > memory barriers to enforce it. This makes sure that code following > try_to_unuse() can safely assume that swap_range_free() ran for all > entries in thr swapfile (e.g. swap cache cleanup, zswap_swapoff()). > > In practice, this currently isn't a problem because swap_range_free() is > called with the swap info lock held, and the swapoff code happens to > spin for that after try_to_unuse(). However, this seems fragile and > unintentional, so make it more relable and future-proof. This also > facilitates a following simplification of zswap_swapoff(). > > Signed-off-by: Yosry Ahmed LGTM, Thanks! Reviewed-by: "Huang, Ying" > --- > mm/swapfile.c | 18 +++++++++++++++--- > 1 file changed, 15 insertions(+), 3 deletions(-) > > diff --git a/mm/swapfile.c b/mm/swapfile.c > index b11b6057d8b5f..0580bb3e34d77 100644 > --- a/mm/swapfile.c > +++ b/mm/swapfile.c > @@ -737,8 +737,6 @@ static void swap_range_free(struct swap_info_struct *si, unsigned long offset, > if (was_full && (si->flags & SWP_WRITEOK)) > add_to_avail_list(si); > } > - atomic_long_add(nr_entries, &nr_swap_pages); > - WRITE_ONCE(si->inuse_pages, si->inuse_pages - nr_entries); > if (si->flags & SWP_BLKDEV) > swap_slot_free_notify = > si->bdev->bd_disk->fops->swap_slot_free_notify; > @@ -752,6 +750,14 @@ static void swap_range_free(struct swap_info_struct *si, unsigned long offset, > offset++; > } > clear_shadow_from_swap_cache(si->type, begin, end); > + > + /* > + * Make sure that try_to_unuse() observes si->inuse_pages reaching 0 > + * only after the above cleanups are done. > + */ > + smp_wmb(); > + atomic_long_add(nr_entries, &nr_swap_pages); > + WRITE_ONCE(si->inuse_pages, si->inuse_pages - nr_entries); > } > > static void set_cluster_next(struct swap_info_struct *si, unsigned long next) > @@ -2049,7 +2055,7 @@ static int try_to_unuse(unsigned int type) > unsigned int i; > > if (!READ_ONCE(si->inuse_pages)) > - return 0; > + goto success; > > retry: > retval = shmem_unuse(type); > @@ -2130,6 +2136,12 @@ static int try_to_unuse(unsigned int type) > return -EINTR; > } > > +success: > + /* > + * Make sure that further cleanups after try_to_unuse() returns happen > + * after swap_range_free() reduces si->inuse_pages to 0. > + */ > + smp_mb(); > return 0; > }