From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D5040C4828F for ; Thu, 8 Feb 2024 19:01:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6E7796B00A4; Thu, 8 Feb 2024 14:01:43 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 697A46B00A5; Thu, 8 Feb 2024 14:01:43 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 55F966B00A6; Thu, 8 Feb 2024 14:01:43 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 47C106B00A4 for ; Thu, 8 Feb 2024 14:01:43 -0500 (EST) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 13D49A0FC1 for ; Thu, 8 Feb 2024 19:01:43 +0000 (UTC) X-FDA: 81769555686.22.5F9D024 Received: from mail-lj1-f169.google.com (mail-lj1-f169.google.com [209.85.208.169]) by imf22.hostedemail.com (Postfix) with ESMTP id 15A34C0017 for ; Thu, 8 Feb 2024 19:01:40 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="QJ/18lvi"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf22.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.208.169 as permitted sender) smtp.mailfrom=ryncsn@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1707418901; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=a8sPLE7pdODqfSJcbtUZGOPnrgiMMh49Z7XQoNEAkYw=; b=75wK3u58jhAohXGh0BSje9oTsqm2ayGiP06I558KhiX2N3Xfi/U4HcKXAJ2vgxmW2mykdS Ja1RQjq2cLe2qCQjY3OJtI/JG7mn0jrpLK/w/fsoTu5iE69WK1cVBSD2cQAkkN/YxnB6z9 6g7k4EwBYPRCAS3SFwT909WYNZUHCOk= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="QJ/18lvi"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf22.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.208.169 as permitted sender) smtp.mailfrom=ryncsn@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1707418901; a=rsa-sha256; cv=none; b=Yh1gIrexzJYmq7VIPQY4/M4P7ZuMFDtlY9t5XGd8u8nBkw3Sw+Jz3k/hmn3wdZMeDF04m1 a8EbqPztKxRvojNye9CYWchw78TIL39Be8kMBL5BYQkIDhE+qlZ5LjknrFf4SIGacDYWzH wjcz8HknbXaT3w3MfSFitULRDriqEDA= Received: by mail-lj1-f169.google.com with SMTP id 38308e7fff4ca-2d0c7e6b240so2634151fa.0 for ; Thu, 08 Feb 2024 11:01:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1707418899; x=1708023699; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=a8sPLE7pdODqfSJcbtUZGOPnrgiMMh49Z7XQoNEAkYw=; b=QJ/18lvimF09AsJJC5yUN7vxYKlQoDIvqpjXNY9AT9nU4eLgKEa4iE9ASKm2GSs7fS Dh1tUwzVsyuee+M1NsrCQSAG+uNAWbJ9UX+Lc2CNG1EeH/Ak+eTSRf2uttjioulcfLYH PwDUPJ0RUIg7l91tqJ/kV1r0RE5eyld4BOBtK+b8JhHh3LOBWS3uj64sPnvNRHwTabQc /VGIwJgMxuAlVj8D75+wyqZ+slNISvciIOeLpzUacwKDGiG5kX0feTJ6bCskfJJpenRr 5oSB1bgJe5ORloWGzwdvMoilg+vO1Xuf4DxCfkE5ywu2SIRQ7zgioiacN9u/LTWc6peL Xazw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707418899; x=1708023699; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=a8sPLE7pdODqfSJcbtUZGOPnrgiMMh49Z7XQoNEAkYw=; b=Z4bM3YN0OHSd/837yI4uVgz1yf7OKHI3qr7tpWEZ91vIWVEWcESrIJp7rBq5ZouGkW cQgzyMCvLNE3IQ8oUgsmPUDqAqAGvIPNrrzpeSk/nGFKDxlztftGn8B03nS7W67QG3e6 yq+JN8R3LBZ7ivzWVNhrkxmygdSCUizko4dG95C69BIVIDhuhGf8rBkmCeBx1RIE77Cv w2Qxn5ALXUBstmUYUujVM3GSqLhuJnk1GpZ6oRpzXa87He6dAWc2xVVcAL2KCH7iENu8 PkEXtIXLhXHHJZqc/c9yVfTwDZbbYR4K7mpXyhnx2mKt+QwIHmDJMjjQuc6T+yAaLkRA otXw== X-Gm-Message-State: AOJu0Yx1zYT72zTkFrm1u2P97O6HpKaW/ugcfMMrVFVz3kiovXdpwaPc QQbHibzuFS5FBD2Wd46DU1zXwXn+Uo5Hi2ih8vZqVtH1QdGStNOZH113faHHhm7O1D2YEeHt4rY gLZTSAGBJD32g/kdwWnIHD5HKU+g= X-Google-Smtp-Source: AGHT+IGiCf1xJW70ovof+reQWmPl2n3Fs3lRMMDNhultrmK35DvsStw9uBA2U2cf4tCfU0543zggc7bJDTh6YzOtQtE= X-Received: by 2002:a05:651c:1505:b0:2d0:9a29:f849 with SMTP id e5-20020a05651c150500b002d09a29f849mr185898ljf.29.1707418898907; Thu, 08 Feb 2024 11:01:38 -0800 (PST) MIME-Version: 1.0 References: <20240206182559.32264-1-ryncsn@gmail.com> <87eddnxy47.fsf@yhuang6-desk2.ccr.corp.intel.com> In-Reply-To: <87eddnxy47.fsf@yhuang6-desk2.ccr.corp.intel.com> From: Kairui Song Date: Fri, 9 Feb 2024 03:01:20 +0800 Message-ID: Subject: Re: [PATCH v2] mm/swap: fix race when skipping swapcache To: "Huang, Ying" Cc: Minchan Kim , Chris Li , Barry Song <21cnbao@gmail.com>, linux-mm@kvack.org, Andrew Morton , Yu Zhao , Barry Song , SeongJae Park , Hugh Dickins , Johannes Weiner , Matthew Wilcox , Michal Hocko , Yosry Ahmed , David Hildenbrand , stable@vger.kernel.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 15A34C0017 X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: eesh67iebd89cq6rr3d5dgwk8nz3zs6q X-HE-Tag: 1707418900-3401 X-HE-Meta: U2FsdGVkX18G+hunaugAvqh3AbfZv2PIZkTZHYFoth4mo1T+7eMpEVwRm74tlnZ8Kj8msOFO9WuUziphVdXXgUJ5yUNeRXte9LC+gS7W2OGtR5nWbRqMpVUkFlsNEYNKLT7Ck5gMvnYcbYjFH4NjOXWbUtlYkcyUgF7OXeQhf4OEuQkHko3/6XypvWpfDduGv59c+YCyvADfEaUVlMp2s/Ee6/iLceKFSLzpsTLRywW+08Dwm05MFgqtjMy8gqY1HbBxHV46BOpKtj9Ts1lTmJ4OWgqR9TpQ/1oQQZNbicxGRBOk2v/7XQmv1ZQ+8ARKymUzmg3KZC85OEm/wH11XB23VaayMzEJSue+3GXp3mrWhJS+Ling3kTBdTKMmohlNPsITkb9z/VosC8Y/cI12sw3AO0hlqqc/fEVoPHGR4S8Jv8n4lUL6m36AuSKd7bjxnfhoJnOHv09CRRh+fg+htwlyCnMkC1arCyZz0EZL5XcwRvDa+qdBPuqqe8I2wYZvmUH0DQKLpU05GmkhN4uy1kBsScJJggrbBpt/g8hY5GBINFVLSJTL6q01YrJz0qPyuzMgz92zzFkRp+G3a8Kmw3J+VyLMHyGEmJKOMV/jjFzaGidSJi9+eA5Jr7G6b276UA8+1l4V0YDuaKAwY4klgi41oP3uyxf+crVMSBv832jlgw9vGi6e36oUw0+pnknJOrldjxtrdIRbLDX7K6e75UMX6cwYCu6TEI7xJ1qszQ+5bMasCa2ytiw5K9qPmoN90OtZqwX2EuSxkkZzgDjff9AzBpriIfWF4P43lYEDR3gfQ7Lw+OHLIQSqDhKDgg33MDb9gsW0KEovf9Ybf1Jal3r5fmV4XJFdLRX1U6MzPfDkC/DFD/4K8nVej4a7FvSO7wvYquiW94kf9Cqdau04Fjvi9EG559aFKigW/x9Api730+RPfx73+bQJp8ruH8tKUOju/1AmHpPXr3eyVE nDgiBTf3 6svHvSaOl14TENrylaxCxlAwK63Zn8vprwX2R X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Feb 8, 2024 at 2:36=E2=80=AFPM Huang, Ying w= rote: > > Kairui Song writes: > > > On Thu, Feb 8, 2024 at 2:31=E2=80=AFAM Minchan Kim = wrote: > >> > >> On Wed, Feb 07, 2024 at 12:06:15PM +0800, Kairui Song wrote: > > [snip] > > >> > > >> > So I think the thing is, it's getting complex because this patch > >> > wanted to make it simple and just reuse the swap cache flags. > >> > >> I agree that a simple fix would be the important at this point. > >> > >> Considering your description, here's my understanding of the other ide= a: > >> Other method, such as increasing the swap count, haven't proven effect= ive > >> in your tests. The approach risk forcing racers to rely on the swap ca= che > >> again and the potential performance loss in race scenario. > >> > >> While I understand that simplicity is important, and performance loss > >> in this case may be infrequent, I believe swap_count approach could be= a > >> suitable solution. What do you think? > > > > Hi Minchan > > > > Yes, my main concern was about simplicity and performance. > > > > Increasing swap_count here will also race with another process from > > releasing swap_count to 0 (swapcache was able to sync callers in other > > call paths but we skipped swapcache here). > > What is the consequence of the race condition? Hi Ying, It will increase the swap count of an already freed entry, this race with multiple swap free/alloc logic that checks if count =3D=3D SWAP_HAS_CACHE or sets count to zero, or repeated free of an entry, all result in random corruption of the swap map. This happens a lot during stress testing. > > > So the right step is: 1. Lock the cluster/swap lock; 2. Check if still > > have swap_count =3D=3D 1, bail out if not; 3. Set it to 2; > > __swap_duplicate can be modified to support this, it's similar to > > existing logics for SWAP_HAS_CACHE. > > > > And swap freeing path will do more things, swapcache clean up needs to > > be handled even in the bypassing path since the racer may add it to > > swapcache. > > > > Reusing SWAP_HAS_CACHE seems to make it much simpler and avoided many > > overhead, so I used that way in this patch, the only issue is > > potentially repeated page faults now. > > > > I'm currently trying to add a SWAP_MAP_LOCK (or SWAP_MAP_SYNC, I'm bad > > at naming it) special value, so any racer can just spin on it to avoid > > all the problems, how do you think about this? > > Let's try some simpler method firstly. Another simpler idea is, add a schedule() or schedule_timeout_uninterruptible(1) in the swapcache_prepare failure path before goto out (just like __read_swap_cache_async). I think this should ensure in almost all cases, PTE is ready after it returns, also yields more CPU.