From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5467DCEACF3 for ; Wed, 2 Oct 2024 18:30:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E869E6B0107; Wed, 2 Oct 2024 14:30:50 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E36A66B010E; Wed, 2 Oct 2024 14:30:50 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D259C6B0112; Wed, 2 Oct 2024 14:30:50 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id A2BFD6B0107 for ; Wed, 2 Oct 2024 14:30:50 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 296B9160E3C for ; Wed, 2 Oct 2024 18:30:50 +0000 (UTC) X-FDA: 82629503460.02.67F94D6 Received: from mail-lj1-f182.google.com (mail-lj1-f182.google.com [209.85.208.182]) by imf19.hostedemail.com (Postfix) with ESMTP id 36AA71A0019 for ; Wed, 2 Oct 2024 18:30:47 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=EMvkC2DJ; spf=pass (imf19.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.208.182 as permitted sender) smtp.mailfrom=ryncsn@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1727893782; a=rsa-sha256; cv=none; b=rQ0xDVQ9RwqdfIHimB3zOKvmmacEgOcrBxue1tIeccCtff8daUzsaWO5lNAK9b/qBdb5Ym vWAusH4J1MA63IEhBg6iqNBN9pdkZA1gjzQlmYCSJg6V/x+XhSelZQx00nymgq02xcUlkB 6qnXQYsAGhkGry/nzeL1S/sKU/91gf8= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=EMvkC2DJ; spf=pass (imf19.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.208.182 as permitted sender) smtp.mailfrom=ryncsn@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1727893782; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=nCJKQj92YUp9lYjcAYVWHDwb92zO+RIIleVkehLxHkk=; b=pX+Dhqu57C3XmGBSh7Pzp9RFD//89urVN8TNsz+/qRYEd+wCwfrFljZldLyGrOPyPbobqK TCAv1lUoG6bFkP6JOH6TWqyhrgcGBnNoVF319WSPaeXVyqHxD4jndi9kPyHhP2+7zNkVcg vzGK6zNnFMpzYewCt8+NGptRjx8Auk0= Received: by mail-lj1-f182.google.com with SMTP id 38308e7fff4ca-2fac6b3c220so1498141fa.2 for ; Wed, 02 Oct 2024 11:30:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1727893846; x=1728498646; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=nCJKQj92YUp9lYjcAYVWHDwb92zO+RIIleVkehLxHkk=; b=EMvkC2DJirtbBZ8R/aazYBBIrh1nWFsiGtLVgfBpnqbYI3eYyqn1OiqiW2X2Fkytt3 WzK/+3W5iR45IeEgK5N3Fx2M8+lh9yoZhpGKwa93GshulSaDBe3mR3AMO7+MocNRuWyP wD2rfGQ2pew4Bby0cVD+r8ZDNUeCtpsFAt0ICh+oolefL/cKCc2d3aQf8+sXYC/gibWg +TMJCWSVgpiFwwTKiLH4d5q6CCAZOhhfSI8mKLShy0wtVCBL6FGaCfK3RUbKFMJHS5gR QszJCWvM5SUZY5nmLkSoFxkdcZcoTdRNW4H4pR1kv9vI6ixHKTnkVj12uHG9pqOmPfkp 6/Ww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1727893846; x=1728498646; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=nCJKQj92YUp9lYjcAYVWHDwb92zO+RIIleVkehLxHkk=; b=i5mqteJnSolQo8sVFTTnRnABMr+6cJ/dKlUJXrt1tAVYNqHk2jgxWjuRH4KNw84AOx UQaWTORB+ow1ZQKWWzOaqRNPYd9yyLiXyT6n3ClSPDA11NlWQYZMAmqZ1LSe0N5XoED7 DOtU/RZXEXm1wUZf2wjckyRrv9Q2wLDGNYTPtD0wQR+clEKm14uF8v8gCacSU1XUOqyF lufKVFiTNb2UzNu+dGacEe2WLjnvpMMIpzXolLyvRJOG5a2yKPmIVrHjjaGmwEYozxuR xl6T82UjNHnW8hXglmQzJTMnacjYJTdS1SrXoh4x0MbMXAVyr2gPnysTwgP3rvJaYOqR xXkw== X-Forwarded-Encrypted: i=1; AJvYcCVjdCNCUfTth84M07P67eiGJBhaLt0pd/IU4r/lEjwZChIvOaE83ex2RB0bztK/SdBKGihw4fkVLw==@kvack.org X-Gm-Message-State: AOJu0YxluJGsqqKNP8ZNQBFg7NDsu1Ry3rikwz5BMeR7vMAD2nurskDA /pnlosPEul1EbSPrkr/kvmgVeQ4QA3jCaVAgS7GEY4w3FaY5iWhGZIYn5qoOtYXuNyWlVzkolEy gXYR8aKjQmkf92ao3Bq+MCoVh92Q= X-Google-Smtp-Source: AGHT+IGejH4I7ojpQ9eTpLEzvNUqv6TXekoWAianOs0i1fVMU1e8y3mYHlllAqf3VecKFG7Of/ZhNSRZbPNVMb8b5yo= X-Received: by 2002:a05:651c:2105:b0:2fa:c0c2:d311 with SMTP id 38308e7fff4ca-2fae10226e4mr37736171fa.5.1727893845890; Wed, 02 Oct 2024 11:30:45 -0700 (PDT) MIME-Version: 1.0 References: <87y137nxqs.fsf@yhuang6-desk2.ccr.corp.intel.com> <20241002015754.969-1-21cnbao@gmail.com> In-Reply-To: <20241002015754.969-1-21cnbao@gmail.com> From: Kairui Song Date: Thu, 3 Oct 2024 02:30:29 +0800 Message-ID: Subject: Re: [PATCH] mm: avoid unconditional one-tick sleep when swapcache_prepare fails To: Barry Song <21cnbao@gmail.com> Cc: ying.huang@intel.com, akpm@linux-foundation.org, chrisl@kernel.org, david@redhat.com, hannes@cmpxchg.org, hughd@google.com, kaleshsingh@google.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, liyangouwen1@oppo.com, mhocko@suse.com, minchan@kernel.org, sj@kernel.org, stable@vger.kernel.org, surenb@google.com, v-songbaohua@oppo.com, willy@infradead.org, yosryahmed@google.com, yuzhao@google.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: ssigppi1o4uo6sf5msu8mfrug63xh63s X-Rspamd-Queue-Id: 36AA71A0019 X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1727893847-385265 X-HE-Meta: U2FsdGVkX19QUQI4o4rI9d57PV1mALiM17hJv3SOfSPqx4oaz23UyRzlKn3tE6wa9w9kylEgBTuIVH/qDGFeoRitR34hxSYj1zN6AMxJpwz7mAs17J/5R4bNoSKFIKhXElR48tSuY7iRI5iGxTp9ZgXhoNb5FnJfkNqaI+gdyWaKmhl1co9jKJHyPX2J1aPHMx/4kBgvmuHQRUDxBP9c0Jn9OCTN98stQk3haQo9qtTtWgR2lgnjC2JIu0wCeti/AaCym4vQ5Quz/mvxZOMHwYJ16po/Sh4KdHXZY2gkUBRAXQhkW7MLx5k0em1fcJ3Vw5lv1auQXIiv76u9ZY+KRb/zfsavXz4iFttfpkfZ8cF5nW8mH0RvNrKMtQluVcd9qBfWoJTz5AzfI2Wbhy+jo0GFu6Yq1o6VYlDhU6YXb8dtXLDR/wuRtXCSQaG9XnpwWFBdarZhDRHEva2Tg8diihEkZajnQYFX+nyuLByGd68hwPyqQJQofYYAUFJb5dqPf6/TeXw6X8YmIreV4M0hsQbjzKpfx3O7YP9NkMfMur9WOR6unF1qA1l4vYkxt+2J1/CX2bgyyMWDBWTZzgGNB4EWJervMppVbcqoqYricJhHQwEfM+zx/8h9TIw6o6JtMamBzEOwswzEkOJQRPST7KTyOGAmgq9ryg3rR4f4x7pFcMTBtSSpM7aaIN8fPoX1BS0mWKeioe7c8Wq0uW6j4eBZ/V8Oo758SiF8nExT6zXU3K0lzXJlRGmDDUDxOvJnxgfC2rmP/WiXy4Y8TjTX0E5+oIHOaG4YkDEjhjiBGU4NdMAx4UD5DceSjPuMIyDBVziqlqA/vHp40z0Kicz9G3DdcN8O7AkSeMcIKmZF3Ztio+IJpze0jBL68T2W9kt897WsFlCFubI4onbOa3/WccznfoUNpa1I3OB+FsxPTq+TIDClRErDjlccKb0IuVEvnp2HiaGkpfcs2h+UNKS nftVBj3J sre/mrLL/5h3fycbDPojikT2hJxvGOhMzYhfbv021dCwfL6W6wQLYNIO6jo0jjUQzrfXc7x8AM20PKcDdvBFnjVrMrF5FGDwUgyZiGL2yP3B0cbR3dxtKfCwRpkTEtyJBIM2gnJwkPnPFIpEhznO7niCIrSBhIALV411qtXP7pQpyFUp44LvHXDWiPSdB5BBbKi/gvo8TyIi9M1seKZxnUhmdbuffzUqmDnokvQ2eMNaS1t1G6+1yzhbG808ZNstRFFnjtxYpGHvpcttauXDcEj0T/lTNdZTRXodD911wRn+XoGT9yxt5iv3JGCMQ8EScSv1Tx+Nj8grzuO/CnD+7fZ16pNxKX9abZ0XY3ewjF+/mwsxQk1HxXwPmuRJ9oRJQgysA2yQ/WX8o9ZHwlz1nNGUmoYkeFz0mQjK721yYA86a+9GQ/Ar+7VFRH+8xaOiv523ZrB47B6oO85MSWmWfjk2lR41c+6rXj3HfiPP3VDtGwvQCVvhDtt5IoE37fHLurUo4aVB83ORNG/k= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Oct 2, 2024 at 10:02=E2=80=AFAM Barry Song <21cnbao@gmail.com> wrot= e: > > On Wed, Oct 2, 2024 at 8:43=E2=80=AFAM Huang, Ying = wrote: > > > > Barry Song <21cnbao@gmail.com> writes: > > > > > On Tue, Oct 1, 2024 at 7:43=E2=80=AFAM Huang, Ying wrote: > > >> > > >> Barry Song <21cnbao@gmail.com> writes: > > >> > > >> > On Sun, Sep 29, 2024 at 3:43=E2=80=AFPM Huang, Ying wrote: > > >> >> > > >> >> Hi, Barry, > > >> >> > > >> >> Barry Song <21cnbao@gmail.com> writes: > > >> >> > > >> >> > From: Barry Song > > >> >> > > > >> >> > Commit 13ddaf26be32 ("mm/swap: fix race when skipping swapcache= ") > > >> >> > introduced an unconditional one-tick sleep when `swapcache_prep= are()` > > >> >> > fails, which has led to reports of UI stuttering on latency-sen= sitive > > >> >> > Android devices. To address this, we can use a waitqueue to wak= e up > > >> >> > tasks that fail `swapcache_prepare()` sooner, instead of always > > >> >> > sleeping for a full tick. While tasks may occasionally be woken= by an > > >> >> > unrelated `do_swap_page()`, this method is preferable to two sc= enarios: > > >> >> > rapid re-entry into page faults, which can cause livelocks, and > > >> >> > multiple millisecond sleeps, which visibly degrade user experie= nce. > > >> >> > > >> >> In general, I think that this works. Why not extend the solution= to > > >> >> cover schedule_timeout_uninterruptible() in __read_swap_cache_asy= nc() > > >> >> too? We can call wake_up() when we clear SWAP_HAS_CACHE. To avo= id > > >> > > > >> > Hi Ying, > > >> > Thanks for your comments. > > >> > I feel extending the solution to __read_swap_cache_async() should = be done > > >> > in a separate patch. On phones, I've never encountered any issues = reported > > >> > on that path, so it might be better suited for an optimization rat= her than a > > >> > hotfix? Hi Barry and Ying, For the __read_swap_cache_async case, I'm not really against adding a similar workqueue, but if no one is really suffering from it, and if the workqueue do causes extra overhead, maybe we can ignore it for the __read_swap_cache_async case now, and I plan to resent the following patch: https://lore.kernel.org/linux-mm/20240326185032.72159-9-ryncsn@gmail.com/#r It removed all schedule_timeout_uninterruptible workaround and other similar things, and the performance will go even higher. > > >> > > >> Yes. It's fine to do that in another patch as optimization. > > > > > > Ok. I'll prepare a separate patch for optimizing that path. > > > > Thanks! > > > > >> > > >> >> overhead to call wake_up() when there's no task waiting, we can u= se an > > >> >> atomic to count waiting tasks. > > >> > > > >> > I'm not sure it's worth adding the complexity, as wake_up() on an = empty > > >> > waitqueue should have a very low cost on its own? > > >> > > >> wake_up() needs to call spin_lock_irqsave() unconditionally on a glo= bal > > >> shared lock. On systems with many CPUs (such servers), this may cau= se > > >> severe lock contention. Even the cache ping-pong may hurt performan= ce > > >> much. > > > > > > I understand that cache synchronization was a significant issue befor= e > > > qspinlock, but it seems to be less of a concern after its implementat= ion. > > > > Unfortunately, qspinlock cannot eliminate cache ping-pong issue, as > > discussed in the following thread. > > > > https://lore.kernel.org/lkml/20220510192708.GQ76023@worktop.programming= .kicks-ass.net/ > > > > > However, using a global atomic variable would still trigger cache bro= adcasts, > > > correct? > > > > We can only change the atomic variable to non-zero when > > swapcache_prepare() returns non-zero, and call wake_up() when the atomi= c > > variable is non-zero. Because swapcache_prepare() returns 0 most times= , > > the atomic variable is 0 most times. If we don't change the value of > > atomic variable, cache ping-pong will not be triggered. > > yes. this can be implemented by adding another atomic variable. > > > > > Hi, Kairui, > > > > Do you have some test cases to test parallel zram swap-in? If so, that > > can be used to verify whether cache ping-pong is an issue and whether i= t > > can be fixed via a global atomic variable. > > > > Yes, Kairui please run a test on your machine with lots of cores before > and after adding a global atomic variable as suggested by Ying. I am > sorry I don't have a server machine. I just had a try with the build kernel test which I used for the allocator patch series, with -j64, 1G memcg on my local branch: Without the patch: 2677.63user 9100.43system 3:33.15elapsed 5452%CPU (0avgtext+0avgdata 863284maxresident)k 2671.40user 8969.07system 3:33.67elapsed 5447%CPU (0avgtext+0avgdata 863316maxresident)k 2673.66user 8973.90system 3:33.18elapsed 5463%CPU (0avgtext+0avgdata 863284maxresident)k With the patch: 2655.05user 9134.21system 3:35.63elapsed 5467%CPU (0avgtext+0avgdata 863288maxresident)k 2652.57user 9104.87system 3:35.07elapsed 5466%CPU (0avgtext+0avgdata 863272maxresident)k 2665.44user 9155.97system 3:35.92elapsed 5474%CPU (0avgtext+0avgdata 863316maxresident)k Only three test runs, the main bottleneck for the test is still some other locks (list_lru lock, swap cgroup lock etc), but it does show the performance seems a bit lower. Could be considered a trivial amount of overhead so I think it's acceptable for the SYNC_IO path.