From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 924E6C433EF for ; Thu, 25 Nov 2021 08:03:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 027196B0074; Thu, 25 Nov 2021 03:02:53 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id F19C66B0075; Thu, 25 Nov 2021 03:02:52 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DE1886B007B; Thu, 25 Nov 2021 03:02:52 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0231.hostedemail.com [216.40.44.231]) by kanga.kvack.org (Postfix) with ESMTP id CF5A06B0074 for ; Thu, 25 Nov 2021 03:02:52 -0500 (EST) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 972FF8771C for ; Thu, 25 Nov 2021 08:02:42 +0000 (UTC) X-FDA: 78846710964.25.23CAC20 Received: from mail-pf1-f181.google.com (mail-pf1-f181.google.com [209.85.210.181]) by imf02.hostedemail.com (Postfix) with ESMTP id 102647001AC5 for ; Thu, 25 Nov 2021 08:02:39 +0000 (UTC) Received: by mail-pf1-f181.google.com with SMTP id x5so5244250pfr.0 for ; Thu, 25 Nov 2021 00:02:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=CSciBp70ctz/T70TEHciU/vFhHdn0cLDtlceFTilb1c=; b=It30LTcXDZF58CHAlmKZ7KsZm6cBiiG66ou8PSaAROYspvZMNMWnzN6Yk9neUAPDjS GueCzvSqJwQ6/exWRTYZGJ0Zx9yWvNJ3rhzNXB37Q7np9hdsIK8UWe6tblkLG7s8KwyL /Qa/Q6ZGm4bFPECOBzayBO3QS1FyG6l2waLmn9B+nSfBogIq1lvOzCLq5nuORnVVDPZw GSo9gm8OTNzZ/jq4fHqt+JGTvUSP8ymwaC1ViJ2wxHTfnrzAjLJUPtyxflk9C0+Amqyq lbWFiMfsYVgeNoA6BT2ujDDGFHSSfUHVYlIASGg5/JI7ZDeIOxAPvcm3z53nHvw93Cy1 U23w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=CSciBp70ctz/T70TEHciU/vFhHdn0cLDtlceFTilb1c=; b=qACcN+YufzGH3Q7jvdICO3AviULaiCAxqyJKQ5dbfzvbLoeb42DwRnqRO1tdg/Lw+q 2uKqfvPjV9Nb/z3Dj1+xuCdii47HLbRGeZbJ8yjjW2umbRhcb9uLZvCaizVtk0STLgxh PCvTONz53DVmGiA1UcOnAslK/TwQCuLSn3hEEim3ueWyG0Xb6Lgxm3EYX8bXjO5RAmBN YTYCfSdMBuj605XD3fRQ9j0dQXiOwmGYKPJ2h2w7Ylk4ngiQ+yiZfWBv6HD/m3nxY1Lk ti590dW2r+UyLuGUwnsd0LSOKAvvzxKgzjIZBo37AUDvq3gZq4LLY9W7SqEthwZY3hJy 8S3w== X-Gm-Message-State: AOAM533XeGgTh/GvS+l9D+C5gEjh9kCkp7vLL6dQ344oRU4zGo2ft/ji v1hIiWoOpBmkaZornckeeUo= X-Google-Smtp-Source: ABdhPJwALKYZnA4tE3kx/LJ+PM6uutmsRQ4tIKRYroIQ0hHi86Cgy3EaVgZEMxaJNwbClYCdinrkLQ== X-Received: by 2002:aa7:8717:0:b0:4a2:967c:96b with SMTP id b23-20020aa78717000000b004a2967c096bmr12351089pfo.14.1637827361120; Thu, 25 Nov 2021 00:02:41 -0800 (PST) Received: from haolee.io ([2600:3c01::f03c:91ff:fe02:b162]) by smtp.gmail.com with ESMTPSA id u9sm2215737pfi.23.2021.11.25.00.02.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 25 Nov 2021 00:02:40 -0800 (PST) Date: Thu, 25 Nov 2021 08:02:38 +0000 From: Hao Lee To: Matthew Wilcox Cc: Michal Hocko , Linux MM , Johannes Weiner , vdavydov.dev@gmail.com, Shakeel Butt , cgroups@vger.kernel.org, LKML Subject: Re: [PATCH] mm: reduce spinlock contention in release_pages() Message-ID: <20211125080238.GA7356@haolee.io> References: <20211124151915.GA6163@haolee.io> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.12.1 (2019-06-15) X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 102647001AC5 X-Stat-Signature: wtr1fum35x3ir5ccsmej9gqnn1dais14 Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=It30LTcX; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf02.hostedemail.com: domain of haolee.swjtu@gmail.com designates 209.85.210.181 as permitted sender) smtp.mailfrom=haolee.swjtu@gmail.com X-HE-Tag: 1637827359-84648 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Nov 25, 2021 at 03:30:44AM +0000, Matthew Wilcox wrote: > On Thu, Nov 25, 2021 at 11:24:02AM +0800, Hao Lee wrote: > > On Thu, Nov 25, 2021 at 12:31 AM Michal Hocko wrote: > > > We do batch currently so no single task should be > > > able to monopolize the cpu for too long. Why this is not sufficient? > > > > uncharge and unref indeed take advantage of the batch process, but > > del_from_lru needs more time to complete. Several tasks will contend > > spinlock in the loop if nr is very large. > > Is SWAP_CLUSTER_MAX too large? Or does your architecture's spinlock > implementation need to be fixed? > My testing server is x86_64 with 5.16-rc2. The spinlock should be normal. I think lock_batch is not the point. lock_batch only break spinning time into small parts, but it doesn't reduce spinning time. The thing may get worse if lock_batch is very small. Here is an example about two tasks contending spinlock. Let's assume each task need a total of 4 seconds in critical section to complete its work. Example1: lock_batch = x task A taskB hold 4s wait 4s hold 4s total waiting time is 4s. Example2: if lock_batch = x/2 task A taskB hold 2s wait 2s wait 2s hold 2s hold 2s wait 2s hold 2s total waiting time is 6s. The above theoretical example can also be proved by a test using usemem. # ./usemem -j 4096 -n 20 10g -s 5 lock_batch=SWAP_CLUSTER_MAX 59.50% native_queued_spin_lock_slowpath lock_batch=SWAP_CLUSTER_MAX/4 69.95% native_queued_spin_lock_slowpath lock_batch=SWAP_CLUSTER_MAX/16 82.22% native_queued_spin_lock_slowpath Nonetheless, enlarging lock_batch can't improve performance obviously though it won't make it worse, and it's not a good idea to hold a irq-disabled spinlock for long time. If cond_reched() will break the task fairness, the only way I can think of is doing uncharge and unref if the current task can't get the spinlock. This will reduce the wasted cpu cycles, although the performance gain is still small (about 4%). However, this way will hurt batch processing in uncharge(). Maybe there exist a better way... diff --git a/mm/swap.c b/mm/swap.c index e8c9dc6d0377..8a947f8d0aaa 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -960,8 +960,16 @@ void release_pages(struct page **pages, int nr) if (PageLRU(page)) { struct lruvec *prev_lruvec = lruvec; - lruvec = folio_lruvec_relock_irqsave(folio, lruvec, + lruvec = folio_lruvec_tryrelock_irqsave(folio, lruvec, &flags); + if (!lruvec) { + mem_cgroup_uncharge_list(&pages_to_free); + free_unref_page_list(&pages_to_free); + INIT_LIST_HEAD(&pages_to_free); + lruvec = folio_lruvec_relock_irqsave(folio, + lruvec, &flags); + } + if (prev_lruvec != lruvec) lock_batch = 0;