From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C0811C02180 for ; Wed, 15 Jan 2025 04:31:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 16079280002; Tue, 14 Jan 2025 23:31:59 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 1111C280001; Tue, 14 Jan 2025 23:31:59 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F1A63280002; Tue, 14 Jan 2025 23:31:58 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id D2532280001 for ; Tue, 14 Jan 2025 23:31:58 -0500 (EST) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 7C3F3C07C5 for ; Wed, 15 Jan 2025 04:31:58 +0000 (UTC) X-FDA: 83008413516.26.5F877FE Received: from mail-ua1-f42.google.com (mail-ua1-f42.google.com [209.85.222.42]) by imf30.hostedemail.com (Postfix) with ESMTP id A1BFB80013 for ; Wed, 15 Jan 2025 04:31:56 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=nNhraqXE; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf30.hostedemail.com: domain of yuzhao@google.com designates 209.85.222.42 as permitted sender) smtp.mailfrom=yuzhao@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736915516; a=rsa-sha256; cv=none; b=T6UCWa0b729KcC0EsKEnfXM+BgjhXOyYtkZK0cOEGrWVezWTQciC2HyEQ3xnoBtd1vTwhC UcTuPCumyN22q4mB2y9p/+tcF/XXcpbcXiEARLDTODQQCLiwDwSTpFZdht0BxpbHDrlbPH aEEi8zyKH6aNfaPIZFzx0ZY2P5oL0EY= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=nNhraqXE; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf30.hostedemail.com: domain of yuzhao@google.com designates 209.85.222.42 as permitted sender) smtp.mailfrom=yuzhao@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736915516; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=yhUEXTZfkgW4BgxI74qkWusw+s+m3LiFaZUA40ueT84=; b=I8SqFRAJtRDnWOggw5+/OTutWDW67Am66CWRP45GfRMp7vaA6XaTnpaOy7bHPtSoupGbpm gxJiRqlZ2OmTNlPVwMOl/W80HwW4nJh5NCQfYbgHsSclDGnrIKdCDExzTmAeOh/W+rJt1L Zbz5hk20ZvQht2To7QQY5XF7b1ASSWQ= Received: by mail-ua1-f42.google.com with SMTP id a1e0cc1a2514c-86112ab1ad4so1725594241.1 for ; Tue, 14 Jan 2025 20:31:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736915515; x=1737520315; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=yhUEXTZfkgW4BgxI74qkWusw+s+m3LiFaZUA40ueT84=; b=nNhraqXEO5TtWSqYapX1b3F24KP9fPnXIxrinQrBfyTngUlPgI3BF+vjk4WVgrrWZ1 NOknjrkZKkZb44Xl4Dcy/OfyzBp5sAVnv/QGCElIV3W7P8hRvsgKjV7n3r8rVAuLFFao OKw2SctchtJBIjJshTtiK0n+3HMtbNVnv1zaAlvDhQ8IRnmR578PgpMkH6v0NsjUcNFN 3WBEwTL7KT/TrOItfda1ah0g3RyhokwdtriyIaCqFb21sdztimzsVlI1Cb8898IX7t2f TzE3tNve+pZkG9twzg1iQmVxbQWzqjbuVVifKjh2tBu/Rn/461lnhFtZPJoTZxv1xOsv eL5g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736915515; x=1737520315; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=yhUEXTZfkgW4BgxI74qkWusw+s+m3LiFaZUA40ueT84=; b=xGBvtojVpsyJ36RrZJjZb9KnvO74Kxkm4+N6ofhUXc9aXk2OnsoHNEsV9g0QgMqxf7 ZQ/+ftZVcbQSZcrCMO1vboCV4tqzvZw/r0EPK3esWXS60CAJHbFe0YpK7XBiLB7D0eho tLjlNiPYjw/jNfRIKswKrfSC6AxrildKIDiMdeszcTNfKmGuCAje45A+PtuPbMEL1lSE n6LM6EJ/qTqRhvt8kgtgySNXLyYfiwTRIoK9mYu1GAOZkwxBXbRCQrN1GtBahBKqrboM cTqFNk7iTcTJggfeAHcDkoZCkzc37wZlo6BCmcFO8gaemw+u2Qxodc6a4XcsUMJlghV4 +2hQ== X-Forwarded-Encrypted: i=1; AJvYcCX/f4MZDSzFOOkf17Dgy75fBBhmMHhKe3cbVVjMmqRIUTAfMnupV0SJVssDhUGcaXeKlPjDAlBTOg==@kvack.org X-Gm-Message-State: AOJu0YxPe4FbJ4PFnXJlkVY8DbTEw6XJ01Nm00abBS7CWhv79oJ2SoW0 fnz87VfMhOWJodOXc+vIs1Cm5DbgF8TbnNKYxpgfpxomc03cZq7/9CVPDjGIeFq3RMDAFrNS2Au LFbbki0r+VMLkuOaeiHMVziUWLqoFA8IzsUc7 X-Gm-Gg: ASbGncv4jzQFoOY1MQdhhfgsg69v/E4I0JqoLphK4IT6XBLN+au/Avwp2B4rTiirfo+ 8+ynAGhE8NsjFsa73iWOS6igo2xyyTUK6TnQjLeAPwMw93Mg/rjWKfxhsLuhmObgdkT9YmkY= X-Google-Smtp-Source: AGHT+IGV1k0H8SbbBA6nThUP6nuirAkFi9y4HuJJgrNaw0QqQFx6vZtOqG1tcU9TKgxw9iHsMXhUfwv5G+xLWaNaSds= X-Received: by 2002:a05:6102:151d:b0:4af:5f65:4fd3 with SMTP id ada2fe7eead31-4b3d0d84e76mr22672519137.6.1736915515480; Tue, 14 Jan 2025 20:31:55 -0800 (PST) MIME-Version: 1.0 References: <20250113093453.1932083-1-kirill.shutemov@linux.intel.com> <20250113093453.1932083-5-kirill.shutemov@linux.intel.com> In-Reply-To: From: Yu Zhao Date: Tue, 14 Jan 2025 21:31:18 -0700 X-Gm-Features: AbW1kvbVhru_iw2y_QlJolSm6M0KiY7kXGkDkKwfJ4WtDVBCkcYcJVbv0C9gb3I Message-ID: Subject: Re: [PATCH 4/8] mm/swap: Use PG_dropbehind instead of PG_reclaim To: Yosry Ahmed Cc: "Kirill A. Shutemov" , Andrew Morton , "Matthew Wilcox (Oracle)" , Jens Axboe , "Jason A. Donenfeld" , Andi Shyti , Chengming Zhou , Christian Brauner , Christophe Leroy , Dan Carpenter , David Airlie , David Hildenbrand , Hao Ge , Jani Nikula , Johannes Weiner , Joonas Lahtinen , Josef Bacik , Masami Hiramatsu , Mathieu Desnoyers , Miklos Szeredi , Nhat Pham , Oscar Salvador , Ran Xiaokai , Rodrigo Vivi , Simona Vetter , Steven Rostedt , Tvrtko Ursulin , Vlastimil Babka , intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: g9r87ho5tw9rygyopmy7s31eczr96x7h X-Rspam-User: X-Rspamd-Queue-Id: A1BFB80013 X-Rspamd-Server: rspam08 X-HE-Tag: 1736915516-632194 X-HE-Meta: U2FsdGVkX1+A19W/af8uyU5mtKosEMwKczmEkEEKts2iD0fVgZUGMbcUbK0iTcnI0pyuUprcfbfCEvPseSWPD769tRoZPVukOzN8Rajy3XhjqXWnKQXBIGmiF6Q+iJTFg/X6MQSIc7ojEVCCj35rhU/aDw8TygP1HWFby9Vuj1XEhYq45kjfYxdEGSE/aCQKJKE8l8xuoJCieBKmjeYcvgk/vy38ExRowvdLl9oXycIbDQlNK7hjfk4D2lTe6MnD+XglGfoj7BFt7w/U17raDxsC6PRy3xriNOH6CRU741KF0NbypyIrKzqI36IVsV5Se0bj0lUUhZdP0IDWTtkc1rPD7zBekEXIFiId2XRXJBHHOjnkMfa8vlp8fSJoldvHXKV8f3Snmu5O3EKTdyL1fXYM9E3VxHNpe/ZvZ8NGBxckgPgD4xxrUZtPz2MZQXUvyZ5Yj077pPmKjgCfbXbsfG/HxSKabb4bogfoX1KMGc8TNKlpUIpCmx5h5ipXVWH+o7cdu2Y4SchNVAcP3M22Uj97biOg8GIPDVUWLQr7l+D5mIZN3W7NfqlyNhtqmTbgGPgEMe1uaO+Tl2OeuaJo3ZL5Gi+wH6Xpp0JyNdCsb1a+FjgCjQT+j7KL0gu9qbkqPfmkOKSGVwWiCdm99vNI+PODtmS9QZ2HPn7IAhm7ztyc/hd+6Cn2TlHFMzFWMmw6VQSdfiOXQqUhh35ZVMz2Xyi+HzvtABWueNrDAnMFa64+15pMXbxK1nOYZU0w7+Pg9qheWfsyCmRvTUakKrQADhf9QcUKy1itxoLf2GWDx4huW/w7pIo02NVcCoPLGawO0PWMkDBaNkgiu31Wf/uZUV3O949AXtF45Grvndo/a3Do+B13lAbevm8VjdaVASoWX+gTz7jijclR4DWxLAKm7TUi0+Yx9KPoNA5l7wECOCZPAHZzxcGzrk/o87xkCVg/4ycm+Uu5ithW9LF3Ke/ fmRqBT6d MAI3oph2oPo3vHKGCoIkV191BAF7hyGAmOWO5Uru3xj52sewKHHVRpcuHPhjoUbva4f70NhzyvLZ1vJkKqMghVffBUqy3d1H9S/3bqTfTsN8xbs/1NsDq6YbNw+ThQJHGqX/4Vdl6R2d67VTgqlRolQqiGQ2I6YbvQqBOA8ZmM8HndDaZu1a7ThuRP9in5aNcXhrQ+pG/lYObUIr4BAmryqtPT7mr18sIe/8R6KIrgTlp7ZIrUoiNXTE/8jMCOGYpLcBV9wzOyhtSTSBrrvqurwOYxpeuBQQ85Fl1 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Jan 14, 2025 at 9:28=E2=80=AFPM Yu Zhao wrote: > > On Tue, Jan 14, 2025 at 11:03=E2=80=AFAM Yosry Ahmed wrote: > > > > On Tue, Jan 14, 2025 at 12:12=E2=80=AFAM Kirill A. Shutemov > > wrote: > > > > > > On Mon, Jan 13, 2025 at 08:17:20AM -0800, Yosry Ahmed wrote: > > > > On Mon, Jan 13, 2025 at 1:35=E2=80=AFAM Kirill A. Shutemov > > > > wrote: > > > > > > > > > > The recently introduced PG_dropbehind allows for freeing folios > > > > > immediately after writeback. Unlike PG_reclaim, it does not need = vmscan > > > > > to be involved to get the folio freed. > > > > > > > > > > Instead of using folio_set_reclaim(), use folio_set_dropbehind() = in > > > > > lru_deactivate_file(). > > > > > > > > > > Signed-off-by: Kirill A. Shutemov > > > > > --- > > > > > mm/swap.c | 8 +------- > > > > > 1 file changed, 1 insertion(+), 7 deletions(-) > > > > > > > > > > diff --git a/mm/swap.c b/mm/swap.c > > > > > index fc8281ef4241..4eb33b4804a8 100644 > > > > > --- a/mm/swap.c > > > > > +++ b/mm/swap.c > > > > > @@ -562,14 +562,8 @@ static void lru_deactivate_file(struct lruve= c *lruvec, struct folio *folio) > > > > > folio_clear_referenced(folio); > > > > > > > > > > if (folio_test_writeback(folio) || folio_test_dirty(folio= )) { > > > > > - /* > > > > > - * Setting the reclaim flag could race with > > > > > - * folio_end_writeback() and confuse readahead. = But the > > > > > - * race window is _really_ small and it's not a = critical > > > > > - * problem. > > > > > - */ > > > > > lruvec_add_folio(lruvec, folio); > > > > > - folio_set_reclaim(folio); > > > > > + folio_set_dropbehind(folio); > > > > > } else { > > > > > /* > > > > > * The folio's writeback ended while it was in th= e batch. > > > > > > > > Now there's a difference in behavior here depending on whether or n= ot > > > > the folio is under writeback (or will be written back soon). If it = is, > > > > we set PG_dropbehind to get it freed right after, but if writeback = has > > > > already ended we put it on the tail of the LRU to be freed later. > > > > > > > > It's a bit counterintuitive to me that folios with pending writebac= k > > > > get freed faster than folios that completed their writeback already= . > > > > Am I missing something? > > > > > > Yeah, it is strange. > > > > > > I think we can drop the writeback/dirty check. Set PG_dropbehind and = put > > > the page on the tail of LRU unconditionally. The check was required t= o > > > avoid confusion with PG_readahead. > > > > > > Comment above the function is not valid anymore. > > > > My read is that we don't put dirty/writeback folios at the tail of the > > LRU because they cannot be freed immediately and we want to give them > > time to be written back before reclaim reaches them. So I don't think > > we want to change that and always put the pages at the tail. > > > > > > > > But the folio that is still dirty under writeback will be freed faste= r as > > > we get rid of the folio just after writeback is done while clean page= can > > > dangle on LRU for a while. > > > > Yeah if we reuse PG_dropbehind then we cannot avoid > > folio_end_writeback() freeing the folio faster than clean ones. > > > > > > > > I don't think we have any convenient place to free clean dropbehind p= age > > > other than shrink_folio_list(). Or do we? > > > > Not sure tbh. FWIW I am not saying it's necessarily a bad thing to > > free dirty/writeback folios before clean ones when deactivated, it's > > just strange and a behavioral change from today that I wanted to point > > out. Perhaps that's the best we can do for now. > > > > > > > > Looking at shrink_folio_list(), I think we need to bypass page demoti= on > > > for PG_dropbehind pages. > > I agree with Yosry. I don't think lru_deactivate_file() is still > needed -- it was needed only because when truncation fails to free a > dirty/writeback folio, page reclaim can do that quickly. For other > conditions that mapping_evict_folio() returns 0, there isn't much page > reclaim can do, and those conditions are not deactivate_file_folio() > and lru_deactivate_file()'s intentions. So the following should be > enough, and it's a lot cleaner : > > diff --git a/mm/truncate.c b/mm/truncate.c > index e2e115adfbc5..12d2aa608517 100644 > --- a/mm/truncate.c > +++ b/mm/truncate.c > @@ -486,7 +486,7 @@ unsigned long mapping_try_invalidate(struct > address_space *mapping, > * of interest and try to speed up its reclaim. > */ > if (!ret) { > - deactivate_file_folio(folio); > + folio_set_dropbehind(folio) > /* Likely in the lru cache of a remote CP= U */ > if (nr_failed) > (*nr_failed)++; > > Then we can drop deactivate_file_folio() and lru_deactivate_file(). And with the above and list_move_tail() removed, we can also remove lruvec_add_folio_tail().