From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8DFE7C433DB for ; Tue, 26 Jan 2021 22:59:23 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 1245520663 for ; Tue, 26 Jan 2021 22:59:23 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1245520663 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 8CC126B0006; Tue, 26 Jan 2021 17:59:22 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 87C196B0007; Tue, 26 Jan 2021 17:59:22 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 746AE6B0008; Tue, 26 Jan 2021 17:59:22 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0214.hostedemail.com [216.40.44.214]) by kanga.kvack.org (Postfix) with ESMTP id 5F7746B0006 for ; Tue, 26 Jan 2021 17:59:22 -0500 (EST) Received: from smtpin08.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 243E7363B for ; Tue, 26 Jan 2021 22:59:22 +0000 (UTC) X-FDA: 77749444164.08.farm18_28027a127592 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin08.hostedemail.com (Postfix) with ESMTP id 085151819E76F for ; Tue, 26 Jan 2021 22:59:22 +0000 (UTC) X-HE-Tag: farm18_28027a127592 X-Filterd-Recvd-Size: 8484 Received: from mail-pj1-f43.google.com (mail-pj1-f43.google.com [209.85.216.43]) by imf07.hostedemail.com (Postfix) with ESMTP for ; Tue, 26 Jan 2021 22:59:21 +0000 (UTC) Received: by mail-pj1-f43.google.com with SMTP id u4so122099pjn.4 for ; Tue, 26 Jan 2021 14:59:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=eqwHYgU2iGZD+ybt4xP+dWXvyCnIIv7E02QKZG2/cJ0=; b=gIt3zYhf5IVme4+R3zL0WgYP/4OsnQK633hHDw7NJ13xKB+lx8A6X0VXB4Dh5wTFnq qhCdz0gqct/G+dbLDnJ1T8oG3eqFv+isG6ZLEt5qJQEWFU3kVsXDMjM4Vqe8K6Y28y3H UC3xcUYCXGaEjZPV+lwXC1takvnVJfPrt8zjhLYEBElVWOSadseRcTwPv6i0UFox4owq szbbrDb6hshAf5MQZGsyoymEap0uqJOD56gGachboF1WlAAGvKFL8UTdLZeFxY3/LtFx 4wu/MmlcFS3XYu4B5oRzHUNPcsGCAQa/grhlcLCK4qswP13l7MtQlEOrV63nGpYZKZUi HvxQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to; bh=eqwHYgU2iGZD+ybt4xP+dWXvyCnIIv7E02QKZG2/cJ0=; b=NbvffmxJ6DuY9fuQ4cJ5WC2k5f/hlvWla68ZaZgk2ft/0XJv+jBa2xixAcAcrMSl71 jpPadidcG4Tuj0WFxksMIacBHHqTMrbHqBPl4DAiImHLH7Du1ysl9yxhmCVNhBrhgmaT nxHYbesOXlfYPU3IQZqTE1ppu3D/bMbUgFFVCC7Ms9hNKDV1ZdnP/wgntPMxP92Y+Pnq K8ODvsMgfHyNWLluEitrCbJ8f7aCrI+Fo1Ix/XNOE7DUJzIVjXs37DpexSqFINZguB/U oDmwolwm9AVebdivZKi+OL11xhOnUuErRNh6Ktr1fOcwpfVi/Iu7oiGaiUQXHMyr/ZjK 1IJw== X-Gm-Message-State: AOAM533STFAEyhJtjo8/WO1kg/7Wrtg0vKfZZl2sYDoliiUGPLfbdNkU +MLWP1ObheENR7hYH35S2/I= X-Google-Smtp-Source: ABdhPJwx2l2p+TwwpYcdXWMcWWMRT4N62FjAhb1NKb15Pz7PMGmJk1DkwwqJ0jQbdVIum8HDeUMI6Q== X-Received: by 2002:a17:90b:a48:: with SMTP id gw8mr2047840pjb.113.1611701960493; Tue, 26 Jan 2021 14:59:20 -0800 (PST) Received: from google.com ([2620:15c:211:201:9dd5:b47b:bb84:dede]) by smtp.gmail.com with ESMTPSA id 145sm63241pge.88.2021.01.26.14.59.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 26 Jan 2021 14:59:19 -0800 (PST) Date: Tue, 26 Jan 2021 14:59:17 -0800 From: Minchan Kim To: Chris Goldsworthy Cc: viro@zeniv.linux.org.uk, Matthew Wilcox , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Laura Abbott Subject: Re: [PATCH v4] fs/buffer.c: Revoke LRU when trying to drop buffers Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Jan 25, 2021 at 10:58:30PM -0800, Chris Goldsworthy wrote: > From: Laura Abbott > > When a buffer is added to the LRU list, a reference is taken which is > not dropped until the buffer is evicted from the LRU list. This is the > correct behavior, however this LRU reference will prevent the buffer > from being dropped. This means that the buffer can't actually be dropped > until it is selected for eviction. There's no bound on the time spent > on the LRU list, which means that the buffer may be undroppable for > very long periods of time. Given that migration involves dropping > buffers, the associated page is now unmigratible for long periods of > time as well. CMA relies on being able to migrate a specific range > of pages, so these types of failures make CMA significantly > less reliable, especially under high filesystem usage. > > Rather than waiting for the LRU algorithm to eventually kick out > the buffer, explicitly remove the buffer from the LRU list when trying > to drop it. There is still the possibility that the buffer > could be added back on the list, but that indicates the buffer is > still in use and would probably have other 'in use' indicates to > prevent dropping. > > Note: a bug reported by "kernel test robot" lead to a switch from > using xas_for_each() to xa_for_each(). > > Signed-off-by: Laura Abbott > Signed-off-by: Chris Goldsworthy > Cc: Matthew Wilcox > Reported-by: kernel test robot Hi Chris, The release buffer_head in LRU is great improvement for migration point of view. A question: Can't we invalidate(e.g., invalidate_bh_lrus) bh_lru in migrate_prep or elsewhere when migration found the failure and is about to retry? Migration has done such a way for other per-cpu stuffs for a long time, which would be more consistent with others and might be faster sometimes with reducing IPI calls for page. > --- > fs/buffer.c | 79 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++---- > 1 file changed, 74 insertions(+), 5 deletions(-) > > diff --git a/fs/buffer.c b/fs/buffer.c > index 96c7604..27516a0 100644 > --- a/fs/buffer.c > +++ b/fs/buffer.c > @@ -48,6 +48,7 @@ > #include > #include > #include > +#include > > #include "internal.h" > > @@ -1471,12 +1472,55 @@ static bool has_bh_in_lru(int cpu, void *dummy) > return false; > } > > +static void __evict_bhs_lru(void *arg) > +{ > + struct bh_lru *b = &get_cpu_var(bh_lrus); > + struct xarray *busy_bhs = arg; > + struct buffer_head *bh; > + unsigned long i, xarray_index; > + > + xa_for_each(busy_bhs, xarray_index, bh) { > + for (i = 0; i < BH_LRU_SIZE; i++) { > + if (b->bhs[i] == bh) { > + brelse(b->bhs[i]); > + b->bhs[i] = NULL; > + break; > + } > + } > + } > + > + put_cpu_var(bh_lrus); > +} > + > +static bool page_has_bhs_in_lru(int cpu, void *arg) > +{ > + struct bh_lru *b = per_cpu_ptr(&bh_lrus, cpu); > + struct xarray *busy_bhs = arg; > + struct buffer_head *bh; > + unsigned long i, xarray_index; > + > + xa_for_each(busy_bhs, xarray_index, bh) { > + for (i = 0; i < BH_LRU_SIZE; i++) { > + if (b->bhs[i] == bh) > + return true; > + } > + } > + > + return false; > + > +} > void invalidate_bh_lrus(void) > { > on_each_cpu_cond(has_bh_in_lru, invalidate_bh_lru, NULL, 1); > } > EXPORT_SYMBOL_GPL(invalidate_bh_lrus); > > +static void evict_bh_lrus(struct xarray *busy_bhs) > +{ > + on_each_cpu_cond(page_has_bhs_in_lru, __evict_bhs_lru, > + busy_bhs, 1); > +} > + > void set_bh_page(struct buffer_head *bh, > struct page *page, unsigned long offset) > { > @@ -3242,14 +3286,38 @@ drop_buffers(struct page *page, struct buffer_head **buffers_to_free) > { > struct buffer_head *head = page_buffers(page); > struct buffer_head *bh; > + struct xarray busy_bhs; > + int bh_count = 0; > + int xa_ret, ret = 0; > + > + xa_init(&busy_bhs); > > bh = head; > do { > - if (buffer_busy(bh)) > - goto failed; > + if (buffer_busy(bh)) { > + xa_ret = xa_err(xa_store(&busy_bhs, bh_count++, > + bh, GFP_ATOMIC)); > + if (xa_ret) > + goto out; > + } > bh = bh->b_this_page; > } while (bh != head); > > + if (bh_count) { > + /* > + * Check if the busy failure was due to an outstanding > + * LRU reference > + */ > + evict_bh_lrus(&busy_bhs); > + do { > + if (buffer_busy(bh)) > + goto out; > + > + bh = bh->b_this_page; > + } while (bh != head); > + } > + > + ret = 1; > do { > struct buffer_head *next = bh->b_this_page; > > @@ -3259,9 +3327,10 @@ drop_buffers(struct page *page, struct buffer_head **buffers_to_free) > } while (bh != head); > *buffers_to_free = head; > detach_page_private(page); > - return 1; > -failed: > - return 0; > +out: > + xa_destroy(&busy_bhs); > + > + return ret; > } > > int try_to_free_buffers(struct page *page) > -- > The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, > a Linux Foundation Collaborative Project > >