From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.6 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AC0F4C4727F for ; Wed, 23 Sep 2020 10:06:06 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 32B412076E for ; Wed, 23 Sep 2020 10:06:06 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="CsAOdQXE" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 32B412076E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 90CD46B005C; Wed, 23 Sep 2020 06:06:05 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8E36E6B005D; Wed, 23 Sep 2020 06:06:05 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7F9216B0062; Wed, 23 Sep 2020 06:06:05 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0203.hostedemail.com [216.40.44.203]) by kanga.kvack.org (Postfix) with ESMTP id 6B28A6B005C for ; Wed, 23 Sep 2020 06:06:05 -0400 (EDT) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 23BAD8249980 for ; Wed, 23 Sep 2020 10:06:05 +0000 (UTC) X-FDA: 77293895490.24.rose48_520705927155 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin24.hostedemail.com (Postfix) with ESMTP id F3DB31A4A0 for ; Wed, 23 Sep 2020 10:06:04 +0000 (UTC) X-HE-Tag: rose48_520705927155 X-Filterd-Recvd-Size: 5952 Received: from mail-il1-f193.google.com (mail-il1-f193.google.com [209.85.166.193]) by imf17.hostedemail.com (Postfix) with ESMTP for ; Wed, 23 Sep 2020 10:06:04 +0000 (UTC) Received: by mail-il1-f193.google.com with SMTP id l16so6785731ilt.13 for ; Wed, 23 Sep 2020 03:06:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=LhpkLiJR1UQuYf4PS2uwetyE2e+YO/AkN+4Bm8ri4+c=; b=CsAOdQXEMLfL1ntftxlMTZsDY5DiK7A4Jj1EKrhCbMVB7Vxi1Qb2ZnlSdkK5RHi/8R fKa1nE3WgGBZESJS3l6I5yf8X0Xxcn5HTxuxoqVdIRqb0Gc1eD7mUTUgGilWlSCJx9Hx /IcCj1qasq7Z1Q6vLQqOXGBw8FDEvFNZcmlivYhLQL5vH+9XzwTk1NP36lUCdLFwAJn4 eEzQ+JAIrI8VvJ+yRooCPjzl0ZBDxP84a5ZlRM+cRGsH0Vff6SX63Cx87TRhBr9+tXeh 8NMM9jjB+/YmTW3Qts9GAr+fGB1g230tiIejiqod9TLVN/+gIvNuvxCs3WThKlVcuwLF GykA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=LhpkLiJR1UQuYf4PS2uwetyE2e+YO/AkN+4Bm8ri4+c=; b=HrazDpXGA0yWxgE2Nd1qQGNpcuVWLaf85k1+Zpy0ykX6KecU2Zw5ahwj2OoyUZp8Rm P0y4eKeGX8A8z/uCN5PtxJprb699IgkXb6dIlmEIZKdAzR/H9R26Xmb8m+xW6Or57Wji 9fMFe3tmQKCtMRmOGqPixKoqEwpegwi8M6KYsk+yUGRvLD2DVYzg8Enk7vDlGvvcdLNq f5QEyGHOkPsQxGjRNd2OirO5log8kbCul33wtej676ogC4KYgh/Logrn+M7PFh78evzO KJRF8U9d7NQQryZRD0f3CrQEKL36hUtbJjd682JCTYjIaHjtyJaN87NDYgP5cll/U9JV Ma8A== X-Gm-Message-State: AOAM530MJbIDEiJBNkPk0a4PZpzyfkrOtG1sbyBITepQMLCR3q52w9Zu cbkRDloFHYLB7xG+lt5xYd95JZKoWlp4R8LocaskErT2N0G7IA== X-Google-Smtp-Source: ABdhPJyi0zLdnj0F27Dl71XsM0k5/Vp0BYWUMtGkwXaSK0HQS8SjprQGij1ri2IHUQBc6bYcZPVfvJuh9cz6PMhbCiA= X-Received: by 2002:a92:5f06:: with SMTP id t6mr8508499ilb.168.1600855563945; Wed, 23 Sep 2020 03:06:03 -0700 (PDT) MIME-Version: 1.0 References: <20200921014317.73915-1-laoar.shao@gmail.com> <20200921223430.GI3117@suse.de> <20200922072324.GJ3117@suse.de> In-Reply-To: <20200922072324.GJ3117@suse.de> From: Yafang Shao Date: Wed, 23 Sep 2020 18:05:27 +0800 Message-ID: Subject: Re: [PATCH] mm, fadvise: improve the expensive remote LRU cache draining after FADV_DONTNEED To: Mel Gorman Cc: Andrew Morton , Johannes Weiner , Michal Hocko , Linux MM Content-Type: text/plain; charset="UTF-8" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Sep 22, 2020 at 3:23 PM Mel Gorman wrote: > > On Tue, Sep 22, 2020 at 10:12:31AM +0800, Yafang Shao wrote: > > On Tue, Sep 22, 2020 at 6:34 AM Mel Gorman wrote: > > > > > > On Mon, Sep 21, 2020 at 09:43:17AM +0800, Yafang Shao wrote: > > > > Our users reported that there're some random latency spikes when their RT > > > > process is running. Finally we found that latency spike is caused by > > > > FADV_DONTNEED. Which may call lru_add_drain_all() to drain LRU cache on > > > > remote CPUs, and then waits the per-cpu work to complete. The wait time > > > > is uncertain, which may be tens millisecond. > > > > That behavior is unreasonable, because this process is bound to a > > > > specific CPU and the file is only accessed by itself, IOW, there should > > > > be no pagecache pages on a per-cpu pagevec of a remote CPU. That > > > > unreasonable behavior is partially caused by the wrong comparation of the > > > > number of invalidated pages and the number of the target. For example, > > > > if (count < (end_index - start_index + 1)) > > > > The count above is how many pages were invalidated in the local CPU, and > > > > (end_index - start_index + 1) is how many pages should be invalidated. > > > > The usage of (end_index - start_index + 1) is incorrect, because they > > > > are virtual addresses, which may not mapped to pages. We'd better use > > > > inode->i_data.nrpages as the target. > > > > > > > > > > How does that work if the invalidation is for a subset of the file? > > > > > > > I realized it as well. There are some solutions to improve it. > > > > Option 1, take the min as the target. > > - if (count < (end_index - start_index + 1)) { > > + target = min_t(unsigned long, inode->i_data.nrpages, > > + end_index - start_index + 1); > > + if (count < target) { > > lru_add_drain_all(); > > > > Option 2, change the prototype of invalidate_mapping_pages and then > > check how many pages were skipped. > > > > + struct invalidate_stat { > > + unsigned long skipped; // how many pages were skipped > > + unsigned long invalidated; // how many pages were invalidated > > +}; > > > > - unsigned long invalidate_mapping_pages(struct address_space *mapping, > > +unsigned long invalidate_mapping_pages(struct address_space *mapping, > > struct invalidate_stat *stat, > > > > That would involve updating each caller and the struct is > unnecessarily heavy. Create one that returns via **nr_lruvec. For > invalidate_mapping_pages, pass in NULL as nr_lruvec. Create a new helper > for fadvise that accepts nr_lruvec. In the common helper, account for pages > that are likely on an LRU and count them in nr_lruvec if !NULL. Update > fadvise to drain only if pages were skipped that were on the lruvec. That > should also deal with the case where holes have been punched between > start and end. > Good suggestion, thanks Mel. I will send v2. -- Thanks Yafang