From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C425CC433DF for ; Mon, 3 Aug 2020 13:21:22 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 86D2F206DA for ; Mon, 3 Aug 2020 13:21:22 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="nLjfcOcc" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 86D2F206DA Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id E8A1F8D00F8; Mon, 3 Aug 2020 09:21:22 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E39D48D0081; Mon, 3 Aug 2020 09:21:22 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D4F458D00F8; Mon, 3 Aug 2020 09:21:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0049.hostedemail.com [216.40.44.49]) by kanga.kvack.org (Postfix) with ESMTP id BE52E8D0081 for ; Mon, 3 Aug 2020 09:21:22 -0400 (EDT) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 50424180AD81A for ; Mon, 3 Aug 2020 13:21:22 +0000 (UTC) X-FDA: 77109318804.18.cry50_130220226f9e Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin18.hostedemail.com (Postfix) with ESMTP id 35F1F100EDBC1 for ; Mon, 3 Aug 2020 13:21:22 +0000 (UTC) X-HE-Tag: cry50_130220226f9e X-Filterd-Recvd-Size: 6979 Received: from mail-il1-f196.google.com (mail-il1-f196.google.com [209.85.166.196]) by imf34.hostedemail.com (Postfix) with ESMTP for ; Mon, 3 Aug 2020 13:21:21 +0000 (UTC) Received: by mail-il1-f196.google.com with SMTP id 77so2582434ilc.5 for ; Mon, 03 Aug 2020 06:21:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=HZl50wQVBYc8PqY8DC8xqmfKguAa3rpopNIQz9f1UhM=; b=nLjfcOccVr6QHMZnli/K3uOFy3fyXUF6wRm/q6vMmcZJHo/nw9X9YOlBBw5cI79Y8+ s0+Z8bWfkJ1D5W9nRsrm2B/T/IL0UgUVvCdnPAHEeTqelbozb0wLELyYer2qcPEcBYpk ZRYFF9PtzPlUNVPMD7T8WQe/U3+7rIcmsqg0I1mhTZMMOTDeDnuC0x5gheZRkItszSz9 GdU0UZnoLbCYNHJ6KTY5yOHOnTebBdBNJvH9sK2mDpaJuW9621je/vLLETsO+d7DoOtk Yi1WMXfL+1pg8iOFJzcLu0lmPI7Kt6CmZz/q5AX2GaxGnuYKYsURMtcyAFFX9RAd69Sq 2Ogg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=HZl50wQVBYc8PqY8DC8xqmfKguAa3rpopNIQz9f1UhM=; b=PVDQO0qpXVCsBEXcR8tJgcjEXBiJCWEQby//BCDryMeNSSmBE94mIfn0WffCf2OC6c 0ClBWdBVYvwHi8t2fHuK83zGMX8HwG2FoDnGIAREXxzTHzIKFIY4qAdxCG3o7/Oo/nkU 38CRuitBtGmRQs2GuT6L7341hDfXfhvDTlWh9ikPrZxnpVd9VPEGVeFZvF0a82A4TnDk ylN7G6Er0t2e6arO0UMegODjP9iAlQwHN/Q97XCEt6ML1xXfqTMzZbOJYnkrkG9nzkt9 45ui6neWnnlt7yYYNqYSZVo/ONjSVeOp9xErE6+ECExYJdmN8OXzt2VCaE0UOhHmhDeB eO+A== X-Gm-Message-State: AOAM530JPTUvxBe4WCRXczoTpnJlrM3yXFpBOM4N+j+EJQgCz4vbKO+2 HJK99g3CE425VZd3uUPEcYG+JapouqfjQ42V+9NTba6x X-Google-Smtp-Source: ABdhPJzawtXjWFT4FoOTTQ7XBciU9I00p09uIvA0v5N0HgTIQIAEvMZrz+lb7ZpyNEMhtMfvRRPyBOx5hddkXXCXD8k= X-Received: by 2002:a92:9a4d:: with SMTP id t74mr2160842ili.203.1596460881063; Mon, 03 Aug 2020 06:21:21 -0700 (PDT) MIME-Version: 1.0 References: <20200728074032.1555-1-laoar.shao@gmail.com> <20200730112620.GH18727@dhcp22.suse.cz> <20200803101226.GH5174@dhcp22.suse.cz> In-Reply-To: <20200803101226.GH5174@dhcp22.suse.cz> From: Yafang Shao Date: Mon, 3 Aug 2020 21:20:44 +0800 Message-ID: Subject: Re: [PATCH] mm, memcg: do full scan initially in force_empty To: Michal Hocko Cc: Johannes Weiner , Andrew Morton , Linux MM Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 35F1F100EDBC1 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam05 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Aug 3, 2020 at 6:12 PM Michal Hocko wrote: > > On Fri 31-07-20 09:50:04, Yafang Shao wrote: > > On Thu, Jul 30, 2020 at 7:26 PM Michal Hocko wrote: > > > > > > On Tue 28-07-20 03:40:32, Yafang Shao wrote: > > > > Sometimes we use memory.force_empty to drop pages in a memcg to work > > > > around some memory pressure issues. When we use force_empty, we want the > > > > pages can be reclaimed ASAP, however force_empty reclaims pages as a > > > > regular reclaimer which scans the page cache LRUs from DEF_PRIORITY > > > > priority and finally it will drop to 0 to do full scan. That is a waste > > > > of time, we'd better do full scan initially in force_empty. > > > > > > Do you have any numbers please? > > > > > > > Unfortunately the number doesn't improve obviously, while it is > > directly proportional to the numbers of total pages to be scanned. > > Your changelog claims an optimization and that should be backed by some > numbers. It is true that reclaim at a higher priority behaves slightly > and subtly differently but that urge for even more details in the > changelog. > With the below addition change (nr_to_scan also changed), the elapsed time of force_empty can be reduced by 10%. @@ -3208,6 +3211,7 @@ static inline bool memcg_has_children(struct mem_cgroup *memcg) static int mem_cgroup_force_empty(struct mem_cgroup *memcg) { int nr_retries = MEM_CGROUP_RECLAIM_RETRIES; + unsigned long size; /* we call try-to-free pages for make this cgroup empty */ lru_add_drain_all(); @@ -3215,14 +3219,15 @@ static int mem_cgroup_force_empty(struct mem_cgroup *memcg) drain_all_stock(memcg); /* try to free all pages in this cgroup */ - while (nr_retries && page_counter_read(&memcg->memory)) { + while (nr_retries && (size = page_counter_read(&memcg->memory))) { int progress; if (signal_pending(current)) return -EINTR; - progress = try_to_free_mem_cgroup_pages(memcg, 1, - GFP_KERNEL, true); + progress = try_to_free_mem_cgroup_pages(memcg, size, + GFP_KERNEL, true, + 0); Below are the numbers for a 16G memcg with full clean pagecache. Without these change, $ time echo 1 > /sys/fs/cgroup/memory/foo/memory.force_empty real 0m2.247s user 0m0.000s sys 0m1.722s With these change, $ time echo 1 > /sys/fs/cgroup/memory/foo/memory.force_empty real 0m2.053s user 0m0.000s sys 0m1.529s But I'm not sure whether we should make this improvement, because force_empty is not a critical path. > > But then I notice that force_empty will try to write dirty pages, that > > is not expected by us, because this behavior may be dangerous in the > > production environment. > > I do not understand your claim here. Direct reclaim doesn't write dirty > page cache pages directly. It will write dirty pages once the sc->priority drops to a very low number. if (sc->priority < DEF_PRIORITY - 2) sc->may_writepage = 1; > And it is even less clear why that would be > dangerous if it did. > It will generate many IOs, which may block the others. > > What do you think introducing per memcg drop_cache ? > > I do not like the global drop_cache and per memcg is not very much > different. This all shouldn't be really necessary because we do have > means to reclaim memory in a memcg. > -- We used to find an issue that there are many negative dentries in some memcgs. These negative dentries were introduced by some specific workload in these memcgs, and we want to drop them as soon as possible. But unfortunately there is no good way to drop them except the force_empy or global drop_caches. The force_empty will also drop the pagecache pages, which is not expected by us. The global drop_caches can't work either because it will drop slabs in other memcgs. That is why I want to introduce per memcg drop_caches. -- Thanks Yafang