From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 231EEC433E6 for ; Mon, 11 Jan 2021 18:57:35 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 98618225AC for ; Mon, 11 Jan 2021 18:57:34 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 98618225AC Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id A19736B00D5; Mon, 11 Jan 2021 13:57:33 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 9C9A76B00EB; Mon, 11 Jan 2021 13:57:33 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 891206B0125; Mon, 11 Jan 2021 13:57:33 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0248.hostedemail.com [216.40.44.248]) by kanga.kvack.org (Postfix) with ESMTP id 7249A6B00D5 for ; Mon, 11 Jan 2021 13:57:33 -0500 (EST) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 1E4A8362A for ; Mon, 11 Jan 2021 18:57:33 +0000 (UTC) X-FDA: 77694402786.18.tail97_430e0002750f Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin18.hostedemail.com (Postfix) with ESMTP id B40A0100EC666 for ; Mon, 11 Jan 2021 18:57:32 +0000 (UTC) X-HE-Tag: tail97_430e0002750f X-Filterd-Recvd-Size: 8472 Received: from mail-ej1-f48.google.com (mail-ej1-f48.google.com [209.85.218.48]) by imf02.hostedemail.com (Postfix) with ESMTP for ; Mon, 11 Jan 2021 18:57:32 +0000 (UTC) Received: by mail-ej1-f48.google.com with SMTP id qw4so1094247ejb.12 for ; Mon, 11 Jan 2021 10:57:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=3NVjzqyeG4SJekFF3bDqFWyaSeiE+9++w1R+w2UCUDY=; b=LHhjYxcoAPAAinyEfyEObtgscE/8ZOzhXFWVkH0R6Jf5o0miQed/nocLX+CXFkAE1G FK69wbVEeMtoVjV5klLZgL1XK/x2VCcbImcTCM9dFtany/NbZ8kgLe+JRKVIGIZ4qR0s JzymgIx2WbydtgCbaN+7U3cQPHhso5PoClnZgCJBZuzAfDJjtiN4wSo03dCzg8j2o/em tLE/bopl8qnJ6He79OpVFuf0S10jO6McjxetLnRUKfRkgxzPMX3JE0wWa280ND8vcBMH G2LiSdbNpN7pbhdsZf6VbXZhgtg+e9jBJIbvHZpQfdZdi5omEwRH1ZJ7Xsjo5B5KBinn hmYA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=3NVjzqyeG4SJekFF3bDqFWyaSeiE+9++w1R+w2UCUDY=; b=m4NdY4axQQGvuel1C59v4BhT/Xz5uRAK5R80GHiAK73/1j64c30OYMlA08NyRY6m4U 88u9uVi0YKX2F6WIzRjYHgWPILSkZF8ZmxOTp5t2nOGtx+IP5eRVDivPTLijSLABUAyy rPHE9xKTHvonKNw7uUbHOm0rPQPKlCWNIiBDM5aqa91kHUS9Yn/Cf+SaSKYEGSx19ODI xlVJBmq/LpTVSnhOeJa6Qyt0zdoetG50inqkntocS+N9MlkvYB1I7hwKH2obT6FAaa64 34NVl0iy4nKYL0t15h0Pa2R6ePoIN3e0iOMsYMvg0QT33YfSjv6lw3/CXMvoid8hVTUS ZJxA== X-Gm-Message-State: AOAM532FnOoJ2Sl/FlWxbTZPcLfIowkIJqnbZnJlLCm3Ett062Odrbo/ NOt9DzwYSMAXwQrCnjP5C1Y1i4+oTqqZlNRivLk= X-Google-Smtp-Source: ABdhPJyS9HMQhQh/g+O1fLpkuel6PXjVXl3mCSzKSP+vxhfXMAvxV7ro06+/MVYihARVdgEYeWOlcl3cn7UrTlrnBvQ= X-Received: by 2002:a17:907:20a4:: with SMTP id pw4mr567221ejb.499.1610391450862; Mon, 11 Jan 2021 10:57:30 -0800 (PST) MIME-Version: 1.0 References: <20210105225817.1036378-1-shy828301@gmail.com> <20210105225817.1036378-4-shy828301@gmail.com> <56d26993-1577-3747-2d89-1275d92f7a15@virtuozzo.com> <35543012-882c-2e1e-f23b-d25a6fa41e67@virtuozzo.com> In-Reply-To: <35543012-882c-2e1e-f23b-d25a6fa41e67@virtuozzo.com> From: Yang Shi Date: Mon, 11 Jan 2021 10:57:19 -0800 Message-ID: Subject: Re: [v3 PATCH 03/11] mm: vmscan: use shrinker_rwsem to protect shrinker_maps allocation To: Kirill Tkhai Cc: Roman Gushchin , Shakeel Butt , Dave Chinner , Johannes Weiner , Michal Hocko , Andrew Morton , Linux MM , Linux FS-devel Mailing List , Linux Kernel Mailing List Content-Type: text/plain; charset="UTF-8" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Jan 11, 2021 at 9:34 AM Kirill Tkhai wrote: > > On 11.01.2021 20:08, Yang Shi wrote: > > On Wed, Jan 6, 2021 at 1:55 AM Kirill Tkhai wrote: > >> > >> On 06.01.2021 01:58, Yang Shi wrote: > >>> Since memcg_shrinker_map_size just can be changd under holding shrinker_rwsem > >>> exclusively, the read side can be protected by holding read lock, so it sounds > >>> superfluous to have a dedicated mutex. This should not exacerbate the contention > >>> to shrinker_rwsem since just one read side critical section is added. > >>> > >>> Signed-off-by: Yang Shi > >>> --- > >>> mm/vmscan.c | 16 ++++++---------- > >>> 1 file changed, 6 insertions(+), 10 deletions(-) > >>> > >>> diff --git a/mm/vmscan.c b/mm/vmscan.c > >>> index 9db7b4d6d0ae..ddb9f972f856 100644 > >>> --- a/mm/vmscan.c > >>> +++ b/mm/vmscan.c > >>> @@ -187,7 +187,6 @@ static DECLARE_RWSEM(shrinker_rwsem); > >>> #ifdef CONFIG_MEMCG > >>> > >>> static int memcg_shrinker_map_size; > >>> -static DEFINE_MUTEX(memcg_shrinker_map_mutex); > >>> > >>> static void memcg_free_shrinker_map_rcu(struct rcu_head *head) > >>> { > >>> @@ -200,8 +199,6 @@ static int memcg_expand_one_shrinker_map(struct mem_cgroup *memcg, > >>> struct memcg_shrinker_map *new, *old; > >>> int nid; > >>> > >>> - lockdep_assert_held(&memcg_shrinker_map_mutex); > >>> - > >>> for_each_node(nid) { > >>> old = rcu_dereference_protected( > >>> mem_cgroup_nodeinfo(memcg, nid)->shrinker_map, true); > >>> @@ -250,7 +247,7 @@ int memcg_alloc_shrinker_maps(struct mem_cgroup *memcg) > >>> if (mem_cgroup_is_root(memcg)) > >>> return 0; > >>> > >>> - mutex_lock(&memcg_shrinker_map_mutex); > >>> + down_read(&shrinker_rwsem); > >>> size = memcg_shrinker_map_size; > >>> for_each_node(nid) { > >>> map = kvzalloc(sizeof(*map) + size, GFP_KERNEL); > >>> @@ -261,7 +258,7 @@ int memcg_alloc_shrinker_maps(struct mem_cgroup *memcg) > >>> } > >>> rcu_assign_pointer(memcg->nodeinfo[nid]->shrinker_map, map); > >> > >> Here we do STORE operation, and since we want the assignment is visible > >> for shrink_slab_memcg() under down_read(), we have to use down_write() > >> in memcg_alloc_shrinker_maps(). > > > > I apologize for the late reply, these emails went to my SPAM again. > > This is the second time the problem appeared. Just add my email address to allow list, > and there won't be this problem again. Yes, I thought clicking "not spam" would add your email address to the allow list automatically. But it turns out not true. > > > Before this patch it was not serialized by any lock either, right? Do > > we have to serialize it? As Johannes mentioned if shrinker_maps has > > not been initialized yet, it means the memcg is a newborn, there > > should not be significant amount of reclaimable slab caches, so it is > > fine to skip it. The point makes some sense to me. > > > > So, the read lock seems good enough. > > No, this is not so. > > Patch "[v3 PATCH 07/11] mm: vmscan: add per memcg shrinker nr_deferred" adds > new assignments: > > + info->map = (unsigned long *)((unsigned long)info + sizeof(*info)); > + info->nr_deferred = (atomic_long_t *)((unsigned long)info + > + sizeof(*info) + m_size); > > info->map and info->nr_deferred are not visible under READ lock in shrink_slab_memcg(), > unless you use WRITE lock in memcg_alloc_shrinker_maps(). However map and nr_deferred are assigned before rcu_assign_pointer(memcg->nodeinfo[nid]->shrinker_info, new). The shrink_slab_memcg() checks shrinker_info pointer. But that order might be not guaranteed, so it seems a memory barrier before rcu_assign_pointer should be good enough, right? > > Nowhere in your patchset you convert READ lock to WRITE lock in memcg_alloc_shrinker_maps(). > > So, just use the true lock in this patch from the first time. > > >> > >>> } > >>> - mutex_unlock(&memcg_shrinker_map_mutex); > >>> + up_read(&shrinker_rwsem); > >>> > >>> return ret; > >>> } > >>> @@ -276,9 +273,8 @@ static int memcg_expand_shrinker_maps(int new_id) > >>> if (size <= old_size) > >>> return 0; > >>> > >>> - mutex_lock(&memcg_shrinker_map_mutex); > >>> if (!root_mem_cgroup) > >>> - goto unlock; > >>> + goto out; > >>> > >>> memcg = mem_cgroup_iter(NULL, NULL, NULL); > >>> do { > >>> @@ -287,13 +283,13 @@ static int memcg_expand_shrinker_maps(int new_id) > >>> ret = memcg_expand_one_shrinker_map(memcg, size, old_size); > >>> if (ret) { > >>> mem_cgroup_iter_break(NULL, memcg); > >>> - goto unlock; > >>> + goto out; > >>> } > >>> } while ((memcg = mem_cgroup_iter(NULL, memcg, NULL)) != NULL); > >>> -unlock: > >>> +out: > >>> if (!ret) > >>> memcg_shrinker_map_size = size; > >>> - mutex_unlock(&memcg_shrinker_map_mutex); > >>> + > >>> return ret; > >>> } > >>> > >>> > >> > >> > >