From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.9 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1, USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5EAB8C33CB3 for ; Thu, 16 Jan 2020 22:02:04 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 1256520728 for ; Thu, 16 Jan 2020 22:02:03 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="pZIi5sE/" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1256520728 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 845E08E0092; Thu, 16 Jan 2020 17:02:03 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7FE2E8E0089; Thu, 16 Jan 2020 17:02:03 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 70C4C8E0092; Thu, 16 Jan 2020 17:02:03 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 587FD8E0089 for ; Thu, 16 Jan 2020 17:02:03 -0500 (EST) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with SMTP id E26D2441A for ; Thu, 16 Jan 2020 22:02:02 +0000 (UTC) X-FDA: 76384870884.19.hour12_11d20205c192c X-HE-Tag: hour12_11d20205c192c X-Filterd-Recvd-Size: 6947 Received: from mail-pj1-f67.google.com (mail-pj1-f67.google.com [209.85.216.67]) by imf25.hostedemail.com (Postfix) with ESMTP for ; Thu, 16 Jan 2020 22:02:02 +0000 (UTC) Received: by mail-pj1-f67.google.com with SMTP id kx11so2198755pjb.4 for ; Thu, 16 Jan 2020 14:02:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:from:to:cc:subject:in-reply-to:message-id:references :user-agent:mime-version; bh=3l1bLmmeM9s291uP94O9U/aPjNKzd/SwoB7qzxFa1R0=; b=pZIi5sE/9TH17JjoL4l5dJZZrLpm7nc9kqlJkmnnzJuXTFlpU/YTZtduls0CTZLNQm E6ll5YpO8ByoE0nF7TJYBSt49NmIhu3tapebJ0GHzTHlrn11c9dfl4pg11YGNTtRTxRR g6ct6R0KdTbp94x6U595KvssLi8FgrCtpq3PPgsq7ZGsbVAGcsoE8WV0gRTaMptR7kLu Avzv6iRXqtNfuUucdsVE8Dx8aZKT07u4EELR1eoXNKnKFbnCN1FSuHa3xuv9FxuzLVhY UCe+3l+fC3JBlJNBgau4SWMziDCNmnvE5ZBSRsIY9lzxnRJqtiLV5b10eh8jS8HfF/jk VwwQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:in-reply-to:message-id :references:user-agent:mime-version; bh=3l1bLmmeM9s291uP94O9U/aPjNKzd/SwoB7qzxFa1R0=; b=GMpWJGhPZNrP5GG2zbTGv/T2496ANZ7hPDxmh0yy6CbVnmnFPAoD48COqJhzHMJu/X BvUkW3g+GwU5G1mRHXM9QojWQ2Q8TZTQ7OIAz3ic+QuP6hfTB+Tr62mrnUNwua++6TEf 5S8s0/qqaGGzizocPIQ2VJ32Ro6LlM2Q30toeH80L6c3Tayc2cdjcgBYTelaUh5OxnBR OdYacH04TgvyKaO5ozmnO/8Xq/I0vaugVoCf5NAUrnBkgY5XzwOivB1mfYJiX/NEISW+ K0cPlzgmcSS3GAFqSzyMEqjOPfbNKD7jH6TRDRonfr1LQh0vW8FHgIDUBr7xP7mSwqx2 t39w== X-Gm-Message-State: APjAAAXPshvYeiFoU33SyGkwBDss85eejbSiKez0ISxjlsEn0+AI6VVa RxCGgOll5nAtGYa77gZMXgr6og== X-Google-Smtp-Source: APXvYqzz/Iq1AFC0H0qPvPoBp1sq2GZeTe9qAtImzX2gAXb3xw+Gl7BruUV5FjWVCYBAKEETDQCciw== X-Received: by 2002:a17:90a:238b:: with SMTP id g11mr1756480pje.128.1579212121109; Thu, 16 Jan 2020 14:02:01 -0800 (PST) Received: from [2620:15c:17:3:3a5:23a7:5e32:4598] ([2620:15c:17:3:3a5:23a7:5e32:4598]) by smtp.gmail.com with ESMTPSA id v4sm26425893pff.174.2020.01.16.14.02.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 16 Jan 2020 14:02:00 -0800 (PST) Date: Thu, 16 Jan 2020 14:01:59 -0800 (PST) From: David Rientjes X-X-Sender: rientjes@chino.kir.corp.google.com To: Kirill Tkhai cc: Wei Yang , hannes@cmpxchg.org, mhocko@kernel.org, vdavydov.dev@gmail.com, akpm@linux-foundation.org, kirill.shutemov@linux.intel.com, yang.shi@linux.alibaba.com, cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, alexander.duyck@gmail.com, stable@vger.kernel.org Subject: Re: [Patch v3] mm: thp: grab the lock before manipulation defer list In-Reply-To: <0bb34c4a-97c7-0b3c-cf43-8af6cf9c4396@virtuozzo.com> Message-ID: References: <20200116013100.7679-1-richardw.yang@linux.intel.com> <0bb34c4a-97c7-0b3c-cf43-8af6cf9c4396@virtuozzo.com> User-Agent: Alpine 2.21 (DEB 202 2017-01-01) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, 16 Jan 2020, Kirill Tkhai wrote: > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > > index c5b5f74cfd4d..6450bbe394e2 100644 > > --- a/mm/memcontrol.c > > +++ b/mm/memcontrol.c > > @@ -5360,10 +5360,12 @@ static int mem_cgroup_move_account(struct page *page, > > } > > > > #ifdef CONFIG_TRANSPARENT_HUGEPAGE > > - if (compound && !list_empty(page_deferred_list(page))) { > > + if (compound) { > > spin_lock(&from->deferred_split_queue.split_queue_lock); > > - list_del_init(page_deferred_list(page)); > > - from->deferred_split_queue.split_queue_len--; > > + if (!list_empty(page_deferred_list(page))) { > > + list_del_init(page_deferred_list(page)); > > + from->deferred_split_queue.split_queue_len--; > > + } > > spin_unlock(&from->deferred_split_queue.split_queue_lock); > > } > > #endif > > @@ -5377,11 +5379,13 @@ static int mem_cgroup_move_account(struct page *page, > > page->mem_cgroup = to; > > > > #ifdef CONFIG_TRANSPARENT_HUGEPAGE > > - if (compound && list_empty(page_deferred_list(page))) { > > + if (compound) { > > spin_lock(&to->deferred_split_queue.split_queue_lock); > > - list_add_tail(page_deferred_list(page), > > - &to->deferred_split_queue.split_queue); > > - to->deferred_split_queue.split_queue_len++; > > + if (list_empty(page_deferred_list(page))) { > > + list_add_tail(page_deferred_list(page), > > + &to->deferred_split_queue.split_queue); > > + to->deferred_split_queue.split_queue_len++; > > + } > > spin_unlock(&to->deferred_split_queue.split_queue_lock); > > } > > #endif > > The patch looks OK for me. But there is another question. I forget, why we unconditionally > add a page with empty deferred list to deferred_split_queue. Shouldn't we also check that > it was initially in the list? Something like: > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index d4394ae4e5be..0be0136adaa6 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -5289,6 +5289,7 @@ static int mem_cgroup_move_account(struct page *page, > struct pglist_data *pgdat; > unsigned long flags; > unsigned int nr_pages = compound ? hpage_nr_pages(page) : 1; > + bool split = false; > int ret; > bool anon; > > @@ -5346,6 +5347,7 @@ static int mem_cgroup_move_account(struct page *page, > if (!list_empty(page_deferred_list(page))) { > list_del_init(page_deferred_list(page)); > from->deferred_split_queue.split_queue_len--; > + split = true; > } > spin_unlock(&from->deferred_split_queue.split_queue_lock); > } > @@ -5360,7 +5362,7 @@ static int mem_cgroup_move_account(struct page *page, > page->mem_cgroup = to; > > #ifdef CONFIG_TRANSPARENT_HUGEPAGE > - if (compound) { > + if (compound && split) { > spin_lock(&to->deferred_split_queue.split_queue_lock); > if (list_empty(page_deferred_list(page))) { > list_add_tail(page_deferred_list(page), > I think that's a good point, especially considering that the current code appears to unconditionally place any compound page on the deferred split queue of the destination memcg. The correct list that it should appear on, I believe, depends on whether the pmd has been split for the process being moved: note the MC_TARGET_PAGE caveat in mem_cgroup_move_charge_pte_range() that does not move the charge for compound pages with split pmds. So when mem_cgroup_move_account() is called with compound == true, we're moving the charge of the entire compound page: why would it appear on that memcg's deferred split queue?