From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.9 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1, USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A7615C433E1 for ; Thu, 27 Aug 2020 06:00:52 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 427DE20786 for ; Thu, 27 Aug 2020 06:00:52 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="OHUT2pXB" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 427DE20786 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 398886B0002; Thu, 27 Aug 2020 02:00:51 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 34A3F6B0003; Thu, 27 Aug 2020 02:00:51 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 237996B0006; Thu, 27 Aug 2020 02:00:51 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0117.hostedemail.com [216.40.44.117]) by kanga.kvack.org (Postfix) with ESMTP id 0BEBF6B0002 for ; Thu, 27 Aug 2020 02:00:51 -0400 (EDT) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id B04861E0B for ; Thu, 27 Aug 2020 06:00:50 +0000 (UTC) X-FDA: 77195299860.24.pest59_63018832706a Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin24.hostedemail.com (Postfix) with ESMTP id 8B44E1A4A5 for ; Thu, 27 Aug 2020 06:00:50 +0000 (UTC) X-HE-Tag: pest59_63018832706a X-Filterd-Recvd-Size: 7442 Received: from mail-qk1-f194.google.com (mail-qk1-f194.google.com [209.85.222.194]) by imf08.hostedemail.com (Postfix) with ESMTP for ; Thu, 27 Aug 2020 06:00:49 +0000 (UTC) Received: by mail-qk1-f194.google.com with SMTP id x69so4965782qkb.1 for ; Wed, 26 Aug 2020 23:00:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:from:to:cc:subject:in-reply-to:message-id:references :user-agent:mime-version; bh=NtN8IRKJbzMJm++6LA+dfi/0c6Tg1EKcXVsplcKZjIA=; b=OHUT2pXBHEvz0iQbWGht6j79B2je+0WG1m4V06yzQSmQ8xZ0CTqRyOf2DFbg599fsQ in1XxmVaKLFPXQhY7e0CbKCZIThDyunI62UcrwGETKCLwUyDACGT2fLoIa3QLMv47ujv hgKSCIE6sM0G45Nwf8tJ3cjUfi8x9N3IqHqNVIiH6kcIddQlA6kvjjZv0Oim2YJnkLfm ouySm8zYZ+EhX/1rThbbUYa/8d8DgEZotJ7lglQKUB6h/0VxGLttAqZkMyZl4QQKBdya iAKe52c5+KP365662n/5VnEJpllmtyDO+8oPDP1f7h9ofPcCZCTQ4x/tw+0zmWL7oX+p I5zg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:in-reply-to:message-id :references:user-agent:mime-version; bh=NtN8IRKJbzMJm++6LA+dfi/0c6Tg1EKcXVsplcKZjIA=; b=SXGLicXakJ7bFvZAjAToi52v72kjefM7SR2AJhuJjonBjwiZ8qt7V3Tm6FF1jpDs9O g0nVxa82EjQtbjdRBf8VNsv0YD5LrOVc1ocLsAurJGbdKCvmx/4SnwM94oy17FiW8gTt XQVymGFHVMwb7VbX6v2iV7Zh89dMSt5V+n4qoIfFHAqgBgxnR1VNQr8la6LnivaGN1/3 zr438ObMuBtvOouRDciBqzPeBIzSeob4Fc8ungOXMIUpSANWi758EPCCQvIKtaalDHRg SVhLmerx9XuPg7sQusX1rrLF4vizr0Rn4h6quAez8hLBypaLYzV4q6aomLFODbHSB40T X3bA== X-Gm-Message-State: AOAM532IWgxgX0E5/kY9dCx9/Rz21zWY5arb6XdH4aPbVZ6tqcIEOXUp zr05kB86ydtfu2+QVbl2lRS7bA== X-Google-Smtp-Source: ABdhPJzTi4TOXzZ1Rou+0ylWYjoURnjNtWOQ/tpkqfSGOVu4C9+0/TdnaP1vDHHc5DUgSCwUiy9Ugg== X-Received: by 2002:a37:2750:: with SMTP id n77mr17039241qkn.26.1598508048964; Wed, 26 Aug 2020 23:00:48 -0700 (PDT) Received: from eggly.attlocal.net (172-10-233-147.lightspeed.sntcca.sbcglobal.net. [172.10.233.147]) by smtp.gmail.com with ESMTPSA id d9sm885200qkl.7.2020.08.26.23.00.46 (version=TLS1 cipher=ECDHE-ECDSA-AES128-SHA bits=128/128); Wed, 26 Aug 2020 23:00:47 -0700 (PDT) Date: Wed, 26 Aug 2020 23:00:33 -0700 (PDT) From: Hugh Dickins X-X-Sender: hugh@eggly.anvils To: Alex Shi cc: Hugh Dickins , Michal Hocko , Qian Cai , akpm@linux-foundation.org, Johannes Weiner , Vladimir Davydov , cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, nao.horiguchi@gmail.com, osalvador@suse.de, mike.kravetz@oracle.com Subject: Re: [Resend PATCH 1/6] mm/memcg: warning on !memcg after readahead page charged In-Reply-To: Message-ID: References: <1597144232-11370-1-git-send-email-alex.shi@linux.alibaba.com> <20200820145850.GA4622@lca.pw> <20200821080127.GD32537@dhcp22.suse.cz> <20200821123934.GA4314@lca.pw> <20200821134842.GF32537@dhcp22.suse.cz> <20200824151013.GB3415@dhcp22.suse.cz> <12425e06-38ce-7ff4-28ce-b0418353fc67@linux.alibaba.com> User-Agent: Alpine 2.11 (LSU 23 2013-08-11) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Rspamd-Queue-Id: 8B44E1A4A5 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam04 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, 24 Aug 2020, Hugh Dickins wrote: > On Tue, 25 Aug 2020, Alex Shi wrote: > > reproduce using our linux-mm random bug collection on NUMA systems. > > >> > > >> OK, I must have missed that this was on ppc. The order makes more sense > > >> now. I will have a look at this next week. > > > > > > OK, so I've had a look and I know what's going on there. The > > > move_pages12 is migrating hugetlb pages. Those are not charged to any > > > memcg. We have completely missed this case. There are two ways going > > > around that. Drop the warning and update the comment so that we do not > > > forget about that or special case hugetlb pages. > > > > > > I think the first option is better. > > > > > > > > > Hi Michal, > > > > Compare to ignore the warning which is designed to give, seems addressing > > the hugetlb out of charge issue is a better solution, otherwise the memcg > > memory usage is out of control on hugetlb, is that right? I agree: it seems that hugetlb is not participating in memcg and lrus, so it should not even be calling mem_cgroup_migrate(). That happens because hugetlb finds the rest of migrate_page_states() useful, but maybe there just needs to be an "if (!PageHuge(page))" or "if (!PageHuge(newpage))" before its call to mem_cgroup_migrate() - but I have not yet checked whether either of those actually works. The same could be done inside mem_cgroup_migrate() instead, but it just seems wrong for hugetlb to be getting that far, if it has no other reason to enter mm/memcontrol.c. > > Please don't suppose that this is peculiar to hugetlb: I'm not > testing hugetlb at all (sorry), but I see the VM_WARN_ON_ONCE from > mem_cgroup_page_lruvec(), and from mem_cgroup_migrate(), and from > mem_cgroup_swapout(). > > In all cases seen on a PageAnon page (well, in one case PageKsm). > And not related to THP either: seen also on machine incapable of THP. > > Maybe there's an independent change in 5.9-rc that's defeating > expectations here, or maybe they were never valid. Worth > investigating, even though the patch is currently removed, > to find out why expectations were wrong. It was very well worth investigating. And at the time of writing the above, I thought it was coming up very quickly on all machines, but in fact it only came up quickly on the one exercising KSM; on the other machines it took about an hour to appear, so no wonder that you and others had not already seen it. While I'd prefer to spring the answer on you all in the patch that fixes it, there's something more there that I don't fully understand yet, and want to sort out before posting; so I'd better not keep you in suspense... we broke the memcg charging of ksm_might_need_to_copy() pages a couple of releases ago, and not noticed until your warning. What's surprising is that the same bug can affect PageAnon pages too, even when there's been no KSM involved whatsoever. I put in the KSM fix, set all the machines running, expecting to get more info on the PageAnon instances, but all of them turned out to be fixed. > > You'll ask me for more info, stacktraces etc, and I'll say sorry, > no time today. Please try the swapping tests I sent before. > > And may I say, the comment > /* Readahead page is charged too, to see if other page uncharged */ > is nonsensical to me, and much better deleted: maybe it would make > some sense if the reader could see the comment it replaces - as > they can in the patch - but not in the resulting source file. I stand by that remark; but otherwise, I think this was a helpful commit that helped to identify a bug, just as it was intended to do. (I say "helped to" because its warnings alerted, but did not point to the culprit: I had to add another in lru_cache_add() to find it.) Hugh