From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D4724C43219 for ; Mon, 27 Sep 2021 17:28:26 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 8DC8561157 for ; Mon, 27 Sep 2021 17:28:26 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 8DC8561157 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 396E96B0071; Mon, 27 Sep 2021 13:28:26 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 31F6F6B0072; Mon, 27 Sep 2021 13:28:26 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 20E446B0073; Mon, 27 Sep 2021 13:28:26 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0035.hostedemail.com [216.40.44.35]) by kanga.kvack.org (Postfix) with ESMTP id 142826B0071 for ; Mon, 27 Sep 2021 13:28:26 -0400 (EDT) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id AEC278249980 for ; Mon, 27 Sep 2021 17:28:25 +0000 (UTC) X-FDA: 78634037370.09.AD8FE99 Received: from mail-ed1-f51.google.com (mail-ed1-f51.google.com [209.85.208.51]) by imf08.hostedemail.com (Postfix) with ESMTP id 6AC7430000AB for ; Mon, 27 Sep 2021 17:28:25 +0000 (UTC) Received: by mail-ed1-f51.google.com with SMTP id y35so21140888ede.3 for ; Mon, 27 Sep 2021 10:28:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=GvswBwCE1hBWGNhCX7mBzDB5cCZTD4rBxUPDGl7SjQ4=; b=luTsp91hJW5aSmAVu3g2RH1xaosBUC7NSs5ERIXPp2H/bas6zXC4JdXKepZLKG45Lb YlcU8EV/oseObIB4+8oP3ysu1nKQ7JEsMvweYCIIk4Y5BJFh5grmwu0VYXHQFxlfsYGx mYnBI0xFisC0ks7cUH7BMaEqD8zOUbzpwdbYKzdoxuLTyCusde5i4WOlaEPp0vR1XI5y 59T453vMTu5KJ5rEqhquFj9PCqd1RN7+oELDkotMOyw1Ij7K3amcCruKl5TV6fq8W9Xk LxdTnCbClHRI18ryyR8TkOxVoa+Ypa8Fv+ygLeIsSoTVjioUJaC/dTv6zYICnRw1vh2G An5Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=GvswBwCE1hBWGNhCX7mBzDB5cCZTD4rBxUPDGl7SjQ4=; b=RQcdcxc/Y9XkHoeIbilM66rODaZtdKNWIaUvi87qIOMnUlh+4ifnHX6WJo67ZUsbOG pkMIEhbI7L4emyC2tAB+Wnlj+9rzpqmQ0ybOXoJEy5RTilk0JJTc//qKOewCitWeUYVb noMll6k9O1JYuoE4nPvuMS/1YlKfTxAsuEjdayfUI/uNAmnOB7Hh4DLTQSe4CqjMN4Ey vqGpvmkvMsHizTodcJUkj1XgJWSeWjAQLSwd6N7Mh9RRJhCAqCvlHz/L1Q5Q+/HREI9+ Q89ePs4x5bLZVoM8s2u19R5C2YjiS6ek3Ts8nesZPAqA8Ai8tf1ossIFH+7cwPFqMM5e Nbkg== X-Gm-Message-State: AOAM532x0PdMC5Tyoxn7rYQBi/BDGrCViXLURbiEj9i120YkOjotuoqM xOIDFxdKMq7GpCvrGOOgak0Bc3/WSetnAwsKxfU= X-Google-Smtp-Source: ABdhPJyE5pS2Z5liHh8dmEoUIxz9amCIzT55Zl8dQS4CcfX97HnESTJfKtpUBFWFaH95iDa1rc8OoTEjA+RZNJn/15w= X-Received: by 2002:a17:907:6297:: with SMTP id nd23mr1445172ejc.62.1632763704099; Mon, 27 Sep 2021 10:28:24 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Yang Shi Date: Mon, 27 Sep 2021 10:28:10 -0700 Message-ID: Subject: Re: [BUG] The usage of memory cgroup is not consistent with processes when using THP To: =?UTF-8?B?5Y+w6L+Q5pa5?= Cc: Johannes Weiner , Hugh Dickins , Tejun Heo , vdavydov@parallels.com, Cgroups , Linux MM Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 6AC7430000AB X-Stat-Signature: sxwdwkmcf7sq4r3581unqhmd1c39y8ch Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=luTsp91h; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf08.hostedemail.com: domain of shy828301@gmail.com designates 209.85.208.51 as permitted sender) smtp.mailfrom=shy828301@gmail.com X-HE-Tag: 1632763705-731703 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Sun, Sep 26, 2021 at 12:35 AM =E5=8F=B0=E8=BF=90=E6=96=B9 wrote: > > Hi folks=EF=BC=8C > We found that the usage counter of containers with memory cgroup v1 is > not consistent with the memory usage of processes when using THP. > > It is introduced in upstream 0a31bc97c80 patch and still exists in > Linux 5.14.5. > The root cause is that mem_cgroup_uncharge is moved to the final > put_page(). When freeing parts of huge pages in THP, the memory usage > of process is updated when pte unmapped and the usage counter of > memory cgroup is updated when splitting huge pages in > deferred_split_scan. This causes the inconsistencies and we could find > more than 30GB memory difference in our daily usage. IMHO I don't think this is a bug. The disparity reflects the difference in how the page life cycle is viewed between process and cgroup. The usage of process comes from the rss_counter of mm. It tracks the per-process mapped memory usage. So it is updated once the page is zapped. But from the point of cgroup, the page is charged when it is allocated and uncharged when it is freed. The page may be zapped by one process, but there might be other users pin the page to prevent it from being freed. The pin may be very transient or may be indefinite. THP is one of the pins. It is gone when the THP is split, but the split may happen a long time after the page is zapped due to deferred split. > > It is reproduced with the following program and script. > The program named "eat_memory_release" allocates every 8 MB memory and > releases the last 1 MB memory using madvise. > The script "test_thp.sh" creates a memory cgroup, runs > "eat_memory_release 500" in it and loops the proceed by 10 times. The > output shows the changing of memory, which should be about 500M memory > less in theory. > The outputs are varying randomly when using THP, while adding "echo 2 > > /proc/sys/vm/drop_caches" before accounting can avoid this. > > Are there any patches to fix it or is it normal by design? > > Thanks, > Yunfang Tai