From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.1 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1, USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D0C64C433E3 for ; Fri, 17 Jul 2020 19:38:01 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 93AC82064C for ; Fri, 17 Jul 2020 19:38:01 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="duEtAC7+" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 93AC82064C Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 23D8B8D000C; Fri, 17 Jul 2020 15:38:01 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1CA138D0003; Fri, 17 Jul 2020 15:38:01 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 08F3F8D000C; Fri, 17 Jul 2020 15:38:01 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0143.hostedemail.com [216.40.44.143]) by kanga.kvack.org (Postfix) with ESMTP id E41C28D0003 for ; Fri, 17 Jul 2020 15:38:00 -0400 (EDT) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 3D1C48248047 for ; Fri, 17 Jul 2020 19:38:00 +0000 (UTC) X-FDA: 77048578320.19.cow53_341804626f0d Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin19.hostedemail.com (Postfix) with ESMTP id 1549E1C12F for ; Fri, 17 Jul 2020 19:38:00 +0000 (UTC) X-HE-Tag: cow53_341804626f0d X-Filterd-Recvd-Size: 6169 Received: from mail-pl1-f194.google.com (mail-pl1-f194.google.com [209.85.214.194]) by imf47.hostedemail.com (Postfix) with ESMTP for ; Fri, 17 Jul 2020 19:37:59 +0000 (UTC) Received: by mail-pl1-f194.google.com with SMTP id d1so5869623plr.8 for ; Fri, 17 Jul 2020 12:37:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:from:to:cc:subject:in-reply-to:message-id:references :user-agent:mime-version; bh=i5w4LeVU5Ef3HXHSWFYKzyIvwNIZHN+Yh+rz6r0uMmY=; b=duEtAC7+t+gDLXkZPbHYCT3BTLKqJu7mTT3ioFk2ZD+MEB3lM7+9bNetpIYa7CPYfD 96OxRjjHkcEh/MOt0HagIqaAhdNz42j9aA3WZvGZjdqK1RJiHEATC6fbBCs8xef+uX2U xu6Mw2ITReMWYNcTGExDruskXOLof7vnT+b6jaZhzBSGuLHYATcQ/WSgTdIjfDLoM9Cb Vxrl/krLn3ObJ0p3dlhPZjbqrNmFFs5gdk4gUZ4f6+HdpG3OB9qog7iBglnm5DGeSJd0 IWyMnqREAg+u4di1F0J8guKzQXU4VIzc5AfKlHoHVECH+e2IYItGxaL+HKGqPb0+u9cN /kPQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:in-reply-to:message-id :references:user-agent:mime-version; bh=i5w4LeVU5Ef3HXHSWFYKzyIvwNIZHN+Yh+rz6r0uMmY=; b=CKmXqMfr9JoAq31d31Z/u40tCICxzSG2TBPUE3UJZYBxfErouIfeZLNP9rKNacIH06 u3joozsZHHYxQH+4T6fAD3q3X1eZr0QtnOCcMGIOL9Ojmi0SSISnSFHf5332BYMJ2p0l lNzmsCT/U3YXhrsUtnHt4f+F7/v9gFMjESQj4YfhFce1Sjwdt4/ga6UfMKlBXr8rY3iJ sfbxrfwuONGsqWWQj4IjWiZd5yO9pLkbHzppRq/ObOn9byu5gxm9Uaw32xRW8RZoyl8c mpkrvdc3JQHhEc7I2sgsnwdJ7gcqJXNerhQG0DW3xhYyhoJHTeJNKxqOoddUfPNbshEh AoOg== X-Gm-Message-State: AOAM5317qCiG+n6iCjlt6j29b2JGUsyPtqe7/+GN53AhoSYGKSnZqQNL wapT9Mw8deNyfJGhO2DMbzWyxA== X-Google-Smtp-Source: ABdhPJz8kGgDsoBytoPjz3NlWfZU+/DAq0lL447bX9flJpreMpFQwHY7bsgHZ7NPxlQ8iTD7IeiOFw== X-Received: by 2002:a17:902:8e86:: with SMTP id bg6mr6323667plb.57.1595014678481; Fri, 17 Jul 2020 12:37:58 -0700 (PDT) Received: from [2620:15c:17:3:4a0f:cfff:fe51:6667] ([2620:15c:17:3:4a0f:cfff:fe51:6667]) by smtp.gmail.com with ESMTPSA id a9sm8661807pfr.103.2020.07.17.12.37.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 17 Jul 2020 12:37:57 -0700 (PDT) Date: Fri, 17 Jul 2020 12:37:57 -0700 (PDT) From: David Rientjes X-X-Sender: rientjes@chino.kir.corp.google.com To: Chris Down cc: Andrew Morton , Yang Shi , Michal Hocko , Shakeel Butt , Yang Shi , Roman Gushchin , Greg Thelen , Johannes Weiner , Vladimir Davydov , cgroups@vger.kernel.org, linux-mm@kvack.org Subject: Re: [patch] mm, memcg: provide a stat to describe reclaimable memory In-Reply-To: <20200717121750.GA367633@chrisdown.name> Message-ID: References: <20200715131048.GA176092@chrisdown.name> <20200717121750.GA367633@chrisdown.name> User-Agent: Alpine 2.23 (DEB 453 2020-06-18) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Rspamd-Queue-Id: 1549E1C12F X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam03 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, 17 Jul 2020, Chris Down wrote: > > With the proposed anon_reclaimable, do you have any reliability concerns? > > This would be the amount of lazy freeable memory and memory that can be > > uncharged if compound pages from the deferred split queue are split under > > memory pressure. It seems to be a very precise value (as slab_reclaimable > > already in memory.stat is), so I'm not sure why there is a reliability > > concern. Maybe you can elaborate? > > Ability to reclaim a page is largely about context at the time of reclaim. For > example, if you are running at the edge of swap, at a metric that truly > describes "reclaimable memory" will contain vastly different numbers from one > second to the next as cluster and page availability increases and decreases. > We may also have to do things like look for youngness at reclaim time, so I'm > not convinced metrics like this makes sense in the general case. ... > Again, I'm curious why this can't be solved by artificial workingset > pressurisation and monitoring. Generally, the most reliable reclaim metrics > come from operating reclaim itself. > Perhaps this is best discussed in the context I gave in the earlier thread: imagine a thp-backed heap of 64MB and then a malloc implementation doing MADV_DONTNEED over all but one page in every one of these pageblocks. On a 4.3 kernel, for example, memory.current for the heap segment is now (64MB / 2MB) * 4KB = 128KB because we have synchronous splitting and uncharging of the underlying hugepage. On a 4.15 kernel, for example, memory.current is still 64MB because the underlying hugepages are still charged to the memcg due to deferred split queues. For any application that monitors this, pressurization is not going to help: the memory will be reclaimed under memcg pressure but we aren't facing that pressure yet. Userspace could identify this as a memory leak unless we describe what anon memory is actually reclaimable in this context (including on systems without swap). For any entity that uses this information to infer if new work can be scheduled in this memcg (the reason MemAvailable exists in /proc/meminfo at the system level), this is now dramatically skewed. At worse, on a swapless system, this memory is seen from userspace as unreclaimable because it's charged anon. Do you have other suggestions for how userspace can understand what anon is reclaimable in this context before encountering memory pressure? If so, it may be a great alternative to this: I haven't been able to think of such a way other than an anon_reclaimable stat.