From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 44B40C433EF for ; Fri, 29 Oct 2021 16:13:13 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C743D6044F for ; Fri, 29 Oct 2021 16:13:12 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org C743D6044F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 48345940007; Fri, 29 Oct 2021 12:13:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 433746B0072; Fri, 29 Oct 2021 12:13:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3232C940007; Fri, 29 Oct 2021 12:13:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0177.hostedemail.com [216.40.44.177]) by kanga.kvack.org (Postfix) with ESMTP id 05DC46B0071 for ; Fri, 29 Oct 2021 12:13:11 -0400 (EDT) Received: from smtpin04.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 6E03339B03 for ; Fri, 29 Oct 2021 16:13:11 +0000 (UTC) X-FDA: 78749969382.04.F68F9D7 Received: from out30-54.freemail.mail.aliyun.com (out30-54.freemail.mail.aliyun.com [115.124.30.54]) by imf13.hostedemail.com (Postfix) with ESMTP id 4128310B4504 for ; Fri, 29 Oct 2021 16:13:01 +0000 (UTC) X-Alimail-AntiSpam:AC=PASS;BC=-1|-1;BR=01201311R161e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04426;MF=ningzhang@linux.alibaba.com;NM=1;PH=DS;RN=6;SR=0;TI=SMTPD_---0UuAVebo_1635523974; Received: from ali-074845.local(mailfrom:ningzhang@linux.alibaba.com fp:SMTPD_---0UuAVebo_1635523974) by smtp.aliyun-inc.com(127.0.0.1); Sat, 30 Oct 2021 00:13:05 +0800 Subject: Re: [RFC 0/6] Reclaim zero subpages of thp to avoid memory bloat To: Michal Hocko Cc: linux-mm@kvack.org, Andrew Morton , Johannes Weiner , Vladimir Davydov , Yu Zhao References: <1635422215-99394-1-git-send-email-ningzhang@linux.alibaba.com> From: ning zhang Message-ID: Date: Sat, 30 Oct 2021 00:12:53 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Thunderbird/78.12.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 4128310B4504 X-Stat-Signature: jya5q995jtert5kcjym4mcspxyutmue3 Authentication-Results: imf13.hostedemail.com; dkim=none; spf=pass (imf13.hostedemail.com: domain of ningzhang@linux.alibaba.com designates 115.124.30.54 as permitted sender) smtp.mailfrom=ningzhang@linux.alibaba.com; dmarc=pass (policy=none) header.from=alibaba.com X-HE-Tag: 1635523981-72872 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: =E5=9C=A8 2021/10/29 =E4=B8=8B=E5=8D=889:38, Michal Hocko =E5=86=99=E9=81= =93: > On Thu 28-10-21 19:56:49, Ning Zhang wrote: >> As we know, thp may lead to memory bloat which may cause OOM. >> Through testing with some apps, we found that the reason of >> memory bloat is a huge page may contain some zero subpages >> (may accessed or not). And we found that most zero subpages >> are centralized in a few huge pages. >> >> Following is a text_classification_rnn case for tensorflow: >> >> zero_subpages huge_pages waste >> [ 0, 1) 186 0.00% >> [ 1, 2) 23 0.01% >> [ 2, 4) 36 0.02% >> [ 4, 8) 67 0.08% >> [ 8, 16) 80 0.23% >> [ 16, 32) 109 0.61% >> [ 32, 64) 44 0.49% >> [ 64, 128) 12 0.30% >> [ 128, 256) 28 1.54% >> [ 256, 513) 159 18.03% >> >> In the case, there are 187 huge pages (25% of the total huge pages) >> which contain more then 128 zero subpages. And these huge pages >> lead to 19.57% waste of the total rss. It means we can reclaim >> 19.57% memory by splitting the 187 huge pages and reclaiming the >> zero subpages. > What is the THP policy configuration in your testing? I assume you are > using defaults right? That would be always for THP and madvise for > defrag. Would it make more sense to use madvise mode for THP for your > workload? The THP code is rather complex and just by looking at the > diffstat this add quite a lot on top. Is this really worth it? The THP configuration is always. Madvise needs users to set MADV_HUGEPAGE by themselves if they want use=20 huge page, while many users don't do set this, and they can't control=20 this well. Such as java, users can set heap and metaspace to use huge pages with=20 madvise, but there is also memory bloat. Users still need to test=20 whether their app can accept the waste. For the case above, if we set THP configuration to be madvise, all the=20 pages it uses will be 4K-page. Memory bloat is one of the most important reasons that users disable=20 THP.=C2=A0 We do this to popularize THP to be default enabled.