From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C78FEC27C40 for ; Thu, 24 Aug 2023 04:06:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2D572680007; Thu, 24 Aug 2023 00:06:52 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 285598E0011; Thu, 24 Aug 2023 00:06:52 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 14D69680007; Thu, 24 Aug 2023 00:06:52 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 069B68E0011 for ; Thu, 24 Aug 2023 00:06:52 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id C81511A01C6 for ; Thu, 24 Aug 2023 04:06:51 +0000 (UTC) X-FDA: 81157662222.23.EAA8685 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by imf12.hostedemail.com (Postfix) with ESMTP id 77B1B4000D for ; Thu, 24 Aug 2023 04:06:45 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=none; spf=pass (imf12.hostedemail.com: domain of liushixin2@huawei.com designates 45.249.212.187 as permitted sender) smtp.mailfrom=liushixin2@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1692850009; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=sW6y4+nd72Q+lzcPZZsGVmbVIxQC0vfDfgXFSUKYJBU=; b=yYxHSl1/x+tXJwsgwUnCYktU2FFx8uqTpbsMwALiSP5SwoRdxPnCsMaHbEpwLpRLMrFX7p ECjkFd+2AxZO0RolroDGiYPEDQV5IfyiuUCTowSrS5HYqMF3aclfF2O99yg+WiI/ZNeS6k wJeVDh8RPFdufCiQCHMX9F/4LYjIJSI= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=none; spf=pass (imf12.hostedemail.com: domain of liushixin2@huawei.com designates 45.249.212.187 as permitted sender) smtp.mailfrom=liushixin2@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1692850009; a=rsa-sha256; cv=none; b=Fmq0xCl40wBpgVPHI5cFkf7JpM/rsHWSGkg8AB+RQoxly/TcCN9ud+phwV1+EMUEvQnpjq KBkrmtC+QQ65xOo2gJTbYC9QpxfcHBMmzvVtRM5wjjuV1yYIUD1IwBtRs/WGQUzWbGtVxb d2Fl0V2zsQnkC8JuFAA3Ny77+tZusZs= Received: from dggpemm500009.china.huawei.com (unknown [172.30.72.56]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4RWTLD1n6DzrSGD; Thu, 24 Aug 2023 11:37:40 +0800 (CST) Received: from [10.174.179.24] (10.174.179.24) by dggpemm500009.china.huawei.com (7.185.36.225) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.31; Thu, 24 Aug 2023 11:39:10 +0800 Subject: Re: [PATCH v2] mm: vmscan: reclaim anon pages if there are swapcache pages To: Yosry Ahmed , Michal Hocko References: <20230822024901.2412520-1-liushixin2@huawei.com> <50c49baf-d04a-f1e3-0d0e-7bb8e22c3889@huawei.com> CC: Johannes Weiner , Roman Gushchin , Shakeel Butt , Muchun Song , Andrew Morton , , , , From: Liu Shixin Message-ID: <14e15f31-f3d3-4169-8ed9-fb36e57cf578@huawei.com> Date: Thu, 24 Aug 2023 11:39:10 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.7.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 8bit X-Originating-IP: [10.174.179.24] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To dggpemm500009.china.huawei.com (7.185.36.225) X-CFilter-Loop: Reflected X-Rspamd-Queue-Id: 77B1B4000D X-Rspam-User: X-Stat-Signature: b46f75rtw8pz6ydegbc5er9igypgtkmc X-Rspamd-Server: rspam01 X-HE-Tag: 1692850005-190971 X-HE-Meta: U2FsdGVkX1/2rSyTuM+fYxSxQtRpGnF88aHWZUyg3OhtKqyMdMbPc/4zHCAIk55h6znvneAbPPLI8tBhlUqI74TUL6e8W18gpcD383f4bobAbXYWShbeagosdvm56Y7Qw8UlWJpcMEJeXVk1fjipHU4m+3zHcEVt7KJU1NS7hhVvp3cskQ8q10AJqVMxTunYG/DN3uA/rDXswYQIQ3KhCCcXeVBvZ7O7xJFucPBi9pnV4OrHHykURXjM0pr+gANY5UXZK9SyAlOT4FlYy9KGT/0redZZtAhcOyVf2p8JSEikI85L9QpcymgiLlGKoNbciE2DPzg24ouje9OMNNrAwqhr58C53W/9bX2iFKKBsgKF9WlFxn6fMyEuIyh4VZPE9GwY32NPx8T6ddxEBWM5WcVgsOFeq8tHn4y++bTf2FzKw3Bj5lSpZBhkwANdJNQTI0N4CbK0cCvZ+cojYGI/DDDqYYaGASVbQ/sA1VV7lRfFdmvjlRX9fBsrbChqwr3YKFhlCvbrD4JK1EI6IAfTdhE0mZ2lhdEjVD0goxNzwGXukOJ5nEZ6GTLBKYviHuG5PJqpaOITVLKQ5d6TPZK9ixuINZEAhOi09u1y2kSltB/Iygs3Siu+dKLjR3Hyrc0UDOciZ7bBe/4wKyTXl6dDipNYbOJu9UhE0H5CTOOpx8etAbTF5NNT4orVGou/w0u7ZBfIVCzlyuDQz6acziXMYPb0v6klne+oEYK+YwVienEiyQwgHJahIMiwPmyuCtKri61Sb1kJmlc7YgmKGb+yB3LDKDTKCEpoJltwl4XX8pu0zv7hIa3A9iODMaDFvjhhJzikyqudFjutjlWi8epw7CtRSK9ya+SKdvXWdlwWsGMRbjZW9pbqmZAXYzaEA8IPBxcqaExCTrMyKsjs4KwztYxCRYKIZ1Xaic1bX5L26BZPbLZSOD5rtwqHJFmCuBfvS8ldhNtfp/eKIsXNiLQ HkM7zVMm pd+ZBhjrqHGds6slqINnE0EJNfjeZDo0TmKaXrvfaVB4WDdnrpw6Lf3/MRr/0AjmaKz9racl3Gclp8LhnLze49twUODAWUVeDMbHqRzkV/FDHcpERuk5QK/htAOy0KutTwNcVQ2pn+f+Kla3xSig23AAtDjZ0jM8qMafYV+mkBmfmP50tsvncbZQ1fvfJJ3LnCGSoBUvjtkTNh9DayKHjm9VBIw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2023/8/23 23:29, Yosry Ahmed wrote: > On Wed, Aug 23, 2023 at 6:12 AM Michal Hocko wrote: >> On Wed 23-08-23 10:00:58, Liu Shixin wrote: >>> >>> On 2023/8/23 0:35, Yosry Ahmed wrote: >>>> On Mon, Aug 21, 2023 at 6:54 PM Liu Shixin wrote: >>>>> When spaces of swap devices are exhausted, only file pages can be reclaimed. >>>>> But there are still some swapcache pages in anon lru list. This can lead >>>>> to a premature out-of-memory. >>>>> >>>>> This problem can be fixed by checking number of swapcache pages in >>>>> can_reclaim_anon_pages(). For memcg v2, there are swapcache stat that can >>>>> be used directly. For memcg v1, use total_swapcache_pages() instead, which >>>>> may not accurate but can solve the problem. >>>> Interesting find. I wonder if we really don't have any handling of >>>> this situation. >>> I have alreadly test this problem and can confirm that it is a real problem. >>> With 9MB swap space and 10MB mem_cgroup limit,when allocate 15MB memory, >>> there is a probability that OOM occurs. >> Could you be more specific about the test and the oom report? > I actually couldn't reproduce it using 9MB of zram and a cgroup with a > 10MB limit trying to allocate 15MB of tmpfs, no matter how many > repetitions I do. The following is the printout of the testcase I used. In fact, the probability of triggering this problem is very low. You can adjust the size of the swap space to increase the probability of recurrence. 10148+0 records in 10148+0 records out 10391552 bytes (10 MB, 9.9 MiB) copied, 0.0390954 s, 266 MB/s mkswap: /home/swapfile: insecure permissions 0644, 0600 suggested. Setting up swapspace version 1, size = 9.9 MiB (10387456 bytes) no label, UUID=9219cb2a-55d7-46b6-9dcd-bb491095225d mkswap return is 0 swapon: /home/swapfile: insecure permissions 0644, 0600 suggested. swapon return is 0 swapoff success 10148+0 records in 10148+0 records out 10391552 bytes (10 MB, 9.9 MiB) copied, 0.0389205 s, 267 MB/s mkswap: /home/swapfile: insecure permissions 0644, 0600 suggested. Setting up swapspace version 1, size = 9.9 MiB (10387456 bytes) no label, UUID=614b967a-bd87-430d-b867-6e09a8b77b27 mkswap return is 0 swapon: /home/swapfile: insecure permissions 0644, 0600 suggested. swapon return is 0 ---- in do_test--- =========orig_mem is 10428416, orig_sw_mem is 17059840 SwapCached: 3428 kB SwapTotal: 10144 kB SwapFree: 240 kB rss is 7572, swap is 0 check pass memcg_swap.sh: line 79: 6596 Killed cgexec -g "memory:test" ./malloc 16777216 swapoff success 10148+0 records in 10148+0 records out 10391552 bytes (10 MB, 9.9 MiB) copied, 0.0404156 s, 257 MB/s mkswap: /home/swapfile: insecure permissions 0644, 0600 suggested. Setting up swapspace version 1, size = 9.9 MiB (10387456 bytes) no label, UUID=a228e988-47c1-44d5-9ce1-cd7b66e97721 mkswap return is 0 swapon: /home/swapfile: insecure permissions 0644, 0600 suggested. swapon return is 0 ---- in do_test--- =========orig_mem is 10485760, orig_sw_mem is 16834560 SwapCached: 3944 kB SwapTotal: 10144 kB SwapFree: 0 kB rss is 7112, swap is 0 check fail memcg_swap.sh: line 79: 6633 Killed cgexec -g "memory:test" ./malloc 16777216 This is my testcase: memcg_swap.sh: #!/bin/bash _mkswap() { size=${1:-10} swapfile="/home/swapfile" # clear swap swapoff -a expect_size=$(free -b | grep 'Swap' | awk '{print $2}') if [ $expect_size -ne 0 ]; then echo "$expect_size" echo "swapoff fail" return 1 fi echo "swapoff success" rm -rf $swapfile dd if=/dev/zero of=$swapfile bs=1k count=10148 mkswap $swapfile echo "mkswap return is $?" swapon $swapfile echo "swapon return is $?" } echo "----in do_pre----" cgdelete -r "memory:test" cgcreate -g "memory:test" _mkswap 10 while true do _mkswap 10 echo "---- in do_test---" cgcreate -g "memory:test" echo 1 > /sys/fs/cgroup/memory/test/memory.oom_control echo 10M > /sys/fs/cgroup/memory/test/memory.limit_in_bytes cgexec -g "memory:test" ./malloc 16777216 & pid=$! sleep 3 orig_mem=$(cat /sys/fs/cgroup/memory/test/memory.usage_in_bytes) orig_sw_mem=$(cat /sys/fs/cgroup/memory/test/memory.memsw.usage_in_bytes) echo "=========orig_mem is $orig_mem, orig_sw_mem is $orig_sw_mem" cat /proc/meminfo | grep Swap swapfree=$(cat /proc/meminfo | grep SwapFree | awk '{print $2}') swapcache=$(cat /proc/meminfo | grep SwapCached | awk '{print $2}') if [ $swapfree -eq 0 ]; then echo "==========" >> /root/free.txt echo "swapfree is $swapfree" >> /root/free.txt echo "swapcache is $swacache" >> /root/free.txt echo "==========" >> /root/free.txt fi rss=$(cat /proc/$pid/smaps_rollup | sed -n 2p | awk '{print $2}') swap=$(cat /proc/$pid/smaps_rollup | sed -n 19p | awk '{print $2}') echo "rss is $rss, swap is $swap" echo "test data==================" >> /root/data.txt echo "rss is $rss" >> /root/data.txt echo "swap is $swap" >> /root/data.txt echo "test end===================" >> /root/data.txt if [ $orig_mem -le 10485760 -a \ "$(cat /proc/$pid/status | grep State | awk '{print $2}')" == "R" ]; then echo "check pass" else echo "check fail" kill -9 $pid cgdelete -r "memory:test" break fi kill -9 $pid cgdelete -r "memory:test" done malloc.c: #include #include #include #include #include #include int main(int argc, char **argv) { char *p; p = malloc(atoi(argv[1])); memset(p, 1, atoi(argv[1])); return 0; } > >> -- >> Michal Hocko >> SUSE Labs > . >