From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.6 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C44E5C433DF for ; Fri, 29 May 2020 01:51:11 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 850072075A for ; Fri, 29 May 2020 01:51:11 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="hv3Ro80U" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 850072075A Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 1DEF18001A; Thu, 28 May 2020 21:51:11 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1902780010; Thu, 28 May 2020 21:51:11 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 07F1B8001A; Thu, 28 May 2020 21:51:11 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id E205080010 for ; Thu, 28 May 2020 21:51:10 -0400 (EDT) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id A8F568245578 for ; Fri, 29 May 2020 01:51:10 +0000 (UTC) X-FDA: 76868078700.21.patch09_a3b53701d641 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin21.hostedemail.com (Postfix) with ESMTP id 90844180442C2 for ; Fri, 29 May 2020 01:51:10 +0000 (UTC) X-HE-Tag: patch09_a3b53701d641 X-Filterd-Recvd-Size: 6786 Received: from mail-il1-f193.google.com (mail-il1-f193.google.com [209.85.166.193]) by imf43.hostedemail.com (Postfix) with ESMTP for ; Fri, 29 May 2020 01:51:10 +0000 (UTC) Received: by mail-il1-f193.google.com with SMTP id h3so908135ilh.13 for ; Thu, 28 May 2020 18:51:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=H3h/KKWXDUeQOc5C3bt+8FTDd7C+ZKV4cv02x2eGUM0=; b=hv3Ro80UVyre+PN/oxyCxv5Wj3U9Miq4RcYmwn7Wer3m5lna+secRy/iNZy1XU5vs6 DwjrNtSLHZt3BxA7cIhF92ENeQKAwPhXRCSqVVVGdfZEG1vXKe2rTqhNQ5Hvrpi2UFjd LXRB+yKNrnG7mdAT8gMmN500gmHD5X41hEKBe0ZSg81bWVcCF0wDSjdIsr/DX4GRWgtn jcsGk9dGIRwLlEPEmqwg9TjN0k2usiunSA7X//YLVJEyZx8FK26kMVPXHvJZ+cbDD4ho dfy41JIPsQje/uQaXQhSJnK2rs1lAoB5dKeFYnuDrG4bDMDOculzn8iblXX+9y2QjMdX sHpA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=H3h/KKWXDUeQOc5C3bt+8FTDd7C+ZKV4cv02x2eGUM0=; b=lJa+6p+pyiqgxb0k2OobCiPaZK2+uiRH1QEKzF6VPE7B6lFAOmCjCLPzVohlAla+6F 3grM2ndCvcoaI8iPk5wpclKZPV7VeWolBwzBuSR5jCwi0L/lXfUqTPP5rULwbyvBBSbS ZAmjft6K5LjjPuNeg+wNc45W0ypU6X8S2KSY8+qVqNzwdlo7eOosqpWxh/x4LX3QRZJC c/SC5OkJ8YAxKF+Zah9G5AoxfeF/ELWtXyNOqqI3zAnA3xVnG0MLICFhGi+wJ1NDQOS+ NNkwq6WmbRRCV7A5XhIobjPspmGFkKvet0UptWG57WK+E3w0fi5pta7ZMl0ohFk7jnhy 1zyA== X-Gm-Message-State: AOAM532nNGUn4Dm+DOxVSkDMQ5nh77yJaXxd0Rw8O5Q12LKSTLUtIKhH fYL++/Q/ssFk95FmTUnGknabLB7TwpylFFrfKTc= X-Google-Smtp-Source: ABdhPJwOdvRBv/VOCAVzd3CfESY0S+IdHP0T8BAuYkYNjREwWIKEF46vM4w9mjj36My18HtWSo+lPWWNSsKWEHxCsAY= X-Received: by 2002:a92:770c:: with SMTP id s12mr79501ilc.203.1590717069612; Thu, 28 May 2020 18:51:09 -0700 (PDT) MIME-Version: 1.0 References: <20200519084535.GG32497@dhcp22.suse.cz> <20200520190906.GA558281@chrisdown.name> <20200521095515.GK6462@dhcp22.suse.cz> <20200521163450.GV6462@dhcp22.suse.cz> <20200528150310.GG27484@dhcp22.suse.cz> <20200528164121.GA839178@chrisdown.name> In-Reply-To: <20200528164121.GA839178@chrisdown.name> From: Yafang Shao Date: Fri, 29 May 2020 09:50:33 +0800 Message-ID: Subject: Re: mm: mkfs.ext4 invoked oom-killer on i386 - pagecache_get_page To: Chris Down Cc: Naresh Kamboju , Michal Hocko , Anders Roxell , "Linux F2FS DEV, Mailing List" , linux-ext4 , linux-block , Andrew Morton , open list , Linux-Next Mailing List , linux-mm , Arnd Bergmann , Andreas Dilger , Jaegeuk Kim , "Theodore Ts'o" , Chao Yu , Hugh Dickins , Andrea Arcangeli , Matthew Wilcox , Chao Yu , lkft-triage@lists.linaro.org, Johannes Weiner , Roman Gushchin , Cgroups Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 90844180442C2 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam02 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, May 29, 2020 at 12:41 AM Chris Down wrote: > > Naresh Kamboju writes: > >On Thu, 28 May 2020 at 20:33, Michal Hocko wrote: > >> > >> On Fri 22-05-20 02:23:09, Naresh Kamboju wrote: > >> > My apology ! > >> > As per the test results history this problem started happening from > >> > Bad : next-20200430 (still reproducible on next-20200519) > >> > Good : next-20200429 > >> > > >> > The git tree / tag used for testing is from linux next-20200430 tag and reverted > >> > following three patches and oom-killer problem fixed. > >> > > >> > Revert "mm, memcg: avoid stale protection values when cgroup is above > >> > protection" > >> > Revert "mm, memcg: decouple e{low,min} state mutations from protectinn checks" > >> > Revert "mm-memcg-decouple-elowmin-state-mutations-from-protection-checks-fix" > >> > >> The discussion has fragmented and I got lost TBH. > >> In http://lkml.kernel.org/r/CA+G9fYuDWGZx50UpD+WcsDeHX9vi3hpksvBAWbMgRZadb0Pkww@mail.gmail.com > >> you have said that none of the added tracing output has triggered. Does > >> this still hold? Because I still have a hard time to understand how > >> those three patches could have the observed effects. > > > >On the other email thread [1] this issue is concluded. > > > >Yafang wrote on May 22 2020, > > > >Regarding the root cause, my guess is it makes a similar mistake that > >I tried to fix in the previous patch that the direct reclaimer read a > >stale protection value. But I don't think it is worth to add another > >fix. The best way is to revert this commit. > > This isn't a conclusion, just a guess (and one I think is unlikely). For this > to reliably happen, it implies that the same race happens the same way each > time. Hi Chris, Look at this patch[1] carefully you will find that it introduces the same issue that I tried to fix in another patch [2]. Even more sad is these two patches are in the same patchset. Although this issue isn't related with the issue found by Naresh, we have to ask ourselves why we always make the same mistake ? One possible answer is that we always forget the lifecyle of memory.emin before we read it. memory.emin doesn't have the same lifecycle with the memcg, while it really has the same lifecyle with the reclaimer. IOW, once a reclaimer begins the protetion value should be set to 0, and after we traversal the memcg tree we calculate a protection value for this reclaimer, finnaly it disapears after the reclaimer stops. That is why I highly suggest to add an new protection member in scan_control before. [1]. https://lore.kernel.org/linux-mm/20200505084127.12923-3-laoar.shao@gmail.com/ [2]. https://lore.kernel.org/linux-mm/20200505084127.12923-2-laoar.shao@gmail.com/ -- Thanks Yafang