From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8DB2DC2D0CD for ; Tue, 17 Dec 2019 16:54:26 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 444EF2465E for ; Tue, 17 Dec 2019 16:54:26 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=cmpxchg-org.20150623.gappssmtp.com header.i=@cmpxchg-org.20150623.gappssmtp.com header.b="A2rtztcs" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 444EF2465E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id B5D2E8E0093; Tue, 17 Dec 2019 11:54:25 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B0AF68E0079; Tue, 17 Dec 2019 11:54:25 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A209C8E0093; Tue, 17 Dec 2019 11:54:25 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0137.hostedemail.com [216.40.44.137]) by kanga.kvack.org (Postfix) with ESMTP id 8A2AD8E0079 for ; Tue, 17 Dec 2019 11:54:25 -0500 (EST) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with SMTP id 3B83D4995E5 for ; Tue, 17 Dec 2019 16:54:25 +0000 (UTC) X-FDA: 76275231690.06.mind79_60b877b463539 X-HE-Tag: mind79_60b877b463539 X-Filterd-Recvd-Size: 5441 Received: from mail-qv1-f68.google.com (mail-qv1-f68.google.com [209.85.219.68]) by imf43.hostedemail.com (Postfix) with ESMTP for ; Tue, 17 Dec 2019 16:54:24 +0000 (UTC) Received: by mail-qv1-f68.google.com with SMTP id l14so3831243qvu.12 for ; Tue, 17 Dec 2019 08:54:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=dzNMqEKy+i126GtW2zZLwQfsfVoyhha71N6j1L+7V+U=; b=A2rtztcs9UaST2s9iupi1B5WmY/QRU8o3QUupO2GupjgQtTzL0SEfdEpjGdTTdLIG8 qD4GmVhCAU0C8e0MMC8IdWzmfcNJSwQP+/YvE0/1SNdrtDrMNmr89hi7gGfO3n9Nqi6C XXsc0rqPgVHPkx+i7ONjMeUl83cLn8+nfPFVoUiX0edvlRnsdiRPhMR/Ko2/mf0bc0TD K404CtsukiJgi3ivLoLpN6sm3fDYtlj9TjnqaiDn/mriO55OBmDYbuAlr4hdt+q2iCNa PagyVJE3TdD9V50Q+m3I8OKzBGrYGAQUm0yncE0+PA4LqI9LALCm66CHiEB+bBKNh+Vv N3Vw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=dzNMqEKy+i126GtW2zZLwQfsfVoyhha71N6j1L+7V+U=; b=qxqo3OTHW6+S1CVlhFHxDzYluVlkbC4apkV9rYRWdqomxBacXaajsB51eh5JLUHXp8 yEnKvDTMtJAYAjKxu04LIO+oVlSz63ASquuJe00Q46COv/Boj29KNXt8vsXjmk9m3igG 15+lhhcPIUy/8o8FmvbzFl8ep/NPUTSBmpLbI0C4DHiVlkgePjdfj1VJuP9DZWJJxVe9 gHlDKvyS67wdLT3HMF/PoJQustDkrwrflU6XF28d9dsJk0YYtNpGobCtOV53aGlDvTlv fX4h2/E8HYkJoygOru4f24MnUTuzeMobrJQsQ3bclYlNJJstlioK8AZXclfeUbRv5YsB zDuA== X-Gm-Message-State: APjAAAWgdBWSQ8F4cwOK7fp0Tl4qAAIo42AVpSduM54HWCrFBCGf0aAX zdSTQHZJ7HGIIZlKXtmgzCydRg== X-Google-Smtp-Source: APXvYqyxmbsmxJNC0kUf9/3tLIScIj+e+DEnCcxzKZKnVJitlv+ffOe6HKLe80LuG/hk6hqLxtzcAA== X-Received: by 2002:a0c:8b68:: with SMTP id d40mr1742352qvc.138.1576601663547; Tue, 17 Dec 2019 08:54:23 -0800 (PST) Received: from localhost ([2620:10d:c091:500::1:853a]) by smtp.gmail.com with ESMTPSA id z28sm8445931qtz.69.2019.12.17.08.54.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 17 Dec 2019 08:54:22 -0800 (PST) Date: Tue, 17 Dec 2019 11:54:22 -0500 From: Johannes Weiner To: Yafang Shao Cc: Michal Hocko , Vladimir Davydov , Andrew Morton , Dave Chinner , Al Viro , Linux MM , linux-fsdevel@vger.kernel.org Subject: Re: [PATCH 0/4] memcg, inode: protect page cache from freeing inode Message-ID: <20191217165422.GA213613@cmpxchg.org> References: <1576582159-5198-1-git-send-email-laoar.shao@gmail.com> <20191217115603.GA10016@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: CCing Dave On Tue, Dec 17, 2019 at 08:19:08PM +0800, Yafang Shao wrote: > On Tue, Dec 17, 2019 at 7:56 PM Michal Hocko wrote: > > What do you mean by this exactly. Are those inodes reclaimed by the > > regular memory reclaim or by other means? Because shrink_node does > > exclude shrinking slab for protected memcgs. > > By the regular memory reclaim, kswapd, direct reclaimer or memcg reclaimer. > IOW, the current->reclaim_state it set. > > Take an example for you. > > kswapd > balance_pgdat > shrink_node_memcgs > switch (mem_cgroup_protected) <<<< memory.current= 1024M > memory.min = 512M a file has 800M page caches > case MEMCG_PROT_NONE: <<<< hard limit is not reached. > beak; > shrink_lruvec > shrink_slab <<< it may free the inode and the free all its > page caches (800M) This problem exists independent of cgroup protection. The inode shrinker may take down an inode that's still holding a ton of (potentially active) page cache pages when the inode hasn't been referenced recently. IMO we shouldn't be dropping data that the VM still considers hot compared to other data, just because the inode object hasn't been used as recently as other inode objects (e.g. drowned in a stream of one-off inode accesses). I've carried the below patch in my private tree for testing cache aging decisions that the shrinker interfered with. (It would be nicer if page cache pages could pin the inode of course, but reclaim cannot easily participate in the inode refcounting scheme.) Thoughts? diff --git a/fs/inode.c b/fs/inode.c index fef457a42882..bfcaaaf6314f 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -753,7 +753,13 @@ static enum lru_status inode_lru_isolate(struct list_head *item, return LRU_ROTATE; } - if (inode_has_buffers(inode) || inode->i_data.nrpages) { + /* Leave the pages to page reclaim */ + if (inode->i_data.nrpages) { + spin_unlock(&inode->i_lock); + return LRU_ROTATE; + } + + if (inode_has_buffers(inode)) { __iget(inode); spin_unlock(&inode->i_lock); spin_unlock(lru_lock);