From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.4 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 256DFC433E0 for ; Thu, 28 May 2020 21:02:39 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D95542078C for ; Thu, 28 May 2020 21:02:38 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="VfT9vvve" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D95542078C Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 75655800B6; Thu, 28 May 2020 17:02:38 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7053F80010; Thu, 28 May 2020 17:02:38 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5CE21800B6; Thu, 28 May 2020 17:02:38 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0045.hostedemail.com [216.40.44.45]) by kanga.kvack.org (Postfix) with ESMTP id 3E80580010 for ; Thu, 28 May 2020 17:02:38 -0400 (EDT) Received: from smtpin27.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id F214F8248068 for ; Thu, 28 May 2020 21:02:37 +0000 (UTC) X-FDA: 76867351554.27.ghost31_6e36b5de7714a Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin27.hostedemail.com (Postfix) with ESMTP id B75D63D669 for ; Thu, 28 May 2020 21:02:37 +0000 (UTC) X-HE-Tag: ghost31_6e36b5de7714a X-Filterd-Recvd-Size: 5748 Received: from mail-lf1-f65.google.com (mail-lf1-f65.google.com [209.85.167.65]) by imf32.hostedemail.com (Postfix) with ESMTP for ; Thu, 28 May 2020 21:02:37 +0000 (UTC) Received: by mail-lf1-f65.google.com with SMTP id w15so17362222lfe.11 for ; Thu, 28 May 2020 14:02:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=OKXkd/2Goa9jMS89Q94W0ujkvkQZZc4XoblLXbs8npU=; b=VfT9vvveDYPm9Ryth8faOYkMaXrBtezGHU2cIz3grKvOlKpU8UUiJtMaij5vwXdW6c MdF+q4O4dv+Jcb8D5cFvQ1NJMY+og5fVguE2A9OfoPMqsVw4L0Ys54+bGGMFewoVziYQ I19+A4AjOq2Syf5MqR79HZ1F/ONCu86nWMh23r2Mt/ZHnchlc/NfKuw3KtyCBOoeEA2U I2ooyEe7iOvV59F5vNcHsuMdRmiaF40OXrPlQxhj1Kx2iO7yklM43zS+UdLMIcuawhef 6XF7vJbPNkWQmaMi/0UuVksVuaFlAvo7OKkZaNJdBRLauXunE5A0gi7f+t16KjS1/7uU eQVA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=OKXkd/2Goa9jMS89Q94W0ujkvkQZZc4XoblLXbs8npU=; b=hopE1wWdyUoYMBOYKJqJVSQ27GixYwLxAwCuomX+T6X8oJGw23Li9hLwF5RsU6I+sp 8e50GiN75PFAkdwqypmoydCvmnPOo1z8sOqwzsRvDj5wDSFKtRtVrMX4TrKKo7bKN0rJ ydDzFyDpJj1c7rxyA2LlMsbxurdpPDiI8cgldNq+PTxfkAahHAMPDWYP8SWtAAVwCJna PP9aoVljmFGOwmT1RBty09By0suZxZl3lU5jlnXqHXRCiK2XRAyjSh4U0DqPb6stjN17 TC/yUkl0ClVl/PUhavUWoVFpb7TDnY5f8eYC8KkbOqpCtUQTTbnpWnzAAv62zx1cS9mh yFuw== X-Gm-Message-State: AOAM531WX6iSUd9upuTCRPJQUwUUUKMUxR2m8SkkUMo6S8EIejZSdp4n DfCVgLN/HH3zoQd/rxEEw6eQHoO9ULHZ8RHm1s6ugA== X-Google-Smtp-Source: ABdhPJzsrob+TxrXT/0Qxvf0fgqGmcH4Xue4XBajqa+4Mv7HoQYtZH7bYZw06Z/fs+xonMgNz+UV+FKJ3O5OyaiTy2M= X-Received: by 2002:a19:6cd:: with SMTP id 196mr2622048lfg.216.1590699755398; Thu, 28 May 2020 14:02:35 -0700 (PDT) MIME-Version: 1.0 References: <20200520143712.GA749486@chrisdown.name> <20200528194831.GA2017@chrisdown.name> <20200528202944.GA76514@cmpxchg.org> In-Reply-To: <20200528202944.GA76514@cmpxchg.org> From: Shakeel Butt Date: Thu, 28 May 2020 14:02:24 -0700 Message-ID: Subject: Re: [PATCH] mm, memcg: reclaim more aggressively before high allocator throttling To: Johannes Weiner Cc: Chris Down , Andrew Morton , Tejun Heo , Michal Hocko , Linux MM , Cgroups , LKML , Kernel Team Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: B75D63D669 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam05 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, May 28, 2020 at 1:30 PM Johannes Weiner wrote: > > On Thu, May 28, 2020 at 08:48:31PM +0100, Chris Down wrote: > > Shakeel Butt writes: > > > What was the initial reason to have different behavior in the first place? > > > > This differing behaviour is simply a mistake, it was never intended to be > > this deviate from what happens elsewhere. To that extent this patch is as > > much a bug fix as it is an improvement. > > Yes, it was an oversight. > > > > > static void high_work_func(struct work_struct *work) > > > > @@ -2378,16 +2384,20 @@ void mem_cgroup_handle_over_high(void) > > > > { > > > > unsigned long penalty_jiffies; > > > > unsigned long pflags; > > > > + unsigned long nr_reclaimed; > > > > unsigned int nr_pages = current->memcg_nr_pages_over_high; > > > > > > Is there any benefit to keep current->memcg_nr_pages_over_high after > > > this change? Why not just use SWAP_CLUSTER_MAX? > > It's there for the same reason why try_to_free_pages() takes a reclaim > argument in the first place: we want to make the thread allocating the > most also do the most reclaim work. Consider a thread allocating THPs > in a loop with another thread allocating regular pages. > > Remember that all callers loop. They could theoretically all just ask > for SWAP_CLUSTER_MAX pages over and over again. > > The idea is to have fairness in most cases, and avoid allocation > failure, premature OOM, and containment failure in the edge cases that > are caused by the inherent raciness of page reclaim. > Thanks for the explanation. > > I don't feel strongly either way, but current->memcg_nr_pages_over_high can > > be very large for large allocations. > > > > That said, maybe we should just reclaim `max(SWAP_CLUSTER_MAX, current - > > high)` for each loop? I agree that with this design it looks like perhaps we > > don't need it any more. > > > > Johannes, what do you think? > > How about this: > > Reclaim memcg_nr_pages_over_high in the first iteration, then switch > to SWAP_CLUSTER_MAX in the retries. > > This acknowledges that while the page allocator and memory.max reclaim > every time an allocation is made, memory.high is currently batched and > can have larger targets. We want the allocating thread to reclaim at > least the batch size, but beyond that only what's necessary to prevent > premature OOM or failing containment. > > Add a comment stating as much. > > Once we reclaim memory.high synchronously instead of batched, this > exceptional handling is no longer needed and can be deleted again. > > Does that sound reasonable? SGTM. It does not seem controversial to me to let the task do the work to resolve the condition for which it is being throttled.