From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0211FC433E0 for ; Tue, 12 Jan 2021 19:48:13 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 65DF523117 for ; Tue, 12 Jan 2021 19:48:12 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 65DF523117 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 7951A6B00B2; Tue, 12 Jan 2021 14:48:11 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 71E716B00B4; Tue, 12 Jan 2021 14:48:11 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5E60F6B00B5; Tue, 12 Jan 2021 14:48:11 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0059.hostedemail.com [216.40.44.59]) by kanga.kvack.org (Postfix) with ESMTP id 453E36B00B2 for ; Tue, 12 Jan 2021 14:48:11 -0500 (EST) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 07795362E for ; Tue, 12 Jan 2021 19:48:11 +0000 (UTC) X-FDA: 77698159182.24.stamp58_010111127518 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin24.hostedemail.com (Postfix) with ESMTP id DF6761A4A5 for ; Tue, 12 Jan 2021 19:48:10 +0000 (UTC) X-HE-Tag: stamp58_010111127518 X-Filterd-Recvd-Size: 5830 Received: from mail-qv1-f46.google.com (mail-qv1-f46.google.com [209.85.219.46]) by imf08.hostedemail.com (Postfix) with ESMTP for ; Tue, 12 Jan 2021 19:48:09 +0000 (UTC) Received: by mail-qv1-f46.google.com with SMTP id h16so1464367qvu.8 for ; Tue, 12 Jan 2021 11:48:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=uREbXCwcyfpXNTx7ee7MudJLRW7UlYdjL+1cYZJdpQ8=; b=VBloEoCXCTkXhKEoW6IimofV0vg//Z0tXPoFnD6d5P94UevYTSdCMdq3LpxHSjRomX ProBumOzdn1ADj+XaPlwbnPOszyLly/YhmI79DIYyxrOnUUuUEIrgwOtFG/dBOIWi9eF nuu6nW0Kp+46ypeKukLzOpH4mqYWonf7ji/d8eKn2HKwUDNi3LvuEUx7bZ5qsL0ufKE9 kSKZSUtMg7qij3kLitiVEy4DBY9dlt1HQH5VoIJpNulRRXo3lS6k1zM7hv/mprTcH1yc /cR1A7M9TdQkuMaB97kg9iAXvPsf4HOydPsIsA0bpAvS0JcziMziwUg/xSDNVJ1CSY9l 6pMw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=uREbXCwcyfpXNTx7ee7MudJLRW7UlYdjL+1cYZJdpQ8=; b=fdD2W/XwE51S6O0bI2r+2k2EC04Vhh/SAo3DRlv+wO4AnrJg0WCThaINF21H/ZZBfb bXmUw2oLAyM6qR/REPI3JfiDszv5iLIrFbeG54KRGcDnwFB4gvzQdj58EztSt9EoUMcA +cBAKP1uGU4WEaGqFW1vHVqEnPQj8AuPq+LxrZRhoFf39fve80Gipuqtoy1LDHf1X/dY zms25ey5H3RhXk1Cg2cXC6Bt53QQPS8ZL/DOv13/+QeExAljof/VDNvh/DWDG+u+tYIg mAscfyY/5OM47U69xwk1cmOobjzbLKEkTY5vS7Rqb68cf8s5gNaySF1ewjKHSapGfJw1 ricA== X-Gm-Message-State: AOAM532QsayMI1a77EppPE896JLRP0u0HFGm9wI81FOLu+7ILR0dZOcN 1vA26iJqAsYuqmXUbwjTtwqj4Q== X-Google-Smtp-Source: ABdhPJz+8gfNtuW49ZpsrzT6yhcw5t3/fcuKUGjyGq53djXPi27K1yuRkXL9Tq4gN1F/yyzwMxEOcg== X-Received: by 2002:a0c:ac43:: with SMTP id m3mr1099634qvb.37.1610480889261; Tue, 12 Jan 2021 11:48:09 -0800 (PST) Received: from localhost ([2620:10d:c091:480::1:1fb4]) by smtp.gmail.com with ESMTPSA id y9sm1709017qtm.96.2021.01.12.11.48.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Jan 2021 11:48:08 -0800 (PST) Date: Tue, 12 Jan 2021 14:45:43 -0500 From: Johannes Weiner To: Roman Gushchin Cc: Andrew Morton , Tejun Heo , Michal Hocko , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: Re: [PATCH] mm: memcontrol: prevent starvation when writing memory.high Message-ID: References: <20210112163011.127833-1-hannes@cmpxchg.org> <20210112170322.GA99586@carbon.dhcp.thefacebook.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210112170322.GA99586@carbon.dhcp.thefacebook.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Jan 12, 2021 at 09:03:22AM -0800, Roman Gushchin wrote: > On Tue, Jan 12, 2021 at 11:30:11AM -0500, Johannes Weiner wrote: > > When a value is written to a cgroup's memory.high control file, the > > write() context first tries to reclaim the cgroup to size before > > putting the limit in place for the workload. Concurrent charges from > > the workload can keep such a write() looping in reclaim indefinitely. > > > > In the past, a write to memory.high would first put the limit in place > > for the workload, then do targeted reclaim until the new limit has > > been met - similar to how we do it for memory.max. This wasn't prone > > to the described starvation issue. However, this sequence could cause > > excessive latencies in the workload, when allocating threads could be > > put into long penalty sleeps on the sudden memory.high overage created > > by the write(), before that had a chance to work it off. > > > > Now that memory_high_write() performs reclaim before enforcing the new > > limit, reflect that the cgroup may well fail to converge due to > > concurrent workload activity. Bail out of the loop after a few tries. > > > > Fixes: 536d3bf261a2 ("mm: memcontrol: avoid workload stalls when lowering memory.high") > > Cc: # 5.8+ > > Reported-by: Tejun Heo > > Signed-off-by: Johannes Weiner > > --- > > mm/memcontrol.c | 7 +++---- > > 1 file changed, 3 insertions(+), 4 deletions(-) > > > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > > index 605f671203ef..63a8d47c1cd3 100644 > > --- a/mm/memcontrol.c > > +++ b/mm/memcontrol.c > > @@ -6275,7 +6275,6 @@ static ssize_t memory_high_write(struct kernfs_open_file *of, > > > > for (;;) { > > unsigned long nr_pages = page_counter_read(&memcg->memory); > > - unsigned long reclaimed; > > > > if (nr_pages <= high) > > break; > > @@ -6289,10 +6288,10 @@ static ssize_t memory_high_write(struct kernfs_open_file *of, > > continue; > > } > > > > - reclaimed = try_to_free_mem_cgroup_pages(memcg, nr_pages - high, > > - GFP_KERNEL, true); > > + try_to_free_mem_cgroup_pages(memcg, nr_pages - high, > > + GFP_KERNEL, true); > > > > - if (!reclaimed && !nr_retries--) > > + if (!nr_retries--) > > Shouldn't it be (!reclaimed || !nr_retries) instead? > > If reclaimed == 0, it probably doesn't make much sense to retry. We usually allow nr_retries worth of no-progress reclaim cycles to make up for intermittent reclaim failures. The difference to OOMs/memory.max is that we don't want to loop indefinitely on forward progress, but we should allow the usual number of no-progress loops.