From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DE46CC7618D for ; Thu, 6 Apr 2023 20:02:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 355446B0071; Thu, 6 Apr 2023 16:02:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 304C66B0074; Thu, 6 Apr 2023 16:02:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1A7556B0075; Thu, 6 Apr 2023 16:02:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 087C36B0071 for ; Thu, 6 Apr 2023 16:02:18 -0400 (EDT) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id B391C1A12D5 for ; Thu, 6 Apr 2023 20:02:17 +0000 (UTC) X-FDA: 80652037914.05.02A7E12 Received: from mail-yb1-f176.google.com (mail-yb1-f176.google.com [209.85.219.176]) by imf10.hostedemail.com (Postfix) with ESMTP id 811F8C001F for ; Thu, 6 Apr 2023 20:02:15 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=cmpxchg-org.20210112.gappssmtp.com header.s=20210112 header.b=LRWGk470; spf=pass (imf10.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.219.176 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org; dmarc=pass (policy=none) header.from=cmpxchg.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680811335; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=IWRsZ2R0gmFY3LChOkHfgbf4Qvq3P9/fH9lrLjafpto=; b=Q+Hwfjv7c7MNJ0iwOiKoQGcz0xZwooiiq+B9LS/8OEuouQq/utnaBtULqVB2T92p+HlWTo PiU/E8inyNgkT04ZhvQpIzUzvgqLb6Lx6GctlOZJcMPneTnmBOCJwrfJ0Y9wgpB1eIq0Nr Fl3yuLeutOOpnJtIzhhvH16mi9j42Og= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=cmpxchg-org.20210112.gappssmtp.com header.s=20210112 header.b=LRWGk470; spf=pass (imf10.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.219.176 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org; dmarc=pass (policy=none) header.from=cmpxchg.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680811335; a=rsa-sha256; cv=none; b=MoflXODu4j1pm3k7/msrSsVTt+9YC6E64VRm+xKb9LXE/c8gVwF1ESj/jUmrvNQJAGnsfc k2RWRUgaFNI8/kyqsTm6AoQo5NGCrAeG5AtoZCwo97NByIBNHYuLld/X3hc1M9Cxz87/0+ unTHtu3m1nFYM0xkYd3HNTDXroXw2oE= Received: by mail-yb1-f176.google.com with SMTP id p15so47362896ybl.9 for ; Thu, 06 Apr 2023 13:02:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20210112.gappssmtp.com; s=20210112; t=1680811334; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:from:to :cc:subject:date:message-id:reply-to; bh=IWRsZ2R0gmFY3LChOkHfgbf4Qvq3P9/fH9lrLjafpto=; b=LRWGk470kfpNXd38RtVM9BvlKwlboaqS0YzuDz+unzuhb1VgqEfXQDgStpvVWVCzgX VH16NMWxrxFTvAfw3FVoQ0rOhQSc5tmKdo/EOd2squdilqYlNBi+rlhKgfvkpU/VYOyn 0ORjZMVrSDULajTv0yUfw2Iw75priMiuroLB1ZGBsRYwH3UcLlGZmybfBSFjhFsJKR7t A6bmiwV1v3/CcTEnrZ5FPmdt33zoF8C8dUQnYqBTjAQhB8Ulkbko8scqpUt5qDIfzCYM 1zd2++jLRIDtzOinnBgeT78PuwJraxIJ57yWc2GpzccQu71smzFoBvjzN+YVtvZ24kB+ slXw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680811334; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=IWRsZ2R0gmFY3LChOkHfgbf4Qvq3P9/fH9lrLjafpto=; b=NWpEMv+ToXBakyVh5GPjmieeUuYVh6Y1aE1EfX/uvXgDMjG7WZEsIGLxEB81fT4o5T OkcJaErv7+A7iyx0q8QLXgoqiFE6lGnU1jn+Umo/onOs4/UeFaj5Z1dgS+G08A6KMkIP B9jI7g0Ic2CD1mGG/u9uK9bLENHDnWMS9OS0+0fYDUKhaFqNdTVb6artTTDi79Elec3l CA/kn23pCUHQ0FF8BqGEGdZDhorROi0aJZ9VelatJstdEYI8l2Gc8Ttzo9ECGDiVRWbg 2NvgnnqHJEcsRoJGJi4Ty6LGYG5YJXvLo+IKvHS6G/QGwc3oqKkgXoLOQhGceLvyRDaU g4bw== X-Gm-Message-State: AAQBX9c+TdhF/kX9/ByOYLUKoiMknVnkZawImBevZ7lH8hXgEnNoxWqH RlSL0Ja4yJjojJeYIGMyH1h23g== X-Google-Smtp-Source: AKy350b9wAurodpwuCRJ9Fk3uJ1oRAYP1n5ZbGSkYWbHlgU5KLi61/Dy8/2GyUpu2dz0P1F1Eu6hDg== X-Received: by 2002:a25:6802:0:b0:934:83b7:3f36 with SMTP id d2-20020a256802000000b0093483b73f36mr448758ybc.9.1680811334504; Thu, 06 Apr 2023 13:02:14 -0700 (PDT) Received: from localhost (2603-7000-0c01-2716-8f57-5681-ccd3-4a2e.res6.spectrum.com. [2603:7000:c01:2716:8f57:5681:ccd3:4a2e]) by smtp.gmail.com with ESMTPSA id i4-20020a05620a248400b007469c93ac2dsm722424qkn.31.2023.04.06.13.02.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Apr 2023 13:02:14 -0700 (PDT) Date: Thu, 6 Apr 2023 16:02:13 -0400 From: Johannes Weiner To: Yu Zhao Cc: Yosry Ahmed , Andrew Morton , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Shakeel Butt , Michal Hocko Subject: Re: global_reclaim() and code documentation (was: Re: [PATCH v3 3/3] mm: vmscan: ignore non-LRU-based reclaim in memcg reclaim Message-ID: <20230406200213.GA50192@cmpxchg.org> References: <20230405200150.GA35884@cmpxchg.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Rspam-User: X-Rspamd-Server: rspam03 X-Stat-Signature: dkwh8b99n6x1qettob1ftbnh31zphnba X-Rspamd-Queue-Id: 811F8C001F X-HE-Tag: 1680811335-371723 X-HE-Meta: U2FsdGVkX1/GgqiF9l2tg4oiRz9oKIu3nQ3Pm5V75PPKjY/pgpojoSj/nhE5bp6opE3/FE2KKxpxPcK87W1QW42lPjswkKG2774hDv3Dgc7cSdIw4xQIaczShXCLXU4iZYKrTRaXrfC8maGV0L656HAqJiyb9EXiLwTpQbeaxnwpVdkLRPRyExe2UWJmXLg6lOIPulI3SANK5BP/POpRRgPwhTl9gS1KiTWWGK93AHNiO9cW6pZg/dehAhqo0//TrPhY90EDpaKTemh5HrA5ugVdHkMXWZuR2XX3ZNI95Yvw6ivQB91VNpn7fpn5hQjQQ8lp8mozeIU8oFDGd5DptS9l/yUaALekt+NYdTS+dvMTOXo4DCnyTaZzhmXQvZngV7IpR+Zroupg0yV45N6Q4JqI4OiqNZ3WxgvNGL5XTWnuUEcHSePJCC4cgYHisV7T3T12Yz7EvS5bRiBFW4cclbActRRdmXFUOD7m7HDFzofR8gel/IW3SKT2DKjm/D0R+Uh7j+9JShqLTeMwM2OhmZIiq7jq5mmJF0z06d4kgh0eOz/xBvB8E1yUOw19PPEcsB93DRObOEWDk5Dve0J3ePOOyxxgVLO5JQbGT+XsLX5dggAfZA3BCUPGIlk8lG17XDbBTm1kf2I6IJkdurCmyZpuL4CsPDRdvtqTNpGyagBraeosnwsOdubNtAyxP2dy46NoOyAZ8Q2I660SXgYEibruAJYwlKaRW00nkkxTqrQSw+j3/JxyUkmVG8iy/qkGzzEU2mDG+XdHwn01umvkh766ELEzQna9WR4WKaO2B11gakBExaErGM+cIQad1ygJ0jyEyY02ZsS/+AeS8rkxNq7gfrFm+k3wbKRDgYghuTsrSVoRBmxiPAPGgLKaMcn9GZPHWAAtCLJ9K0NJuSaoij2eE7MG8Fakx7mObl1W/zSp/gQxiM3C9SL41OaKKP6jwFZL+Jz3PlN9Ztl9fq2 cDn0qKej PVGuplLEueOcN3bArYxrsnLJyXWdoVGTH6o6dHM8yiq4Ww8FRCw5s39bYnHTVc5l4dBWHwhAulC+hybkdLxE7PbEs09lQsnGLWhQA/3FXmmC2f6LOwe3yKLff0kzm0BvIJ8VARL3iC4w4K0a4qsOG38dEgGO3LeU/D/GATg/DogBN7t0h2U4IBre5Pe4kdNHfUp0dZRvQsX3NSu+JqK3QyZKc6AieHVDxdknbthkuZ8kZyQEXbk1lX6574K7nvM0SHAedgDnPXkCGcggLuIEfxZaoITjcqXBqybnUX4gJ4gc3FTU2UWCmWe+Lubcm2Tmdx88yH4R7tmJvHha2t81Lr8nxhnj6vdWIvzDaPDoux+UalsjF8TAb+TOCWbvA7D30XYMavGFXHdnY1hDJk1JE44FufLSIeyddEE7PMrk5NDpXCwE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Apr 05, 2023 at 03:09:27PM -0600, Yu Zhao wrote: > On Wed, Apr 5, 2023 at 2:01 PM Johannes Weiner wrote: > > static bool cgroup_reclaim(struct scan_control *sc) > > { > > return sc->target_mem_cgroup; > > } > > > > static bool global_reclaim(struct scan_control *sc) > > { > > return !sc->target_mem_cgroup || mem_cgroup_is_root(sc->target_mem_cgroup); > > } > > > > The name suggests it's the same thing twice, with opposite > > polarity. But of course they're subtly different, and not documented. > > > > When do you use which? > > The problem I saw is that target_mem_cgroup is set when writing to the > root memory.reclaim. And for this case, a few places might prefer > global_reclaim(), e.g., in shrink_lruvec(), in addition to where it's > being used. > > If this makes sense, we could 1) document it (or rename it) and apply > it to those places, or 2) just unset target_mem_cgroup for root and > delete global_reclaim(). Option 2 might break ABI but still be > acceptable. Ah, cgroup_reclaim() tests whether it's limit/proactive reclaim or allocator reclaim. global_reclaim() tests whether it's root reclaim (which could be from either after memory.reclaim). I suppose we didn't clarify when introducing memory.reclaim what the semantics should be on the root cgroup: - We currently exclude PGSCAN and PGSTEAL stats from /proc/vmstat for limit reclaim to tell cgroup constraints from physical pressure. We currently exclude root memory.reclaim activity as well. Seems okay. - The file_is_tiny heuristic is for allocator reclaim near OOM. It's currently excluded for root memory.reclaim, which seems okay too. - Like limit reclaim, root memory.reclaim currently NEVER swaps when global swappiness is 0. The whole cgroup-specific swappiness=0 semantic is kind of quirky. But I suppose we can keep it as-is. - Proportional reclaim is disabled for everybody but direct reclaim from the allocator at initial priority. Effectively disabled for all situations where it might matter, including root memory.reclaim. We should also keep this as-is. - Writeback throttling is interesting. Historically, kswapd set the congestion state when running into lots of PG_reclaim pages, and clear it when the node is balanced. This throttles direct reclaim. Cgroup limit reclaim would set and clear congestion on non-root only to do local limit-reclaim throttling. But now root memory.reclaim might clear a bit set by kswapd before the node is balanced, and release direct reclaimers from throttling prematurely. This seems wrong. However, the alternative is throttling memory.reclaim on subgroup congestion but not root, which seems also wrong. - Root memory.reclaim is exempted from the compaction_ready() bail condition as well as soft limit reclaim. But they'd both be no-ops anyway if we changed cgroup_reclaim() semantics. IMO we should either figure out what we want to do in those cases above, at least for writeback throttling. Are you guys using root-level proactive reclaim? > If we don't want to decide right now, I can rename global_reclaim() to > root_reclaim() and move it within #ifdef CONFIG_LRU_GEN and probably > come back and revisit later. So conventional vmscan treats root-level memory.reclaim the same as any other cgroup reclaim. And the cgroup_reclaim() checks are mostly reserved for (questionable) allocator reclaim-specific heuristics. Is there a chance lrugen could do the same, and you'd be fine with using cgroup_reclaim()? Or is it a data structure problem? If so, I agree it could be better to move it to CONFIG_LRU_GEN and rename it for the time being. root_reclaim() SGTM.