From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 292B5C77B78 for ; Tue, 2 May 2023 21:08:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 943456B0074; Tue, 2 May 2023 17:08:28 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8F2956B0075; Tue, 2 May 2023 17:08:28 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 809BF6B0078; Tue, 2 May 2023 17:08:28 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from mail-ej1-f49.google.com (mail-ej1-f49.google.com [209.85.218.49]) by kanga.kvack.org (Postfix) with ESMTP id 2F8BD6B0074 for ; Tue, 2 May 2023 17:08:28 -0400 (EDT) Received: by mail-ej1-f49.google.com with SMTP id a640c23a62f3a-94a342f7c4cso849458166b.0 for ; Tue, 02 May 2023 14:08:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1683061706; x=1685653706; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=wExq1azF4GftMmcHTSy6kxlx9hSn4o3sgJc6Bo+y6xM=; b=fyAjTMLEbnmG4PkO4vs0g59JPdWH+A5E9JSD5rN15qiyb/d8ucGJUFMRC89tfqTmVh 2qzJJej3ohhrVzyBmseE9MN3q7crjlzvBbHL7gdLAL1jeBygA1EWvjjPBDZrVozB2iNA pWEr6Hy25xlQj8X1UczSs9NbWjj2QRoWzMK7A= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1683061706; x=1685653706; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=wExq1azF4GftMmcHTSy6kxlx9hSn4o3sgJc6Bo+y6xM=; b=Dn0b9Xyt4O9M3P1i1nUWG9lngxSnzxlrmAyxicVr+++dRMqGVhasl5lVt6SM/ZOHyj hqKfzjPNGszQu5PTMhUI9/eJgguFntOPMt6uQHyUnZEsU8DEzmAZCeGHRilE4aRSTuPj rzfXzfn7rG7VKxAAtYQmy1xBHzg8UvRXeM243noZjwgEaMVGwv65eoOGrQxSKyqoQxAe 34U3SlkD+r/F8WPXAWI/lQDhIpnNzb8eZvpE6Ah4rdnCmE33EpP4JuZByz2TfSUo+3yE FcFaUlaHKx+qYf7MntAFiLHDBxFTLFDv7Ipl/6uH4WFtlM8edPlPkLZecfp+jp4SooRB +9cw== X-Gm-Message-State: AC+VfDzYePwmxQ5DFTkkXq6fVtbIoc4dAHI9vuDY/6heM/ZYuRTjAEYc JOiZSkaHpuFVCK+LLlCJxlpU3k92/trli0J+dhdeWQ== X-Google-Smtp-Source: ACHHUZ4LtySN6fqKC0HaUdw4pm5QwMloJCufn4C1kWXwM/KxS1mW0IPyv+t3rHujtF8vc/YnY8yopA== X-Received: by 2002:a17:906:9b96:b0:94f:bdda:b29 with SMTP id dd22-20020a1709069b9600b0094fbdda0b29mr1205397ejc.77.1683061706635; Tue, 02 May 2023 14:08:26 -0700 (PDT) Received: from mail-wm1-f42.google.com (mail-wm1-f42.google.com. [209.85.128.42]) by smtp.gmail.com with ESMTPSA id c26-20020a056402121a00b0050673b13b58sm13618653edw.56.2023.05.02.14.08.23 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 02 May 2023 14:08:23 -0700 (PDT) Received: by mail-wm1-f42.google.com with SMTP id 5b1f17b1804b1-3f2548256d0so58685e9.1 for ; Tue, 02 May 2023 14:08:23 -0700 (PDT) X-Received: by 2002:a05:600c:1e20:b0:3f1:70d1:21a6 with SMTP id ay32-20020a05600c1e2000b003f170d121a6mr63861wmb.0.1683061703233; Tue, 02 May 2023 14:08:23 -0700 (PDT) MIME-Version: 1.0 References: <20230428135414.v3.1.Ia86ccac02a303154a0b8bc60567e7a95d34c96d3@changeid> <20230429101345.2769-1-hdanton@sina.com> In-Reply-To: <20230429101345.2769-1-hdanton@sina.com> From: Doug Anderson Date: Tue, 2 May 2023 14:08:10 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v3] migrate_pages: Avoid blocking for IO in MIGRATE_SYNC_LIGHT To: Hillf Danton Cc: Andrew Morton , Mel Gorman , Alexander Viro , Christian Brauner , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Matthew Wilcox , Yu Zhao , Johannes Weiner Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi, On Sat, Apr 29, 2023 at 3:14=E2=80=AFAM Hillf Danton wro= te: > > On 28 Apr 2023 13:54:38 -0700 Douglas Anderson > > The MIGRATE_SYNC_LIGHT mode is intended to block for things that will > > finish quickly but not for things that will take a long time. Exactly > > how long is too long is not well defined, but waits of tens of > > milliseconds is likely non-ideal. > > > > When putting a Chromebook under memory pressure (opening over 90 tabs > > on a 4GB machine) it was fairly easy to see delays waiting for some > > locks in the kcompactd code path of > 100 ms. While the laptop wasn't > > amazingly usable in this state, it was still limping along and this > > state isn't something artificial. Sometimes we simply end up with a > > lot of memory pressure. > > Was kcompactd waken up for PAGE_ALLOC_COSTLY_ORDER? I put some more traces in and reproduced it again. I saw something that looked like this: 1. balance_pgdat() called wakeup_kcompactd() with order=3D10 and that caused us to get all the way to the end and wakeup kcompactd (there were previous calls to wakeup_kcompactd() that returned early). 2. kcompactd started and completed kcompactd_do_work() without blocking. 3. kcompactd called proactive_compact_node() and there blocked for ~92ms in one case, ~120ms in another case, ~131ms in another case. > > Putting the same Chromebook under memory pressure while it was running > > Android apps (though not stressing them) showed a much worse result > > (NOTE: this was on a older kernel but the codepaths here are similar). > > Android apps on ChromeOS currently run from a 128K-block, > > zlib-compressed, loopback-mounted squashfs disk. If we get a page > > fault from something backed by the squashfs filesystem we could end up > > holding a folio lock while reading enough from disk to decompress 128K > > (and then decompressing it using the somewhat slow zlib algorithms). > > That reading goes through the ext4 subsystem (because it's a loopback > > mount) before eventually ending up in the block subsystem. This extra > > jaunt adds extra overhead. Without much work I could see cases where > > we ended up blocked on a folio lock for over a second. With more > > extreme memory pressure I could see up to 25 seconds. > > In the same kcompactd code path above? It was definitely in kcompactd. I can go back and trace through this too, if it's useful, but I suspect it's the same. > > We considered adding a timeout in the case of MIGRATE_SYNC_LIGHT for > > the two locks that were seen to be slow [1] and that generated much > > discussion. After discussion, it was decided that we should avoid > > waiting for the two locks during MIGRATE_SYNC_LIGHT if they were being > > held for IO. We'll continue with the unbounded wait for the more full > > SYNC modes. > > > > With this change, I couldn't see any slow waits on these locks with my > > previous testcases. > > Well this is the upside after this change, but given the win, what is > the lose/cost paid? For example the changes in compact fail and success [= 1]. > > [1] https://lore.kernel.org/lkml/20230418191313.268131-1-hannes@cmpxchg.o= rg/ That looks like an interesting series. Obviously it would need to be tested, but my hunch is that ${SUBJECT} patch would work well with that series. Specifically with Johannes's series it seems more important for the kcompactd thread to be working fruitfully. Having it blocked for a long time when there is other useful work it could be doing still seems wrong. With ${SUBJECT} patch it's not that we'll never come back and try again, but we'll just wait until a future iteration when (hopefully) the locks are easier to acquire. In the meantime, we're looking for other pages to migrate. -Doug