From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D6078C77B75 for ; Tue, 18 Apr 2023 16:19:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2BED28E0002; Tue, 18 Apr 2023 12:19:40 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 26EB78E0001; Tue, 18 Apr 2023 12:19:40 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 15DB58E0002; Tue, 18 Apr 2023 12:19:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 057FE8E0001 for ; Tue, 18 Apr 2023 12:19:40 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id CE058403F9 for ; Tue, 18 Apr 2023 16:19:39 +0000 (UTC) X-FDA: 80695022478.09.9E6B768 Received: from mail-ej1-f47.google.com (mail-ej1-f47.google.com [209.85.218.47]) by imf01.hostedemail.com (Postfix) with ESMTP id AE9C140018 for ; Tue, 18 Apr 2023 16:19:37 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=fWQbLhPb; spf=pass (imf01.hostedemail.com: domain of dianders@chromium.org designates 209.85.218.47 as permitted sender) smtp.mailfrom=dianders@chromium.org; dmarc=pass (policy=none) header.from=chromium.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681834777; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=VFrSQc7lDYoC+pA674PKNte5MxFoAJUM1iwHstdWOEo=; b=Prd3QYorLMFkeh9oqngW0EZXY37T78/oNFZZcsnBYTT4JD/h2y7CcdYdBVoWGDS4V9IVV6 dW0uTg4duGtpGuq4AUnIJ4eqFFxWmO+rURsopfT30wB6npi8vwi6XpUaSaXtBsDhxsDx/4 L4kGQejuVf7cfRhP9dQ+/3+FRMx3KyE= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=fWQbLhPb; spf=pass (imf01.hostedemail.com: domain of dianders@chromium.org designates 209.85.218.47 as permitted sender) smtp.mailfrom=dianders@chromium.org; dmarc=pass (policy=none) header.from=chromium.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681834777; a=rsa-sha256; cv=none; b=d2tdE9fRVs7yxPXz21yRQIb4NjCLt/3B4Nh9t1HPtlMu4atPypH/mOZkhcvO7lR1US1ZUi m2wH1MdhXDH9g4loNBqbcK3YmKyVNsetVnHewCKsYkyLWbI82rmYZyzTUFhhfbnJT7hNXd Oxp0g3WpN+D5n5mj2k/7fWEaZg0yeGM= Received: by mail-ej1-f47.google.com with SMTP id dx24so30625817ejb.11 for ; Tue, 18 Apr 2023 09:19:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1681834775; x=1684426775; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=VFrSQc7lDYoC+pA674PKNte5MxFoAJUM1iwHstdWOEo=; b=fWQbLhPbDolxi6LMQafclHGha50Rxvq8BwDvHozJVwPFmWzikBH4gTuzEGuxaYkBzU dw8tsuW51IQ6eE3XgEI5AJXy9trSRVjNnuNohxXVFCqKj4OpBGWrIma5r5W5HA4GiYXB jwIuhjGt8644HRHssfN5yvjAKGGgT1eIRVIE4= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681834775; x=1684426775; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=VFrSQc7lDYoC+pA674PKNte5MxFoAJUM1iwHstdWOEo=; b=hTUoJCk2McC3FRc3wzFG8YpnE90gMfwIm0Dtpi4L60LHx5psA4SNG25EOC+47FloTx 9eqdkqmtaAT5KMZugmv9QR03fb7lMuxuD6U7zt6FszsRCbGjNAm4qa0FIomluIGOuCMq KT25DfW1D6wodCpmozE33bbttNSSKQtWa2cpA8mhEEDqZ5iZPooNeIm4Gi7ayJSKP4cZ MEW7h3RzYA0hU5MQ0AmnuHDucrSi/pTgl9hm5bGSfz9jAe6qi7E+FhABpItveoGbQT5Z amcBkPKgPoJXpNm7qyiIUe3c6jlKGPLAsCIJtcPwS+sdOi/qBLwUw7g7ALSJ4YHScmLd tmHw== X-Gm-Message-State: AAQBX9c0YvlTV/+udwAErPhlR5tDl3bg6RU+E8IAbp8PPdBFG4uBA7fu 1JBQFC5xv7+6pWBIxNvIfNw1tOD+ZH/vN+NrR1+/8w== X-Google-Smtp-Source: AKy350bKaC0wQ5yqBkmRx3GSm89RbHbXa+QVm1e65F68h5VUfwT5GEuHIAB+TlEFGMb34/GV1f6Ftg== X-Received: by 2002:a17:906:474c:b0:8b8:c06e:52d8 with SMTP id j12-20020a170906474c00b008b8c06e52d8mr10181588ejs.36.1681834775435; Tue, 18 Apr 2023 09:19:35 -0700 (PDT) Received: from mail-wm1-f52.google.com (mail-wm1-f52.google.com. [209.85.128.52]) by smtp.gmail.com with ESMTPSA id e17-20020a17090681d100b0094f2f1c5ea1sm4931195ejx.174.2023.04.18.09.19.34 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 18 Apr 2023 09:19:35 -0700 (PDT) Received: by mail-wm1-f52.google.com with SMTP id eo4-20020a05600c82c400b003f05a99a841so134493wmb.3 for ; Tue, 18 Apr 2023 09:19:34 -0700 (PDT) X-Received: by 2002:a1c:770e:0:b0:3f1:758e:40fa with SMTP id t14-20020a1c770e000000b003f1758e40famr1171776wmi.6.1681834774341; Tue, 18 Apr 2023 09:19:34 -0700 (PDT) MIME-Version: 1.0 References: <20230413182313.RFC.1.Ia86ccac02a303154a0b8bc60567e7a95d34c96d3@changeid> <87v8hz17o9.fsf@yhuang6-desk2.ccr.corp.intel.com> <87ildvwbr5.fsf@yhuang6-desk2.ccr.corp.intel.com> <87edohvpzk.fsf@yhuang6-desk2.ccr.corp.intel.com> In-Reply-To: <87edohvpzk.fsf@yhuang6-desk2.ccr.corp.intel.com> From: Doug Anderson Date: Tue, 18 Apr 2023 09:19:21 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [RFC PATCH] migrate_pages: Never block waiting for the page lock To: "Huang, Ying" Cc: Andrew Morton , Yu Zhao , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Vlastimil Babka , Mel Gorman Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: 6s1oz9c4md1o3us8iujzrfmkxusk6qtf X-Rspam-User: X-Rspamd-Queue-Id: AE9C140018 X-Rspamd-Server: rspam06 X-HE-Tag: 1681834777-549552 X-HE-Meta: U2FsdGVkX1+yJAHwpOr8VWb0GIS/6rAEDEzSrWr8lk2yy2wSOozpDCBkZHEI+5pItxc7VWAKKxohUv01H8hrPmjA32wepwQT5zOeZuUD1smxRYx6LKSv15vG4RAM8ePWIpg5l5fYuxleF8ITeE8jPksi7Pt2otzgCmxkuqXB4wISMLh31aFQsjIYjRDiTbEH9tSETwLa2OWv+UAX+bQvx7rKuvJpqWfBwzf1FhBVtTY0f36G6rx0mE4HLT/fXo1S0Vuq/hdiVn4F93vFZ0pfmDSPcw6jrOVYJkrXFHHRxAQFh1bFHHAJf6ZorjbxR4C5Qf7Kh3+kQDHHw8cJtziCghoBam4sUpzPyT1KMHrlSnuS5srY9C4IcygI5VTPv+JAoV7OVcsa3HmCTRUMy9ctBIEHl/oP0HVLakCe/L/uToMi653+VZCKoqxlbXW/JWlAwgDDdso5k0Rdb3BYqdEN1K943orUgH6TajcLTsUP9wTUfrGjs1tRf4lrAlw/45+bZ/m85H54TGGr7YKyiR+WAJi4XOPUWW2VolCmJepk2gVGcDlaSV+sdBZz+jpt1VYjSLX31NO0yNBmZkME+D176FqtSwN1Hf+uJqxk0RxvaC9dGNCb0h6Ru39X+zGZOpOJmuLTOwhQ7KOGO1J/rfdS4Hlwtw4n7D4FZzTE7UpYhjUzZk8fQdTTmpcGkjAmhAAyK9wBQJ5zKm2luOZMW8uOhXScgreO5Wcj+jm8s3L1phOW20Tm3m9uObDPPxygLzIHx/KLTz6N2C+y0ehEWS0e3ZE+xWvavqw7kWbFZ8tRkpS7zFhcb096dkNCwARvYIdksByKaD1byiKfOU2rsdzvVmV5t65/x6t/tb70gzv8livl2vpTa1KTaDlF84KlF/AsTHG7UHceYNybvwM7By6SFP1NmHFRtC3AXuqwSC5/yXaeRwVFJTlWYIaFxaO+9mV9SzaIe5VouydhnYeMOff kBOUuHP8 oSS36/69/SVNleyX1hGeN+e1A4dPRHqMhMvSbjX+OC4Z2RnqVrKR7d63Gi+EEBVEJCVS6y7R68zJfTFSXIPXr31HtW8vpXkX3kOy/mpSz6iTRspbsh3DB1NAzWEE+eEVe9euU7mgQ2oUPT9eBEtEKb9W+jueIB4f3deepix3sIYUVdYUAgzB1f2mMew12Mf8y7/5GPkwwwv1w036YcTJE41fx2Dd2znjIG3NeBef8bwbRkyIrFHO60n9eFEhbzR5P6gdlxEh6DdvKnNwWMfQDbJbKlPjCLxh3l2CQTRxmaYbd18+uzLKYPMU5DV7WPrcE8Hrqox+a8t3sfLZjGcTCwXSrZLQlyJFK9okgIXfxjX3p0cxDgZrtDCqwq2P9m59AaCQvxHtvZQAaxHA+ZFo8NLj4PKrHYy0D0BSmJobaX7BDHS6RqOoW5ewQhGgzIaPWJf9t X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi, On Mon, Apr 17, 2023 at 8:18=E2=80=AFPM Huang, Ying = wrote: > > Doug Anderson writes: > > > Hi, > > > > On Sun, Apr 16, 2023 at 6:15=E2=80=AFPM Huang, Ying wrote: > >> > >> Doug Anderson writes: > >> > >> > Hi, > >> > > >> > On Thu, Apr 13, 2023 at 8:10=E2=80=AFPM Huang, Ying wrote: > >> >> > >> >> Douglas Anderson writes: > >> >> > >> >> > Currently when we try to do page migration and we're in "synchron= ous" > >> >> > mode (and not doing direct compaction) then we'll wait an infinit= e > >> >> > amount of time for a page lock. This does not appear to be a grea= t > >> >> > idea. > >> >> > > >> >> > One issue can be seen when I put a device under extreme memory > >> >> > pressure. I took a sc7180-trogdor Chromebook (4GB RAM, 8GB zram > >> >> > swap). I ran the browser along with Android (which runs from a > >> >> > loopback mounted 128K block-size squashfs "disk"). I then manuall= y ran > >> >> > the mmm_donut memory pressure tool [1]. The system is completely > >> >> > unusable both with and without this patch since there are 8 proce= sses > >> >> > completely thrashing memory, but it was still interesting to look= at > >> >> > how migration was behaving. I put some timing code in and I could= see > >> >> > that we sometimes waited over 25 seconds (in the context of > >> >> > kcompactd0) for a page lock to become available. Although the 25 > >> >> > seconds was the high mark, it was easy to see tens, hundreds, or > >> >> > thousands of milliseconds spent waiting on the lock. > >> >> > > >> >> > Instead of waiting, if I bailed out right away (as this patch doe= s), I > >> >> > could see kcompactd0 move forward to successfully to migrate othe= r > >> >> > pages instead. This seems like a better use of kcompactd's time. > >> >> > > >> >> > Thus, even though this didn't make the system any more usable in = my > >> >> > absurd test case, it still seemed to make migration behave better= and > >> >> > that feels like a win. It also makes the code simpler since we ha= ve > >> >> > one fewer special case. > >> >> > >> >> TBH, the test case is too extreme for me. > >> > > >> > That's fair. That being said, I guess the point I was trying to make > >> > is that waiting for this lock could take an unbounded amount of time= . > >> > Other parts of the system sometimes hold a page lock and then do a > >> > blocking operation. At least in the case of kcompactd there are bett= er > >> > uses of its time than waiting for any given page. > >> > > >> >> And, we have multiple "sync" mode to deal with latency requirement,= for > >> >> example, we use MIGRATE_SYNC_LIGHT for compaction to avoid too long > >> >> latency. If you have latency requirement for some users, you may > >> >> consider to add new "sync" mode. > >> > > >> > Sure. kcompactd_do_work() is currently using MIGRATE_SYNC_LIGHT. I > >> > guess my first thought would be to avoid adding a new mode and make > >> > MIGRATE_SYNC_LIGHT not block here. Then anyone that truly needs to > >> > wait for all the pages to be migrated can use the heavier sync modes= . > >> > It seems to me like the current users of MIGRATE_SYNC_LIGHT would no= t > >> > want to block for an unbounded amount of time here. What do you thin= k? > >> > >> It appears that you can just use MIGRATE_ASYNC if you think the correc= t > >> behavior is "NOT block at all". I found that there are more > >> fine-grained controls on this in compaction code, please take a look a= t > >> "enum compact_priority" and its comments. > > > > Actually, the more I think about it the more I think the right answer > > is to keep kcompactd as using MIGRATE_SYNC_LIGHT and make > > MIGRATE_SYNC_LIGHT not block on the folio lock. > > Then, what is the difference between MIGRATE_SYNC_LIGHT and > MIGRATE_ASYNC? Aren't there still some differences even if we remove blocking this one lock? ...or maybe your point is that maybe the other differences have similar properties? OK, so let's think about just using MIGRATE_ASYNC and either leaving MIGRATE_SYNC_LIGHT alone or deleting it (if there are no users left). The nice thing is that the only users of MIGRATE_SYNC_LIGHT are in "compaction.c" and there are only 3 places where it's specified. 1. kcompactd_do_work() - This is what I was analyzing and where I argued that indefinite blocking is less useful than simply trying to compact a different page. So sure, moving this to MIGRATE_ASYNC seems like it would be OK? 2. proactive_compact_node() - Just like kcompactd_do_work(), this is called from kcompactd and thus probably should have the same mode. 3. compact_zone_order() - This explicitly chooses between MIGRATE_SYNC_LIGHT and MIGRATE_ASYNC, so I guess I'd keep MIGRATE_SYNC_LIGHT just for this use case. It looks as if compact_zone_order() is called for direct compaction and thus making it synchronous can make sense, especially since it seems to go to the synchronous case after it failed with the async case. Ironically, though, the exact lock I was proposing to not wait on _isn't_ ever waited on in direct reclaim (see the comment in migrate_folio_unmap() about deadlock), but the other differences between SYNC_LIGHT and ASYNC come into play. If the above sounds correct then I'm OK w/ moving #1 and #2 to MIGRATE_ASYNC and leaving #3 as the sole user or MIGRATE_SYNC_LIGHT. > > kcompactd can accept some blocking but we don't want long / unbounded > > blocking. Reading the comments for MIGRATE_SYNC_LIGHT, this also seems > > like it fits pretty well. MIGRATE_SYNC_LIGHT says that the stall time > > of writepage() is too much. It's entirely plausible that someone else > > holding the lock is doing something as slow as writepage() and thus > > waiting on the lock can be just as bad for latency. > > IIUC, during writepage(), the page/folio will be unlocked. > > But, during page reading, the page/folio will be locked. I don't really > understand why we can wait for page reading but cannot wait for page > writeback. I'm not sure I totally got your point here. It sorta sounds as if you're making the same point that I was? IIUC by waiting on the lock we may be implicitly waiting for someone to finish reading which seems as bad as waiting for writing. That was why I was arguing that with MIGRATE_SYNC_LIGHT (which says that waiting for the write was too slow) that we shouldn't wait for the lock (which may be blocking on a read). -Doug