From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 134FEC77B73 for ; Wed, 19 Apr 2023 19:31:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1BA1A6B0071; Wed, 19 Apr 2023 15:31:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 143326B0072; Wed, 19 Apr 2023 15:31:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 009EF900002; Wed, 19 Apr 2023 15:31:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id E64F26B0071 for ; Wed, 19 Apr 2023 15:31:12 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 877281603BC for ; Wed, 19 Apr 2023 19:31:12 +0000 (UTC) X-FDA: 80699133984.20.75C76B5 Received: from mail-yw1-f180.google.com (mail-yw1-f180.google.com [209.85.128.180]) by imf19.hostedemail.com (Postfix) with ESMTP id 572071A0029 for ; Wed, 19 Apr 2023 19:31:10 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=GNz+FvJo; dmarc=pass (policy=none) header.from=chromium.org; spf=pass (imf19.hostedemail.com: domain of dianders@chromium.org designates 209.85.128.180 as permitted sender) smtp.mailfrom=dianders@chromium.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681932670; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=tHVwt8ouo1WrczMVZ0u9DPFDA9SmLrHyYVpeivZ/wGA=; b=2IvqjwoYHR0R0PaJZyepr7DgwJuy3r9bogGMzpAVbEddYChkuIq4UzCBbI8SGx8+tlAz9O MUHn19wirddHWACN5hTB2t7fLTne4ZcTZBPDQDuUZqh6vWA9zdZNrfK5DuV0U+PaYg71/n aGXR2nR+WdzW1r6cn0t5PYlCjpY+11Y= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=GNz+FvJo; dmarc=pass (policy=none) header.from=chromium.org; spf=pass (imf19.hostedemail.com: domain of dianders@chromium.org designates 209.85.128.180 as permitted sender) smtp.mailfrom=dianders@chromium.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681932670; a=rsa-sha256; cv=none; b=oOy3h2CfuukldAp+O6f64EeEqR83yQM8gchm8/sRX6ZflPllfUzlZ36N3oOW7aWRxWEqSM T5Ssh/FsB2ZHPhTUWs7zWckrOtziLcQ05Y/IJegYJMcTt3gLucsRV6vB02w45MkK/OSlqO yZ5R+clibW9EVNuefdjuEQU6CpuhSI4= Received: by mail-yw1-f180.google.com with SMTP id 00721157ae682-54c12009c30so13862907b3.9 for ; Wed, 19 Apr 2023 12:31:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1681932668; x=1684524668; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=tHVwt8ouo1WrczMVZ0u9DPFDA9SmLrHyYVpeivZ/wGA=; b=GNz+FvJo5e+qc8WQalDUQexCP8WwoPEShICw/eWfka7BOVOnW2lEE9YZqPG779CCtJ Kx38jHAEoLR9gGJ0BLzIglyvlLcqdbvXrdDThPoSWm9BF9mPoN12M0b4eTxgQGR++b/F fmKrP9uxmBeQXaNU3llZSyFf9CY3k9hhMgWz8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681932668; x=1684524668; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=tHVwt8ouo1WrczMVZ0u9DPFDA9SmLrHyYVpeivZ/wGA=; b=MdASscvZUY+AvSbfzhkO4FPHdbSETv6ujcZ0CVaFP60nMzTta+bU2Ln4RRmLMKlEvN fyNdPz5ftRMjh9V+MQIFhK7sBP1dsWqc2UWZ9ghnZXxHGEopPZ1J1EnV8VoycvHkCjOz SwoXELLKvuXrkF4wJORYm/kMjxrrehlbs+gtKmr1rjFwXGSAT7lQwwpZvczRX2EcgCTw vC65Tx/OHUZ/e3PijBKM6u86n8XacqKeGbX1MbOvXaBysa43suMrhJjsSG3lNJziWROj 51HPzRIJFA+31m2dgSpkiweNWGy5sXqO1tyv4zQLQlqKC6Vsop562aRq8mM1ZqY34yHX P2qQ== X-Gm-Message-State: AAQBX9fUli74fkzvgEf50v9me1K1xCaBxGMLD9RXT/ZS7fowrvQxWsVl L6yFtEVBa19Ad5+sMLNb0T0+AT51B8XTqpfk8Ww/7Q== X-Google-Smtp-Source: AKy350bsiAi9XUuMNYKV8tri/4cBfibs9tVNCJ/dy0YJL/n1Is9Ay73AWSE08BMJtQ+TX5LPfgGHig== X-Received: by 2002:a0d:d4c3:0:b0:54f:8636:2152 with SMTP id w186-20020a0dd4c3000000b0054f86362152mr3797465ywd.15.1681932668126; Wed, 19 Apr 2023 12:31:08 -0700 (PDT) Received: from mail-yb1-f178.google.com (mail-yb1-f178.google.com. [209.85.219.178]) by smtp.gmail.com with ESMTPSA id 184-20020a810ac1000000b00545a081848bsm4663409ywk.27.2023.04.19.12.31.07 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 19 Apr 2023 12:31:07 -0700 (PDT) Received: by mail-yb1-f178.google.com with SMTP id m14so627879ybk.4 for ; Wed, 19 Apr 2023 12:31:07 -0700 (PDT) X-Received: by 2002:a25:cad6:0:b0:b92:380b:caeb with SMTP id a205-20020a25cad6000000b00b92380bcaebmr573391ybg.0.1681932666601; Wed, 19 Apr 2023 12:31:06 -0700 (PDT) MIME-Version: 1.0 References: <20230413182313.RFC.1.Ia86ccac02a303154a0b8bc60567e7a95d34c96d3@changeid> <87v8hz17o9.fsf@yhuang6-desk2.ccr.corp.intel.com> <87ildvwbr5.fsf@yhuang6-desk2.ccr.corp.intel.com> <87edohvpzk.fsf@yhuang6-desk2.ccr.corp.intel.com> <87wn28u2wy.fsf@yhuang6-desk2.ccr.corp.intel.com> In-Reply-To: <87wn28u2wy.fsf@yhuang6-desk2.ccr.corp.intel.com> From: Doug Anderson Date: Wed, 19 Apr 2023 12:30:55 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [RFC PATCH] migrate_pages: Never block waiting for the page lock To: "Huang, Ying" Cc: Andrew Morton , Yu Zhao , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Vlastimil Babka , Mel Gorman Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 572071A0029 X-Stat-Signature: q71dyg5ggkax7ommkeq3qshsfhr9bc7n X-HE-Tag: 1681932670-762843 X-HE-Meta: U2FsdGVkX1+zU6Ws2Vr8rBkrPynL/7BxxUFFMVPsQkIZlwEOAcsd/MzNYa2q2eE5AGHFRspwJg8mmUUU8tgJfgWmn0NZmVDc4D5WJlGfrgAFc702OiMfdTAW9GBHGIhutmwvXneeLW90YoOEZpe/1QQV8+d/kDc8jLsFWLY8uRNiqQafBVZFzkU29TM6fHFgUALkTR9+SESgCpAMFeTpQPM+QU5aelWmnIbW0QPxm7LkdiPjNvgWJ39c1jW7GprlnAAExPngi/KIfQ65SdyYxfPorgY07mYIqFDmmSESdPYv0aoAbRzknGgnYTid4/AcWVAnSriUnJzjkoMaNuHGWfwlrnoqxDxHhFKqHAQxLkEoF3DV4MCsnNYvVdEVT9fx+ulPZhyjp9aVqxdRTCOuk9CTJw0xS6iJ9B3vL+qD0C5tOsoMPMm4Fj+4Tv8sPvS73/+xJp9106U5fK065AOAGlVhQ4P0XvRTPV+9fZwYLYUhgS8jC+FLHFjPvHOiAw4h83tKmO1tpeFqHdOrvWTWT0ZroF/yd9UWSUlE2cqWA4gbDdxpSlvDwgOa72x3zpsB4KdTnlcvWc2hilLyuL04iQ7V0W5lEXw6yV0ufAkhsSvfUdu2+vvyDyTALlLi6U+HsIiHVC54sVrnEPMJIZ4V/L93wyteKA46+YIftMEpe03z7h3EMV2jxRF2bjMrRSVZuDD+l+MiT12Y5V6GkRUQoPn0PfmkigtBFrF0n71jqNHNbjN6kXfhGLFA9PpNpG1Fv15SfTAE1oBRPvDGVaZAJCle7az+z/TawffbC6CVnJFL3Z8aitq1rYJM0/nXsoK7btxmz4Qsdg8ECDAIMmFuQwivoDb+cWn5ZCr8MnmGAXRXSHJRjuiA1HgXidVzxS9fah2dZ+WjsKE25mLCRvsl7DJiBoE3VoUai/9qlfno1r8xF6QWoHVkFXLu+kmAKA+qf4CKmaG7NCEUZ7FYHMg 10M1uh1V KMX2lN4/5ylLw5z9Sg/inlzA1C1QWTvLoxAAs+QiX3DZMvbTtW+2pW2lV/xnJHcys8tGFt5RX9M2FE1GGnHa75767kUBgQ7kADzdcwHckBi8MbVLdv/L6GsFht+CnLaXXGsPb0qBFnd6hWzkmN3fo7bNZfsq+xzlqHWTOmf64B+C496PNeaIC4+j5LYvgX8YhnC9v308N1SGfJnYCymXPhJErddS9HVJYCeIg1PoKhA5IiIpIV1ONuwv9nzHSXYTuCpolY0bGczyyyOR1UeSbSaA0G2EuEEZONzzdSWcH5FGTC3BmcvOV9QYwoJQkFJYri2S27E7Tnwit99BjO0IWsV3MkHFlKeP0LX46w30VjKh+6WiGeYPxQJsgIIpgxWq8c+D0RuzYT0withs= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi, On Tue, Apr 18, 2023 at 5:34=E2=80=AFPM Huang, Ying = wrote: > > >> >> >> TBH, the test case is too extreme for me. > >> >> > > >> >> > That's fair. That being said, I guess the point I was trying to m= ake > >> >> > is that waiting for this lock could take an unbounded amount of t= ime. > >> >> > Other parts of the system sometimes hold a page lock and then do = a > >> >> > blocking operation. At least in the case of kcompactd there are b= etter > >> >> > uses of its time than waiting for any given page. > >> >> > > >> >> >> And, we have multiple "sync" mode to deal with latency requireme= nt, for > >> >> >> example, we use MIGRATE_SYNC_LIGHT for compaction to avoid too l= ong > >> >> >> latency. If you have latency requirement for some users, you ma= y > >> >> >> consider to add new "sync" mode. > >> >> > > >> >> > Sure. kcompactd_do_work() is currently using MIGRATE_SYNC_LIGHT. = I > >> >> > guess my first thought would be to avoid adding a new mode and ma= ke > >> >> > MIGRATE_SYNC_LIGHT not block here. Then anyone that truly needs t= o > >> >> > wait for all the pages to be migrated can use the heavier sync mo= des. > >> >> > It seems to me like the current users of MIGRATE_SYNC_LIGHT would= not > >> >> > want to block for an unbounded amount of time here. What do you t= hink? > >> >> > >> >> It appears that you can just use MIGRATE_ASYNC if you think the cor= rect > >> >> behavior is "NOT block at all". I found that there are more > >> >> fine-grained controls on this in compaction code, please take a loo= k at > >> >> "enum compact_priority" and its comments. > >> > > >> > Actually, the more I think about it the more I think the right answe= r > >> > is to keep kcompactd as using MIGRATE_SYNC_LIGHT and make > >> > MIGRATE_SYNC_LIGHT not block on the folio lock. > >> > >> Then, what is the difference between MIGRATE_SYNC_LIGHT and > >> MIGRATE_ASYNC? > > > > Aren't there still some differences even if we remove blocking this > > one lock? ...or maybe your point is that maybe the other differences > > have similar properties? > > Sorry for confusing words. Here, I asked you to list the implementation > difference between MIGRATE_ASYNC and MIGRATE_SYNC_LIGHT after your > proposed changes. Which are waited in MIGRATE_SYNC_LIGHT but not in > MIGRATE_ASYNC? Ah, got it! It's not always the easiest to follow all the code paths, but let's see what I can find. I guess to start with, though, I will assert that someone seems to have believed that there was an important difference between MIGRATE_ASYNC and MIGRATE_SYNC_LIGHT besides waiting on the lock in migrate_folio_unmapt() since (as I found in my previous digging) the "direct reclaim" path never grabs this lock but explicitly sometimes chooses MIGRATE_ASYNC some times and MIGRATE_SYNC_LIGHT other times. OK, so looking at mainline Linux and comparing differences in behavior between SYNC_LIGHT and ASYNC and thoughts about which one should be used for kcompactd. Note that I won't go _too_ deep into all the differences... -- In nfs.c: 1. We will wait for the fscache if SYNC_LIGHT but not ASYNC. No idea what would be the most ideal for calls from kcompactd. In compaction.c: 2. We will update the non-async "compact_cached_migrate_pfn" for SYNC_LIGHT but not ASYNC since we keep track of sync and async progress separately. 3. compact_lock_irqsave() note contentions for ASYNC but not SYNC_LIGHT and cause an earlier abort. Seems like kcompactd would want the SYNC_LIGHT behavior since this isn't about things indefinitely blocking. 4. isolate_migratepages_block() will bail if too_many_isolated() for ASYNC but not SYNC_LIGHT. My hunch is that kcompactd wants the SYNC_LIGHT behavior for kcompact. 5. If in direct compaction, isolate_migratepages_block() sets "skip_on_failure" for ASYNC but not SYNC_LIGHT. My hunch is that kcompactd wants the SYNC_LIGHT behavior for kcompact. 6. suitable_migration_source() considers more things suitable migration sources when SYNC_LIGHT but not (ASYNC+direct_compaction). Doesn't matter since kcompactd isn't direct compaction and non-direct compaction is the same. 7. fast_isolate_around() does less scanning when SYNC_LIGHT but not (ASYNC+direct_compaction). Again, it doesn't matter for kcompactd. 8. isolate_freepages() uses a different stride with SYNC_LIGHT vs. ASYNC. My hunch is that kcompactd wants the SYNC_LIGHT behavior for kcompact. In migrate.c: 9. buffer_migrate_lock_buffers() will block waiting to lock buffers with SYNC_LIGHT but not ASYNC. I don't know for sure, but this feels like something we _wouldn't_ want to block on in kcompactd and instead should look for easier pickings. 10. migrate_folio_unmap() has the case we've already talked about 11. migrate_folio_unmap() does batch flushes for async because it doesn't need to worry about a class of deadlock. Somewhat recent code actually ends up running this code first even for sync modes to get the batch. 12. We'll retry a few more times for SYNC_LIGHT than ASYNC. Seems like the more retries won't really hurt for kcompactd. -- So from looking at all the above, I'll say that kcompactd should stick with SYNC_LIGHT and we should fix #10. In other words, like my original patch except that we keep blocking on the lock in the full SYNC modes. It's possible that we should also change case #9 I listed above. Do you know if locking buffers is likely to block on something as slow as page reading/writing? -Doug