From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 45411C433EF for ; Mon, 22 Nov 2021 18:15:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AF25C6B0071; Mon, 22 Nov 2021 13:15:43 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id ACA556B0072; Mon, 22 Nov 2021 13:15:43 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 96F6B6B0073; Mon, 22 Nov 2021 13:15:43 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0210.hostedemail.com [216.40.44.210]) by kanga.kvack.org (Postfix) with ESMTP id 88D676B0071 for ; Mon, 22 Nov 2021 13:15:43 -0500 (EST) Received: from smtpin16.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 2708C824C424 for ; Mon, 22 Nov 2021 18:15:33 +0000 (UTC) X-FDA: 78837368946.16.9A26205 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf18.hostedemail.com (Postfix) with ESMTP id 872374002088 for ; Mon, 22 Nov 2021 18:15:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1637604932; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vVCiEmUed2GoJ5PqcHl3o7fyXnNyNAFjYeGeAqfPvS8=; b=BMVivo/qBt1nABv25GLiAA4ks4qRBNK8cN0NSAQU/l3Y5hl95RzBJSwzn5dlbjKBju6+at ii3PsV6wu0JuZgvslAOGd+Uoa9Ahbmb2fnbB9xLO/smSkrwTe6T8nZWJdAZ8cC6emMpwHM iJCmaGVi9Y5Na4bcrJFDF1TWgebnBH8= Received: from mail-wm1-f70.google.com (mail-wm1-f70.google.com [209.85.128.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-183-xj8WmEzgN5S6juz3npV0nQ-1; Mon, 22 Nov 2021 13:15:30 -0500 X-MC-Unique: xj8WmEzgN5S6juz3npV0nQ-1 Received: by mail-wm1-f70.google.com with SMTP id l6-20020a05600c4f0600b0033321934a39so260465wmq.9 for ; Mon, 22 Nov 2021 10:15:30 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent :content-language:to:cc:references:from:organization:subject :in-reply-to:content-transfer-encoding; bh=vVCiEmUed2GoJ5PqcHl3o7fyXnNyNAFjYeGeAqfPvS8=; b=UWR3WziGRXLG/CxUvEa4KTj4EK9wmTUCgw4c7NOYsJwAKqguOYDYgBSUikt1IDxTsW C9kx/smRSs2qmGxYQouNlsRNxsL5uET3Cy8BW31cxX6V65dnB8hZqgdFQr7MNZWyl/sM CzRXOxMiVYePlLpFT5eX9gFCfCri9sHIEAfp8cTqb1IMX4fAEtUpB2E3DXTZt7QYX9Qj 91xyoa/bufsGdjLoauXeKWBB2JOtqB+PCuY+P89VCiV3/Y1Q56jgHyA8CmdHQ7Yip5aW 5QynCGeHlxTW4G5dv8fi1nIAokcVgDE4WC81nIFY4ubT+X1IVASGHUCjibuurxG7Crt9 +29Q== X-Gm-Message-State: AOAM5339QsBGlyE9l+J1aJfwaUW4T7CSyAbkt3aHnbkfTZ11TdYWrdSo cnhBbRKD5vkmFDDpRB9Vf+xrz6KIkX9fhFCg5gMIpvC3kHYi5PMEXjCsZtdEYeok9LiwAWFD2IH juRfTwg/+rec= X-Received: by 2002:a05:6000:154b:: with SMTP id 11mr41302677wry.394.1637604929626; Mon, 22 Nov 2021 10:15:29 -0800 (PST) X-Google-Smtp-Source: ABdhPJzxi4YAS9tSxGUAuFMH3FMF4u8MTUYMqnxiPlwUA4wIg0m2IdSHSzeUt/GWd0Dd7cFjH96xdA== X-Received: by 2002:a05:6000:154b:: with SMTP id 11mr41302626wry.394.1637604929374; Mon, 22 Nov 2021 10:15:29 -0800 (PST) Received: from [192.168.3.132] (p5b0c667b.dip0.t-ipconnect.de. [91.12.102.123]) by smtp.gmail.com with ESMTPSA id 4sm13044814wrz.90.2021.11.22.10.15.27 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 22 Nov 2021 10:15:28 -0800 (PST) Message-ID: Date: Mon, 22 Nov 2021 19:15:27 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.2.0 To: Alistair Popple , akpm@linux-foundation.org Cc: willy@infradead.org, dhowells@redhat.com, hughd@google.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, jglisse@redhat.com, jgg@nvidia.com, rcampbell@nvidia.com, jhubbard@nvidia.com References: <20211118020754.954425-1-apopple@nvidia.com> From: David Hildenbrand Organization: Red Hat Subject: Re: [PATCH v4] mm/migrate.c: Rework migration_entry_wait() to not take a pageref In-Reply-To: <20211118020754.954425-1-apopple@nvidia.com> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 872374002088 X-Stat-Signature: pae9ijdigtosp9ummo5u6amfqmx9yn1f Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="BMVivo/q"; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf18.hostedemail.com: domain of david@redhat.com has no SPF policy when checking 170.10.129.124) smtp.mailfrom=david@redhat.com X-HE-Tag: 1637604930-782653 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 18.11.21 03:07, Alistair Popple wrote: > This fixes the FIXME in migrate_vma_check_page(). > > Before migrating a page migration code will take a reference and check > there are no unexpected page references, failing the migration if there > are. When a thread faults on a migration entry it will take a temporary > reference to the page to wait for the page to become unlocked signifying > the migration entry has been removed. > > This reference is dropped just prior to waiting on the page lock, > however the extra reference can cause migration failures so it is > desirable to avoid taking it. > > As migration code already has a reference to the migrating page an extra > reference to wait on PG_locked is unnecessary so long as the reference > can't be dropped whilst setting up the wait. > > When faulting on a migration entry the ptl is taken to check the > migration entry. Removing a migration entry also requires the ptl, and > migration code won't drop its page reference until after the migration > entry has been removed. Therefore retaining the ptl of a migration entry > is sufficient to ensure the page has a reference. Reworking > migration_entry_wait() to hold the ptl until the wait setup is complete > means the extra page reference is no longer needed. > I really like this, thanks for this work! [...] > +#ifdef CONFIG_MIGRATION > +/** > + * migration_entry_wait_on_locked - Wait for a migration entry to be removed > + * @folio: folio referenced by the migration entry. > + * @ptep: mapped pte pointer. This function will return with the ptep unmapped. > + * @ptl: already locked ptl. This function will drop the lock. > + * > + * Wait for a migration entry referencing the given page to be removed. This is > + * equivalent to put_and_wait_on_page_locked(page, TASK_UNINTERRUPTIBLE) except > + * this can be called without taking a reference on the page. Instead this > + * should be called while holding the ptl for the migration entry referencing > + * the page. > + * > + * Returns after unmapping and unlocking the pte/ptl with pte_unmap_unlock(). You could maybe make it clear that callers have to pass the ptep only for PTE migration entries. For a PMD migration entry, pass NULL. > + * > + * This follows the same logic as wait_on_page_bit_common() so see the comments s/wait_on_page_bit_common/folio_wait_bit_common/ ? > + * there. > + */ > +void migration_entry_wait_on_locked(struct folio *folio, pte_t *ptep, > + spinlock_t *ptl) > +{ > + struct wait_page_queue wait_page; > + wait_queue_entry_t *wait = &wait_page.wait; > + bool thrashing = false; > + bool delayacct = false; > + unsigned long pflags; > + wait_queue_head_t *q; > + > + q = folio_waitqueue(folio); > + if (!folio_test_uptodate(folio) && folio_test_workingset(folio)) { > + if (!folio_test_swapbacked(folio)) { > + delayacct_thrashing_start(); > + delayacct = true; > + } > + psi_memstall_enter(&pflags); > + thrashing = true; > + } > + > + init_wait(wait); > + wait->func = wake_page_function; > + wait_page.folio = folio; > + wait_page.bit_nr = PG_locked; > + wait->flags = 0; > + > + spin_lock_irq(&q->lock); > + folio_set_waiters(folio); > + if (!folio_trylock_flag(folio, PG_locked, wait)) > + __add_wait_queue_entry_tail(q, wait); > + spin_unlock_irq(&q->lock); > + > + /* > + * If a migration entry exists for the page the migration path must hold > + * a valid reference to the page, and it must take the ptl to remove the > + * migration entry. So the page is valid until the ptl is dropped. > + */ > + if (ptep) > + pte_unmap_unlock(ptep, ptl); > + else > + spin_unlock(ptl); > + > + for (;;) { > + unsigned int flags; > + > + set_current_state(TASK_UNINTERRUPTIBLE); > + > + /* Loop until we've been woken or interrupted */ > + flags = smp_load_acquire(&wait->flags); > + if (!(flags & WQ_FLAG_WOKEN)) { > + if (signal_pending_state(TASK_UNINTERRUPTIBLE, current)) > + break; > + > + io_schedule(); > + continue; > + } > + break; > + } > + > + finish_wait(q, wait); > + > + if (thrashing) { > + if (delayacct) > + delayacct_thrashing_end(); > + psi_memstall_leave(&pflags); > + } > +} > +#endif > + I'm fairly new to the glory details of core migration entry and page bit waiting code, but it makes sense to me and removing the temporary extra references is very nice! Feel free to add my Acked-by: David Hildenbrand -- Thanks, David / dhildenb