From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CD7ECE936FB for ; Sat, 7 Oct 2023 07:29:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 684448D000A; Sat, 7 Oct 2023 03:29:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 633FD8D0005; Sat, 7 Oct 2023 03:29:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5231D8D000A; Sat, 7 Oct 2023 03:29:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 4323C8D0005 for ; Sat, 7 Oct 2023 03:29:41 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 1A496B424E for ; Sat, 7 Oct 2023 07:29:41 +0000 (UTC) X-FDA: 81317840562.23.FACB499 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.120]) by imf22.hostedemail.com (Postfix) with ESMTP id 782F6C0028 for ; Sat, 7 Oct 2023 07:29:38 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b="BB+Cpw/c"; spf=pass (imf22.hostedemail.com: domain of ying.huang@intel.com designates 192.55.52.120 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1696663779; a=rsa-sha256; cv=none; b=EU3vniwgueJHw+EhtDCeqGR05Tev3VkCdAjgF3iH55hQDVH5znzaxuaT6l+TbVDSlwUGcY 4wcnQEbtlIlqwr5LLGs4Awe2Fmz4MFw90U46ZctS3o3YlVwI2nL6X7S7ca/Lap3aEyTWUp fR+o1VGTlMuhJHdfila6/Jdxf3zxIuo= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b="BB+Cpw/c"; spf=pass (imf22.hostedemail.com: domain of ying.huang@intel.com designates 192.55.52.120 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1696663779; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=yW6cLWnxr8WxRHJcT50sPQV85mo01atpIaprRDaD6T4=; b=rQWWT54uqCx9FomDYVfkm5O3YLfLZNlxvLajrYGh//B+OZ2aeGdntXN2Thk0JQugr8avE4 fzOFTBjf/QHqxgD51Y0Q/GAGj7vamrXaiD9reV3jSQcRYAoTkR+gIwV+Lli1KrEWHp3uPL cOhgo2ytx6jTItEr455wYIlx5CdK/UM= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1696663778; x=1728199778; h=from:to:cc:subject:references:date:in-reply-to: message-id:mime-version; bh=KKsNX1B4ephXy3wePWdgK65UEebN0b8CYmREFVzaYus=; b=BB+Cpw/cYY8y4J26qRaOesVhrYaoDRWKHEW+X79bmVQ/L+4lsPY58T7W OsH5e/f5x4PeUCFqrAouw7q1Vc2NmwGPPnjD2INrQggy/ciV0ROH06O/Y TD9RNc72AtAnJXaR8qJ52SdRI32vFJt8qhD6JPOI1+Ud8m7E/AMS/8gMe 5CBPLwMksnMo3gdTpUYWoWYrvfBYWzG6v28C12xI6TvBsbgodElhQWLkR 2p5kUilrMiOrqqtPTp8FQ6YnWcG++zQXUbAOw7oESLFzxxn9SQ2WSKI6t sOMgy1cLJrG3W84eFjeP3gOkiXjYlIFlCIyQbK/GLUXa/41qe2KLz7j9/ g==; X-IronPort-AV: E=McAfee;i="6600,9927,10855"; a="382768538" X-IronPort-AV: E=Sophos;i="6.03,205,1694761200"; d="scan'208";a="382768538" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Oct 2023 00:29:36 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10855"; a="868608902" X-IronPort-AV: E=Sophos;i="6.03,205,1694761200"; d="scan'208";a="868608902" Received: from yhuang6-desk2.sh.intel.com (HELO yhuang6-desk2.ccr.corp.intel.com) ([10.238.208.55]) by fmsmga002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Oct 2023 00:29:32 -0700 From: "Huang, Ying" To: Hugh Dickins Cc: Andrew Morton , Andi Kleen , Christoph Lameter , Matthew Wilcox , Mike Kravetz , David Hildenbrand , Suren Baghdasaryan , Yang Shi , Sidhartha Kumar , Vishal Moola , Kefeng Wang , Greg Kroah-Hartman , Tejun Heo , Mel Gorman , Michal Hocko , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH v2 03/12] mempolicy: fix migrate_pages(2) syscall return nr_failed References: <9a6b0b9-3bb-dbef-8adf-efab4397b8d@google.com> Date: Sat, 07 Oct 2023 15:27:23 +0800 In-Reply-To: <9a6b0b9-3bb-dbef-8adf-efab4397b8d@google.com> (Hugh Dickins's message of "Tue, 3 Oct 2023 02:17:43 -0700 (PDT)") Message-ID: <87o7halwo4.fsf@yhuang6-desk2.ccr.corp.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=ascii X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 782F6C0028 X-Stat-Signature: 8p8tayrrciwuu4p5jbszuzjdo6j9e9mi X-Rspam-User: X-HE-Tag: 1696663778-904568 X-HE-Meta: U2FsdGVkX18arUnRYJ6TdYJKBA5FXRljIUSv/OiEbGVU49ey8NJU426T00j/pym28kih+BDFbDoUIXZOgPeYnQT1ksulMmFfY6o8Zn+JITFQfhakKKZLo3Lai8kl3TgAVcDYe1m56RuAGhaDXMWVLwq0T8b+qycXvuBiBih4zfqIi/vhUgqMHcaHis2tKkzD691usoFwAOqUzuiL7W56h+zUxw5ZOv8CSPRGGh2o9moZEXX3OsTWoMwPRtoS9LQTq74bJ64mUbYAihX3v4WaJqyV2YyjvP6xIvRJZMq1BN6roina70sPWLwfg/siJEMAUFiMoIyvTEW9cddcPBM8DYfKlbKp23I2qD2U/+LjeQBG0REn+B7KtsWqeQ0aSnqZBiIGBWzJhPGZ2DAxKihfvr4lpX2/i0PMam3hB3wQ0/LYZtOle6oIOqTOcfzqZXtQNs+RGf98cuQmnPYP+Ey+0x2yS2lKDkjXTyHtLV1utibIu86Dta3HlO3uI4A89fMQB5SWTDmbJ718B+4UKib8dhYz1Cms35d6PVXL2oTQiZz6f7bRtPRa0hRJ29XMAF10wQyroivzuiBQRd1tQ2xMyQU7dOCZNTY4XUueHikGfUajN5kWPjN5orw1EdBBH9BRH0PuHzWRDb9WmwZS1qOyC+/lyk/IOhAt+Zd8ynvCTjnl9eLG1FRDAZW7BQQocsMiXFF3549ulh33KteQnJOoGn7Uv+kRap9ZiQ97jTnJMj6uyoLKVSQZ0Jp6dxx+LeL9LJQhrMwMDto73lVSgkGw8gG2QsUoqlUsced8xLQuybAwrA9ndS63yBWE1tA3032CePPUoCMg91fzto930x6pW9d4r5yC9NQ1Vnr7xVXT7YMT+KnuRr6g/Wn1wb7IrOB1nGCh0uvKtPof9arpeM5wXl4sVEkNEPevqwcVJetruCjn615dL4zPHMzfaCDZIISq2KrJyBKfvLseV08mUfh w9qXx1Rj MgRX1rhrLDEcV1CETPIfLX+Sgw0790vObvCuzHKDW0d5I5+Pe2PYbhhs2nUdZ78bOg/M79bsXtnjYfAHkV8JEJqq6zq40w+r8WShD2JVVU9Eci5Vp1s55Vj7V/wZOa8WNeCu/tquU7NMxEG8secJUHhk4Z6HLE8l3+ZmdvsMZ+3s2Be/UwvpXs5z0CN6h2FsgVkE2HZXipRWfprQUY2Yw7mlK8LeuNiMQilUhQCFGvxF5b85V/sPk1mHJldaCs7MA+rlgz2iMxSyopvHlhncSwkeJ2xR2orOynFBxwQMuM4yLwzCf6Y6mxMglp/iBuFdDXbCJf3rxMGsiuI/plIUz1ZpKNLaH3ikaJE44oFou4oxIRzuTHYzgHix2Ag== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hugh Dickins writes: > "man 2 migrate_pages" says "On success migrate_pages() returns the number > of pages that could not be moved". Although 5.3 and 5.4 commits fixed > mbind(MPOL_MF_STRICT|MPOL_MF_MOVE*) to fail with EIO when not all pages > could be moved (because some could not be isolated for migration), > migrate_pages(2) was left still reporting only those pages failing at the > migration stage, forgetting those failing at the earlier isolation stage. > > Fix that by accumulating a long nr_failed count in struct queue_pages, > returned by queue_pages_range() when it's not returning an error, for > adding on to the nr_failed count from migrate_pages() in mm/migrate.c. > A count of pages? It's more a count of folios, but changing it to pages > would entail more work (also in mm/migrate.c): does not seem justified. > > queue_pages_range() itself should only return -EIO in the "strictly > unmovable" case (STRICT without any MOVEs): in that case it's best to > break out as soon as nr_failed gets set; but otherwise it should continue > to isolate pages for MOVing even when nr_failed - as the mbind(2) manpage > promises. > > There's a case when nr_failed should be incremented when it was missed: > queue_folios_pte_range() and queue_folios_hugetlb() count the transient > migration entries, like queue_folios_pmd() already did. And there's a > case when nr_failed should not be incremented when it would have been: > in meeting later PTEs of the same large folio, which can only be isolated > once: fixed by recording the current large folio in struct queue_pages. > > Clean up the affected functions, fixing or updating many comments. Bool > migrate_folio_add(), without -EIO: true if adding, or if skipping shared > (but its arguable folio_estimated_sharers() heuristic left unchanged). > Use MPOL_MF_WRLOCK flag to queue_pages_range(), instead of bool lock_vma. > Use explicit STRICT|MOVE* flags where queue_pages_test_walk() checks for > skipping, instead of hiding them behind MPOL_MF_VALID. > > Signed-off-by: Hugh Dickins > Reviewed-by: Matthew Wilcox (Oracle) Thanks! Feel free to add Reviewed-by: "Huang, Ying" -- Best Regards, Huang, Ying