From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D110FC3064D for ; Thu, 27 Jun 2024 19:03:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 334786B0092; Thu, 27 Jun 2024 15:03:55 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2E4676B0096; Thu, 27 Jun 2024 15:03:55 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1854B6B0098; Thu, 27 Jun 2024 15:03:55 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id ED62D6B0092 for ; Thu, 27 Jun 2024 15:03:54 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 64D21C0B68 for ; Thu, 27 Jun 2024 19:03:54 +0000 (UTC) X-FDA: 82277593188.10.1CEA343 Received: from mail-ej1-f51.google.com (mail-ej1-f51.google.com [209.85.218.51]) by imf19.hostedemail.com (Postfix) with ESMTP id 7950B1A001D for ; Thu, 27 Jun 2024 19:03:52 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Lqw9Wub+; spf=pass (imf19.hostedemail.com: domain of shy828301@gmail.com designates 209.85.218.51 as permitted sender) smtp.mailfrom=shy828301@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1719515015; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=3cNQaKFEgW5yQvTALHrvOQ9l/UgLJ4FF7ZCby1Wnszo=; b=TCr30nUQF8D1x3kQvRXW3X2MNGTuDXpixoYXTAgD8nmE9k/owsHMggzTz53AShHCKHLuU0 hMsU606CeXmhqDCl7wsxugIseT9pu30z+lTM6Lfu57RzSRZWvvf8GnH/J+/yCEOZwXqvw1 /GdZXOE9348x3Rtlg4JZZf3QOAUjB6M= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1719515015; a=rsa-sha256; cv=none; b=i01tFD2hiDVDbnOpyX1U/UThtjOhoa6qo7DoyRv26QUBVApwq/tzrhRaVxP54caaIVK0xs inFJZtLKLbnZ5dwWO2xUu02r8L1kJWg1d6rUDgVwD1AgZA0DM2kOJbQzLWex/eaYGqdWmO L68+hRcc08KRCJjEPW0yTMpuxBoOJa4= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Lqw9Wub+; spf=pass (imf19.hostedemail.com: domain of shy828301@gmail.com designates 209.85.218.51 as permitted sender) smtp.mailfrom=shy828301@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-ej1-f51.google.com with SMTP id a640c23a62f3a-a725ea1a385so553801866b.3 for ; Thu, 27 Jun 2024 12:03:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1719515031; x=1720119831; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=3cNQaKFEgW5yQvTALHrvOQ9l/UgLJ4FF7ZCby1Wnszo=; b=Lqw9Wub+TapkCxAT0rQOzGsDTgvDGRxIiTv5cwj5jTMoZ6+Gf39XnDgsYhvC/Ve+hC TM5r2fhUFhIxqiY++9VBwyRY65ssGzSnC1xp3HfPyZGz+5Fl1uVQV8KtURKk+c/NCWfE OLBgExliUv0q5lAapRbNGImhW0m6yEvDn9C12TCEZ6AIktncA8IK6dbUVGnKrPiQLXWa n8eYAU9xKjQbkMMeWcXx7cyHMMJgr4x9R+8CJeZBBaHVp2e1jyYsXgxrpAClNk7ncSAi yDMtQKd2CDDkRxPF5mJ5rFR+VjWIID4YEZbP8x5sLeVoz11Dzp4rr0AuTn/RCL0QMnM1 ZLqQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1719515031; x=1720119831; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=3cNQaKFEgW5yQvTALHrvOQ9l/UgLJ4FF7ZCby1Wnszo=; b=Es0aYgGyV1DnC5hDXmPd59rkUzse7i/tBSzRz4KT1X6JcTgtrjbs0rdnEUIUjby8x2 gYJsopiLKDogBUqEBJTRUFuJ0HdXGhcBZWYGrxus8nxMvbGM9GVYwwf4NNmcx4KSzN9Y WU5nqjBWpV0LUX4jOxgHal2YoH2CazsUWoIQzOU6fDq47PtFE390ifwWgawgAegrneZg FJXoEGOjdfHhl7p0GJKiqdzp9ktH0UcEWphU4piB62Zm4eFILda9PQqoTwNfxcZ1P/Ok HDykKDgcb8IDpfHVLZ5xZZYiJ9A9vYZxDQgjALU6NmmZny1C5n/SQpVevxzo226GQOEU z1PA== X-Forwarded-Encrypted: i=1; AJvYcCVpgvCPDxXP+AvarlecXC/T7ztmV2XrOyq34rOyJLdpynUEfRhLL3MFd21llu1eSVdPDUYO3gTux+HLh0DD3HdwDBI= X-Gm-Message-State: AOJu0YwcZgsMvbqzJRvjp9jYwkQsE0cjq/bdZU88xqtGOvEWcDaDg0xu JxC7GqnX4IytrngCVbhO6WKJBL6iIKVEWDSN5uRUIiz5YbtW+f1X7ZAItX99xAbmqDW2osohsqU Zj+zplm5ItL8YHBCrmmm0zHL4i0Q= X-Google-Smtp-Source: AGHT+IFuUYrwKVK5ll2BgGlAUvAhY8Bipf/zQhfWva3sESfc+Kmwzi0MnW2uFrNgC6jM1EKbN4X5hzvkY3tsqbs06uI= X-Received: by 2002:a17:907:8e8b:b0:a6f:6b6a:e8d0 with SMTP id a640c23a62f3a-a7242c4df6fmr976876366b.7.1719515030382; Thu, 27 Jun 2024 12:03:50 -0700 (PDT) MIME-Version: 1.0 References: <1719478388-31917-1-git-send-email-yangge1116@126.com> In-Reply-To: From: Yang Shi Date: Thu, 27 Jun 2024 12:03:38 -0700 Message-ID: Subject: Re: [PATCH] mm/gup: Use try_grab_page() instead of try_grab_folio() in gup slow path To: Peter Xu Cc: yangge1116@126.com, akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org, 21cnbao@gmail.com, baolin.wang@linux.alibaba.com, liuzixing@hygon.cn, David Hildenbrand Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 7950B1A001D X-Stat-Signature: ertyrcun5sf4objenw8mxtkky3oxm4mo X-HE-Tag: 1719515032-34365 X-HE-Meta: U2FsdGVkX1+nN5ZagHDjuiPpR9XSEzhMjSolCORWIPYhlKB47WRB9D0Gqw/fLoN7+kIAwosqbRC7i6OEF07sv1202uBYhsKWAQwKk9bh/vPfcfg4uk9DHxT52NC+QiKqG5OvzgsdajtK/I1kwe0msyYsrN14swZW8AN8+xdbfDqaprh6nIvCUy9ZIP8CNldyt5mi00PnsXvd74ITa0AX456tDtUZbQEXh+1zIlX8yXDr1+tM317F9LDuTjidWF00XB5JpNHgtwDswTZ0vjxsqRIfdAIqqmwDSSmMJjD3b4/zkAJo3DNWO+8c3zvhSYWlFWkiUnr2w1bkoz253RC+OATzVF5dl7rjFLLXh9+sB6epCnZFnkXiwLFjilkmDw6Mij3R2DnKhniAebYatmsBdImCXZXrLYewvEf2SI+jHCIIeTtH+Sxa27dBFy1yOXWJJ9MyCK1GYBR/pTjGO40rwkp9RSJEY9iQ29EKT+mX8lmV9vZGzJzuH5aySSsYGKr2y1QNa6Yz9BL28c+zLrTUb4nr4T3kxDnOCPlx5nlxlEnQL2/zjK669/XQCTKXbN8SzVjJYtTP4TTFjDAdfakgky+qWtHVijIQguyNSFBy8ucck0AOhc0VBtO1+jXbZBClwE1Jx8xQdLz9OgPRRl3OLiw0P1HpJZy9qJaQJdTyiKpGLQuIeXuMWmWs2ov4z7kRNSeKqddTOqCQbgrr3wIy96wtuLEO3QeU4id31kEj+nYJP93KikJPs9Gzh/fUF6eiZtGQYfO+ZLpoNIj9cjS2UcnTSs73fTyrp3YFonpoVa9KK5H3k/OzjeeHK2lXVtwsnF0qM7fN3nq6PFzfP28nup08htwQr//ozvrD49EfQsicGJvBVjOHnpFvNoz0BXUjBGUlnBQCsU/x4euYYVwh9JZiynoU0sY4Bgf1QUom/Nw1CukIY9XydZ3G4kbg/Dxp5v/qG6iMIoW6ObDwt0H Msoclpxl DBEULOSDI1GmlrXy4d8XpX77b3InJw3u7NLs9/3PqWWdkEhVBjIwrlzS7RoBU0kBY/TNMox6S+7VxjDBTsPVscS/AB+/bfca4uItJxUJgwYgpa17FMao4ot+W6TxQYzLjhjP/X8d91PW/ireODV/xMXIXAZXNaLoVYi6Zmh+DADxVgKNLq8NedbPOVh60ovYKfThEl9rZA3oYz6SzMxJDuot8aRdmzTrDbibYudsFkIEARGNj0LG5j43fgG8fcm+ikcM8LTBbqN1+2/PF9n86WGXtZOtGCXSLWDvsY9WUaxp3fbURBdMJpgMgL9kOYR9Clme7KVg4N3IVNfjprZYH8eS53/1y3tT7/vkFRbS34HFblOf9yyhO1+fM3za5ivF+O63yYG12L/RuvZh5ARpYeYpQwOqbpoNg0oaHjnNsYsamPfIDB3PJQ685TmHLAUbgnrXu8ctrdTZ8tUX44KrQku0zLJgyOF7ejCxquqdP+iC8I61buwbWaZQr9EYwYl7+1Tbm X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Jun 27, 2024 at 11:55=E2=80=AFAM Peter Xu wrote= : > > On Thu, Jun 27, 2024 at 04:53:08PM +0800, yangge1116@126.com wrote: > > From: yangge > > > > If a large number of CMA memory are configured in system (for > > example, the CMA memory accounts for 50% of the system memory), > > starting a SEV virtual machine will fail. During starting the SEV > > virtual machine, it will call pin_user_pages_fast(..., FOLL_LONGTERM, > > ...) to pin memory. Normally if a page is present and in CMA area, > > pin_user_pages_fast() will first call __get_user_pages_locked() to > > pin the page in CMA area, and then call > > check_and_migrate_movable_pages() to migrate the page from CMA area > > to non-CMA area. But the current code calling __get_user_pages_locked() > > will fail, because it call try_grab_folio() to pin page in gup slow > > path. > > > > The commit 57edfcfd3419 ("mm/gup: accelerate thp gup even for "pages > > !=3D NULL"") uses try_grab_folio() in gup slow path, which seems to be > > problematic because try_grap_folio() will check if the page can be > > longterm pinned. This check may fail and cause __get_user_pages_lock() > > to fail. However, these checks are not required in gup slow path, > > seems we can use try_grab_page() instead of try_grab_folio(). In > > addition, in the current code, try_grab_page() can only add 1 to the > > page's refcount. We extend this function so that the page's refcount > > can be increased according to the parameters passed in. > > > > The following log reveals it: > > > > [ 464.325306] WARNING: CPU: 13 PID: 6734 at mm/gup.c:1313 __get_user_p= ages+0x423/0x520 > > [ 464.325464] CPU: 13 PID: 6734 Comm: qemu-kvm Kdump: loaded Not taint= ed 6.6.33+ #6 > > [ 464.325477] RIP: 0010:__get_user_pages+0x423/0x520 > > [ 464.325515] Call Trace: > > [ 464.325520] > > [ 464.325523] ? __get_user_pages+0x423/0x520 > > [ 464.325528] ? __warn+0x81/0x130 > > [ 464.325536] ? __get_user_pages+0x423/0x520 > > [ 464.325541] ? report_bug+0x171/0x1a0 > > [ 464.325549] ? handle_bug+0x3c/0x70 > > [ 464.325554] ? exc_invalid_op+0x17/0x70 > > [ 464.325558] ? asm_exc_invalid_op+0x1a/0x20 > > [ 464.325567] ? __get_user_pages+0x423/0x520 > > [ 464.325575] __gup_longterm_locked+0x212/0x7a0 > > [ 464.325583] internal_get_user_pages_fast+0xfb/0x190 > > [ 464.325590] pin_user_pages_fast+0x47/0x60 > > [ 464.325598] sev_pin_memory+0xca/0x170 [kvm_amd] > > [ 464.325616] sev_mem_enc_register_region+0x81/0x130 [kvm_amd] > > > > Fixes: 57edfcfd3419 ("mm/gup: accelerate thp gup even for "pages !=3D N= ULL"") > > Cc: > > Signed-off-by: yangge > > Thanks for the report and the fix proposed. This is unfortunate.. > > It's just that I worry this may not be enough, as thp slow gup isn't the > only one using try_grab_folio(). There're also hugepd and memfd pinning > (which just got queued, again). > > I suspect both of them can also hit a cma chunk here, and fail whenever > they shouldn't have. > > The slight complexity resides in the hugepd path where it right now share= s > with fast-gup. So we may potentially need something similiar to what Yan= g > used to introduce in this patch: > > https://lore.kernel.org/r/20240604234858.948986-2-yang@os.amperecomputing= .com > > So as to identify whether the hugepd gup is slow or fast, and we should > only let the fast gup fail on those. > > Let me also loop them in on the other relevant discussion. Thanks, Peter. I was actually typing the same thing... Yes, I agree my patch should be able to solve the problem. At the beginning I thought it is just a pure clean up patch, but it seems like it is more useful. I'm going to port my patch to the latest mm-unstable, then post it. > > Thanks, > > -- > Peter Xu >