From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8BDA0C3DA5D for ; Thu, 25 Jul 2024 10:22:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0C4706B0088; Thu, 25 Jul 2024 06:22:23 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 04D2B6B0089; Thu, 25 Jul 2024 06:22:22 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E09436B008C; Thu, 25 Jul 2024 06:22:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id BEB806B0088 for ; Thu, 25 Jul 2024 06:22:22 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 38AA280C54 for ; Thu, 25 Jul 2024 10:22:22 +0000 (UTC) X-FDA: 82377885324.13.697EC67 Received: from mail-vs1-f44.google.com (mail-vs1-f44.google.com [209.85.217.44]) by imf30.hostedemail.com (Postfix) with ESMTP id 6FF0080003 for ; Thu, 25 Jul 2024 10:22:20 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Y0TeF3l0; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf30.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.217.44 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1721902939; a=rsa-sha256; cv=none; b=zbT0okb913lNEmTc1Kj0sjHwgvClIy+sDWuT54EIh6uXlz0Ja1+goRt7JQ8S5z+QtgDsy7 Ccx83BAhsfDTAbPTte1QhX2BLqTd3VI+MIaWL6vpEacd6Mbw09eD9xWyzDhLOKkQFgKNR8 Jeopood6iLubvW5N+thtYrLct1WbZQQ= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Y0TeF3l0; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf30.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.217.44 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1721902939; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=35QGeM2NRB2w4L23NxDgGsutAUSkEkUeceaTp3ZD6A0=; b=8bwg7m4iv6UV9xfEFTZ70hptW0g+0YKRc7oXWphxtqVyvKc6QP04H+QAg+b/KH3AReBsXC QKKfbPYiD/JqcbGwwpjUqy62JCDPLnxdeGBicpX682jU99hleqNULssvbV2QQlF+n2pnp+ klri0BycF5D5O+JSMQ8FG2f0hOr/4bA= Received: by mail-vs1-f44.google.com with SMTP id ada2fe7eead31-492ad3fc768so200002137.1 for ; Thu, 25 Jul 2024 03:22:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1721902939; x=1722507739; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=35QGeM2NRB2w4L23NxDgGsutAUSkEkUeceaTp3ZD6A0=; b=Y0TeF3l0U+bw9Klu/AcJjipyzed0kMY7p1w5erivGms0qsAYlZdV5fCTmU8d0bzMc/ LOu1WeU2FLzFjoyDw5a07teImmbqC2rM2nrSzUx/khwf1jQT9ML62BPEFlIf1LXW4TgV QvHMKXBCjkKHgkuGE+QmhiFG1agZywdCOef4fi1l520IwqYbxbHn7SMq95LUNLU0KZG7 QIe8lsaNoo6iIpiSTNG+f/mV+aeOHJXdq4MhCoFtVLyqSI33SRxQ08RjznQVaHPi3iQa PFrMlL2qlFzZYoDeYSK9y/ErOm0RcK/FvN6mKVtkidBHHM1jOMojC/RM3LZA+FX+v5lA 3giQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1721902939; x=1722507739; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=35QGeM2NRB2w4L23NxDgGsutAUSkEkUeceaTp3ZD6A0=; b=A7uLsNrVL72Q96qvkYN8WaqY6m/qhOnzpqLesPpq5eyt7Y5M9XBqm/SX4EIG2+hORb IOTmVk9NAlcCiVGnZ3pRCbyK/HJnAiTKN7njOxPIp5o0DVWel1ISwEyoqCsb9I9/7qkj DyM5n/DyPI6tHOyXKbBYiabhMrd+/RyEqnvFkhNN8ODLnpsq1unCOHXua8uuAO2QMff+ cWT5Qxn9Xk9z0BfBy6y8Ls835niTx7PZLtCsRZK52oSH9XHbwwyxxVh/t4q0ORS8hpN8 /9h+ljvxLprH1doLq4EiXbT7S2x3IAKHx92NMfqPDMmZ+41qzkvUTOfqKrvSkF+GNg6G wVIw== X-Forwarded-Encrypted: i=1; AJvYcCVqqdfGARSJNxLZ4TniC+pVcND2OYAUkPjhkoDkDqbDE/NRGGFm1RyxFxlqwxB7QaPdBbXKzisQR7T3wntRWdtrkMk= X-Gm-Message-State: AOJu0YynL7/RFZvfM5O9NB+QkOyqDWj23TCYA59IQ+3MLlM72gE1Y/8R C4EoPYnrqGYsPJwVTJJAmN5OasIiFHQDB00bIzPWq9KJQrPIvmb9Ldsa8HS7s2zdAyV+3caFyag c+4aQ2KiTfDuzVwhF6ZhKlwDBgSA= X-Google-Smtp-Source: AGHT+IEppCQxMau/20PEfgL/M4nn3XnSgWPui12jXz2fSQKv4SEChYRQ1/3z7/r5WqMimhseS5enDe7l5pEigiYpZRk= X-Received: by 2002:a05:6102:2b9c:b0:48f:eb5f:84d8 with SMTP id ada2fe7eead31-493d6540a10mr3588189137.27.1721902939321; Thu, 25 Jul 2024 03:22:19 -0700 (PDT) MIME-Version: 1.0 References: <20240725035318.471-1-hailong.liu@oppo.com> <20240725091703.tsjpgltwgu3jwy5e@oppo.com> <20240725095826.pqt4shyiw6odgcem@oppo.com> In-Reply-To: <20240725095826.pqt4shyiw6odgcem@oppo.com> From: Barry Song <21cnbao@gmail.com> Date: Thu, 25 Jul 2024 18:22:08 +0800 Message-ID: Subject: Re: [RFC PATCH v2] mm/vmalloc: fix incorrect __vmap_pages_range_noflush() if vm_area_alloc_pages() from high order fallback to order0 To: Hailong Liu Cc: Andrew Morton , Uladzislau Rezki , Christoph Hellwig , Lorenzo Stoakes , Vlastimil Babka , Michal Hocko , Baoquan He , Matthew Wilcox , "Tangquan . Zheng" , linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Queue-Id: 6FF0080003 X-Rspamd-Server: rspam01 X-Stat-Signature: yczp7j1bnojwxri6r98jatxdt8nh9txc X-HE-Tag: 1721902940-307636 X-HE-Meta: U2FsdGVkX19/7btq2f74MqYtS9jjAfeZHTEkMcIJWEn0sq8kD1UX/4l7tJxDeMMCGV+HcQAHv12TqtOR45NotsL95fXqjTtEEmigYEY2cV7AYo+umRXFDQhaVG+AtgeHKFhH/zrJWhiQe2syix0Wyol5rRi80YiETHeo2rdv1HV0QH/7ESh/CLhJQeINin/ySIoXQ6Pdiz+fLhPRMFtJ80ksIujLODwmoJ0W2KbLP8aO6FCditdRk8Rc4WfpE8NZ43ywq9grpbxA81O05TxDk8vE/qrIy9OiDYor2WmcCLOMe6a7AclqPyuHkzocqxeVuLrjf/xXrl1Ps4jmAMqQN3DRk8MnMDMnWT2OWnkYy5ZUX1orqJ+SBN7XUranlsb871yARIZzVVJPK0H2kKHpn+Yeo2Dmu/y9k46GG6jrEf5UHNiJygchCqmm7+3dTmxH3Nlh00e67Ez6oz87jAz+7r8dC5iY2pdWulAaLYT9tsoEK1oJDBnSAYIcjj2U8+dgCSvCFMFL0m9mUtKkYWd7RKEa3/FC9+9m+r1xVuf+nW9hjOWunN0mNWHIgbvwqj20ZMDwGPyFbhlR3vKw6eDyoxpeeUfPJvAjfRk+gebvXMbJ36pYiaqlj0CbOLig+OwMtcW2bmW4OUJfE5gjr9kL5jo+FCXxcRwkO/OaFXMJMKzOu8GvOXpmNSjsEXpoZf9LjFWhmmrqhg7ibqLI3Im1JF8dNLZTV2vO5+g+dZa8Ax5b3J6qQWOmnASU/ILPMTLLcUBE8HCTqyGTTDbN33JlhS3YQF2wP3UKxN5rK1HmGpNj3HP/6N0FhrJ3iWx3FMgZvf3zVmAB83NsI1qbjA0Cij7e+8qvv4QLz7zHzT//Vllf6QYEScSwOFbM46LJ8bIX+KDCJ2s2zh52OSEr+yR8BDgdIi1vDbBOd/RVvd/lHzN/ZmopoYvhxpnp7ls15hxr09hTw28L/NTrId+0iO2 YPkCv9Lc iiCmq964f5Y1L2RnD3nkoKsrsc4wUlxCRk5OtOynAdZjCducTmZOJ80ZA9pwihNT9vKn6Lo3/meaEioFCsyctKgc1OwLHw+SbZeJVMo98wMSO87Nl8v5ZmStlO5Uj72gUxOqXuxyinZOSDKbEn+GoKGNhRfGeNBIZCnWoozZkwk/fBapBFBlU6KD3ADrkcDY98F7htFrIb7epXwSMbeSXtzNep10uBoIkcud3hleR+WxowUXbaRyCa2cV75fW4K8GtLgnqshpWMShAISzBkwsqwFGyTwH6pZ2sNMX9nN4APerXctCpZnJtPvY5XDKEruoj8f4h24q+Hknq810O59HpOEc69LVVEeU2qVbuqr0/Pwis6E3i0VLw0nGEUNXFPt4GIFdVBnIv1Hu+8YRD1Yma/AA8iUSEeB5KHw86AKifhsT1Sn2mJZcn304EibUzHljcDSdv8C94Ot8wnsPGBkUMa6xOg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Jul 25, 2024 at 5:58=E2=80=AFPM Hailong Liu = wrote: > > On Thu, 25. Jul 21:34, Barry Song wrote: > > On Thu, Jul 25, 2024 at 9:17=E2=80=AFPM Hailong Liu wrote: > > > > > > On Thu, 25. Jul 18:21, Barry Song wrote: > > > > On Thu, Jul 25, 2024 at 3:53=E2=80=AFPM wrot= e: > > > [snip] > > > > > > > > This is still incorrect because it undoes Michal's work. We also ne= ed to break > > > > the loop if (!nofail), which you're currently omitting. > > > > > > IIUC, the origin issue is to fix kvcalloc with __GFP_NOFAIL return NU= LL. > > > https://lore.kernel.org/all/ZAXynvdNqcI0f6Us@dhcp22.suse.cz/T/#u > > > if we disable huge flag in kmalloc_node, the issue will be fixed. > > > > No, this just bypasses kvmalloc and doesn't solve the underlying issue.= Problems > > can still be triggered by vmalloc_huge() even after the bypass. Once we > > reorganize vmap_huge to support the combination of PMD and PTE > > mapping, we should re-enable HUGE_VMAP for kvmalloc. > Totally agree, This will take some time to support. As in [1] I prepare t= o fix > with a offset in page_private to indicate the location of fallback. > > > > > I would consider dropping VM_ALLOW_HUGE_VMAP() for kvmalloc as > > an short-term "optimization" to save memory rather than a long-term fix= . This > > 'optimization' is only valid until we reorganize HUGE_VMAP in a way > > similar to THP. I mean, for a 2.1MB kvmalloc, we can map 2MB as PMD > > and 0.1 as PTE. > However this just fixed the kvmalloc_node, but for others who call > vmalloc_huge(), the issue exits. so I remove the Michal's code. sorry for= this. My proposal was to fallback to order-0 for __GFP_NOFAIL even before vm_area_alloc_pages() as a short-term quick "fix". We need to meet three conditions to do HUGE_VMAP 1. vmap_allow_huge 2. vm_flags & VM_ALLOW_HUGE_VMAP 3. !__GFP_NOFAIL gfp_flags This is because if we fallback within vm_area_alloc_pages(), the caller still expects vm_area_alloc_pages() to return contiguous 2MB memory. By removing this assumption from its callers, its caller will realize vm_area_alloc_pages() is returning small pages. That means, vm_area gets 0 as page_order from the first beginning if we have __GFP_NOFAIL in gfp_flags. Other fixes appear to require significant changes to the source code and can't be done quickly. > > > > > > > > > > > To avoid reverting Michal's work, the simplest "fix" would be, > > > > > > > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > > > > index caf032f0bd69..0011ca30df1c 100644 > > > > --- a/mm/vmalloc.c > > > > +++ b/mm/vmalloc.c > > > > @@ -3775,7 +3775,7 @@ void *__vmalloc_node_range_noprof(unsigned lo= ng > > > > size, unsigned long align, > > > > return NULL; > > > > } > > > > > > > > - if (vmap_allow_huge && (vm_flags & VM_ALLOW_HUGE_VMAP)) { > > > > + if (vmap_allow_huge && (vm_flags & VM_ALLOW_HUGE_VMAP) & > > > > !(gfp_mask & __GFP_NOFAIL)) { > > > > unsigned long size_per_node; > > > > > > > > /* > > > > > > > > > > [1] https://lore.kernel.org/lkml/20240724182827.nlgdckimtg2gwns5@= oppo.com/ > > > > > 2.34.1 > > > > > > > > Thanks > > > > Barry > > > > > > -- > > > help you, help me, > > > Hailong. > > -- > help you, help me, > Hailong.