From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8DE3AC5320E for ; Mon, 19 Aug 2024 10:10:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 248736B0088; Mon, 19 Aug 2024 06:10:50 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1F8C36B0089; Mon, 19 Aug 2024 06:10:50 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 099F26B008A; Mon, 19 Aug 2024 06:10:50 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id DFF136B0088 for ; Mon, 19 Aug 2024 06:10:49 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 8B28D81108 for ; Mon, 19 Aug 2024 10:10:49 +0000 (UTC) X-FDA: 82468576218.14.914A7DE Received: from mail-vk1-f178.google.com (mail-vk1-f178.google.com [209.85.221.178]) by imf07.hostedemail.com (Postfix) with ESMTP id C8B9F40013 for ; Mon, 19 Aug 2024 10:10:47 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=LgPjYLfv; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf07.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.221.178 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1724062185; a=rsa-sha256; cv=none; b=jQvs0klDgj4pynTI8Diqm1GLW3WQ/Ib/LTOSpLG1G568EVOrCeFNGcpSBmgQ2QvRN/Od1r cV+vkGOBkcE8uHWSgBz+Gk7rkMhIcbe3m8unkaO7TaL+mNrpTzxokB6fuHpfzDGVMQlVjN bj0mjCHFx+IqBamcOD8vtTApzHOz5dk= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=LgPjYLfv; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf07.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.221.178 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1724062185; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=c0RqpfEc0+4cjxjzidl6yIPtJMbo5w9bV40MgxVYW/0=; b=v7VNUTPMHv6GE+XKOLtkZrtNqKfzJVIp84CnwUwuQdvSu3xYCvvfFlwLBE8UKsf8NyC3Rc 2V/dzQ8FFvrCyoudp5EK4/62dxZfn7iBaTH0APHOMgCyjouUk50sw1JEjVfhoSCudaJEoT MXBsnFEuQccCpMt4xQGBDTlO0Ij5R5U= Received: by mail-vk1-f178.google.com with SMTP id 71dfb90a1353d-4f6b612fad4so1413234e0c.0 for ; Mon, 19 Aug 2024 03:10:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1724062247; x=1724667047; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=c0RqpfEc0+4cjxjzidl6yIPtJMbo5w9bV40MgxVYW/0=; b=LgPjYLfvTaBGhRwEWfYe6Dp3s1PVboYun94EwgcQVxR95yvORhJLvHKDffW/cHc7oC lGy7GyPgaKxVRb3p8EfipBoVwXHzltqwBCqmBn+YC5aV6/Y21dt0RgEerik2iV473cd9 +JqgrItQ0AB3H0SXOWhA5CgffLEMFqTQH1IiJmVp9WRdmlxNaCpV8I4Mfzz9EFaYMPHK szU3kOSswrIdNmfe9IgNiPbLmNf4LW9Nm6xaPqoo0RWuDKb4Ud5WJSHXGk9LpsYpOcD9 rnyhfGtz01bKq/oYCrQldQRsVkSI64M0PQcv5fPXTqgboL6dJLV0R4+b+cMUL8NHKgUy Wk6g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724062247; x=1724667047; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=c0RqpfEc0+4cjxjzidl6yIPtJMbo5w9bV40MgxVYW/0=; b=qqgXNA7JJHeGGlvrgHIJAoem3810p956JSTQrF7x2rTfMpKS1i80fTHaCCcIukLf5R FhlqzZYihDTkaCeOfdQV9l1NjRsx5PCX2HTXZNhJBfhrQLuJQilibbCZjVRRE48gHBmw /Dr7W1VOQONQwIVMr1zA2opLzWPoOsLi90/qUdfMmSBbyOMpL4+h7iOOR3ULVEamSDGe eZJAHK9qrNS2JfI8569Tri6yDVydMtJd/JZ14McDsx1aB1M0k9oQkfeCtsdcHsmRpSHC mqUSieXZNTJ675cHIGFL0UYxfwBcyS7vQAnJUxmyNYyqHrTOsomQiTYXXw2NYWMqNbQj GClA== X-Forwarded-Encrypted: i=1; AJvYcCWLfoV+CxF9PG2CASU23rWCy4oWIPBjPNe3YPp3LGKfk7+8GQqYbAaawilyx9YIt2L6PQmFeYgy0St4z2sjz4Zto48= X-Gm-Message-State: AOJu0YyZ7dach1Rz/cnlkTg+i3oYdKGeE7WqYh4XrBAMI2/vnIHUzxOI CjL8DZn5tTJ4rvUjxp1WitBseWQttJDTCFoeAHGxxecsp/dOrd+ZLzxMy5fvav34j9wLa3+0PFC maATnXne4nD2Z01yVroLbX6sG+v4= X-Google-Smtp-Source: AGHT+IEGt+a8G8T+r5tvO9Us9hGLofbZ4xz9Y9DYDkWNgf7+D65DfOaCXLqBujLxJDqnjRtosRiavtkGxo1uTooD16k= X-Received: by 2002:a05:6122:3190:b0:4ec:f8f0:7175 with SMTP id 71dfb90a1353d-4fc6c99cc45mr12788227e0c.11.1724062246760; Mon, 19 Aug 2024 03:10:46 -0700 (PDT) MIME-Version: 1.0 References: <20240817062449.21164-1-21cnbao@gmail.com> <20240817062449.21164-5-21cnbao@gmail.com> In-Reply-To: From: Barry Song <21cnbao@gmail.com> Date: Mon, 19 Aug 2024 22:10:35 +1200 Message-ID: Subject: Re: [PATCH v3 4/4] mm: prohibit NULL deference exposed for unsupported non-blockable __GFP_NOFAIL To: Yafang Shao Cc: Michal Hocko , akpm@linux-foundation.org, linux-mm@kvack.org, 42.hyeyoo@gmail.com, cl@linux.com, hailong.liu@oppo.com, hch@infradead.org, iamjoonsoo.kim@lge.com, penberg@kernel.org, rientjes@google.com, roman.gushchin@linux.dev, torvalds@linux-foundation.org, urezki@gmail.com, v-songbaohua@oppo.com, vbabka@suse.cz, virtualization@lists.linux.dev, Lorenzo Stoakes , Kees Cook , =?UTF-8?Q?Eugenio_P=C3=A9rez?= , Jason Wang , Maxime Coquelin , "Michael S. Tsirkin" , Xuan Zhuo Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: C8B9F40013 X-Stat-Signature: q941r9rmsswdbzkdnidqgg5ti5eipu36 X-Rspam-User: X-HE-Tag: 1724062247-724473 X-HE-Meta: U2FsdGVkX19EGYr4f2UIwhxThC9AW5ioRXGQXwKURjHCaVofKNNWVCsgWmI4c25fcMupauYHxnUq+vw7ReZt3Y8pjycTtMG9Kpi3hYxP0GHr5JsO/NSMj/mG0XvCsc+g/s8mzjfKepn6GIvKoPk8v/FnRT3PrLbbH/DV7wNK2cZaLBKHk0fA+HnUdovzCSwa/8zNUz5/5Hp4vk8TvnOQhmkGjYA/zAzrVup3QOVsWliEPcnfldHBQBccJdNuj2viU0Vq6wemgbZY5I+fGi9pbTu24kH4dklt4JnliFLrtEEKBF/L5EMMirloV92/l6nrXs6dhlGioDW+3mIoPidkmuP5y5K45LzOU0oJ06k6MN0KnDGCJaBr2WgG2AlT3QGNcAWsb+6Kkwh4kDuDgGvXdXvy1TrXITrRFt/JttRW8NaRT4g7F19kDm+AtzYQ1B3qIEgTJKU+60cFBvhNnnKUYc6IitoyJ1wt9GWswDRU9+8hW+60NQC/HYKrC7/nrw7laJS76akX/jdLMAOUagledBbrhvnABEHiN44+wKIwq6Ayy93p3HE7ztQvnYrEviN/B79ykjA2jSJxplKf4rFuWKFEaSaLNQRixilPdvog7etKEiegIc29bG/XrdvfyhEM4CIcWhJCTXDEH16hsKwUd+WZNhK5OipNWnmqR4uUTa+8pymaNnbjwWoHDBmLUFY8775ZPR0SmHl/QyiIlO1lIiegtjRaW0YPbOlFoc8TL5ChiMk5XJYhfEMEY12iPDdGN1igg5UCjkT0XW+6iSAsac6ZP4oiRIo05TIhn6GkC8jJCnlUTdPslNZ3xsf6E2B0hM+cnP8Yhw2mBD3qEFtoGybBpp/CKJ5w+Zjgp3WDKtZc7o8nReG/gF7HJ+cLcY9lPIAZnLUtai887QbM0kNjcRezlBm2hlqAjRx9gLhldqBbqQB9S1Akm/6TnlEfAUjoWlxyTQFlrLqbfFwqumV cN6v0+4E MUnPVnYOeNRUC5H2QnPJeiaG+AvDLFcu6jFCgo2OY57tGvah8hpfTQR7NUui2vT/EyzzNNP151WoKOtnk/YUmzVb7v6GGybsT37s6GjlZqctc4OcH8/rXC9FSwc5WzvxPSD/vO9+bB2Lh0mCv0ByfII1p098sR10SZWF5nRaRC9HMYgD5J6hmywWQVxdn+XsJVMB2ieywZ1eYmnktEkHfmIK7s2maz3wnByFs86WfdPJSRpDA84zTOOmI3D8rcRo1DCD3GTlwaPzOZaKVnnM60UaTV6AJgo0z4rgFpuLbaLjlXDoihfZ2VjGNiVMmSc1RuH2z67NfaK0yqV6uNJzR8gUuOCBqMbZeuihw/lvaKkjkP1ZhFdyXgoZqiOsV6PW47pJwI1qn2Q4iok6tGn/shOjshTlcg52XghQdxP5Ntv58ecSv+WSjeXBVPzjXInCOrlg42tBz+nDQC37kagnWwTr+HFTXHSWWd7FOBxSHykh8/Ml7upYdV389nFGsBnj5NIVk30Yfgp3s8Uq4Gd9TsmvSmsKmnGlP6ixfj5/ao0GZHhVuospbkuzmHg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000111, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Aug 19, 2024 at 9:46=E2=80=AFPM Yafang Shao = wrote: > > On Mon, Aug 19, 2024 at 5:39=E2=80=AFPM Barry Song <21cnbao@gmail.com> wr= ote: > > > > On Mon, Aug 19, 2024 at 9:25=E2=80=AFPM Yafang Shao wrote: > > > > > > On Mon, Aug 19, 2024 at 3:50=E2=80=AFPM Michal Hocko wrote: > > > > > > > > On Sun 18-08-24 10:55:09, Yafang Shao wrote: > > > > > On Sat, Aug 17, 2024 at 2:25=E2=80=AFPM Barry Song <21cnbao@gmail= .com> wrote: > > > > > > > > > > > > From: Barry Song > > > > > > > > > > > > When users allocate memory with the __GFP_NOFAIL flag, they mig= ht > > > > > > incorrectly use it alongside GFP_ATOMIC, GFP_NOWAIT, etc. This= kind of > > > > > > non-blockable __GFP_NOFAIL is not supported and is pointless. = If we > > > > > > attempt and still fail to allocate memory for these users, we h= ave two > > > > > > choices: > > > > > > > > > > > > 1. We could busy-loop and hope that some other direct recla= mation or > > > > > > kswapd rescues the current process. However, this is unreli= able > > > > > > and could ultimately lead to hard or soft lockups, > > > > > > > > > > That can occur even if we set both __GFP_NOFAIL and > > > > > __GFP_DIRECT_RECLAIM, right? > > > > > > > > No, it cannot! With __GFP_DIRECT_RECLAIM the allocator might take a= long > > > > time to satisfy the allocation but it will reclaim to get the memor= y, it > > > > will sleep if necessary and it will will trigger OOM killer if ther= e is > > > > no other option. __GFP_DIRECT_RECLAIM is a completely different sto= ry > > > > than without it which means _no_sleeping_ is allowed and therefore = only > > > > a busy loop waiting for the allocation to proceed is allowed. > > > > > > That could be a livelock. > > > From the user's perspective, there's no noticeable difference between > > > a livelock, soft lockup, or hard lockup. > > > > This is certainly different. A lockup occurs when tasks can't be schedu= led, > > causing the entire system to stop functioning. > > When a livelock occurs, your only options are to migrate your > applications to other servers or reboot the system=E2=80=94there=E2=80=99= s no other > resolution (except for using oomd, which is difficult for users > without cgroup2 or swap). > > So, there's effectively no difference. Could you express your options more clearly? I am guessing two possibilities? 1. entirely drop __GFP_NOFAIL and require all users who are using __GFP_NOFAIL to add error handlers instead? 2. no matter if it is an unsupported case, such as, GFP_ATOMIC| __GFP_NOFAIL, we always loop till a soft or hard lockup? > > > > > > > > > > > > > > > So, I don't believe the issue is related > > > > > to setting __GFP_DIRECT_RECLAIM; rather, it stems from the flawed > > > > > design of __GFP_NOFAIL itself. > > > > > > > > Care to elaborate? > > > > > > I've read the documentation explaining why the busy loop is embedded > > > within the page allocation process instead of letting users implement > > > it based on their needs. However, the complexity and numerous issues > > > suggest that this design might be fundamentally flawed. > > > > I don't see "numerous issues", only two issues: > > > > 1. allocation size overflow with __GFP_NOFAIL > > 2. unsupported case: __GFP_NOWAIT/ATOMIC | __GFP_NOFAIL. > > > > for 1, it has been a BUG to require an overflowed size to always succee= d. > > > > for 2, it is an unsupported case. we just need to hide __GFP_NOFAIL > > and only expose GFP_NOFAIL(which definitely includes blockable) so > > any unsupported case like vdpa will no longer occur. I would greatly > > appreciate it if you or someone else could take over this task, as I am > > currently extremely busy. > > > > > > > > -- > > > Regards > > > Yafang > > > > Thanks > > Barry > > > > -- > Regards > Yafang Thanks Barry