From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 370FDC36014 for ; Tue, 1 Apr 2025 14:51:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2D6E7280003; Tue, 1 Apr 2025 10:51:23 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2AD0F280001; Tue, 1 Apr 2025 10:51:23 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 174C8280003; Tue, 1 Apr 2025 10:51:23 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id EDC4B280001 for ; Tue, 1 Apr 2025 10:51:22 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 2FBC1C02DB for ; Tue, 1 Apr 2025 14:51:24 +0000 (UTC) X-FDA: 83285763288.13.1F373D6 Received: from mail-qv1-f49.google.com (mail-qv1-f49.google.com [209.85.219.49]) by imf09.hostedemail.com (Postfix) with ESMTP id 5DC76140008 for ; Tue, 1 Apr 2025 14:51:22 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=ZYqSMEOj; spf=pass (imf09.hostedemail.com: domain of laoar.shao@gmail.com designates 209.85.219.49 as permitted sender) smtp.mailfrom=laoar.shao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1743519082; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=w2PGIUWN8Qg4lwQqZjTaAHNRuRoKwS4SfvGv0EzV/1Q=; b=hxW7yPq1brXtI+XdL8O2fXKsyKznZpBzR2NWEnFA9ml6mkSaE4O+RnmJb/xbx4sy3plkMF AdMoD3SHlN6mOTtqhD3ckxDtHj7BOV6U1mq2UVvXC4WL0nA9cmsgjJ4VFV5hZ1wm5rhktv O29taj4PDoqQSdHvs3WwgHDfYMa15MY= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=ZYqSMEOj; spf=pass (imf09.hostedemail.com: domain of laoar.shao@gmail.com designates 209.85.219.49 as permitted sender) smtp.mailfrom=laoar.shao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1743519082; a=rsa-sha256; cv=none; b=p5nWUU33wNrLH6jYfiIYW/FadTOIVTtaBAyMeOcmtV+IP8/QBic2LuRBWpt18sdXTL2H4t Zg/mAOApNnzyarRwa81GBqmLIeHEgcHMQuvBd162tuTGuomcAmaMFs8GIZ3mjbTb2Csbxd H1fDD+bLwTfTb5ywZNOPm3mE3TXOesY= Received: by mail-qv1-f49.google.com with SMTP id 6a1803df08f44-6ecfc7fb2aaso51583086d6.0 for ; Tue, 01 Apr 2025 07:51:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1743519081; x=1744123881; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=w2PGIUWN8Qg4lwQqZjTaAHNRuRoKwS4SfvGv0EzV/1Q=; b=ZYqSMEOjujftrBYDdQiKBONKcg9loYSiqO+lzCegQnXUvruxLGcPI1VkQWk5lyTrup u/7o551G2BfRhE6ypXywaPXEvbc09CHFLJi+ds0xXCEfiPl9XRGLdPDlHrY+mk+i3jPz 0ykMW7nie/yVuPZyYG/dKxx6DLxdaqykhd7ET2Pap0xKWiMbvIgAVcCmB6k6L8u0T1cb pQA4135cBhMwQ2flpCUxDveqGj9VYjrCAZrk8LD7jkPaawcTUdyfcK8XI5Hhf+/gzfl4 ctBn2BiT9JSAIcuxkNhsK8Vb/tl7IOj5zNll9iLZxXmMD4dL606t6aXjWbkw7hm7jyZB h+fQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1743519081; x=1744123881; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=w2PGIUWN8Qg4lwQqZjTaAHNRuRoKwS4SfvGv0EzV/1Q=; b=ZX4004sbcTSWIU6ttQoe7AAbQqdhyTFiw/YnWAF/W1muCjZGu4UtfuAzCDl6n/wn5x g8nl0CB3N2DYxf6P0+giN84qZ+vNgdu92A2hTSZrtuFGDZb2ShngXenJ49r5Zlj/ufv+ c+dcl8HSSoXnCKSpY+8I6mhnwnEXD4Rfj8AnDRr+MeTJIcQAcL49J5tjktpdB8gfMlfc nc1qfV+MOZ23nmK+b6ql656MJhLohLF0b0wrwtERQjYB0sR9/zoyo6lO1Ck7qvw1WZyj zCkstNLCUIlcwDgXNZFzvRcWh4CyXpdCSGubkO1FKH5dHrnSZjQeGf/Pvm9Mqxa1IyYE LT3g== X-Forwarded-Encrypted: i=1; AJvYcCVEQ8iMcBNkddMW/Zxme6m8Kwbkb19vaCC+R4wBFfDo7LviNbXzfZdLzJtzdlQtnAwEigGIKZY+Aw==@kvack.org X-Gm-Message-State: AOJu0YzthxIQmAHUULQDzBl1fL5Z6E+kw+xFKB9nUTRUO194/IVUp05x q+K3G6KH5q6Jtc9LwKhsHa9862LwMig2fQg11acRQ4DV8jdUHlSbJtVbbpNJIr8JUgM0llpVgnO cGmYj3Cqp33+6wED1J+cpTe1FbVg= X-Gm-Gg: ASbGnct1zEN/gLQPYSYq1Y7QTwAsvSIho3igC8VbgC3dVcmt0CWF3HHEhhzkUIBlR4b A9hjjb+ZVDpOkB9skRqhMKFld0JG6oIiuKWMRkuSX4CFofSPdhyLjQyi9zI9ue3vDx6eQHqYxNb fzzeYqHILcG1Eqv9FUCJgKbKn+l64= X-Google-Smtp-Source: AGHT+IG7pkQ1GzQpATG8QqTleJB4yNaHwHHE1zgVDQOR+ZgvIB8JI2BkCAG4FIMKSnZH4heOS1hQC/RjR8JUmqaXJsU= X-Received: by 2002:ad4:5684:0:b0:6e6:61a5:aa57 with SMTP id 6a1803df08f44-6eed5fcb745mr166993946d6.14.1743519081478; Tue, 01 Apr 2025 07:51:21 -0700 (PDT) MIME-Version: 1.0 References: <20250401073046.51121-1-laoar.shao@gmail.com> <3315D21B-0772-4312-BCFB-402F408B0EF6@kernel.org> In-Reply-To: <3315D21B-0772-4312-BCFB-402F408B0EF6@kernel.org> From: Yafang Shao Date: Tue, 1 Apr 2025 22:50:45 +0800 X-Gm-Features: AQ5f1JoxFVNGVjpqTbB8efeabk5Z3jz9ZAOuKNxTpHQhuIoFL3zlIvT5_Hlc1yA Message-ID: Subject: Re: [PATCH] proc: Avoid costly high-order page allocations when reading proc files To: Kees Cook Cc: joel.granados@kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Josef Bacik , linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 5DC76140008 X-Stat-Signature: qcexyj69g1akpe3fqaakagq88i4x8xkr X-Rspam-User: X-Rspamd-Server: rspam12 X-HE-Tag: 1743519082-727385 X-HE-Meta: U2FsdGVkX1/nNroCT9CTRAS3cXcWBSFlmmB1GXpcLAqF0HWRO7r6ibFGotB0S6jJN04EvMZCksAquPuXk8AOisQZHYnMpRZMur2wn4gY6/wIId61WloXEc/QBnAbXl7/7ZW6osOFHCMSybRC7XTIMxx3b+/iTFkvBq1n9QEQqVs/JcHJoUR33m8utRHioo5P/tfw2yeVeGk2YtrFCzVpXHL8aXnoykYPP/N7zvUsgU5e8dsYUfpgwdYLBXglhWo6k7Ttq1zOs81umICKJomLYYdzyVVE1+k9A2FKvtiVtRjCRJrdHVmw9gux2HeR2WV3PiT993iNBCcs4udlFzkx3VT4hWOKvwcBt0agzBdoiVA2+qj2bMUluNjFYdyrQxgz9AQx5P/OhPrdHd26Gzw/DAaGwk2aiKy6358WNVNYNolDRmqd0krpnuBdFVibcyeUc7DFWRE8WO3u+OapxJY7N1C9C07gTi4NHlngIJtEqF4rVmI9lmzghKI59qix3flHQsVemftrptl3yOCIodooEPxy/4bL1ZCgp3w72S5M37rX6tlCweetJSyoNUumHyTyDXxSk9JKoBpWXv6NzVY7G8Aenjwv/+nASFO7+TZIS/LqnmSHYq1vdIK6IbvKUDSyzVXK3tnEA5Evn8C876gnXZQU8akXW+L/aQ+0/2Udzev3Q2g2sAqsmWw9h9YRiJM+bnlhZSK7ges1cFKaJg/wJ7S+vVxLWtTiJEP1W/JU2dHJm9FoShPmQJ8d0I3kkSvd807zkUWasNvsHsvcr3HkWqvAYCgu/VIqL1nPcB+n+zBJOna4UjnUaTRHsh8FjJG0LQOdIwYAd2XLFJ0xMFoeEuLkLbsNG8961esA0PMBtmQMmffDb4zRF8Ju3eizhnOzPdTTCxkn4Yp/uAlHBcAu2Y/v57JWU1fHyivtZgtnlMT6dnXPceeejYYg6+Wl6hbUNocV9f4vqN7V4vHkJkK RGCMyr3u mm/Ma82UvJBvpyP6vjAGEWStxgS3tFBvAgCbEAAWkVwwLfeRuqKIVGQ6tsMsfAN6+rwbJqrY0oI2POrKivQC02GEl8bPF2C7Vzi3ElX1OBn099CeHX6xYEGjZbN5iW8iRMmqYjV4kMil6z1acnc/Qb4KCZjkabfLRwT4mK13PpPN2FNQmX+r914TToiCJi5T7akvF+usPMIhLI08XvLIE178Bcd8c9BZ7in989aQDqpWbZrMggIysvo2FW9eGKlPJp0lYdHg9L9M3MMOWOKHWvwGiF3Y55U5fZjmaK7HnPdNzhyR/P8UbdQC+XXALYYiNXZj7muYdOIXAIbBoK1KBX1hfSBRgfgV8WCRjpj9NVCzp5psm6HMfCG0rbUb4l8lRnF/A6QxHcTDsbMVGnITxUVjLtMmXorjjind2wr3ryf77EMjUnkNVaT5K9SRc1+KI9mDPE7o1kdEPeG8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Apr 1, 2025 at 10:01=E2=80=AFPM Kees Cook wrote: > > > > On April 1, 2025 12:30:46 AM PDT, Yafang Shao wrot= e: > >While investigating a kcompactd 100% CPU utilization issue in production= , I > >observed frequent costly high-order (order-6) page allocations triggered= by > >proc file reads from monitoring tools. This can be reproduced with a sim= ple > >test case: > > > > fd =3D open(PROC_FILE, O_RDONLY); > > size =3D read(fd, buff, 256KB); > > close(fd); > > > >Although we should modify the monitoring tools to use smaller buffer siz= es, > >we should also enhance the kernel to prevent these expensive high-order > >allocations. > > > >Signed-off-by: Yafang Shao > >Cc: Josef Bacik > >--- > > fs/proc/proc_sysctl.c | 10 +++++++++- > > 1 file changed, 9 insertions(+), 1 deletion(-) > > > >diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c > >index cc9d74a06ff0..c53ba733bda5 100644 > >--- a/fs/proc/proc_sysctl.c > >+++ b/fs/proc/proc_sysctl.c > >@@ -581,7 +581,15 @@ static ssize_t proc_sys_call_handler(struct kiocb *= iocb, struct iov_iter *iter, > > error =3D -ENOMEM; > > if (count >=3D KMALLOC_MAX_SIZE) > > goto out; > >- kbuf =3D kvzalloc(count + 1, GFP_KERNEL); > >+ > >+ /* > >+ * Use vmalloc if the count is too large to avoid costly high-ord= er page > >+ * allocations. > >+ */ > >+ if (count < (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER)) > >+ kbuf =3D kvzalloc(count + 1, GFP_KERNEL); > > Why not move this check into kvmalloc family? good suggestion. > > >+ else > >+ kbuf =3D vmalloc(count + 1); > > You dropped the zeroing. This must be vzalloc. Nice catch. > > > if (!kbuf) > > goto out; > > > > Alternatively, why not force count to be PAGE_SIZE= writes in proc/sys? This would break backward compatibility with existing tools, so we cannot enforce this restriction. --=20 Regards Yafang