From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A2173CA0FE8 for ; Fri, 1 Sep 2023 12:56:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DC837280005; Fri, 1 Sep 2023 08:56:02 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D77518D0002; Fri, 1 Sep 2023 08:56:02 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C3E72280005; Fri, 1 Sep 2023 08:56:02 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id B40D78D0002 for ; Fri, 1 Sep 2023 08:56:02 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 7DEA3402AF for ; Fri, 1 Sep 2023 12:56:02 +0000 (UTC) X-FDA: 81188026164.28.427101F Received: from mail-io1-f54.google.com (mail-io1-f54.google.com [209.85.166.54]) by imf14.hostedemail.com (Postfix) with ESMTP id 96CD010002A for ; Fri, 1 Sep 2023 12:56:00 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=DwPYNDwM; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf14.hostedemail.com: domain of glider@google.com designates 209.85.166.54 as permitted sender) smtp.mailfrom=glider@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1693572960; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Yp6kuqivZUwDBvAydSVxSfEt7o7hfm/DB8FgT+ioHzU=; b=Z05UuUzsFBRjFTt+hp/tm3cKB5ywS28TMBlQhpa9QahIBjmnlDvQaAqhYeuEgmSJDdW9C3 2t1tA4m6t/Np4SWLvJ4pyVxO0jaDbtawaVSKJazHhmjcGwxnMV8SeKQVWgV4X7Xc1p97XW rHSVl9HqmWwD2CunhL41lGYyFPXTsj8= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=DwPYNDwM; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf14.hostedemail.com: domain of glider@google.com designates 209.85.166.54 as permitted sender) smtp.mailfrom=glider@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1693572960; a=rsa-sha256; cv=none; b=b5Tud61CFTx8bWgwkmSAtMjwY7qttA0bUuytsneIfnFonvR3dKF+Cky9xrg5NpfZS9/XFW Xr4sh4qdbf7Jja4XOmfcqhXfPm5Peqxvb+Hqj+7Nhju7eTabRC6EzylynZAGSz7mjqAgps 62ySPIComzEFsuWt99UB/rKEs9XMv+E= Received: by mail-io1-f54.google.com with SMTP id ca18e2360f4ac-794c7d95ba5so66864239f.0 for ; Fri, 01 Sep 2023 05:56:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1693572959; x=1694177759; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Yp6kuqivZUwDBvAydSVxSfEt7o7hfm/DB8FgT+ioHzU=; b=DwPYNDwMF+UoOfavNx0AreDi+SbjL0ZaixxSfAhTdqI2+rYvHoU3yHlH0Hjc+UbLM8 oMSL30oyW0mNdrVUq65wOcdAn5jLKAhcC2XG/j9NwFL4Opz91FY/ZYCwYXE2ohSrPYzG ab+Zk3buM+AJSGddf5HAW4FaT3oT17X5cSJHEGDzaxSmgItqYLqNdoJSlv4TpRNAtiFC yEVcHABF3gF5rmh9eHJjpIbD2ugH5uZOxf7E/U58x51B+r5tJXDVuD/I1JvnGwEXmSDf thTIUr+Ox246gSt4XUYDsSMbH20ZCaqf20pF0PBwNsnzbpfrWwL9Y0Yg7RE4wUMplXLC y/BQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1693572959; x=1694177759; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Yp6kuqivZUwDBvAydSVxSfEt7o7hfm/DB8FgT+ioHzU=; b=cPddDxCwXsn4MgvV7WGxCiNcpCzptapVxjiQ/b3P117vfdXj6U32uoirh2bqbDXonr z0UPxEhFI7CCqNZsu+e8MuLhBdN3uf2BQFzYWHNaRWWnWI1N9+6jaUq6PVN19xzYxDVD 4GhJ4J1Hmp9n1iWzP2XI7vwlIDGQNAPTpnP98vqtdy7u0NVei0HI2Q37F0zq0SFjiEKd JjELU8V4pxLxe2qGBLjrpaAXzYYy4FRD6Ze2KG1yNCmtwTaY8+whJ7W7H9TRZwZSMfbm JLG9mWyt1TApqCPFzda+UUqrSVcodKs7AnMWCSomIPxtcWk/VF/d0bWZ9SXPd7Tyb8Us izXA== X-Gm-Message-State: AOJu0Yzcb9X35ax9woLRspE+IUHMofRffPCkQsc4rRKdYmCJ2kGc+oPI CdqH9kQQwY1OcO9g0gGZ1KNut+a8EAkMDkaunvrpWw== X-Google-Smtp-Source: AGHT+IHpml0xIyAhuTM6WK2ZMvP39QviO5vePqECq+n308G/C9kRkss4Wd7em/CWZ4a3YguAYG1sVgTg8cbYOm2FnN4= X-Received: by 2002:a6b:5f17:0:b0:792:9a3e:2dd with SMTP id t23-20020a6b5f17000000b007929a3e02ddmr2681740iob.3.1693572959540; Fri, 01 Sep 2023 05:55:59 -0700 (PDT) MIME-Version: 1.0 References: <20230831105252.1385911-1-zhaoyang.huang@unisoc.com> In-Reply-To: From: Alexander Potapenko Date: Fri, 1 Sep 2023 14:55:17 +0200 Message-ID: Subject: Re: [PATCH] mm: make __GFP_SKIP_ZERO visible to skip zero operation To: Zhaoyang Huang Cc: Matthew Wilcox , "zhaoyang.huang" , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, ke.wang@unisoc.com, Marco Elver , Dmitry Vyukov , Dave Hansen , Michal Hocko , Kees Cook , Eric Biggers , Mateusz Guzik , Linus Torvalds , Al Viro Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 96CD010002A X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: bz788ueacf1778fe6hzyki8fcimisx5e X-HE-Tag: 1693572960-206468 X-HE-Meta: U2FsdGVkX180HiH7S9Trp6qGjBQxaEwEMytCLZfBcK+iScd98imkl0/gqRrd/zbGV7wVpoj1l6YCjrwSLv6Nfo5eJk7W+H4IeOn8rJ3ZBSloT6qmyiDG+FOGx+essgyO3dAu1pghqKE5QammiJ8Y5G84LEhfezYI1Gyr29gw+6a6bsCJpNr449uvOf8t955mjnQmhkYTqZY7s5VrErFShqaF03wMUd4FOSnQd75fAj6MSU0PU+BFX51gnZeNeLZkDCg614BJq+8LYEorZIGBe4FRZ0nyMiJ/OuO8DVlkKEJj/tr5lIoFV3qUmHwYxAeIGLIUSd6jr05fb/F9/9XbUzYLphuTLpgC7/ia56yAZ4IsqhL/vyWRS7JGKJy3oRG3TALbA4I5+hvSNS77mapdtguDFarNLuIRy5PvoaYJvv/I78PTOyUerYCdL/mKmPy9b0OXt6B+YKlROLmvpQmyBolkCLWhI1TtEcgXCcY+wqmwG/zRZYbU9CbDlPHAE5E3GqIKbFgnP5494UkUbjjypJU8SMu1bpZ5aaqRNRvs6vghObh0cPhzLz+eDfDdlcolSKbel2SCV0Fa7mwNV/WFdLBAJx1ikkul34XGv+1ETAr5x4uhQFuHwYsQoiOEV02jSO6WmcDwuS31CikkbDURk6IYLHmQv+3fgh797+rP0dOPqamjxM6/bzreO25/sInOm3QIpp2Tt+RYZtgiapavIjLOKzpy2MshGZeaFbK6Wl4mFGJC3+Mq9nBq7yzQHCMbHhT4FkVvmqgR0ZHpsL5pyCdCkLPs4rJglES4U+C1fMMqCoVxxFubwlA2m2thQIPxIOZGgyFK+6AoAdy5vH3JCePE339hWeLLMcPjVaIOKOcXkGVxLBNmiie1ZmmeTO0UMyKdZdiX2fZwc6G8eZg8Ov07504aMpeoVP+n9W2HLHKihKSFeDubnpKdEq3hug8/OlJ1YtgpBf4kfzhqFQR cZP2QtL3 w5jhke7DsI+sKpiarDsEodmMGvmiUYRSAcz2EXrttHnF7H9lysNkfo/r6a5putwI3PoNlD+ruPczo/7p02iTQDFMK47mwzzem+18nCmn/oXO9ewc1O745Skb1i8H5wrbiaEkN5RhFskepuFgnJYeRv6740wAeJ8M7ug3o09iwiPF4h3tfbfMVob7bKxgplSL2uKc3lfH73kT5qUbpSIJqWCtY3+8yrSaVBRVbGHkJP0DIr3pCJpeFi+Tn44JcLmlXJUT29GUgqb6rrmBjZCrunfexPQVatDyW7kywPyMnyxEOkzZRzsT4gNDNXsax0KlLllH68Qq0b3RY7EEfAgQ9E0GBffKBiYqvnojOgsFVEhGoommoBJbGwN/8/CRHwDvwHNooUdWQDwwdUn6UYepcGI48tj53tlo+hIP1yH2kaAE72hhR88gg78ks7Ejj+d3sC0HF4tgAuZdDlBoxunboi+jhPMLoMvZOpx5+tmrum7QJudjkFjn+6yAckImweJsPAlQNhHaT0e9e8mpIMmO4mLwMQC07S4pZ2Rvo+MUOgrIC3po7C8mXDozqfJg8Kx5Xpn8Q83tQC+ftBfhx315kVcozPFLLb0EvZCim X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Sep 1, 2023 at 12:29=E2=80=AFPM Zhaoyang Huang wrote: > > loop alex (adding more people who took part in the previous discussions) > > On Thu, Aug 31, 2023 at 8:16=E2=80=AFPM Matthew Wilcox wrote: > > > > On Thu, Aug 31, 2023 at 06:52:52PM +0800, zhaoyang.huang wrote: > > > From: Zhaoyang Huang > > > > > > There is no explicit gfp flags to let the allocation skip zero > > > operation when CONFIG_INIT_ON_ALLOC_DEFAULT_ON=3Dy. I would like to m= ake > > > __GFP_SKIP_ZERO be visible even if kasan is not configured. Hi all, This is a recurring question, as people keep encountering performance problems on systems with init_on_alloc=3D1 (https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1862822 being one of the examples). Brad Spengler has also pointed out (https://twitter.com/spendergrsec/status/1296461651659694082) that there are cases where there is no security vs. performance tradeoff (e.g. kmemdup() and kstrdup()). An opt-out flag was included in the initial init_on_alloc series, but back then Michal Hocko has noted that it might easily get out of control: https://patchwork.kernel.org/project/linux-hardening/patch/2019041= 8154208.131118-2-glider@google.com/#22600229. Now that init_on_alloc is actually being used by people which may have different preferences wrt. security and performance (in the cases where this tradeoff exists), we must be very careful with the opt-out GFP flag. Not initializing a particular allocation site in the upstream kernel will affect every downstream user, and some may consider this a security regression. Another problematic case is an OS vendor mandating init_on_alloc everywhere, but a third party driver vendor doing kmalloc(..., __GFP_SKIP_ZERO) for their allocations. So I think a working opt-out scheme for the heap initialization should be two-step: 1. The code owner may decide that a particular allocation site needs an opt-out, and make the upstream code change; 2. The OS vendor has the ability to override that decision for the kernel they ship without the need to patch the source. Let me quoute the idea briefly outlined at https://lore.kernel.org/lkml/CAG_fn=3DUQEuvJ9WXou_sW3moHcVQZJ9NvJ5McNcsYE8x= w_WEYGw@mail.gmail.com/ (unfortunately the discussion got derailed a bit): """ 1. Add a ___GFP_SKIP_ZERO flag that is not intended to be used explicitly (e.g. disallow passing it to kmalloc(), add a checkpatch.pl warning). Explicitly passing an opt-out flag to allocation functions was considered harmful previously: https://lore.kernel.org/kernel-hardening/20190423083148.GF25106@dhcp22.suse= .cz/ 2. Define new allocation API that will allow opt-out: struct page *alloc_pages_uninit(gfp_t gfp, unsigned int order, const char *key); void *kmalloc_uninit(size_t size, gfp_t flags, const char *key); void *kmem_cache_alloc_uninit(struct kmem_cache *, gfp_t flags, const char *key); , where @key is an arbitrary string that identifies a single allocation or a group of allocations. 3. Provide a boot-time flag that holds a comma-separated list of opt-out keys that actually take effect: init_on_alloc.skip=3D"xyz-camera-driver,some_big_buffer". """ A draft implementation at https://github.com/ramosian-glider/linux/commit/00791be14eb1113eae615c74b65= 2f94b5cc3c336 (which probably does not apply anymore) may give some insight into how this is supposed to work. There's plenty of room for bikeshedding here (does the command-line flag opt-in or opt-out? should we use function names instead of some "keys"? can we allow overriding every allocation site without the need for alloc_pages_uninit()?), but if the overall scheme is viable I can probably proceed with an RFC. > > > > This bypasses a security feature so you're going to have to do a little > > better than "I want it". > Thanks for pointing this out. What I want to do is to give the user a > way to exempt some types of pages from being zeroed, which could help > on performance issues. Could we have the most safety concern admin > use INIT_ON_FREE while the less concerned use INIT_ON_ALLOC & > __GFP_SKIP_ZERO as a light version method? init_on_free has a more significant performance impact, and might be more problematic for production use (even more opt-outs would've been needed). As a side note, I don't think we should repurpose the same __GFP_SKIP_ZERO flag used by KASAN, because the semantics of the flags may not match 1:1.