From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4BC3EEE021C for ; Wed, 11 Sep 2024 08:46:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7F9E68D0100; Wed, 11 Sep 2024 04:46:22 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7AA3A8D0056; Wed, 11 Sep 2024 04:46:22 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 67B558D0100; Wed, 11 Sep 2024 04:46:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 4459D8D0056 for ; Wed, 11 Sep 2024 04:46:22 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id E4234AA050 for ; Wed, 11 Sep 2024 08:46:21 +0000 (UTC) X-FDA: 82551825762.10.D86AC91 Received: from mail-40133.protonmail.ch (mail-40133.protonmail.ch [185.70.40.133]) by imf22.hostedemail.com (Postfix) with ESMTP id 0CA2AC0003 for ; Wed, 11 Sep 2024 08:46:19 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=proton.me header.s=protonmail header.b=RNIEY2pl; spf=pass (imf22.hostedemail.com: domain of benno.lossin@proton.me designates 185.70.40.133 as permitted sender) smtp.mailfrom=benno.lossin@proton.me; dmarc=pass (policy=quarantine) header.from=proton.me ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726044327; a=rsa-sha256; cv=none; b=07Kwe1gG+tQA4ELUNBBOmNumCffRR7NpJcZn8WvwVYmQHqAWligY5O8DDawKGF+b/ht3qp 46JKnNDiGTimlQXzXhoXpu5GYk2A0zzR3FOl+SSC6MN8nRpP4uNsbqoaNmo64qo44h4IIq WOlr7TFDETiTniQXDde7VuOgaQbzFOM= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=proton.me header.s=protonmail header.b=RNIEY2pl; spf=pass (imf22.hostedemail.com: domain of benno.lossin@proton.me designates 185.70.40.133 as permitted sender) smtp.mailfrom=benno.lossin@proton.me; dmarc=pass (policy=quarantine) header.from=proton.me ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726044327; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=34zMZyYnOjGwEsDtBreqI075Rfehln50WRz0u2TgMuY=; b=FGsaqZowZV8cixZkfEkgMNn99zIkD0OjvkovQjeuNwIeOVQuguPSEdogrtvA9Lzz51ISAP 31P8MfAqx3qSsOs8mZdIcY7TLnmeL5E/aax0WL0fF5Acd52Rx3odBUiPf0e2WNGUu9wbzV vGjJtDUgm3Hp4m6+6GMxZk5fLNN6Sus= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=proton.me; s=protonmail; t=1726044377; x=1726303577; bh=34zMZyYnOjGwEsDtBreqI075Rfehln50WRz0u2TgMuY=; h=Date:To:From:Cc:Subject:Message-ID:In-Reply-To:References: Feedback-ID:From:To:Cc:Date:Subject:Reply-To:Feedback-ID: Message-ID:BIMI-Selector; b=RNIEY2pliTlOKVOROFGu4sfCi6Kn5721NaUGagNhLWbKOYt4WJEAGyDOAc7GqnL6K 1WyPMy/YVll2/Vr1ZhnoD9jzByjKzzEvmnyqdscHergtPaXVc9NFQl2jQVsIzCA7ux 3XLQHDsP+crndeDDrY5dg9WscKiyrlfVwudU/AXFAK1kMGwr0e+xLF7Oq/dS2i91Qw nvITOijrD0TmWPbLkO2MWHCMhB3h3s7lZ7kYCgDqnLWYWJJQYnwizkj9mGjz6sqSF/ k3nskkkWsGaIN4Aq7DH5sLvDumzBq6aASAqiFheYfOf7nITVTvJ0XezpbKiTssh7rj Bxl+FotNE+RoQ== Date: Wed, 11 Sep 2024 08:46:11 +0000 To: Danilo Krummrich From: Benno Lossin Cc: ojeda@kernel.org, alex.gaynor@gmail.com, wedsonaf@gmail.com, boqun.feng@gmail.com, gary@garyguo.net, bjorn3_gh@protonmail.com, a.hindborg@samsung.com, aliceryhl@google.com, akpm@linux-foundation.org, daniel.almeida@collabora.com, faith.ekstrand@collabora.com, boris.brezillon@collabora.com, lina@asahilina.net, mcanal@igalia.com, zhiw@nvidia.com, cjia@nvidia.com, jhubbard@nvidia.com, airlied@redhat.com, ajanulgu@redhat.com, lyude@redhat.com, linux-kernel@vger.kernel.org, rust-for-linux@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH v6 13/26] rust: alloc: implement kernel `Vec` type Message-ID: <1ecebc7f-a1a5-488b-832c-40471fa3e61e@proton.me> In-Reply-To: References: <20240816001216.26575-1-dakr@kernel.org> <20240816001216.26575-14-dakr@kernel.org> Feedback-ID: 71780778:user:proton X-Pm-Message-ID: 6a9244cdd4ff465f6a6d470cfb1f6a82be4ea609 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Stat-Signature: coc8irjwf9f619wxxmp7ih7kja61ydgs X-Rspamd-Queue-Id: 0CA2AC0003 X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1726044379-250914 X-HE-Meta: U2FsdGVkX18XEpqLER8ysmX2Ow1ud+DwOST1XY/urfEdHuwvnzhDH9Hnsyv6ZdLEuyhUkmZnitvNsgHxUJHpk89UcE0AfgIgaOKYGJF4CHjaKjFSZmd/cXBxkYmxyyAyYg2rdribPXYJ/i08zYz7VAMfR7uqSL4x92r4keCt2vpa5+hIeuiF4Q71iWESri56gnrkmfB6EZERMP8BHwXv8Qhd9pfG655qvZhGvW4c+QP4WcGTJzX6V/UyzrTC0xjHk/WqO2wl22rIsUS8f68W7IFf6X9Xvc6OsG0gc55QyLrGDC3VyAL2+i2SFnezTgo5IXglCBcV6y6AWMp7KO5qlWTopVSNaFL8+sboUjnojQAze/lSAD0SiRS/o9IkurO7gPxxR796rL4ob89b+7+bZEnhhh2m+0oGXuo3vTV7EZjzRcqytMKxxj2cpqEEObGxRUHNO5dN/03xJ/R42TYcNyvNLPaITaXgd/b0kykivtAuqn/u/g2QX2ozDcLEk1KQhHt/Q6cmHiQ8Sd10KaGnnZI0bzPqfOY4HoWeJ8iObBj7phQdyZpycLoHGCUVqq1VW9CRG+d+oqCrV0vvjNyd3jP3txZVJ2387KJomqDNyDD6KUtUeyxc6kg7RD1e3gRUhJvjHFf5mXEBvCio+re+dHtyJkdS1jlZQfjhrK8Tvh0AmwtQhj1wrIJ3XlBPARYGsH70NIcJcXmXJ9hvPJ0N9CeYLV+2kJxT1JJj+x6GsdSfZxfzudQg/Ecf/8xo/e6SWS7KaW5cIgxEtQSkV5iMtEogtAbjfomtpKExgNMfSakz9MRkH3GkMw+18gHLdAX3U0uFLGMyfsx82okRr9KGo7UzTkS1kMWTVGeuLcroiSDAWVaBUqW0COs9PgdupTibDOOoJLzHnnad8SxJtKiyvKevcElRMNWy+epVa9PzAMx2VjjCIfbTnCH7SUCn7zsdOM9W/sSodv29YG4rBSQ LeDC3X7+ sMr+ckbnjQ1VxOHpFm1hdtutbtgL9AsYvwuiz5f7tEBcMiauJArDPqsqYdCQAfCKUJdSqCXQDMspwab1AvPVQIUQQIaL/cHuCStPSCuuAK1r4OcV7y44qoQPRFcuRUqPT93iOJE5b60IqrLwwDgTOMPcJaE/peEjn0cVnfXTJ0IDeinF7Otr6F/M50gCppdNh8pv49aigc8OHaPRp0TWegqbu0Az+BmpANFvQ0K2lftVbCjvaYB2hfx07KP9hQJZhVGq+hS0ReNalqE38EvaT3iu9rm7FpPpGi3hPGOIES0omAVYPsWwATyTjLEv7t9fJlslP3PlsnELyEAeL5MsWO4qY/ebW8s1YI9jkDFgsnrCafgBkrUUOYZPOxbFCELfpI/J4yGp6+OmfW1LeVBiG/06zx7s4XJTIycjxc7ZuUTytcguTdmXU1DUyLsMipDn5l7kg X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 11.09.24 02:18, Danilo Krummrich wrote: > On Tue, Sep 10, 2024 at 07:32:06PM +0000, Benno Lossin wrote: >> On 16.08.24 02:10, Danilo Krummrich wrote: >>> + ($elem:expr; $n:expr) =3D> ( >>> + { >>> + $crate::alloc::KVec::from_elem($elem, $n, GFP_KERNEL) >> >> Hmm, could it be that one might want to use `kvec!` without >> `GFP_KERNEL`? >> Or add additional flags, eg __GFP_ZERO? >=20 > Pretty unlikely, I'd say. __GFP_ZERO in particular wouldn't make much sen= se, > since the macro enforces initialization anyways. >=20 > Maybe something like GFP_ATOMIC, but I expect this to be uncommon enough = to not > consider this for this macro for now. SGTM >> If you think that supporting this is not immediately necessary or if >> this is too much for this patchset, then we can also postpone it (maybe >> it could be a good-first-issue). Do you keep a list of future >> improvements for the new allocator API somewhere? If not, then I think >> we should create one (best place would be the issue tracker on GH). >=20 > I would only add it if it turns out to be a common need. As mentioned, I = don't > expect it to be. >=20 > I'd rather keep issues in the source tree. For instance, DRM has them in > '/Documentation/gpu/todo.rst'. But then you need to submit patches to change them... That sounds like a hassle and we already have the precedent to use the github issue tracker. It is also much better for discoverability for people outside of the kernel. >>> + } >>> + ); >>> + ($($x:expr),+ $(,)?) =3D> ( >>> + { >>> + match $crate::alloc::KBox::new_uninit(GFP_KERNEL) { >>> + Ok(b) =3D> Ok($crate::alloc::KVec::from($crate::alloc:= :KBox::write(b, [$($x),+]))), >>> + Err(e) =3D> Err(e), >>> + } >>> + } >>> + ); >>> +} >>> + >>> +/// The kernel's [`Vec`] type. >>> +/// >>> +/// A contiguous growable array type with contents allocated with the = kernel's allocators (e.g. >>> +/// `Kmalloc`, `Vmalloc` or `KVmalloc`), written `Vec`. >> >> New folks might get confused as to which allocator they should choose. >> Do we have a sensible default for `Vec`? >=20 > Then they should read the documentation about the kernel's allocators. We > already link them in rust/kernel/alloc/allocator.rs. >=20 >> If yes, then we at least should document that here. We might also want >> to make it the default value for the generic parameter. >> This is also a good idea for `Box`. >=20 > If we really want a default it should be `Kmalloc`, but I really think we= should > force people to make an explicit decision and think about it and don't ju= st go > with whatever default. >=20 > It makes it also easier to grep for things. :) SGTM >>> +/// >>> +/// For non-zero-sized values, a [`Vec`] will use the given allocator = `A` for its allocation. For >>> +/// the most common allocators the type aliases `KVec`, `VVec` and `KV= Vec` exist. >>> +/// >>> +/// For zero-sized types the [`Vec`]'s pointer must be `dangling_mut::= `; no memory is allocated. >>> +/// >>> +/// Generally, [`Vec`] consists of a pointer that represents the vecto= r's backing buffer, the >>> +/// capacity of the vector (the number of elements that currently fit = into the vector), it's length >>> +/// (the number of elements that are currently stored in the vector) a= nd the `Allocator` type used >>> +/// to allocate (and free) the backing buffer. >>> +/// >>> +/// A [`Vec`] can be deconstructed into and (re-)constructed from it's= previously named raw parts >>> +/// and manually modified. >>> +/// >>> +/// [`Vec`]'s backing buffer gets, if required, automatically increase= d (re-allocated) when elements >>> +/// are added to the vector. >>> +/// >>> +/// # Invariants >>> +/// >>> +/// The [`Vec`] backing buffer's pointer is always properly aligned an= d either points to memory >>> +/// allocated with `A` or, for zero-sized types, is a dangling pointer= . >> >> Just use `self.ptr` instead of "backing buffer". >> >>> +/// >>> +/// The length of the vector always represents the exact number of ele= ments stored in the vector. >> >> Same here, `self.len`. >> >>> +/// >>> +/// The capacity of the vector always represents the absolute number o= f elements that can be stored >> >> Ditto. >> >>> +/// within the vector without re-allocation. However, it is legal for = the backing buffer to be >>> +/// larger than `size_of` times the capacity. >>> +/// >>> +/// The `Allocator` type `A` of the vector is the exact same `Allocato= r` type the backing buffer was >>> +/// allocated with (and must be freed with). >> >> Please turn this into a bullet-point list. >=20 > Is this a rule? Do we want to make it one? I am trying to make it one with my safety standard. >>> +pub struct Vec { >>> + ptr: NonNull, >>> + /// Represents the actual buffer size as `cap` times `size_of::= ` bytes. >>> + /// >>> + /// Note: This isn't quite the same as `Self::capacity`, which in = contrast returns the number of >>> + /// elements we can still store without reallocating. >>> + /// >>> + /// # Invariants >>> + /// >>> + /// `cap` must be in the `0..=3Disize::MAX` range. >>> + cap: usize, >>> + len: usize, >>> + _p: PhantomData, >>> +} >>> + /// Creates a new, empty Vec. >>> + /// >>> + /// This method does not allocate by itself. >>> + #[inline] >>> + pub const fn new() -> Self { >>> + Self { >>> + ptr: NonNull::dangling(), >>> + cap: 0, >>> + len: 0, >>> + _p: PhantomData::, >>> + } >>> + } >>> + >>> + /// Returns a slice of `MaybeUninit` for the remaining spare ca= pacity of the vector. >> >> Returns >=20 > Hm? Forgot to finish that sentence, since I couldn't really pinpoint what exactly I wanted to change. Just ignore it. >>> + pub fn spare_capacity_mut(&mut self) -> &mut [MaybeUninit] { >>> + // SAFETY: The memory between `self.len` and `self.capacity` i= s guaranteed to be allocated >>> + // and valid, but uninitialized. >>> + unsafe { >>> + slice::from_raw_parts_mut( >>> + self.as_mut_ptr().add(self.len) as *mut MaybeUninit= , >>> + self.capacity() - self.len, >>> + ) >>> + } >>> + /// >>> + /// Otherwise: >>> + /// >>> + /// - `ptr` must have been allocated with the allocator `A`. >>> + /// - `ptr` must satisfy or exceed the alignment requirements of `= T`. >>> + /// - `ptr` must point to memory with a size of at least `size_of:= :` times the `capacity` >>> + /// bytes. >> >> Just write "`size_of::() * capacity` bytes". >> >>> + /// - The allocated size in bytes must not be larger than `isize::= MAX`. >> >> Should we make this a safety requirement of `Allocator`? I think this is >> generally the maximum size other allocated things can have anyways. >=20 >=20 >=20 >> >>> + /// - `length` must be less than or equal to `capacity`. >>> + /// - The first `length` elements must be initialized values of ty= pe `T`. >>> + /// >>> + /// It is also valid to create an empty `Vec` passing a dangling p= ointer for `ptr` and zero for >>> + /// `cap` and `len`. >> >> Can you make this last sentence part of the `if` chain that you have >> above (ie the one started with "If `T` is a ZST:"). >=20 > But `T` isn't necessarily a ZST, but it may be. I originally meant adding an "else if" part that checks for empty capacity. But you could just add that to the if at the top ie "If `capacity =3D=3D 0` or `T` is a ZST". >> Just to experiment with the suggestion from Kangrejos to use Rust as the >> language for safety documentation, here is what it could look like (we >> should discuss it more before we start using it): >> >> /// # Safety >> /// >> /// ```ignore >> /// assert!(ptr.is_aligned_to(align_of::())); >> /// assert!(!ptr.is_null()); >> /// assert!(length <=3D capacity); >> /// if capacity !=3D 0 && size_of::() !=3D 0 { >> /// assert!(A::did_allocate(ptr)); >> /// assert!(size_of::() * capacity <=3D A::layout_of(ptr).len= ()); >> /// assert!(is_initialized(ptr::slice_from_raw_parts(ptr, length= ))); >> /// } >> /// ``` >> >> I really like how this looks! We might want to add labels/names to each >> of the conditions and then one could use those in the justifications. A >> tool could then read those and match them to the requirements of the >> unsafe operation. >=20 > I need to think about this a bit more, but at a first glance I think I li= ke it. >=20 > The tool would ideally be the compiler itself... Yes! There is the contracts draft PR that might add support for this: https://github.com/rust-lang/rust/pull/128045 --- Cheers, Benno