From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from r-passerv.ralfj.de (r-passerv.ralfj.de [109.230.236.95]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0F6B51E5210 for ; Thu, 27 Feb 2025 13:55:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=109.230.236.95 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740664539; cv=none; b=uxWkGmToA8Q4/MCxWp5CybtpSXjs6r2nLSrKNk48iLwFzXg3ocqIX/Dv/Zup3JeIVMBWQLE3tAgSoJ05j709hR17qpEDp3yNXdA96bKskTAJVuu5W8fjdnudq/L4iPW7RIYzs0T/fcl4oGrigvYa0xAE8wSdMa0xfnARknr7I8I= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740664539; c=relaxed/simple; bh=MZ5feQNtisJgpeJy5xLzx0Y8Q+XnSBumrFimAbzZ9CM=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=I+Pcq1A6JIUEL7mONV1k9pmiGik195G5LpNNi6a7/t/DF3T8B7vS8Bv2JMBn1bS1FupmHyAMK4OkF00cO1YW09CMsN7oq3L7yuhGCIPy3Ugtjm9qGUhEJJrHq3wfME9f9HhycJjku8a3V8B6Aulc7YQYNoiwX7lSdj9nyt8FDRc= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ralfj.de; spf=pass smtp.mailfrom=ralfj.de; dkim=pass (1024-bit key) header.d=ralfj.de header.i=@ralfj.de header.b=EmSqhOjV; arc=none smtp.client-ip=109.230.236.95 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ralfj.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ralfj.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=ralfj.de header.i=@ralfj.de header.b="EmSqhOjV" DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=ralfj.de; s=mail; t=1740664527; bh=MZ5feQNtisJgpeJy5xLzx0Y8Q+XnSBumrFimAbzZ9CM=; h=Date:Subject:To:Cc:References:From:In-Reply-To:From; b=EmSqhOjVTv/5qycUfH0WaMucg+fhJaX+QaLJDaN7j5pv+AO3LHrkfHfhEz78bmFE8 5Wmvn1X52+ZGUzV0+Stdk7alPHSvfNRXxNqy1bwepz5dx+8Bn1TEdqm3PBFf7I1NKc 4yxnyQjNSmcPnthA3ukUCkq3/7nwB3yR/KSCtRhI= Received: from [IPV6:2001:67c:10ec:5784:8000::87] (2001-67c-10ec-5784-8000--87.net6.ethz.ch [IPv6:2001:67c:10ec:5784:8000::87]) by r-passerv.ralfj.de (Postfix) with ESMTPSA id 62DA02052A91; Thu, 27 Feb 2025 14:55:27 +0100 (CET) Message-ID: Date: Thu, 27 Feb 2025 14:55:22 +0100 Precedence: bulk X-Mailing-List: ksummit@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: C aggregate passing (Rust kernel policy) To: David Laight Cc: Ventura Jack , Kent Overstreet , Miguel Ojeda , Gary Guo , torvalds@linux-foundation.org, airlied@gmail.com, boqun.feng@gmail.com, ej@inai.de, gregkh@linuxfoundation.org, hch@infradead.org, hpa@zytor.com, ksummit@lists.linux.dev, linux-kernel@vger.kernel.org, rust-for-linux@vger.kernel.org References: <20250222141521.1fe24871@eugeo> <6pwjvkejyw2wjxobu6ffeyolkk2fppuuvyrzqpigchqzhclnhm@v5zhfpmirk2c> <780ff858-4f8e-424f-b40c-b9634407dce3@ralfj.de> <7edf8624-c9a0-4d8d-a09e-2eac55dc6fc5@ralfj.de> <20250226230816.2c7bbc16@pumpkin> Content-Language: en-US, de-DE From: Ralf Jung In-Reply-To: <20250226230816.2c7bbc16@pumpkin> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Hi all, > ... >>> Unions in C, C++ and Rust (not Rust "enum"/tagged union) are >>> generally sharp. In Rust, it requires unsafe Rust to read from >>> a union. >> >> Definitely sharp. At least in Rust we have a very clear specification though, >> since we do allow arbitrary type punning -- you "just" reinterpret whatever >> bytes are stored in the union, at whatever type you are reading things. There is >> also no "active variant" or anything like that, you can use any variant at any >> time, as long as the bytes are "valid" for the variant you are using. (So for >> instance if you are trying to read a value 0x03 at type `bool`, that is UB.) > > That is actually a big f***ing problem. > The language has to define the exact behaviour when 'bool' doesn't contain > 0 or 1. No, it really does not. If you want a variable that can hold all values in 0..256, use `u8`. The entire point of the `bool` type is to represent values that can only ever be `true` or `false`. So the language requires that when you do type-unsafe manipulation of raw bytes, and when you then make the choice of the `bool` type for that code (which you are not forced to!), then you must indeed uphold the guarantees of `bool`: the data must be `0x00` or `0x01`. > Much the same as the function call interface defines whether it is the caller > or called code is responsible for masking the high bits of a register that > contains a 'char' type. > > Now the answer could be that 'and' is (or may be) a bit-wise operation. > But that isn't UB, just an undefined/unexpected result. > > I've actually no idea if/when current gcc 'sanitises' bool values. > A very old version used to generate really crap code (and I mean REALLY) > because it repeatedly sanitised the values. > But IMHO bool just shouldn't exist, it isn't a hardware type and is actually > expensive to get right. > If you use 'int' with zero meaning false there is pretty much no ambiguity. We have many types in Rust that are not hardware types. Users can even define them themselves: enum MyBool { MyFalse, MyTrue } This is, in fact, one of the entire points of higher-level languages like Rust: to let users define types that represent concepts that are more abstract than what exists in hardware. Hardware would also tell us that `&i32` and `*const i32` are basically the same thing, and yet of course there's a world of a difference between those types in Rust. Kind regards, Ralf