From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 05211E77188 for ; Wed, 8 Jan 2025 20:04:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3615A6B007B; Wed, 8 Jan 2025 15:04:50 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 310BC6B0083; Wed, 8 Jan 2025 15:04:50 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1B1D56B0085; Wed, 8 Jan 2025 15:04:50 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id EF1816B007B for ; Wed, 8 Jan 2025 15:04:49 -0500 (EST) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 65F221A1575 for ; Wed, 8 Jan 2025 20:04:49 +0000 (UTC) X-FDA: 82985362698.25.925F699 Received: from mail-pl1-f175.google.com (mail-pl1-f175.google.com [209.85.214.175]) by imf27.hostedemail.com (Postfix) with ESMTP id 4FF0940003 for ; Wed, 8 Jan 2025 20:04:47 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Ul10iMwA; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf27.hostedemail.com: domain of isaacmanjarres@google.com designates 209.85.214.175 as permitted sender) smtp.mailfrom=isaacmanjarres@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736366687; a=rsa-sha256; cv=none; b=Qz6Ao7Pw2wCNbwShiG/EtLYVD6zRfZAIZdfDVe/oJxsbdrLfVM2m915Z15N+9hkWLzzP+b cxtrsAqRs4KF9z8hfsz5fMMEOHcaDq5C9TPHzVts5Jfw+DWPdCGiTZLG/eItNiaVskkm2Y OlkB0+KElYRnz0/lt1Raje7bQwwXeAo= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Ul10iMwA; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf27.hostedemail.com: domain of isaacmanjarres@google.com designates 209.85.214.175 as permitted sender) smtp.mailfrom=isaacmanjarres@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736366687; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=aVlyK/6DL+VXJZJFgnmJBFKbZKjrUopLWCGmwQTzfUs=; b=wHwwQASCXlI5O9LvCjDoxrETkVyF7cFZAmGUuqCn4RhEJtpTQAzz92ZhcWK9XCoCvRI6gf OQkbCeLESE5z1U2umVVH3jd9U/PRGMsKqi8lrN43BPPNMNjZ/T+q2XxezR2qBFKF2RFotE N8D2YnI5W4HzY9sZs7WG3y7W2ICuN+k= Received: by mail-pl1-f175.google.com with SMTP id d9443c01a7336-219f6ca9a81so21195ad.1 for ; Wed, 08 Jan 2025 12:04:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736366686; x=1736971486; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=aVlyK/6DL+VXJZJFgnmJBFKbZKjrUopLWCGmwQTzfUs=; b=Ul10iMwAi/iqx4ci5EKGsJx/JS57VqSZM6A/b7oRpuZPNb2FRcfha5fg0H1FL9KsfQ Gqmj0twd4CG5SPbIWXsXMh7abhCTOByfq7fK+RtnFfxxx3OB3w9F0DZ+sp7q4a/DSc15 7B+VFpHhSNcOfeAvfDGlz+kOwrRP5hxM+ZQqnijSa6xsYs1wm8xu+o1mYkriEDzrphrk czXzamwnq8Trf+/dvCTthDAUBNXM4E10oC4lcGzBE6wLaDa9g5ylByPPBUmZCI33acWd n0T6PUkcn7Rg80/yVpuc3tEUbpTHsk2orWNumNjvyyu+WSwAunbrFg41lZ4japbvvDpp 1uDw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736366686; x=1736971486; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=aVlyK/6DL+VXJZJFgnmJBFKbZKjrUopLWCGmwQTzfUs=; b=fqq4nQqOQMogNvLZs4SA0Ot9Ia+m2qIxQFVLxzwx929YvCtMaHQcuoWEUoSDmb65S0 IJJL8nfap/cKIH/WPX2u4h/Szn707YUno6iQbTPfwHrdSw/Hzu3izmwB6KP2NQsDSeed W4S+h+xzVA0S3xKv10vjK03RxrqLTTIo9Fcy90RZsU8nAR7s25wanU5Sf263wD78DDFe oivcXwCnkv2FPz2IH9lgBbHIprKszXK/Yv+cLHK/sf4VzOuwV8BeZ89N0XMgPQ4sHyGt VH7emI0U61lP+BQ1SZCn+sVkqJ1FP8LwbSo4fQA5qG+ONj8YDYVle0FOCd6tUel+BQz+ cDMw== X-Forwarded-Encrypted: i=1; AJvYcCVsKTITGGo1oXxSkcVEch/cqJQs4U2kQh7eBeDAJ/R8k0MtAHEs5R/MhSy3MOX17JX/zpdok4pSLQ==@kvack.org X-Gm-Message-State: AOJu0YyxxoaG40MV0gY6abEo/2dr7CMhhwhws9ocu0Y3butL3BEva0Wc 8Et9pbPLYotWuE4sSCaOnAtST7yJu9pOip4wQ5rUVuY28/vsYZ1dEgjbo6payA== X-Gm-Gg: ASbGncsnSd1rWBVKBNdjXCiaNCuur89/RbTua8TMYvvxwLlhfXj+voLlbgeOZHMJMUq Ygj0mCVzmQk4TsyghklSrGx5IRSZ8tDvjFmQZlg0A+wLDD9KIu6TU/kU0cFIAdkmPUXAKMF5r5l ZDo/Imf3l5IrNxlq84NmNLKMjfzkeK33eFyxO1+t3quds4zyBS9QYzUbZeY+N/ESI+NjN5h18KY opcE3G6s5ZpGtb/5yRpzNH0JBn0C9NprdPRkqNX6fmpoEkPfFMQaU2t X-Google-Smtp-Source: AGHT+IEAKajAEylR0mGRGCcfWMj81RRcSa2jLphCgnxCX1qv4wQt2sf7OwWRtM+OdXn18xDYThSv7w== X-Received: by 2002:a17:903:2581:b0:215:9327:5aed with SMTP id d9443c01a7336-21a8ed4b11fmr273255ad.20.1736366685694; Wed, 08 Jan 2025 12:04:45 -0800 (PST) Received: from google.com ([2620:15c:2d:3:e514:4b22:2740:5ec1]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-72aad8dbb42sm35544518b3a.115.2025.01.08.12.04.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 08 Jan 2025 12:04:45 -0800 (PST) Date: Wed, 8 Jan 2025 12:04:40 -0800 From: Isaac Manjarres To: Lorenzo Stoakes Cc: Andrew Morton , kaleshsingh@google.com, jstultz@google.com, aliceryhl@google.com, surenb@google.com, kernel-team@android.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v2 1/2] mm/memfd: Refactor and cleanup the logic in memfd_create() Message-ID: References: <20250107184804.4074147-1-isaacmanjarres@google.com> <20250107184804.4074147-2-isaacmanjarres@google.com> <8bcbd7af-1cd2-4043-ab10-978d568a9b89@lucifer.local> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <8bcbd7af-1cd2-4043-ab10-978d568a9b89@lucifer.local> X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 4FF0940003 X-Stat-Signature: u7x5hitbsgqsub6iun7mu93n96jzbi3p X-Rspam-User: X-HE-Tag: 1736366687-78435 X-HE-Meta: U2FsdGVkX1/9SRJwhXEmlv6X45B9yBlkfzSgFx4DVlhdiVQrTMhrif9/A5rTLVNpVMrV3+/UlSmGDjbdURI2E1R80Ca8tY1sgjKc8YSwNlnxWAGbsazfglpUiCzslYCUkSF8crz3A4tEGXnRCFIb9z/iMGaV0qFUaRJuguTWHvDm7TIboLHs8wR9WgOQZkJUhd4evVPq8reGFGkO0koVOfN7drxBEPdI8Z74W95GpizHLeLOhiIDwUAGC57m8+tOKNvJtUUnQTgsg9x3SwwfulcdKFevw7rcWKNAQpOmr5l9WwQfXfdcuZXpF2ALeY+vWAqVspIy7MPuJhKELalWJ+AEJ52DLK0IhYWBSS3GdPfDtBAF1oTv6ipE5QVh/H+37GmJJntVunbbcht/404CnNavP+IH0YqrjT/5xbCCW/WjTXlbhSN1DnnuVhTJEIb4h9AhZXZYOsVgJZxnnYQHv1I0vJW4HvBtrCymo1MwguYOsSsqFbVieMa4ti5XASBIg70rS+Oq0Ih2aQOX+HbdQCCbfoioGGFX53X9eZkHymelj7TqXFbZN4Qei0BxEm3nNXV7sJfRAoYj5TRiop/vsC9Q3LHghzvs/BYAUaHYN+QLPhNidh5C4sEa70D5gzRXw1mVcrL+b1T5ljERYGjjJALFAZ1nOYMyqu5QJwM8t4NoQPQTimkTwTLPaJROjo0AUSe+eASh6WwTMRTxZbCGUJQZrgjTE2b9NxhCxabjCCyhKb/TVNF4pwMS+SiktSYq6fuaJ0Yv4fyDrFo9d+cpw8HnVEGbxXjsp6xaN4ZTNeBjAdVxL4zFG0G6dSNzTsOpiWyPRME4LTDfJfcufkP2Irds/Z4WMRBa0+SB7ynpsOD1wAptD8SR031j1C2Hmliq+dk5Vaxf+PMl9zWyC/HERRIvNe9f6Q84HC5HgcGq4GDjuLtuBdc21siF4jKxONO9nR+JVBxiUpTqkAAGhg/ bWLxLRD5 l34i03rUOtDTOCFmGJTZYMCaMIb7VeHDOdPoBfJXW74fFkOyiTkSKAPkqQ+bvmucPgvr2iZW2H9WV9M+rK9Vh5dYobEDajTwSdZPIcagT3BCEAJHsS1UKpNFZTAIzK1FNe0XC+e3rRRAIkT1OgcFOLcd26BBMMUMxYof87N1qiY0+3VBl+tx4yRBv8D37hMlSYJpiN5/8ETP2PudfcKV1rnY7IegDgwmca3FPwZdz451MliM973Cc3t9JwEVPQG/JIJaqnLubi5gophwGrxxvf6+AqsAxuzL0V5uV93T3xbD4LW0aFa6VViLOSk1wjrhebjFAB3W+mFD8jw/DwFlelZK6KtA9ioGfrr1Ci7vbBNt1c06HclFL78RP0bMdV1mPmPUfvLZFKz6+pgRp1NBYgQNGfw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000101, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Jan 08, 2025 at 06:30:54PM +0000, Lorenzo Stoakes wrote: > On Tue, Jan 07, 2025 at 10:48:01AM -0800, Isaac J. Manjarres wrote: > > memfd_create() is a pretty busy function that could be easier to read > > if some of the logic was split out into helper functions. > > > > Therefore, split the flags check, name creation, and file creation into > > their own helper functions, and create the file structure before > > creating the memfd. This allows for simplifying the error handling path > > in memfd_create(). > > I do like the intent of this change, but I think this needs some tweaking. > > I wish the diff algorithm would do a little better here, because it's quite > hard to follow. In no way your fault that :) difftastic hopefully can help > me here... > > > > > No functional change. > > > > Signed-off-by: Isaac J. Manjarres > > --- > > mm/memfd.c | 87 +++++++++++++++++++++++++++++++++++------------------- > > 1 file changed, 56 insertions(+), 31 deletions(-) > > > > diff --git a/mm/memfd.c b/mm/memfd.c > > index 5f5a23c9051d..a9430090bb20 100644 > > --- a/mm/memfd.c > > +++ b/mm/memfd.c > > @@ -369,16 +369,8 @@ int memfd_check_seals_mmap(struct file *file, unsigned long *vm_flags_ptr) > > return err; > > } > > > > -SYSCALL_DEFINE2(memfd_create, > > - const char __user *, uname, > > - unsigned int, flags) > > +static int memfd_validate_flags(unsigned int flags) > > For static functions the memfd_ prefix is redundant, please strip them. We > know we're in mm/memfd.c which is context enough for these internal > helpers! > Thanks for taking a look at this patch! Understood, I'll go ahead and strip those prefixes. > > { > > - unsigned int *file_seals; > > - struct file *file; > > - int fd, error; > > - char *name; > > - long len; > > - > > if (!(flags & MFD_HUGETLB)) { > > if (flags & ~(unsigned int)MFD_ALL_FLAGS) > > return -EINVAL; > > @@ -393,20 +385,25 @@ SYSCALL_DEFINE2(memfd_create, > > if ((flags & MFD_EXEC) && (flags & MFD_NOEXEC_SEAL)) > > return -EINVAL; > > > > - error = check_sysctl_memfd_noexec(&flags); > > - if (error < 0) > > - return error; > > + return check_sysctl_memfd_noexec(&flags); > > More importantly - this is broken... > > The check_sysctl_memfd_noexec() function is _changing_ flags, which you now > discard. > > This also renders 'validate' in the name a little inaccurate (hey naming is > hard :), perhaps 'sanitise_flags()'? > > Anyway you should pass flags as a pointer (even if that's yuck) and rename. > Thanks for catching that. I will go ahead and use a pointer to the flags and rename the function as well. > > +} > > + > > +static char *memfd_create_name(const char __user *uname) > > Again, strip memfd_ prefix please. > > Also I don't know what 'create' means here. Given the function you're > interacting with is memfd_create() it's rendered a little vague. > > I'd say 'alloc_name()' would be better. Acknowledged, I will go ahead and do that in v3 of the patch. > > +{ > > + int error; > > + char *name; > > + long len; > > > > /* length includes terminating zero */ > > len = strnlen_user(uname, MFD_NAME_MAX_LEN + 1); > > if (len <= 0) > > - return -EFAULT; > > + return ERR_PTR(-EFAULT); > > Not sure if this is necessary, I mean I guess technically... but feels like > it's adding a bunch of noise. > > I know you refactor this whole thing in the next commit so maybe to reduce > the size of this commit you could drop these changes here and keep the bare > minimum before you change the function again? > To make sure I understood correctly, do you mean that I should combine introducing alloc_name() and changing the handling for how the name is allocated in the next patch instead and just leave this logic alone in this patch? > > if (len > MFD_NAME_MAX_LEN + 1) > > - return -EINVAL; > > + return ERR_PTR(-EINVAL); > > > > name = kmalloc(len + MFD_NAME_PREFIX_LEN, GFP_KERNEL); > > if (!name) > > - return -ENOMEM; > > + return ERR_PTR(-ENOMEM); > > > > strcpy(name, MFD_NAME_PREFIX); > > if (copy_from_user(&name[MFD_NAME_PREFIX_LEN], uname, len)) { > > @@ -420,11 +417,22 @@ SYSCALL_DEFINE2(memfd_create, > > goto err_name; > > } > > > > > - fd = get_unused_fd_flags((flags & MFD_CLOEXEC) ? O_CLOEXEC : 0); > > - if (fd < 0) { > > - error = fd; > > - goto err_name; > > - } > > + return name; > > + > > +err_name: > > + kfree(name); > > + return ERR_PTR(error); > > +} > > + > > +static struct file *memfd_file_create(const char *name, unsigned int flags) > > I really am not a great fan of this name, memfd_ prefix obviously has to > go, but 'file_create' when the actual system call is 'memfd_create". > > Again, naming eh? It is hard :) > > alloc_file() probably works best as you are in fact allocating memory for > this. Yes, that makes sense to me. > > Then, as mentioned below, restore the original ordering of fd assignment in > the syscall, do so before file allocation, and install the file into the fd > afterwards. > > > +{ > > + unsigned int *file_seals; > > + struct file *file; > > + int error; > > + > > + error = memfd_validate_flags(flags); > > + if (error < 0) > > + return ERR_PTR(error); > > I'm not actually sure why you put this here, it seems quite > arbitrary. Let's invoke this from the syscall so we neatly divide the logic > into each part rather than dividing into different parts one of which is > invoked by another. I put the flags validation here since the flags are mostly used in this function, so it made sense to me embed the flags validation into here. However, I do understand how splitting this out as it was before makes more sense now, especially since the flags can change. It seems odd for alloc_file() to have a side effect of changing the flags, so I will place the flags sanitization back to where it was. > > And obviously make sure the original ordering is restored. > > > > > if (flags & MFD_HUGETLB) { > > file = hugetlb_file_setup(name, 0, VM_NORESERVE, > > @@ -433,10 +441,8 @@ SYSCALL_DEFINE2(memfd_create, > > MFD_HUGE_MASK); > > } else > > file = shmem_file_setup(name, 0, VM_NORESERVE); > > I know not to do with you, and strictly this is probably in line with > kernel code style, but this dangling else _kills_ me. > > Could we put { } around it? Risking invoking the ire of the strict > adherents of the coding style perhaps here :P > Absolutely! > > - if (IS_ERR(file)) { > > - error = PTR_ERR(file); > > - goto err_fd; > > - } > > + if (IS_ERR(file)) > > + return file; > > file->f_mode |= FMODE_LSEEK | FMODE_PREAD | FMODE_PWRITE; > > file->f_flags |= O_LARGEFILE; > > > > @@ -456,13 +462,32 @@ SYSCALL_DEFINE2(memfd_create, > > *file_seals &= ~F_SEAL_SEAL; > > } > > > > - fd_install(fd, file); > > - kfree(name); > > - return fd; > > + return file; > > +} > > > > -err_fd: > > - put_unused_fd(fd); > > -err_name: > > +SYSCALL_DEFINE2(memfd_create, > > + const char __user *, uname, > > + unsigned int, flags) > > +{ > > + struct file *file; > > + int fd; > > + char *name; > > + > > + name = memfd_create_name(uname); > > + if (IS_ERR(name)) > > + return PTR_ERR(name); > > You're changing the ordering of checks which is user-visible. Previously > the flags would be validated first, now you're 'creating' the name (not > sure this is a great name - naming is hard obviously but this doesn't > really tell me what you intend here, I'll highlight in that bit of code I > guess). > > Please try to keep the order of checks the same and validate the flags first. > Will do. > > + > > + file = memfd_file_create(name, flags); > > + /* name is not needed beyond this point. */ > > This is nice to highlight! Though it absolutely SUCKS to kmalloc() some > memory then instantly discard it like this. But I suppose we have no choice > here. > What if instead, we allocate a buffer on the stack and change alloc_name() to populate_name() and have it take in a pointer to that buffer and join the memfd name prefix and copy the name from userspace into that buffer? That avoids having to dynamically allocate a buffer that gets freed right away, and also gets rid of the cleanup, since that memory will be deallocated when we return from the function. > > kfree(name); > > - return error; > > + if (IS_ERR(file)) > > + return PTR_ERR(file); > > + > > + fd = get_unused_fd_flags((flags & MFD_CLOEXEC) ? O_CLOEXEC : 0); > > + if (fd >= 0) > > + fd_install(fd, file); > > + else > > + fput(file); > > You've changed the ordering of this again, and you're not doing anything to > free file if something goes wrong with fd allocation? That's a leak no? > file only has one reference to it at this point, and in the else branch, or failure case, fput() is invoked to drop the reference on file, which will cause it to be freed, so I don't think it's a leak, unless I missed something? > Please reinstate the original ordering. I'm happy to reinstate the original ordering, but for my knowledge, is there a reason as to why fd allocation before file structure allocation is preferred instead of the other way around? Thanks, Isaac > It's strange to open code this when we don't open code other things... but > perhaps for a few lines it's ok. > > > + > > + return fd; > > } > > -- > > 2.47.1.613.gc27f4b7a9f-goog > >