From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 39A61C0218D for ; Thu, 30 Jan 2025 00:20:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3DD472800AA; Wed, 29 Jan 2025 19:20:02 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 38BF12800A8; Wed, 29 Jan 2025 19:20:02 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 22CD02800AA; Wed, 29 Jan 2025 19:20:02 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id F41092800A8 for ; Wed, 29 Jan 2025 19:20:01 -0500 (EST) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id AD8A444D69 for ; Thu, 30 Jan 2025 00:20:01 +0000 (UTC) X-FDA: 83062210602.28.2D208CF Received: from mail-pl1-f176.google.com (mail-pl1-f176.google.com [209.85.214.176]) by imf16.hostedemail.com (Postfix) with ESMTP id C9626180009 for ; Thu, 30 Jan 2025 00:19:59 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Blq+Mttg; spf=pass (imf16.hostedemail.com: domain of andrii.nakryiko@gmail.com designates 209.85.214.176 as permitted sender) smtp.mailfrom=andrii.nakryiko@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738196399; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=6AMh6nFsScnlaxd8YC2LQqnUdnaSzHiT4jf6slLXshc=; b=D8ODb7QgiA0Ru462OIeNX9DfdXFU3IPHpxsnC+oayqFy1knGJcMWppw67dVOiPaMN2/7Hd ZojG5B1FL0h1NKMyzwVTJz81Z6DXk5D21EhxthlIGnsRAmRfJILvlGmh943FBopObqAexU +02ZP/HU2S+CepNRR+/J2NRRn2a3tUg= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Blq+Mttg; spf=pass (imf16.hostedemail.com: domain of andrii.nakryiko@gmail.com designates 209.85.214.176 as permitted sender) smtp.mailfrom=andrii.nakryiko@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738196399; a=rsa-sha256; cv=none; b=K1Nn50Z8tcjOxZwrAxs5g7JSUk4muQ0vlDp3rOvfkpC3pta0CSl/OakDEtWX5VpZ8Y+huV P+Klb5rWWIDvLUihrKeO9imVH5FBDvBXF+hUG3oip5NPI/U+84gwOAON8YBzaeBw0+o3BE fmQlquvhWjn8vE8kMQv9atx2UGNoj94= Received: by mail-pl1-f176.google.com with SMTP id d9443c01a7336-2163dc5155fso3522985ad.0 for ; Wed, 29 Jan 2025 16:19:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738196398; x=1738801198; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=6AMh6nFsScnlaxd8YC2LQqnUdnaSzHiT4jf6slLXshc=; b=Blq+MttgT+wV+2/CEXSGZpo+1F83h6HNPl3Qwll6o3vor6AjUflM064uiW4VgadEet jueyyTHP8DnYUYCRqSk2YRfrAK+bKCdOZOywg1j1U7ZobFFi7w+I54+uAZPRkNy6Bwvt ohy4GNWx1cdzmxRwPhEbh2O/CbALnpdYktXEt9tE8/vJYlgQ9vB4hr0eU/AmGwFQV+2q B1CXSXn/yEBNA9rdirYkpo+eVQUnCsQkxRutAA1lmbtyjcDRTsF1gFjBZXVcGQE8wFEv GFEd983vUD3LxhrpyuMiUO9iiGN8XP91nW7eZXQl03YviPIlDZ70w6Ad697qx4ZTYE3K fn2Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738196398; x=1738801198; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=6AMh6nFsScnlaxd8YC2LQqnUdnaSzHiT4jf6slLXshc=; b=D1Sq6Jtn0I9aZIiQeBJMvOFltnA8um9i5iZIMy5snRHEclZPZVRbzJQ70xe2IERIKy oOd7mGWxDIf2UI5P2KBVclqSWMjjo2Cq9RjtgHrwvNLadfnxCH3WDFuX/ACPe9nkSNu1 IUY9vZQTd0Ucsaaf3SIpNio+g7QIrUOorpTsecmair9M3ehPQlCROjdMV6yDro9p9c7h HQ2FdCckNA5YBQ6D6O/7kGJcnsZE/dQshReWNeT1o5+t8tLxuAX+GW4XcPHKP/A/rU0N AT6QgcTzgis2Av4MHBbu/IZrKBr4tpQU83s2N/42K4dc7lfsNfiFl/pKyvqM1Y4Jjzy8 sVKw== X-Forwarded-Encrypted: i=1; AJvYcCVWNj/AfR/XDbqGTygjab/vQ99ig7qrmgY5Nqzq9zxCkCEI7XenmNjpZt+BIpbacF6X3jiPmI8vCA==@kvack.org X-Gm-Message-State: AOJu0YxaBhJt7QnbbFacxBxif3b6VlEthCyLAexrDVocyCIha/Y7gyZe 7IdCUu2Lvh1GovNiH2gyy6BZsdYCk4FQaDHWC6QtOQLckzadkhqReY8j0IdpWhDyLr37oCvRI6n 6/vywec+Zf/xWo6g5J8PlUSrcAFA= X-Gm-Gg: ASbGncshdqzP6ssjrZOwmk9z56j13z7Iv6ptalTpV3VUhmSUzBaLYdjQb9R7wXl7HsA D7S0PNRCvm8iUTRyBEdhnyWdt1oxXz3kgX2Tau4bh4EXxH5PC+HCny+B+TluZa3HOPpaJxQnz5Y sO5MaFuVwNBARg X-Google-Smtp-Source: AGHT+IEWbSlMkZ5Mw89iU2qXlIjzZXpHahbdfaQoRbtPhTaFUQtOSWvv+SrxUpHVnmQ3LYEDqKbO08pevESaNYb+5/8= X-Received: by 2002:a05:6a20:c91c:b0:1e1:f281:8d07 with SMTP id adf61e73a8af0-1ed7a5efb7fmr7402768637.10.1738196398239; Wed, 29 Jan 2025 16:19:58 -0800 (PST) MIME-Version: 1.0 References: <20250128224352.3808460-1-linux@jordanrome.com> In-Reply-To: <20250128224352.3808460-1-linux@jordanrome.com> From: Andrii Nakryiko Date: Wed, 29 Jan 2025 16:19:44 -0800 X-Gm-Features: AWEUYZnKIr_x5jBe8dUdIMZTkUYqxsK0hZXP8wHi6oSpzQDU6UMn2eP25IZoNrg Message-ID: Subject: Re: [bpf-next v6 1/3] mm: add copy_remote_vm_str To: Jordan Rome Cc: bpf@vger.kernel.org, linux-mm@kvack.org, Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Kernel Team , Andrew Morton , Shakeel Butt , Alexander Potapenko Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: C9626180009 X-Stat-Signature: fbnsrs3n4of57m6t14kw61azmssefdxa X-Rspam-User: X-HE-Tag: 1738196399-406477 X-HE-Meta: U2FsdGVkX1/BZVC7NhfsZFEgMZ0ejE6jghkWFNlZaaQ9LFOz0kgoQgW96pWHgOgsNs1M42oRrmVv1GI8VVzvPRDb+reSJaMAz8NB0w3rKEN2G8Y2gKntutylYJmpQRHQYzBgeY/7EB4+nQ1t4HgsuxqtnkJAD7vYWsK1F8bnynClT0UcmlCDhvshD9WwvQkwcOVL8RmRdBaQ+pJsE8ooOWGOIll4KpT9Yru1ROOnBwcS2IXgs1YFy6ZOxUOzxjpCMG6LgDgPtDBc901S1P7wiERfdEc5Pa5gAKITqQC6uRSkaYsBZS40byJRB2KwmzbP57jkO0Zk8qgxs+1onJ4siT7/QXj7XyLsEH3NV+0942WPdTU1qlEcOgZOSPCG+jBwfRok3uH8tmts31W+zZQKafLZRm4HtKIzcz84k6rjTGZmc9L0ugwBv7tuHe7RqD2lN5mKOmt90J9ydhLrhltTPh6GHxk7BLvfEGGe2VMXUSZIGIp+t6LyHRiBjSrSiObEfYlWnJHT0pv3EphA0vwEhvBS0ttuU2S8l3aKzjhBaXaSqcddrUid4V0puvhimlYTsOkPtb/HEAUdRSrGXRmnb+5+oESX9dtyp0F1WDJOOHmbzhDDMKrwrMnI7lipxhSsWyjMiuRQLN2sG4Z9bgi5MvCs29OSEMAqyQeEcI1mU4v5oK9iKCjdYiwW1kFLK7ispFdDhortoohgcieRRa08a4h45P+qsbt6tbng/tv1ZZrEeL/ukgR6BtytP+eND7tErIcetHhCyefo85UyxsXWQ4/y2ni81RMq0Qy0RIDZsbqFWgFTi7TQf3TTdgwKvGaac1zCUG4fNMSmQ0FIGacH9JAacJEfH55FXQYL79PN/fkX0izP1N66Os/aEI0tcLM/XDLB02Sv1o7FHh7CZQRNwrhMOhQ9anRdl1H7A5ZzH90MxjY0PoMSda5aeEQpjA2Rar6RIYyHJyIiJXrjK+B mY+XjFVo e5k88mIIAZyb23VV7L8eSJ/9tDVwHNmbWMMqlNiEu1LU4518gztPq/KDVFU/S8/+2HC0OvkymcuIiLNEV96puisNYlgQeanKvJWpkNZG5oVCidYmqUniOMzIAKKlXm6k2v2iHJiO5ZFQi2TfHqFR7NZ8StgaGLMTkCpb1RafziaDcVot2aTPibVrbL4DEn4sFUyNqneLTLcoYgQsCRR8Wdp+ELzy0Vq21Jrc2CXD9d1MT3S4jEwW+4ddHoZhK+jPdmYeb+2n0k6PBWaxnY2bPeVr5LEvbSjn6AKr6TFI7U/LDCfH4o9qasYIFU1yNLHV2Oyt+mdr3K+awKRibO5kGOOBz78jwkBCMZ+YvOACoP08pnL/LWxpoTdB0lxPRVPt1+4e09RTflzaT4m6c5WOvfNhVtt9RZynZHatZ/ahq503QR8OtXd3UAJBudhlVP6VNxxUjhY/oiOtkz2auVWX7o3xyhekPfiOOkDsf4Ao6B61dWXM= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Jan 28, 2025 at 2:44=E2=80=AFPM Jordan Rome = wrote: > > Similar to `access_process_vm` but specific to strings. > Also chunks reads by page and utilizes `strscpy` > for handling null termination. > > Signed-off-by: Jordan Rome > --- > include/linux/mm.h | 3 ++ > mm/memory.c | 119 +++++++++++++++++++++++++++++++++++++++++++++ > mm/nommu.c | 74 ++++++++++++++++++++++++++++ > 3 files changed, 196 insertions(+) > The logic looks good, but I have a bunch of stylistic nits below. It would be nice for someone from mm side to take a look as well. > diff --git a/include/linux/mm.h b/include/linux/mm.h > index f02925447e59..f3a05b3eb2f2 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -2485,6 +2485,9 @@ extern int access_process_vm(struct task_struct *ts= k, unsigned long addr, > extern int access_remote_vm(struct mm_struct *mm, unsigned long addr, > void *buf, int len, unsigned int gup_flags); > > +extern int copy_remote_vm_str(struct task_struct *tsk, unsigned long add= r, > + void *buf, int len, unsigned int gup_flags); > + > long get_user_pages_remote(struct mm_struct *mm, > unsigned long start, unsigned long nr_pages, > unsigned int gup_flags, struct page **pages, > diff --git a/mm/memory.c b/mm/memory.c > index 398c031be9ba..7f6e74a99984 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -6714,6 +6714,125 @@ int access_process_vm(struct task_struct *tsk, un= signed long addr, > } > EXPORT_SYMBOL_GPL(access_process_vm); > > +/* > + * Copy a string from another process's address space as given in mm. > + * If there is any error return -EFAULT. > + */ > +static int __copy_remote_vm_str(struct mm_struct *mm, unsigned long addr= , > + void *buf, int len, unsigned int gup_flags) > +{ > + void *old_buf =3D buf; > + int err =3D 0; empty line between variables and statements > + ((char *)buf)[0] =3D '\0'; nit: this would be probably a bit more "canonical": *(char *)buf =3D '\0'; > + > + if (mmap_read_lock_killable(mm)) > + return -EFAULT; > + > + /* Untag the address before looking up the VMA */ > + addr =3D untagged_addr_remote(mm, addr); > + > + /* Avoid triggering the temporary warning in __get_user_pages */ > + if (!vma_lookup(mm, addr)) { > + err =3D -EFAULT; > + goto out; > + } > + > + while (len) { > + int bytes, offset, retval; > + void *maddr; > + struct page *page; > + struct vm_area_struct *vma =3D NULL; > + > + page =3D get_user_page_vma_remote(mm, addr, gup_flags, &v= ma); > + > + if (IS_ERR(page)) { > + /* > + * Treat as a total failure for now until we deci= de how > + * to handle the CONFIG_HAVE_IOREMAP_PROT case an= d > + * stack expansion. > + */ > + ((char *)buf)[0] =3D '\0'; > + err =3D -EFAULT; > + goto out; > + } > + > + bytes =3D len; > + offset =3D addr & (PAGE_SIZE - 1); > + if (bytes > PAGE_SIZE - offset) > + bytes =3D PAGE_SIZE - offset; > + > + maddr =3D kmap_local_page(page); > + retval =3D strscpy(buf, maddr + offset, bytes); > + > + if (retval < 0) { > + buf +=3D (bytes - 1); nit: unnecessary () another nit: you could have had `addr +=3D bytes - 1;` here, to keep addr and buf adjustment code close > + /* > + * Because strscpy always NUL terminates we need = to > + * copy the last byte in the page if we are going= to > + * load more pages > + */ > + if (bytes !=3D len) { > + addr +=3D (bytes - 1); > + copy_from_user_page(vma, page, addr, buf, > + maddr + (PAGE_SIZE - 1), = 1); > + > + buf +=3D 1; > + addr +=3D 1; > + } > + len -=3D bytes; > + } > + > + unmap_and_put_page(page, maddr); > + > + if (retval >=3D 0) { > + /* Found the end of the string */ > + buf +=3D retval; > + goto out; > + } it's not incorrect, but it would be nice not to have to re-check retval twice. Why not this structure: ret =3D strscpy(...) if (retval >=3D 0) { unmap_and_put_page(page, maddr); buf +=3D retval; break; } /* this is -E2BIG case */ buf +=3D bytes - 1; addr +=3D bytes - 1; if (bytes !=3D len) { copy, buf +=3D 1, addr +=3D 1 } unmap_and_put_page(page, maddr); Note that you don't need goto, break is fine. And yes, I don't think duplicating unmap_and_put_page() is a problem. > + } > + > +out: > + mmap_read_unlock(mm); > + if (err) > + return err; > + > + return buf - old_buf; > +} > + > +/** > + * copy_remote_vm_str - copy a string from another process's address spa= ce. > + * @tsk: the task of the target address space > + * @addr: start address to read from > + * @buf: destination buffer > + * @len: number of bytes to copy > + * @gup_flags: flags modifying lookup behaviour > + * > + * The caller must hold a reference on @mm. > + * > + * Return: number of bytes copied from @addr (source) to @buf (destinati= on); > + * not including the trailing NUL. Always guaranteed to leave NUL-termin= ated > + * buffer. On any error, return -EFAULT. > + */ > +int copy_remote_vm_str(struct task_struct *tsk, unsigned long addr, > + void *buf, int len, unsigned int gup_flags) > +{ > + struct mm_struct *mm; > + int ret; > + > + mm =3D get_task_mm(tsk); > + if (!mm) { > + ((char *)buf)[0] =3D '\0'; > + return -EFAULT; > + } > + > + ret =3D __copy_remote_vm_str(mm, addr, buf, len, gup_flags); > + > + mmput(mm); > + > + return ret; > +} > +EXPORT_SYMBOL_GPL(copy_remote_vm_str); > + > /* > * Print the name of a VMA. > */ > diff --git a/mm/nommu.c b/mm/nommu.c > index 9cb6e99215e2..4d83d0813eb8 100644 > --- a/mm/nommu.c > +++ b/mm/nommu.c > @@ -1701,6 +1701,80 @@ int access_process_vm(struct task_struct *tsk, uns= igned long addr, void *buf, in > } > EXPORT_SYMBOL_GPL(access_process_vm); > > +/* > + * Copy a string from another process's address space as given in mm. > + * If there is any error return -EFAULT. > + */ > +static int __copy_remote_vm_str(struct mm_struct *mm, unsigned long addr= , > + void *buf, int len) > +{ > + uint64_t tmp; s/uint64_t/unsigned long/ also tmp -> addr_end ? > + struct vm_area_struct *vma; > + nit: no empty line here, why? > + int ret =3D -EFAULT; > + > + ((char *)buf)[0] =3D '\0'; > + > + if (mmap_read_lock_killable(mm)) > + return ret; > + > + /* the access must start within one of the target process's mappi= ngs */ > + vma =3D find_vma(mm, addr); > + if (!vma) > + goto out; > + > + if (check_add_overflow(addr, len, &tmp)) > + goto out; > + /* don't overrun this mapping */ > + if (tmp >=3D vma->vm_end) nit: strictly speaking only `tmp > vma->vm_end` needs special handling > + len =3D vma->vm_end - addr; > + > + /* only read mappings where it is permitted */ > + if (vma->vm_flags & VM_MAYREAD) { > + ret =3D strscpy(buf, (char *)addr, len); > + if (ret < 0) > + ret =3D len - 1; > + } > + > +out: > + mmap_read_unlock(mm); > + return ret; > +} > + > +/** > + * copy_remote_vm_str - copy a string from another process's address spa= ce. > + * @tsk: the task of the target address space > + * @addr: start address to read from > + * @buf: destination buffer > + * @len: number of bytes to copy > + * @gup_flags: flags modifying lookup behaviour (unused) > + * > + * The caller must hold a reference on @mm. > + * > + * Return: number of bytes copied from @addr (source) to @buf (destinati= on); > + * not including the trailing NUL. Always guaranteed to leave NUL-termin= ated > + * buffer. On any error, return -EFAULT. > + */ > +int copy_remote_vm_str(struct task_struct *tsk, unsigned long addr, > + void *buf, int len, unsigned int gup_flags) > +{ > + struct mm_struct *mm; > + int ret; > + > + mm =3D get_task_mm(tsk); > + if (!mm) { > + ((char *)buf)[0] =3D '\0'; > + return -EFAULT; > + } > + > + ret =3D __copy_remote_vm_str(mm, addr, buf, len); > + > + mmput(mm); > + > + return ret; > +} > +EXPORT_SYMBOL_GPL(copy_remote_vm_str); > + > /** > * nommu_shrink_inode_mappings - Shrink the shared mappings on an inode > * @inode: The inode to check > -- > 2.43.5 >