From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 436B6C0218C for ; Sat, 25 Jan 2025 14:02:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3023A2800A2; Sat, 25 Jan 2025 09:02:16 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 2B1A46B01F2; Sat, 25 Jan 2025 09:02:16 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 17AAC2800A2; Sat, 25 Jan 2025 09:02:16 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id EC8CE6B01F1 for ; Sat, 25 Jan 2025 09:02:15 -0500 (EST) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 43DE4C1E70 for ; Sat, 25 Jan 2025 14:02:15 +0000 (UTC) X-FDA: 83046138630.16.F550C3C Received: from mout.perfora.net (mout.perfora.net [74.208.4.194]) by imf19.hostedemail.com (Postfix) with ESMTP id 8A3FE1A001E for ; Sat, 25 Jan 2025 14:02:12 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=jordanrome.com header.s=s1-ionos header.b=13DxjDJC; spf=pass (imf19.hostedemail.com: domain of linux@jordanrome.com designates 74.208.4.194 as permitted sender) smtp.mailfrom=linux@jordanrome.com; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1737813733; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=rYf1KXUNG30BcbFQzSpNBwFqJgAn9teGb0jTJqlGBfg=; b=1ihpxgHKagUKkPv5CEU0r30Tzgws23UnwvcYr2H8+gZ0bTlrk9Mp3dH1Lm/LnMyM9DqIZL 1MRZZzc5PlepdAmUIQf2lFRnkWK27fs/CwgMbPRb/H09lbNOaXO4tenCxE4QmCZ9chWH4L rTILKlzVTcRM1SBCHBgpKC/xBpc+ZtQ= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1737813733; a=rsa-sha256; cv=none; b=rm1d1SLUcB0AWIMN9IH7nZmGmi1p9C7JWQO0eMK9knkFtw4aSpeHaJV2Hots9yh2nkkSjk NVjdxyBL5bUC2+eP9kisrBr0l4mgCspxGGpzpIrzlXbX4F0hB4QmYXvXyyRcdDsVyZxPVR 3qKU4A3sR4UMFfWiD8tGCzN8PkZ8mBM= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=jordanrome.com header.s=s1-ionos header.b=13DxjDJC; spf=pass (imf19.hostedemail.com: domain of linux@jordanrome.com designates 74.208.4.194 as permitted sender) smtp.mailfrom=linux@jordanrome.com; dmarc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=jordanrome.com; s=s1-ionos; t=1737813731; x=1738418531; i=linux@jordanrome.com; bh=rYf1KXUNG30BcbFQzSpNBwFqJgAn9teGb0jTJqlGBfg=; h=X-UI-Sender-Class:MIME-Version:References:In-Reply-To:From:Date: Message-ID:Subject:To:Cc:Content-Type:Content-Transfer-Encoding: cc:content-transfer-encoding:content-type:date:from:message-id: mime-version:reply-to:subject:to; b=13DxjDJCYipi/k6vbTRJG7A7xWgcd56z8C9a2WbWegtxouvQpCKoq/+wnU2v6HBq VzimlDbspMfh2E/54VWPP7qCQ7pZ1pDeietwLIOLdepcTSpbjoxCSeEI5Lj0d0VRJ t0wms8B4BFxKI6Iq/ebtgi78t652877j4jpoXs+7BflYtkKShbcW/I/GmfApeWxOu +ioAhj+WeIMMsQakJCdXkvj0QPtIZ9IyjsWER5YxJoYlUP7W3JAi71uzVLEO5A+Mp /IlQNq1Ji56J8OXVwAp8wBH/dcLlj6MXbR4ZNz/Lsv1HqRZ3h5Ow4ie99pwicVT4Y 7R+vC6KXcDZf4PIOFQ== X-UI-Sender-Class: 55c96926-9e95-11ee-ae09-1f7a4046a0f6 Received: from mail-il1-f177.google.com ([209.85.166.177]) by mrelay.perfora.net (mreueus004 [74.208.5.2]) with ESMTPSA (Nemesis) id 1MybjN-1tGrZW18Rx-00uXLw for ; Sat, 25 Jan 2025 15:02:11 +0100 Received: by mail-il1-f177.google.com with SMTP id e9e14a558f8ab-3cfae81ab24so8324715ab.3 for ; Sat, 25 Jan 2025 06:02:11 -0800 (PST) X-Forwarded-Encrypted: i=1; AJvYcCUBIRj7vbu5jTnJuzNzXpCX2Cu1aX1gGnO1pJm2cauiWrTDSf3KdxHKt6K5FI01CuEEnjjrW9DsWw==@kvack.org X-Gm-Message-State: AOJu0Yx5lK34Ixh9FRRBBSj5Xat59eVeMGg+4KWSvQAdE47i+wf4Puqx vnGWabQTG56TPzwYgQ1eIN1LDObCTtCm/qWpNaay++YwD/y14ful2SrMxRtRCpV3V1z991nUz52 LRm+4B5iW1skwfL+i4V2//GTB/jQ= X-Google-Smtp-Source: AGHT+IE5GrCHOzylD7RLBeNyEdIR9FJzPs5p/pyF2/z1RwtKgiEmMbCj6K0TFEd/nTQODRRPSQ5UH2wseoHxEnWuA6c= X-Received: by 2002:a05:6e02:1d88:b0:3a7:e452:db5 with SMTP id e9e14a558f8ab-3cf744788cfmr290655085ab.15.1737813730652; Sat, 25 Jan 2025 06:02:10 -0800 (PST) MIME-Version: 1.0 References: <20250124181602.1872142-1-linux@jordanrome.com> In-Reply-To: From: Jordan Rome Date: Sat, 25 Jan 2025 09:01:59 -0500 X-Gmail-Original-Message-ID: X-Gm-Features: AWEUYZlmuiSI9JYAQZJlAfxa8d14eIKfXKn-YrZ4ah-wQ5bmO4uRGJZWzsMzhFA Message-ID: Subject: Re: [bpf-next v3 1/3] mm: add copy_remote_vm_str To: Andrii Nakryiko Cc: bpf@vger.kernel.org, linux-mm@kvack.org, Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Kernel Team , Andrew Morton , Shakeel Butt , Alexander Potapenko Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Provags-ID: V03:K1:oEfx/UstvIoT23s75UkHEa+LAc7lpHINNTF0MIso0JY1AbNe7/p /DPRM4Erx/ixy6nE1oHYjfFOoEAHO6d8OKBKoSncoAPy2o9P78wlKUf5KwySBjyArEodDZw 1HIUzvIBbDP3KTw6d2zgvf9IhejwWURixbNH8FAPv696rt0I7tf2otSYSQzTi5AKqzoCybh XWodrqqjfig2oOdbMpW3Q== UI-OutboundReport: notjunk:1;M01:P0:MlsIENSHH28=;ymd9/uUNqkQqRVRWRednKmLJD1u EoTo7Jmrz1KTRjMR94fkE/teadAnECGVkP/iDMgnQE7H5eEV7r9GVRpqN5hjgPfXyxAQAlp7Z H8gCAWClo5MnWnPvfCsA8mQdIEEZ7uME0wT7GnTAO3pt2YeEtTMkvewIUO9Ebgyzix63W3y3J gv+KI8RnuGLkynY8HwgSBOOShzYjr2Ya9mawx7GuMHhp02hF37K3e+Jzu0+AT7jH283EbJ1HF ulis3hYGLEAWlLOtoU/ju1lC1Bwelfkkk59il+moPMbCjvCtpA991VE101f1DKN5YrHbmpjJq zXkH2iYsUK9JswMRYcdrqBXwlgi+A/Liil/faHZAAfwBMiMah3Q/PaZJeFXEeSWhKeDbMaZgu zZS64Xt2xhv8hjLyhdhGNlsr7EBUMOvkZFN9h0ERpYy4fOd7RRwiPVXsWOolhIPyEIeC8qEGK pMW+nUI2LHbIf6onGtBVy/TY9ls2BxjVBsbcjYo8j4EPkA9URVA0VjTdMz7awOmv8aXCIeyZp 0+eyy7WIsfPkR10o9O+tap2QjrtzrYqyhJ1YyudvY1+7edmZRGcidZG/Pyy3sms+/DkFlpsow 0WSYCy2Wlr1Cu4RnqtyBCpGmk+k9+tm3qWVym2fZLvZYoCy6D+oEi50mOJ+9VNVAs5RIw3F4h teXbr1IOlzP8BslzQJ9E3WQ84wa2vRdFf7VRIep4Dm327/8YQzf+RGWB3+GPg5nd8XTeEiN7e MPS74A7cCcBn3tsOYoPtmi69rpFqbRxdl0LjQUPeGUZdcLdF+d7bYtUGVQCDMhTwnGAXoSFO/ jzBl7vBNRxRJInewB9Lq74jczvj71+lUdaNBM6He9XlyikoNFqJwDIGQ6M0opKZw79QdLpxAj WA1Sy2SQFyYBKFTfx6v5Ilh68DxG+NKyVWVLZQMf2E2iuwoGV2+Y1SuLZa9PJ8n8JZY99SS39 PzLcbNO01tPC1+XqlV655j1UzNfocyKXvu/V9RiqQy5vohY6f4HoInOIGXkNMuhNc++EdyAqL vcR9p9g+pYI+5KKjt1SgAeZEagjqGskgdMsZxYLU1i1LYN+NLDf8OiW7eRQTY13p60/C3fW9e IQNsvFnh2U9M6BAvlakl/BMGNW2KPE/BC/oXje6F7yFSoysX/0xHd04C0Yh/1Pc9BrOg2sZ7g = X-Stat-Signature: gpwwfxaai5uf1njg7mqxcnp116mxbhn6 X-Rspam-User: X-Rspamd-Queue-Id: 8A3FE1A001E X-Rspamd-Server: rspam03 X-HE-Tag: 1737813732-889541 X-HE-Meta: U2FsdGVkX19mZAUWpVgxHI+ULP9i2SQC8z7yz9hJcPKFCzYaKtilL5Dxgj38iST0rKSvu+j/VxRC+MV+p+8L/bhORCJ3tZKEntscj1dmuuUBxxV/TnFISAZtvfsKDCc1mrvMzeff6Qno4VdMKGJOiJL5e2bhY5GQkGwcnb6owsElnNe/JMIA26iM+7+B2dw6T/nT9E4otKCknvMt8MAd46UGmlsWSYXYGgBLbbESkwprpBze0QaKM9Hoe8h4iNIyCJILkQqD6+/eRLME635C03rgcBf/JVYT5iZi8T7fda/TeFUk+eQ/kEDIdjmDzj9FGQAKJ9xm1OIvpmAacX6bgSC7ITYRE3S2GNZZtaJjaxxa4sLr3RbtSIhj9u7kpvnXpo3Z02HS4wwdI7kJ65tidHqWXGorPt+hX/OGlqSJBto/T48byV4EjtxKS9p4kr6zFaXh84dQE1yWc79KpFXJDQxrZ/4rD4/elUFRwbk37fhp6Vv+p9NULRfK2qKTzI14fSUTItSw0r97z9TNdOndATjylJU5yEpLx6OsPCGX01bkxolizkmu8rQ7t5IMXNOEGCqqhahd6on+TqDGr9gaI4f4BiYdKxx46Wer7D/nGBB/lgeUw1FsunxaFb6fjhCE28WvZrjKHfVFg2Wrv77sEHrtM0heu8m2Bq5wERVJdWDM9lpCULcE1nSbKlqtZfVWkLLuBVHtluv2duV0fZU+yvHJRlJhDv0iANpeNN772AUVqPaPoQyTrVjoSwbBmTqpdPxf9AWkpMqalT/GQQYY51Bwfxdl5ZxfddqELEAL2O36Io2MvcbASGSUFKGVxz8nknb4cFcRMTJ6TXlcZOwJNEShXTgi6yXZlX2MWn/rpprSnUB5Ct8wVLA/aCGvdT6K6IM9KGfBg9si/RnBNe/2jQdrUSP+dIRgWSrbFbjf/up7hoWMYhDazF6SW2Rq8fZE7TrF0+6biwXJ61TByN+ lU6/c+Z+ cbsIizAsC0HaUlWrOlyABo5E73ISaGmW3WggE0J0yBBqmjUXDsDWty1SYAGmNfMOI/2Mc/QqlPU2LHHZXTzL43J64G4zJ7A5KCZNMLQnJZdNykLv/thEZ1IpJrrxzz4VAX/RwhUk/uPHpy7zkwSnELEZTikPjrR4hL1bfDr7lZsISmYvGaAWW5nxWpBkZPrPqw41oMu9DHzCjPiICiciK/+k2MVp8Ss9Xme7aORbHcRoKzT0MoWsE+h16X6fbO1waZGsmvNySwlg7eHCHMM9Ag6xioeCVpKDH2U4EF/VGMR0Ouy54bZO8pHxI1zEp4cBnHhXNiP8ioG16qY1D0LmFAEbHnJ1u+7hQUNXj7+w7UjYSLpntrmCIs0vzO5D+QeEJii5a8v4lhF6BVkGG6uA3Y4MAeNakQuyi+NHBeQPk5iFGbTk= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Jan 24, 2025 at 7:09=E2=80=AFPM Andrii Nakryiko wrote: > > On Fri, Jan 24, 2025 at 10:22=E2=80=AFAM Jordan Rome wrote: > > > > Similar to `access_process_vm` but specific to strings. > > Also chunks reads by page and utilizes `strscpy` > > for handling null termination. > > > > Signed-off-by: Jordan Rome > > --- > > include/linux/mm.h | 3 ++ > > mm/memory.c | 119 +++++++++++++++++++++++++++++++++++++++++++++ > > mm/nommu.c | 68 ++++++++++++++++++++++++++ > > 3 files changed, 190 insertions(+) > > > > [...] > > > + maddr =3D kmap_local_page(page); > > + retval =3D strscpy(buf, maddr + offset, bytes); > > + unmap_and_put_page(page, maddr); > > + > > + if (retval > -1 && retval < bytes) { > > + /* found the end of the string */ > > + buf +=3D retval; > > + goto out; > > + } > > + > > + if (retval =3D=3D -E2BIG) { > > nit: strscpy() can't return any other error, so I'd structure result > handling as: > > if (retval < 0) { > /* that annoying last byte copy */ > retval =3D bytes; > } > if (retval < bytes) { > /* "we are done" handling */ > } > > /* common len, buf, addr adjustment logic stays here */ > Ack. Actually, I thought of a way to make this cleaner and correct. > > but also here's the question. If we get E2BIG, while bytes is exactly > how many bytes we have left in the buffer, the last byte should be > zero, no? So this should be cleanly handled, right? Or do we have a > test for that and it works already? > Ok, I found an inconsistency that gets handled in the BPF helper `bpf_copy_from_user_task_str`, which I'm going to fix in the next version of this patch. Let me explain how this function is SUPPOSED to work and enumerate some different test cases (a lot of which are in commit 3). Note1: all of the target strings are null terminated (if you try to copy a string that's not null terminated you may accidentally copy junk). Note2: strscopy returns E2BIG if the len requested isn't as long as the string INCLUDING the nul terminator. So if you want to copy "test_data\0" you need to request 10 bytes not 9. And if you request 10 or anything greater it returns 9. Note3: This function, as opposed to the bpf helper calling this function, returns the number copied NOT including the nul terminator. So... the target string is "test_data". Request 10 bytes, return 9. Request 2 bytes, return 1. Request 20 bytes, return 9. Now, let's say this string falls across a page boundary where "_" is the last character in the page. Request 10 bytes, which becomes a request for 5 bytes for the first page. strscopy returns E2BIG and copies "test\0" into the buffer. We copy the last bytes of the page into the buffer, which is the "_", and then request 5 more bytes on the next page, copying "data\0" and strscopy returns 4. Return 9. Now let's say the last "a" is the last character on the page. Request 10 bytes, which becomes a request for 9 bytes. strscopy returns E2BIG and copies "test_dat\0" into the buffer. Once again we copy the last byte of the page into the buffer, which is "a" and we request 1 more byte of the next page, which is the nul terminator. strscopy returns 0 and this function returns 9. Again note, that this version of the code has a bug that is "handled" by the bpf helper and I'm going to fix. > > + retval =3D bytes; > > + /* > > + * Because strscpy always null terminates we ne= ed to > > + * copy the last byte in the page if we are goi= ng to > > + * load more pages > > + */ > > + if (bytes < len) { > > + end =3D bytes - 1; > > + copy_from_user_page(vma, > > + page, > > + addr + end, > > + buf + end, > > you don't need the `end` variable, just use `bytes - 1` twice? > > > + maddr + (PAGE_SIZE - 1)= , > > + 1); > > + } > > + } > > + > > + len -=3D retval; > > + buf +=3D retval; > > + addr +=3D retval; > > + } > > + > > +out: > > + mmap_read_unlock(mm); > > + if (err) > > + return err; > > + > > + return buf - old_buf; > > +} > > + > > +/** > > + * copy_remote_vm_str - copy a string from another process's address s= pace. > > + * @tsk: the task of the target address space > > + * @addr: start address to read from > > + * @buf: destination buffer > > + * @len: number of bytes to transfer > > + * @gup_flags: flags modifying lookup behaviour > > + * > > + * The caller must hold a reference on @mm. > > + * > > + * Return: number of bytes copied from @addr (source) to @buf (destina= tion). > > + * If the source string is shorter than @len then return the length of= the > > + * source string. If the source string is longer than @len, return @le= n. > > + * On any error, return -EFAULT. > > strncpy_from_user_nofault() doc says: > > On success, returns the length of the string INCLUDING the trailing NUL > > Is this the case with copy_remote_vm_str() as well? I.e., if the > source string is 5 bytes + NUL, dst buf is 10. Will we get 5 or 6 > returned? We should be very careful with all this +/- 1 business in > corner cases, too easy to mess this up. > Explained above. > > + */ > > +int copy_remote_vm_str(struct task_struct *tsk, unsigned long addr, > > + void *buf, int len, unsigned int gup_flags) > > +{ > > + struct mm_struct *mm; > > + int ret; > > + > > + mm =3D get_task_mm(tsk); > > + if (!mm) > > + return -EFAULT; > > + > > + ret =3D __copy_remote_vm_str(mm, addr, buf, len, gup_flags); > > + > > + mmput(mm); > > + > > + return ret; > > +} > > +EXPORT_SYMBOL_GPL(copy_remote_vm_str); > > + > > [...]