From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.4 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 52AFFC433E0 for ; Fri, 12 Feb 2021 15:38:03 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D10AF64DFF for ; Fri, 12 Feb 2021 15:38:02 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D10AF64DFF Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 50BF58D0066; Fri, 12 Feb 2021 10:38:02 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4BD4A8D0060; Fri, 12 Feb 2021 10:38:02 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 35D078D0066; Fri, 12 Feb 2021 10:38:02 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0048.hostedemail.com [216.40.44.48]) by kanga.kvack.org (Postfix) with ESMTP id 2123F8D0060 for ; Fri, 12 Feb 2021 10:38:02 -0500 (EST) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id D461E18019695 for ; Fri, 12 Feb 2021 15:38:01 +0000 (UTC) X-FDA: 77810021562.22.stove50_250b5fe27622 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin22.hostedemail.com (Postfix) with ESMTP id B05941801916D for ; Fri, 12 Feb 2021 15:38:01 +0000 (UTC) X-HE-Tag: stove50_250b5fe27622 X-Filterd-Recvd-Size: 7543 Received: from mail-io1-f46.google.com (mail-io1-f46.google.com [209.85.166.46]) by imf22.hostedemail.com (Postfix) with ESMTP for ; Fri, 12 Feb 2021 15:38:00 +0000 (UTC) Received: by mail-io1-f46.google.com with SMTP id p132so9672385iod.11 for ; Fri, 12 Feb 2021 07:38:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=FmA1qHg5t4euXl5Xl6ULxBbcKoWMJgzgukOTpoQtTJM=; b=dA3Y/gNaXZtZcLKuCXzL8rgzUY/n/j4eb7nWICegND1r0LBD8Pmw6hxBaJY/xcRLxJ SSDtCHzRi3H6L7JjDDRuvlePwTgaIQgRBMInsURCtoD54WptnCbJKIfBIErGg24xCzFA 3jOInCvcSOiJmw3Hy1YKpLkFPAgtAlppRYlNU+U0TFWO8B56PnT+DrAK6z6zQVLKby1a FAce+AIQLL+GjodP3No2vG5sl3/4/XSkjnSWVGBJbiNJLZ0H7QIXTSVg9NTW23QUcrpG bLfqRFZTMFcEeES8qqt8tW79rFnoSc3PRHiJy/0C9rx4lnZHjmJGQ71eW3rN0OB3L4IW rTrA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=FmA1qHg5t4euXl5Xl6ULxBbcKoWMJgzgukOTpoQtTJM=; b=Ri2wpvL2QmNVEx+umLUu1xq8Z5UoRWiD/90u1/oTENWUXRBqUAdEEST9nUSSqCgwpZ uCO3CGQH8fviAuj79z2XlplfbLTlMTDhIrmSEc5GRqJ0ulRUCN/ib/cP4CZaE3qOn6nX RHa+yoING24SQjpRoJjgc6x1bXIMhQmwLBBQ8ISIPMoe+RXeOO23uh65WGgwXldEdu/R pk98o8b3bdOj9LxCqv30A9wktiQ8sjW+lOYXnRuchnpSdGqVQNKVgq72mP6zyNpv5yPX kzVvX6SB0aX6RDlEdn52YLRj6zPHBIcvqocHL1qA6Ov0QPgTiGKYJ2zpNHI1F/y7OP4M OQVQ== X-Gm-Message-State: AOAM530xhds5LzGDsl45ZvrpUNWzGNuFeJtyl0z1Ybo3ogg3K/RiDPtc upVLmWnHIuNGpHSqRPXlDmXByw== X-Google-Smtp-Source: ABdhPJwGd+H2kqRqTqq7eiE8shrPVUQiasY2Km3sWlLOgdKrbQb1FiRdfCngCkj6PMyzfsWKhOHOaw== X-Received: by 2002:a02:ccc2:: with SMTP id k2mr3153230jaq.112.1613144279059; Fri, 12 Feb 2021 07:37:59 -0800 (PST) Received: from [192.168.1.30] ([65.144.74.34]) by smtp.gmail.com with ESMTPSA id o7sm2829889ilj.67.2021.02.12.07.37.57 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 12 Feb 2021 07:37:58 -0800 (PST) Subject: Re: Memory keys and io_uring. To: "Aneesh Kumar K.V" , Dave Hansen , Michael Ellerman Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org References: <877dndzs8c.fsf@linux.ibm.com> <4ed6cbf6-b850-dac5-88c6-03e58dfc6631@linux.ibm.com> From: Jens Axboe Message-ID: <0ec1943b-4004-66bd-5a8f-2daf86de3349@kernel.dk> Date: Fri, 12 Feb 2021 08:37:57 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: <4ed6cbf6-b850-dac5-88c6-03e58dfc6631@linux.ibm.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2/12/21 8:33 AM, Aneesh Kumar K.V wrote: > On 2/12/21 8:45 PM, Jens Axboe wrote: >> On 2/11/21 11:59 PM, Aneesh Kumar K.V wrote: >>> >>> Hi, >>> >>> I am trying to estabilish the behaviour we should expect when passing a >>> buffer with memory keys attached to io_uring syscalls. As show in the >>> blow test >>> >>> /* >>> * gcc -Wall -O2 -D_GNU_SOURCE -o pkey_uring pkey_uring.c -luring >>> */ >>> #include >>> #include >>> #include >>> #include >>> #include >>> #include >>> #include "liburing.h" >>> >>> #define PAGE_SIZE (64 << 10) >>> >>> int main(int argc, char *argv[]) >>> { >>> int fd, ret, pkey; >>> struct io_uring ring; >>> struct io_uring_sqe *sqe; >>> struct io_uring_cqe *cqe; >>> struct iovec iovec; >>> void *buf; >>> >>> if (argc < 2) { >>> printf("%s: file\n", argv[0]); >>> return 1; >>> } >>> >>> ret = io_uring_queue_init(1, &ring, IORING_SETUP_SQPOLL); >>> if (ret < 0) { >>> fprintf(stderr, "queue_init: %s\n", strerror(-ret)); >>> return 1; >>> } >>> >>> fd = open(argv[1], O_RDONLY | O_DIRECT); >>> if (fd < 0) { >>> perror("open"); >>> return 1; >>> } >>> >>> if (posix_memalign(&buf, PAGE_SIZE, PAGE_SIZE)) >>> return 1; >>> iovec.iov_base = buf; >>> iovec.iov_len = PAGE_SIZE; >>> >>> //mprotect(buf, PAGE_SIZE, PROT_NONE); >>> pkey = pkey_alloc(0, PKEY_DISABLE_WRITE); >>> pkey_mprotect(buf, PAGE_SIZE, PROT_READ | PROT_WRITE, pkey); >>> >>> >>> sqe = io_uring_get_sqe(&ring); >>> if (!sqe) { >>> perror("io_uring_get_sqe"); >>> return 1; >>> } >>> io_uring_prep_readv(sqe, fd, &iovec, 1, 0); >>> >>> ret = io_uring_submit(&ring); >>> if (ret != 1) { >>> fprintf(stderr, "io_uring_submit: %s\n", strerror(-ret)); >>> return 1; >>> } >>> >>> ret = io_uring_wait_cqe(&ring, &cqe); >>> >>> if (cqe->res < 0) >>> fprintf(stderr, "iouring submit failed %s\n", strerror(-cqe->res)); >>> else >>> fprintf(stderr, "iouring submit success\n"); >>> >>> io_uring_cqe_seen(&ring, cqe); >>> >>> /* >>> * let's access this via a read syscall >>> */ >>> ret = read(fd, buf, PAGE_SIZE); >>> if (ret < 0) >>> fprintf(stderr, "read failed : %s\n", strerror(errno)); >>> >>> close(fd); >>> io_uring_queue_exit(&ring); >>> >>> return 0; >>> } >>> >>> A read syscall do fail with EFAULT. But we allow read via io_uring >>> syscalls. Is that ok? Considering memory keys are thread-specific we >>> could debate that kernel thread can be considered to be the one that got all access >>> allowed via keys or we could update that access is denied via kernel >>> thread for any key value other than default key (key 0). Other option >>> is to inherit the memory key restrictions when doing >>> io_uring_submit() and use the same when accessing the userspace from >>> kernel thread. >>> >>> Any thoughts here with respect to what should be behaviour? >> >> It this a powerpc thing? I get -EFAULT on x86 for both reads, io_uring >> and regular syscall. That includes SQPOLL, not using SQPOLL, or >> explicitly setting IOSQE_ASYNC on the sqe. >> > > Interesting, I didn't check x86 because i don't have hardware that > supports memory keys. I am trying to make ppc64 behavior compatible with > other archs here. > > IIUC, in your test io_wqe/sqe kernel thread did hit access fault when > touching the buffer on x86? That is different from what Dave explained > earlier. Yes, all four methods (task inline, task_work, SQPOLL, io-wq offload) return -EFAULT for me on x86. > With the patch 8c511eff1827 ("powerpc/kuap: Allow kernel thread to > access userspace after kthread_use_mm") I now have key 0 access allowed > but all other keys denied with ppc64. I was planning to change that to > allow all key access based on reply from Dave. I would be curious to > understand what made x86 deny the access and how did kthread inherit the > key details. I'm not very familiar with the memory protection for pkeys and how it's done on various archs, so not going to be of much help there... But io_uring assumes the right mm for any of these accesses, so if it's tied to that, then it should work as it does on x86. -- Jens Axboe