From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B4ED0C433E6 for ; Fri, 28 Aug 2020 18:25:48 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 5A7FB2074A for ; Fri, 28 Aug 2020 18:25:48 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=brauner.io header.i=@brauner.io header.b="fko4qsnU" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5A7FB2074A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=brauner.io Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 0B32B8D0001; Fri, 28 Aug 2020 14:25:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 064506B0005; Fri, 28 Aug 2020 14:25:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E946C8D0001; Fri, 28 Aug 2020 14:25:47 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0134.hostedemail.com [216.40.44.134]) by kanga.kvack.org (Postfix) with ESMTP id D4B826B0003 for ; Fri, 28 Aug 2020 14:25:47 -0400 (EDT) Received: from smtpin08.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 989B2362B for ; Fri, 28 Aug 2020 18:25:47 +0000 (UTC) X-FDA: 77200805934.08.glue60_2200a9227077 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin08.hostedemail.com (Postfix) with ESMTP id 62B901819E621 for ; Fri, 28 Aug 2020 18:25:47 +0000 (UTC) X-HE-Tag: glue60_2200a9227077 X-Filterd-Recvd-Size: 7341 Received: from mail-ej1-f68.google.com (mail-ej1-f68.google.com [209.85.218.68]) by imf34.hostedemail.com (Postfix) with ESMTP for ; Fri, 28 Aug 2020 18:25:46 +0000 (UTC) Received: by mail-ej1-f68.google.com with SMTP id e23so392924ejb.4 for ; Fri, 28 Aug 2020 11:25:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=brauner.io; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=tNcRyj9RyFCqhML1vaB3/0lTIN4FgUKY5VeSnjUCZmc=; b=fko4qsnU8XsIG5wTzcpyBnhk2ZNW7LMMDcERg5LQjJnk/nhoReV6uTCwCciCdnm9XQ td3jm8s3IxQzbIWItUTVIyQw2DIhHO70YwD88A3kwPuTc/XP41FHOFg1Mj4qIEk+GJE3 lm12IAf3+JJzxbSyHX0MWuIwn/UsBFpZ5OZyCKjWx2inRtZv8dNaGYwhRZuvkPS7ojIs SVhPktB5+5MeeN94MPT5Xik4kyzOzDx+bmzMMtXd2VhwoeexqHsLj8I/lFCNu1rZsWl+ 0G0FoHC1mk8ny8gadNdX1AU5d2L3B4r9kSmr+1ZuFYSJ6wPbxkQcfyr3xxn7zPnQNEa9 Rxbw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=tNcRyj9RyFCqhML1vaB3/0lTIN4FgUKY5VeSnjUCZmc=; b=qBU5hBQonNMS+mvozdN6qtVHQ9ovE3USJ0VZfxWbQKCDyH53lSy8zDmqU1qz1RicoW Wb1OkXoegDBpj58kVsYkJ3PXvEcI6o8Nq5Xx5IaH15H+My6Hn9rU3j3RTjIqiNZ1o3HZ P/twspf1WRuoxpQ85YJ/v+nmlVHd4asj9vg/VOLalJM4I4aNWPHB80cMivihnr5Dufhw +yw71SZ4Kd8KpNSX136CCb10OvJqrmLQy9kjRM2578KxaS/uPDN83pPu+hofyNHgSEHH LTP7ZME6r/PB0CTMcYbryijgigLEm5eyhprvKeeOZDxi/w2/iKpJxacJUNEhY9XBcCA7 Ht8g== X-Gm-Message-State: AOAM533lFLaaZZ7hBi2Z0gGA0uy65tR4ICldXnuf8uxdfrk8U6P0732n gll2TPdwcFfao/JM/VvDZE8f+6tZm7qED1DJ6OzHyA== X-Google-Smtp-Source: ABdhPJyTcpVzOcX+7dT85XsoWiLFbMeySiOrlyhBqqk1vP4uvGZ86jHVGBRMHXDc7GokaYdxJetDs3WNzeU8cdEwgNM= X-Received: by 2002:a17:906:178d:: with SMTP id t13mr3428857eje.410.1598639145504; Fri, 28 Aug 2020 11:25:45 -0700 (PDT) MIME-Version: 1.0 References: <20200622192900.22757-1-minchan@kernel.org> <20200622192900.22757-4-minchan@kernel.org> <9c339413-68c7-344e-dd01-327cb988d385@kernel.dk> In-Reply-To: <9c339413-68c7-344e-dd01-327cb988d385@kernel.dk> From: Christian Brauner Date: Fri, 28 Aug 2020 20:25:34 +0200 Message-ID: Subject: Re: [PATCH v8 3/4] mm/madvise: introduce process_madvise() syscall: an external memory hinting API To: Jens Axboe Cc: Arnd Bergmann , Minchan Kim , Andrew Morton , LKML , linux-mm , Linux API , Oleksandr Natalenko , Suren Baghdasaryan , Tim Murray , Sandeep Patil , Sonny Rao , Brian Geffon , Michal Hocko , Johannes Weiner , Shakeel Butt , John Dias , Joel Fernandes , Jann Horn , alexander.h.duyck@linux.intel.com, SeongJae Park , David Rientjes , Arjun Roy , Vlastimil Babka , Daniel Colascione , Kirill Tkhai , SeongJae Park , linux-man Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 62B901819E621 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam03 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Aug 28, 2020 at 8:24 PM Jens Axboe wrote: > > On 8/28/20 11:40 AM, Arnd Bergmann wrote: > > On Mon, Jun 22, 2020 at 9:29 PM Minchan Kim wrote: > >> So finally, the API is as follows, > >> > >> ssize_t process_madvise(int pidfd, const struct iovec *iovec, > >> unsigned long vlen, int advice, unsigned int flags); > > > > I had not followed the discussion earlier and only now came across > > the syscall in linux-next, sorry for stirring things up this late. > > > >> diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl > >> index 94bf4958d114..8f959d90338a 100644 > >> --- a/arch/x86/entry/syscalls/syscall_64.tbl > >> +++ b/arch/x86/entry/syscalls/syscall_64.tbl > >> @@ -364,6 +364,7 @@ > >> 440 common watch_mount sys_watch_mount > >> 441 common watch_sb sys_watch_sb > >> 442 common fsinfo sys_fsinfo > >> +443 64 process_madvise sys_process_madvise > >> > >> # > >> # x32-specific system call numbers start at 512 to avoid cache impact > >> @@ -407,3 +408,4 @@ > >> 545 x32 execveat compat_sys_execveat > >> 546 x32 preadv2 compat_sys_preadv64v2 > >> 547 x32 pwritev2 compat_sys_pwritev64v2 > >> +548 x32 process_madvise compat_sys_process_madvise > > > > I think we should not add any new x32-specific syscalls. Instead I think > > the compat_sys_process_madvise/sys_process_madvise can be > > merged into one. > > > >> + mm = mm_access(task, PTRACE_MODE_ATTACH_FSCREDS); > >> + if (IS_ERR_OR_NULL(mm)) { > >> + ret = IS_ERR(mm) ? PTR_ERR(mm) : -ESRCH; > >> + goto release_task; > >> + } > > > > Minor point: Having to use IS_ERR_OR_NULL() tends to be fragile, > > and I would try to avoid that. Can mm_access() be changed to > > itself return PTR_ERR(-ESRCH) instead of NULL to improve its > > calling conventions? I see there are only three other callers. > > > > > >> + ret = import_iovec(READ, vec, vlen, ARRAY_SIZE(iovstack), &iov, &iter); > >> + if (ret >= 0) { > >> + ret = do_process_madvise(pidfd, &iter, behavior, flags); > >> + kfree(iov); > >> + } > >> + return ret; > >> +} > >> + > >> +#ifdef CONFIG_COMPAT > > ... > >> + > >> + ret = compat_import_iovec(READ, vec, vlen, ARRAY_SIZE(iovstack), > >> + &iov, &iter); > >> + if (ret >= 0) { > >> + ret = do_process_madvise(pidfd, &iter, behavior, flags); > >> + kfree(iov); > >> + } > > > > Every syscall that passes an iovec seems to do this. If we make import_iovec() > > handle both cases directly, this syscall and a number of others can > > be simplified, and you avoid the x32 entry point I mentioned above > > > > Something like (untested) > > > > index dad8d0cfaaf7..0de4ddff24c1 100644 > > --- a/lib/iov_iter.c > > +++ b/lib/iov_iter.c > > @@ -1683,8 +1683,13 @@ ssize_t import_iovec(int type, const struct > > iovec __user * uvector, > > { > > ssize_t n; > > struct iovec *p; > > - n = rw_copy_check_uvector(type, uvector, nr_segs, fast_segs, > > - *iov, &p); > > + > > + if (in_compat_syscall()) I suggested the exact same solutions roughly 1.5 weeks ago. :) Fun when I saw you mentioning this in BBB I knew exactly what you were referring too. :) Christian