From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.2 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0DEDFC33CAF for ; Wed, 22 Jan 2020 17:04:49 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B9C362465A for ; Wed, 22 Jan 2020 17:04:48 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20150623.gappssmtp.com header.i=@kernel-dk.20150623.gappssmtp.com header.b="dHkKEGgH" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B9C362465A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 510B06B000E; Wed, 22 Jan 2020 12:04:48 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4C0836B0266; Wed, 22 Jan 2020 12:04:48 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 37DA66B0010; Wed, 22 Jan 2020 12:04:48 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0153.hostedemail.com [216.40.44.153]) by kanga.kvack.org (Postfix) with ESMTP id 205B36B000D for ; Wed, 22 Jan 2020 12:04:48 -0500 (EST) Received: from smtpin02.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with SMTP id C166C181AEF09 for ; Wed, 22 Jan 2020 17:04:47 +0000 (UTC) X-FDA: 76405894614.02.order80_6404f96fda533 X-HE-Tag: order80_6404f96fda533 X-Filterd-Recvd-Size: 6588 Received: from mail-il1-f196.google.com (mail-il1-f196.google.com [209.85.166.196]) by imf28.hostedemail.com (Postfix) with ESMTP for ; Wed, 22 Jan 2020 17:04:47 +0000 (UTC) Received: by mail-il1-f196.google.com with SMTP id f5so5685572ilq.5 for ; Wed, 22 Jan 2020 09:04:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=XWTQKsARJ+uzj6OBD4b66jlpFw8cyx0abxQgN3Kvgv4=; b=dHkKEGgH54CBLZcJKvkkdgLaX01WGN/ZkqqVq/hZ+7IW8DzAblc0STtyXBWwiXD3re ir+Y3SK4pmMtj4lXy+HYCLfETcVl5XG5H39kbu+tQg/ZmqNzhhA5K6QOXUkY5N8SwJNf RqEHAPiBkazaRBkEOHjp5RuoIae2CwZfsQy20mQ9OyJ2kNyRH9zUHiH5YuRVT0QAso7w ci3MPZx0ieZpFehxt02qM5jX+YrT9AQL617YjVdlNna060vWuSajOrq9rVkNM8/wMu4D WZBHeiLRMrf5XM3QPKG7vRRvHz8ybnJGHs9T45NlL8ThyzvQGV41ikLhiLCBkS30zHE6 1eNw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=XWTQKsARJ+uzj6OBD4b66jlpFw8cyx0abxQgN3Kvgv4=; b=Q2aonHG7EGsUn6Wg5cvpLWwpLdx9Tdkbgp0uIdgbO4y65S7QfwVIcnh1gfZLDhHfXq 6wfBfP7ZgbRlAumfHrXIiZKOOz+XSjEz6GUHXs36HtNfGcnEvHg6+7IgpJZG8C4cAwxy nLXtKhP6FXVXvtEpBel0JDbOqFBq4pHdKH+YfftzWElsrsmnnSGQ7L7CM/z+iiALJsb4 fB9VEaIK6TIf2xwLFs106G6XEbRb//BKen7DRUg/mY+V96Q/ptxAvFb3VQAc3maz4G5K fhmDjI2MXqnlP4BhB5z3qqd0tIub4+THzPar6rPoWQmlqzT6bNYGt28I3Bmo4Xk/km3u tLUg== X-Gm-Message-State: APjAAAWjMJH59IahU3O6lZAkIFiMdLTFXGfppVOYYJLO3ToSHTFlVoK4 bvXWOvPxBT2KFKvDNa12CtpmtQ== X-Google-Smtp-Source: APXvYqypbMgNnhw5DUBINcLHTOLuL/DfhR+ZyDBtkYwX3VNVbXnCvUAZDLZCrkPuGPkzGVNHJrTKLg== X-Received: by 2002:a92:d642:: with SMTP id x2mr8909980ilp.169.1579712686373; Wed, 22 Jan 2020 09:04:46 -0800 (PST) Received: from [192.168.1.159] ([65.144.74.34]) by smtp.gmail.com with ESMTPSA id q1sm10959011iog.8.2020.01.22.09.04.45 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 22 Jan 2020 09:04:45 -0800 (PST) Subject: Re: [LSF/MM/BPF TOPIC] Do not pin pages for various direct-io scheme To: Jerome Glisse Cc: Michal Hocko , lsf-pc@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, Benjamin LaHaise References: <20200122023100.75226-1-jglisse@redhat.com> <20200122045723.GC76712@redhat.com> <20200122115926.GW29276@dhcp22.suse.cz> <015647b0-360c-c9ac-ac20-405ae0ec4512@kernel.dk> <20200122165427.GA6009@redhat.com> From: Jens Axboe Message-ID: <66027259-81c3-0bc4-a70b-74069e746058@kernel.dk> Date: Wed, 22 Jan 2020 10:04:44 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.4.1 MIME-Version: 1.0 In-Reply-To: <20200122165427.GA6009@redhat.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 1/22/20 9:54 AM, Jerome Glisse wrote: > On Wed, Jan 22, 2020 at 08:12:51AM -0700, Jens Axboe wrote: >> On 1/22/20 4:59 AM, Michal Hocko wrote: >>> On Tue 21-01-20 20:57:23, Jerome Glisse wrote: >>>> We can also discuss what kind of knobs we want to expose so that >>>> people can decide to choose the tradeof themself (ie from i want low >>>> latency io-uring and i don't care wether mm can not do its business; to >>>> i want mm to never be impeded in its business and i accept the extra >>>> latency burst i might face in io operations). >>> >>> I do not think it is a good idea to make this configurable. How can >>> people sensibly choose between the two without deep understanding of >>> internals? >> >> Fully agree, we can't just punt this to a knob and call it good, that's >> a typical fallacy of core changes. And there is only one mode for >> io_uring, and that's consistent low latency. If this change introduces >> weird reclaim, compaction or migration latencies, then that's a >> non-starter as far as I'm concerned. >> >> And what do those two settings even mean? I don't even know, and a user >> sure as hell doesn't either. >> >> io_uring pins two types of pages - registered buffers, these are used >> for actual IO, and the rings themselves. The rings are not used for IO, >> just used to communicate between the application and the kernel. > > So, do we still want to solve file back pages write back if page in > ubuffer are from a file ? That's not currently a concern for io_uring, as it disallows file backed pages for the IO buffers that are being registered. > Also we can introduce a flag when registering buffer that allows to > register buffer without pining and thus avoid the RLIMIT_MEMLOCK at > the cost of possible latency spike. Then user registering the buffer > knows what he gets. That may be fine for others users, but I don't think it'll apply to io_uring. I can't see anyone selecting that flag, unless you're doing something funky where you're registering a substantial amount of the system memory for IO buffers. And I don't think that's going to be a super valid use case... > Maybe it would be good to test, it might stay in the noise, then it > might be a good thing to do. Also they are strategy to avoid latency > spike for instance we can block/force skip mm invalidation if buffer > has pending/running io in the ring ie only have buffer invalidation > happens when there is no pending/running submission entry. Would that really work? The buffer could very well be idle right when you check, but wanting to do IO the instant you decide you can do background work on it. Additionally, that would require accounting on when the buffers are inflight, which is exactly the kind of overhead we're trying to avoid to begin with. > We can also pick what kind of invalidation we allow (compaction, > migration, ...) and thus limit the scope and likelyhood of > invalidation. I think it'd be useful to try and understand the use case first. If we're pinning a small percentage of the system memory, do we really care at all? Isn't it completely fine to just ignore? -- Jens Axboe