From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=aYM/=2C=kvack.org=owner-linux-mm@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-7.0 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED,
	HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,MENTIONS_GIT_HOSTING,
	SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=unavailable
	autolearn_force=no version=3.4.0
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 4CB26C00454
	for <linux-mm@archiver.kernel.org>; Thu, 12 Dec 2019 22:15:40 +0000 (UTC)
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by mail.kernel.org (Postfix) with ESMTP id F046C2173E
	for <linux-mm@archiver.kernel.org>; Thu, 12 Dec 2019 22:15:39 +0000 (UTC)
Authentication-Results: mail.kernel.org;
	dkim=fail reason="signature verification failed" (2048-bit key) header.d=kernel-dk.20150623.gappssmtp.com header.i=@kernel-dk.20150623.gappssmtp.com header.b="wsU6P+eG"
DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org F046C2173E
Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk
Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix)
	id 915728E0005; Thu, 12 Dec 2019 17:15:39 -0500 (EST)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id 89F538E0001; Thu, 12 Dec 2019 17:15:39 -0500 (EST)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id 767A78E0005; Thu, 12 Dec 2019 17:15:39 -0500 (EST)
X-Delivered-To: linux-mm@kvack.org
Received: from forelay.hostedemail.com (smtprelay0211.hostedemail.com [216.40.44.211])
	by kanga.kvack.org (Postfix) with ESMTP id 59F848E0001
	for <linux-mm@kvack.org>; Thu, 12 Dec 2019 17:15:39 -0500 (EST)
Received: from smtpin20.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251])
	by forelay01.hostedemail.com (Postfix) with SMTP id D4473180AD811
	for <linux-mm@kvack.org>; Thu, 12 Dec 2019 22:15:38 +0000 (UTC)
X-FDA: 76257897156.20.crowd32_888b71155da2c
X-HE-Tag: crowd32_888b71155da2c
X-Filterd-Recvd-Size: 6894
Received: from mail-pj1-f67.google.com (mail-pj1-f67.google.com [209.85.216.67])
	by imf05.hostedemail.com (Postfix) with ESMTP
	for <linux-mm@kvack.org>; Thu, 12 Dec 2019 22:15:38 +0000 (UTC)
Received: by mail-pj1-f67.google.com with SMTP id n96so148437pjc.3
        for <linux-mm@kvack.org>; Thu, 12 Dec 2019 14:15:37 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=kernel-dk.20150623.gappssmtp.com; s=20150623;
        h=subject:to:cc:references:from:message-id:date:user-agent
         :mime-version:in-reply-to:content-language:content-transfer-encoding;
        bh=Y5oT9uWtY0LeQYVivfgqNjCA9SpoRNFmIvY21McRdiQ=;
        b=wsU6P+eGgFAPh2UXy913mTpywcdlCws2X+H53i+IG5zcE96sWW7rNeLsv1WyKaZz5L
         qrASayNaoInK194CB5qqkkVvHfL5ccYL80KewoESrHyqY1P1Xc41Uf8mqjwr17sSFNJj
         kxQYg94vo9gyl/9M42tS2R77CXLFQpzW37MfbLumD0vWwh3ATPMEpq5Gek38um9PwFkp
         IYtrqbxhkgKVS6Sn6oo72BEoFUs6t5WkKgb4GuKSX6xZyoeQHld57dMsa5fy3wKZ++Ob
         6RB+8/IXZi+euSOGMMOgHi6VrzlX2vrr1pATmjI7S0LrKA0dy/TSO/03MN9XRdmOopME
         yJJg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20161025;
        h=x-gm-message-state:subject:to:cc:references:from:message-id:date
         :user-agent:mime-version:in-reply-to:content-language
         :content-transfer-encoding;
        bh=Y5oT9uWtY0LeQYVivfgqNjCA9SpoRNFmIvY21McRdiQ=;
        b=IXvzRCwtExILr3EqvDeL3RZGt/Q9qQUYzd0zBgnnenQHnN++JRRyd3kdo4cnEgHBll
         UOfC3rrvYArFgt4zX/5zdF2cGvgYYnP9kNS4D77rK2b/MBRW2Xt5yoOj4hY+OXE8DFjH
         1GDkAW4BBDs6SrWT5ZFP0t+Rn/d7xtAjXp7VpCboWE7YR97gMPLy++fMF/qyy4lxwziL
         EQxws+g7RkgRyr/kcjY27iGAZE/KBHyaq3GK61ApH1HOYl9j9uFKC9XdUgOBM0EyJShF
         11NzFCRkvaexqGzSSmlGKGwQOYzsl4zUOanDc+pBGvTExIgL4PgcfsE455Bo8tewpFeE
         h/vg==
X-Gm-Message-State: APjAAAWw8AYA7PqutXUeGvk9k6cye9bHeBEZi629D4NJwRrdb+jdWGoX
	9zytittWzioYVYwfKe+dCQbNZA==
X-Google-Smtp-Source: APXvYqz1otoGGvMZpzx90SoE/R6gEpiVj/8VFSGUpjMUZqD8uBZ74Uk+1vPN8q8O9/LDHyCDLgmZMg==
X-Received: by 2002:a17:902:fe09:: with SMTP id g9mr11632167plj.162.1576188936703;
        Thu, 12 Dec 2019 14:15:36 -0800 (PST)
Received: from [192.168.1.188] ([66.219.217.145])
        by smtp.gmail.com with ESMTPSA id l66sm7878610pga.30.2019.12.12.14.15.34
        (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128);
        Thu, 12 Dec 2019 14:15:35 -0800 (PST)
Subject: Re: [PATCHSET v3 0/5] Support for RWF_UNCACHED
To: Martin Steigerwald <martin@lichtvoll.de>
Cc: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
 linux-block@vger.kernel.org, willy@infradead.org, clm@fb.com,
 torvalds@linux-foundation.org, david@fromorbit.com
References: <20191211152943.2933-1-axboe@kernel.dk>
 <63049728.ylUViGSH3C@merkaba>
 <7bf74660-874e-6fd7-7a41-f908ccab694e@kernel.dk> <2091494.0NDvsO6yje@merkaba>
From: Jens Axboe <axboe@kernel.dk>
Message-ID: <05adab5c-1405-f4a3-b14f-3242fa5ce8fc@kernel.dk>
Date: Thu, 12 Dec 2019 15:15:33 -0700
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101
 Thunderbird/68.2.2
MIME-Version: 1.0
In-Reply-To: <2091494.0NDvsO6yje@merkaba>
Content-Type: text/plain; charset=utf-8
Content-Language: en-US
Content-Transfer-Encoding: quoted-printable
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>

On 12/12/19 2:45 PM, Martin Steigerwald wrote:
> Jens Axboe - 12.12.19, 16:16:31 CET:
>> On 12/12/19 3:44 AM, Martin Steigerwald wrote:
>>> Jens Axboe - 11.12.19, 16:29:38 CET:
>>>> Recently someone asked me how io_uring buffered IO compares to
>>>> mmaped
>>>> IO in terms of performance. So I ran some tests with buffered IO,
>>>> and
>>>> found the experience to be somewhat painful. The test case is
>>>> pretty
>>>> basic, random reads over a dataset that's 10x the size of RAM.
>>>> Performance starts out fine, and then the page cache fills up and
>>>> we
>>>> hit a throughput cliff. CPU usage of the IO threads go up, and we
>>>> have kswapd spending 100% of a core trying to keep up. Seeing
>>>> that, I was reminded of the many complaints I here about buffered
>>>> IO, and the fact that most of the folks complaining will
>>>> ultimately bite the bullet and move to O_DIRECT to just get the
>>>> kernel out of the way.
>>>>
>>>> But I don't think it needs to be like that. Switching to O_DIRECT
>>>> isn't always easily doable. The buffers have different life times,
>>>> size and alignment constraints, etc. On top of that, mixing
>>>> buffered
>>>> and O_DIRECT can be painful.
>>>>
>>>> Seems to me that we have an opportunity to provide something that
>>>> sits somewhere in between buffered and O_DIRECT, and this is where
>>>> RWF_UNCACHED enters the picture. If this flag is set on IO, we get
>>>> the following behavior:
>>>>
>>>> - If the data is in cache, it remains in cache and the copy (in or
>>>> out) is served to/from that.
>>>>
>>>> - If the data is NOT in cache, we add it while performing the IO.
>>>> When the IO is done, we remove it again.
>>>>
>>>> With this, I can do 100% smooth buffered reads or writes without
>>>> pushing the kernel to the state where kswapd is sweating bullets.
>>>> In
>>>> fact it doesn't even register.
>>>
>>> A question from a user or Linux Performance trainer perspective:
>>>
>>> How does this compare with posix_fadvise() with POSIX_FADV_DONTNEED
>>> that for example the nocache=C2=B9 command is using? Excerpt from
>>> manpage>=20
>>> posix_fadvice(2):
>>>        POSIX_FADV_DONTNEED
>>>       =20
>>>               The specified data will not be accessed  in  the  near
>>>               future.
>>>              =20
>>>               POSIX_FADV_DONTNEED  attempts to free cached pages as=E2=
=80=90
>>>               sociated with the specified region.  This  is  useful,
>>>               for  example,  while streaming large files.  A program
>>>               may periodically request the  kernel  to  free  cached
>>>               data  that  has already been used, so that more useful
>>>               cached pages are not discarded instead.
>>>
>>> [1] packaged in Debian as nocache or available
>>> herehttps://github.com/ Feh/nocache
>>>
>>> In any way, would be nice to have some option in rsync=E2=80=A6 I sti=
ll did
>>> not change my backup script to call rsync via nocache.
>>
>> I don't know the nocache tool, but I'm guessing it just does the
>> writes (or reads) and then uses FADV_DONTNEED to drop behind those
>> pages? That's fine for slower use cases, it won't work very well for
>> fast IO. The write side currently works pretty much like that
>> internally, whereas the read side doesn't use the page cache at all.
>=20
> Yes, it does that. And yeah I saw you changed the read site to bypass=20
> the cache entirely.
>=20
> Also as I understand it this is for asynchronous using io uring=20
> primarily?

Or preadv2/pwritev2, they also allow passing in RWF_* flags.

--=20
Jens Axboe