From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=aYM/=2C=kvack.org=owner-linux-mm@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-5.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS,
	MAILING_LIST_MULTI,MENTIONS_GIT_HOSTING,SPF_HELO_NONE,SPF_PASS autolearn=ham
	autolearn_force=no version=3.4.0
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 2CD84C00454
	for <linux-mm@archiver.kernel.org>; Thu, 12 Dec 2019 21:45:52 +0000 (UTC)
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by mail.kernel.org (Postfix) with ESMTP id D3D342464B
	for <linux-mm@archiver.kernel.org>; Thu, 12 Dec 2019 21:45:51 +0000 (UTC)
DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D3D342464B
Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=lichtvoll.de
Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix)
	id 6937D8E0005; Thu, 12 Dec 2019 16:45:51 -0500 (EST)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id 643ED8E0001; Thu, 12 Dec 2019 16:45:51 -0500 (EST)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id 559A38E0005; Thu, 12 Dec 2019 16:45:51 -0500 (EST)
X-Delivered-To: linux-mm@kvack.org
Received: from forelay.hostedemail.com (smtprelay0206.hostedemail.com [216.40.44.206])
	by kanga.kvack.org (Postfix) with ESMTP id 4020C8E0001
	for <linux-mm@kvack.org>; Thu, 12 Dec 2019 16:45:51 -0500 (EST)
Received: from smtpin12.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251])
	by forelay04.hostedemail.com (Postfix) with SMTP id 0117B2460
	for <linux-mm@kvack.org>; Thu, 12 Dec 2019 21:45:51 +0000 (UTC)
X-FDA: 76257822060.12.coast57_15e709a38ef05
X-HE-Tag: coast57_15e709a38ef05
X-Filterd-Recvd-Size: 4878
Received: from mail.lichtvoll.de (luna.lichtvoll.de [194.150.191.11])
	by imf45.hostedemail.com (Postfix) with ESMTP
	for <linux-mm@kvack.org>; Thu, 12 Dec 2019 21:45:50 +0000 (UTC)
Received: from 127.0.0.1 (localhost [127.0.0.1])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256)
	(No client certificate requested)
	by mail.lichtvoll.de (Postfix) with ESMTPSA id 66EBA99BE3;
	Thu, 12 Dec 2019 22:45:48 +0100 (CET)
From: Martin Steigerwald <martin@lichtvoll.de>
To: Jens Axboe <axboe@kernel.dk>
Cc: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, willy@infradead.org, clm@fb.com, torvalds@linux-foundation.org, david@fromorbit.com
Subject: Re: [PATCHSET v3 0/5] Support for RWF_UNCACHED
Date: Thu, 12 Dec 2019 22:45:47 +0100
Message-ID: <2091494.0NDvsO6yje@merkaba>
In-Reply-To: <7bf74660-874e-6fd7-7a41-f908ccab694e@kernel.dk>
References: <20191211152943.2933-1-axboe@kernel.dk> <63049728.ylUViGSH3C@merkaba> <7bf74660-874e-6fd7-7a41-f908ccab694e@kernel.dk>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="UTF-8"
Authentication-Results: mail.lichtvoll.de;
	auth=pass smtp.auth=martin smtp.mailfrom=martin@lichtvoll.de
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>

Jens Axboe - 12.12.19, 16:16:31 CET:
> On 12/12/19 3:44 AM, Martin Steigerwald wrote:
> > Jens Axboe - 11.12.19, 16:29:38 CET:
> >> Recently someone asked me how io_uring buffered IO compares to
> >> mmaped
> >> IO in terms of performance. So I ran some tests with buffered IO,
> >> and
> >> found the experience to be somewhat painful. The test case is
> >> pretty
> >> basic, random reads over a dataset that's 10x the size of RAM.
> >> Performance starts out fine, and then the page cache fills up and
> >> we
> >> hit a throughput cliff. CPU usage of the IO threads go up, and we
> >> have kswapd spending 100% of a core trying to keep up. Seeing
> >> that, I was reminded of the many complaints I here about buffered
> >> IO, and the fact that most of the folks complaining will
> >> ultimately bite the bullet and move to O_DIRECT to just get the
> >> kernel out of the way.
> >>=20
> >> But I don't think it needs to be like that. Switching to O_DIRECT
> >> isn't always easily doable. The buffers have different life times,
> >> size and alignment constraints, etc. On top of that, mixing
> >> buffered
> >> and O_DIRECT can be painful.
> >>=20
> >> Seems to me that we have an opportunity to provide something that
> >> sits somewhere in between buffered and O_DIRECT, and this is where
> >> RWF_UNCACHED enters the picture. If this flag is set on IO, we get
> >> the following behavior:
> >>=20
> >> - If the data is in cache, it remains in cache and the copy (in or
> >> out) is served to/from that.
> >>=20
> >> - If the data is NOT in cache, we add it while performing the IO.
> >> When the IO is done, we remove it again.
> >>=20
> >> With this, I can do 100% smooth buffered reads or writes without
> >> pushing the kernel to the state where kswapd is sweating bullets.
> >> In
> >> fact it doesn't even register.
> >=20
> > A question from a user or Linux Performance trainer perspective:
> >=20
> > How does this compare with posix_fadvise() with POSIX_FADV_DONTNEED
> > that for example the nocache=C2=B9 command is using? Excerpt from
> > manpage>=20
> > posix_fadvice(2):
> >        POSIX_FADV_DONTNEED
> >       =20
> >               The specified data will not be accessed  in  the  near
> >               future.
> >              =20
> >               POSIX_FADV_DONTNEED  attempts to free cached pages as=E2=
=80=90
> >               sociated with the specified region.  This  is  useful,
> >               for  example,  while streaming large files.  A program
> >               may periodically request the  kernel  to  free  cached
> >               data  that  has already been used, so that more useful
> >               cached pages are not discarded instead.
> >=20
> > [1] packaged in Debian as nocache or available
> > herehttps://github.com/ Feh/nocache
> >=20
> > In any way, would be nice to have some option in rsync=E2=80=A6 I still=
 did
> > not change my backup script to call rsync via nocache.
>=20
> I don't know the nocache tool, but I'm guessing it just does the
> writes (or reads) and then uses FADV_DONTNEED to drop behind those
> pages? That's fine for slower use cases, it won't work very well for
> fast IO. The write side currently works pretty much like that
> internally, whereas the read side doesn't use the page cache at all.

Yes, it does that. And yeah I saw you changed the read site to bypass=20
the cache entirely.

Also as I understand it this is for asynchronous using io uring=20
primarily?

=2D-=20
Martin