From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A584BC43603 for ; Wed, 11 Dec 2019 19:15:19 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 64F96206A5 for ; Wed, 11 Dec 2019 19:15:19 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=linux-foundation.org header.i=@linux-foundation.org header.b="VdTfqGW9" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 64F96206A5 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id EBDE46B3382; Wed, 11 Dec 2019 14:15:18 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E6F746B3383; Wed, 11 Dec 2019 14:15:18 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DAC336B3384; Wed, 11 Dec 2019 14:15:18 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0174.hostedemail.com [216.40.44.174]) by kanga.kvack.org (Postfix) with ESMTP id C42306B3382 for ; Wed, 11 Dec 2019 14:15:18 -0500 (EST) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with SMTP id 9330A181AC9B6 for ; Wed, 11 Dec 2019 19:15:18 +0000 (UTC) X-FDA: 76253813916.26.key81_686db57fb482f X-HE-Tag: key81_686db57fb482f X-Filterd-Recvd-Size: 6283 Received: from mail-lj1-f195.google.com (mail-lj1-f195.google.com [209.85.208.195]) by imf09.hostedemail.com (Postfix) with ESMTP for ; Wed, 11 Dec 2019 19:15:17 +0000 (UTC) Received: by mail-lj1-f195.google.com with SMTP id h23so25315684ljc.8 for ; Wed, 11 Dec 2019 11:15:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=VOD7R27AsduJWVvdOr2UIanUZawLN4t/gENPDoXyCKs=; b=VdTfqGW9GvpFK5GEtZfMWoXvDM0CVFcRYBxvTgJD7+rU2yW/orFNa/DB6iVh91TX0t aDxZ9K5e9xBP74CjsXLZ6j+btdY5WwxvIgcQ3P7lR8WRNwUHrI70uAyCgm82AEi4TdHY GbH2oENsGX414IFAn8/pU1MmiER9PTwnReX7M= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=VOD7R27AsduJWVvdOr2UIanUZawLN4t/gENPDoXyCKs=; b=WkJTKlyssG1xy0o7WE7J2uF6DhTzsyYu9EGkH8HyNCVOqxRl7u0zpLWwxVzSXYRv/g 8l9g4APQCdiNVjp6b24GSnDzZeahtXMdxzue7AWjv3K7H+Z8FH3E1BVRvH1aRFDvlA53 Woa83yR5LwKbTDwxBd/grwB0xg4lD9ZZrvVNlmvdUtQVkt2/0PXM5wAUxf5bkMu5FWK8 aZiMxh9Zvbz1DZJThihRCQytas8Quq4UWdQ9Afc0fls2GTfVW3+yVtUuXK6b6ShhFdy+ LYOGXzBDndlTWeyFvMZBatZJDPwFIUNDCvrsqJDX5ZFamjt9KkuQhdTvafVuIJiGPuIY uUvQ== X-Gm-Message-State: APjAAAV7z2MPPsUrM+BU8luxiUzVJ2uSmxYvE8niNJCc2z7ZRovCvF+5 b30murL9lpxnm5wAApBRO4gcfCRJehg= X-Google-Smtp-Source: APXvYqzmCGCcDsu2jrCFJdto12QcDIMw25Hu1nPS4cAp7wUdbCcTuFEJFDkp7PLpC67B8Wa7y9AW4A== X-Received: by 2002:a2e:461a:: with SMTP id t26mr3380564lja.204.1576091714349; Wed, 11 Dec 2019 11:15:14 -0800 (PST) Received: from mail-lj1-f175.google.com (mail-lj1-f175.google.com. [209.85.208.175]) by smtp.gmail.com with ESMTPSA id j204sm1666903lfj.38.2019.12.11.11.15.13 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 11 Dec 2019 11:15:13 -0800 (PST) Received: by mail-lj1-f175.google.com with SMTP id 21so25392016ljr.0 for ; Wed, 11 Dec 2019 11:15:13 -0800 (PST) X-Received: by 2002:a2e:9041:: with SMTP id n1mr3339710ljg.133.1576091712754; Wed, 11 Dec 2019 11:15:12 -0800 (PST) MIME-Version: 1.0 References: <20191211152943.2933-1-axboe@kernel.dk> In-Reply-To: From: Linus Torvalds Date: Wed, 11 Dec 2019 11:14:56 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCHSET v3 0/5] Support for RWF_UNCACHED To: Jens Axboe , Johannes Weiner Cc: Linux-MM , linux-fsdevel , linux-block , Matthew Wilcox , Chris Mason , Dave Chinner Content-Type: text/plain; charset="UTF-8" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: [ Adding Johannes Weiner to the cc, I think he's looked at the working set and the inactive/active LRU lists the most ] On Wed, Dec 11, 2019 at 9:56 AM Jens Axboe wrote: > > > In fact, that you say that just a pure random read case causes lots of > > kswapd activity makes me think that maybe we've screwed up page > > activation in general, and never noticed (because if you have enough > > memory, you don't really see it that often)? So this might not be an > > io_ring issue, but an issue in general. > > This is very much not an io_uring issue, you can see exactly the same > kind of behavior with normal buffered reads or mmap'ed IO. I do wonder > if streamed reads are as bad in terms of making kswapd go crazy, I > forget if I tested that explicitly as well. We definitely used to have people test things like "read the same much-bigger-than-memory file over and over", and it wasn't supposed to be all _that_ painful, because the pages never activated, and they got moved out of the cache quickly and didn't disturb other activities (other than the constant IO, of course, which can be a big deal in itself). But maybe that was just the streaming case. With read-around and random accesses, maybe we end up activating too much (and maybe we always did). But I wouldn't be surprised if we've lost that as people went from having 16-32MB to having that many GB instead - simply because a lot of loads are basically entirely cached, and the few things that are not tend to be explicitly uncached (ie O_DIRECT etc). I think the workingset changes actually were maybe kind of related to this - the inactive list can become too small to ever give people time to do a good job of picking the _right_ thing to activate. So this might either be the reverse situation - maybe we let the inactive list grow too large, and then even a big random load will activate pages that really shouldn't be activated? Or it might be related to the workingset issue in that we've activated pages too eagerly and not ever moved things back to the inactive list (which then in some situations causes the inactive list to be very small). Who knows. But this is definitely an area that I suspect hasn't gotten all that much attention simply because memory has become much more plentiful, and a lot of regular loads basically have enough memory that almost all IO is cached anyway, and the old "you needed to be more clever about balancing swap/inactive/active even under normal loads" thing may have gone away a bit. These days, even if you do somewhat badly in that balancing act, a lot of loads probably won't notice that much. Either there is still so much memory for caching that the added IO isn't really ever dominant, or you had such a big load to begin with that it was long since rewritten to use O_DIRECT. Linus