From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.5 required=3.0 tests=MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5C473C47404 for ; Mon, 7 Oct 2019 12:51:00 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 20C7D20867 for ; Mon, 7 Oct 2019 12:51:00 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 20C7D20867 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id BECFB8E0005; Mon, 7 Oct 2019 08:50:59 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B75AF8E0003; Mon, 7 Oct 2019 08:50:59 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A8BE48E0005; Mon, 7 Oct 2019 08:50:59 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0210.hostedemail.com [216.40.44.210]) by kanga.kvack.org (Postfix) with ESMTP id 8836D8E0003 for ; Mon, 7 Oct 2019 08:50:59 -0400 (EDT) Received: from smtpin04.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with SMTP id 22746180AD80F for ; Mon, 7 Oct 2019 12:50:59 +0000 (UTC) X-FDA: 76016973438.04.bear06_5f72137dec019 X-HE-Tag: bear06_5f72137dec019 X-Filterd-Recvd-Size: 3893 Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by imf46.hostedemail.com (Postfix) with ESMTP for ; Mon, 7 Oct 2019 12:50:55 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 322CAB1B3; Mon, 7 Oct 2019 12:50:54 +0000 (UTC) Date: Mon, 7 Oct 2019 14:50:53 +0200 From: Michal Hocko To: Konstantin Khlebnikov Cc: linux-mm@kvack.org, Andrew Morton , linux-kernel@vger.kernel.org, Matthew Wilcox Subject: Re: [PATCH v2] mm/swap: piggyback lru_add_drain_all() calls Message-ID: <20191007125053.GK2381@dhcp22.suse.cz> References: <157019456205.3142.3369423180908482020.stgit@buzz> <20191004131230.GL9578@dhcp22.suse.cz> <20191004133929.GN9578@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri 04-10-19 17:06:13, Konstantin Khlebnikov wrote: > On 04/10/2019 16.39, Michal Hocko wrote: > > On Fri 04-10-19 16:32:39, Konstantin Khlebnikov wrote: > > > On 04/10/2019 16.12, Michal Hocko wrote: > > > > On Fri 04-10-19 16:09:22, Konstantin Khlebnikov wrote: > > > > > This is very slow operation. There is no reason to do it again if somebody > > > > > else already drained all per-cpu vectors while we waited for lock. > > > > > > > > > > Piggyback on drain started and finished while we waited for lock: > > > > > all pages pended at the time of our enter were drained from vectors. > > > > > > > > > > Callers like POSIX_FADV_DONTNEED retry their operations once after > > > > > draining per-cpu vectors when pages have unexpected references. > > > > > > > > This describes why we need to wait for preexisted pages on the pvecs but > > > > the changelog doesn't say anything about improvements this leads to. > > > > In other words what kind of workloads benefit from it? > > > > > > Right now POSIX_FADV_DONTNEED is top user because it have to freeze page > > > reference when removes it from cache. invalidate_bdev calls it for same reason. > > > Both are triggered from userspace, so it's easy to generate storm. > > > > > > mlock/mlockall no longer calls lru_add_drain_all - I've seen here > > > serious slowdown on older kernel. > > > > > > There are some less obvious paths in memory migration/CMA/offlining > > > which shouldn't be called frequently. > > > > Can you back those claims by any numbers? > > > > Well, worst case requires non-trivial workload because lru_add_drain_all > skips cpus where vectors are empty. Something must constantly generates > flow of pages at each cpu. Also cpus must be busy to make scheduling per-cpu > works slower. And machine must be big enough (64+ cpus in our case). > > In our case that was massive series of mlock calls in map-reduce while other > tasks writes log (and generates flow of new pages in per-cpu vectors). Mlock > calls were serialized by mutex and accumulated latency up to 10 second and more. This is a very useful information! > Kernel does not call lru_add_drain_all on mlock paths since 4.15, but same scenario > could be triggered by fadvise(POSIX_FADV_DONTNEED) or any other remaining user. OK, so I read it as, you are unlikely to hit problems with the current tree but they are still possible in principle. That is a useful information as well. All that belongs to the changelog. Do not let us guess and future generations scratch their heads WTH is going on with that weird code. Thanks! -- Michal Hocko SUSE Labs