From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.5 required=3.0 tests=MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E2A39C388F3 for ; Mon, 30 Sep 2019 11:28:20 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B1CAC206BB for ; Mon, 30 Sep 2019 11:28:20 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B1CAC206BB Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 4C7496B0006; Mon, 30 Sep 2019 07:28:20 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 478766B0007; Mon, 30 Sep 2019 07:28:20 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 38E1F6B0008; Mon, 30 Sep 2019 07:28:20 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0114.hostedemail.com [216.40.44.114]) by kanga.kvack.org (Postfix) with ESMTP id 116A46B0006 for ; Mon, 30 Sep 2019 07:28:20 -0400 (EDT) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with SMTP id A38054404 for ; Mon, 30 Sep 2019 11:28:19 +0000 (UTC) X-FDA: 75991363518.01.worm66_82dcd942a1037 X-HE-Tag: worm66_82dcd942a1037 X-Filterd-Recvd-Size: 3807 Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by imf25.hostedemail.com (Postfix) with ESMTP for ; Mon, 30 Sep 2019 11:28:19 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id C7A39AEAE; Mon, 30 Sep 2019 11:28:17 +0000 (UTC) Date: Mon, 30 Sep 2019 13:28:17 +0200 From: Michal Hocko To: Linus Torvalds Cc: David Rientjes , Andrea Arcangeli , Andrew Morton , Mel Gorman , Vlastimil Babka , "Kirill A. Shutemov" , Linux Kernel Mailing List , Linux-MM Subject: Re: [patch for-5.3 0/4] revert immediate fallback to remote hugepages Message-ID: <20190930112817.GC15942@dhcp22.suse.cz> References: <20190904205522.GA9871@redhat.com> <20190909193020.GD2063@dhcp22.suse.cz> <20190925070817.GH23050@dhcp22.suse.cz> <20190927074803.GB26848@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Sat 28-09-19 13:59:26, Linus Torvalds wrote: > On Fri, Sep 27, 2019 at 12:48 AM Michal Hocko wrote: > > > > - page = get_page_from_freelist(gfp_mask, order, alloc_flags, ac); > > + if (!order) > > + page = get_page_from_freelist(gfp_mask, order, alloc_flags, ac); > > if (page) > > goto got_pg; > > > > The whole point of handling this in the page allocator directly is to > > have a unified solutions rather than have each specific caller invent > > its own way to achieve higher locality. > > The above just looks hacky. It is and it was meant to help move on when debugging rather than a final solution. > Why would order-0 be special? Ideally it wouldn't be but the current implementation makes it special. Why? Because the whole concept of low wmark fast path attempt is based on kswapd balancing for a high watermark providing some space. Kcompactd doesn't have any notion like that. And I believe that a large part of the problem really is there. If I am wrong here then I would appreciate to be corrected. If __GFP_THISNODE allows for a better THP utilization on a local node then the problem points at kcompactd not being pro-active enough. And that was the first diff aiming at. I also claim that this is not a THP specific problem. You are right that lower orders are less likely to hit the problem because the memory is usually not fragmented that heavily but fundamentally the over eager fallback in the fast path is still there. And that is the reason for me to pushback against __GFP_THIS_NODE && fallback allocation opencoded outside of the allocator. The allocator knows the context can compact so why should we require the caller to be doing that? Do not get me wrong, but we have a quite a long history of fine tuning for THP by adding kludges here and there and they usually turnout to break something else. I really want to get to understand the underlying problem and base a solution on it rather than "__GFP_THISNODE can cause overreclaim so pick up a break out condition and hope for the best". -- Michal Hocko SUSE Labs