From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A65B5C433DF for ; Fri, 21 Aug 2020 22:01:47 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 68037207C3 for ; Fri, 21 Aug 2020 22:01:46 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="yQNLm87w" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 68037207C3 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 1904B6B000D; Fri, 21 Aug 2020 18:01:46 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 11C116B0010; Fri, 21 Aug 2020 18:01:46 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F25416B0022; Fri, 21 Aug 2020 18:01:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0234.hostedemail.com [216.40.44.234]) by kanga.kvack.org (Postfix) with ESMTP id D9F136B000D for ; Fri, 21 Aug 2020 18:01:45 -0400 (EDT) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 9356F180AD80F for ; Fri, 21 Aug 2020 22:01:45 +0000 (UTC) X-FDA: 77175948570.22.lunch72_5f0f0182703c Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin22.hostedemail.com (Postfix) with ESMTP id 5D60B18038E67 for ; Fri, 21 Aug 2020 22:01:45 +0000 (UTC) X-HE-Tag: lunch72_5f0f0182703c X-Filterd-Recvd-Size: 5373 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf48.hostedemail.com (Postfix) with ESMTP for ; Fri, 21 Aug 2020 22:01:44 +0000 (UTC) Received: from localhost.localdomain (c-73-231-172-41.hsd1.ca.comcast.net [73.231.172.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 95B9D207C3; Fri, 21 Aug 2020 22:01:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1598047304; bh=aq7qmIMUGdMaoUlA+6jQpfeWcG91tpxIXFUSsaIi1FU=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=yQNLm87wvpu/PSjqkUXCuyDFfqEx5vmk96zjhAzeD2wDj86JlCP+M142C36DgZQkP pvwYdh0H9KVArGe0M6Ah0Rd8+GCxjWa9LNEE+BeYyaOuaPAENds6qAanbnF/rZP3Kv vcsfzrgfrj5bsVjBAnLSE9AsG9/HbFjoj11LyXIc= Date: Fri, 21 Aug 2020 15:01:43 -0700 From: Andrew Morton To: cgoldswo@codeaurora.org Cc: linux-mm@kvack.org, linux-arm-msm@vger.kernel.org, linux-kernel@vger.kernel.org, pratikp@codeaurora.org, pdaly@codeaurora.org, sudraja@codeaurora.org, iamjoonsoo.kim@lge.com, linux-arm-msm-owner@vger.kernel.org Subject: Re: cma_alloc(), add sleep-and-retry for temporary page pinning Message-Id: <20200821150143.8a8645b3fabc11016311b78d@linux-foundation.org> In-Reply-To: <896f92e8c37936e7cb2914e79273e9e8@codeaurora.org> References: <896f92e8c37936e7cb2914e79273e9e8@codeaurora.org> X-Mailer: Sylpheed 3.5.1 (GTK+ 2.24.31; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 5D60B18038E67 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam01 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, 11 Aug 2020 15:20:47 -0700 cgoldswo@codeaurora.org wrote: > On 2020-08-06 18:31, Andrew Morton wrote: > > On Wed, 5 Aug 2020 19:56:21 -0700 Chris Goldsworthy > > wrote: > > > >> On mobile devices, failure to allocate from a CMA area constitutes a > >> functional failure. Sometimes during CMA allocations, we have > >> observed > >> that pages in a CMA area allocated through alloc_pages(), that we're > >> trying > >> to migrate away to make room for a CMA allocation, are temporarily > >> pinned. > >> This temporary pinning can occur when a process that owns the pinned > >> page > >> is being forked (the example is explained further in the commit text). > >> This patch addresses this issue by adding a sleep-and-retry loop in > >> cma_alloc() . There's another example we know of similar to the above > >> that > >> occurs during exit_mmap() (in zap_pte_range() specifically), but I > >> need to > >> determine if this is still relevant today. > > > > > Sounds fairly serious but boy, we're late for 5.9. > > > > I can queue it for 5.10 with a cc:stable so that it gets backported > > into earlier kernels a couple of months from now, if we think the > > seriousness justifies backporting(?). > > > > Queuing this seems like the best way to proceed I'd really prefer not. It's very hacky and it isn't a fix - it's a likelihood-reducer. > > > > And... it really is a sad little patch, isn't it? Instead of fixing > > the problem, it reduces the problem's probability by 5x. Can't we do > > better than this? > > I have one alternative in mind. I have been able to review the > exit_mmap() > case, so before proceeding, let's do a breakdown of the problem: we can > categorize the pinning issue we're trying to address here as being one > of > (1) incrementing _refcount and getting context-switched out before > incrementing _mapcount (applies to forking a process / copy_one_pte()), > and > (2) decrementing _mapcount and getting context-switched out before > decrementing _refcount (applies to tearing down a process / > exit_mmap()). > So, one alternative would be to insert preempt_disable/enable() calls at > affected sites. So, for the copy_one_pte() pinning case, we could do the > following inside of copy_one_pte(): > > if (page) { > + preempt_disable(); > get_page(page); > page_dup_rmap(page, false); > + preempt_enable(); > rss[mm_counter(page)]++; > } This would make the retry loop much more reliable, and preempt_disable() is fast. Such additions should be clearly commented (a bare preempt_disable() explains nothing). Perhaps by wrapping the prerempt_disable() in a suitably-named wrapper function. But it's still really unpleasant. > > The good thing about this patch is that it has been stable in our kernel > for four years (though for some SoCs we increased the retry counts). That's discouraging! > One > thing to stress is that there are other instances of CMA page pinning, > that > this patch isn't attempting to address. Oh. How severe are these? > Please let me know if you're > okay > with queuing this for the 5.10 merge window - if you are, I can add an > option to configure the number of retries, and will resend the patch > once > the 5.9 merge window closes. Well. Why not wait infinitely? Because there are other sources of CMA page pinning, I guess. Could we take a sleeping lock on the exit_mmap() path and on the migration path? So that the migration path automatically waits for the exact amount of time?