From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8460BC021A1 for ; Tue, 11 Feb 2025 15:36:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 00C3F280003; Tue, 11 Feb 2025 10:36:58 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id EFF62280002; Tue, 11 Feb 2025 10:36:57 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DC624280003; Tue, 11 Feb 2025 10:36:57 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id BEEA6280002 for ; Tue, 11 Feb 2025 10:36:57 -0500 (EST) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 15C431403E2 for ; Tue, 11 Feb 2025 15:36:08 +0000 (UTC) X-FDA: 83108064816.08.A263FA2 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) by imf13.hostedemail.com (Postfix) with ESMTP id 196EF2001C for ; Tue, 11 Feb 2025 15:36:04 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=ajiXRJUi; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=D9uW25t3; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=j38a5bnJ; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=YbLyNrYq; spf=pass (imf13.hostedemail.com: domain of vbabka@suse.cz designates 195.135.223.130 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739288165; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=SK/hMZFFWYHTcqTuxA26rggFAmVKbr6XW3u2dPATlN8=; b=LVU/vhwbdYI0GXn2JIfM0haxx5W4XKQ+gLHAoy9kUdWwQLCBXl+eUrHAOuAWf1MrdVa4u8 WSHhTn1XG0GQFqbkBdtnMTlokMJcrfxCcAAOP4pKr6W8mZe3c/zDMHpHAAM/bz7SeK8zj4 ei0ezyq8cnFJjBTK8E94hkil69nFlZo= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=ajiXRJUi; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=D9uW25t3; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=j38a5bnJ; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=YbLyNrYq; spf=pass (imf13.hostedemail.com: domain of vbabka@suse.cz designates 195.135.223.130 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739288165; a=rsa-sha256; cv=none; b=lXzjLDE1gjvFig4rUShuoBcENsXH8rSjXCG/BJaqG80r4qq32Y6MFviw7jb3IHJ2zv2zJK rqI1IdR9zhhS19W9G8U+HU3pJPyfgfRo6EVzfqrdaydY7frXNu+tDL3YufoPv0AvK4fqWe WqTQe/70BbmmOpUBV8EoBnLbPYWgfPg= Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 27EA4343DB; Tue, 11 Feb 2025 15:05:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1739286356; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=SK/hMZFFWYHTcqTuxA26rggFAmVKbr6XW3u2dPATlN8=; b=ajiXRJUih45/mRPxgwFDUY9e9NIKnsH3/4Ve7O28ajVAHRsqcMASgBaxFeTxEOngG9jMAI +eSw5aI8EPfQWCw3mCksvmzuzrN+J2TaRgOMHLxcv9k+SVgsZeBxQNaKF51XOfL/Q6pcKc /GS5VCTDu+cs4PO06qRVyHTB7pVvnxU= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1739286356; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=SK/hMZFFWYHTcqTuxA26rggFAmVKbr6XW3u2dPATlN8=; b=D9uW25t3hJxdPmbd/eVt9lnBviVjctjdv8IR5AcPj79k8PHMr7YqKK0MPQJ7vTqaZlirb4 mCGfEaZrUl7RJEAg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1739286355; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=SK/hMZFFWYHTcqTuxA26rggFAmVKbr6XW3u2dPATlN8=; b=j38a5bnJKw8ySZbz8a5mBmzIHcTN+zPIkAZNrSqAa9ElartJzi3+SbAj80qV4p8kVQdQO7 JNClojydqwWPuzStQjh2Lf8Xz28VBUcLJCF7sEckKiVGndyXVXnX4+BW5oMQrKbOHgsRjF AxvuHIvShki7AD3IPBaXUmB2mwsVsGY= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1739286355; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=SK/hMZFFWYHTcqTuxA26rggFAmVKbr6XW3u2dPATlN8=; b=YbLyNrYqwK8qmWNN5MvCXwcPkBY5yXp1es3olx4Oc3xlfApEVm5NEn8fRsjG+ZSJoGFPTu thP1hNQeYspc7bDg== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 0A45A13715; Tue, 11 Feb 2025 15:05:55 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id IMosAlNnq2emXAAAD6G6ig (envelope-from ); Tue, 11 Feb 2025 15:05:55 +0000 Message-ID: Date: Tue, 11 Feb 2025 16:05:54 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH] mm, percpu: do not consider sleepable allocations atomic Content-Language: en-US To: Michal Hocko , Dennis Zhou , Tejun Heo , Filipe Manana Cc: Andrew Morton , linux-mm@kvack.org, LKML , Michal Hocko References: <20250206122633.167896-1-mhocko@kernel.org> From: Vlastimil Babka In-Reply-To: <20250206122633.167896-1-mhocko@kernel.org> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspam-User: X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 196EF2001C X-Stat-Signature: xw7hizhmzg3bt8w8qcjdx84hasa3twf9 X-HE-Tag: 1739288164-739013 X-HE-Meta: U2FsdGVkX1/WhCK6NCBUVC/TfF4Rr+VjWZ6m6cmrRHVz/ithSwkeQuvNGXH1Bf29AOcf2mISP78zkHD3N+wjePj11TNZxeezFqA8Yh0TN4MuYLpdT6LVaW3VxOeR+lwrzN18bju8r9gH+gXFYYUmn9NF0G+fToCsez0efKLj2BuOtZoLxAIJyHJ1GynwMorDHs0wQV6SaPlvVTvuqYbjI3g40exx9jwlCVmws5A1YZsY5jptsiea61Q2BNxeza0XXQAngtxiNTfDBoDbSIygI4009tzocfLlUgMzcX17iV1Ymic0Nd/iXq1B7Tz3RDPq12uoLXX9A/LC3T7hTVw9r2MnMa0Ow5bbRhiHEnQeeouJpggEtPd2Dqarw0CFiDY4wyYfSPaQuumg/wDJi1Pevd0SQ/5iZl8/TcziYUUWYZFPYqhGcfmdPzx+6QmR8tiZ75zbIW7gGHK/akNV43B49aKCk8Vf5qgL6C3fiGDZD3BurY4E8tx7ctwU4EqDGIypEIgHrQBBLL1z++jix2LY6CuVKj4uP008CKdOAL3SIa1u3gzX370X5w/rAb7ekMpqSG7NVkiUclDgUk7xv8833vaKQ9/7LCsRuGSNSrqKqPsQXBPXpKQbF3KVAFyUTmyC98qpQ9nuOBPt9oftePv6RP60bJ6Iu1g3M0euMvdxdtz0t5yzDP8vJ6t+gW30EEUttOcsWLCEclXVs5II22ttxlx9gBYV2RwFhT3L9KDr0OE8b9EcyNnWVX1O4Zsa7SESWg3rW93zy/1CqjrXoctKCzZG+h8y0lD4++Zzo+hJY3V/YSGaAsk0u0oGVbkgD+Pmw0zJ3nw29lDTbJXhZTCH/ieXEDA2WRExDi9j7TkyLvvXYKpS56Z783FsQTqMn7VaZyZwb+0cvt2HOnE1nx+/0IacwfmpkpErAm91Gt5CqzDMR0ASF0Az3GVCuczaxnYUt+QrqLjB6jDsm9C1HgW bk44k0xN 7qu9Do/XxDJLCHKJIGwVBtA2sIW3bk3NYgbIv74ZBA+rC58KZ14aCEfhQDmW7Vsb/mH5rOvkPpyzKGKPgM40NmEcKwpf1L1is84PhqMPQgM71fOxpbwD+gOKkF5CFLY+wqHpdGYrlXQ4c97YVA1hvF4vaFRKuAWlx5hCaAnUrp1LrgWIZVL34Oh9GXxPmtPzMj3IxKxh+bflbiF+o29PVyuAbce4B6eZ4jr3UV9XzDdPfRYixHZ6t/L/38jaurOnX4Iq5YPIewqAOEcpnvLUMsnpxYFw5DKM6HCfKdFzzfgcEwNYNSbSKozetsecosY0SCmuTrWAnwPRe0sz0fAtSYXIWbaKutSgchAeF99E5FIPRtZsmnjRrkoCCVQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2/6/25 13:26, Michal Hocko wrote: > From: Michal Hocko > > 28307d938fb2 ("percpu: make pcpu_alloc() aware of current gfp context") > has fixed a reclaim recursion for scoped GFP_NOFS context. It has done > that by avoiding taking pcpu_alloc_mutex. This is a correct solution as > the worker context with full GFP_KERNEL allocation/reclaim power and which > is using the same lock cannot block the NOFS pcpu_alloc caller. > > On the other hand this is a very conservative approach that could lead > to failures because pcpu_alloc lockless implementation is quite limited. > > We have a bug report about premature failures when scsi array of 193 > devices is scanned. Sometimes (not consistently) the scanning aborts > because the iscsid daemon fails to create the queue for a random scsi > device during the scan. iscsid itslef is running with PR_SET_IO_FLUSHER > set so all allocations from this process context are GFP_NOIO. This in > turn makes any pcpu_alloc lockless (without pcpu_alloc_mutex) which > leads to pre-mature failures. > > It has turned out that iscsid has worked around this by dropping > PR_SET_IO_FLUSHER (https://github.com/open-iscsi/open-iscsi/pull/382) > when scanning host. But we can do better in this case on the kernel side > and use pcpu_alloc_mutex for NOIO resp. NOFS constrained allocation > scopes too. We just need the WQ worker to never trigger IO/FS reclaim. > Achieve that by enforcing scoped GFP_NOIO for the whole execution of > pcpu_balance_workfn (this will imply NOFS constrain as well). This will > remove the dependency chain and preserve the full allocation power of > the pcpu_alloc call. > > While at it make is_atomic really test for blockable allocations. > > Fixes: 28307d938fb2 ("percpu: make pcpu_alloc() aware of current gfp context > Signed-off-by: Michal Hocko Acked-by: Vlastimil Babka > --- > mm/percpu.c | 8 +++++++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/mm/percpu.c b/mm/percpu.c > index d8dd31a2e407..192c2a8e901d 100644 > --- a/mm/percpu.c > +++ b/mm/percpu.c > @@ -1758,7 +1758,7 @@ void __percpu *pcpu_alloc_noprof(size_t size, size_t align, bool reserved, > gfp = current_gfp_context(gfp); > /* whitelisted flags that can be passed to the backing allocators */ > pcpu_gfp = gfp & (GFP_KERNEL | __GFP_NORETRY | __GFP_NOWARN); > - is_atomic = (gfp & GFP_KERNEL) != GFP_KERNEL; > + is_atomic = !gfpflags_allow_blocking(gfp); > do_warn = !(gfp & __GFP_NOWARN); > > /* > @@ -2204,7 +2204,12 @@ static void pcpu_balance_workfn(struct work_struct *work) > * to grow other chunks. This then gives pcpu_reclaim_populated() time > * to move fully free chunks to the active list to be freed if > * appropriate. > + * > + * Enforce GFP_NOIO allocations because we have pcpu_alloc users > + * constrained to GFP_NOIO/NOFS contexts and they could form lock > + * dependency through pcpu_alloc_mutex > */ > + unsigned int flags = memalloc_noio_save(); > mutex_lock(&pcpu_alloc_mutex); > spin_lock_irq(&pcpu_lock); > > @@ -2215,6 +2220,7 @@ static void pcpu_balance_workfn(struct work_struct *work) > > spin_unlock_irq(&pcpu_lock); > mutex_unlock(&pcpu_alloc_mutex); > + memalloc_noio_restore(flags); > } > > /**