From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 49856C02198 for ; Wed, 12 Feb 2025 16:57:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B0A8D6B0082; Wed, 12 Feb 2025 11:57:09 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id ABA016B0083; Wed, 12 Feb 2025 11:57:09 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 981206B0085; Wed, 12 Feb 2025 11:57:09 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 7BD8B6B0082 for ; Wed, 12 Feb 2025 11:57:09 -0500 (EST) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 1FE2BB3C75 for ; Wed, 12 Feb 2025 16:57:09 +0000 (UTC) X-FDA: 83111897778.28.A135FED Received: from mail-ej1-f47.google.com (mail-ej1-f47.google.com [209.85.218.47]) by imf18.hostedemail.com (Postfix) with ESMTP id 02CFA1C0007 for ; Wed, 12 Feb 2025 16:57:06 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=suse.com header.s=google header.b=O6dB3Rv5; spf=pass (imf18.hostedemail.com: domain of mhocko@suse.com designates 209.85.218.47 as permitted sender) smtp.mailfrom=mhocko@suse.com; dmarc=pass (policy=quarantine) header.from=suse.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739379427; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=JjsYFCJJEHZxSzckcIzfcbuaPZal5Jzp1D2k19Zbpnk=; b=dyf1B1AQM1HuMUfHk9/J4K/4ybw1CA/vb/GR3y+vM6eB7iH1hFo5YeZqJoAWhopieLtINX Q3HGuIxri1OgjMTsK1mVDsOCvCIadt7TKUsWnFFPsh9Bi6ECADpfvJV2GqbVxi3jyKEFHU WT5TUNsu3EQKKEF9WK19EomMpyRzE/Y= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=suse.com header.s=google header.b=O6dB3Rv5; spf=pass (imf18.hostedemail.com: domain of mhocko@suse.com designates 209.85.218.47 as permitted sender) smtp.mailfrom=mhocko@suse.com; dmarc=pass (policy=quarantine) header.from=suse.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739379427; a=rsa-sha256; cv=none; b=gAnN2iguoDyQkvPHKCX+Stg1XlPLvC3OoeHsW4F2nkHF36kpsr+3qePOZmNtddoFg2x5Q5 IeTwQkIN/14tQfBwPGpGtBG3Vw0bh9ilp6WUrHUIMZs1deKZIvd66Zr3LgtSZA7ohumANn mh95A+1tEfn3Ps9E2WqZcNEK9IJ9nSg= Received: by mail-ej1-f47.google.com with SMTP id a640c23a62f3a-ab7e80c4b55so1163766b.0 for ; Wed, 12 Feb 2025 08:57:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1739379425; x=1739984225; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=JjsYFCJJEHZxSzckcIzfcbuaPZal5Jzp1D2k19Zbpnk=; b=O6dB3Rv5GYPBlggQV0HHv39rxZmBtPgpSPrppS/PZcJoVlwBtBciADrgUdvj69w5M6 zf2rFDVbvcnpoVU6wubhwV+OIa1jhQtQ/xLFcaO7Kc+OibGyG0AHs7BItB77pnPPHj8i gUhIz7Hbiv4d+WOJxvKzjtfGIW1xhg8gJilaag5ojK/9LngLuF7AfdrMAZZGNFbTzLm+ 8YzLRq2CW2j+0AV9V5CwktXUgvrUUpo5wxgtyuLeiSPKw9ryV16GCVrC47DeUNKedLiz +yY8sEZmBjhdOMNYkFsoJDKTi2aKcyvlXY5X81id09cZt9PiFK5Xd2gglwT9hE2Ci7bL G6uQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1739379425; x=1739984225; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=JjsYFCJJEHZxSzckcIzfcbuaPZal5Jzp1D2k19Zbpnk=; b=SnLVHKAN565sBCsmabxQPjkZV96J96jAm+upBmYSdHunU5x3ac/EnZ9civ2RbKK4Ty kcGKbUNzEGJ53Qn6mgaHmkzwbDqZSeAC10QB3ex9EQlnRIOXV9aMIVOT8FU9GUxy8pvT YZBKlcFHSQJEeOZ99cgrFRNgtuu8dX4tQuEwnORtTQCIF4cAeW1XrFbBP+2a25DssPOb x1iNTgiKXomKdJ5jwOyIvZc9qIzOReiNvGhRNh6X0nLUF7xQmlobGWFEI42+2nPP81eI trywxGNX2KTvYeLGVmF5oKBIDv7lyEnsr9PdhgKeOyTYF4nmDyTWxAqtG2NwARlFbHOe gNJg== X-Forwarded-Encrypted: i=1; AJvYcCXBzaJ5EjqCXPofOVIUFo+UBBNBFxYV5rf3/9wtXIYAaEpfSBpYAHGo9pb/BgA9rpa8jcVstKnJmQ==@kvack.org X-Gm-Message-State: AOJu0Yw5YiSBMVKvGwe9NI6eAt0kawzwjZuysTxQL0fYMS4zWQMLxyb6 sy9ZpFYy32cVMPt/9OsCQSx4YKA2e78wLNjnAPg7B2xXoDdzcCjJzQY5BCnz+3A= X-Gm-Gg: ASbGncuEVS3f4WVxMFUzIbDfx5dmXem0NLc1O8wW8scUWSOlLln/nwWNfJ51oisCp82 HCOvHucwFzO6omQBCk5Vk31N12lAqXV9mKYkpN9mX7wXu28fyg0SZGb8a6Ly7TKQZ/fg4TbBn0M OlKr1LZs71IApuu4x9WLrA9IIJHSHRaDwh57bzQZBu+XhEVRQvClKzBMg5gLpppJhf7A7vvjvay k5Npw78PZ1becIFNNNqdLFm1OP+u78j7a6d8Gg/YTb/gJDvMnbS52XUhE9ZK76XNCoto7Sp5cOl V7h561o+6GZ/BpEQRSkM99dXGpEL X-Google-Smtp-Source: AGHT+IH7LlfK74B8VvYBHknSdhrm92vGSyA95nxZa/qMjbiRpSunsF+OUjP8S86GVgK+ZMWNOar0ww== X-Received: by 2002:a17:907:1c14:b0:ab7:b30:42ed with SMTP id a640c23a62f3a-ab7f31944c1mr340818066b.0.1739379425293; Wed, 12 Feb 2025 08:57:05 -0800 (PST) Received: from localhost (109-81-84-135.rct.o2.cz. [109.81.84.135]) by smtp.gmail.com with UTF8SMTPSA id a640c23a62f3a-ab7f7ee8686sm137024766b.22.2025.02.12.08.57.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 12 Feb 2025 08:57:05 -0800 (PST) Date: Wed, 12 Feb 2025 17:57:04 +0100 From: Michal Hocko To: Tejun Heo Cc: Dennis Zhou , Filipe Manana , Andrew Morton , linux-mm@kvack.org, LKML Subject: Re: [PATCH] mm, percpu: do not consider sleepable allocations atomic Message-ID: References: <20250206122633.167896-1-mhocko@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspam-User: X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: 02CFA1C0007 X-Stat-Signature: a6nnaeeybypm1txad1sspw4m3fbopzn7 X-HE-Tag: 1739379426-805429 X-HE-Meta: U2FsdGVkX18u2AYwf3sPT6a4/DuEiMs2XnDNr1m3lf74kNsq2kK/Ux3EnIxDGXzVe0uOz6rfxtZc+KXNwdp/LyAYBipj2eOlEjUgv72EVhCzjwG6Ev4GyV4NwRyjGWDROCqK7++sm3GIXdCPNXKJalj/wTKzyR9JoHfhZfkMLxHzOHI32FgYFoWa1fI+9EvFtJISkpj6W095PimUur5leR+t1zL25CSTY2D294sy1w/nGiAsuYoC7w7EOov79RPpLC9AWh8NWLDiIo4agWHqpYUT6xPyB4wMRwVRE4bYJLcThBU28EM8TE3RGr/AH03CxdUcZoANVR0Tjjkxx7u7Kh28W+i6vM3ulEXycUD14q+O/wYQSo+t5EKTdURLbP7lFSlU2rcV+a/KBcgSCqosbFVsDZXt/591lzLiJLfloiY2FyeVVtOvAl64WrP51i4wcJPR4e9iXL2b2ShBVzIkFmCNJLVaDxTGKXjsnZTGonUMO1I+ZRJyAmdu+h0MODfBm8XtkSjFche3n5FXD47qcNtnWTZW1RTfmj7/cj7piL+hNLxnxW45uZM3rKGPMavEYfjJBURaU7Zzdbtwu+bD7othSORQX3oulp6Idf0yr8xC6Li8hxEkv8TGfusfZ6TWlOGBQRi02+BubRkdIyJ3lNnXpFaFrqK+xLsD1J7eMpfFgGqjlRF1IqEE7n66+AVX1t22UID97cCM7VNCn0r+wKOWPBu/xGNXMJ7yYGRno8+tQWozORUphxhSseQprXcM51+mnmfDgUXeBiyRUB4wYGxE6N0ftFCa+ix6h7WtvpiXBe2D/DEzYC+0lqSGwFbT7EDCdYl4tFpqhpmuv4pVGSnyhsLZPCHWAaMyfxpNPt92uNUyK5cO8aW+nA4/P6CYXEB4NN2gLmaDrUYxGMXjy+QkRvUJDg9yRF2rt7pytugsuK16E8c+99nQ/14GFPW8W4ThnbYoAvA0Adpqc1F +0IHvc7o 4G0ubmirJ5ntw2Ug0MLq5kYngvzfBhXWnXJf2zJ8ruadI09j4fM5cHxk9Vj6gboaESODFc/i/1DXQHE2DkRblC+jWYlA6kYA/ARIu9VLsnxNVe2AyT/eG6VFNj47sd7J1dUm3ZE22PodHPhjX363VuGhnTmaxWtYxNK1Uwqzt6J+laaaRuT5nT1VgKtejhn78J0wyjgZTbJXHSFpFmO165O+x7Sslf6t5oAifdCAwNWUZj+R+4+cDdOfvF5zEfmFJG1yCTLwlC9cuRkpGVk7D6iIJZmrm8uX/o4ns6iznbgeT11lOP49B0c1VYZB1PhPQq4d4TS5pyvxuaSWGgAUj6fqlnA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000004, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue 11-02-25 10:55:20, Tejun Heo wrote: > Hello, Michal. > > On Thu, Feb 06, 2025 at 01:26:33PM +0100, Michal Hocko wrote: > ... > > It has turned out that iscsid has worked around this by dropping > > PR_SET_IO_FLUSHER (https://github.com/open-iscsi/open-iscsi/pull/382) > > when scanning host. But we can do better in this case on the kernel side > > FWIW, requiring GFP_KERNEL context for probing doesn't sound too crazy to > me. > > > @@ -2204,7 +2204,12 @@ static void pcpu_balance_workfn(struct work_struct *work) > > * to grow other chunks. This then gives pcpu_reclaim_populated() time > > * to move fully free chunks to the active list to be freed if > > * appropriate. > > + * > > + * Enforce GFP_NOIO allocations because we have pcpu_alloc users > > + * constrained to GFP_NOIO/NOFS contexts and they could form lock > > + * dependency through pcpu_alloc_mutex > > */ > > + unsigned int flags = memalloc_noio_save(); > > Just for context, the reason why the allocation mask support was limited to > GFP_KERNEL or not rather than supporting full range of GFP flags is because > percpu memory area expansion can involve page table allocations in the > vmalloc area which always uses GFP_KERNEL. memalloc_noio_save() masks IO > part out of that, right? It might be worthwhile to explain why we aren't > passing down GPF flags throughout and instead depending on masking. I have gone with masking because that seemed easier to review and more robust solution. vmalloc does support NOFS/NOIO contexts these days (it will just uses scoped masking in those cases). Propagating the gfp throughout the worker code path is likely possible, but I haven't really explored that in detail to be sure. Would that be preferable even if the fix would be more involved? > Also, doesn't the above always prevent percpu allocations from doing fs/io > reclaims? Yes it does. Probably worth mentioning in the changelog. These allocations should be rare so having a constrained reclaim didn't really seem problematic to me. There should be kswapd running in the background with the full reclaim power. > ie. Shouldn't the masking only be used if the passed in gfp > doesn't allow fs/io? This is a good question. I have to admit that my understanding might be incorrect but wouldn't it be possible that we could get the lock dependency chain if GFP_KERNEL and scoped NOFS alloc_pcp calls are competing? fs/io lock pcpu_alloc_noprof(NOFS/NOIO) pcpu_alloc_noprof(GFP_KERNEL) pcpu_schedule_balance_work pcpu_alloc_mutex pcpu_alloc_mutex allocation_deadlock throgh fs/io lock This is currently not possible because constrained allocations only do trylock. Makes sense? -- Michal Hocko SUSE Labs