From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0A33ED68BF1 for ; Sat, 16 Nov 2024 21:13:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 591B56B00AF; Sat, 16 Nov 2024 16:13:36 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 53FA16B00B0; Sat, 16 Nov 2024 16:13:36 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3E0A26B00B1; Sat, 16 Nov 2024 16:13:36 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 1CBB26B00AF for ; Sat, 16 Nov 2024 16:13:36 -0500 (EST) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 6A3BE141457 for ; Sat, 16 Nov 2024 21:13:35 +0000 (UTC) X-FDA: 82793208078.21.04FB4AE Received: from mail-wr1-f42.google.com (mail-wr1-f42.google.com [209.85.221.42]) by imf25.hostedemail.com (Postfix) with ESMTP id 97099A0005 for ; Sat, 16 Nov 2024 21:12:59 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=a04t0y27; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf25.hostedemail.com: domain of alexei.starovoitov@gmail.com designates 209.85.221.42 as permitted sender) smtp.mailfrom=alexei.starovoitov@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731791433; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=8dDhpQKeKCOHpYq1q9d/wZt5vbSCK68JYDrGvDMs02k=; b=1w1q1UC7D2wIcewybaSyOdalbirq94rN13YS+UKFIjcCKMQidRy/zjVHa6VozxTCtz/vk8 GGuwcHWFi43daYs+jHHM6jvaEd4QeKnZrZbUiO9pH43wRz7xzS2DoKxUW1KPNQUc+p3jH2 d31XRzRT/Moyxi+kR1yveoW6NNiLqgI= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=a04t0y27; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf25.hostedemail.com: domain of alexei.starovoitov@gmail.com designates 209.85.221.42 as permitted sender) smtp.mailfrom=alexei.starovoitov@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731791433; a=rsa-sha256; cv=none; b=42YY2xE3oEZP0TMsdezSbCTS4XGL45ac2/ydZN/ptT+CmrUxiVVwCp0HkoBFLzwLIqlO5l PZI08Rv20AY2K5n81h1J5A51GUpE+QhjXGJhGXca0i/4YArjKIJf90n9m9Rb6SfatxmD8y pT5I976YxVcxIzvPIQadmN7J5cHvRbQ= Received: by mail-wr1-f42.google.com with SMTP id ffacd0b85a97d-382411ea5eeso64017f8f.0 for ; Sat, 16 Nov 2024 13:13:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1731791612; x=1732396412; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=8dDhpQKeKCOHpYq1q9d/wZt5vbSCK68JYDrGvDMs02k=; b=a04t0y27GMXlxj2VhK6wnqfxwPBSdorHwpA57Iwp/eolnv2Wnw2AQgygw8Tgjpen9/ mrEhOgsi/YwkqR6m89q2j5CBMT1F3j0GLoiC2eDpR9Thhw4pjqTtFz8BkqIgLfvF+F14 JN0zP6kanmj9I7MdCh8vdlG37RNk9HEMOAJVmG/Wt/O8r6/5E0t4mFdafrU+VkwD7UWu Zc7XOPRXBE+gZaA+A15nR7qPasbX9LwzXmIeXWjalFrKKDO12O8irK6VL8ZJnHVv/1++ QIz9rtBZHEuHsuE10pJ9NiFiGesajNaxM5NwhQGFWTNg36LaPj/UyV6RFEZtr7e4ry/p YBTg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731791612; x=1732396412; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=8dDhpQKeKCOHpYq1q9d/wZt5vbSCK68JYDrGvDMs02k=; b=LJjkMzPeeKRiQbcgQOn3QdApRexHVQZsMBCOW/AYho7OfVkQGplSIi02Jpi9KCanhD Aaj+bOXhlbPX3NeHz7I8ToBkjfRXMcPhK4/S9JiZOruwKzPhYjPsv/cRQEq0peuuuV0f MQfDrkeUztuIlKPGBFE07nso8+epwMlazuivYsXm/f5L/lVYuZb+sCM34WUEN1oK/4Jp T+NYHPDLiFZs15lJkklblV6S3HMLS8brlmsw3HHVHyKq4hsNInyrbSWnE0qBK9sWZFK7 ZgLX6GR7YI4ck5nCWvBx3dkIVgVCS9ynogVri86VROcodePO6Pz8pmyYG+dcyUYKHBSu 4fkw== X-Forwarded-Encrypted: i=1; AJvYcCWsEPJQYc/In+NngEb8r1xmXaHk4MUZ+JJpiySiHSLVvktBlcdyPSowhpwngiTsYpfUxe+J0ze+PQ==@kvack.org X-Gm-Message-State: AOJu0Yz6y6JQd+BNbKW0Lis4wd6bkhZtYaKVOCbZxnxH7nmqD7x+RvOa bOY3C4prCA66E6cFSpC5aThSTW5U13NyVqBqQOo9wCfFRMujIKAIth1quAVdRvmJ3MmGGyWnchk MkXQi6RL51lyjvMgEuW7S2j+Cj0g= X-Google-Smtp-Source: AGHT+IERg180eYsWsfE3AGTNTKe/Lns2k0t9xV3PGtNkTH6C1lGtsalMYi9alOa6Z2EXvVeMv5gqxDH8YklhcuL1g6w= X-Received: by 2002:a05:6000:1865:b0:37d:443b:7ca4 with SMTP id ffacd0b85a97d-38214022068mr11374949f8f.14.1731791611849; Sat, 16 Nov 2024 13:13:31 -0800 (PST) MIME-Version: 1.0 References: <20241116014854.55141-1-alexei.starovoitov@gmail.com> <20241116194202.GR22801@noisy.programming.kicks-ass.net> In-Reply-To: <20241116194202.GR22801@noisy.programming.kicks-ass.net> From: Alexei Starovoitov Date: Sat, 16 Nov 2024 13:13:20 -0800 Message-ID: Subject: Re: [PATCH bpf-next 1/2] mm, bpf: Introduce __GFP_TRYLOCK for opportunistic page allocation To: Peter Zijlstra Cc: bpf , Andrii Nakryiko , Kumar Kartikeya Dwivedi , Andrew Morton , Vlastimil Babka , Hou Tao , Johannes Weiner , shakeel.butt@linux.dev, Michal Hocko , Tejun Heo , linux-mm , Kernel Team Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 97099A0005 X-Stat-Signature: 3aqokk4rdtsphne6i7xs7gg95d1oxa95 X-Rspam-User: X-HE-Tag: 1731791579-185485 X-HE-Meta: U2FsdGVkX18nq2XsqyApPSVhWnrH3weFFa6alFpcidD8P4IvrtmSb9x7UNtEVwl5xZ0uxeIYQwCMfXg/6vlC+B1GXyMuYzB2F0Jf6p8R1jAhNy/pGkTCJ2pagL1jlH1MLugwYxrncQwN7SGiMU6i0lCCsMpbuK6lL8yQImSD2ByzTPw/zb+YSA+oddGuZwC/CmduzcBZkWmSbw4UGYU4rdyehq4zZSUFfa5sYl+0UznDB251ilnztlB0JOanbeYMS/EkUCdawtZ5d/l1idpb7VKDx08O9tcAPDPd3wRHY80/y8zOeIzsyY11vbod1hsd9dT8TV7IIOCa2o1t811jLiF2gwj71h1tP+oF9zpGVavH7qFTv2i9vLSGtzlAc/pqaBb1r2UBvJoIDBLf81i05i5LTsZdq0XmGhDvvn6/qs6KoR/El8QPXH1Pw0zIkkadpUvxbzncyoJrGBV51ZkK7QWMxzdQJdSG72t6CI0ZeZmv8jFyMYwR2cyYSNsAu8o+084xHcxBy47jWa+hr4jBHb+Fjcl59guBbFr/VG/7/sbFkdhmF6ejt8m+3HGMyxxfmSsBqRhQygVYaGN1C+Td6tFFy7tVXkKE/0iQOH3lbTMdqmTGpK8ULS4hiK8fsmeS+al83lnMN3EKI6BRDyI/MLvoqEprNw5Esuka9LoE2f3SPs6T3lM58ode0hph2EQxYqTSiKxq3g+cUvVnEt8LDqUeJDwjYOtZJdX/bZY269MHpe7UXJWnAyR6vCC5dSXq0LR34pZipYt3+Pm1ZPhJywAJNyH7E0Cclt9AfMF/ME4+o+bPZ7tS10DZhN4oO02zrztx85Z7I9F0IqWWmfI1mbPLuT/HVOJRi276+Wc7f7PjUz6y21G18CCs1e74Vs25ctcQAd9GSvw1yMIwPiVtcjZpc0lvzqBOXPEQWPzhTznuYcyVhoLg2p4MLIPZSD1IkuOCrzkE1L4TtCW+Uox 7NW13dQz QKHNRg9RrjYTUEqtk28M/tQ8e8pveLTM3mwZnZB36hXjE4Wenl9oyYFwL19ItJeOrMVHn9MAl+iAZ4KGnCQHAD+SoBcxs0IBTvAnMPXXSZHppV7qgF9UilQFqTnpTvz8BO6zM29SO+M6W/pXvlyy6BqDWkKPq43MhaD8II3cZyqoWjFcQQdBWMs9u825yjelLu87mINMd40mJ3ex30Leu/1uhWJXi1I+OVxisBhDCE87CrhMV7EBLAQECEtw3LDZXnAZYWHHwFABVULWvTVKng9qoefbnTOAL9qk6VfAZ/xyj6u3p6A1VE+T0CLIFT19+2Wkc77Id9rEvqk5pWOg8ZfwUMigIQ8mm1fmhDFFZGjlzk8pOcqNWWJmXnrIsAy7HgoNDvrmNLAkZR4R9YWBio+wProj96iTqvqJWuJrRz7KZeVQZ0w1awSQ3xsJt5cik8sWJRCI2z2Rzg+M= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sat, Nov 16, 2024 at 11:42=E2=80=AFAM Peter Zijlstra wrote: > > On Fri, Nov 15, 2024 at 05:48:53PM -0800, Alexei Starovoitov wrote: > > +static inline struct page *try_alloc_page_noprof(int nid) > > +{ > > + /* If spin_locks are not held and interrupts are enabled, use nor= mal path. */ > > + if (preemptible()) > > + return alloc_pages_node_noprof(nid, GFP_NOWAIT | __GFP_ZE= RO, 0); > > This isn't right for PREEMPT_RT, spinlock_t will be preemptible, but you > very much do not want regular allocation calls while inside the > allocator itself for example. I'm aware that spinlocks are preemptible in RT. Here is my understanding of why the above is correct... - preemptible() means that IRQs are not disabled and preempt_count =3D=3D 0= . - All page alloc operations are protected either by pcp_spin_trylock() or by spin_lock_irqsave(&zone->lock, flags) or both together. - In non-RT spin_lock_irqsave disables IRQs, so preemptible() check guarantees that we're not holding zone->lock. The page alloc logic can hold pcp lock when try_alloc_page() is called, but it's always using pcp_trylock, so it's still ok to call it with GFP_NOWAIT. pcp trylock will fail and zone->lock will proceed to acquire zone->lock. - In RT spin_lock_irqsave doesn't disable IRQs despite its name. It calls rt_spin_lock() which calls rcu_read_lock() which increments preempt_count. So preemptible() checks guards try_alloc_page() from re-entering in zone->lock protected code. And pcp_trylock is trylock. So I believe the above is safe. > > + /* > > + * Best effort allocation from percpu free list. > > + * If it's empty attempt to spin_trylock zone->lock. > > + * Do not specify __GFP_KSWAPD_RECLAIM to avoid wakeup_kswapd > > + * that may need to grab a lock. > > + * Do not specify __GFP_ACCOUNT to avoid local_lock. > > + * Do not warn either. > > + */ > > + return alloc_pages_node_noprof(nid, __GFP_TRYLOCK | __GFP_NOWARN = | __GFP_ZERO, 0); > > +}