From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.0 required=3.0 tests=BAYES_00,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B5C44C43460 for ; Wed, 7 Apr 2021 14:56:42 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 3AE986138B for ; Wed, 7 Apr 2021 14:56:42 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3AE986138B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id B5D036B007E; Wed, 7 Apr 2021 10:56:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AE5FA6B0080; Wed, 7 Apr 2021 10:56:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 95FFB6B0081; Wed, 7 Apr 2021 10:56:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0253.hostedemail.com [216.40.44.253]) by kanga.kvack.org (Postfix) with ESMTP id 727FD6B007E for ; Wed, 7 Apr 2021 10:56:41 -0400 (EDT) Received: from smtpin04.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 2AF0C180ACEEB for ; Wed, 7 Apr 2021 14:56:41 +0000 (UTC) X-FDA: 78005872602.04.3FE705D Received: from mail-il1-f182.google.com (mail-il1-f182.google.com [209.85.166.182]) by imf26.hostedemail.com (Postfix) with ESMTP id 479C540002E2 for ; Wed, 7 Apr 2021 14:56:38 +0000 (UTC) Received: by mail-il1-f182.google.com with SMTP id t14so16454070ilu.3 for ; Wed, 07 Apr 2021 07:56:40 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=ciYyOe2TaR+zXgJWJq9y6FIUTM0gYXQGEgNe2yv7a4I=; b=CSHGrHOwb3SbkAJYupod4B30akMkUzB+GzTLdtRCWVtThmnpeuABW5M+x/UGnUQld2 ozOdqkpoBp74tTf6fQLlc4cCtoC8AwacmKzL31+EtM7OgcFvMDkyYS2rPGzt4zMHNz0K 6COOVzOxEAc+bDd4oflv5iD77OFjRyEQlyYSTiuICBzCqGxJ+qRybFft3klU+tbIepRf E9voHHOyqw5sLRow9LSUU4xiA3ao43TxqukIjoLu6BnqT6ZaKq8xG0fVNGjocmfwpBQg jQXbRRn57mKOo7279lEx4qK4CryVAVMYiQq5WZbUN9b/5fzyPlyv2ky6vGLBjdzoNJAn bYLw== X-Gm-Message-State: AOAM530FzrbLs6fhoIyFZX36VgORUKTm0NN/1VDHeygsfttJ/qVuSGbR 5WHuOvxGAfKv0fuApXi9ZPM= X-Google-Smtp-Source: ABdhPJxhFiE344XNDJScIHvn5IO6BT61+VbgASZ+gBhMDBwUwjrmk+T77X/ajyR/aTzzLVD86BL/LQ== X-Received: by 2002:a05:6e02:1d99:: with SMTP id h25mr2951443ila.114.1617807400123; Wed, 07 Apr 2021 07:56:40 -0700 (PDT) Received: from google.com (243.199.238.35.bc.googleusercontent.com. [35.238.199.243]) by smtp.gmail.com with ESMTPSA id y3sm15450047iot.15.2021.04.07.07.56.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 07 Apr 2021 07:56:39 -0700 (PDT) Date: Wed, 7 Apr 2021 14:56:38 +0000 From: Dennis Zhou To: Wang Yugui Cc: Vlastimil Babka , linux-mm@kvack.org, linux-btrfs@vger.kernel.org Subject: Re: unexpected -ENOMEM from percpu_counter_init() Message-ID: References: <20210401185158.3275.409509F4@e16-tech.com> <60e9b994-e37c-d059-4af5-0cb7860ca4f3@suse.cz> <20210407210905.F790.409509F4@e16-tech.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210407210905.F790.409509F4@e16-tech.com> X-Rspamd-Queue-Id: 479C540002E2 X-Stat-Signature: ro5p8guy8x1khaugyqmko1yycha4wnn1 X-Rspamd-Server: rspam02 Received-SPF: none (gmail.com>: No applicable sender policy available) receiver=imf26; identity=mailfrom; envelope-from=""; helo=mail-il1-f182.google.com; client-ip=209.85.166.182 X-HE-DKIM-Result: none/none X-HE-Tag: 1617807398-94325 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hello, On Wed, Apr 07, 2021 at 09:09:07PM +0800, Wang Yugui wrote: > Hi, > > > +CC btrfs > > > > On 4/1/21 12:51 PM, Wang Yugui wrote: > > > Hi, > > > > > > an unexpected -ENOMEM from percpu_counter_init() happened when xfstest > > > with kernel 5.11.10 and 5.10.27 > > > > Is there a dmesg log showing allocation failure or something? > > When unexpected -ENOMEM of percpu_counter_init(), btrfs as upper caller > finally output something to dmesg. > > And we add one trace to btrfs source to make sure that. > > if (ret == -ENOMEM) printk("ENOMEM btrfs_drew_lock_init\n"); > > > Now the reproduce frequency become from >50% to not happen or very slow > with the flowing change. > > diff --git a/mm/percpu.c b/mm/percpu.c > index 6596a0a..0127be1 100644 > --- a/mm/percpu.c > +++ b/mm/percpu.c > @@ -104,8 +104,8 @@ > /* chunks in slots below this are subject to being sidelined on failed alloc */ > #define PCPU_SLOT_FAIL_THRESHOLD 3 > > -#define PCPU_EMPTY_POP_PAGES_LOW 2 > -#define PCPU_EMPTY_POP_PAGES_HIGH 4 > +#define PCPU_EMPTY_POP_PAGES_LOW 8 > +#define PCPU_EMPTY_POP_PAGES_HIGH 16 > These settings are from 2014 when Tejun initially implemented the atomic allocation float. It is probably time to think about increasing the number of pages. I'd prefer to do it in a dynamic way though (some X% of a chunk instead of a fixed number increase). > #ifdef CONFIG_SMP > /* default addr <-> pcpu_ptr mapping, override in asm/percpu.h if necessary */ > diff --git a/include/linux/percpu.h b/include/linux/percpu.h > index 5e76af7..8cc091b 100644 > --- a/include/linux/percpu.h > +++ b/include/linux/percpu.h > @@ -14,7 +14,7 @@ > > /* enough to cover all DEFINE_PER_CPUs in modules */ > #ifdef CONFIG_MODULES > -#define PERCPU_MODULE_RESERVE (8 << 10) > +#define PERCPU_MODULE_RESERVE (32 << 10) > #else > #define PERCPU_MODULE_RESERVE 0 > #endif > This is a reserved region purely for module static inits. btrfs_drew_lock_init() is a dynamic init. > > Just some guess, > 1) maybe some releationship to the trigger of 'vm.dirty_bytes=10737418240'. > > this problem happen in > server/T7610 with E5-2660v2 *2 and SSD/SAS(6Gb/s) and 192G memory > but not happen in > server/T620 with E5-2680v2 *2 and SSD/NVMe and 192G memory. > > 2) maybe some releationship to numa. > 128G memory in node1(CPU1), and 64G in node2(CPU2) > > Best Regards > Wang Yugui (wangyugui@e16-tech.com) > 2021/04/07 > > > > > direct caller: > > > int btrfs_drew_lock_init(struct btrfs_drew_lock *lock) > > > { > > > int ret; > > > > > > ret = percpu_counter_init(&lock->writers, 0, GFP_KERNEL); > > > if (ret) > > > return ret; > > > > > > atomic_set(&lock->readers, 0); > > > init_waitqueue_head(&lock->pending_readers); > > > init_waitqueue_head(&lock->pending_writers); > > > > > > return 0; > > > } > > > > > > upper caller: > > > nofs_flag = memalloc_nofs_save(); > > > ret = btrfs_drew_lock_init(&root->snapshot_lock); > > > memalloc_nofs_restore(nofs_flag); The issue is here. nofs is set which means percpu attempts an atomic allocation. If it cannot find anything already allocated it isn't happy. This was done before memalloc_nofs_{save/restore}() were pervasive. Percpu should probably try to allocate some pages if possible even if nofs is set. > > > if (ret == -ENOMEM) printk("ENOMEM btrfs_drew_lock_init\n"); > > > if (ret) > > > goto fail; > > > > > > The hardware of this server: > > > CPU: Xeon(R) CPU E5-2660 v2(10 core) *2 > > > memory: 192G, no swap > > > > > > Only one xfstests job is running in this server, and about 7% of memory > > > is used. > > > > > > Any advice please. > > > > > > Best Regards > > > Wang Yugui (wangyugui@e16-tech.com) > > > 2021/04/01 > > > > > > > Thanks, Dennis