From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.0 required=3.0 tests=BAYES_00,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C0F17C433B4 for ; Fri, 9 Apr 2021 13:56:13 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 550F3610A8 for ; Fri, 9 Apr 2021 13:56:13 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 550F3610A8 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id DEE556B0070; Fri, 9 Apr 2021 09:56:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D9EC26B0071; Fri, 9 Apr 2021 09:56:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C661B6B0072; Fri, 9 Apr 2021 09:56:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0098.hostedemail.com [216.40.44.98]) by kanga.kvack.org (Postfix) with ESMTP id A63E86B0070 for ; Fri, 9 Apr 2021 09:56:12 -0400 (EDT) Received: from smtpin17.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 652E9184956E5 for ; Fri, 9 Apr 2021 13:56:12 +0000 (UTC) X-FDA: 78012977784.17.702F46E Received: from mail-il1-f176.google.com (mail-il1-f176.google.com [209.85.166.176]) by imf25.hostedemail.com (Postfix) with ESMTP id 514B26000106 for ; Fri, 9 Apr 2021 13:56:10 +0000 (UTC) Received: by mail-il1-f176.google.com with SMTP id c18so4782055iln.7 for ; Fri, 09 Apr 2021 06:56:11 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=hnToDMIf1Fki+bNzsYO2DOzEZd4Lcc1Sjb6xiW6usec=; b=rzb8+sPIcQXy8N89DyyMF/SquRTW9N8xwFwzPUqvP1P/bNJ+SjrF4so2/+RBJhkUa3 nfnPjAbzp36TYi5R1Z3ML1eNyIVUvYlIFFzIE9DSuylcTpdnyURSsCj1WXavJBxYe5Ny 4JullBUK0BYnKgIBRWvDjlPAS/3r6zkNCTy4hBV3dYNpJqOyujqt5sf8CJLSkzKzjJd6 me1SWApqN6vCwmDVscfTY6HmI/SM2ZwZYxUOUaS7MlJy8hMHmYGGM1grM3sL6NZoCXzq CzvtKq/p9QvfarYG+btDTRheNh8K0tlGnKktDCjwnXYNsMhTjnSYCEXJ8G5Yxv31kqmm v52g== X-Gm-Message-State: AOAM532Q9ToDrU2N117944dPyLJ+StxnstarXL7J4795rXoMXahplFKY cty09m/co+MYKcGtVPmXDXE= X-Google-Smtp-Source: ABdhPJxqD/ZI2sj1OfCydZKz1cQE3cW8PEZ+wdDwxF6NofDP8cTI3CNpmFsp4FXMm3pS0U3G9c54Ig== X-Received: by 2002:a05:6e02:1a24:: with SMTP id g4mr10866321ile.56.1617976571356; Fri, 09 Apr 2021 06:56:11 -0700 (PDT) Received: from google.com (243.199.238.35.bc.googleusercontent.com. [35.238.199.243]) by smtp.gmail.com with ESMTPSA id e6sm1282448ilr.81.2021.04.09.06.56.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 09 Apr 2021 06:56:10 -0700 (PDT) Date: Fri, 9 Apr 2021 13:56:09 +0000 From: Dennis Zhou To: Wang Yugui Cc: Vlastimil Babka , linux-mm@kvack.org, linux-btrfs@vger.kernel.org Subject: Re: unexpected -ENOMEM from percpu_counter_init() Message-ID: References: <20210409120214.7BB6.409509F4@e16-tech.com> <20210409153636.C53D.409509F4@e16-tech.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210409153636.C53D.409509F4@e16-tech.com> X-Rspamd-Queue-Id: 514B26000106 X-Stat-Signature: a9h1bu1mtdwkzkdowee1ra1sadeczr58 X-Rspamd-Server: rspam02 Received-SPF: none (gmail.com>: No applicable sender policy available) receiver=imf25; identity=mailfrom; envelope-from=""; helo=mail-il1-f176.google.com; client-ip=209.85.166.176 X-HE-DKIM-Result: none/none X-HE-Tag: 1617976570-7822 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Apr 09, 2021 at 03:36:39PM +0800, Wang Yugui wrote: > Hi, > > some question about workqueue for percpu. > > > > > > > > > And a question about this, > > > > > > > > upper caller: > > > > > > > > nofs_flag = memalloc_nofs_save(); > > > > > > > > ret = btrfs_drew_lock_init(&root->snapshot_lock); > > > > > > > > memalloc_nofs_restore(nofs_flag); > > > > > > > > > > The issue is here. nofs is set which means percpu attempts an atomic > > > > > allocation. If it cannot find anything already allocated it isn't happy. > > > > > This was done before memalloc_nofs_{save/restore}() were pervasive. > > > > > > > > > > Percpu should probably try to allocate some pages if possible even if > > > > > nofs is set. > > > > > > > > Should we check and pre-alloc memory inside memalloc_nofs_restore()? > > > > another memalloc_nofs_save() may come soon. > > > > > > > > something like this in memalloc_nofs_save()? > > > > if (pcpu_nr_empty_pop_pages[type] < PCPU_EMPTY_POP_PAGES_LOW) > > > > pcpu_schedule_balance_work(); > > > > > > > > > > Percpu does do this via a workqueue item. The issue is in v5.9 we > > > introduced 2 types of chunks. However, the free float page number was > > > for the total. So even if 1 chunk type dropped below, the other chunk > > > type might have enough pages. I'm queuing this for 5.12 and will send it > > > out assuming it does fix your problem. > > workqueue for percpu maybe not strong enough( not scheduled?) when high > CPU load? > Percpu is not really cheap memory to allocate because it has a amplification factor of NR_CPUS. As a result, percpu on the critical path is really not something that is expected to be high throughput. Ideally things like btrfs snapshots should preallocate a number of these and not try to do atomic allocations because that in theory could fail because even after we go to the page allocator in the future we can't get enough pages due to needing to go into reclaim. The workqueue approach has been good enough so far. Technically there is a higher priority workqueue that this work could be scheduled on, but save for this miss on my part, the system workqueue has worked out fine. In the future as I mentioned above. It would be good to support actually getting pages, but it's work that needs to be tackled with a bit of care. I might target the work for v5.14. > this is our application pipeline. > file_pre_process | > bwa.nipt xx | > samtools.nipt sort xx | > file_post_process > > file_pre_process/file_post_process is fast, so often are blocked by > pipe input/output. > > 'bwa.nipt xx' is a high-cpu-load, almost all of CPU cores. > > 'samtools.nipt sort xx' is a high-mem-load, it keep the input in memory. > if the memory is not enough, it will save all the buffer to temp file, > so it is sometimes high-IO-load too(write 60G or more to file). > > > xfstests(generic/476) is just high-IO-load, cpu/memory load is NOT high. > so xfstests(generic/476) maybe easy than our application pipeline. > > Although there is yet not a simple reproducer for another problem > happend here, but there is a little high chance that something is wrong > in btrfs/mm/fs-buffer. > > but another problem(os freezed without call trace, PANIC without OOPS?, > > the reason is yet unkown) still happen. I do not have an answer for this. I would recommend looking into kdump. > > Best Regards > Wang Yugui (wangyugui@e16-tech.com) > 2021/04/09 > > > Thanks, Dennis