From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-14.8 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1, USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 32064C433DB for ; Thu, 14 Jan 2021 04:31:43 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 79F64238E5 for ; Thu, 14 Jan 2021 04:31:42 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 79F64238E5 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 972EA8D00BB; Wed, 13 Jan 2021 23:31:41 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 922F68D008E; Wed, 13 Jan 2021 23:31:41 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8115D8D00BB; Wed, 13 Jan 2021 23:31:41 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0241.hostedemail.com [216.40.44.241]) by kanga.kvack.org (Postfix) with ESMTP id 676258D008E for ; Wed, 13 Jan 2021 23:31:41 -0500 (EST) Received: from smtpin10.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 24F34181AEF09 for ; Thu, 14 Jan 2021 04:31:41 +0000 (UTC) X-FDA: 77703107202.10.sink77_500cb2927523 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin10.hostedemail.com (Postfix) with ESMTP id 0551C16A4D8 for ; Thu, 14 Jan 2021 04:31:41 +0000 (UTC) X-HE-Tag: sink77_500cb2927523 X-Filterd-Recvd-Size: 5720 Received: from mail-oi1-f181.google.com (mail-oi1-f181.google.com [209.85.167.181]) by imf28.hostedemail.com (Postfix) with ESMTP for ; Thu, 14 Jan 2021 04:31:40 +0000 (UTC) Received: by mail-oi1-f181.google.com with SMTP id l200so4652466oig.9 for ; Wed, 13 Jan 2021 20:31:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:from:to:cc:subject:in-reply-to:message-id:references :user-agent:mime-version; bh=JpVnzwXDtkHwenpkUE21vZW7L/3kV39UhMH2nbmFqWM=; b=blug7ajAa4fgIPSwbXBLHxHwHiV9Qvfp5d6NtwgJBghtC9xEuMxvbuYSM+uNcpJ4yv WEEqy/x4Qs9hKoiZUTYaBk152lxV95Dv5r//cXWFE7rZhqZ5Mip2eJZlS/e8Cv0gLxV6 XEA/ZDoodq0Ic83FCJPuuEFjIlyzAMdYyG+RUwIq2dUnMKUCgf6Ths9d2SmoYJnieaOl atdKKkK3GmHjh55E9xyp+7Ww3EmcUtrhssgT1pLVl9dtIBQ9AxtQ7R/ntTd0YKiSVi8/ nVP8eodhJ9IwgRGInQZFBICbe8j0Z7PTx0e2TlVSuWUD8hM1FGwlABqt0NKsP7d6KiWN FAoQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:in-reply-to:message-id :references:user-agent:mime-version; bh=JpVnzwXDtkHwenpkUE21vZW7L/3kV39UhMH2nbmFqWM=; b=K9laBCpUHTn3ZGvYOQspriXVTDtO1h5oHbmIyu9WNgdfX0wdjeb36eUUx+GOIt/EJN /cB9aZz/1d4yt3cY8MVS8fUVSRcHAJg4BFljq0gHcdt+l0NC1LvcEaa0eRDI+sI5ucWv nR8A4GAaUQkLMgNuLJDMVC/dLeOBumWTdpZiBl7crk70CgbA/k/Yxo1FnQ26KUE/uEyA bpGjN5lx31jz5tCGZ4uZqKi+l9Vk4DzElOtZTvVu/OXiMmN16RvfmiW/3f2J0eqpr+r+ /vjn0iJTFHPsDxvvtUSFtuMqvQ0IMmALyBZksZI9JVuIHMZ80QulgWtNb/k4fSGnTDKJ M+4w== X-Gm-Message-State: AOAM533+cQRo5FPubmW2Z1NRRXoclRxaZsEQtJf2u+D2fZW144XXKK3d fhL5rsv+BxoOaZFp/HQPQcbXRA== X-Google-Smtp-Source: ABdhPJzPPGAlo4IKNDwvJg9HpgNIDt7ESoLbeCmOrKt0BADqAxkIOsbmR8aY4eREXFuBKl5Ppwwpzg== X-Received: by 2002:a05:6808:8f0:: with SMTP id d16mr1543795oic.47.1610598699830; Wed, 13 Jan 2021 20:31:39 -0800 (PST) Received: from eggly.attlocal.net (172-10-233-147.lightspeed.sntcca.sbcglobal.net. [172.10.233.147]) by smtp.gmail.com with ESMTPSA id x31sm861435otb.4.2021.01.13.20.31.38 (version=TLS1 cipher=ECDHE-ECDSA-AES128-SHA bits=128/128); Wed, 13 Jan 2021 20:31:39 -0800 (PST) Date: Wed, 13 Jan 2021 20:31:23 -0800 (PST) From: Hugh Dickins X-X-Sender: hugh@eggly.anvils To: Sergey Senozhatsky cc: Andrew Morton , Hugh Dickins , "Kirill A. Shutemov" , Suleiman Souhlal , Matthew Wilcox , Andrea Arcangeli , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: madvise(MADV_REMOVE) deadlocks on shmem THP In-Reply-To: Message-ID: References: User-Agent: Alpine 2.11 (LSU 23 2013-08-11) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, 14 Jan 2021, Sergey Senozhatsky wrote: > Hi, > > We are running into lockups during the memory pressure tests on our > boards, which essentially NMI panic them. In short the test case is > > - THP shmem > echo advise > /sys/kernel/mm/transparent_hugepage/shmem_enabled > > - And a user-space process doing madvise(MADV_HUGEPAGE) on new mappings, > and madvise(MADV_REMOVE) when it wants to remove the page range > > The problem boils down to the reverse locking chain: > kswapd does > > lock_page(page) -> down_read(page->mapping->i_mmap_rwsem) > > madvise() process does > > down_write(page->mapping->i_mmap_rwsem) -> lock_page(page) > > > > CPU0 CPU1 > > kswapd vfs_fallocate() > shrink_node() shmem_fallocate() > shrink_active_list() unmap_mapping_range() > page_referenced() << lock page:PG_locked >> unmap_mapping_pages() << down_write(mapping->i_mmap_rwsem) >> > rmap_walk_file() zap_page_range_single() > down_read(mapping->i_mmap_rwsem) << W-locked on CPU1>> unmap_page_range() > rwsem_down_read_failed() __split_huge_pmd() > __rwsem_down_read_failed_common() __lock_page() << PG_locked on CPU0 >> > schedule() wait_on_page_bit_common() > io_schedule() Very interesting, Sergey: many thanks for this report. There is no doubt that kswapd is right in its lock ordering: __split_huge_pmd() is in the wrong to be attempting lock_page(). Which used not to be done, but was added in 5.8's c444eb564fb1 ("mm: thp: make the THP mapcount atomic against __split_huge_pmd_locked()"). Which explains why this deadlock was not seen years ago: that surprised me at first, since the case you show to reproduce it is good, but I'd expect more common ways in which that deadlock could show up. And your report is remarkably timely too: I have two other reasons for looking at that change at the moment (I'm currently catching up with recent discussion of page_count versus mapcount when deciding COW page reuse). I won't say more tonight, but should have more to add tomorrow. Hugh