From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4A7A5C47258 for ; Thu, 25 Jan 2024 20:29:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BB4646B0082; Thu, 25 Jan 2024 15:29:00 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B646D8D0002; Thu, 25 Jan 2024 15:29:00 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A2C326B0088; Thu, 25 Jan 2024 15:29:00 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 8A2306B0082 for ; Thu, 25 Jan 2024 15:29:00 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id D0A8F804DB for ; Thu, 25 Jan 2024 20:28:59 +0000 (UTC) X-FDA: 81718972398.14.7F2E200 Received: from mail-vk1-f176.google.com (mail-vk1-f176.google.com [209.85.221.176]) by imf06.hostedemail.com (Postfix) with ESMTP id 41A5818000D for ; Thu, 25 Jan 2024 20:28:58 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=bVB8ifda; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf06.hostedemail.com: domain of thorvald@google.com designates 209.85.221.176 as permitted sender) smtp.mailfrom=thorvald@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1706214538; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=OcglBF5voLie0Ky+jsfmcYsQqg3i1ndvHPEXlXEaKY0=; b=bTXzMI+KksjJTrKx12Y6km6i2sACa1N3esv0qxjz/0aThJQ4ZkTEp4UbpQIl4c37ZV7y47 Svth9JF8hmeTSDlqx9pCT9Q5Z3OgmqrL5PBTn4aOlkPOTBdhAyftMVO+2Q8+bP1psZiOVC vqr8xkaDAr4j+KlUHop+IHNWUmTGyko= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=bVB8ifda; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf06.hostedemail.com: domain of thorvald@google.com designates 209.85.221.176 as permitted sender) smtp.mailfrom=thorvald@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1706214538; a=rsa-sha256; cv=none; b=Gk3WIoPcMgfIkjCb6nAFgfY3PNn471bPTJtsKTlF0n4QYDz9liaFguBDjEOuQqoy6bfV1U zVysHLSozOywvDJm170gubKwWRuCm2bvA1SLfLQbLSJG1WXVRMrVudQYyyt3lx4DqM4XAx WoNuRQZlw+deDbmAJji7D4mKYiv56yc= Received: by mail-vk1-f176.google.com with SMTP id 71dfb90a1353d-4bd3dcee54eso643608e0c.0 for ; Thu, 25 Jan 2024 12:28:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1706214537; x=1706819337; darn=kvack.org; h=cc:to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=OcglBF5voLie0Ky+jsfmcYsQqg3i1ndvHPEXlXEaKY0=; b=bVB8ifda/S3noIzIo4gYY+A2FcycVz1guOpGejuQieZ5P1s+Luic6GBdRPcLpSJLf6 OS+ywb+UtXCYoZZmcegvHtWFfhlnthmN/UOTd4GzCxVRSNsdVqOhreDeBnpW1XL7VFgy rqhdJxiNoi1Y2DxfP3tzPrhGB8xaO+nSJJJBAsFrx2Q9W9Vug1DeF1Tjn0qqy1TzUqND tMImkd+zYWoxvQmrGFHGP5rwprC2BnVcT3Izf1YT1/mUCLfFHuZJHsmA2443MQqSFgJt P20TeHhlhEST9kthhNr4B8jA2tgpOw6WL736tPnZVnwpPeUOD/Ue0bzhUoebMPE4VuvP 5l5g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706214537; x=1706819337; h=cc:to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=OcglBF5voLie0Ky+jsfmcYsQqg3i1ndvHPEXlXEaKY0=; b=hlk48O4bHDFFQrpXFnyWZiOm+MxRjn9CqIce4SgBNRiU7SHevaGxlvZ0wdjRqMXpbE MXTq4Jf8mX051QmpAhlzmzL8PWUuem1eFTuZSvcVn8WH+lbiFv/KviAvAmdomKPUTCIe llxAWA2WsfZsViKvIK/tE59OPZBWnvM78QtwOs41KPXE5ZifvoGAEf8zDMnsk8WShkxA iVFO+AM8BH2SSm7uMS1TQ3/V3p931IT+hpVMAJ1kaCnm7YnaR329R5BASOHsYKdlM4+Z iyQr/7NkmmEoyQqXZX43q5Ag7pSu3clXgkdD0dPHWOqhrJoa+eHT3AliuA8cKX4G+Wd2 FX1A== X-Gm-Message-State: AOJu0YytHcA/Gv+YBeGlsBO/RW+GUY1DkVM8HWWzP5ycoO3VvnDl5n1T sv9noyEnCISXEKvieabJhvch7tEA/cqb1+3snNmb+uybCpxbnNhELvtQZx8J7QUO/+Rgfg5MJIT eB4pZuAAo71BPYA++G5VesShSwHRZd/kCqFFWHKgBzy2wcLnzbi0S X-Google-Smtp-Source: AGHT+IF4CXKRKNVftRbariXlgmeqjkDcHAOlT6PiiOh4V717whlFhf673PKBnRF1DwPhpeMPVs0eqgL3PnIkINaES58= X-Received: by 2002:a05:6122:468b:b0:4bd:54d0:e6df with SMTP id di11-20020a056122468b00b004bd54d0e6dfmr118651vkb.1.1706214537126; Thu, 25 Jan 2024 12:28:57 -0800 (PST) MIME-Version: 1.0 From: Thorvald Natvig Date: Thu, 25 Jan 2024 12:28:40 -0800 Message-ID: Subject: hugetlbfs: WARNING: bad unlock balance detected during MADV_REMOVE To: Muchun Song Cc: linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 41A5818000D X-Stat-Signature: 64nscdhr88p44usbshbb7bzrfs3toymm X-HE-Tag: 1706214538-551272 X-HE-Meta: U2FsdGVkX18hw5M1jp3Ok2iJ+KVp7PvBU4Pw6jQTaqxabH9/LNCs6lmw8CFxpRgo/1AyE5E2CAShWfneIUssfuyGeKnL9C5DMiCFvnbOZL9isKLNtRDynHjYhBFTbEepSeLp/GHxTEfANO3nQevsx5R+NUVX4gLThTJQMBJ7+Vt/F4KhCsrrWaT+UOzd7/UZePlbC1DsuLWMViN6hb3YR5YCPefFJXi9cMHsWNfN+Fv1DlD/LukNsBRa7LHdJJ1vuo435jxbGuYYw5xnrm0d+/jWM5tSW/3Hkte3VyL0bhNtXzXT0S4UReE9yt767VgS9sYcmTHXF9YYTlLLXCv5tLVn/ZjD4WqcyLkBO2GyAaocNRtWgrl2SEbTOsVBwOPkGFtsTJ/tKnQJAGZr3qAN+NDXWzZejAXRZNdnI773zXhEj2VtKV3NEFhibFqcfEp6t7JNtWIkvQV/k+VijPrsCMTnaz0hXKzMfrVxUHGpO9/py+chL5my8TeS9qpnZp0zRXuRJu3pghQYyQXsh9gXrBdCfErWinKKjUDi0kNGN9J7nbw5OhB709Xu6Us4pLF+0E36MaHdjBzzW4Yp92go1UbnSSb0fEUD6JMcMA2sn/Syj81mMm16IeJaGOsyB+3FYrmHaqegnsEat5LhgxRfXdiL99FgZiMfPJAqFJG5+IKT62BZ3uem3dCtsPUzAjeXnsEndVS5MCC1J0rleWtJkBxV2AmVOZfwgdDf8LuD2kzkiA8ylKdQ7rJR4/x3fwa653fh+Z2t6MWgXUtHwRbNLntWvSaqTQXyPPbZLTRH9IlFamZRXmw6uJFRhVw0AiHR+3oXdi+2TB3uqvGzOrmfjW9QrX7mn/bSCByuDLS4AdEL9Sfx1P92u2NbbIGuLnRjdhfMVKhFBNHAUc50CkwPB/vSO3TO4yokvY49iInjRxDHImd+0y1R+v52C00JkHrQgxa1rPwnWHvt1y/0tcf PdnggqV0 hR3lzaJGmuJT8JAtfRovGc7rYMHcCQgW3dvIFXWpD8eqbo/6WQdfcice0ASq0P7wfG+aoPPOoS9E6DD4oVIK+2xQpxYlZUeQR+POspfJAw0HXpsURjxhbIFUMYcknGHY5uo+Vl7J1Apvck4EfW0NFvadhBE/iro/Txkxe+3PNvY+MmiKLB9gbASNAwl5O2WAqxNEJf6JZ5Pae8YUxzOnmLDxlRg8/2TbfBnuU39qP44rBsWqAjqyrrV9+Pmh4XmG2mw9lFYw54ewbIls= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: We've found what appears to be a lock issue that results in a blocked process somewhere in hugetlbfs for shared maps; seemingly from an interaction between hugetlb_vm_op_open and hugetlb_vmdelete_list. Based on some added pr_warn, we believe the following is happening: When hugetlb_vmdelete_list is entered from the child process, vma->vm_private_data is NULL, and hence hugetlb_vma_trylock_write does not lock, since neither __vma_shareable_lock nor __vma_private_lock are true. While hugetlb_vmdelete_list is executing, the parent process does fork(), which ends up in hugetlb_vm_op_open, which in turn allocates a lock for the same vma. Thus, when the hugetlb_vmdelete_list in the child reaches the end of the function, vma->vm_private_data is now populated, and hence hugetlb_vma_unlock_write tries to unlock the vma_lock, which it does not hold. dmesg: WARNING: bad unlock balance detected! 6.8.0-rc1+ #24 Not tainted ------------------------------------- lock/2613 is trying to release lock (&vma_lock->rw_sema) at: [] hugetlb_vma_unlock_write+0x48/0x60 but there are no more locks to release! 3 locks held by lock/2613: #0: ffff9b4bc6225450 (sb_writers#16){.+.+}-{0:0}, at: madvise_vma_behavior+0x4cc/0xcf0 #1: ffff9ba4dc34eca0 (&sb->s_type->i_mutex_key#23){+.+.}-{3:3}, at: hugetlbfs_fallocate+0x3fe/0x620 #2: ffff9ba4dc34ef38 (&hugetlbfs_i_mmap_rwsem_key){+.+.}-{3:3}, at: hugetlbfs_fallocate+0x438/0x620 CPU: 17 PID: 2613 Comm: lock Not tainted 6.8.0-rc1+ #24 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 12/02/2023 Call Trace: dump_stack_lvl+0x77/0xe0 ? hugetlb_vma_unlock_write+0x48/0x60 dump_stack+0x10/0x20 print_unlock_imbalance_bug+0x127/0x150 lock_release+0x21a/0x3f0 ? hugetlb_vma_unlock_write+0x48/0x60 up_write+0x1c/0x1d0 hugetlb_vma_unlock_write+0x48/0x60 hugetlb_vmdelete_list+0x93/0xd0 hugetlbfs_fallocate+0x4e1/0x620 vfs_fallocate+0x153/0x4b0 madvise_vma_behavior+0x4cc/0xcf0 ? mas_prev+0x68/0x70 ? srso_alias_return_thunk+0x5/0xfbef5 ? find_vma_prev+0x78/0xc0 ? __pfx_madvise_vma_behavior+0x10/0x10 madvise_walk_vmas+0xc4/0x140 do_madvise+0x3df/0x450 __x64_sys_madvise+0x2c/0x40 do_syscall_64+0x8e/0x160 ? srso_alias_return_thunk+0x5/0xfbef5 ? do_syscall_64+0x9b/0x160 ? do_syscall_64+0x9b/0x160 ? do_syscall_64+0x9b/0x160 entry_SYSCALL_64_after_hwframe+0x6e/0x76 RIP: 0033:0x7f55e0b23bbb Repro: #include #include #include #include #include #include #include #define PSIZE (2048UL * 1024UL) int main(int argc, char **argv) { char *buffer = mmap(NULL, PSIZE, PROT_READ | PROT_WRITE, MAP_ANONYMOUS | MAP_SHARED | MAP_HUGETLB, -1, 0); if (buffer == MAP_FAILED) { perror("mmap"); exit(1); } pid_t remover = fork(); if (remover == 0) { while(1) { if (madvise(buffer, PSIZE, MADV_REMOVE) == -1) { perror("madvise"); exit(1); } } } int wstatus; for(int l = 0; l < 10000; ++l) { pid_t childpid = fork(); if (childpid == 0) { exit(0); } else { waitpid(childpid, &wstatus, 0); } } kill(remover, SIGKILL); waitpid(remover, &wstatus, 0); printf("Clean exit\n"); } - Thorvald