From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.0 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 31B0EC2D0E4 for ; Tue, 17 Nov 2020 08:39:43 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 8A01124671 for ; Tue, 17 Nov 2020 08:39:42 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="EV5a5QsC" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8A01124671 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 8B1426B0036; Tue, 17 Nov 2020 03:39:41 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 863BC6B005D; Tue, 17 Nov 2020 03:39:41 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 72CF96B0068; Tue, 17 Nov 2020 03:39:41 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0093.hostedemail.com [216.40.44.93]) by kanga.kvack.org (Postfix) with ESMTP id 46E7E6B0036 for ; Tue, 17 Nov 2020 03:39:41 -0500 (EST) Received: from smtpin23.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id E063C8249980 for ; Tue, 17 Nov 2020 08:39:40 +0000 (UTC) X-FDA: 77493261720.23.turn14_220681b27330 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin23.hostedemail.com (Postfix) with ESMTP id BF2CE37608 for ; Tue, 17 Nov 2020 08:39:40 +0000 (UTC) X-HE-Tag: turn14_220681b27330 X-Filterd-Recvd-Size: 6728 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [63.128.21.124]) by imf21.hostedemail.com (Postfix) with ESMTP for ; Tue, 17 Nov 2020 08:39:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1605602379; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type; bh=xcB4mnMol7uTZzFihHJV+mU2Eu0m110J4b8ABxTL8bk=; b=EV5a5QsCSZsejrw4A4sYX2zan9c680pWiTmyKH7YCZ6PmoCng/l3njRZHAUEoRNvDhZ1TW VrIW2gTdM+FdUBoOHp7PPYMVfRHa9zw5CC9AS3iSLZGz0ZMEgGDdU7si8hDxwi7UZ6qaWe ysd+XheCtUqAwkubjKwduJUOMsAlSU4= Received: from mail-wr1-f69.google.com (mail-wr1-f69.google.com [209.85.221.69]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-548-6Jb59bvaP-yxO7Otr2G_Fw-1; Tue, 17 Nov 2020 03:39:35 -0500 X-MC-Unique: 6Jb59bvaP-yxO7Otr2G_Fw-1 Received: by mail-wr1-f69.google.com with SMTP id z7so11469228wrl.14 for ; Tue, 17 Nov 2020 00:39:35 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:to:from:subject:message-id:date:user-agent :mime-version:content-language; bh=7WpwE3xpUq7Ois4U5KIpLbnMJL1otEMaYHXcxQI7bh4=; b=MgJp7Chm+iDSElgGNEt/An1i/sdYqigFZJWWMPPod6BOcXTAkTByvoxQgjh40r65Sv TvizpRI74My7wPNVCPJkiOrwhyTsnlyMeawQhIG4kboKKCw+47M2OshjB81MEGm4NU+D SCRFGCMoZtgY97O2i/rnPMGFDHvF/MlvasyVe55Cp2FuXb59n9t6aGAYufSIZSjUobGp 81TN87QyvaWbDIwt7gbAvTC8zBt4dk2FaoFJLsS54cudXzKVIQH6nUQ8t4/fDUZKl3Pe KqIemzI9UWQ6vmclevpHa1Q1XKTq5xNp3dmOjVsLXJjrEKJrQh8ISaOHmUArQXifsD2Q 6Vsg== X-Gm-Message-State: AOAM532CkEDYyjvUIaJ3YnTWC2t86eynLxP1qCKZTEHko1fpnsCtBJ16 hXPgJpHA9Ojdcj0c9RRw13PCpsogYErrtXEh0k4gT0h21BDvfwYmyrXymPzRo/xp5KNwU4I9N4c cxAdZ04j657uXQyuIf3/knYBVFRUpjN9paxXmH141gl/Ql2ScpAkyU/rMzVMDPd8= X-Received: by 2002:a5d:5308:: with SMTP id e8mr23595063wrv.299.1605602374140; Tue, 17 Nov 2020 00:39:34 -0800 (PST) X-Google-Smtp-Source: ABdhPJyl94TnHkSnWpBHCJ7tRv6gFQq80m8SX34kjSiLD/1OBMlPs3dp9IhDb4IsxkAZ7sC90mbg7g== X-Received: by 2002:a5d:5308:: with SMTP id e8mr23595046wrv.299.1605602373788; Tue, 17 Nov 2020 00:39:33 -0800 (PST) Received: from amorenoz.users.ipa.redhat.com ([94.73.56.18]) by smtp.gmail.com with ESMTPSA id u203sm2473398wme.32.2020.11.17.00.39.32 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 17 Nov 2020 00:39:32 -0800 (PST) To: linux-mm@kvack.org From: Adrian Moreno Subject: cgroup_destroy kworker loops on hugetlb_cgroup_css_offline Message-ID: <0ef61368-4341-d6bb-a383-cf03fdc0117e@redhat.com> Date: Tue, 17 Nov 2020 09:39:32 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.3.1 MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=amorenoz@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: multipart/mixed; boundary="------------200ABC5786DCCCA141251A4F" Content-Language: en-US X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This is a multi-part message in MIME format. --------------200ABC5786DCCCA141251A4F Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Hello Mike, I don't usually work on the kernel so please excuse any inaccuracies. I'm contacting you off-list because, if what I've facing is confirmed, it= might be considered a security issue (DoS). I'll leave that to your judgement. I'm seeing an issue related to hugetlb_cgroup: I'm running: kubernetes 1.19 + containerd/docker kernel 5.9.0-36.fc34.x86_64 kernel params: systemd.unified_cgroup_hierarchy=3D0 default_hugepagesz=3D= 1G hugepagesz=3D1G hugepages=3D10 I'm still trying to isolate aspects of my setup, currently my reproducer = is: 1 - Start a simple pod that uses the recently added HugePages medium feat= ure [1] (pod yaml attached) 2 - Start a DPDK app. It doesn't need to run successfully (as in transfer packets) nor interact with real hardware. It seems just initializing the = EAL layer (which handles hugepage reservation and locking) is enough to trigg= er the issue 3 - Delete the Pod (or let it "Complete"). Results in what seems to be a thread endlessly looping over a spin_lock. top: 1425 root 20 0 0 0 0 R 99.7 0.0 5:22.45 kworker/28:7+cgroup_destroy 'perf top -g' reports: - 63.28% 0.01% [kernel] [k] worker_thread - 49.97% worker_thread - 52.64% process_one_work - 62.08% css_killed_work_fn - hugetlb_cgroup_css_offline 41.52% _raw_spin_lock - 2.82% _cond_resched rcu_all_qs 2.66% PageHuge - 0.57% schedule - 0.57% __schedule Under certain circumstances (which I'm still trying to understand) this m= akes the kernel quite unresponsive, requiring a hard reboot. I've isolated the issue in a VM and I was about to start bisecting the is= sue (which does not happen on kernel-5.6.6-300.fc32). Do you have any clue or pointer as to how to further troubleshoot this is= sue? Thanks, --=20 Adri=C3=A1n Moreno [1] https://kubernetes.io/docs/tasks/manage-hugepages/scheduling-hugepage= s/ --------------200ABC5786DCCCA141251A4F Content-Type: application/x-yaml; name="test.yaml" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="test.yaml" YXBpVmVyc2lvbjogdjEKa2luZDogUG9kCm1ldGFkYXRhOgogIG5hbWU6IGNnLXRlc3QKc3BlYzoK ICBjb250YWluZXJzOgogIC0gbmFtZTogdGVzdC1jZwogICAgaW1hZ2U6IHRlc3QtY2c6bGF0ZXN0 CiAgICBpbWFnZVB1bGxQb2xpY3k6IE5ldmVyCiAgICBzZWN1cml0eUNvbnRleHQ6CiAgICAgIHBy aXZpbGVnZWQ6IHRydWUKICAgIHZvbHVtZU1vdW50czoKICAgIC0gbW91bnRQYXRoOiAvZGV2L2h1 Z2VwYWdlcwogICAgICBuYW1lOiBodWdlcGFnZQogICAgcmVzb3VyY2VzOgogICAgICByZXF1ZXN0 czoKICAgICAgICBtZW1vcnk6IDFHaQogICAgICBsaW1pdHM6CiAgICAgICAgaHVnZXBhZ2VzLTFH aTogMkdpCiAgICAgICAgICAjY29tbWFuZDogWyIvdXNyL2Jpbi9kcGRrLXRlc3RwbWQiLCAiLW0i LCAiMTAyNCIsICItbCIsICI0LDUiLCAiLS12ZGV2IiwgIm5ldF92aG9zdDAsaWZhY2U9L3RtcC9k b2Vzbm90ZXhpc3QiLCAiLS0iLCAiLWkiXSAKICAgICMgVGhpcyBjb21tYW5kIHN0YXRzIERQREss IG1ha2VzIEVBTCBkbyBzb21lIG1lbW9yeSBwcmUtYWxsb2NhdGlvbiAoIi1tIikgYW5kIHRoZW4g ZmFpbHMgd2hlbiB2aG9zdCBmYWlscyB0byBiaW5kIHRvIC90bXAvZG9lc25vdGV4aXN0CiAgICAj IEl0IHNlZW1zIGVub3VnaCB0byB0cmlnZ2VyIHRoZSBpc3N1ZQoKICB2b2x1bWVzOgogIC0gbmFt ZTogaHVnZXBhZ2UKICAgIGVtcHR5RGlyOgogICAgICBtZWRpdW06IEh1Z2VQYWdlcwo= --------------200ABC5786DCCCA141251A4F--