From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 99642C433EF for ; Wed, 29 Jun 2022 15:23:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BED9A8E0007; Wed, 29 Jun 2022 11:23:20 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B9BDD8E0001; Wed, 29 Jun 2022 11:23:20 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A3C508E0007; Wed, 29 Jun 2022 11:23:20 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 905878E0001 for ; Wed, 29 Jun 2022 11:23:20 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay13.hostedemail.com (Postfix) with ESMTP id 3F5B96042B for ; Wed, 29 Jun 2022 15:23:20 +0000 (UTC) X-FDA: 79631642160.12.889F71A Received: from mail-wr1-f52.google.com (mail-wr1-f52.google.com [209.85.221.52]) by imf08.hostedemail.com (Postfix) with ESMTP id C4BCB16002F for ; Wed, 29 Jun 2022 15:23:19 +0000 (UTC) Received: by mail-wr1-f52.google.com with SMTP id q9so23009106wrd.8 for ; Wed, 29 Jun 2022 08:23:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=references:user-agent:from:to:cc:subject:date:in-reply-to :message-id:mime-version:content-transfer-encoding; bh=E8dLril/sPEIKP4uoSQRUctLF1kHauClBO1fRvFZs6M=; b=S1sh9U6PVDe+Fmvx6KxW/5tkKg2vVShLUzbCF65HnaD0v14693sxNQCMZYtLfbTc15 zUm5vuapUjne3EvgB1GQk72B2/ZD1fRnzZdYKFt/g4gWsJwM+jfwYsYn1RMWEZZJq/Pk pYnop5Jo4N7nKz7dQhNjloPCewcWSFCn8WJJXoXu7KyVYiPn/MRX+O0b4uaYxeImA+f+ W/deD+MRX9MgVl0O/eRQOhSvKW4bC8G9bSY/vgeW2k/+6alBvtuoFAM0Aj12Qk7JMKpC eJ4es7WQwDbnKF7t1MzyRiO2sfJU6riDzo7GvTwYAonldmKb+R/FfixGLvg4Qh6ccOmj B0JQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:references:user-agent:from:to:cc:subject:date :in-reply-to:message-id:mime-version:content-transfer-encoding; bh=E8dLril/sPEIKP4uoSQRUctLF1kHauClBO1fRvFZs6M=; b=vFuNKNMWXQMyXRznXqFjzHaPuqbHnT5tguvoWQXqhWxoIGyAm5iWEpgnm0ZiVZmhbw 9RY5THwb7/Kxj4FbtY/Qb2V9OxPd04FpFzXBAHnqfDd82u9IFvFlnEp5R1eDy5AukVBO zpDDVJHdPXrb0rgB+X84EVCnEHdO6fLXYWTRQ51uqubAXQhn/t8aKt3IuYD07JGLvPh/ pvH1unGNZXm6fjnDSG2qZyK+eRc1bERqYqZn0Lm4HxYke1gC+xVoXzuQrlh8Fqibciox nDoBKQDVDQBy0W2XCvv9j9Wjwz79HbpczCwWFcflPL00OMJPTzhVwSrKpdStPSf/Vooq IVZA== X-Gm-Message-State: AJIora/dUBxlvgmUPzH8YURekFIHoDPoqkYNJ4DPvckt6hsfkvysuDbi nl5EwLUBJbt5F0v+QFxypt8aEA== X-Google-Smtp-Source: AGRyM1vhi0GQF8IFjF2fgeTa5JS18xUprpYxhjeuK8Y92lrIegtSvJjuOeTARxIpIjG7hroKViZ1qg== X-Received: by 2002:a05:6000:144a:b0:21b:c9bc:ec87 with SMTP id v10-20020a056000144a00b0021bc9bcec87mr3916725wrx.178.1656516198020; Wed, 29 Jun 2022 08:23:18 -0700 (PDT) Received: from zen.linaroharston ([51.148.130.216]) by smtp.gmail.com with ESMTPSA id bg11-20020a05600c3c8b00b003a04a9504b0sm4717677wmb.40.2022.06.29.08.23.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 Jun 2022 08:23:15 -0700 (PDT) Received: from zen (localhost [127.0.0.1]) by zen.linaroharston (Postfix) with ESMTP id BE4491FFB7; Wed, 29 Jun 2022 16:23:14 +0100 (BST) References: <20220426150616.3937571-24-Liam.Howlett@oracle.com> <20220428201947.GA1912192@roeck-us.net> <20220429003841.cx7uenepca22qbdl@revolver> <20220428181621.636487e753422ad0faf09bd6@linux-foundation.org> <20220502001358.s2azy37zcc27vgdb@revolver> <20220501172412.50268e7b217d0963293e7314@linux-foundation.org> <20220502133050.kuy2kjkzv6msokeb@revolver> <20220503215520.qpaukvjq55o7qwu3@revolver> <60a3bc3f-5cd6-79ac-a7a8-4ecc3d7fd3db@linux.ibm.com> <15f5f8d6-dc92-d491-d455-dd6b22b34bc3@redhat.com> <87pmirj3aq.fsf@linaro.org> User-agent: mu4e 1.7.27; emacs 28.1.50 From: Alex =?utf-8?Q?Benn=C3=A9e?= To: Sven Schnelle Cc: David Hildenbrand , Janosch Frank , Liam Howlett , Heiko Carstens , Claudio Imbrenda , Andrew Morton , Guenter Roeck , "maple-tree@lists.infradead.org" , "linux-mm@kvack.org" , "linux-kernel@vger.kernel.org" , Yu Zhao , Juergen Gross , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Andreas Krebbel , Ilya Leoshkevich , Thomas Huth , richard.henderson@linaro.org, qemu-devel@nongnu.org, qemu-s390x@nongnu.org Subject: Re: qemu-system-s390x hang in tcg Date: Wed, 29 Jun 2022 15:52:17 +0100 In-reply-to: Message-ID: <87v8sjh4t9.fsf@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1656516199; a=rsa-sha256; cv=none; b=MxKLlhs6jU9TX1itOnaGtqcotqZTPxqHMY7a0yujZeakemVn2o1Lm2jdM8a1Ooo6rVCmpe xtjLsA8FR/EFM3evRDGTS9a8XL/uXQ73ADRHkwQL0DpPBrUir5SSSITabboWiZITpx0eCJ YPQ97AJL7XMwZj4zAnW3moUcgy/lyNY= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=linaro.org header.s=google header.b=S1sh9U6P; spf=pass (imf08.hostedemail.com: domain of alex.bennee@linaro.org designates 209.85.221.52 as permitted sender) smtp.mailfrom=alex.bennee@linaro.org; dmarc=pass (policy=none) header.from=linaro.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1656516199; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=E8dLril/sPEIKP4uoSQRUctLF1kHauClBO1fRvFZs6M=; b=2xZSRgfKyW/Xc+prEvgwHpUM41YTBb0CbbG7dXfXDlcunk2C1gp50hyNQsLFGscUqsP8UF tRAKrailr9vsIdzgmagOt1i5hO7sRCRMvMjHLoDegnai2rg6uNv8sqS57HyP2zaPULMXsj bcKbZmc8ZSdRLUqGWiiya1ubYR06u+0= X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: C4BCB16002F X-Rspam-User: X-Stat-Signature: rx7tu7kg8wjum4sjnknancpfxashk9xb Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=linaro.org header.s=google header.b=S1sh9U6P; spf=pass (imf08.hostedemail.com: domain of alex.bennee@linaro.org designates 209.85.221.52 as permitted sender) smtp.mailfrom=alex.bennee@linaro.org; dmarc=pass (policy=none) header.from=linaro.org X-HE-Tag: 1656516199-259145 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Sven Schnelle writes: > Sven Schnelle writes: > >> Alex Benn=C3=A9e writes: >> >>> Sven Schnelle writes: >>> >>>> Hi, >>>> >>>> David Hildenbrand writes: >>>> >>>>> On 04.05.22 09:37, Janosch Frank wrote: >>>>>> I had a short look yesterday and the boot usually hangs in the raid6= =20 >>>>>> code. Disabling vector instructions didn't make a difference but a f= ew=20 >>>>>> interruptions via GDB solve the problem for some reason. >>>>>>=20 >>>>>> CCing David and Thomas for TCG >>>>>>=20 >>>>> >>>>> I somehow recall that KASAN was always disabled under TCG, I might be >>>>> wrong (I thought we'd get a message early during boot that the HW >>>>> doesn't support KASAN). >>>>> >>>>> I recall that raid code is a heavy user of vector instructions. >>>>> >>>>> How can I reproduce? Compile upstream (or -next?) with kasan support = and >>>>> run it under TCG? >>>> >>>> I spent some time looking into this. It's usually hanging in >>>> s390vx8_gen_syndrome(). My first thought was that it is a problem with >>>> the VX instructions, but turned out that it hangs even if i remove all >>>> the code from s390vx8_gen_syndrome(). >>>> >>>> Tracing the execution of TB's, i see that the generated code is always >>>> jumping between a few TB's, but never exiting the TB's to check for >>>> interrupts (i.e. return to cpu_tb_exec(). I only see calls to >>>> helper_lookup_tb_ptr to lookup the tb pointer for the next TB. >>>> >>>> The raid6 code is waiting for some time to expire by reading jiffies, >>>> but interrupts are never processed and therefore jiffies doesn't chang= e. >>>> So the raid6 code hangs forever. >>>> >>>> As a test, i made a quick change to test: >>>> >>>> diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c >>>> index c997c2e8e0..35819fd5a7 100644 >>>> --- a/accel/tcg/cpu-exec.c >>>> +++ b/accel/tcg/cpu-exec.c >>>> @@ -319,7 +319,8 @@ const void *HELPER(lookup_tb_ptr)(CPUArchState *en= v) >>>> cpu_get_tb_cpu_state(env, &pc, &cs_base, &flags); >>>> >>>> cflags =3D curr_cflags(cpu); >>>> - if (check_for_breakpoints(cpu, pc, &cflags)) { >>>> + if (check_for_breakpoints(cpu, pc, &cflags) || >>>> + unlikely(qatomic_read(&cpu->interrupt_request))) { >>>> cpu_loop_exit(cpu); >>>> } >>>> >>>> And that makes the problem go away. But i'm not familiar with the TCG >>>> internals, so i can't say whether the generated code is incorrect or >>>> something else is wrong. I have tcg log files of a failing + working r= un >>>> if someone wants to take a look. They are rather large so i would have= to >>>> upload them somewhere. >>> >>> Whatever is setting cpu->interrupt_request should be calling >>> cpu_exit(cpu) which sets the exit flag which is checked at the start of >>> every TB execution (see gen_tb_start). >> >> Thanks, that was very helpful. I added debugging and it turned out >> that the TB is left because of a pending irq. The code then calls >> s390_cpu_exec_interrupt: >> >> bool s390_cpu_exec_interrupt(CPUState *cs, int interrupt_request) >> { >> if (interrupt_request & CPU_INTERRUPT_HARD) { >> S390CPU *cpu =3D S390_CPU(cs); >> CPUS390XState *env =3D &cpu->env; >> >> if (env->ex_value) { >> /* Execution of the target insn is indivisible from >> the parent EXECUTE insn. */ >> return false; >> } >> if (s390_cpu_has_int(cpu)) { >> s390_cpu_do_interrupt(cs); >> return true; >> } >> if (env->psw.mask & PSW_MASK_WAIT) { >> /* Woken up because of a floating interrupt but it has alrea= dy >> * been delivered. Go back to sleep. */ >> cpu_interrupt(CPU(cpu), CPU_INTERRUPT_HALT); >> } >> } >> return false; >> } >> >> Note the 'if (env->ex_value) { }' check. It looks like this function >> just returns false in case tcg is executing an EX instruction. After >> that the information that the TB should be exited because of an >> interrupt is gone. So the TB's are never exited again, although the >> interrupt wasn't handled. At least that's my assumption now, if i'm >> wrong please tell me. > > Looking at the code i see CF_NOIRQ to prevent TB's from getting > interrupted. But i only see that used in the core tcg code. Would > that be a possibility, or is there something else/better? Yes CF_NOIRQ is exactly the compiler flag you would use to prevent a block from exiting early when you absolutely want to execute the next block. We currently only use it from core code to deal with icount related things but I can see it's use here. I would probably still wrap it in a common function in cpu-exec-common.c I'm unsure of the exact semantics for s390 so I will defer to Richard and others but something like (untested): /* * Ensure the next N instructions are not interrupted by IRQ checks. */ void cpu_loop_exit_unint(CPUState *cpu, uintptr_t pc, int len) { if (pc) { cpu_restore_state(cpu, pc, true); } cpu->cflags_next_tb =3D len | CF_LAST_IO | CF_NOIRQ | curr_cflags(cpu); cpu_loop_exit(cpu); } And then in HELPER(ex) you can end the helper with: void HELPER(ex)(CPUS390XState *env, uint32_t ilen, uint64_t r1, uint64_t ad= dr) { ... /* * We must execute the next instruction exclusively so exit the loop * and trigger a NOIRQ TB which won't check for an interrupt until * it finishes executing. */ cpu_loop_exit_unint(cpu, 0, 1); } Some notes: * Take care to ensure the CPU state is synchronised Which means the helper cannot use the flags TCG_CALL_NO_(READ_GLOBALS|WRITE_GLOBALS|SIDE_EFFECTS). And you you will to make sure you write the current PC in the tcg gen code in op_ex() * I think the env->ex_value can be removed after this * We will actually exit the execution loop (via a sigjmp) but the IRQ check in cpu_handle_interrupt() will be skipped due to the custom flags. When the next block is looked up (or generated) it will be entered but then immediately exit * I think even a branch to self should work because the second iteration will be interuptable > Sorry for the dumb questions, i'm not often working on qemu ;-) There are no dumb questions, just opportunities for better documentation ;-) --=20 Alex Benn=C3=A9e