From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6B1BAE77188 for ; Tue, 14 Jan 2025 13:00:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 006D56B0085; Tue, 14 Jan 2025 08:00:26 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id EF8276B0088; Tue, 14 Jan 2025 08:00:25 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DC0286B0089; Tue, 14 Jan 2025 08:00:25 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id C0B5A6B0085 for ; Tue, 14 Jan 2025 08:00:25 -0500 (EST) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 5D0321C70AD for ; Tue, 14 Jan 2025 13:00:25 +0000 (UTC) X-FDA: 83006066010.02.A653CF6 Received: from mx0b-0031df01.pphosted.com (mx0b-0031df01.pphosted.com [205.220.180.131]) by imf23.hostedemail.com (Postfix) with ESMTP id F364E14001E for ; Tue, 14 Jan 2025 13:00:22 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=quicinc.com header.s=qcppdkim1 header.b=Wi8ITHJc; dmarc=pass (policy=none) header.from=quicinc.com; spf=pass (imf23.hostedemail.com: domain of quic_jiangenj@quicinc.com designates 205.220.180.131 as permitted sender) smtp.mailfrom=quic_jiangenj@quicinc.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736859623; a=rsa-sha256; cv=none; b=qyQ1ZnprwPUYAqp9sHfTp9ExmV53iKmcizQ8zC5n3SB1a4YafCBV08lfrYMhrMyn7iCjPd R5AiSJPXed0SIiez1pY0AyvjnknT7Bij1zFcCQ457op/cl2EcwnNrWoQaCeML7Bv8Gf07S xgA8N5e3SQu1zYceM4sRpGgJMjvuiSk= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=quicinc.com header.s=qcppdkim1 header.b=Wi8ITHJc; dmarc=pass (policy=none) header.from=quicinc.com; spf=pass (imf23.hostedemail.com: domain of quic_jiangenj@quicinc.com designates 205.220.180.131 as permitted sender) smtp.mailfrom=quic_jiangenj@quicinc.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736859623; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=pHaapRq9KPEM5wkA9qBkuMynp0US/1hBcSuicBhsMLQ=; b=eRzGARyly0s0EeuAFpjADdLXSwgEhkE4XBDNX9vcf2bXtbfeFhCYdlv2SPc8UDGCHTsDYY QgLo1NC+daqlgTbQTz+iqqVCv1lDwYMv6F7j1trcqUWDfhujL1C15MH5VMdFx4JpK7GkEI dToDJ6oDOfXYxAxbL641W5TJky/wNsI= Received: from pps.filterd (m0279873.ppops.net [127.0.0.1]) by mx0a-0031df01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 50ECvaDA029334; Tue, 14 Jan 2025 13:00:10 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; h= cc:content-type:date:from:in-reply-to:message-id:mime-version :references:subject:to; s=qcppdkim1; bh=pHaapRq9KPEM5wkA9qBkuMyn p0US/1hBcSuicBhsMLQ=; b=Wi8ITHJctIruwohHYBU96OoT70AyObs97T8cNC41 ZZjVK/5ihf19AGY94EBd8PoFpVa6qu0yGN7uCrNiS/L48UdGH8WF201evNadOlhW o0P0Tnr4/yaqxNsFrPvXOR5dS9X/o66D2BGhrH4sIleeIxuAI2fgb0uw5o4fPLx7 euV40s0h1RmHBd0IWVRqCGTre6K7HmdX4Qgmk+YJto0u+vpLVTGYv4yJL//KJMFe zp0+4KEgI2MKM8bNVeitNMB6P+vkLNyIo/8uvokLZ4DM8rasFRQJMvLGO7OGuFRp 3Yo9qHkQLXysKycjfv5BdHD5W5TrUlJBrZTosqC2kLKYPw== Received: from nasanppmta01.qualcomm.com (i-global254.qualcomm.com [199.106.103.254]) by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 445rcy0046-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 14 Jan 2025 13:00:09 +0000 (GMT) Received: from nasanex01c.na.qualcomm.com (nasanex01c.na.qualcomm.com [10.45.79.139]) by NASANPPMTA01.qualcomm.com (8.18.1.2/8.18.1.2) with ESMTPS id 50ED08Nx031490 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 14 Jan 2025 13:00:08 GMT Received: from hu-jiangenj-sha.qualcomm.com (10.80.80.8) by nasanex01c.na.qualcomm.com (10.45.79.139) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.9; Tue, 14 Jan 2025 05:00:02 -0800 Date: Tue, 14 Jan 2025 18:29:58 +0530 From: Joey Jiao To: Marco Elver CC: Dmitry Vyukov , Andrey Konovalov , Jonathan Corbet , Andrew Morton , Dennis Zhou , Tejun Heo , Christoph Lameter , Catalin Marinas , Will Deacon , , , , , , , Subject: Re: [PATCH 0/7] kcov: Introduce New Unique PC|EDGE|CMP Modes Message-ID: References: <20250114-kcov-v1-0-004294b931a2@quicinc.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline In-Reply-To: X-Originating-IP: [10.80.80.8] X-ClientProxiedBy: nasanex01b.na.qualcomm.com (10.46.141.250) To nasanex01c.na.qualcomm.com (10.45.79.139) X-QCInternal: smtphost X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800 signatures=585085 X-Proofpoint-GUID: 9bWMrs7yTO1d9x5OkhyqPh_zpoALy6eE X-Proofpoint-ORIG-GUID: 9bWMrs7yTO1d9x5OkhyqPh_zpoALy6eE X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.60.29 definitions=2024-09-06_09,2024-09-06_01,2024-09-02_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 suspectscore=0 clxscore=1015 adultscore=0 phishscore=0 bulkscore=0 impostorscore=0 malwarescore=0 priorityscore=1501 mlxlogscore=999 mlxscore=0 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.19.0-2411120000 definitions=main-2501140108 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: F364E14001E X-Stat-Signature: 9k9m7rwa78sz9ruc8t6nn66uh3kpccrj X-Rspam-User: X-HE-Tag: 1736859622-536940 X-HE-Meta: U2FsdGVkX1/KPMESRG8C7Ofw1F2UmkGu3/VoRsza1tflfqTp/2DmXQ4TGJvk0VLGHC2kXRoyJDDvG7qptnrzexc8YZ74HfbcVoRx2mlCfqLlk/dh2dFu4ZnvCMMhu5bJO+qQ5MDvrNS4peHw/UI7yrgIGXdufveasR4kRCIL7WC48D2gSVXI093R5lMU24/3jS+KgM98XHMYMhkRnX+3X4HV721T4rXhvbXWb/kXwA6WigTKIKEAYivyP/+PrYRaccJwQWJxnnnAXbejRk2M7ciZ8zDLwynORZSVPY/Jk0n5BxTbDIgpqD0fq8shbXY8N/AGp1JD4B0wX4j72rKxp/E+C2dlDHun2AW9zNscskQCP07oB6KmVw4AgANEy/GKqIqfj4CM6z8Z1Mm9zxScRG2nCc4PE1VpxoP6pvjmP/0I+DfqiYPG6t9iZYAwcg86QDtEHz1DQ6r7litKc/vMEoKgw8qTLIt66lQKPo8loSph1wV8fukDrsQXcCnGxgL1G6sYVv+eXnfcopbEtx2BXroufhNB3eppzbl7e+JFgmjuKryOXCL8+6Gd1zz/4G8YzEyew2MbQCaYBvYmDi75sY8LrAIiwfuP0NxBeikxtUZt5uAb8LEZTHPqgJj57t8qth6b/Cd+ch6W7Eg0axpYuH+HYCI89bqhh0dPt82vN6vvtkYTKWu7tTgXUmJzPtW0EF/ii0dkam3Um8YMzP10MSu1m9g8eXtxuOXGfKoCoiNudACnPygzKdsmNVWG/g2AgJontXYZVrXuHfUp7/E2EfeE0f+1a8/ih/s7ycsFj3B/YSapb8YejQjaOYaJXK5CuP4zxhfRg0pt+Fp+PT7f7Ec8uI1+U6IvJV6zj9EfQOtpVSlFHApJ456SuGtFtnj53PRnxCBtJhEPVIlbfyJgVQ9xma08ZLWpWND/sU2oJdilyPzP+mLO2u8e7qIxbRLVu3yvDpjpZN19dAlW4ZV kDxFKOpO K/2JMrQBh9B5JPseqK4s1OuyxS7Q7Jg5ZUFYIHS1ClH46C+B61hEJZGNEwXrYrSshnkaJ5HB5oIWN8gudZxSCuBl8ACE+F5+3A3M01qPDZ+aujzY4PFQ2HT+AZrEH/d/pfxvJSuzxMFISmBU/JdCrZx1a0yBc/vZwytN1JGHFsWi+ZOc49R3tsblhZFlRRm4W//EivaqWDUQS13ifUjAmA6jwviTg+sxlUEcbpZjXfS2+e7Cn5D5ky6AbipUYoyexpeA83/nruNpYbFDkHstC6XlwF4PKfqYKWjueH1zRZLk/P/6eTXpC2aHT2k2Qio09sj0Q X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Jan 14, 2025 at 11:43:08AM +0100, Marco Elver wrote: > On Tue, 14 Jan 2025 at 06:35, Jiao, Joey wrote: > > > > Hi, > > > > This patch series introduces new kcov unique modes: > > `KCOV_TRACE_UNIQ_[PC|EDGE|CMP]`, which are used to collect unique PC, EDGE, > > CMP information. > > > > Background > > ---------- > > > > In the current kcov implementation, when `__sanitizer_cov_trace_pc` is hit, > > the instruction pointer (IP) is stored sequentially in an area. Userspace > > programs then read this area to record covered PCs and calculate covered > > edges. However, recent syzkaller runs show that many syscalls likely have > > `pos > t->kcov_size`, leading to kcov overflow. To address this issue, we > > introduce new kcov unique modes. > > Overflow by how much? How much space is missing? Ideally we should get the pos, but the test in syzkaller only counts how many times the overflow occurs. Actually I guess the pos is much bigger than cover size because originally we have 64KB cover size, the overflow happens; then now syzkaller set it to 1MB, but still 3535 times overflow for `ioctl$DMA_HEAP_IOCTL_ALLOC` syscall which has only 19 inputs. mmap syscall is also likely to overflow for 10873 times with 181 inputs in my case. Internally, I tried also 64MB cover size, but I still see the overflow case. Using syz-execprog together with -cover options shows many pcs are hit frequently, but disabling instrumentation for each these PC is less efficient and sometimes no lucky to fix the overflow problem. I think the overflow happens more frequent on arm64 device as I found functions in header files hit frequently. And I'm not able to access syzbot backend syz-manager data, perhaps qemu x86_64 setup has more info. > > > Solution Overview > > ----------------- > > > > 1. [P 1] Introduce `KCOV_TRACE_UNIQ_PC` Mode: > > - Export `KCOV_TRACE_UNIQ_PC` to userspace. > > - Add `kcov_map` struct to manage memory during the KCOV lifecycle. > > - `kcov_entry` struct as a hashtable entry containing unique PCs. > > - Use hashtable buckets to link `kcov_entry`. > > - Preallocate memory using genpool during KCOV initialization. > > - Move `area` inside `kcov_map` for easier management. > > - Use `jhash` for hash key calculation to support `KCOV_TRACE_UNIQ_CMP` > > mode. > > > > 2. [P 2-3] Introduce `KCOV_TRACE_UNIQ_EDGE` Mode: > > - Save `prev_pc` to calculate edges with the current IP. > > - Add unique edges to the hashmap. > > - Use a lower 12-bit mask to make hash independent of module offsets. > > - Distinguish areas for `KCOV_TRACE_UNIQ_PC` and `KCOV_TRACE_UNIQ_EDGE` > > modes using `offset` during mmap. > > - Support enabling `KCOV_TRACE_UNIQ_PC` and `KCOV_TRACE_UNIQ_EDGE` > > together. > > > > 3. [P 4] Introduce `KCOV_TRACE_UNIQ_CMP` Mode: > > - Shares the area with `KCOV_TRACE_UNIQ_PC`, making these modes > > exclusive. > > > > 4. [P 5] Add Example Code Documentation: > > - Provide examples for testing different modes: > > - `KCOV_TRACE_PC`: `./kcov` or `./kcov 0` > > - `KCOV_TRACE_CMP`: `./kcov 1` > > - `KCOV_TRACE_UNIQ_PC`: `./kcov 2` > > - `KCOV_TRACE_UNIQ_EDGE`: `./kcov 4` > > - `KCOV_TRACE_UNIQ_PC|KCOV_TRACE_UNIQ_EDGE`: `./kcov 6` > > - `KCOV_TRACE_UNIQ_CMP`: `./kcov 8` > > > > 5. [P 6-7] Disable KCOV Instrumentation: > > - Disable instrumentation like genpool to prevent recursive calls. > > > > Caveats > > ------- > > > > The userspace program has been tested on Qemu x86_64 and two real Android > > phones with different ARM64 chips. More syzkaller-compatible tests have > > been conducted. However, due to limited knowledge of other platforms, > > assistance from those with access to other systems is needed. > > > > Results and Analysis > > -------------------- > > > > 1. KMEMLEAK Test on Qemu x86_64: > > - No memory leaks found during the `kcov` program run. > > > > 2. KCSAN Test on Qemu x86_64: > > - No KCSAN issues found during the `kcov` program run. > > > > 3. Existing Syzkaller on Qemu x86_64 and Real ARM64 Device: > > - Syzkaller can fuzz, show coverage, and find bugs. Adjusting `procs` > > and `vm mem` settings can avoid OOM issues caused by genpool in the > > patches, so `procs:4 + vm:2GB` or `procs:4 + vm:2GB` are used for > > Qemu x86_64. > > - `procs:8` is kept on Real ARM64 Device with 12GB/16GB mem. > > > > 4. Modified Syzkaller to Support New KCOV Unique Modes: > > - Syzkaller runs fine on both Qemu x86_64 and ARM64 real devices. > > Limited `Cover overflows` and `Comps overflows` observed. > > > > 5. Modified Syzkaller + Upstream Kernel Without Patch Series: > > - Not tested. The modified syzkaller will fall back to `KCOV_TRACE_PC` > > or `KCOV_TRACE_CMP` if `ioctl` fails for Unique mode. > > > > Possible Further Enhancements > > ----------------------------- > > > > 1. Test more cases and setups, including those in syzbot. > > 2. Ensure `hash_for_each_possible_rcu` is protected for reentrance > > and atomicity. > > 3. Find a simpler and more efficient way to store unique coverage. > > > > Conclusion > > ---------- > > > > These patches add new kcov unique modes to mitigate the kcov overflow > > issue, compatible with both existing and new syzkaller versions. > > Thanks for the analysis, it's clearer now. > > However, the new design you introduce here adds lots of complexity. > Answering the question of how much overflow is happening, might give > better clues if this is the best design or not. Because if the > overflow amount is relatively small, a better design (IMHO) might be > simply implementing a compression scheme, e.g. a simple delta > encoding. I tried many ways to store the uniq info, like bitmap, segment bitmap, customized allocator + allocation index, also considering rhashmap, but perhaps hashmap (maybe rhashmap) is better. I also tried a full bitmap to record all PCs from all threads which shows that syzkaller can't find the new coverage while the full bitmap recorded it. If I replay the syzkaller log (or prog), kernel GCOV can also show these functions/lines are hit (not because flaky or interrupt) but syzkaller coverage doesn't have that data, which can be another proof of the kcov overflow. > > Thanks, > -- Marco