
Research on Intelligent Speech Technology for Criminal Investigation













丁盼. 侦查智能语音技术研究[D]. 西南政法大学,2023.

  • dc.title
  • 侦查智能语音技术研究
  • dc.title
  • Research on Intelligent Speech Technology for Criminal Investigation
  • dc.contributor.schoolno
  • B20200301Z3115
  • dc.contributor.author
  • 丁盼
  • dc.contributor.affiliation
  • 刑事侦查学院
  • dc.contributor.degree
  • 博士
  • dc.contributor.childdegree
  • 法学博士
  • dc.contributor.degreeConferringInstitution
  • 西南政法大学
  • dc.identifier.year
  • 2023
  • dc.contributor.direction
  • 证据技术与方法
  • dc.contributor.advisor
  • 张翠玲
  • dc.contributor.advisorAffiliation
  • 刑事侦查学院
  • dc.language.iso
  • 中文
  • dc.subject
  • 智能化;侦查技术;智能语音技术;深度学习;人工智能
  • dc.subject
  • Intelligence;Investigation Technology;Intelligent Speech Technology;Deep Learning;Artificial Intelligence
  • dc.description.abstract
  • 当前,智能语音技术作为人工智能领域应用最广泛的技术之一,在教育、医疗、金融服务等领域中已经具有相对成熟的应用模式。在人工智能时代的背景下,智能语音技术迎来了更多发展与创新机遇,其应用场景也在不断拓展和延伸,同时也为刑事侦查领域智能化技术的应用注入了新的科技力量。侦查工作的高效展开离不开先进技术的创新应用,在当前非接触性犯罪持续高发的态势下,智能语音技术在侦查领域中的应用与发展也迎来了新的契机。侦查智能语音技术是指应用于侦查领域、服务于侦查破案工作的与智能语音技术相关的全部方法和手段,不仅包含了适用于侦查场景下的具体智能语音技术及其实施方法,而且还涵盖了利用智能语音技术达成侦查目的采用的所有手段和方式。作为侦查领域的前沿科技发展成果,侦查智能语音技术为侦查工作提供了先进的技术手段,促进了人工智能与侦查技术领域的创新融合。为了提升我国侦查智能语音技术的实战应用水平和促进侦查技术领域的科技创新,有必要构建侦查智能语音技术的运行逻辑体系,以提高我国侦查智能语音技术的应用水平,促进智能语音技术与侦查实战的深度融合。本研究从智能语音技术及其应用出发,论证了智能语音技术与现阶段侦查领域需求的契合性,然后对侦查智能语音技术进行了概述,并且结合侦查智能语音技术在实务工作中的运行情况,对其在推进过程中面临的多维困境和成因进行了深入解析,最后从理论、制度、方法和实践维度对侦查智能语音技术的优化路径进行了阐述。全文共分为五章:第一章对智能语音技术及应用进行了详细论述,涉及智能语音技术的概念、基本种类、发展沿革和发展驱动因素以及其在侦查中的应用进展。首先明确了智能语音技术的概念和基本种类,然后基于智能语音技术的基本种类对其发展沿革进行了回顾,并提出了智能语音技术发展的驱动因素有语音数据量与质的提升、人工智能算法的优化以及计算和存储资源的升级,最后从国内外侦查工作需求与智能语音技术的契合性角度切入,论述了构建面向侦查的智能语音技术运行逻辑体系的合理性和必要性。智能语音技术不仅是侦查智能语音技术的构成基础,而且可以为侦查智能语音技术的应用提供方法支撑,在侦查领域中具有较强的适用性,并且与现阶段的侦查工作需求具有高度契合性。无论从打击新型犯罪的有效性方面考虑,还是从提升侦查工作整体效能的角度出发,智能语音技术在侦查领域的应用都具有较高的合理性和必要性,构建面向侦查领域的智能语音技术应用体系也是迎合侦查实务需求的必然选择。第二章对侦查智能语音技术进行了界定,并对该技术的实施原理、适用原则与应用价值以及实施基础进行了论证。侦查智能语音技术是现代科技环境下侦查实践的产物,是智能语音技术与侦查工作的创新结合,侦查智能语音技术不仅包含了适用于侦查场景下的具体智能语音技术及其实施方法,而且还涵盖了利用智能语音技术达成侦查目的采用的所有手段和方式。侦查智能语音技术的基本种类可分为自动说话人识别、智能音频分析、智能话者人身分析、智能语音处理和智能语音转写,其应用原理不仅涉及侦查学领域的信息转移原理和同一认定原理,而且包含了统计学领域的统计推断原理。侦查智能语音技术的适用原则包括目的正当原则、最佳效益原则、技术引领原则和风险预防原则,其具有拓展侦查模式、丰富侦查技术、提升侦查效率和降低侦查成本的适用价值。从理论、制度、方法、政策层面看,侦查智能语音技术在实务工作中已具备基础实施条件。第三章对侦查智能语音技术在实务中的应用情况进行了详细论述。从国内不同级别、不同地区的公安机关语音数据库和语音检验实验室建设情况入手,分析了侦查智能语音技术的基础建设进展,然后对侦查智能语音技术的运行需要的主体资源、设备资源和制度资源配置进行了总结,并阐述了侦查智能语音技术在实务案件中的应用场景,最后基于实务工作中的典型案例开展了实证分析,并归纳总结了侦查智能语音技术在实务中的实施要点。我国公安机关语音数据库和语音检验实验室的建设已取得显著成效,但不同经济发展水平地区的公安机关在侦查智能语音技术基础建设质量方面存在一定差异。当前,侦查智能语音技术已逐步应用于侦查实战中,并且其在侦查领域中的适用场景正在不断延伸。在实际案件中适用侦查智能语音技术时,注重涉案语音指向的线索、合理选择技术实施路径、扎实开展语音数据采集工作和发挥语音在案件串并中的作用是充分发挥技术效能的有效举措。第四章对侦查智能语音技术在推进工作中面临的多维困境和成因进行了深入剖析。侦查智能语音技术的推进工作在理论、制度、方法和实践维度都面临着诸多困境。在理论维度面临着缺乏成熟的学科交叉理论体系构建范式、学科发展依托的基础理论缺少多元性的困境,该困境的主要成因是不同学科的理论体系框架存在差异以及未实现理论创新与实践创新的良性互动;在制度维度面临着视听资料取证规则要求存在模糊性、缺少基于智能语音技术的检验鉴定规范以及现有语音证据评价模式主观性较强的困境;在方法维度面临着语音数据质与量不达标、语音处理工作效果不理想以及深度学习模型改进难度高的困境,模型训练需要的语音数据资源获取困难、超大规模语音数据处理成本高昂以及算法模型在侦查场景中的适用性降低是形成该困境的主要成因;在实践维度,面临着专业技术人才队伍建设不规范、刑事技术部门业务分工不明确和公安机关内部语音数据资源相互独立的困境,该困境的主要成因是侦查部门专业化人才极度匮乏、地区经济发展不平衡导致资源分配差异化和缺乏科学的语音数据治理机制。第五章提出了侦查智能语音技术发展的多维优化路径。结合侦查智能语音技术的实务应用情况和现实面临的多维困境,从我国的基本国情出发,分别从侦查智能语音技术应用的理论体系、制度规范、方法效能和实践运用维度提出了优化路径。在理论体系构建方面,可以通过借鉴学科交叉理论体系的构建思路、关注实践工作与理论创新的结合点以及聚焦证据评价领域对科学性的诉求达成目标;在制度规范建设方面,可以考虑从健全音频资料取证规则的内容体系、制定以智能化技术为基础的音频资料检验新规范和贯彻语音证据评价的科学性要求三个角度对侦查智能语音技术应用的配套制度进行完善;在方法效能提升方面,可以从拓宽语音数据资源的获取渠道、提升语音处理的智能化和自动化水平以及提高深度学习算法模型与侦查需求的适配性三个方面促进侦查智能语音技术发挥出最优效能;在实务应用推进方面,可以通过构建科学化人才培养机制、加强资源分配的理性引导、提升资源的协同运用效果以及推动科技与实战的深度融合,进一步推进侦查智能语音技术在实战中的应用。
  • dc.description.abstract
  • Currently, intelligent speech technology, as one of the most widely applied technology in the field of artificial intelligence, has relatively mature application models in domains such as education, healthcare, and financial services. In the era of artificial intelligence, intelligent speech technology is encountering more opportunities for development and innovation. Its application scenarios are continuously expanding and extending, while also infusing new technological capabilities into the field of criminal investigation field. The efficient execution of investigative work heavily relies on the innovative application of advanced technology. Given the ongoing high incidence of non-contact crimes, intelligent speech technology presents new opportunities for its application and development in the criminal investigation field. Intelligence speech technology for criminal investigation refers to all methods and means related to intelligent speech technology applied in the field of investigation, serving the investigative and case-solving work. It encompasses not only specific intelligent speech technology and their implementation methods applicable to investigative scenarios but also all means and approaches used to achieve investigative objectives using intelligent speech technology. As an advanced achievement in the forefront of the field of criminal investigation, intelligent speech technology provides cutting-edge technical capabilities for investigative work, promoting the innovative integration of artificial intelligence with investigative technology. In order to enhance the practical application of intelligent speech investigation technology in China and facilitate technological innovation in the field of criminal investigation, it is necessary to establish an operational logical system for intelligent speech technology. This will raise the level of application of intelligent speech technology in China and further advance the deep integration of intelligent speech technology with investigative practices. This research begins by elucidating the compatibility between intelligent speech technology and the current demands in the field of criminal investigation. Subsequently, it provides an overview of criminal investigation intelligence speech technology and delves into the multifaceted challenges and underlying causes it encounters during its implementation in investigative work. Finally, it outlines the optimization pathways for intelligent speech technology for criminal investigation from theoretical, institutional, methodological, and practical perspectives.The thesis is divided into five chapters:Chapter 1 provides a detailed exposition of intelligent speech technology and its applications. It covers the concepts of intelligent speech technology, its basic types, historical development, driving factors, and its progress in the field of investigation. Firstly, the concepts and basic types of intelligent speech technology are defined. Then, a retrospective overview of its historical development is presented based on these types. The driving factors behind the development of intelligent speech technology, such as the increase in speech data volume and quality, optimization of artificial intelligence algorithms, and upgrades in computing and storage resources, are discussed. Finally, from the perspective of the alignment between domestic and international investigative needs and intelligent speech technology, the rationale and necessity of constructing a rational logic system for intelligent speech technology tailored to the investigative domain are elaborated upon. Intelligent speech technology not only serves as the foundational basis for intelligent speech technology but also provides methodological support for its application in investigations. It is highly adaptable to the investigative field and aligns closely with the current investigative demands. Whether viewed from the perspective of effectively combating new types of crimes or enhancing the overall efficiency of investigations, the application of intelligent speech technology in the investigative field is highly justified and necessary. The construction of an application system for intelligent speech technology tailored to the investigative domain is also a natural choice in response to the needs of investigative practice.Chapter 2 defines intelligent speech technology for criminal investigation and provides an argument for its implementation principles, applicability principles, value, and foundational implementation. Intelligent speech technology for criminal investigation is a product of modern technological advancements in investigative practice. It represents the innovative fusion of intelligent speech technology and criminal investigation. Intelligent speech technology for criminal investigation encompasses not only specific intelligent speech technology and their implementation methods applicable to criminal investigative scenarios but also all means and methods of utilizing intelligent speech technology to achieve investigative objectives.The fundamental types of intelligent speech technology for criminal investigation can be categorized as automatic speaker recognition, intelligent audio analysis, intelligent speaker persona analysis, intelligent speech processing, and intelligent speech transcription. Its principles of application involve not only the information transfer principles and same-person identification principles in the field of criminal investigation but also statistical inference principles from the domain of statistics.The applicability principles of intelligent speech technology for criminal investigation include the principle of legitimate purpose, the principle of optimal benefit, the principle of technological leadership, and the principle of risk prevention. It holds valuable applications in expanding investigative models, enriching investigative techniques, enhancing investigative efficiency, and reducing investigative costs. From a theoretical, institutional, methodological, and policy perspective, intelligent speech technology for criminal investigation has already established the foundational conditions for practical implementation.Chapter 3 provides a detailed discussion of the practical applications of intelligent speech technology for criminal investigation. Beginning with an analysis of the construction of voice databases and voice examination laboratories in public security agencies at different levels and in different regions of China, it examines the progress of the foundational infrastructure development for intelligent speech technology in criminal investigation field. The chapter then summarizes the allocation of essential resources, including human, equipment, and institutional resources for the operation of intelligent speech technology for criminal investigation. It elucidates the application scenarios of intelligent speech technology in practical cases and conducts empirical analyses based on typical case studies from real-world work. The chapter concludes by outlining the key implementation points for intelligent speech technology for criminal investigation in practical applications.The construction of voice databases and voice examination laboratories within public security agencies in China has achieved significant results. However, there exist differences in the quality of infrastructure development for intelligent speech technology for criminal investigation among regions with varying levels of economic development. Currently, intelligent speech technology has been gradually integrated into investigative operations and its applicable scenarios in the field of investigation continue to expand. When applying intelligent speech technology for criminal investigation in actual cases, focusing on clues provided by voice evidence, selecting the appropriate technological implementation pathway, conducting thorough voice data collection, and harnessing the role of voice evidence in the consolidation of cases are effective measures for maximizing technological efficiency.Chapter 4 provides a comprehensive analysis of the multifaceted challenges and underlying causes faced in the advancement of intelligent speech technology in criminal investigation field. Intelligent speech technology faces a myriad of practical challenges across theoretical, institutional, methodological, and practical dimensions.In the theoretical dimension, it grapples with the absence of a mature paradigm for constructing interdisciplinary theoretical frameworks and the lack of diversified foundational theories to support the development of the field. These challenges arise from disparities in theoretical frameworks across different disciplines and the failure to foster a constructive interplay between theoretical and practical innovations.In the institutional dimension, challenges stem from the ambiguity in the requirements of rules governing audio-visual evidence, the absence of standards for inspection and verification based on intelligent speech technology, and the subjective nature of existing models for voice evidence assessment.The methodological dimension confronts issues such as insufficient quality and quantity of voice data, suboptimal performance in voice processing, and difficulties in enhancing deep learning models. These challenges are attributed to the scarcity of accessible voice data for model training, the high cost of processing extensive voice datasets, and the limited applicability of algorithms in investigative settings.In the practical dimension, challenges manifest in the irregular development of professional technical personnel, unclear division of responsibilities within criminal technology departments, and the independence of audio data resources within public security institutions. These difficulties are rooted in the severe shortage of specialized personnel in investigative units, regional economic disparities leading to unequal resource allocation, and the absence of a scientific framework for audio data governance.Chapter 5 presents a multidimensional optimization pathway for intelligent speech technology in criminal investigation field. Drawing from the practical application scenarios and the multidimensional challenges faced in reality, and taking into consideration China’s unique national conditions, this chapter proposes optimization pathways across various dimensions of intelligent speech technology applications in criminal investigation field. In the context of constructing a theoretical framework, the chapter suggests achieving goals by borrowing insights from the construction of interdisciplinary theoretical systems, focusing on the convergence of practical work and theoretical innovation, and addressing scientific demands in the field of evidence evaluation. Regarding institutional standardization, considerations include enhancing the content structure of audio data collection rules, formulating new standards for audio data examination based on intelligent technology, and implementing scientific requirements for voice evidence evaluation to improve the supporting institutional framework for the application of investigative intelligent speech technology. In order to enhance methodological efficiency, strategies encompass diversifying channels for acquiring voice data resources, elevating the intelligence and automation levels of voice processing, and increasing the adaptability of deep learning algorithm models to investigative requirements. In advancing practical applications, the chapter recommends the establishment of a systematic talent cultivation mechanism, rational guidance of resource allocation, improving the synergy of resource utilization, and promoting the deep integration of technology and real-world applications. These measures aim to further advance the practical application of intelligent speech technology for criminal investigation in real-world scenarios.
  • dc.date.issued
  • 2023-11-23
  • dc.date.oralDefense
  • 2023-11-17