论文标题
提供更有效地访问政府记录:涉及应用机器学习以改善FOIA审查的用例
Providing More Efficient Access To Government Records: A Use Case Involving Application of Machine Learning to Improve FOIA Review for the Deliberative Process Privilege
论文作者
论文摘要
目前,根据《美国信息自由法》(FOIA)以及在全球许多类似的政府透明度制度下,根据《信息自由法》(FOIA)披露的材料审查过程是完全手动的。因此,等待此类评论的长期积压抑制了公众对政府记录的访问。本文通过首先创建一个新的公共测试收集来研究该问题的一个方面,并用一类豁免材料的注释,审议过程特权,然后使用该测试收集来研究当前文本分类技术的能力,以识别那些在该特权下免于释放的材料。结果表明,当使用同一审稿人的注释对系统进行培训和评估时,通常可以可靠地检测到困难的案例,但是审查员解释的差异,记录保管人的差异以及用于培训和测试的记录的差异和测试的差异。
At present, the review process for material that is exempt from disclosure under the Freedom of Information Act (FOIA) in the United States of America, and under many similar government transparency regimes worldwide, is entirely manual. Public access to the records of their government is thus inhibited by the long backlogs of material awaiting such reviews. This paper studies one aspect of that problem by first creating a new public test collection with annotations for one class of exempt material, the deliberative process privilege, and then by using that test collection to study the ability of current text classification techniques to identify those materials that are exempt from release under that privilege. Results show that when the system is trained and evaluated using annotations from the same reviewer that even difficult cases can often be reliably detected, but that differences in reviewer interpretations, differences in record custodians, and that differences in topics of the records used for training and testing pose additional challenges.