改进yolov8表格行列单元格结构检测
首发时间:2024-05-29
摘要:当前数字办公文档中涵盖了大量的表格数据,因此智能化表格结构识别需求日益剧增,但表格结构紧密相连且表格结构类型复杂多变,从而导致表格结构检测难度极大。针对该问题,在yolov8的基础上,以icdar19-ctdar表格单元格结构和tabstructdb表格行列结构为实验对象,提出了一种新型表格行列单元格结构检测方法。首先,为了增强表格单元格及行列特征提取能力,引入了可变形卷积(deformable convolution network,dcn),其次引入了空间通道重构卷积(spatial and channel reconstruction convolution,scconv),该卷积不仅特征提取能力强而且能够减少冗余特征从而降低复杂性和计算成本。根据以上引入的卷积设计了一个新的模块--dsc模块以替代c2f中的bottlenck模块,并命名为c2fdsc模块。此外,为了进一步加强表格结构的角落局部特征提取,在yolov8的骨干网络上加入了显示中心特征调节(explicit visual center feature adjustment,evc)模块。最后将原模型的损失函数替换为mpdiou,在解决密集目标回归精度问题时,相较于原始模型损失函数,mpdiou损失函数边界框回归的准确性和效率更高。实验结果表明,该表格结构检测算法在数据集icdar19-ctdar上取得了目前最佳实验效果(sota),单元格查准率、查全率和f1值分别为91.7%、82.3%和86.7%,在数据集tabstructdb表格行列检测中也取得了非常实用的性能结果。
关键词:
for information in english, please click here
table row and column cell structure detection based on yolov8
abstract:the current digital office documents cover a large amount of table data, so intelligent table recognition is increasingly in demand in life, but the table structure is closely linked, including embedded, merged and wireless cells and other complex types, which leads to complex table structure detection. to address this problem, this paper, based on yolov8, takes the icdar19-ctdar table cell structure and the tabstructdb table row-column structure as the object, respectively, and proposes a new table row-column cell structure detection method. firstly, in order to enhance the extraction of table cells and row and column features this paper introduces deformable convolution network (dcn), secondly, the introduction of spatial and channel reconstruction convolution (scconv) not only has a strong feature extraction capability but also reduces redundant features to reduce the complexity and computational cost. based on the above introduced convolution a new module dsc module is designed to replace the bottlenck module in c2f and named as c2fdsc module. in addition to this, in order to further enhance the corner local feature extraction of the table structure, a explicit visual center feature adjustment (evc) module was added to the backbone network of yolov8. finally, the loss function of the original model is replaced with mpdiou. when solving the problem of dense objective regression accuracy, the mpdiou loss function bounding box regression is more accurate and efficient compared to the original model loss function. experiments show that the table structure detection algorithm in the dataset icdar19-ctdar achieved the best detection results so far, the cell checking rate, checking rate and f1 value are 91.7%, 82.3% and 86.7%, respectively, and in the dataset tabstructdb table row and column detection has also achieved very practical performance results.
keywords:
论文图表:
引用
导出参考文献
no.****
同行评议
勘误表
改进yolov8表格行列单元格结构检测
评论
全部评论