面向查询的高性能二级索引设计
首发时间:2018-04-28
摘要:由于传统关系型数据库难以应对智能电网中成千上万的传感器产生的海量数据和属性,新型大数据平台的hbase数据库作为一个面向列的键值存储数据库,逐步成为主流的大数据平台数据库。然而,hbase虽然可以支撑大数据量,但仍然难以应对电网数据分析的查询频率较高、响应时间较短的需求。本研究中,我们提出了一个面向查询的二级索引方案,它可以加速查询,实验结果表明当涉及到两张表连接查询时,我们的方案相比于经典二级索引方案可以提供最小1.026倍至最大4.761倍的加速比,当涉及到三张表连接查询时,我们的方案相比于经典二级索引方案可以提供最小1.797倍至最大8.581倍的加速比,进一步优化之后,该方案还可以大量节省索引表的存储空间。本研究提出的二级索引方案在查询性能和存储效率方面都有不错的效果
for information in english, please click here
high performance secondary index design for complex queries
abstract: since the traditional relational database is difficult to cope with the massive data and properties produced by thousands of sensors in the smart grid, the hbase database of the new large data platform, as a column oriented key value storage database, has gradually become the mainstream large data platform database. thoughhbase can support large data volume, but it is still difficult to cope with the demand of high frequency and short response time in power grid data analysis. in this study, we proposed a query oriented two level index scheme that can speed up the query. the experimental results show that when the two table connection queries are involved, our scheme can provide a minimum of 1.026 to the maximum 4.761 times the acceleration ratio compared to the classic two level index, when it involves three table connection queries, our scheme provides a minimum of 1.797 to a maximum of 8.581 times the speed ratio of the classic two level index scheme. after further optimization, the scheme can also save much of the storage space of the index table. the two level indexing scheme proposed in this study has a good effect in terms of query performance and storage efficiency.
keywords: big data secondary index smart grid
论文图表:
引用
导出参考文献
no.****
动态公开评议
共计0人参与
勘误表
面向查询的高性能二级索引设计
评论
全部评论0/1000