全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...
Information  2012 

Enhancing the Search in MOLAP Sparse Data

DOI: 10.3390/info3040661

Keywords: data warehousing, MOLAP, bitmap compression, hashing

Full-Text   Cite this paper   Add to My Lib

Abstract:

Multidimensional on-line analytical processing (MOLAP) systems deal well with dense data than relational ones (ROLAP). In the existence of sparse data, MOLAP systems become memory consuming, which may limit and slow down data processing tasks. Many compression techniques have been proposed to deal with the sparsity of data in MOLAP systems. One of these techniques is the bitmap compression, which allows a significant reduction of the memory space used for data processing. In this article, we propose an extension to the bitmap compression technique by storing the compressed data as bits into multiple efficient data structures based on a new indexing strategy instead of the linear structure. Compared with the classical bitmap, the proposed enhancement not only allows space reduction but also reduces the search time through the compressed data. We present some algorithms that allow maintaining and searching within the compressed structure without the need for decompression. We demonstrate that the complexity of the proposed algorithms varies from logarithmic to constant, compared with the linear complexity of the classical bitmap technique.

References

[1]  Inmon, W.H. The data warehouse environment. In the Building the Data Warehouse, 3rd ed.; John Wiley & Sons: Hoboken, NJ, USA, 2002; pp. 31–77.
[2]  Kimball, R.; Ross, M. Dimensional modeling primer. In the Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling, 2nd ed.; John Wiley & Sons: Hoboken, NJ, USA, 2002; pp. 1–27.
[3]  Li, J.; Rotem, D.; Wong, H. A new compression method with fast searching on large databases. In Very Large Data Bases: Proceedings of the Thirteenth International Conference on Very Large Data Bases, Brighton, England, September 1—4 1987; Morgan Kaufmann: San Francisco, CA, USA, 1987; pp. 311–318.
[4]  Moffat, A.; Zobel, J. Parameterised compression for sparse bitmaps. In Proceedings of the 15th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Copenhagen, Denmark, June 21–24, 1992; Belkin, N.J., Ingwersen, P., Pejtersen, A.M., Eds.; ACM Press: New York, NY, USA, 1992; pp. 274–285.
[5]  Chan, C.Y.; Ioannidis, Y.E. Bitmap index design and evaluation. Sigmod Rec. 1998, 34, 355–366.
[6]  Vaidyanathan, J.K.; Yang, G.; Agrawal, G. Communication and memory optimal parallel data cube construction. IEEE Trans. Parallel Distrib. Syst. 2005, 16, 1105–1119, doi:10.1109/TPDS.2005.144.
[7]  Ester, M.; Kohlhammer, J.; Kriegel, H.P. The DC-tree: A fully dynamic index structure for data warehouses. In Proceedings of the 16th International Conference on Data Engineering, San Diego, CA, USA, 29 February–3 March 2000; pp. 379–388.
[8]  Allen, B.; Munro, I. Self-organizing binary search trees. J. ACM 1978, 25, 526–535, doi:10.1145/322092.322094.
[9]  Cormen, T.H.; Leiserson, C.E.; Rivest, R.L.; Stein, C. Binary search trees. In Introduction to Algorithms, 2nd ed.; MIT Press: Cambridge, MA, USA, 1990; pp. 253–272.
[10]  Zalaket, J. Speed up the search in bitmap based compressed sparse arrays. In Proceedings of the International Conference on Information Management and Engineering, Kuala Lumpur, Malaysia, 3–5 April 2009; pp. 142–146.
[11]  In this example, the indexes are arriving in an ascending order, but they can arrive in any order without affecting our goal which is obtaining a compressed structure.
[12]  Adelson-Velskii, G.; Landis, E.M. An algorithm for the organization of information. Sov. Math. Dokl. 1962, 146, 1259–1263. Translated by Ricci, M.J.
[13]  Data can be modified or deleted from dimension MOLAP tables, but here our balanced BST is representing facts which in general are not deleted directly from the fact table but canceled by adding negative entries when it is necessary.
[14]  Elmasri, R.; Navathe, S. Fundamentals of Database Systems, 2nd ed.; Addison Wesley: Boston, MA, USA, 2010; pp. 646–659.
[15]  Fusco, F.; Vlachos, M.; Stoecklin, M. Real-time creation of bitmap indexes on streaming network data. VLDB J. 2012, 21, 287–307, doi:10.1007/s00778-011-0242-x.
[16]  Dichotomous search is applied into the sorted root vector in our implementation which has a logarithmic time compared with the linear time of the invoked one which is illustrated for simplicity reason.
[17]  The compression ratio increases when we increase the amount of data. The same benchmarks of Table 1 are used for the calculation of compression ratio.

Full-Text

comments powered by Disqus

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133