Back to Table of contents

Primeur weekly 2015-03-02

Special

ASCETiC project to reduce energy consumption of Cloud platforms ...

Focus

2014 Another year on the road to Exascale - An Interview with Satoshi Matsuoka and Thomas Sterling - Part II ...

Mont Blanc vision complements On the Road to Exascale interview ...

The Cloud

ISC Cloud & Big Data is now open for research paper submission ...

EuroFlash

Stalprodukt S.A. selects Cray XC30 supercomputer for modelling steel designs ...

Bright Cluster Manager now supports SUSE Linux Enterprise Server 12 ...

Asetek announces global OEM purchase agreement with Fujitsu ...

ADVA Optical Networking to demonstrate virtualization in radio access backhaul networks at Mobile World Congress ...

TDC deploys Oscilloquartz synchronization technology in National Danish Network ...

Queen's researchers in bid to develop world's fastest supercomputers ...

Supermicro expands embedded computing solutions with new wireless IoT gateway at Embedded World, Nürnberg ...

Ensuring security for networks of the future ...

USFlash

Cirrascale announces rackmount multi-device peering platform for highly parallel applications ...

Penguin Computing announces Scyld ClusterWare for Hadoop ...

Adaptive Computing appoints Marty Smuin as CEO ...

Innovative AMD FirePro server GPU supports intense compute workloads on HP ProLiant DL380 Gen9 servers ...

Mellanox ConnectX-4 100Gb/s Interconnect adapter delivers record performance results ...

Tohoku University and Fujitsu succeed in real-time flood analysis using supercomputer-based high-resolution tsunami modelling ...

Fujitsu M10 UNIX server achieves world-record performance once again on two-tier SAP SD standard application benchmark with 20% less CPU resources ...

Fujitsu develops column-oriented data-processing engine enabling fast, high-volume data analysis in database systems ...

AMD discloses architecture details of high-performance, energy-efficient "Carrizo" System-on-Chip ...

Undergraduate OSC researcher heading to UK ...

IBM and Juniper Networks partner to build smarter networks with predictive analytics ...

MySQL Cluster 7.4 now generally available ...

Intersect360 Research releases predictions for HPC in 2015 ...

Fujitsu develops column-oriented data-processing engine enabling fast, high-volume data analysis in database systems


26 Feb 2015 Kawasaki - Fujitsu Laboratories Ltd. has developed a column-oriented data-storage and processing engine that enables fast analysis of large volumes of data in a database system. In recent years, column-oriented databases have emerged as a system that allows for better speed when reading and analyzing large volumes of data, as a counterpart to existing row-oriented databases, which are suited to handling data updates. But problems have been either that the changes to row-oriented data cannot be automatically reflected in column-oriented data, or that the size of the column-oriented data is constrained by installed memory.

Fujitsu has developed an engine that, running on a PostgreSQL open-source database, without being dependent on memory capacity, instantly updates column-oriented data in response to changes in row-oriented data, and processes column-oriented data quickly. The engine quickly analyzes indexes, which are provided by most database systems, and can be used by developers without special consideration to whether the storage method is row-oriented or column-oriented. With a parallel-processing engine especially suited for processing column-oriented data, analyses run on a single CPU core are conducted 4 times faster than before, and one server equipped with 15 CPU cores can run analyses at least 50 times faster.

Even on smaller computer systems with little memory, this technology enables real-time data analysis reflecting the latest data.

Details of this technology are being presented at the Seventh Forum on Data Engineering and Information Management (DEIM 2015), opening March 2 in Koriyama, Fukushima.

Database systems are able to report processing results back to a terminal efficiently, for what is called online transaction processing (OLTP), and are used widely for processing changes to data, such as with the storage and utilization of data from business systems.

In recent years, there has been an increasing demand for high-volume data analysis that is fast and available on demand, creating a need for a single database system that can handle OLTP and high-volume data analysis simultaneously. In contrast to the row-oriented data that is best-suited to OLTP, column-oriented data is better for data analysis, but this method gets bogged down when processing changes to data. One relatively recent solution is to store both row-oriented and column-oriented data as a way to accelerate analyses. But with previous technologies, changes to the row-oriented data are not automatically reflected to the column-oriented data, and memory constraints are also problematic.

Fujitsu has developed an engine for PostgreSQL open-source databases that instantly reflects updated row-oriented data to column-oriented data, stores column-oriented data without being dependent on memory capacity, and quickly conducts analysis of column-oriented data. Massive volumes of column-oriented data can be stored by taking advantage of a new technique for managing column-oriented data. The engine also enables high-speed analyses of the indexes that typical database systems provide, and can be used without special consideration for whether the data is stored as row-oriented or column-oriented. On the DBT-3 benchmark Query for reading, filtering, and aggregating, the parallel-processing analysis engine, which has been optimized for column-oriented data, runs 4 times faster on a single CPU core than its predecessors. On a single server with 15 CPU cores, performance is at least 50 times faster.

Key features of the technology are as follows:

1. Large-volume column-oriented data storage

To efficiently manage large volumes of column-oriented data that cannot fit into memory, data domains are managed in "extents", large increments (roughly 260,000 records), in which data domains are secured or deleted, and in which free domains are reclaimed. While managing large increments and simultaneously running analyses can result in long wait times, Fujitsu has adopted a solution in the form of MultiVersion Concurrency Control (MVCC), which allows analyses to run at the same time that data domains are managed.

2. Column-oriented indexes (column-store indexes)

Like other indexes, creating a column-oriented index - column-store index - is a way to select a data-storage method - row-oriented or column-oriented - that suits the contents of the database being queried and to process it. When there is an update to row-oriented data from which the column-store index is created, the column-oriented data is automatically updated. This completely frees users from worries about the data-storage method.

3. Analysis engine optimized for column-oriented data and parallel processing using an original shared-memory structure

Simply using column-oriented data to improve read performance does not make the most of the benefits that column-oriented data can offer. Fujitsu developed an analysis engine that can apply the same process at once to multiple types of data (vector processing), which improves performance under single parallelization. Also as a parallel-analysis mechanism, the company also developed a new shared-memory structure so that multiple processes operating in parallel in PostgreSQL can hand off data with little slowdown. On a server with 15 CPU cores, this can achieve minimum fifty-fold performance improvements over the previous PostgreSQL.

This technology enables existing smaller systems with limited memory to achieve real-time analysis and utilization of Big Data in ways that were not possible before.

Fujitsu is aiming for a commercial implementation of this technology during fiscal 2015, as a part of Symfoware Server, Fujitsu's database product.
Source: Fujitsu

Back to Table of contents

Primeur weekly 2015-03-02

Special

ASCETiC project to reduce energy consumption of Cloud platforms ...

Focus

2014 Another year on the road to Exascale - An Interview with Satoshi Matsuoka and Thomas Sterling - Part II ...

Mont Blanc vision complements On the Road to Exascale interview ...

The Cloud

ISC Cloud & Big Data is now open for research paper submission ...

EuroFlash

Stalprodukt S.A. selects Cray XC30 supercomputer for modelling steel designs ...

Bright Cluster Manager now supports SUSE Linux Enterprise Server 12 ...

Asetek announces global OEM purchase agreement with Fujitsu ...

ADVA Optical Networking to demonstrate virtualization in radio access backhaul networks at Mobile World Congress ...

TDC deploys Oscilloquartz synchronization technology in National Danish Network ...

Queen's researchers in bid to develop world's fastest supercomputers ...

Supermicro expands embedded computing solutions with new wireless IoT gateway at Embedded World, Nürnberg ...

Ensuring security for networks of the future ...

USFlash

Cirrascale announces rackmount multi-device peering platform for highly parallel applications ...

Penguin Computing announces Scyld ClusterWare for Hadoop ...

Adaptive Computing appoints Marty Smuin as CEO ...

Innovative AMD FirePro server GPU supports intense compute workloads on HP ProLiant DL380 Gen9 servers ...

Mellanox ConnectX-4 100Gb/s Interconnect adapter delivers record performance results ...

Tohoku University and Fujitsu succeed in real-time flood analysis using supercomputer-based high-resolution tsunami modelling ...

Fujitsu M10 UNIX server achieves world-record performance once again on two-tier SAP SD standard application benchmark with 20% less CPU resources ...

Fujitsu develops column-oriented data-processing engine enabling fast, high-volume data analysis in database systems ...

AMD discloses architecture details of high-performance, energy-efficient "Carrizo" System-on-Chip ...

Undergraduate OSC researcher heading to UK ...

IBM and Juniper Networks partner to build smarter networks with predictive analytics ...

MySQL Cluster 7.4 now generally available ...

Intersect360 Research releases predictions for HPC in 2015 ...