Projects per year
One of the main targets of data analytics is unstructured data, which primarily involves textual data. High-performance processing of textual data is non-trivial. We present the HPTA library for high-performance text analytics. The library helps programmers to map textual data to a dense numeric representation, which can be handled more efficiently. HPTA encapsulates three performance optimizations: (i) efficient memory management for textual data, (ii) parallel computation on associative data structures that map text to values and (iii) optimization of the type of associative data structure depending on the program context. We demonstrate that HPTA outperforms popular frameworks for text analytics such as scikit-learn and Spark.
|Title of host publication||Proceedings of tge IEEE International Conference on Big Data|
|Number of pages||8|
|Publication status||Published - 06 Feb 2017|
|Event||2016 IEEE International Conference on Big Data - DC, Washington, United States|
Duration: 05 Dec 2016 → 08 Dec 2016
|Conference||2016 IEEE International Conference on Big Data|
|Period||05/12/2016 → 08/12/2016|
- data analytics
- performance optimization
- text analytics
FingerprintDive into the research topics of 'HPTA: High-Performance Text Analytics'. Together they form a unique fingerprint.
- 2 Finished
28/07/2014 → 02/03/2017
21/03/2014 → 28/02/2017