Research projects
A gallery of the main projects I have been working on during my research activities. As you can see, I draw inspiration from art, and my projects are named after artists or artworks. Like I do, I hope you can also find inspiration in art.
MaGRiTTE
A large pretrained model is aimed at representing tabular files with ML to solve data preparation issues.
Read more
Read less
References:
- G. Vitagliano, M. Hameed, F. Naumann : Structural embedding of data files with MaGRiTTE. Table Representation Learning Workshop at NeurIPS (TRL@NeurIPS) , 2022
The Pollock benchmark
How well can data management tools load non-standard CSV files?
Read more
Read less
References:
- G. Vitagliano, M. Hameed, L. Jiang, L. Reisener, E. Wu, F. Naumann : Pollock: A Data Loading Benchmark. PVLDB 16(8):1870–1882 , 2022
References:
- G. Vitagliano, L. Jiang, F. Naumann : Detecting Layout Templates in Complex Multiregion Files. PVLDB 15(3):646-658 , 2021
- G. Vitagliano, L. Reisener, L. Jiang, M. Hameed, F. Naumann : Mondrian: Spreadsheet Layout Detection. Proceedings of the International Conference on Management of Data (SIGMOD) , 2022