TY - GEN
T1 - Synthesizers: A Meta-Framework for Generating and Evaluating High-Fidelity Tabular Synthetic Data
AU - Schneider-Kamp, Peter
AU - Lautrup, Anton Danholt
AU - Hyrup, Tobias
PY - 2024
Y1 - 2024
N2 - Synthetic data is by many expected to have a significant impact on data science by enhancing data privacy, reducing biases in datasets, and enabling the scaling of datasets beyond their original size. However, the current landscape of tabular synthetic data generation is fragmented, with numerous frameworks available, only some of which have integrated evaluation modules. synthesizers is a meta-framework that simplifies the process of generating and evaluating tabular synthetic data. It provides a unified platform that allows users to select generative models and evaluation tools from open-source implementations in the research field and apply them to datasets of any format. The aim of synthesizers is to consolidate the diverse efforts in tabular synthetic data research, making it more accessible to researchers from different sub-domains, including those with less technical expertise such as health researchers. This could foster collaboration and increase the use of synthetic data tools, ultimately leading to more effective research outcomes.
AB - Synthetic data is by many expected to have a significant impact on data science by enhancing data privacy, reducing biases in datasets, and enabling the scaling of datasets beyond their original size. However, the current landscape of tabular synthetic data generation is fragmented, with numerous frameworks available, only some of which have integrated evaluation modules. synthesizers is a meta-framework that simplifies the process of generating and evaluating tabular synthetic data. It provides a unified platform that allows users to select generative models and evaluation tools from open-source implementations in the research field and apply them to datasets of any format. The aim of synthesizers is to consolidate the diverse efforts in tabular synthetic data research, making it more accessible to researchers from different sub-domains, including those with less technical expertise such as health researchers. This could foster collaboration and increase the use of synthetic data tools, ultimately leading to more effective research outcomes.
U2 - 10.5220/0012856000003753
DO - 10.5220/0012856000003753
M3 - Article in proceedings
SN - 978-989-758-706-1
T3 - International Conference on Software Technologies
SP - 177
EP - 184
BT - Proceedings of the 19th International Conference on Software Technologies ICSOFT
PB - SCITEPRESS Digital Library
ER -