Talk

🤯 No data? No problem! Synthetic data to the rescue

Thursday, May 29

11:45 - 12:15
RoomTortellini
LanguageEnglish
Audience levelIntermediate
Elevator pitch

Got data problems? Relax. Synthetic data is here to help.

I will go over the fundamentals of synthetic data and show how you can use synthetic data to speed up your model development processes, proving that sometimes the best solution isn’t finding the right data—it’s creating it.

Abstract

Got data problems? Relax. Synthetic data is here to help.

I will go over the fundamentals of synthetic data and show how you can use synthetic data to speed up your model development processes, proving that sometimes the best solution isn’t finding the right data—it’s creating it.

This talk will start with an overview of the importance of data in general and continue with the what, why and how of synthetic data. This will also briefly mention some of the use cases in commercial and research scenarios.

Then, we’ll go over a taxonomy of different synthetic data types focusing on rewriting, judging and rationales. These data types are then discussed based on fun, practical examples and more educational research papers.

The final chapter focuses on the evaluation of synthetic data and how we can scale generation from various prompts to pipelines with various useful tools and libraries. We then go over various entry points for getting started with synthetic data.

TagsMachine-Learning, Best Practice, Natural Language Processing, Computer Vision
Participant

David Berenstein