SDK Notebooks
Jupyter notebook-based tutorials for popular use cases around synthetic data.
Notebook | Description | Description |
This notebook is designed to help users successfully train synthetic models on complex datasets with high row and column counts. The code works by intelligently dividing a dataset into a set of smaller datasets of correlated columns that can be parallelized and then joined together. | ||
Walk through the basics of using Gretel's Python SDK to create a synthetic dataset from a Pandas DataFrame or CSV. | ||
Train a synthetic model locally and generate data in your environment. | ||
Conditional data generation (seeding a model) is helpful when you want to preserve some of the original row data (primary keys, dates, important categorical data) in synthetic datasets. | ||
Balance demographic representation bias in a healthcare set using conditional data generation with a synthetic model. | ||
Create synthetic time-series data from a Pandas DataFrame or CSV. | ||
Use a synthetic model to boost the representation of an extreme minority class in a dataset by incorporating features from nearest neighbors. | ||
Use Gretel APIs to anonymize, synthesize, and then compare synthetic accuracy for a time-series dataset vs real world data. | ||
Run a sweep to automate hyper parameter optimization for a synthetic model using Weights and Biases. | ||
Augment a popular machine learning dataset with synthetic data to improve downstream accuracy and algorithmic fairness. | ||
Measure the effects of different differential privacy settings on a model's ability to memorize and replay secrets in a dataset. | ||
This notebook shows how to generate synthetic data directly from a multi-table relational database to support data augmentation and subsetting use cases. | ||
Generate realistic but synthetic text examples using an open-source implementation of the GPT-3 architecture. | ||
Generate synthetic daily oil price data using the DoppelGANger GAN for time-series data. | ||
Produce a quality score and detailed report for any synthetic dataset vs. real world data. |
Walk through creating synthetic data with Gretel.ai, Python, Pandas, and Jupyter.
Last modified 6mo ago