Models
This section covers the generative machine learning models supported by Gretel APIs as well as core use cases and capabilities.
This section compares features of different generative data models supported by Gretel APIs.
✅ = Supported
✖️ = Not yet supported
LSTM | Gretel GPT | ACTGAN | Amplify | DGAN | |
---|---|---|---|---|---|
Tag | synthetics | gpt_x | actgan | amplify | timeseries_dgan |
Type | Language Model | Language Model | Generative Adversarial Network | Statistical | Generative Adversarial Network |
Model | LSTM | Pre-trained Transformer | GAN | Statistical | GAN |
Privacy filters | ✅ | ✖️ | ✅ | ✖️ | ✖️ |
Differential privacy | ✅ | ✖️ | ✖️ | ✖️ | ✖️ |
Synthetic quality report | ✅ | ✖️ | ✅ | ✅ | ✖️ |
Tabular | ✅ | ✖️ | ✅ | ✅ | ✖️ |
Time-series | ✅ | ✖️ | ✖️ | ✖️ | ✅ |
Natural language | ✅ | ✅ | ✖️ | ✖️ | ✖️ |
Conditional generation | ✅ | ✅ | ✅ | ✖️ | ✖️ |
Pre-trained | ✖️ | ✅ | ✖️ | ✖️ | ✖️ |
Gretel cloud | ✅ | ✅ | ✅ | ✅ | ✅ |
On-premises | ✅ | ✅ | ✅ | ✅ | ✅ |

GitHub - gretelai/gretel-synthetics: Synthetic data generators for structured and unstructured text, featuring differentially private learning.
GitHub
Check out our GitHub for research, source code and examples including our core synthetic data generation library.
Below is an example configuration that may be used to create and fine-tune a synthetic data model. Save the example above to
model-config.yaml
. - Replace
[model_id]
with the type of model you wish to train (e.g.synthetics
,gpt_x
,actgan
,timeseries_dgan
,amplify
). data_source
must point to a valid and accessible file in CSV, JSON, or JSONL format.- Supported storage formats include S3, GCS, Azure Blog Storage, HDFS, WebHDFS, HTTP, HTTPS, SFTP, or local filesystem.
data_source: __temp__
can be used when the source file is specified elsewhere using:--in_data
parameter via CLI,- parameter via SDK,
- dataset
button
via Console.
schema_version: "1.0"
name: "my-model"
models:
- [model_id]:
data_source: foo.csv
Use the following CLI command to train and create the synthetic data model.
- The use of
exports
are not necessary, they are only used to have a cleanermodels create
command. --in_data
is optional, and can be used to override thedata_source
specified in the config.
export CONFIG_PATH=model-config.yaml
export DATASOURCE=foo.csv
gretel models create \
--config $CONFIG_PATH \
--runner cloud \
--in-data $DATASOURCE > my-model.json
Below is an example CLI command that may be used to generate data from a model.
--model-id
supports both a modeluid
and the JSON thatmodels create
outputs.--in-data
(optional) allows you to specify a CSV file to prompt the model for conditional data generation tasks.
gretel models run --model-id my-model.json \
--runner cloud \
--in-data prompts.csv \
--output .
Last modified 3mo ago