diff --git a/models/artifacts/artifacts-walkthrough.mdx b/models/artifacts/artifacts-walkthrough.mdx
index 2744c1580c..b96a403b03 100644
--- a/models/artifacts/artifacts-walkthrough.mdx
+++ b/models/artifacts/artifacts-walkthrough.mdx
@@ -4,11 +4,11 @@ description:
title: "Tutorial: Create, track, and use a dataset artifact"
---
-This walkthrough demonstrates how to create, track, and use a dataset artifact.
+This walkthrough demonstrates how to create, track, and use a dataset artifact with W&B. By the end, you've logged a dataset as a versioned artifact to W&B and downloaded it in a subsequent run. This lets you reproducibly share datasets across experiments and track them as inputs and outputs of your runs.
-## 1. Log into W&B
+## Log in to W&B
-Import the W&B library and log in to W&B. You will need to sign up for a free W&B account if you have not done so already.
+Import the W&B library and log in to W&B. If you haven't done so already, sign up for a free W&B account.
```python
import wandb
@@ -16,22 +16,22 @@ import wandb
wandb.login()
```
-## 2. Initialize a run
+## Initialize a run
Use [`wandb.init()`](/models/ref/python/functions/init) to initialize a run. This generates a background process to sync and log data. Provide a project name and a job type:
```python
-# Create a W&B Run. Here we specify 'dataset' as the job type since this example
+# Create a W&B Run. Here you specify 'dataset' as the job type since this example
# shows how to create a dataset artifact.
with wandb.init(project="artifacts-example", job_type="upload-dataset") as run:
# Your code here
```
-## 3. Create an artifact object
+## Create an artifact object
-Create an artifact object with the [`wandb.Artifact()`](/models/ref/python/experiments/artifact). Provide a name for the artifact and a description of the file type for the `name` and `type` parameters, respectively.
+Create an artifact object with [`wandb.Artifact()`](/models/ref/python/experiments/artifact). Provide a name for the artifact and a description of the file type for the `name` and `type` parameters, respectively.
-For example, the following code snippet demonstrates how to create an artifact called `‘bicycle-dataset’` with a `‘dataset’` label:
+For example, the following code snippet demonstrates how to create an artifact called `'bicycle-dataset'` with a `'dataset'` label:
```python
artifact = wandb.Artifact(name="bicycle-dataset", type="dataset")
@@ -39,9 +39,9 @@ artifact = wandb.Artifact(name="bicycle-dataset", type="dataset")
For more information about how to construct an artifact, see [Construct artifacts](./construct-an-artifact).
-## 4. Add the dataset to the artifact
+## Add the dataset to the artifact
-Add a file to the artifact. Common file types include models and datasets. The following example adds a dataset named `dataset.h5` that is saved locally on our machine to the artifact:
+Add a file to the artifact. Common file types include models and datasets. The following example adds a dataset named `dataset.h5` that is saved locally on your machine to the artifact:
```python
# Add a file to the artifact's contents
@@ -51,9 +51,9 @@ artifact.add_file(local_path="dataset.h5")
Replace the filename `dataset.h5` in the previous code snippet with the path to the file you want to add to the artifact.
-## 5. Log the dataset
+## Log the dataset
-Use the W&B run objects `wandb.Run.log_artifact()` method to both save your artifact version and declare the artifact as an [output of the run](/models/artifacts/explore-and-traverse-an-artifact-graph).
+Use the W&B run object's `wandb.Run.log_artifact()` method to both save your artifact version and declare the artifact as an [output of the run](/models/artifacts/explore-and-traverse-an-artifact-graph).
```python
# Save the artifact version to W&B and mark it
@@ -61,10 +61,10 @@ Use the W&B run objects `wandb.Run.log_artifact()` method to both save your arti
run.log_artifact(artifact)
```
-A `'latest'` [alias](/models/artifacts/create-a-custom-alias) is created by default when you log an artifact. For more information about artifact aliases and versions, see [Create a custom alias](./create-a-custom-alias) and [Create new artifact versions](./create-a-new-artifact-version), respectively.
+When you log an artifact, W&B creates a `'latest'` [alias](/models/artifacts/create-a-custom-alias) by default. For more information about artifact aliases and versions, see [Create a custom alias](./create-a-custom-alias) and [Create new artifact versions](./create-a-new-artifact-version), respectively.
-Putting this together, you script so far should look like this:
+Putting this together, your script so far should look like this:
```python
import wandb
@@ -78,17 +78,17 @@ with wandb.init(project="artifacts-example", job_type="upload-dataset") as run:
```
-## 6. Download and use the artifact
+## Download and use the artifact
-The following code example demonstrates the steps you can take to use an artifact you have logged and saved to the W&B servers.
+Now that the dataset is logged as an artifact, you can pull it into other runs as a tracked input. The following code example demonstrates the steps you can take to use an artifact you've logged and saved to the W&B servers:
-1. First, initialize a new run object with **`wandb.init()`.**
-2. Second, use the run objects [`wandb.Run.use_artifact()`](/models/ref/python/experiments/run#use_artifact) method to tell W&B what artifact to use. This returns an artifact object.
-3. Third, use the artifacts [`wandb.Artifact.download()`](/models/ref/python/experiments/artifact#download) method to download the contents of the artifact.
+1. Initialize a new run object with `wandb.init()`.
+2. Use the run object's [`wandb.Run.use_artifact()`](/models/ref/python/experiments/run#use_artifact) method to specify which artifact to use. This returns an artifact object.
+3. Use the artifact's [`wandb.Artifact.download()`](/models/ref/python/experiments/artifact#download) method to download the contents of the artifact.
```python
-# Create a W&B Run. Here we specify 'training' for 'type'
-# because we will use this run to track training.
+# Create a W&B Run. Here you specify 'training' for 'type'
+# because you use this run to track training.
with wandb.init(project="artifacts-example", job_type="training") as run:
# Query W&B for an artifact and mark it as input to this run
@@ -98,4 +98,6 @@ with wandb.init(project="artifacts-example", job_type="training") as run:
artifact_dir = artifact.download()
```
-Alternatively, you can use the Public API (`wandb.Api`) to export (or update data) data already saved in a W&B outside of a Run. See [Track external files](./track-external-files) for more information.
+Alternatively, you can use the Public API (`wandb.Api`) to export or update data already saved in W&B outside of a run. For more information, see [Track external files](./track-external-files).
+
+You now have a versioned dataset artifact logged to W&B and consumed by a downstream run. The artifact graph tracks both the upload and the download.
diff --git a/models/artifacts/construct-an-artifact.mdx b/models/artifacts/construct-an-artifact.mdx
index 1fd81bb326..cabe992aef 100644
--- a/models/artifacts/construct-an-artifact.mdx
+++ b/models/artifacts/construct-an-artifact.mdx
@@ -1,46 +1,48 @@
---
-description: Create and log a W&B Artifact. Learn how to add one or more files or a URI reference to an Artifact.
+description: Create and log a W&B artifact. Learn how to add one or more files or a URI reference to an artifact.
title: Create an artifact
---
+This page shows you how to create a W&B artifact, add content to it, and save it so that you can version and share datasets, models, and other files across your machine learning workflows.
+
Use the W&B Python SDK to construct artifacts from [W&B Runs](/models/ref/python/experiments/run). You can add [files, directories, URIs, and files from parallel runs to artifacts](#add-files-to-an-artifact). After you add a file to an artifact, save the artifact to the W&B Server or [your own private server](/platform/hosting/hosting-options/self-managed). Each artifact is associated with a run.
For information on how to track external files, such as files stored in Amazon S3, see the [Track external files](./track-external-files) page.
## Construct an artifact
-Construct a [W&B Artifact](/models/ref/python/experiments/artifact) in three steps:
+Construct a [W&B artifact](/models/ref/python/experiments/artifact) in three steps:
1. [Create an artifact Python object with `wandb.Artifact()`](/models/artifacts/construct-an-artifact#create-an-artifact-python-object-with-wandb-artifact)
2. [Add one or more files to the artifact](/models/artifacts/construct-an-artifact#add-one-or-more-files-to-the-artifact)
-3. [Save your artifact to the W&B server](/models/artifacts/construct-an-artifact#save-your-artifact-to-the-w&b-server)
+3. [Save your artifact to the W&B server](/models/artifacts/construct-an-artifact#save-your-artifact-to-the-wb-server)
### Create an artifact Python object with `wandb.Artifact()`
Initialize the [`wandb.Artifact()`](/models/ref/python/experiments/artifact) class to create an artifact object. Specify the following parameters:
* **Name**: The name of your artifact. The name should be unique, descriptive, and memorable.
-* **Type**: The type of artifact. The type should be simple, descriptive, and correspond to a single step of your machine learning pipeline. Common artifact types include `'dataset'` or `'model'`.
+* **Type**: The type of artifact. The type should be short, descriptive, and correspond to a single step of your machine learning pipeline. Common artifact types include `'dataset'` or `'model'`.
-W&B uses the "name" and "type" you provide to create a directed acyclic graph in the W&B App. See the [Explore and traverse artifact graphs](./explore-and-traverse-an-artifact-graph) for more information.
+W&B uses the `name` and `type` you provide to create a directed acyclic graph in the W&B App. See the [Explore and traverse artifact graphs](./explore-and-traverse-an-artifact-graph) for more information.
-Artifacts can not have the same name, regardless of type. In other words, you can not create an artifact named `cats` of type `dataset` and another artifact with the same name of type `model`.
+Artifacts can't have the same name, regardless of type. In other words, you can't create an artifact named `cats` of type `dataset` and another artifact with the same name of type `model`.
-You can optionally provide a description and metadata when you initialize an artifact object. For more information on available attributes and parameters, see the [`wandb.Artifact`](/models/ref/python/experiments/artifact) Class definition in the Python SDK Reference Guide.
+You can optionally provide a description and metadata when you initialize an artifact object. For more information about available attributes and parameters, see the [`wandb.Artifact`](/models/ref/python/experiments/artifact) class definition in the Python SDK Reference Guide.
-Copy and paste the following code snippet to create an artifact object. Replace the `` and `` placeholders with your own values:
+Copy and paste the following code snippet to create an artifact object. Replace the `[NAME]` and `[TYPE]` placeholders with your own values:
```python
import wandb
# Create an artifact object
-artifact = wandb.Artifact(name="", type="")
+artifact = wandb.Artifact(name="[NAME]", type="[TYPE]")
```
### Add one or more files to the artifact
@@ -50,51 +52,51 @@ artifact = wandb.Artifact(name="", type="")
To add a single file, use the artifact object's [`Artifact.add_file()`](/models/ref/python/experiments/artifact#add_file) method:
```python
-artifact.add_file(local_path="path/to/file.txt", name="")
+artifact.add_file(local_path="path/to/file.txt", name="[NAME]")
```
To add a directory, use the [`Artifact.add_dir()`](/models/ref/python/experiments/artifact#add_dir) method:
```python
-artifact.add_dir(local_path="path/to/directory", name="")
+artifact.add_dir(local_path="path/to/directory", name="[NAME]")
```
-See the next section, [Add files to an artifact](/models/artifacts/construct-an-artifact#add-files-to-an-artifact), for more information on how to add different file types to an artifact.
+See the next section, [Add files to an artifact](/models/artifacts/construct-an-artifact#add-files-to-an-artifact), for more information about how to add different file types to an artifact.
### Save your artifact to the W&B server
-Save your artifact to the W&B server. Use the run object's [`wandb.Run.log_artifact()`](/models/ref/python/experiments/run#log_artifact) method to save the artifact.
+When you save the artifact, W&B uploads its contents and registers it with the run, which makes the artifact available for downstream use and versioning. Use the run object's [`wandb.Run.log_artifact()`](/models/ref/python/experiments/run#log_artifact) method to save the artifact.
```python
-with wandb.init(project="", job_type="") as run:
+with wandb.init(project="[PROJECT]", job_type="[JOB-TYPE]") as run:
run.log_artifact(artifact)
```
-**When to use to use `wandb.Run.log_artifact()` or `Artifact.save()`**
+**When to use `wandb.Run.log_artifact()` or `Artifact.save()`**
- Use `wandb.Run.log_artifact()` to create a new artifact and associate it with a specific run.
- Use `Artifact.save()` to update an existing artifact without creating a new run.
-Putting this all together, the following code snippet demonstrates how to create a dataset artifact, add a file to the artifact, and save the artifact to W&B:
+Putting this all together, the following code snippet shows how to create a dataset artifact, add a file to the artifact, and save the artifact to W&B:
```python
import wandb
-artifact = wandb.Artifact(name="", type="")
-artifact.add_file(local_path="path/to/file.txt", name="")
-artifact.add_dir(local_path="path/to/directory", name="")
+artifact = wandb.Artifact(name="[NAME]", type="[TYPE]")
+artifact.add_file(local_path="path/to/file.txt", name="[NAME]")
+artifact.add_dir(local_path="path/to/directory", name="[NAME]")
-with wandb.init(project="", job_type="") as run:
+with wandb.init(project="[PROJECT]", job_type="[JOB-TYPE]") as run:
run.log_artifact(artifact)
```
Each time you log an artifact with the same name and type, W&B creates a new version of that artifact. For more information, see [Create a new artifact version](/models/artifacts/create-a-new-artifact-version).
-W&B performs calls `wandb.Run.log_artifact()` asynchronously for performant uploads. This can cause surprising behavior when logging artifacts in a loop. For example:
+W&B performs `wandb.Run.log_artifact()` calls asynchronously for faster uploads. This can cause surprising behavior when you log artifacts in a loop. For example:
```python
with wandb.init() as run:
@@ -109,12 +111,12 @@ with wandb.init() as run:
run.log_artifact(a)
```
-The artifact version **v0** might not have an index of 0 in its metadata because artifacts may be logged in an arbitrary order.
+The artifact version **v0** is not guaranteed to have an index of 0 in its metadata because W&B might log artifacts in an arbitrary order.
## Add files to an artifact
-The following sections demonstrate how to add different types of objects to an artifact. Assume you have a directory with the following structure as you read through the examples:
+After you create an artifact object, populate it with the content you want to track. The following sections show how to add different types of objects to an artifact. Assume you have a directory with the following structure as you read through the examples:
```text
root-directory
@@ -136,7 +138,7 @@ Use [`wandb.Artifact.add_file()`](/models/ref/python/experiments/artifact#method
import wandb
# Initialize an artifact object
-artifact = wandb.Artifact(name="", type="")
+artifact = wandb.Artifact(name="[NAME]", type="[TYPE]")
# Add a single file
artifact.add_file(local_path="path/file.format")
@@ -172,7 +174,7 @@ new/path/hello_world.txt
The following table shows how different API calls produce different artifact contents:
-| API Call | Resulting artifact |
+| API call | Resulting artifact |
| --------------------------------------------------------- | ------------------ |
| `artifact.new_file('hello.txt')` | `hello.txt` |
| `artifact.add_file('model.h5')` | `model.h5` |
@@ -188,22 +190,22 @@ Use the [`wandb.Artifact.add_dir()`](/models/ref/python/experiments/artifact#met
import wandb
# Initialize an artifact object
-artifact = wandb.Artifact(name="", type="")
+artifact = wandb.Artifact(name="[NAME]", type="[TYPE]")
# Add a local directory to the artifact
-artifact.add_dir(local_path="path/file.format", name="optional-prefix")
+artifact.add_dir(local_path="path/to/directory", name="optional-prefix")
```
-The following table show how different API calls produce different artifact contents:
+The following table shows how different API calls produce different artifact contents:
-| API Call | Resulting artifact |
+| API call | Resulting artifact |
| ------------------------------------------- | ------------------------------------------------------ |
| `artifact.add_dir('images')` | cat.png
dog.png
|
| `artifact.add_dir('images', name='images')` | images/cat.png
images/dog.png
|
### Add a URI reference
-Artifacts track checksums and other information for reproducibility if the URI has a scheme that the W&B library supports.
+Use a URI reference when you want an artifact to point to content stored outside of W&B, such as in an object store, without copying the underlying bytes. Artifacts track checksums and other information for reproducibility if the URI has a scheme that the W&B library supports.
Add an external URI reference to an artifact with the [`wandb.Artifact.add_reference()`](/models/ref/python/experiments/artifact#method-artifact-add-reference) method. Replace the `'uri'` string with your own URI. Optionally pass the desired path within the artifact for the name parameter.
@@ -214,9 +216,9 @@ artifact.add_reference(uri="uri", name="optional-name")
Artifacts support the following URI schemes:
-* `http(s)://`: A path to a file accessible over HTTP. The artifact will track checksums in the form of etags and size metadata if the HTTP server supports the `ETag` and `Content-Length` response headers.
-* `s3://`: A path to an object or object prefix in S3. The artifact will track checksums and versioning information (if the bucket has object versioning enabled) for the referenced objects. Object prefixes are expanded to include the objects under the prefix, up to a maximum of 10,000 objects.
-* `gs://`: A path to an object or object prefix in GCS. The artifact will track checksums and versioning information (if the bucket has object versioning enabled) for the referenced objects. Object prefixes are expanded to include the objects under the prefix, up to a maximum of 10,000 objects.
+* `http(s)://`: A path to a file accessible over HTTP. The artifact tracks checksums in the form of etags and size metadata if the HTTP server supports the `ETag` and `Content-Length` response headers.
+* `s3://`: A path to an object or object prefix in S3. The artifact tracks checksums and versioning information (if the bucket has object versioning enabled) for the referenced objects. W&B expands object prefixes to include the objects under the prefix, up to a maximum of 10,000 objects.
+* `gs://`: A path to an object or object prefix in GCS. The artifact tracks checksums and versioning information (if the bucket has object versioning enabled) for the referenced objects. W&B expands object prefixes to include the objects under the prefix, up to a maximum of 10,000 objects.
The following table shows how different API calls produce different artifact contents:
@@ -236,7 +238,7 @@ For large datasets or distributed training, multiple parallel runs might need to
import wandb
import time
-# This example uses Ray to runs in parallel
+# This example uses Ray to run in parallel
# for demonstration purposes.
import ray
@@ -256,7 +258,7 @@ group_name = "writer-group-{}".format(round(time.time()))
@ray.remote
def train(i):
"""
- Our writer job. Each writer will add one image to the artifact.
+ The writer job. Each writer adds one image to the artifact.
"""
with wandb.init(group=group_name) as run:
artifact = wandb.Artifact(name=artifact_name, type=artifact_type)
@@ -267,7 +269,7 @@ def train(i):
# Add the table to folder in the artifact
artifact.add(table, "{}/table_{}".format(parts_path, i))
- # Upserting the artifact creates or appends data to the artifact
+ # Upsert the artifact to create or append data to the artifact
run.upsert_artifact(artifact)
@@ -278,7 +280,7 @@ result_ids = [train.remote(i) for i in range(num_parallel)]
# been added before finishing the artifact.
ray.get(result_ids)
-# Once all the writers are finished, finish the artifact
+# After all the writers finish, finish the artifact
# to mark it ready.
with wandb.init(group=group_name) as run:
artifact = wandb.Artifact(artifact_name, type=artifact_type)
@@ -287,21 +289,21 @@ with wandb.init(group=group_name) as run:
# and add it to the artifact.
artifact.add(wandb.data_types.PartitionedTable(parts_path), table_name)
- # Finish artifact finalizes the artifact, disallowing future "upserts"
+ # Finish the artifact to finalize it, disallowing future "upserts"
# to this version.
run.finish_artifact(artifact)
```
## Find path for logged artifacts and other metadata
-The following code snippet shows how to use the [W&B Public API](/models/ref/python/public-api/) to list the files in a run, including their names and URLs. Replace the `` placeholder with your own values:
+After you log an artifact, you might want to inspect the files associated with the run that produced it. The following code snippet shows how to use the [W&B Public API](/models/ref/python/public-api/) to list the files in a run, including their names and URLs. Replace the `[ENTITY/PROJECT/RUN-ID]` placeholder with your own values:
```python
from wandb.apis.public.files import Files
from wandb.apis.public.api import Api
# Example run object
-run = Api().run("")
+run = Api().run("[ENTITY/PROJECT/RUN-ID]")
# Create a Files object to iterate over files in the run
files = Files(api.client, run)
@@ -313,4 +315,4 @@ for file in files:
print(f"Path to file in the bucket: {file.direct_url}")
```
-See the [File](/models/ref/python/public-api/file) Class for more information on available attributes and methods.
\ No newline at end of file
+See the [File](/models/ref/python/public-api/file) class for more information about available attributes and methods.
\ No newline at end of file
diff --git a/models/artifacts/create-a-custom-alias.mdx b/models/artifacts/create-a-custom-alias.mdx
index f1632a789f..5bf82c4f79 100644
--- a/models/artifacts/create-a-custom-alias.mdx
+++ b/models/artifacts/create-a-custom-alias.mdx
@@ -7,20 +7,20 @@ Use aliases as pointers to specific versions. By default, `wandb.Run.log_artifac
W&B creates an artifact version `v0` and attaches it to your artifact when you log that artifact for the first time. W&B checksums the contents when you log again to the same artifact. If the artifact changed, W&B saves a new version `v1`.
-For example, if you want your training script to pull the most recent version of a dataset, specify `latest` when you use that artifact. The following code example demonstrates how to download a recent dataset artifact named `bike-dataset` that has an alias, `latest`:
+For example, if you want your training script to pull the most recent version of a dataset, specify `latest` when you use that artifact. The following code example downloads a recent dataset artifact named `bike-dataset` that has an alias, `latest`. Replace `[PROJECT]` with your W&B project name:
```python
import wandb
-with wandb.init(project="") as run:
+with wandb.init(project="[PROJECT]") as run:
artifact = run.use_artifact("bike-dataset:latest")
artifact.download()
```
-You can also apply a custom alias to an artifact version. For example, if you want to mark that model checkpoint is the best on the metric AP-50, you could add the string `'best-ap50'` as an alias when you log the model artifact.
+You can also apply a custom alias to an artifact version. For example, if you want to mark that a model checkpoint is the best on the AP-50 metric, you could add the string `'best-ap50'` as an alias when you log the model artifact.
```python
-with wandb.init(project="") as run:
+with wandb.init(project="[PROJECT]") as run:
artifact = wandb.Artifact("run-3nq3ctyy-bike-model", type="model")
artifact.add_file("model.h5")
run.log_artifact(artifact, aliases=["latest", "best-ap50"])
diff --git a/models/artifacts/create-a-new-artifact-version.mdx b/models/artifacts/create-a-new-artifact-version.mdx
index 699875f6d1..d908d98001 100644
--- a/models/artifacts/create-a-new-artifact-version.mdx
+++ b/models/artifacts/create-a-new-artifact-version.mdx
@@ -4,32 +4,33 @@ description: Create a new artifact version from a single run or from a distribut
title: Create an artifact version
---
-Create a new artifact version with a single [run](/models/runs/) or collaboratively with distributed runs. You can optionally create a new artifact version from a previous version known as an [incremental artifact](#create-a-new-artifact-version-from-an-existing-version).
+This page shows you how to create a new artifact version so you can track, share, and reuse datasets, models, or other files across experiments. Create a new artifact version with a single [run](/models/runs/) or collaboratively with distributed runs. You can optionally create a new artifact version from a previous version known as an [incremental artifact](#create-a-new-artifact-version-from-an-existing-version), which avoids re-uploading files that didn't change.
-We recommend that you create an incremental artifact when you need to apply changes to a subset of files in an artifact, where the size of the original artifact is significantly larger.
+Create an incremental artifact when you need to apply changes to a subset of files in an artifact, where the size of the original artifact is much larger.
## Create new artifact versions from scratch
-There are two ways to create a new artifact version: from a single run and from distributed runs. They are defined as follows:
+You can create a new artifact version in two ways: from a single run and from distributed runs. The following list defines each method:
-* **Single run**: A single run provides all the data for a new version. This is the most common case and is best suited when the run fully recreates the needed data. For example: outputting saved models or model predictions in a table for analysis.
-* **Distributed runs**: A set of runs collectively provides all the data for a new version. This is best suited for distributed jobs which have multiple runs generating data, often in parallel. For example: evaluating a model in a distributed manner, and outputting the predictions.
+* **Single run**: A single run provides all the data for a new version. This is the most common case and is best suited when the run fully recreates the needed data. For example, to output saved models or model predictions in a table for analysis.
+* **Distributed runs**: A set of runs collectively provides all the data for a new version. This is best suited for distributed jobs which have multiple runs that generate data, often in parallel. For example, to evaluate a model in a distributed manner and output the predictions.
-W&B will create a new artifact and assign it a `v0` alias if you pass a name to the `wandb.Artifact` API that does not exist in your project. W&B checksums the contents when you log again to the same artifact. If the artifact changed, W&B saves a new version `v1`.
+W&B creates a new artifact and assigns it a `v0` alias if you pass a name to the `wandb.Artifact` API that doesn't exist in your project. W&B checksums the contents when you log again to the same artifact. If the artifact changed, W&B saves a new version `v1`.
-W&B will retrieve an existing artifact if you pass a name and artifact type to the `wandb.Artifact` API that matches an existing artifact in your project. The retrieved artifact will have a version greater than 1.
+W&B retrieves an existing artifact if you pass a name and artifact type to the `wandb.Artifact` API that matches an existing artifact in your project. The retrieved artifact already has at least one version (`v0` or later).
### Single run
-Log a new version of an Artifact with a single run that produces all the files in the artifact. This case occurs when a single run produces all the files in the artifact.
-Based on your use case, select one of the tabs below to create a new artifact version inside or outside of a run:
+Log a new version of an artifact with a single run that produces all the files in the artifact. This case occurs when a single run produces all the files in the artifact.
+
+You can create a new artifact version either as part of an active W&B run (so the artifact is tracked as that run's output) or outside of a run (when you want to log artifacts independently of experiment tracking). Based on your use case, select one of the following tabs to create a new artifact version inside or outside of a run:
@@ -53,7 +54,7 @@ with wandb.init() as run:
Create an artifact version outside of a W&B run:
-1. Create a new artifact or retrieve an existing one with `wanb.Artifact`.
+1. Create a new artifact or retrieve an existing one with `wandb.Artifact`.
2. Add files to the artifact with `.add_file`.
3. Save the artifact with `.save`.
@@ -72,19 +73,18 @@ artifact.save()
### Distributed runs
-Allow a collection of runs to collaborate on a version before committing it. This is in contrast to single run mode described above where one run provides all the data for a new version.
+Allow a collection of runs to collaborate on a version before they commit it. This is in contrast to single-run mode described previously, where one run provides all the data for a new version. Use distributed runs when no single run has access to all the files that belong in the artifact (for example, when several parallel jobs each produce a portion of the output).
-1. Each run in the collection needs to be aware of the same unique ID (called `distributed_id`) in order to collaborate on the same version. By default, if present, W&B uses the run's `group` as set by `wandb.init(group=GROUP)` as the `distributed_id`.
-2. There must be a final run that "commits" the version, permanently locking its state.
+1. Each run in the collection needs the same unique ID (called `distributed_id`) to collaborate on the same version. By default, if present, W&B uses the run's `group` as set by `wandb.init(group=GROUP)` as the `distributed_id`.
+2. A final run must "commit" the version, permanently locking its state.
3. Use `upsert_artifact` to add to the collaborative artifact and `finish_artifact` to finalize the commit.
-Consider the following example. Different runs (labelled below as **Run 1**, **Run 2**, and **Run 3**) add a different image file to the same artifact with `upsert_artifact`.
-
+Consider the following example, which demonstrates how multiple runs share a `distributed_id` to contribute to a single artifact version and how a final run commits it. Different runs (labeled as **Run 1**, **Run 2**, and **Run 3** in the following examples) add a different image file to the same artifact with `upsert_artifact`.
-#### Run 1
+**Run 1**:
```python
with wandb.init() as run:
@@ -95,7 +95,7 @@ with wandb.init() as run:
run.upsert_artifact(artifact, distributed_id="my_dist_artifact")
```
-#### Run 2
+**Run 2**:
```python
with wandb.init() as run:
@@ -106,9 +106,9 @@ with wandb.init() as run:
run.upsert_artifact(artifact, distributed_id="my_dist_artifact")
```
-#### Run 3
+**Run 3**:
-Must run after Run 1 and Run 2 complete. The Run that calls `wandb.Run.finish_artifact()` can include files in the artifact, but does not need to.
+Run 3 commits the artifact version and permanently locks its state, so no further runs can add files under the same `distributed_id`. Run 3 must run after Run 1 and Run 2 complete. The run that calls `wandb.Run.finish_artifact()` can include files in the artifact, but doesn't need to.
```python
with wandb.init() as run:
@@ -130,13 +130,13 @@ Add, modify, or remove a subset of files from a previous artifact version withou
-Here are some scenarios for each type of incremental change you might encounter:
+The following list describes scenarios for each type of incremental change you might encounter:
-- add: you periodically add a new subset of files to a dataset after collecting a new batch.
-- remove: you discovered several duplicate files and want to remove them from your artifact.
-- update: you corrected annotations for a subset of files and want to replace the old files with the correct ones.
+- **Add**: You periodically add a new subset of files to a dataset after you collect a new batch.
+- **Remove**: You discovered several duplicate files and want to remove them from your artifact.
+- **Update**: You corrected annotations for a subset of files and want to replace the old files with the correct ones.
-You could create an artifact from scratch to perform the same function as an incremental artifact. However, when you create an artifact from scratch, you will need to have all the contents of your artifact on your local disk. When making an incremental change, you can add, remove, or modify a single file without changing the files from a previous artifact version.
+You could create an artifact from scratch to perform the same function as an incremental artifact. However, when you create an artifact from scratch, you need to have all the contents of your artifact on your local disk. When you make an incremental change, you can add, remove, or modify a single file without changing the files from a previous artifact version.
@@ -144,9 +144,9 @@ You can create an incremental artifact within a single run or with a set of runs
-Follow the procedure below to incrementally change an artifact:
+To incrementally change an artifact, follow this procedure:
-1. Obtain the artifact version you want to perform an incremental change on:
+1. Obtain the artifact version you want to incrementally change:
@@ -171,9 +171,9 @@ saved_artifact = client.artifact("my_artifact:latest")
draft_artifact = saved_artifact.new_draft()
```
-3. Perform any incremental changes you want to see in the next version. You can either add, remove, or modify an existing entry.
+3. Perform any incremental changes you want to see in the next version. You can add, remove, or modify an existing entry.
-Select one of the tabs for an example on how to perform each of these changes:
+Select one of the tabs for an example of how to perform each of these changes:
@@ -214,7 +214,7 @@ draft_artifact.add_file("modified_file.txt")
The method to add or modify an artifact are the same. Entries are replaced (as opposed to duplicated), when you pass a filename for an entry that already exists.
*/}
-4. Lastly, log or save your changes. The following tabs show you how to save your changes inside and outside of a W&B run. Select the tab that is appropriate for your use case:
+4. Log or save your changes to commit the draft as a new artifact version. The following tabs show you how to save your changes inside and outside of a W&B run. Select the tab that is appropriate for your use case:
@@ -230,7 +230,9 @@ draft_artifact.save()
-Putting it all together, the code examples above look like:
+After you log or save the draft, W&B creates a new artifact version that records only the incremental changes while reusing the unchanged files from the previous version.
+
+The preceding code examples combined look like the following:
diff --git a/models/artifacts/data-privacy-and-compliance.mdx b/models/artifacts/data-privacy-and-compliance.mdx
index c60086c9f4..83f66ec58e 100644
--- a/models/artifacts/data-privacy-and-compliance.mdx
+++ b/models/artifacts/data-privacy-and-compliance.mdx
@@ -4,17 +4,23 @@ description: Learn where W&B files are stored by default. Explore how to save, s
title: Artifact data privacy and compliance
---
-Files are uploaded to a Google Cloud bucket managed by W&B when you log artifacts. The contents of the bucket are encrypted both at rest and in transit. Artifact files are only visible to users who have access to the corresponding project.
+This page explains where W&B stores artifact files by default and how deletion and retention work. It also covers options for sensitive datasets that can't reside in a multi-tenant environment.
+
+When you log artifacts, W&B uploads files to a Google Cloud bucket that W&B manages. W&B encrypts the contents of the bucket both at rest and in transit. Only users who have access to the corresponding project can view artifact files.
-When you delete a version of an artifact, it is marked for soft deletion in our database and removed from your storage cost. When you delete an entire artifact, it is queued for permanent deletion and all of its contents are removed from the W&B bucket. If you have specific needs around file deletion, reach out to [Customer Support](mailto:support@wandb.com).
+When you delete a version of an artifact, W&B marks it for soft deletion in the database and removes it from your storage cost. When you delete an entire artifact, W&B queues it for permanent deletion and removes all of its contents from the W&B bucket.
+
+
+If you have specific needs around file deletion, contact [Customer Support](mailto:support@wandb.com).
+
-By default, deleted artifacts are retained for 7 days and can be restored during this period, which is configurable for Dedicated Cloud. Learn more about data retention in [Multi-tenant Cloud](/platform/hosting/hosting-options/multi_tenant_cloud#data-retention-policy) or [Dedicated Cloud](/platform/hosting/hosting-options/dedicated-cloud#data-retention-policy).
+By default, W&B retains deleted artifacts for 7 days, and you can restore them during this period. This period is configurable for Dedicated Cloud. Learn more about data retention in [Multi-tenant Cloud](/platform/hosting/hosting-options/multi_tenant_cloud#data-retention-policy) or [Dedicated Cloud](/platform/hosting/hosting-options/dedicated-cloud#data-retention-policy).
-For sensitive datasets that cannot reside in a multi-tenant environment, you can use [W&B Dedicated Cloud](/platform/hosting/hosting-options/dedicated-cloud) or [reference artifacts](/models/artifacts/track-external-files). Reference artifacts track references to private buckets without sending file contents to W&B. Reference artifacts maintain links to files on your buckets or servers. W&B only keeps track of the metadata associated with the files, not the files themselves.
+For sensitive datasets that can't reside in a multi-tenant environment, you can use [W&B Dedicated Cloud](/platform/hosting/hosting-options/dedicated-cloud) or [reference artifacts](/models/artifacts/track-external-files). Reference artifacts track references to private buckets without sending file contents to W&B, maintaining links to files on your buckets or servers. W&B only tracks the metadata associated with the files, not the files themselves.
diff --git a/models/artifacts/delete-artifacts.mdx b/models/artifacts/delete-artifacts.mdx
index af6c966ae9..a1b7ea8845 100644
--- a/models/artifacts/delete-artifacts.mdx
+++ b/models/artifacts/delete-artifacts.mdx
@@ -4,9 +4,9 @@ description: Delete artifacts interactively with the App UI or programmatically
title: Delete an artifact
---
-Delete artifacts interactively with the W&B App or programmatically with the W&B Python SDK. When you delete an artifact, W&B marks that artifact as a *soft-delete*. In other words, the artifact is marked for deletion but files are not immediately deleted from storage.
+This page shows you how to delete W&B artifacts so you can remove unneeded data, free up storage, and manage retention. Delete artifacts interactively with the W&B App or programmatically with the W&B Python SDK. When you delete an artifact, W&B marks that artifact as a *soft-delete*. In other words, W&B marks the artifact for deletion but doesn't immediately delete files from storage.
-The contents of the artifact remain as a soft-delete, or pending deletion state, until a regularly run garbage collection process reviews all artifacts marked for deletion. The garbage collection process deletes associated files from storage if the artifact and its associated files are not used by a previous or subsequent artifact versions.
+The contents of the artifact remain as a soft-delete, or pending deletion state, until a regularly run garbage collection process reviews all artifacts marked for deletion. The garbage collection process deletes associated files from storage if a previous or subsequent artifact version doesn't use the artifact and its associated files.
Garbage collection is **best-effort**. W&B does not guarantee how quickly freed space appears in your object storage after you delete an artifact. Large deployments or backlogs can take longer than expected. For how this fits with run data, retention settings, and optional operator actions, see [Manage bucket storage and costs](/platform/hosting/managing-bucket-storage).
@@ -14,7 +14,7 @@ Garbage collection is **best-effort**. W&B does not guarantee how quickly freed
## Artifact garbage collection workflow
-The following diagram illustrates the complete artifact garbage collection process:
+This section provides a visual overview of how artifact deletion flows through soft-delete and garbage collection before files are permanently removed. The following diagram illustrates the complete artifact garbage collection process:
```mermaid
graph TB
@@ -46,12 +46,12 @@ graph TB
style KeepFiles fill:#e8f5e9,stroke:#333,stroke-width:2px,color:#000
style DeleteFiles fill:#ffebee,stroke:#333,stroke-width:2px,color:#000
style End fill:#e0e0e0,stroke:#333,stroke-width:2px,color:#000
-```
+```
You can schedule when artifacts are deleted from W&B with TTL policies. For more information, see [Manage data retention with Artifact TTL policy](./ttl).
-Artifacts deleted by a TTL policy, the W&B Python SDK, or the W&B App are first soft-deleted. Soft-deleted artifacts are then garbage-collected before they are permanently deleted.
+Artifacts deleted by a TTL policy, the W&B Python SDK, or the W&B App are first soft-deleted. Soft-deleted artifacts are then garbage-collected before they're permanently deleted.
@@ -94,7 +94,7 @@ artifact.delete(delete_aliases=True)
## Delete multiple artifact versions
-The following code example shows how to delete multiple artifact versions. Provide the entity, project name, and run ID that created the artifact as arguments to `wandb.Api.run()`. This returns a run object that you can use to access all artifact versions created by that run. Next, iterate through the artifact versions and delete the ones that match your criteria.
+Use this approach when you want to clean up several artifact versions logged by a single run in one operation, rather than deleting each version individually. The following code example shows how to delete multiple artifact versions. Provide the entity, project name, and run ID that created the artifact as arguments to `wandb.Api.run()`. This returns a run object that you can use to access all artifact versions created by that run. Next, iterate through the artifact versions and delete the versions that match your criteria.
Set the `delete_aliases` parameter to `True` (`wandb.Artifact.delete(delete_aliases=True)`) to delete an artifact version and any aliases associated with it.
@@ -125,7 +125,7 @@ for artifact in run.logged_artifacts():
## Delete multiple artifact versions with a specific alias
-The following code demonstrates how to delete multiple artifact versions that have a specific alias.
+To remove only versions tagged with a particular alias (for example, an obsolete release tag), filter on the alias before deleting. The following code demonstrates how to delete multiple artifact versions that have a specific alias.
Replace the ``, ``, ``, ``, and `` placeholders with your own values:
@@ -144,7 +144,7 @@ artifact_name = ""
# Specify the alias to filter artifact versions for deletion
desired_alias = ""
-# Delete artifacts logged to run with alias 'v3' and 'v4
+# Delete artifacts logged to run with alias 'v3' and 'v4'
for artifact in run.logged_artifacts():
print(f"Found artifact: {artifact.name}")
if (artifact.name.split(":")[0] == artifact_name) and (desired_alias in artifact.aliases):
@@ -153,18 +153,20 @@ for artifact in run.logged_artifacts():
## Delete an artifact collection
+Deleting a collection removes the collection and all artifact versions it contains. Deleted versions follow the same soft-delete and garbage collection workflow described in the preceding sections.
+
To delete an artifact collection:
1. Navigate to the artifact collection you want to delete.
-3. Select the **action ()** menu next to the artifact collection name.
-4. From the dropdown menu, select **Delete**.
+2. Click the **action ()** menu next to the artifact collection name.
+3. From the dropdown menu, select **Delete**.
-Delete artifact collection programmatically with the [wandb.Artifact.delete()](/models/ref/python/experiments/artifact#delete) method.
+Delete artifact collection programmatically with the [wandb.Artifact.delete()](/models/ref/python/experiments/artifact#delete) method.
Provide the full path of the artifact collection to `wandb.Api.artifact_collection(name="")`. The full path consists of `//`.
@@ -192,9 +194,9 @@ Artifacts with protected aliases have special deletion restrictions. [Protected
**Important considerations for protected aliases:**
-- Artifacts with protected aliases cannot be deleted by non-registry admins.
-- Within a registry, registry admins can unlink protected artifact versions and delete collections/registries that contain protected aliases.
-- For source artifacts: if a source artifact is linked to a registry with a protected alias, it cannot be deleted by any user
+- Non-registry admins can't delete artifacts with protected aliases.
+- Within a registry, registry admins can unlink protected artifact versions and delete collections and registries that contain protected aliases.
+- For source artifacts: if a source artifact is linked to a registry with a protected alias, no user can delete it.
- Registry admins can remove the protected aliases from source artifacts and then delete them.
@@ -203,17 +205,15 @@ Artifacts with protected aliases have special deletion restrictions. [Protected
Garbage collection timing is not guaranteed. See [Manage bucket storage and costs](/platform/hosting/managing-bucket-storage) for details.
-Garbage collection is active by default if you use W&B Multi-tenant Cloud. In W&B Dedicated and Self-Managed, you might need to take these additional steps to activate garbage collection.
+Whether garbage collection is active by default depends on your W&B deployment type, so the steps to enable it vary. Garbage collection is active by default if you use W&B Multi-tenant Cloud. In W&B Dedicated and Self-Managed, you might need to take the following steps to activate garbage collection.
1. **W&B Self-Managed**: Set `GORILLA_ARTIFACT_GC_ENABLED=true`.
1. **Dedicated Cloud**: Contact support to verify that garbage collection is active.
-1. Enable bucket versioning if you use [AWS](https://docs.aws.amazon.com/AmazonS3/latest/userguide/manage-versioning-examples.html), [Google Cloud](https://cloud.google.com/storage/docs/object-versioning) or any other storage provider such as [Minio](https://min.io/docs/minio/linux/administration/object-management/object-versioning.html#enable-bucket-versioning). If you use Azure, [enable soft deletion](https://learn.microsoft.com/azure/storage/blobs/soft-delete-blob-overview), which is equivalent to bucket versioning.
-
+1. Enable bucket versioning if you use [AWS](https://docs.aws.amazon.com/AmazonS3/latest/userguide/manage-versioning-examples.html), [Google Cloud](https://cloud.google.com/storage/docs/object-versioning), or any other storage provider such as [Minio](https://min.io/docs/minio/linux/administration/object-management/object-versioning.html#enable-bucket-versioning). If you use Azure, [enable soft deletion](https://learn.microsoft.com/azure/storage/blobs/soft-delete-blob-overview), which is equivalent to bucket versioning.
-The following table describes how to satisfy requirements to enable garbage collection based on your deployment type.
-The `X` indicates you must satisfy the requirement:
+Use the following table to confirm which requirements apply to your deployment. An `X` indicates you must satisfy the requirement:
| | Environment variable | Enable versioning |
| -----------------------------------------------| ------------------------| ----------------- |
@@ -225,6 +225,5 @@ The `X` indicates you must satisfy the requirement:
-note
-Secure storage connector is currently only available for Google Cloud Platform and Amazon Web Services.
+Secure storage connector is available only for Google Cloud Platform and Amazon Web Services.
diff --git a/models/artifacts/download-and-use-an-artifact.mdx b/models/artifacts/download-and-use-an-artifact.mdx
index 959986502a..86a9469aa1 100644
--- a/models/artifacts/download-and-use-an-artifact.mdx
+++ b/models/artifacts/download-and-use-an-artifact.mdx
@@ -1,31 +1,36 @@
---
-description: Download and use Artifacts from multiple projects.
+description: Download and use artifacts from multiple projects.
title: Download and use artifacts
---
-Download and use an artifact that is already stored on the W&B server or construct an artifact object and pass it in to for de-duplication as necessary.
+This page shows you how to download and use an artifact that's already stored on the W&B server, or construct an artifact object and pass it in for de-duplication. Use these workflows when you need to consume datasets, models, or other versioned files produced earlier in your pipeline, either inside a W&B run or as a standalone operation.
+
+Replace placeholders in the code examples with your own values:
+
+- `[PROJECT-NAME]`: The name of your W&B project.
+- `[JOB-TYPE]`: A label describing the type of run, such as `training` or `eval`.
-Team members with view-only seats cannot download artifacts.
+Team members with view-only seats can't download artifacts.
-### Download and use an artifact stored on W&B
+## Download and use an artifact stored on W&B
-Download and use an artifact stored in W&B either inside or outside of a W&B Run. Use the Public API ([`wandb.Api`](/models/ref/python/public-api/api)) to export (or update data) already saved in W&B.
+Download and use an artifact stored in W&B either inside or outside a W&B run. Use the Public API ([`wandb.Api`](/models/ref/python/public-api/api)) to export or update data already saved in W&B.
-First, import the W&B Python SDK. Next, create a W&B [Run](/models/ref/python/experiments/run):
+Import the W&B Python SDK, then create a W&B [Run](/models/ref/python/experiments/run):
```python
import wandb
-with wandb.init(project="", job_type="") as run:
+with wandb.init(project="[PROJECT-NAME]", job_type="[JOB-TYPE]") as run:
# See next step
```
-Indicate the artifact you want to use with the [`wandb.Run.use_artifact()`](/models/ref/python/experiments/run#use_artifact) method. This returns a run object. In the following code snippet specifies an artifact called `'bike-dataset'` with the alias `'latest'`:
+Indicate the artifact you want to use with the [`wandb.Run.use_artifact()`](/models/ref/python/experiments/run#use_artifact) method. This returns an artifact object. The following code snippet specifies an artifact called `bike-dataset` with the alias `latest`:
```python
# Indicate the artifact to use. Format is "name:alias"
@@ -39,7 +44,7 @@ Use the object returned to download all the contents of the artifact:
datadir = artifact.download()
```
-You can optionally pass a path to the root parameter to download the contents of the artifact to a specific directory.
+You can optionally pass a path to the `root` parameter to download the contents of the artifact to a specific directory.
Use the [`wandb.Artifact.get_entry()`](/models/ref/python/experiments/artifact#get_entry) method to download only a subset of files:
@@ -51,9 +56,9 @@ entry = artifact.get_entry(name)
Putting this together, the complete code example looks like this:
```python
-import wandb
+import wandb
-with wandb.init(project="", job_type="") as run:
+with wandb.init(project="[PROJECT-NAME]", job_type="[JOB-TYPE]") as run:
# Indicate the artifact to use. Format is "name:alias"
artifact = run.use_artifact("bike-dataset:latest")
@@ -67,14 +72,14 @@ with wandb.init(project="", job_type="") as run:
This fetches only the file at the path `name`. It returns an `Entry` object with the following methods:
-* `Entry.download`: Downloads file from the artifact at path `name`
-* `Entry.ref`: If `add_reference` stored the entry as a reference, returns the URI
+* `Entry.download`: Downloads the file from the artifact at path `name`.
+* `Entry.ref`: Returns the URI if `add_reference` stored the entry as a reference.
{/* References that have schemes that W&B knows how to handle get downloaded just like artifact files. For more information, see [Track external files](/models/artifacts/track-external-files/). */}
-
-First, import the W&B SDK. Next, create an artifact object from the Public API Class. Provide the entity, project, artifact, and alias associated with that artifact:
+
+Import the W&B SDK, then create an artifact object from the Public API class. Provide the entity, project, artifact, and alias associated with that artifact:
```python
import wandb
@@ -89,45 +94,46 @@ Use the object returned to download the contents of the artifact:
artifact.download()
```
-You can optionally pass a path the `root` parameter to download the contents of the artifact to a specific directory. For more information, see the [Python SDK Reference Guide](/models/ref/python/experiments/artifact#download).
+You can optionally pass a path to the `root` parameter to download the contents of the artifact to a specific directory. For more information, see the [Python SDK Reference Guide](/models/ref/python/experiments/artifact#download).
Use the `wandb artifact get` command to download an artifact from the W&B server.
-```
-$ wandb artifact get project/artifact:alias --root mnist/
+```bash
+wandb artifact get project/artifact:alias --root mnist/
```
-### Partially download an artifact
+## Partially download an artifact
-You can optionally download part of an artifact based on a prefix. Use the `path_prefix=` (`wandb.Artifact.download(path_prefix=)`) parameter to download a single file or the content of a sub-folder.
+If you only need a subset of an artifact's contents, download part of an artifact based on a prefix. Use the `path_prefix=` (`wandb.Artifact.download(path_prefix=)`) parameter to download a single file or the content of a sub-folder.
```python
-with wandb.init(project="", job_type="") as run:
+with wandb.init(project="[PROJECT-NAME]", job_type="[JOB-TYPE]") as run:
# Indicate the artifact to use. Format is "name:alias"
artifact = run.use_artifact("bike-dataset:latest")
# Download a specific file or sub-folder
- artifact.download(path_prefix="bike.png") # downloads only bike.png
+ artifact.download(path_prefix="bike.png") # Downloads only bike.png
```
Alternatively, you can download files from a certain directory. To do so, specify the directory within the `path_prefix=` parameter. Continuing from the previous code snippet:
```python
-# downloads files in the images/bikes directory
-artifact.download(path_prefix="images/bikes/")
+# Downloads files in the images/bikes directory
+artifact.download(path_prefix="images/bikes/")
```
-### Use an artifact from a different project
-Specify the name of artifact along with its project name to reference an artifact. You can also reference artifacts across entities by specifying the name of the artifact with its entity name.
+## Use an artifact from a different project
+
+Specify the name of the artifact along with its project name to reference an artifact. You can also reference artifacts across entities by specifying the name of the artifact with its entity name.
-The following code example demonstrates how to query an artifact from another project as input to the current W&B run.
+The following code example queries an artifact from another project and uses it as input to the current W&B run.
```python
-with wandb.init(project="", job_type="") as run:
+with wandb.init(project="[PROJECT-NAME]", job_type="[JOB-TYPE]") as run:
# Query W&B for an artifact from another project and mark it
# as an input to this run.
artifact = run.use_artifact("my-project/artifact:alias")
@@ -135,17 +141,16 @@ with wandb.init(project="", job_type="") as run:
# Use an artifact from another entity and mark it as an input
# to this run.
artifact = run.use_artifact("my-entity/my-project/artifact:alias")
-
```
-### Construct and use an artifact simultaneously
+## Construct and use an artifact simultaneously
-Simultaneously construct and use an artifact. Create an artifact object and pass it to use_artifact. This creates an artifact in W&B if it does not exist yet. The [`wandb.Run.use_artifact()`](/models/ref/python/experiments/run#use_artifact) API is idempotent, so you can call it as many times as you like.
+When you want to log an artifact and immediately mark it as an input to the same run, construct and use the artifact in a single step. Create an artifact object and pass it to `use_artifact`. This creates the artifact in W&B if it doesn't exist yet. The [`wandb.Run.use_artifact()`](/models/ref/python/experiments/run#use_artifact) API is idempotent, so you can call it as many times as needed.
```python
import wandb
-with wandb.init(project="", job_type="") as run:
+with wandb.init(project="[PROJECT-NAME]", job_type="[JOB-TYPE]") as run:
artifact = wandb.Artifact("reference model")
artifact.add_file("model.h5")
run.use_artifact(artifact)
diff --git a/models/artifacts/explore-and-traverse-an-artifact-graph.mdx b/models/artifacts/explore-and-traverse-an-artifact-graph.mdx
index eec9a780ca..857d58b754 100644
--- a/models/artifacts/explore-and-traverse-an-artifact-graph.mdx
+++ b/models/artifacts/explore-and-traverse-an-artifact-graph.mdx
@@ -6,12 +6,14 @@ title: Explore artifact lineage graphs
W&B tracks the inputs and outputs of runs using directed acyclic graphs (DAGs) called _lineage graphs_. Lineage graphs are visual representations of the relationships between artifacts and runs in an ML experiment. They show how data and models
flow through different stages of the ML lifecycle, from raw data ingestion to model training and evaluation.
+This page is for ML practitioners and team members who want to view, navigate, and enable lineage tracking for the artifacts produced and consumed by their W&B runs.
+
Tracking artifact lineage provides several key advantages:
-* **Reproducibility**: Enables teams to reproduce experiments, models, and results for debugging, experimentation, and validation.
-* **Version control**: Tracks changes to artifacts over time, allowing teams to revert to previous data or model versions when needed.
-* **Auditing**: Maintains a detailed record of artifacts and transformations to support compliance and governance.
-* **Collaboration**: Helps to improve teamwork by making experiment history transparent, reducing duplicated effort, and accelerating development.
+* **Reproducibility**: Teams can reproduce experiments, models, and results for debugging, experimentation, and validation.
+* **Version control**: Track changes to artifacts over time so that teams can revert to previous data or model versions when needed.
+* **Auditing**: Maintain a detailed record of artifacts and transformations to support compliance and governance.
+* **Collaboration**: Improve teamwork by making experiment history transparent, reducing duplicated effort, and accelerating development.
## View an artifact's lineage graph
@@ -19,9 +21,11 @@ Tracking artifact lineage provides several key advantages:
To view an artifact's lineage graph:
1. Navigate to your project's workspace in the W&B App.
-2. Click on the **Artifacts** tab in the project sidebar.
+2. Click the **Artifacts** tab in the project sidebar.
3. Select an artifact, then click the **Lineage** tab.
+The Lineage tab displays the artifact's lineage graph. For more information, see [Navigate lineage graphs](#navigate-lineage-graphs).
+
## Navigate lineage graphs
The lineage graph is a visual representation of the relationships between artifacts and runs in an ML experiment. Use the W&B App UI or the Python SDK to explore and traverse an artifact's lineage graph.
@@ -31,10 +35,10 @@ The lineage graph is a visual representation of the relationships between artifa
Nodes with green icons represent runs. Nodes with blue icons represent artifacts. Arrows between nodes indicate the input and output of a run or artifact.
-Artifact nodes display the artifact's name along with the version of the artifact in the form `:`. An artifact's type is displayed above the name of the artifact.
+Artifact nodes display the artifact's name along with the version of the artifact in the form `[ARTIFACT_NAME]:[VERSION]`. The artifact's type appears above the name of the artifact.
-You can view the type and the name of artifact in both the left sidebar and in the lineage graph node.
+You can view the type and the name of an artifact in both the left sidebar and in the lineage graph node.
@@ -44,13 +48,13 @@ Run nodes display the run's name.
-Click any individual run to get more information about that runs such as the run's: start time, time duration, author, job type, and more. Click any individual artifact to get more information about the artifact's: aliases, creation time, type, version, description, the run that logged the artifact, file size, and more.
+Click any individual run to get more information about that run, such as the run's start time, time duration, author, and job type. Click any individual artifact to get more information about the artifact, such as its aliases, creation time, type, version, description, the run that logged the artifact, and file size.
-Runs that create multiple versions of the same artifact are grouped together in a cluster. Click on a specific artifact version listed within the cluster to view specific information about that artifact version.
+W&B groups runs that create multiple versions of the same artifact into a cluster. Click a specific artifact version listed within the cluster to view specific information about that artifact version.
@@ -62,7 +66,7 @@ Click and drag a node to rearrange the graph to customize the layout. You can al
-Hover your mouse over a node and click on the eye icon to hide or show a node in the graph. This is useful for decluttering the graph to focus on specific nodes and their relationships.
+Point to a node and click the eye icon to hide or show a node in the graph. Use this to declutter the graph and focus on specific nodes and their relationships.
@@ -83,26 +87,26 @@ with wandb.init() as run:
-## Enable lineage graph tracking
+## Enable lineage graph tracking
-To enable lineage graph tracking, you need to mark artifacts as [inputs](/models/artifacts/explore-and-traverse-an-artifact-graph) or
-[outputs](/models/artifacts/explore-and-traverse-an-artifact-graph#track-the-output-of-a-run) of a run using the W&B Python SDK.
+To enable lineage graph tracking, you must mark artifacts as [inputs](/models/artifacts/explore-and-traverse-an-artifact-graph#track-the-input-of-a-run) or
+[outputs](/models/artifacts/explore-and-traverse-an-artifact-graph#track-the-output-of-a-run) of a run using the W&B Python SDK. The following sections describe how to mark each.
### Track the input of a run
Mark an artifact as the input (or dependency) of a run with the [`wandb.Run.use_artifact()`](/models/ref/python/experiments/run#method-runuse_artifact)
method. Specify the name of the artifact and an optional alias to reference a specific version of that artifact. The name of the
-artifact is in the format `:` or `:`.
+artifact is in the format `[ARTIFACT_NAME]:[VERSION]` or `[ARTIFACT_NAME]:[ALIAS]`.
-Replace values enclosed in angle brackets (`< >`) with your values:
+Replace the bracketed placeholders with your values:
```python
import wandb
# Initialize a run
-with wandb.init(entity="", project="") as run:
+with wandb.init(entity="[ENTITY]", project="[PROJECT]") as run:
# Get artifact, mark it as a dependency
- artifact = run.use_artifact(artifact_or_name="", aliases="")
+ artifact = run.use_artifact(artifact_or_name="[NAME]:[ALIAS]")
```
@@ -112,17 +116,17 @@ Use [`wandb.Run.log_artifact()`](/models/ref/python/experiments/run#log_artifact
create an artifact with the [`wandb.Artifact()`](/models/ref/python/experiments/artifact#wandb.Artifact) constructor. Then, log the
artifact as an output of the run with `wandb.Run.log_artifact()`.
-Replace values enclosed in angle brackets (`< >`) with your values:
+Replace the bracketed placeholders with your values:
```python
import wandb
# Initialize a run
-with wandb.init(entity="", project="") as run:
+with wandb.init(entity="[ENTITY]", project="[PROJECT]") as run:
# Create an artifact
- artifact = wandb.Artifact(name = "", type = "")
- artifact.add_file(local_path = "", name="")
+ artifact = wandb.Artifact(name = "[ARTIFACT_NAME]", type = "[ARTIFACT_TYPE]")
+ artifact.add_file(local_path = "[LOCAL_FILEPATH]", name="[OPTIONAL_NAME]")
# Log the artifact as an output of the run
run.log_artifact(artifact_or_path = artifact)
@@ -130,9 +134,11 @@ with wandb.init(entity="", project="") as run:
## Artifact clusters
-When a level of the graph has five or more runs or artifacts, it creates a cluster. A cluster has a search bar to find specific versions of runs or artifacts and pulls an individual node from a cluster to continue investigating the lineage of a node inside a cluster.
+To keep large lineage graphs readable, W&B groups dense levels of the graph into clusters that you can search and expand.
+
+When a level of the graph has five or more runs or artifacts, W&B creates a cluster. A cluster has a search bar to find specific versions of runs or artifacts and lets you pull an individual node from a cluster to continue investigating the lineage of a node inside a cluster.
-Clicking on a node opens a preview with an overview of the node. Clicking on the arrow extracts the individual run or artifact so you can examine the lineage of the extracted node.
+Click a node to open a preview with an overview of the node. Click the arrow to extract the individual run or artifact so you can examine the lineage of the extracted node.
diff --git a/models/artifacts/storage.mdx b/models/artifacts/storage.mdx
index 0081c4ecb0..f1fc940317 100644
--- a/models/artifacts/storage.mdx
+++ b/models/artifacts/storage.mdx
@@ -1,6 +1,6 @@
---
-description: Manage storage, memory allocation of W&B Artifacts.
-title: Manage artifact storage and memory allocation
+description: Manage where W&B stores artifact files and how to clean up the local artifact cache.
+title: Manage artifact storage
---
W&B stores artifact files in a private Google Cloud Storage bucket located in the United States by default. All files are encrypted at rest and in transit.
@@ -20,15 +20,15 @@ During training, W&B locally saves logs, artifacts, and configuration files in t
For a complete guide to using environment variables to configure W&B, see the [environment variables reference](/models/track/environment-variables/).
-Depending on the machine on `wandb` is initialized on, these default folders may not be located in a writeable part of the file system. This might trigger an error.
+Depending on the machine where `wandb` is initialized, these default folders may not be located in a writable part of the file system. This might trigger an error.
-### Clean up local artifact cache
+## Clean up local artifact cache
-W&B caches artifact files to speed up downloads across versions that share files in common. Over time this cache directory can become large. Run the [`wandb artifact cache cleanup`](/models/ref/cli/wandb-artifact/wandb-artifact-cache/) command to prune the cache and to remove any files that have not been used recently.
+W&B caches artifact files to speed up downloads across versions that share files in common. Over time this cache directory can become large. To prune the cache and remove any files you haven't used recently, run the [`wandb artifact cache cleanup`](/models/ref/cli/wandb-artifact/wandb-artifact-cache/) command.
-The following code snippet demonstrates how to limit the size of the cache to 1GB. Copy and paste the code snippet into your terminal:
+To limit the cache size to 1 GB, run:
```bash
-$ wandb artifact cache cleanup 1GB
+wandb artifact cache cleanup 1GB
```
diff --git a/models/artifacts/track-external-files.mdx b/models/artifacts/track-external-files.mdx
index 7ce9747e47..bbcadab4db 100644
--- a/models/artifacts/track-external-files.mdx
+++ b/models/artifacts/track-external-files.mdx
@@ -3,23 +3,23 @@ description: Track files saved in an external bucket, HTTP file server, or an NF
title: Track external files
---
-Use *reference artifacts* to track and use files saved outside of W&B servers. Common external storage solutions include: CoreWeave AI Object Storage, an Amazon Simple Storage Service (Amazon S3) bucket, GCS bucket, Azure blob, HTTP file server, or NFS share.
+Use *reference artifacts* to track and use files saved outside W&B servers, so you can version and reference large datasets and models without copying them into W&B. Common external storage solutions include: CoreWeave AI Object Storage, an Amazon Simple Storage Service (Amazon S3) bucket, GCS bucket, Azure blob, HTTP file server, or NFS share.
-Reference artifacts behave similar to non-reference artifacts. The key difference is that the reference artifacts only consists of metadata about the files, such as their sizes and MD5 checksums. The files themselves never leave your system.
+Reference artifacts behave like non-reference artifacts. The key difference is that a reference artifact only consists of metadata about the files, such as their sizes and MD5 checksums. The files themselves never leave your system.
-You can interact with reference artifact similarly to non-reference artifacts. In the W&B App, you can browse the contents of the reference artifact using the file browser, explore the full dependency graph, and scan through the versioned history of your artifact. However, the UI cannot render rich media such as images, audio, because the data itself is not contained within the artifact.
+You can interact with reference artifacts the same way you interact with non-reference artifacts. In the W&B App, you can browse the contents of the reference artifact using the file browser, explore the full dependency graph, and scan through the versioned history of your artifact. However, the UI can't render rich media such as images and audio, because the artifact doesn't contain the data itself.
-If you log an artifact that does not track external files, W&B saves the artifact's files to W&B servers. This is the default behavior when you log artifacts with the W&B Python SDK.
+If you log an artifact that doesn't track external files, W&B saves the artifact's files to W&B servers. This is the default behavior when you log artifacts with the W&B Python SDK.
-If you log an artifact that tracks external files, W&B logs metadata about the object, such as the object's ETag and size. If object versioning is enabled on the bucket, the version ID is also logged.
+If you log an artifact that tracks external files, W&B logs metadata about the object, such as the object's ETag and size. If object versioning is enabled on the bucket, W&B also logs the version ID.
The following sections describe how to track external reference artifacts.
## Track an artifact in an external bucket
-Use the W&B Python SDK to track references to files stored outside of W&B.
+Use the W&B Python SDK to track references to files stored outside W&B.
1. Initialize a run with `wandb.init()`.
2. Create an artifact object with `wandb.Artifact()`.
@@ -56,9 +56,9 @@ s3://my-bucket
The `datasets/mnist/` directory contains a collection of images. To track the image `datasets/mnist/` directory as a dataset artifact, specify:
1. Provide a name for the artifact, such as `"mnist"`.
-1. Set the `type` parameter to `"dataset"` when you construct the artifact object (`wandb.Artifact(type="dataset")`).
-1. Provide the path to the `datasets/mnist/` directory as an Amazon S3 URI (`s3://my-bucket/datasets/mnist/`) when you call `wandb.Artifact.add_reference()`.
-1. Log the artifact with `run.log_artifact()`.
+2. Set the `type` parameter to `"dataset"` when you construct the artifact object (`wandb.Artifact(type="dataset")`).
+3. When you call `wandb.Artifact.add_reference()`, provide the path to the `datasets/mnist/` directory as an Amazon S3 URI (`s3://my-bucket/datasets/mnist/`).
+4. Log the artifact with `run.log_artifact()`.
The following code sample creates a reference artifact `mnist:latest`:
@@ -71,10 +71,10 @@ with wandb.init(project="my-project") as run:
run.log_artifact(artifact)
```
-Within the W&B App, you can look through the contents of the reference artifact using the file browser, [explore the full dependency graph](/models/artifacts/explore-and-traverse-an-artifact-graph/), and scan through the versioned history of your artifact. The W&B App does not render rich media such as images, audio, and so forth because the data itself is not contained within the artifact.
+Within the W&B App, you can browse the contents of the reference artifact using the file browser, [explore the full dependency graph](/models/artifacts/explore-and-traverse-an-artifact-graph/), and scan through the versioned history of your artifact. The W&B App doesn't render rich media such as images or audio because the artifact doesn't contain the data itself.
-W&B Artifacts support any Amazon S3 compatible interface, including CoreWeave Storage and MinIO. The scripts described below work as-is with both providers, when you set the `AWS_S3_ENDPOINT_URL` environment variable to point at your CoreWeave Storage or MinIO server.
+W&B Artifacts support any Amazon S3 compatible interface, including CoreWeave AI Object Storage and MinIO. The following scripts work without modification with both providers, when you set the `AWS_S3_ENDPOINT_URL` environment variable to point at your CoreWeave AI Object Storage or MinIO server.
@@ -83,7 +83,9 @@ By default, W&B imposes a 10,000 object limit when adding an object prefix. You
## Download an artifact from an external bucket
-W&B retrieves the files from the underlying bucket when it downloads a reference artifact using the metadata recorded when the artifact is logged. If your bucket has object versioning enabled, W&B retrieves the object version that corresponds to the state of the file at the time an artifact was logged. As you evolve the contents of your bucket, you can always point to the exact version of your data a given model was trained on, because the artifact serves as a snapshot of your bucket during the training run.
+After you log a reference artifact, you can download it later to retrieve the original files from the bucket.
+
+When W&B downloads a reference artifact, it retrieves the files from the underlying bucket using the metadata recorded when you logged the artifact. If your bucket has object versioning enabled, W&B retrieves the object version that corresponds to the state of the file at the time the artifact was logged. As you evolve the contents of your bucket, you can point to the exact version of your data a given model was trained on, because the artifact serves as a snapshot of your bucket during the training run.
The following code sample shows how to download a reference artifact. The APIs for downloading artifacts are the same for both reference and non-reference artifacts:
@@ -96,14 +98,14 @@ with wandb.init(project="my-project") as run:
```
-W&B recommends that you enable 'Object Versioning' on your storage buckets if you overwrite files as part of your workflow.
+If you overwrite files as part of your workflow, W&B recommends that you enable 'Object Versioning' on your storage buckets.
-If versioning is enabled, W&B can always retrieve the correct version of the file when you download an artifact, even if the file has been overwritten since the artifact was logged.
+If versioning is enabled, W&B can retrieve the correct version of the file when you download an artifact, even if the file has been overwritten since you logged the artifact.
Based on your use case, read the instructions to enable object versioning: [AWS](https://docs.aws.amazon.com/AmazonS3/latest/userguide/manage-versioning-examples.html), [Google Cloud](https://cloud.google.com/storage/docs/using-object-versioning#set), [Azure](https://learn.microsoft.com/azure/storage/blobs/versioning-enable).
-## Add and download an external from a bucket
+## Add and download an external file from a bucket
The following code sample uploads a dataset to an Amazon S3 bucket, tracks it with a reference artifact, then downloads it:
@@ -123,7 +125,7 @@ with wandb.init() as run:
run.log_artifact(model_artifact)
```
-At a later point, you can download the model artifact. Specify the name of the artifact and its type:
+To download the model artifact later, specify the name of the artifact and its type:
```python
import wandb
@@ -142,27 +144,25 @@ See the following reports for an end-to-end walkthrough on how to track artifact
## Cloud storage credentials
-W&B uses the default mechanism to look for credentials based on the cloud provider you use. Read the documentation from your cloud provider to learn more about the credentials used:
+To read from and write to your external bucket, W&B needs credentials configured in your environment. W&B uses the default mechanism to look for credentials based on the cloud provider you use. To learn more about the credentials used, read the documentation from your cloud provider:
-| Cloud provider | Credentials Documentation |
+| Cloud provider | Credentials documentation |
| -------------- | ------------------------- |
| CoreWeave AI Object Storage | [CoreWeave AI Object Storage documentation](https://docs.coreweave.com/docs/products/storage/object-storage/how-to/manage-access-keys/cloud-console-tokens) |
| AWS | [Boto3 documentation](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html#configuring-credentials) |
| Google Cloud | [Google Cloud documentation](https://cloud.google.com/docs/authentication/provide-credentials-adc) |
| Azure | [Azure documentation](https://learn.microsoft.com/python/api/azure-identity/azure.identity.defaultazurecredential?view=azure-python) |
-For AWS, if the bucket is not located in the configured user's default region, you must set the `AWS_REGION` environment variable to match the bucket region.
+For AWS, if the bucket isn't located in the configured user's default region, you must set the `AWS_REGION` environment variable to match the bucket region.
-Rich media such as images, audio, video, and point clouds may fail to render in the App UI depending on the CORS configuration of your bucket. Allow listing **app.wandb.ai** in your bucket's CORS settings will allow the W&B App to properly render such rich media.
-
-If rich media such as images, audio, video, and point clouds does not render in the App UI, ensure that `app.wandb.ai` is allowlisted in your bucket's CORS policy.
+Rich media such as images, audio, video, and point clouds may fail to render in the App UI depending on the CORS configuration of your bucket. To resolve rendering issues, allowlist `app.wandb.ai` in your bucket's CORS policy.
## Track an artifact in a filesystem
-A common pattern for accessing datasets is to expose an NFS mount point to a remote filesystem on all machines running training jobs. This can be an alternative solution to a cloud storage bucket because from the perspective of the training script, the files appear local to your filesystem.
+A common pattern for accessing datasets is to expose an NFS mount point to a remote filesystem on all machines running training jobs. This can be an alternative to a cloud storage bucket because, from the perspective of the training script, the files appear local to your filesystem.
{/* Use W&B Artifacts to track references to file systems, regardless if they are mounted or not. */}
@@ -174,7 +174,7 @@ To track an artifact in a filesystem:
3. Specify the reference to the filesystem path with the artifact object's `wandb.Artifact.add_reference()` method.
4. Log the artifact's metadata with `run.log_artifact()`.
-Copy and paste the following code snippet to track files in a mounted filesystem. Replace the values enclosed in angle brackets (`< >`) with your own values.
+To track files in a mounted filesystem, copy and paste the following code snippet. Replace the values enclosed in angle brackets (`< >`) with your own values.
```python
import wandb
@@ -204,7 +204,7 @@ mount
|-- cnn/
```
-You want to track the `datasets/mnist/` directory as a dataset artifact. To track it, you could use the following code snippet.
+To track the `datasets/mnist/` directory as a dataset artifact, use the following code snippet:
```python
import wandb
@@ -221,7 +221,7 @@ This creates a reference artifact `mnist:latest` that points to the files stored
By default, W&B imposes a 10,000 file limit when adding a reference to a directory. You can adjust this limit by specifying `max_objects=` when you call `wandb.Artifact.add_reference()`.
-Similarly, to track a model stored at `models/cnn/my_model.h5`, you could use the following code snippet:
+Similarly, to track a model stored at `models/cnn/my_model.h5`, use the following code snippet:
```python
import wandb
@@ -244,12 +244,13 @@ with wandb.init() as run:
## Download an artifact from an external filesystem
+After you log a filesystem reference artifact, you can download it later to retrieve the original files from the mounted filesystem.
Download files from a referenced filesystem using the same APIs as non-reference artifacts:
1. Initialize a run with `wandb.init()`.
2. Use the `wandb.Run.use_artifact()` method to indicate the artifact you want to download.
-3. Call the artifact's `wandb.Artifact.download()` method to download the files from the referenced filesystem
+3. Call the artifact's `wandb.Artifact.download()` method to download the files from the referenced filesystem.
```python
with wandb.init() as run:
@@ -257,11 +258,11 @@ with wandb.init() as run:
artifact_dir = artifact.download()
```
-W&B copies the contents of `/mount/datasets/mnist` to the `artifacts/mnist:v0/` directory.
+W&B copies the contents of `/mount/datasets/mnist` to the `artifacts/mnist:v0/` directory.
-`Artifact.download()` throws an error if it cannot reconstruct the artifact. For example, if an artifact contains a reference to a file that was overwritten, `Artifact.download()` will throw an error because the artifact can no longer be reconstructed.
+`Artifact.download()` throws an error if it can't reconstruct the artifact. For example, if an artifact contains a reference to a file that was overwritten, `Artifact.download()` throws an error because the artifact can no longer be reconstructed.
diff --git a/models/artifacts/ttl.mdx b/models/artifacts/ttl.mdx
index e6339c066d..4dcc301176 100644
--- a/models/artifacts/ttl.mdx
+++ b/models/artifacts/ttl.mdx
@@ -6,51 +6,54 @@ import { ColabLink } from '/snippets/_includes/colab-link.mdx';
-Schedule when artifacts are deleted from W&B with a W&B Artifact time-to-live (TTL) policy. When you delete an artifact, W&B marks that artifact as a *soft-delete*. In other words, the artifact is marked for deletion but files are not immediately deleted from storage. For more information on how W&B deletes artifacts, see the [Delete artifacts](./delete-artifacts) page.
+Schedule when W&B deletes artifacts with a W&B Artifact time-to-live (TTL) policy. Use TTL policies to automate cleanup of stale artifact data, manage storage consumption, and enforce data retention requirements without manually tracking which artifacts to remove. When you delete an artifact, W&B marks that artifact as a "soft-delete." In other words, W&B marks the artifact for deletion but doesn't immediately delete files from storage. For more information on how W&B deletes artifacts, see the [Delete artifacts](./delete-artifacts) page.
Watch a [Managing data retention with Artifacts TTL](https://www.youtube.com/watch?v=hQ9J6BoVmnc) video tutorial to learn how to manage data retention with Artifacts TTL in the W&B App.
-W&B deactivates the option to set a TTL policy for artifacts linked to the Registry. This is to help ensure that linked artifacts do not accidentally expire if used in production workflows.
+W&B deactivates the option to set a TTL policy for artifacts linked to the Registry. This helps ensure that linked artifacts don't accidentally expire when used in production workflows.
-* Only team admins can view a [team's settings](/platform/app/settings-page/teams) and access team level TTL settings such as (1) permitting who can set or edit a TTL policy or (2) setting a team default TTL.
-* If you do not see the option to set or edit a TTL policy in an artifact's details in the W&B App UI or if setting a TTL programmatically does not successfully change an artifact's TTL property, your team admin has not given you permissions to do so.
+* Only team admins can view a [team's settings](/platform/app/settings-page/teams) and access team-level TTL settings such as (1) permitting who can set or edit a TTL policy or (2) setting a team default TTL.
+* If you don't see the option to set or edit a TTL policy in an artifact's details in the W&B App UI, or if setting a TTL programmatically doesn't change an artifact's TTL property, your team admin hasn't given you permissions to do so.
-## Auto-generated Artifacts
-Only user-generated artifacts can use TTL policies. Artifacts auto-generated by W&B cannot have TTL policies set for them.
+## Autogenerated artifacts
+
+Before you set a TTL policy, confirm that the target artifact is eligible. Only user-generated artifacts can use TTL policies. W&B-generated artifacts can't have TTL policies set for them.
+
+The following artifact types indicate an autogenerated artifact:
-The following Artifact types indicate an auto-generated Artifact:
- `run_table`
- `code`
- `job`
-- Any Artifact type starting with: `wandb-*`
+- Any artifact type starting with `wandb-*`
-You can check an Artifact's type on the [W&B platform](/models/artifacts/explore-and-traverse-an-artifact-graph/) or programmatically:
+You can check an artifact's type on the [W&B platform](/models/artifacts/explore-and-traverse-an-artifact-graph/) or programmatically:
```python
import wandb
-with wandb.init(project="") as run:
- artifact = run.use_artifact(artifact_or_name="")
+with wandb.init(project="[MY-PROJECT-NAME]") as run:
+ artifact = run.use_artifact(artifact_or_name="[MY-ARTIFACT-NAME]")
print(artifact.type)
```
-Replace the values enclosed with `<>` with your own.
+Replace the bracketed placeholders with your own values.
## Define who can edit and set TTL policies
-Define who can set and edit TTL policies within a team. You can either grant TTL permissions only to team admins, or you can grant both team admins and team members TTL permissions.
+
+Define who can set and edit TTL policies within a team to control which users can change artifact retention. Grant TTL permissions only to team admins, or grant both team admins and team members TTL permissions.
Only team admins can define who can set or edit a TTL policy.
-1. Navigate to your team’s profile page.
+1. Navigate to your team's profile page.
2. Select the **Settings** tab.
3. Navigate to the **Artifacts time-to-live (TTL) section**.
4. From the **TTL permissions dropdown**, select who can set and edit TTL policies.
-5. Click on **Review and save settings**.
+5. Click **Review and save settings**.
6. Confirm the changes and select **Save settings**.
@@ -58,15 +61,17 @@ Only team admins can define who can set or edit a TTL policy.
## Create a TTL policy
-Set a TTL policy for an artifact either when you create the artifact or retroactively after the artifact is created.
-For all the code snippets below, replace the content wrapped in `<>` with your information to use the code snippet.
+Set a TTL policy for an artifact either when you create the artifact or retroactively after you create the artifact. The following sections describe both approaches and additional options for setting team-wide defaults or working outside of a run.
+
+For all the following code snippets, replace the bracketed placeholders with your own values to use the code snippet.
### Set a TTL policy when you create an artifact
-Use the W&B Python SDK to define a TTL policy when you create an artifact. TTL policies are typically defined in days.
+
+Use the W&B Python SDK to define a TTL policy when you create an artifact. You typically define TTL policies in days.
-Defining a TTL policy when you create an artifact is similar to how you normally [create an artifact](/models/artifacts/construct-an-artifact/). With the exception that you pass in a time delta to the artifact's `ttl` attribute.
+Defining a TTL policy when you create an artifact is similar to how you normally create an artifact, except that you pass in a time delta to the artifact's `ttl` attribute.
The steps are as follows:
@@ -74,7 +79,7 @@ The steps are as follows:
1. [Create an artifact](/models/artifacts/construct-an-artifact/).
2. [Add content to the artifact](/models/artifacts/construct-an-artifact/#add-files-to-an-artifact) such as files, a directory, or a reference.
3. Define a TTL time limit with the [`datetime.timedelta`](https://docs.python.org/3/library/datetime.html) data type that is part of Python's standard library.
-4. [Log the artifact](/models/artifacts/construct-an-artifact/#3-save-your-artifact-to-the-wb-server).
+4. [Log the artifact](/models/artifacts/construct-an-artifact/#save-your-artifact-to-the-wb-server).
The following code snippet demonstrates how to create an artifact and set a TTL policy.
@@ -82,9 +87,9 @@ The following code snippet demonstrates how to create an artifact and set a TTL
import wandb
from datetime import timedelta
-with wandb.init(project="", entity="") as run:
- artifact = wandb.Artifact(name="", type="")
- artifact.add_file("")
+with wandb.init(project="[MY-PROJECT-NAME]", entity="[MY-ENTITY]") as run:
+ artifact = wandb.Artifact(name="[ARTIFACT-NAME]", type="[TYPE]")
+ artifact.add_file("[MY-FILE]")
artifact.ttl = timedelta(days=30) # Set TTL policy
run.log_artifact(artifact)
@@ -93,10 +98,11 @@ with wandb.init(project="", entity="") as run:
The preceding code snippet sets the TTL policy for the artifact to 30 days. In other words, W&B deletes the artifact after 30 days.
### Set or edit a TTL policy after you create an artifact
+
Use the W&B App UI or the W&B Python SDK to define a TTL policy for an artifact that already exists.
-When you modify an artifact's TTL, the time the artifact takes to expire is still calculated using the artifact's `createdAt` timestamp.
+When you modify an artifact's TTL, W&B still calculates the expiration time using the artifact's `createdAt` timestamp.
@@ -105,15 +111,15 @@ When you modify an artifact's TTL, the time the artifact takes to expire is stil
2. Pass in a time delta to the artifact's `ttl` attribute.
3. Update the artifact with the [`save`](/models/ref/python/experiments/run#save) method.
-
The following code snippet shows how to set a TTL policy for an artifact:
```python
import wandb
from datetime import timedelta
-artifact = run.use_artifact("")
-artifact.ttl = timedelta(days=365 * 2) # Delete in two years
-artifact.save()
+with wandb.init(project="[MY-PROJECT]") as run:
+ artifact = run.use_artifact("[MY-ENTITY]/[MY-PROJECT]/[MY-ARTIFACT]:[ALIAS]")
+ artifact.ttl = timedelta(days=365 * 2) # Delete in two years
+ artifact.save()
```
The preceding code example sets the TTL policy to two years.
@@ -121,9 +127,9 @@ The preceding code example sets the TTL policy to two years.
1. Navigate to your W&B project in the W&B App UI.
2. Select the artifact icon in the project sidebar.
-3. From the list of artifacts, expand the artifact type you
-4. Select on the artifact version you want to edit the TTL policy for.
-5. Click on the **Version** tab.
+3. From the list of artifacts, expand the artifact type that contains your artifact.
+4. Select the artifact version you want to edit the TTL policy for.
+5. Click the **Version** tab.
6. From the dropdown, select **Edit TTL policy**.
7. Within the modal that appears, select **Custom** from the TTL policy dropdown.
8. Within the **TTL duration** field, set the TTL policy in units of days.
@@ -135,23 +141,21 @@ The preceding code example sets the TTL policy to two years.
-
-
### Set default TTL policies for a team
Only team admins can set a default TTL policy for a team.
-Set a default TTL policy for your team. Default TTL policies apply to all existing and future artifacts based on their respective creation dates. Artifacts with existing version-level TTL policies are not affected by the team's default TTL.
+Set a default TTL policy for your team to apply a consistent retention period across artifacts without configuring each one individually. Default TTL policies apply to all existing and future artifacts based on their respective creation dates. The team's default TTL doesn't affect artifacts with existing version-level TTL policies.
-1. Navigate to your team’s profile page.
+1. Navigate to your team's profile page.
2. Select the **Settings** tab.
3. Navigate to the **Artifacts time-to-live (TTL) section**.
-4. Click on the **Set team's default TTL policy**.
+4. Click the **Set team's default TTL policy**.
5. Within the **Duration** field, set the TTL policy in units of days.
-6. Click on **Review and save settings**.
-7/ Confirm the changes and then select **Save settings**.
+6. Click **Review and save settings**.
+7. Confirm the changes and then select **Save settings**.
@@ -159,14 +163,17 @@ Set a default TTL policy for your team. Default TTL policies apply to all existi
### Set a TTL policy outside of a run
-Use the public API to retrieve an artifact without fetching a run, and set the TTL policy. TTL policies are typically defined in days.
+Use the public API to retrieve an artifact without fetching a run, and set the TTL policy. You typically define TTL policies in days.
The following code sample shows how to fetch an artifact using the public API and set the TTL policy.
-```python
+```python
+import wandb
+from datetime import timedelta
+
api = wandb.Api()
-artifact = api.artifact("entity/project/artifact:alias")
+artifact = api.artifact("[ENTITY]/[PROJECT]/[ARTIFACT]:[ALIAS]")
artifact.ttl = timedelta(days=365) # Delete in one year
@@ -174,9 +181,10 @@ artifact.save()
```
## Deactivate a TTL policy
-Use the W&B Python SDK or W&B App UI to deactivate a TTL policy for a specific artifact version.
+
+Use the W&B Python SDK or W&B App UI to deactivate a TTL policy for a specific artifact version when you no longer want the artifact to expire automatically.
{/*
-Artifacts with TTL turned off will not inherit an artifact collection's TTL. Refer to (## Inherit TTL Policy) on how to delete artifact TTL and inherit from the collection level TTL.
+Artifacts with TTL turned off don't inherit an artifact collection's TTL. Refer to (## Inherit TTL Policy) on how to delete artifact TTL and inherit from the collection level TTL.
*/}
@@ -185,21 +193,23 @@ Artifacts with TTL turned off will not inherit an artifact collection's TTL. Ref
2. Set the artifact's `ttl` attribute to `None`.
3. Update the artifact with the [`save`](/models/ref/python/experiments/run#save) method.
-
-The following code snippet shows how to turn off a TTL policy for an artifact:
+The following code snippet shows how to deactivate a TTL policy for an artifact:
```python
-artifact = run.use_artifact("")
-artifact.ttl = None
-artifact.save()
+import wandb
+
+with wandb.init(project="[MY-PROJECT]") as run:
+ artifact = run.use_artifact("[MY-ENTITY]/[MY-PROJECT]/[MY-ARTIFACT]:[ALIAS]")
+ artifact.ttl = None
+ artifact.save()
```
1. Navigate to your W&B project in the W&B App UI.
2. Select the artifact icon in the project sidebar.
-3. From the list of artifacts, expand the artifact type you
-4. Select on the artifact version you want to edit the TTL policy for.
-5. Click on the Version tab.
-6. Click the **action ()** menu next to the **Link to registry** button.
+3. From the list of artifacts, expand the artifact type that contains your artifact.
+4. Select the artifact version you want to edit the TTL policy for.
+5. Click the **Version** tab.
+6. Click the **action ()** menu next to the **Link to registry** button.
7. From the dropdown, select **Edit TTL policy**.
8. Within the modal that appears, select **Deactivate** from the TTL policy dropdown.
9. Select the **Update TTL** button to save your changes.
@@ -210,19 +220,20 @@ artifact.save()
-
-
-
## View TTL policies
-View TTL policies for artifacts with the Python SDK or with the W&B App UI.
+
+View TTL policies for artifacts with the Python SDK or with the W&B App UI to confirm the retention period applied to each artifact.
Use a print statement to view an artifact's TTL policy. The following example shows how to retrieve an artifact and view its TTL policy:
```python
-artifact = run.use_artifact("")
-print(artifact.ttl)
+import wandb
+
+with wandb.init(project="[MY-PROJECT]") as run:
+ artifact = run.use_artifact("[MY-ENTITY]/[MY-PROJECT]/[MY-ARTIFACT]:[ALIAS]")
+ print(artifact.ttl)
```
@@ -230,10 +241,10 @@ View a TTL policy for an artifact with the W&B App UI.
1. Navigate to the [W&B App](https://wandb.ai).
2. Navigate to your W&B Project.
-3. Within your project, select the Artifacts tab in the project sidebar.
-4. Click on a collection.
+3. Within your project, select the **Artifacts** tab in the project sidebar.
+4. Click a collection.
-Within the collection view you can see all of the artifacts in the selected collection. Within the `Time to Live` column you will see the TTL policy assigned to that artifact.
+Within the collection view you can see all of the artifacts in the selected collection. The **Time to Live** column shows the TTL policy assigned to each artifact.
diff --git a/models/artifacts/update-an-artifact.mdx b/models/artifacts/update-an-artifact.mdx
index a8228245ac..021cca037f 100644
--- a/models/artifacts/update-an-artifact.mdx
+++ b/models/artifacts/update-an-artifact.mdx
@@ -3,7 +3,7 @@ description: Update an existing artifact while a run is active or using only the
title: Update an artifact
---
-Pass desired values to update the `description`, `metadata`, and `alias` of an artifact. Update a run previously logged to W&B with the W&B Public API with ([`wandb.Api`](/models/ref/python/public-api/api)). Use `wandb.Run.save()` to update an artifact when is first initialized and still active.
+This page describes how to update the `description`, `metadata`, and `alias` of an existing artifact. Update an artifact when you want to refine its documentation, adjust its metadata, or manage its aliases without creating a new version. To update an artifact from a previously logged run, use the W&B Public API ([`wandb.Api`](/models/ref/python/public-api/api)). To update an artifact while its run is still active, use the [`wandb.Artifact`](/models/ref/python/experiments/artifact) class and call `Artifact.save()`.
**When to use wandb.Artifact.save() or wandb.Run.log_artifact()**
@@ -12,22 +12,22 @@ Pass desired values to update the `description`, `metadata`, and `alias` of an a
- Use `wandb.Run.log_artifact()` to create a new artifact and associate it with a specific run.
-Use the W&B Public API ([`wandb.Api`](/models/ref/python/public-api/api)) to update an artifact. Use the wandb.Artifact ([`wandb.Artifact`](/models/ref/python/experiments/artifact)) Class while a run is active.
+Choose the approach that matches your workflow. Use the W&B Public API ([`wandb.Api`](/models/ref/python/public-api/api)) to update an artifact outside of a run, or use the [`wandb.Artifact`](/models/ref/python/experiments/artifact) class while a run is active.
-You can not update the alias of artifact linked to a model in Model Registry.
+You can't update the alias of an artifact linked to a model in Model Registry.
-The following code example demonstrates how to update the description of an artifact using the [`wandb.Artifact`](/models/ref/python/experiments/artifact) API:
+The following code example shows how to update the description of an artifact with the [`wandb.Artifact`](/models/ref/python/experiments/artifact) API:
```python
import wandb
-with wandb.init(project="") as run:
- artifact = run.use_artifact(":")
- artifact.description = ""
+with wandb.init(project="[EXAMPLE]") as run:
+ artifact = run.use_artifact("[ARTIFACT-NAME]:[ALIAS]")
+ artifact.description = "[DESCRIPTION]"
artifact.save()
```
@@ -56,25 +56,25 @@ artifact.aliases.append("best")
# Remove an alias
artifact.aliases.remove("latest")
-# Completely replace the aliases
+# Replace the aliases
artifact.aliases = ["replaced"]
# Persist all artifact modifications
artifact.save()
```
-For more information, see the Weights and Biases [Artifact API](/models/ref/python/experiments/artifact).
+For more information, see the W&B [Artifact API](/models/ref/python/experiments/artifact).
-You can also update an Artifact collection in the same way as a singular artifact:
+You can also update an artifact collection the same way as a singular artifact. The following example renames a collection and updates its description:
```python
import wandb
-with wandb.init(project="") as run:
+with wandb.init(project="[EXAMPLE]") as run:
api = wandb.Api()
- artifact = api.artifact_collection(type="", collection="")
- artifact.name = ""
- artifact.description = ""
+ artifact = api.artifact_collection(type="[TYPE-NAME]", collection="[COLLECTION-NAME]")
+ artifact.name = "[NEW-COLLECTION-NAME]"
+ artifact.description = "[COLLECTION-DESCRIPTION]"
artifact.save()
```
For more information, see the [Artifacts Collection](/models/ref/python/public-api/api) reference.