Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
319 changes: 220 additions & 99 deletions docs/docs/writing-plugins/the-rules-api/concepts.mdx

Large diffs are not rendered by default.

93 changes: 58 additions & 35 deletions docs/docs/writing-plugins/the-rules-api/file-system.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -46,13 +46,14 @@ A `Snapshot` is useful when you want to know which files a `Digest` refers to. F
Given a `Digest`, you may use the engine to enrich it into a `Snapshot`:

```python
from pants.engine.fs import Digest, Snapshot
from pants.engine.rules import Get, rule
from pants.engine.fs import Snapshot
from pants.engine.intrinsics import digest_to_snapshot
from pants.engine.rules import rule

@rule
async def demo(...) -> Foo:
...
snapshot = await Get(Snapshot, Digest, my_digest)
snapshot: Snapshot = await digest_to_snapshot(my_digest)
```

## `CreateDigest`: create new files
Expand All @@ -61,12 +62,15 @@ async def demo(...) -> Foo:

```python
from pants.engine.fs import CreateDigest, Digest, FileContent
from pants.engine.rules import Get, rule
from pants.engine.intrinsics import create_digest
from pants.engine.rules import rule

@rule
async def demo(...) -> Foo:
...
digest = await Get(Digest, CreateDigest([FileContent("f1.txt", b"hello world")]))
digest: Digest = await create_digest(
CreateDigest([FileContent("f1.txt", b"hello world")])
)
```

The `CreateDigest` constructor expects an iterable including any of these types:
Expand All @@ -83,12 +87,15 @@ This does _not_ write the `Digest` to the build root. Use `Workspace.write_diges

```python
from pants.engine.fs import Digest, PathGlobs
from pants.engine.rules import Get, rule
from pants.engine.intrinsics import path_globs_to_digest
from pants.engine.rules import rule

@rule
async def demo(...) -> Foo:
...
digest = await Get(Digest, PathGlobs(["**/*.txt", "!ignore_me.txt"]))
digest: Digest = await path_globs_to_digest(
PathGlobs(["**/*.txt", "!ignore_me.txt"])
)
```

- All globs must be relative paths, relative to the build root.
Expand Down Expand Up @@ -120,16 +127,17 @@ PathGlobs(
)
```

If you only need to resolve the file names—and don't actually need to use the file content—you can use `await Get(Paths, PathGlobs)` instead of `await Get(Digest, PathGlobs)` or `await Get(Snapshot, PathGlobs)`. This will avoid "digesting" the files to the LMDB Store cache as a performance optimization. `Paths` has two properties: `files: tuple[str, ...]` and `dirs: tuple[str, ...]`.
If you only need to resolve the file names—and don't actually need to use the file content—you can use `await path_globs_to_paths()` instead of `await path_globs_to_digest()` or `await digest_to_snapshot(**implicitly(PathGlobs(...)))`. This will avoid "digesting" the files to the LMDB Store cache, as a performance optimization. The returned `Paths` instance has two properties: `files: tuple[str, ...]` and `dirs: tuple[str, ...]`.

```python
from pants.engine.fs import Paths, PathGlobs
from pants.engine.rules import Get, rule
from pants.engine.fs import PathGlobs, Paths
from pants.engine.intrinsics import path_globs_to_paths
from pants.engine.rules import rule

@rule
async def demo(...) -> Foo:
...
paths = await Get(Paths, PathGlobs(["**/*.txt", "!ignore_me.txt"]))
paths: Paths = await path_globs_to_paths(["**/*.txt", "!ignore_me.txt"]))
logger.info(paths.files)
```

Expand All @@ -139,12 +147,13 @@ async def demo(...) -> Foo:

```python
from pants.engine.fs import Digest, DigestContents
from pants.engine.rules import Get, rule
from pants.engine.intrinsics import path_globs_to_paths
from pants.engine.rules import rule

@rule
async def demo(...) -> Foo:
...
digest_contents = await Get(DigestContents, Digest, my_digest)
digest_contents: DigestContents = await get_digest_contents(my_digest)
for file_content in digest_contents:
logger.info(file_content.path)
logger.info(file_content.content) # This will be `bytes`.
Expand All @@ -155,9 +164,9 @@ The result will be a sequence of `FileContent` objects, which each have a proper
:::caution You may not need `DigestContents`
Only use `DigestContents` if you need to read and operate on the content of files directly in your rule.

- If you are running a `Process`, you only need to pass the `Digest` as input and that process will be able to read all the files in its environment. If you only need a list of files included in the digest, use `Get(Snapshot, Digest)`.
- If you are running a `Process`, you only need to pass the `Digest` as input and that process will be able to read all the files in its environment. If you only need the list of files included in the digest, use `get_digest_entries()`.

- If you just need to manipulate the directory structure of a `Digest`, such as renaming files, use `DigestEntries` with `CreateDigest` or use `AddPrefix` and `RemovePrefix`. These avoid reading the file content into memory.
- If you only need to manipulate the directory structure of a `Digest`, by renaming files, use `DigestEntries` with `create_digest()` or use `add_prefix()` and `remove_prefix()`. These avoid reading the file content into memory.

:::

Expand All @@ -172,13 +181,14 @@ Only use `DigestContents` if you need to read and operate on the content of file
This is useful if you need to manipulate the directory structure of a `Digest` without actually needing to bring the file contents into memory (which is what occurs if you were to use `DigestContents`).

```python
from pants.engine.fs import Digest, DigestEntries, Directory, FileEntry
from pants.engine.rules import Get, rule
from pants.engine.fs import DigestEntries, Directory, FileEntry
from pants.engine.intrinsics import get_digest_entries
from pants.engine.rules import rule

@rule
async def demo(...) -> Foo:
...
digest_entries = await Get(DigestEntries, Digest, my_digest)
digest_entries: DigestEntries = await get_digest_entries(my_digest)
for entry in digest_entries:
if isinstance(entry, FileEntry):
logger.info(entry.path)
Expand All @@ -194,34 +204,39 @@ Often, you will need to provide a single `Digest` somewhere in your plugin—suc

```python
from pants.engine.fs import Digest, MergeDigests
from pants.engine.rules import Get, rule
from pants.engine.intrinsics import merge_digests
from pants.engine.rules import rule

@rule
async def demo(...) -> Foo:
...
digest = await Get(
Digest,
MergeDigests([downloaded_tool_digest, config_file_digest, source_files_snapshot.digest],
digest: Digest = await merge_digests(
MergeDigests([
downloaded_tool_digest,
config_file_digest,
source_files_snapshot.digest
])
)
```

- It is okay if multiple digests include the same file, so long as they have identical content.
- If any digests have different content for the same file, the engine will error. Unlike Git, the engine does not attempt to resolve merge conflicts.
- It is okay if some digests are empty, i.e. `EMPTY_DIGEST`.
- If any digests have different content for the same file, the engine will error.
- It is okay if some digests are empty. The `pants.engine.fs.EMPTY_DIGEST` constant represents an empty digest.

## `DigestSubset`: extract certain files from a `Digest`

To get certain files out of a `Digest`, use `DigestSubset`.

```python
from pants.engine.fs import Digest, DigestSubset, PathGlobs
from pants.engine.rules import Get, rule
from pants.engine.intrinsics import digest_subset_to_digest
from pants.engine.rules import rule

@rule
async def demo(...) -> Foo:
...
new_digest = await Get(
Digest, DigestSubset(original_digest, PathGlobs(["file1.txt"])
new_digest: Digest = await digest_subset_to_digest(
DigestSubset(original_digest, PathGlobs(["file1.txt"]))
)
```

Expand All @@ -233,13 +248,18 @@ Use `AddPrefix` and `RemovePrefix` to change the paths of every file in the dige

```python
from pants.engine.fs import AddPrefix, Digest, RemovePrefix
from pants.engine.rules import Get, rule
from pants.engine.intrinsics import add_prefix, remove_prefix
from pants.engine.rules import rule

@rule
async def demo(...) -> Foo:
...
added_prefix = await Get(Digest, AddPrefix(original_digest, "new_prefix/subdir"))
removed_prefix = await Get(Digest, RemovePrefix(added_prefix, "new_prefix/subdir"))
added_prefix: Digest = await add_prefix(
AddPrefix(original_digest, "new_prefix/subdir")
)
removed_prefix: Digest = await remove_prefix(
RemovePrefix(added_prefix, "new_prefix/subdir")
)
assert removed_prefix == original_digest
```

Expand All @@ -256,7 +276,7 @@ from pants.engine.rules import goal_rule
@goal_rule
async def run_my_goal(..., workspace: Workspace) -> MyGoal:
...
# Note that this is a normal method; we do not use `await Get`.
# Note that this is a regular synchronous method; we do not use `await`.
workspace.write_digest(digest)
```

Expand All @@ -277,7 +297,7 @@ for digest in all_digests:
Good:

```python
merged_digest = await Get(Digest, MergeDigests(all_digests))
merged_digest = await merge_digests(MergeDigests(all_digests))
workspace.write_digest(merged_digest)
```

Expand All @@ -286,8 +306,9 @@ workspace.write_digest(merged_digest)
`DownloadFile` allows you to download an asset using a `GET` request.

```python
from pants.engine.fs import DownloadFile, FileDigest
from pants.engine.rules import Get, rule
from pants.engine.fs import Digest, DownloadFile, FileDigest
from pants.engine.download_file import download_file
from pants.engine.rules import rule

@rule
async def demo(...) -> Foo:
Expand All @@ -297,7 +318,9 @@ async def demo(...) -> Foo:
"12937da9ad5ad2c60564aa35cb4b3992ba3cc5ef7efedd44159332873da6fe46",
2637138
)
downloaded = await Get(Digest, DownloadFile(url, file_digest)
downloaded: Digest = await download_file(
DownloadFile(url, file_digest), **implicitly()
)
```

`DownloadFile` expects a `url: str` parameter pointing to a stable URL for the asset, along with an `expected_digest: FileDigest` parameter. A `FileDigest` is like a normal `Digest`, but represents a single file, rather than a set of files/directories. To determine the `expected_digest`, manually download the file, then run `shasum -a 256` to compute the fingerprint and `wc -c` to compute the expected length of the downloaded file in bytes.
Expand Down
11 changes: 9 additions & 2 deletions docs/docs/writing-plugins/the-rules-api/goal-rules.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ How to create new goals.

For many [plugin tasks](../common-plugin-tasks/index.mdx), you will be extending existing goals, such as adding a new linter to the `lint` goal. However, you may instead want to create a new goal, such as a `publish` goal. This page explains how to create a new goal.

As explained in [Concepts](./concepts.mdx), `@goal_rule`s are the entry-point into the rule graph. When a user runs `pants my-goal`, the Pants engine will look for the respective `@goal_rule`. That `@goal_rule` will usually request other types, either as parameters in the `@goal_rule` signature or through `await Get`. But unlike a `@rule`, a `@goal_rule` may also trigger side effects (such as running interactive processes, writing to the filesystem, etc) via `await Effect`.
As explained in [Concepts](./concepts.mdx), `@goal_rule`s are the entry-point into the rule graph. When a user runs `pants my-goal`, the Pants engine will look for the respective `@goal_rule`. That `@goal_rule` will usually request other types, either as parameters in the `@goal_rule` signature or through `await`ing another rule. But unlike a `@rule`, a `@goal_rule` may also trigger side effects (such as running interactive processes, writing to the filesystem, etc) via `await Effect`.

Often, you can keep all of your logic inline in the `@goal_rule`. As your `@goal_rule` gets more complex, you may end up factoring out helper `@rule`s, but you do not need to start with writing helper `@rule`s.

Expand Down Expand Up @@ -201,7 +201,14 @@ async def hello_world(console: Console, specs_paths: SpecsPaths) -> HelloWorld:

`SpecsPaths.files` will list all files matched by the specs, e.g. `::` will match every file in the project (regardless of if targets own the files).

To convert `SpecsPaths` into a [`Digest`](./file-system.mdx), use `await Get(Digest, PathGlobs(globs=specs_paths.files))`.
To convert `SpecsPaths` into a [`Digest`](./file-system.mdx), use:

```python
from pants.engine.intrinsics import path_globs_to_digest
...
await path_globs_to_digest(PathGlobs(globs=specs_paths.files))
```


:::note Name clashing
It is very unlikely, but is still possible that adding a custom goal with an unfortunate name may cause issues when certain existing Pants options are passed in the command line. For instance, executing a goal named `local` with a particular option (in this case, the global `local_cache` option), e.g. `pants --no-local-cache local ...` would fail since there's no `--no-cache` flag defined for the `local` goal.
Expand Down
1 change: 1 addition & 0 deletions docs/docs/writing-plugins/the-rules-api/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -18,3 +18,4 @@ Adding logic to your plugin.
- [Logging and dynamic output](./logging-and-dynamic-output.mdx)
- [Testing rules](./testing-plugins.mdx)
- [Tips and debugging](./tips-and-debugging.mdx)
- [Migrating from call-by-type](./migrating-gets.mdx)
60 changes: 38 additions & 22 deletions docs/docs/writing-plugins/the-rules-api/installing-tools.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -19,14 +19,18 @@ If you instead want to allow the binary to be located anywhere on a user's machi
from pants.core.util_rules.system_binaries import (
BinaryPathRequest,
BinaryPaths,
ProcessResult,
find_binary,
)
from pants.engine.process import (
Process,
ProcessResult,
execute_process_or_raise
)
from pants.engine.rules import implicitly

@rule
async def demo(...) -> Foo:
docker_paths = await Get(
BinaryPaths,
docker_paths: BinaryPaths = await find_binary(
BinaryPathRequest(
binary_name="docker",
search_path=["/usr/bin", "/bin"],
Expand All @@ -35,7 +39,9 @@ async def demo(...) -> Foo:
docker_bin = docker_paths.first_path
if docker_bin is None:
raise OSError("Could not find 'docker'.")
result = await Get(ProcessResult, Process(argv=[docker_bin.path, ...], ...))
result: ProcessResult = await execute_process_or_raise(
**implicitly(Process(argv=[docker_bin.path, ...], ...))
)
```

`BinaryPaths` has a field called `paths: tuple[BinaryPath, ...]`, which stores all the discovered absolute paths to the specified binary. Each `BinaryPath` object has the fields `path: str`, such as `/usr/bin/docker`, and `fingerprint: str`, which is used to invalidate the cache if the binary changes. The results will be ordered by the order of `search_path`, meaning that earlier entries in `search_path` will show up earlier in the result.
Expand Down Expand Up @@ -113,26 +119,35 @@ You must also define the methods `generate_url`, which is the URL to make a GET

Because an `ExternalTool` is a subclass of [`Subsystem`](./options-and-subsystems.mdx), you must also define an `options_scope`. You may optionally register additional options from `pants.option.option_types`.

In your rules, include the `ExternalTool` as a parameter of the rule, then use `Get(DownloadedExternalTool, ExternalToolRequest)` to download and extract the tool.
In your rules, include the `ExternalTool` as a parameter of the rule, then `await download_external_tool()` to download and extract the tool.

```python
from pants.core.util_rules.external_tool import DownloadedExternalTool, ExternalToolRequest
from pants.core.util_rules.external_tool import (
DownloadedExternalTool,
ExternalToolRequest,
download_external_tool,
)
from pants.engine.platform import Platform
from pants.engine.process import (
Process,
ProcessResult,
execute_process_or_raise
)
from pants.engine.rules import implicitly

@rule
async def demo(shellcheck: Shellcheck, ...) -> Foo:
shellcheck = await Get(
DownloadedExternalTool,
ExternalToolRequest,
async def demo(shellcheck: Shellcheck, platform: Platform) -> Foo:
shellcheck: DownloadedExternalTool = await download_external_tool(
shellcheck.get_request(platform)
)
result = await Get(
ProcessResult,
Process(argv=[shellcheck.exe, ...], input_digest=shellcheck.digest, ...)
result: ProcessResult = await execute_process_or_raise(
**implicitly(
Process(argv=[shellcheck.exe, ...], input_digest=shellcheck.digest, ...)
)
)
```

A `DownloadedExternalTool` object has two fields: `digest: Digest` and `exe: str`. Use the `.exe` field as the first value of a `Process`'s `argv`, and use the `.digest` in the `Process's` `input_digest`. If you want to use multiple digests for the input, call `Get(Digest, MergeDigests)` with the `DownloadedExternalTool.digest` included.
A `DownloadedExternalTool` object has two fields: `digest: Digest` and `exe: str`. Use the `.exe` field as the first value of a `Process`'s `argv`, and use the `.digest` in the `Process's` `input_digest`. If you want to use multiple digests for the input, call `merge_digests()` with the `DownloadedExternalTool.digest` included.

## `Pex`: Install binaries through pip

Expand All @@ -146,13 +161,15 @@ from pants.backend.python.util_rules.pex import (
PexProcess,
PexRequest,
PexRequirements,
create_pex,
)
from pants.engine.intrinsics import execute_process
from pants.engine.process import FallibleProcessResult
from pants.engine.rules import implicitly

@rule
async def demo(...) -> Foo:
pex = await Get(
Pex,
pex: Pex = await create_pex(
PexRequest(
output_filename="black.pex",
internal_only=True,
Expand All @@ -161,9 +178,8 @@ async def demo(...) -> Foo:
main=ConsoleScript("black"),
)
)
result = await Get(
FallibleProcessResult,
PexProcess(pex, argv=["--check", ...], ...),
result: FallibleProcessResult = await execute_process(
**implicitly(PexProcess(pex, argv=["--check", ...], ...)),
)
```

Expand All @@ -181,9 +197,9 @@ There are several other optional parameters that may be helpful.

The resulting `Pex` object has a `digest: Digest` field containing the built `.pex` file. This digest should be included in the `input_digest` to the `Process` you run.

Instead of the normal `Get(ProcessResult, Process)`, you should use `Get(ProcessResult, PexProcess)`, which will set up the environment properly for your Pex to execute. There is a predefined rule to go from `PexProcess -> Process`, so `Get(ProcessResult, PexProcess)` will cause the engine to run `PexProcess -> Process -> ProcessResult`.
Instead of the usual execute_process(Process), you should use `execute_process(**implicitly(PexProcess))`, which will set up the environment properly for your Pex to execute. There is a rule to convert `PexProcess -> Process`, so this will cause the engine to run `PexProcess -> Process -> FallibleProcessResult`.
Comment thread
benjyw marked this conversation as resolved.
Outdated

`PexProcess` requires arguments for `pex: Pex`, `argv: Iterable[str]`, and `description: str`. It has several optional parameters that mirror the arguments to `Process`. If you specify `input_digest`, be careful to first use `Get(Digest, MergeDigests)` on the `pex.digest` and any of the other input digests.
`PexProcess` requires arguments for `pex: Pex`, `argv: Iterable[str]`, and `description: str`. It has several optional parameters that mirror the arguments to `Process`. If you specify `input_digest`, be careful to first use `merge_digests()` on the `pex.digest` and any of the other input digests.

:::note Use `PythonToolBase` when you need a Subsystem
Often, you will want to create a [`Subsystem`](./options-and-subsystems.mdx) for your Python tool
Expand Down Expand Up @@ -221,7 +237,7 @@ Then, you can set up your `Pex` like this:
```python
@rule
async def demo(black: Black, ...) -> Foo:
pex = await Get(Pex, PexRequest, black.to_pex_request())
pex = await create_pex(black.to_pex_request())
Comment thread
benjyw marked this conversation as resolved.
Outdated
```

:::
Loading
Loading