Skip to content
Open
Changes from 16 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
179 changes: 179 additions & 0 deletions proposals/4194-user-redaction.md
Copy link
Copy Markdown
Member

@tulir tulir Sep 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementation requirements:

  • Server implementing the endpoint
  • Client or bot using the endpoint

Original file line number Diff line number Diff line change
@@ -0,0 +1,179 @@
# MSC4194: User redaction
Comment thread
clokep marked this conversation as resolved.
Outdated

Targeted attacks are a problem in Matrix. Typically throwaway users
are used to target matrix rooms with abuse in the form of multiple
messages. The intent of the spammer is to fill the room timeline view
in clients with abuse.

Typically, a moderator will be pinged by a bystander, and the
moderator will respond by first banning the throwaway account and then
issuing redactions via a client for each event the throwaway account
sent.

Due to the architecture of various Matrix APIs, there are multiple
issues that can occur in the moderator's flow.

1. Calling `/messages` is required for some clients to accurately get
the full list of `event_id`s the throwaway account sent. This is a
compounding problem because homeservers can be slow to respond to a
filtered `/messages` request and then the client must still process
the response and issue a redaction for each event returned.
This is how both Mjolnir and Draupnir source events for redaction.

2. The client must call `/rooms/redact/event` for each target event,
requiring the client to make multiple requests, further impeding
response time. Homeservers typically rate limit the number of
redactions that a client can send to `/redact` evenly with `/send`,
which further compounds response time.
[MSC2244](https://github.com/matrix-org/matrix-spec-proposals/pull/2244)
tried to alleviate part of the problem by allowing the redactions
to be sent in one event. But there currently are no endpoints that
allow you to create this event[^create-mass-redaction].

3. Soft failure of events sent by the throwaway account after the
moderator bans the throwaway. These events are not visible to the
moderator's matrix client, and there is no way for a moderator to
redact these events. This is a huge problem that occurs far more
frequently than it should[^often-soft-failure].

[^create-mass-redaction]: I think?

[^often-soft-failure]: https://github.com/matrix-org/synapse/issues/9329
## Proposal

This proposal introduces a very simple client-server endpoint to
alleviate all three of the concerns surrounding the moderators
redaction flow.

### Redacting a user's events
Comment thread
Gnuxie marked this conversation as resolved.

A new endpoint is introduced into the client-server spec.

`POST /_matrix/client/v1/rooms/{roomID}/redact/user/{userID}`
Comment thread
Gnuxie marked this conversation as resolved.

This endpoint redacts the target matrix user's unredacted events in
reverse chronological order, starting from the most
chronologically-recent visible event in the room history for the
requesting user.
Comment thread
Gnuxie marked this conversation as resolved.
Outdated
Comment thread
clokep marked this conversation as resolved.
Outdated

This endpoint is blind to the distinction between normal and
soft-failed events, and will cause redactions to be issued
for any soft-failed events that match the scope of the
request.

Events known to have already been redacted SHOULD be ignored and MUST
NOT contribute to the request limit.

It is left to an implementation detail whether a `m.room.redaction`
event is created for each event, or
[MSC2244](https://github.com/matrix-org/matrix-spec-proposals/pull/2244)
is employed. We expect that for now, most implementations will
issue one `m.room.redaction` event for each event under
the scope of the request.

The action of redacting the events that are determined to be within
the scope of the request should be seen as an atomic operation from
the perspective of the client, regardless of whether multiple
`m.room.redaction` events are issued by the server.

#### Rate limiting

This endpoint SHOULD be rate limited, but the number `m.room.redaction`
events emitted SHOULD NOT be the subject of the rate limit.
But instead the number of events that have been redacted overall.
This is to cover semantics of MSC2244.

Servers SHOULD be more liberal in the number of events that
can be redacted in comparison to rate limits for `/send` when
the endpoint is being used to redact another user's events.
Rather than their own events. The intent of being liberal
is to allow moderators to remove spam faster.

#### Query parameters

`limit`: `integer` - The maximum number of events to
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This MSC seems to heavily assume that this endpoint will only be called after a user has been banned, however the pagination defined here doesn't really account for a scenario where that isn't the case. How should pagination work if a user is not banned and continues to send events after the endpoint has already been called? Does it reset position to the latest event from them, working back from there (presumably skipping over events that already have been redacted)? Without a pagination token that's the only way I can see this working, but then, that makes this endpoint kinda ineffective if you're trying to redact someone who hasn't been removed from the room, since they can just send another event to interrupt the redaction progress.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, bit of a nitpick, but there's no ceiling suggested here - is it reasonable for a client to assume that a server will usually cap out at 100 events, like /messages typically does?

redact. Default: 25.

Client authors should be aware that the server may return less than
the `limit` even when `is_more_events` is `true`, and so should always
check the response.

#### Response
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be a communication of which Events failed to redact? For example due to a db failure. Similar to how s-s api handles failed events where you get a list of Event ids back. Or should the server retry forever to redact the event?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it should ever fail to redact individual events 🤔

If the database explodes halfway through, that's the server's problem, it can throw an error on the entire request (but really the server should tell the database to not explode)


`is_more_events`: `boolean` - Whether there are more events outside of
the scope of the request sent by the user that could be redacted, but
have not been because of the `limit` query parameter. If
`is_more_events` is true, then the server should expect that the
client can optionally call the endpoint again, to redact even more events.

`redacted_events`: `integer` - The number of events that have been redacted, including soft failed events.

`soft_failed_events`: `integer` - The number of soft failed events that have been redacted.

```
200 OK
Content-Type: application/json

{
"is_more_events": false,
"redacted_events": 5,
"soft_failed_events": 1
}
Comment on lines +117 to +141
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like some thoughts on the names of these fields. I'd also like to know if people agree that it makes less sense to return say an array of event_ids

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like some thoughts on the names of these fields.

This is always going to be subjective but here goes:

soft_failed_eventsmight be slightly ambiguous in that it doesn't explicitly express that these events have also been redacted. Maybe the counters could be nested under a redactedkey?

"redacted": {
  "total": 5,
  "soft_failed": 1
}

I'd also like to know if people agree that it makes less sense to return say an array of event_ids

IIUC the main use case for this API is to redact all of a user's events. The counters in the response could nicely be used to display progress. It might be nice to also return the total number of events to let clients display a completion percentage. Not sure if that's easy to get though?

I can't really think of a reason why the caller would need event IDs for the redacted events.

Copy link
Copy Markdown
Contributor Author

@Gnuxie Gnuxie Sep 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the counters could be nested under a redactedkey?

Yeah that sounds like a good idea, 8f1900a

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't really think of a reason why the caller would need event IDs for the redacted events.

Yeah, i'll leave this open for now just in case someone else has thoughts

```

#### Limited response

Servers may wish to limit the maximum size of the limit that can be
specified. This maximum limit may even be dynamic to comply with rate
limiting.

To do this, they should simply redact less events than the limit provided
by the client.

```
POST /.../?limit=1000
200 OK
Content-Type: application/json

{
"is_more_events": true,
"redacted_events": 25,
"soft_failed_events": 3
}
```

## Potential issues

### Forward compatibility with `/messages` style query parameters

We probably want to keep the endpoint compatible with `/messages`
style query pamarameters if the endpoint needs to be extended in the
future. It is our current assessment that we are currently forward
compatible. But someone should double check.

## Alternatives

* An endpoint could be developed to expose soft-failed events to
clients or appservices such as in
[MSC4109](https://github.com/matrix-org/matrix-spec-proposals/pull/4109).

## Security considerations

### Use case for self redaction

Implementers should be cautious over the use of this API for self
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about limiting self redaction to non dag critical Events? So messages and custom state events but not for example powerlevel Events or similar.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DAG critical events are supposed to be redaction exempt already for critical fields i thought.

redaction. We might omit that use case from the MSC if there are concerns.

## Unstable prefix

Until the MSC is accepted, implementations can use `org.matrix.msc4194` as the
unstable prefix and as a flag in the `unstable_features` section of `/versions`:
* `/_matrix/client/unstable/org.matrix.msc4194/rooms/{roomID}/redact/user/{userID}`
instead of `/_matrix/client/v1/rooms/{roomID}/redact/user/{userID}`
Once the MSC is merged, the `org.matrix.msc4194.stable` unstable feature flag
can be used to advertise support for the stable version of the endpoint, until
the spec release with the endpoint is advertised.

## Dependencies

None