Skip to content

ESRI Rest Serivce Import support#1102

Open
hamishcampbell wants to merge 7 commits into
koordinates:masterfrom
hamishcampbell:feat/esri-rest-importer
Open

ESRI Rest Serivce Import support#1102
hamishcampbell wants to merge 7 commits into
koordinates:masterfrom
hamishcampbell:feat/esri-rest-importer

Conversation

@hamishcampbell

@hamishcampbell hamishcampbell commented May 7, 2026

Copy link
Copy Markdown
Contributor

Description

Esri Rest services were already importable with the (slightly secret) OGR:ESRIJSON driver but it was painful to get the correct parameters. E.g. you needed to know that the where, outFields and returnGeometry fields should be set, and you couldn't import multiple layers from a single source easily.

This PR adds a dedicated esri: handler that accepts an Esri ArcGIS Map/Feature Server URL and simplifies the import process. For example:

> kart import esri:https://HOST/PATH/MapServer/ -a
Importing 19 features from https://HOST/PATH/MapServer/0:ESRIJSON) to First Layer/ ...
Added 19 Features to index in 0.0s
Overall rate: 2959 features/s)
Closed in 0s
Importing 3 features from PkGeneratingTableImportSource(https://HOST/PATH/MapServer/1:ESRIJSON) to Second Layer/ ...
Added 3 Features to index in 0.0s
Overall rate: 4675 features/s)
Closed in 0s
Importing 303 features from PkGeneratingTableImportSource(https://HOST/PATH/MapServer/2:ESRIJSON) to Third Layer/ ...
Added 303 Features to index in 0.0s
Overall rate: 10269 features/s)
Closed in 0s

Any supplied URL parameters are preserved. E.g. a narrower set of features & fields can be imported from a specific source:

# Only import the attributes `field1` and `field2` of features with `status=current`
kart import esri:https://HOST/PATH/MapServer/1?where=status%3Dcurrent&outFields=field1,field2

This PR includes other improvement to the importer framework to make it simple to add new handlers. See d823ec7 for more detail. The subsequent commit 17b7dc4 has the esri-specific changes.

Future Improvements

This doesn't recurse into folders, so the multi-layer importer will only work from a single MapServer/FeatureService endpoint. It would be straight-forward to extent this to iterate into folders in future.

Related links:

N/A

Checklist:

  • Have you reviewed your own change?
  • Have you included test(s)?
  • Have you updated the changelog?

  * Removes the unused ImportType.optional_prefix
  * Simplifes import source selection to use kart.import_sources.from_spec
    to actually pick from the import options rather than indirectly via
    DbType.from_spec
  * Removed unused OgrTableImportSource handle_source_string logic
    which was unused and made creating custom handlers more difficult
  * Fixes some minor formatting and code hinting problems to keep
    the IDE happy
Rest services were already importable with the OGR:ESRIJSON driver. This
change adds improved support via a dedicated importer that simplifes the
approach. The new esri:https://HOST/PATH/FeatureServer/[LAYER_ID] import
source handles multi-table import from a single service endpoint and
automatically adds necessary query parameters.

Includes tests and (minor) documentation updates.

Example usage:

    kart import esri:https://example.com/arcgis/rest/services/Public/MapServer -a
Comment thread kart/import_sources.py
Comment on lines +42 to +45
elif self is self.ESRI_REST_TABLE:
from kart.tabular import ESRIRestImportSource

return ESRIRestImportSource

@hamishcampbell hamishcampbell May 7, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awkward IMO, but necessary without further refactoring of ImportType. A future improvement here would be to simplify the types to TABULAR, POINT_CLOUD, RASTER and define the import source class in ImportSourceType rather than here. That would simplify some other pieces too - e.g. the warning if you try to --link a tabular source.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to address if you're feeling keen 😃

# see if any subclasses have a handler for this.
for subclass in cls._all_subclasses():
if "handle_source_string" in subclass.__dict__:
retval = subclass.handle_source_string(ogr_source)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried this approach first but ultimately added extra steps vs adding the custom esri: prefix. I'm not 100% sure what the intention with this code is, but it's not used by anything so I've removed it on the basis that import source selection via from_spec is more reliable/universal.

@craigds craigds requested a review from olsen232 May 7, 2026 23:25
@hamishcampbell

Copy link
Copy Markdown
Contributor Author

Hm seems I might've broken the old OGR:[DRIVER]:... approach here, which was unintended.

Comment thread tests/test_esri_rest_import_source.py Outdated
source = ESRIRestImportSource.open(
"esri:https://example.com/arcgis/rest/services/MyService/MapServer/0"
)
assert isinstance(source, ESRIJSONImportSource)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test doesn't look quite right:

Firstly, with pytest.raises is preferable to try ... except ... pass

Having an assert within the block from whence we except an exception to be raised doesn't look right either - the assert could easily be skipped when we just straight to the except, or, the assert could raise an exception that we catch and ignore, in which case, why assert it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah correct, I've fixed the test and made it clearer what it is trying to achieve

# For ESRI JSON sources, the datasource typically has only one layer
# If no table is specified or the specified table doesn't exist,
# use the first (and usually only) layer in the datasource
if table is None or ogr_ds.GetLayerByName(table) is None:

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this could be rewritten for clarity. I think there's effectively 2 sets of 2 branches that needn't actually interfere with each other, which should make it easier to follow:

1a) if there is one layer in the source import that layer
1b) otherwise, import the specified one (argument "table")

2a) if argument table is set, use that as dest_path
2b) otherwise, use the name of the layer

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I just did some testing and I don't think the case where the table doesn't match is reachable currently - at least when I tried it, this was prevented by a check elsewhere:

kart/tabular/import_.py(280)table_import()
    279             meta_overrides["metadata.xml"] = meta_overrides.pop("xmlMetadata")
--> 280         import_source = base_import_source.clone_for_table(
    281             table,

kart/tabular/ogr_import_source.py(211)clone_for_table()
    210         meta_overrides = {**self.meta_overrides, **meta_overrides}
--> 211         self.check_table(table)
    212
    
kart/tabular/ogr_import_source.py(262)check_table()
    261         if table_name not in self.get_tables():
--> 262             raise NotFound(
    263                 f"Table '{table_name}' not found",

What should happen here might depend on if or how you address my other comment about finding the names of these layers.

Comment thread kart/import_sources.py
Comment on lines +42 to +45
elif self is self.ESRI_REST_TABLE:
from kart.tabular import ESRIRestImportSource

return ESRIRestImportSource

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to address if you're feeling keen 😃

from kart.exceptions import ImportSourceError


class ESRIRestImportSource(TableImportSource):

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK this is great in that it supports everything - you can import all tables from FeatureServer, or just one table from FeatureServer by name, or you can specify FeatureServer/0 and it'll import specifically that. Users probably don't want to read the help files so it's good if they can throw a URL at it and it will do something sensible.

What I don't like so much is that if you import FeatureServer/0, then the name is lost and replaced with ESRIJSON. This is probably Jack Dangermond's fault - a better URL would maybe be FeatureServer/name_of_dataset. but we can't fix that here. However, I wonder can we do something a bit cleverer where when someone asks to import FeatureServer/0, we also hit FeatureServer to find the name of the layer, and use that name if no other name is specified by the user.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I'll update to fetch the metadata from the feature service URL (.../0?f=json)

Comment thread kart/tabular/esri_rest_import_source.py Outdated
Comment thread kart/tabular/esri_rest_import_source.py Outdated

Example:
# List all layers in a service
source = ESRIRestServerSource.open("https://example.com/.../MapServer")

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bit of a mix of docs here, some aimed at callers and some aimed at end-users.

For end-users who want to know all the layers, there's also:
kart import "esri:https://example.com/.../MapServer" --list

And for importing a particular layer:
kart import "esri:https://example.com/.../MapServer" LAYER

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants