ESRI Rest Serivce Import support#1102
Conversation
* Removes the unused ImportType.optional_prefix
* Simplifes import source selection to use kart.import_sources.from_spec
to actually pick from the import options rather than indirectly via
DbType.from_spec
* Removed unused OgrTableImportSource handle_source_string logic
which was unused and made creating custom handlers more difficult
* Fixes some minor formatting and code hinting problems to keep
the IDE happy
Rest services were already importable with the OGR:ESRIJSON driver. This
change adds improved support via a dedicated importer that simplifes the
approach. The new esri:https://HOST/PATH/FeatureServer/[LAYER_ID] import
source handles multi-table import from a single service endpoint and
automatically adds necessary query parameters.
Includes tests and (minor) documentation updates.
Example usage:
kart import esri:https://example.com/arcgis/rest/services/Public/MapServer -a
| elif self is self.ESRI_REST_TABLE: | ||
| from kart.tabular import ESRIRestImportSource | ||
|
|
||
| return ESRIRestImportSource |
There was a problem hiding this comment.
This is awkward IMO, but necessary without further refactoring of ImportType. A future improvement here would be to simplify the types to TABULAR, POINT_CLOUD, RASTER and define the import source class in ImportSourceType rather than here. That would simplify some other pieces too - e.g. the warning if you try to --link a tabular source.
There was a problem hiding this comment.
Feel free to address if you're feeling keen 😃
| # see if any subclasses have a handler for this. | ||
| for subclass in cls._all_subclasses(): | ||
| if "handle_source_string" in subclass.__dict__: | ||
| retval = subclass.handle_source_string(ogr_source) |
There was a problem hiding this comment.
I tried this approach first but ultimately added extra steps vs adding the custom esri: prefix. I'm not 100% sure what the intention with this code is, but it's not used by anything so I've removed it on the basis that import source selection via from_spec is more reliable/universal.
|
Hm seems I might've broken the old |
| source = ESRIRestImportSource.open( | ||
| "esri:https://example.com/arcgis/rest/services/MyService/MapServer/0" | ||
| ) | ||
| assert isinstance(source, ESRIJSONImportSource) |
There was a problem hiding this comment.
This test doesn't look quite right:
Firstly, with pytest.raises is preferable to try ... except ... pass
Having an assert within the block from whence we except an exception to be raised doesn't look right either - the assert could easily be skipped when we just straight to the except, or, the assert could raise an exception that we catch and ignore, in which case, why assert it.
There was a problem hiding this comment.
Ah correct, I've fixed the test and made it clearer what it is trying to achieve
| # For ESRI JSON sources, the datasource typically has only one layer | ||
| # If no table is specified or the specified table doesn't exist, | ||
| # use the first (and usually only) layer in the datasource | ||
| if table is None or ogr_ds.GetLayerByName(table) is None: |
There was a problem hiding this comment.
I feel like this could be rewritten for clarity. I think there's effectively 2 sets of 2 branches that needn't actually interfere with each other, which should make it easier to follow:
1a) if there is one layer in the source import that layer
1b) otherwise, import the specified one (argument "table")
2a) if argument table is set, use that as dest_path
2b) otherwise, use the name of the layer
There was a problem hiding this comment.
Actually I just did some testing and I don't think the case where the table doesn't match is reachable currently - at least when I tried it, this was prevented by a check elsewhere:
kart/tabular/import_.py(280)table_import()
279 meta_overrides["metadata.xml"] = meta_overrides.pop("xmlMetadata")
--> 280 import_source = base_import_source.clone_for_table(
281 table,
kart/tabular/ogr_import_source.py(211)clone_for_table()
210 meta_overrides = {**self.meta_overrides, **meta_overrides}
--> 211 self.check_table(table)
212
kart/tabular/ogr_import_source.py(262)check_table()
261 if table_name not in self.get_tables():
--> 262 raise NotFound(
263 f"Table '{table_name}' not found",
What should happen here might depend on if or how you address my other comment about finding the names of these layers.
| elif self is self.ESRI_REST_TABLE: | ||
| from kart.tabular import ESRIRestImportSource | ||
|
|
||
| return ESRIRestImportSource |
There was a problem hiding this comment.
Feel free to address if you're feeling keen 😃
| from kart.exceptions import ImportSourceError | ||
|
|
||
|
|
||
| class ESRIRestImportSource(TableImportSource): |
There was a problem hiding this comment.
OK this is great in that it supports everything - you can import all tables from FeatureServer, or just one table from FeatureServer by name, or you can specify FeatureServer/0 and it'll import specifically that. Users probably don't want to read the help files so it's good if they can throw a URL at it and it will do something sensible.
What I don't like so much is that if you import FeatureServer/0, then the name is lost and replaced with ESRIJSON. This is probably Jack Dangermond's fault - a better URL would maybe be FeatureServer/name_of_dataset. but we can't fix that here. However, I wonder can we do something a bit cleverer where when someone asks to import FeatureServer/0, we also hit FeatureServer to find the name of the layer, and use that name if no other name is specified by the user.
There was a problem hiding this comment.
Thanks, I'll update to fetch the metadata from the feature service URL (.../0?f=json)
|
|
||
| Example: | ||
| # List all layers in a service | ||
| source = ESRIRestServerSource.open("https://example.com/.../MapServer") |
There was a problem hiding this comment.
Bit of a mix of docs here, some aimed at callers and some aimed at end-users.
For end-users who want to know all the layers, there's also:
kart import "esri:https://example.com/.../MapServer" --list
And for importing a particular layer:
kart import "esri:https://example.com/.../MapServer" LAYER
Description
Esri Rest services were already importable with the (slightly secret) OGR:ESRIJSON driver but it was painful to get the correct parameters. E.g. you needed to know that the
where,outFieldsandreturnGeometryfields should be set, and you couldn't import multiple layers from a single source easily.This PR adds a dedicated
esri:handler that accepts an Esri ArcGIS Map/Feature Server URL and simplifies the import process. For example:Any supplied URL parameters are preserved. E.g. a narrower set of features & fields can be imported from a specific source:
This PR includes other improvement to the importer framework to make it simple to add new handlers. See d823ec7 for more detail. The subsequent commit 17b7dc4 has the esri-specific changes.
Future Improvements
This doesn't recurse into folders, so the multi-layer importer will only work from a single MapServer/FeatureService endpoint. It would be straight-forward to extent this to iterate into folders in future.
Related links:
N/A
Checklist: