Skip to content

fix: handle conversion of native geometry columns#116

Open
saidy-moregeo wants to merge 5 commits into
mainfrom
bugFix/conversion
Open

fix: handle conversion of native geometry columns#116
saidy-moregeo wants to merge 5 commits into
mainfrom
bugFix/conversion

Conversation

@saidy-moregeo

@saidy-moregeo saidy-moregeo commented Jun 5, 2026

Copy link
Copy Markdown
Collaborator
  • Handle the conversion of datasets with native geoarrow geometry column.
  • Investigate issue with conversion of locally uploaded files.
  • Disable conversion for local files.

@saidy-moregeo saidy-moregeo self-assigned this Jun 5, 2026
@saidy-moregeo saidy-moregeo marked this pull request as ready for review June 5, 2026 14:00
Comment thread src/converter.js Outdated
* @param {Array} opts.schema - Schema columns ({name, type, ...}).
* @param {Array<string>} opts.geoColumns - Names of geometry columns.
* @param {string|null} opts.primaryGeoColumn - Primary geometry column name.
* @param {string|null} opts.encoding - Geometry encoding ('wkb' or 'wkt').

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't it also be the arrow types?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it can - will update it to include the native geoArrow geometry type.

Comment thread src/db.js
Comment on lines +188 to +192
const ringToWkt = (inner) =>
`ring -> '(' || array_to_string(list_transform(ring, ${inner}), ', ') || ')'`;
const wrapGeom = (wktExpr) =>
`CASE WHEN ${col} IS NOT NULL AND len(${col}) > 0 THEN ST_GeomFromText(${wktExpr}) ELSE NULL END`;
const arr = (expr, lambda) => `array_to_string(list_transform(${expr}, ${lambda}), ', ')`;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this do?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's an abstraction helper functions to avoid having to repeated the same code multiple times on the different cases of the native geoArrow.

ringToWkt: Helper to transform list of points or lines into lambda strings that can be easily converted to duckDB's Geometry type.
wrapGeom: Checks for nulls before converting the constructed wkt to the actual duckDb's geometry type that can be transform.
arr : Helper function to map each element of a list to a lambda that's join in a comma separated list.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. Any reason these are redefined on each function call instead of defining them once?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are referring to having them defined on both the db.js and converter.worker.js, frankly was planning to have a single buildGeomExpr function that can be invoked on either files to avoid multiple definition, but didn't follow through as was experience some errors after refactor and I decided to not waste more time on such.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I meant why you use inline functions.

@m-mohr

m-mohr commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator

Looks generally okay, but I left some comments.

Generally, we may benefit from documenting our design decisions with regards to the different data types and conversions so that we can reflect on them later.

@saidy-moregeo

Copy link
Copy Markdown
Collaborator Author

Regarding documentation of the query geometry expression of the different data types for future reference - we already have an open issue explain the decision and to find an alternative in the future.

Unless if it needs to be on the code base as a comment or added on the readme.

@m-mohr

m-mohr commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator

Extending/Updating the issue while we develop is okay. Is the issue up-to-date?

@saidy-moregeo

Copy link
Copy Markdown
Collaborator Author

Extending/Updating the issue while we develop is okay. Is the issue up-to-date?

Yes it entails our current implementation and the reason - it's on issue #111 you can take a look to double check.

Comment thread src/converter.js
* @param {Array<string>} opts.geoColumns - Names of geometry columns.
* @param {string|null} opts.primaryGeoColumn - Primary geometry column name.
* @param {string|null} opts.encoding - Geometry encoding ('wkb' or 'wkt').
* @param {string|null} opts.encoding - Geometry encoding ('wkb', 'wkt', or 'native geoArrow').

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So was pass in "native geoArrow"? I thought it would be "point", "polygon", etc...?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, isn't the list (points, linestrings, multipoints, etc )considered native geoArrow.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can update it. I just thought native geoArrow reflects the list that's not wkt or wkb.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should reflect the API as is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants