T-07: A hands-on introduction to Spatial ETL with GeoKettle
ETL (Extract, Transform and Load) tools are conceived to facilitate the integration of heterogeneous data sources, often in order to feed data warehouses. This tutorial is a practical introduction to the open source spatial ETL tool GeoKettle (http://www.geokettle.org). GeoKettle is an extended version of Pentaho Data Integration (Kettle) which adds to this open source ETL, a geospatial data type for the management of vector geometries and a set of powerful functionalities dedicated to the processing of geospatial data.
This tutorial will alternate between short presentations of the concepts and modules on which the tool is based and progressive exercises so that attendees can practice the different functionalities offered by GeoKettle. The tutorial will rely on the conference Live DVD. GeoKettle is already included in it. Each GeoKettle module will be briefly presented, a demonstration will be performed on the screen and attendees will be able to experiment the different steps on their own computer. All the following points will be addressed during this tutorial:
- Short introduction to fundamental concepts on ETL and data warehouses
- Installation of GeoKettle: software prerequisites and supported platforms
- Exploration of the graphical user interface
- Basic concepts: transformations, steps, jobs and entries
- The transformation repository
- Access to heterogeneous data sources: ESRI Shapefiles, spatial DBMS, ...
- Writing geospatial data: PostGIS tables, Shapefiles, ...
- Transformations: spatial selection and filtering, schema changes, SRS handling, coordinates transformations, integration of several data sources, geoprocessing, use of scripting capabilities, ...
At the end, attendees should have a working knowledge of GeoKettle and should be able to design advanced geospatial data transformations and to carry out efficient integration tasks with this open source spatial ETL tool.
User Level: Beginner
Prerequisites: A practical experience of GIS and spatial DBMS.
Instructor bio: http://ca.linkedin.com/in/thierrybadard
Instructor experience/relevant material: http://docs.google.com/View?id=dcsb4h2h_17cggf9vc2