Cloud Native Geospatial Analytics with Apache Sedona A Hands-On Guide for Working with Large-Scale Spatial Data Paweł Tokaj, Jia Yu & Mo Sarwat Foreword by Qiusheng Wu
ISBN: 978-1-098-17400-2 DATA Navigating the complexities of large-scale spatial data can be daunting. In order to unleash the power of massive and complex datasets, you’ll need a cutting-edge tool like Apache Sedona. This innovative distributed computing system, designed specifically for spatial data, has diverse applications in fields such as mobility, telematics, agriculture, climate science, and more. This book serves as your guide to leveraging this tool, along with other technologies, to unlock the potential of geospatial analytics. Authors Paweł Tokaj, Jia Yu, and Mo Sarwat provide practical solutions to the challenges of working with geospatial data at scale. Ideal for developers, data scientists, engineers, and analysts, this guide uses real-world examples to help you integrate Python data ecosystems, apply machine learning, build geospatial data lakehouses, and handle modern geospatial data formats like GeoParquet. • Understand how Apache Sedona helps data practitioners address challenges with geospatial data • Learn how to run Apache Sedona, both locally and in cloud environments • Efficiently load, query, and analyze geospatial datasets using spatial SQL • Employ machine learning techniques to derive strategy-defining insights from spatial data • Manage and optimize large-scale geospatial data within a data lakehouse architecture Paweł Tokaj is a staff software engineer at Splunk and PMC of the Apache Sedona project. He’s passionate about distributed and streaming systems, cloud native architectures, microservices, and efficient geospatial processing. Jia Yu is cofounder of Wherobots and cocreator of Apache Sedona. His research focuses on large-scale database systems and geospatial data management. Mo Sarwat is CEO of Wherobots and cocreator of Apache Sedona. He specializes in large-scale data systems, AI infrastructure, and geospatial analytics. Cloud Native Geospatial Analytics with Apache Sedona “I have used Apache Sedona with my students for years. It stands out as one of the very few geospatial systems with sustained, measurable impact in both academia and industry. This book is a must-read for data practitioners working with large-scale geospatial data.” Amr Magdy, associate professor of computer science, UC Riverside
*Wherobots is 100% committed to be a carbon neutral company, enabling every organization to analyze our planet without impacting it. Get started for free on Wherobots Cloud at www.wherobots.com Contact us info@wherobots.comOR Developed by the original creators of Apache Sedona, Wherobots enables data engineering teams to create spatial data products up to 60x faster at a fraction of the cost of existing solutions. The Spatial Intelligence Cloud No overhead, serverless, fully managed. Instant global geospatial ETL, analytics and AI. Spatial Intelligence Cloud Benefits for Data Teams: Serverless: Compatible with Apache Sedona APIs, enable geospatial joins across global scale points, polygon, trips, raster data, and more without considering the overhead. Rapid Design: Discover geospatial data relationships at planetary scale and speed. SQL and Python ready: 180+ ST functions, 90+ raster functions. Spatial Data Catalog: Deliver spatial data products to your customers, users, and community through API, or data lakehouse friendly data formats. Embedded AI & ML: Empower your teams with GPU backed raster inference on aerial imagery, run map matching on billions of vehicle trips, and more.
Paweł Tokaj, Jia Yu, and Mo Sarwat Foreword by Qiusheng Wu Cloud Native Geospatial Analytics with Apache Sedona A Hands-On Guide for Working with Large-Scale Spatial Data
978-1-098-17400-2 [LSI] Cloud Native Geospatial Analytics with Apache Sedona by Paweł Tokaj, Jia Yu, and Mo Sarwat Copyright © 2026 O’Reilly Media, Inc. All rights reserved. Published by O’Reilly Media, Inc., 141 Stony Circle, Suite 195, Santa Rosa, CA 95401. O’Reilly books may be purchased for educational, business, or sales promotional use. Online editions are also available for most titles (http://oreilly.com). For more information, contact our corporate/institutional sales department: 800-998-9938 or corporate@oreilly.com. Acquisitions Editor: Aaron Black Development Editor: Gary O’Brien Production Editor: Clare Laylock Copyeditor: Charles Roumeliotis Proofreader: Carol McGillivray Indexer: BIM Creatives, LLC Cover Designer: Karen Montgomery Cover Illustrator: José Marzan Jr. Interior Designer: David Futato Interior Illustrator: Kate Dullea December 2025: First Edition Revision History for the First Edition 2025-12-05: First Release See http://oreilly.com/catalog/errata.csp?isbn=9781098173999 for release details. The O’Reilly logo is a registered trademark of O’Reilly Media, Inc. Cloud Native Geospatial Analytics with Apache Sedona, the cover image, and related trade dress are trademarks of O’Reilly Media, Inc. The views expressed in this work are those of the authors, and do not represent the publisher’s views or the views of the authors’ current or former employers. While the publisher and the authors have used good faith efforts to ensure that the information and instructions contained in this work are accurate, the publisher and the authors disclaim all responsibility for errors or omissions, including without limitation responsibility for damages resulting from the use of or reliance on this work. Use of the information and instructions contained in this work is at your own risk. If any code samples or other technology this work contains or describes is subject to open source licenses or the intellectual property rights of others, it is your responsibility to ensure that your use thereof complies with such licenses and/or rights. This work is part of a collaboration between O’Reilly and Wherobots. See our statement of editorial independence.
Table of Contents Foreword. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi 1. Introduction to Apache Sedona. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Introduction to Cloud Native Geospatial Analysis and Its Challenges 2 The Geospatial Analytics Ecosystem 4 Leveraging Cloud Native Architecture 6 Apache Sedona Overview 7 Spatial Query Processing 7 Apache Spark Overview 8 Understanding Apache Sedona’s Architecture and Components 12 Apache Sedona Data Structures 12 Spatial SQL 12 Spatial Query Optimizations 12 Support for Spatial File Formats 13 Visualization 13 Integration with PyData Ecosystem 14 Benefits of Apache Sedona 14 The Developer Experience 16 Who Uses Apache Sedona 16 Common Apache Sedona Use Cases 17 Community Adoption 18 The Future of the Project 18 Resources 19 Summary 19 iii
2. Getting Started with Apache Sedona. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 How to Run the Apache Sedona Python Program 22 The Apache Sedona Docker Image 27 Overview of the Notebook Environment 29 The Spatial DataFrame 30 Introduction to Spatial SQL 31 Working with the DataFrame API 34 Visualizing Data 35 Summary 35 3. Loading Geospatial Data into Apache Sedona. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 Loading Vector Data Formats 38 Vector Data Serialization 38 Apache Sedona Serialization 40 Differences Between Vector Data Formats 40 Distributed Versus Nondistributed Files 42 Reading Flat Files 42 Reading Shapefiles 43 Reading GeoJSON 44 Reading GeoPackage 47 Reading GeoParquet 48 Introduction to Raster Data Formats (GeoTIFF) 52 How Sedona Processes Rasters 53 Loading Data from Databases 55 Reading from PostgreSQL (PostGIS) 56 Reading from MySQL 57 Reading from MongoDB 58 Data Synchronization 59 CDC with PostgreSQL to GeoParquet Source 60 Hands-On Use Case: New York Taxi Data Analysis 63 The Most Popular Areas for Pickups and Dropoffs 65 The 10 Most Popular Routes 66 Summary 68 4. Points, Lines, and Polygons: Vector Data Analysis with Spatial SQL. . . . . . . . . . . . . . . . . 69 Vector Data Model and Spatial Relationship 70 Spatial Relationships 71 Dimensionally Extended 9-Intersection Model (DE-9IM) 76 Spatial Reference System and the Geography Model 78 Coordinate Reference System 78 Datum 79 Map Projections 80 iv | Table of Contents
Transformation 81 Spatial SQL and Vector Data Manipulation 82 Spatial Queries 89 Spark Distributed Joins 89 Spatial Joins 93 Spatial Indexes 94 Optimized Spatial Joins 97 Spatial Partitioning 99 Distributed Spatial Joins 100 Distributed KNN Joins 104 Hands-On Use Case: Real Estate Analysis 107 Summary 114 5. Raster Data Analysis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 The Raster Data Model 115 Raster SQL and Raster Data Manipulation 120 Raster Loader 120 Writing to Raster Formats 123 Pixel Functions 124 Geometry Functions 125 Raster Accessors 126 Raster Band Accessors 126 Raster Predicates 127 Raster-Based Operators 127 Raster Tiles 130 Map Algebra Functions 130 Raster Visualization 131 Zonal Statistics 132 Map Algebra 136 Joining Raster Data 143 Hands-On Use Case: Insurance Risk Modeling 148 Population Density (building_population) 150 Flood Risk (flood_stats) 151 Fire Risk (fire_risk_stats) 152 Closest Police and Fire Departments 153 Residential Building Density (building_density) 154 Summary 155 6. Apache Sedona and the PyData Ecosystem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 Manipulating Geospatial Vector Data 158 Working with GeoPandas and Shapely 158 Raster Data Tools 165 Table of Contents | v
Scheduling Your Geospatial Code 170 Transforming Your Geospatial Data with dbt 174 Writing dbt Applications Using Apache Sedona 175 Testing dbt Applications Using Apache Sedona 179 Vector Geospatial Visualization 181 Kepler.gl 182 GeoPandas 183 PyDeck 185 Raster Geospatial Visualization 186 Summary 187 7. Geospatial Data Science and Machine Learning. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 Geospatial Clustering with Apache Sedona (DBSCAN) 189 Outlier Detection (Local Outlier Factor) 193 Hot Spot Analysis (Local Getis–Ord Gi(*)) 197 Autocorrelation (Moran’s I) 201 Classification, Segmentation, and Object Detection from a Raster 203 Creating Geospatial Machine Learning Models with MLlib 207 Hands-On Use Case: Analyzing Road Accidents in Germany 213 Summary 225 8. Building a Geospatial Data Lakehouse with Apache Parquet and Apache Iceberg. . . 227 Overview of Data Lakehouse Architecture 227 Parquet Deep Dive 231 Columnar Versus Row Data Formats 232 Parquet Data Format 233 GeoParquet 237 Iceberg Tables 241 Data Transactions 241 Schema Evolution 243 Apache Iceberg Specification 244 Apache Iceberg Features 247 Iceberg Geospatial Types 252 Hands-On Use Case: Geospatial Data Lakehouse Deep Dive 253 Summary 257 9. Using Apache Sedona with Cloud Data Providers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259 Prerequisites 259 Sedona Spark 262 Databricks 262 AWS EMR 263 AWS Glue 265 vi | Table of Contents
Microsoft Fabric 266 GCP Dataproc 267 Wherobots Cloud 269 SedonaSnow in Snowflake 272 Sedona EWKB-Encoded Geometry Constructor Functions 274 Snowflake GeoJSON-Encoded Geometry Constructor Functions 275 Spatial Joins 277 Sedona Flink (Ververica) 278 Summary 279 10. Optimizing Apache Sedona Applications. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281 Optimizing the Apache Sedona Program 281 Select Only the Needed Columns 281 Filter Early 282 Reduce the Number of Vertices 283 Limit Spheroid Distance Use in Joins 284 Cache Reused DataFrames 284 Modify the Partition Number for Join Operations 286 Avoid Unnecessary Shuffling 287 Avoid Wide Operations 287 Avoid Collecting Large Amounts of Data for Your Application Driver 288 Use Window Functions Over GROUP BY and JOIN 288 Use Native Apache Sedona Methods 289 Use Apache Sedona Serializers 289 Avoiding Skew Joins 290 Spatial Partitioning 290 Spatial Join 292 Apache Sedona Python 295 Python (Vectorized) UDF Versus Using Apache Sedona SQL Function 295 Apache Sedona DataFrame to GeoPandas With and Without GeoArrow 296 GeoPandas DataFrame to Apache Sedona DataFrame With and Without GeoArrow 298 GeoParquet and Spatial Parquet 300 Apache Iceberg 303 Summary 303 Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305 Table of Contents | vii
(This page has no text content)
Foreword Geospatial data has become central to our understanding and response to the world around us. From monitoring ecosystems to precise map matching, location is often the key to unlocking insight. However, as the volume and velocity of geospatial data have surged, our analytical tools have struggled to keep pace. Traditional GIS tools excel at analysis but are often limited to single machine environments. Meanwhile, cloud data warehouses offer impressive scalability but often treat geospatial data as an afterthought. Apache Sedona bridges this divide. Sedona is an open source framework that embeds geospatial analysis directly into distributed computing platforms such as Apache Spark, Flink, and Snowflake. It treats spatial as a first-class concern, enabling complex spatial joins, queries, and raster processing across billions of records. With Sedona, we gain both the depth of geospatial science coupled with the elasticity of the cloud. I introduced Sedona briefly in my previous book, Introduction to GIS Programming: A Practical Python Guide to Open Source Geospatial Tools, where I included a chapter on distributed computing with Apache Sedona. That chapter sparked strong interest among readers, but it could only scratch the surface. Sedona is far too powerful and comprehensive to be condensed into a single section. It deserves a full-length treatment, and that is precisely what Cloud Native Geospatial Analytics with Apache Sedona provides. This book is authored by Sedona’s creators and core developers, who not only under‐ stand its technical architecture but also know how to make it approachable. It begins with the fundamentals such as the spatial DataFrame and spatial SQL, then advances into topics including distributed joins, raster analysis, and geospatial data lakehouses. It also introduces readers to running Sedona in real environments, from Docker contain‐ ers to managed platforms like Wherobots Cloud. With Wherobots Cloud, practitioners can quickly launch Sedona clusters, experiment with scalable analytics, and focus on solving problems rather than managing infrastructure. That accessibility lowers the barrier for anyone eager to bring distributed spatial analytics into their work. ix
Equally important is how the book situates Sedona within the broader ecosystem. Geospatial practitioners often rely on Python libraries like GeoPandas, Shapely, and Rasterio for analysis, as well as visualization frameworks like Kepler.gl for exploration and communication. Sedona integrates seamlessly with these tools, scaling familiar workflows across distributed clusters while adhering to open standards such as Geo‐ Parquet and spatial SQL. The authors highlight these connections, making Sedona valuable not only for GIS professionals but also for data engineers and scientists building modern data pipelines. The timing of this book could not be better. We face global challenges in disaster response, public health, and urban mobility, each of which relies on location-aware data at massive scales. Sedona has already been widely adopted by the geospatial community, and it is being used by organizations including Amazon and the Over‐ ture Maps Foundation. By walking through practical examples such as analyzing New York City taxi trips, visualizing Overture Maps building data, or processing global flood hazard maps, the book demonstrates Sedona’s versatility and its ability to tackle real-world problems. For me, this book represents a natural extension of the journey in geospatial analytics. In my Introduction to GIS Programming book, I could only provide a glimpse of what Sedona enables. This book delves deeper, teaching not only how Sedona functions but also how to effectively deploy it in scalable production environments. Whether you are a data scientist enhancing machine learning with spatial features, an engineer designing large-scale ETL pipelines, or an analyst seeking patterns hidden within billions of records, this book is tailored to meet your needs. Even if you are simply curious about the future of geospatial analytics, you will discover why cloud native approaches are essential and why Apache Sedona is a cornerstone of this evolution. I am delighted to recommend Cloud Native Geospatial Analytics with Apache Sedona. It is a technical guide, a practical manual, and an invitation to join a vibrant open source community. With Sedona and platforms like Wherobots Cloud, the future of geospatial analytics is already here: scalable, accessible, and open. — Qiusheng Wu Associate Professor at the University of Tennessee, Knoxville Knoxville, TN, September 2025 x | Foreword
Preface Welcome to Cloud Native Geospatial Analytics with Apache Sedona. Thank you for choosing this book as your learning companion. The following pages introduce what lies ahead, our reasons for writing it, and how to get the most value from your reading. About This Book In this book, you will learn how to process and analyze geospatial data with Apache Sedona. Sedona is designed for data engineers and geospatial analysts working with datasets in the physical world, and this book explores the core concepts, inner workings, and practical applications of Apache Sedona. By the time you reach the end, you will have grasped the essentials and possess the practical knowledge to use Apache Sedona effectively in your data projects. You will have learned about how to use state-of-the-art geospatial data processing techniques on different engines and for different types of spatial data. You will learn how to perform spatial joins, run KNN queries, and build spatial indices with different algorithms like R-tree and quadtree. No matter your level of experience, Cloud Native Geospatial Analytics with Apache Sedona offers a comprehensive and accessible path to mastering Apache Sedona in the realm of cloud native geospatial analytics. Why We Wrote This Book As adoption of Apache Sedona accelerates across industries, a clear need has emerged for a comprehensive technical resource that addresses the complexities of large-scale geospatial data processing. The volume of available spatial data like weather maps, socioeconomic data, vegetation indices, and geo-tagged social media has increased tremendously, and scalable tools like Sedona are needed to extract insights from these datasets. Additionally, the growing ecosystem of spatial file formats, the integration of both vector and raster data, and the demand for scalable spatial analytics have created challenges for geospatial professionals working with distributed systems. We wrote xi
Cloud Native Geospatial Analytics with Apache Sedona to bridge this gap and provide a centralized, authoritative reference for leveraging Sedona’s capabilities in spatial indexing, spatial joins, coordinate system handling, and format interoperability. Our goal is to equip practitioners with the knowledge required to process and analyze geospatial data efficiently in modern big data environments. What You Will Find Inside The upcoming chapters will teach you Apache Sedona’s fundamentals and operational mechanics, demonstrate how to leverage the framework across different platforms and tools, and share proven strategies for handling geospatial data efficiently with Apache Sedona. Following is an overview of what each chapter covers: Chapter 1, “Introduction to Apache Sedona” Provides a foundational overview of Apache Sedona, explaining what it is, its core capabilities for geospatial data processing, and why it’s valuable for modern data analytics workflows. Chapter 2, “Getting Started with Apache Sedona” Walks you through the initial setup and configuration of Apache Sedona, cov‐ ering installation requirements, environment preparation, and your first basic operations with the framework. Chapter 3, “Loading Geospatial Data into Apache Sedona” Demonstrates how to efficiently load and manage large-scale geospatial datasets in Apache Sedona, covering various data formats, ingestion methods, and opti‐ mization techniques for handling big geospatial data. Chapter 4, “Points, Lines, and Polygons: Vector Data Analysis with Spatial SQL” Explores how to analyze vector geospatial data using Apache Sedona’s SQL capa‐ bilities for spatial operations and geometric computations. Chapter 5, “Raster Data Analysis” Covers techniques for processing and analyzing raster geospatial data using Apache Sedona, including operations on gridded datasets like satellite imagery and digital elevation models. Chapter 6, “Apache Sedona and the PyData Ecosystem” Demonstrates how Apache Sedona integrates with Python’s PyData ecosystem, showing how to leverage popular libraries like GeoPandas, Shapely, and Jupyter Notebooks for geospatial analytics workflows. xii | Preface
Chapter 7, “Geospatial Data Science and Machine Learning” Explores how to apply data science techniques and machine learning algorithms to geospatial datasets using Apache Sedona, covering spatial feature engineering, predictive modeling, and analytical workflows. Chapter 8, “Building a Geospatial Data Lakehouse with Apache Parquet and Apache Iceberg” Demonstrates how to construct a modern geospatial data lakehouse architecture using Apache Sedona with Apache Parquet files and Apache Iceberg for efficient storage, versioning, and querying of large-scale geospatial datasets. Chapter 9, “Using Apache Sedona with Cloud Data Providers” Shows how to deploy and utilize Apache Sedona with major cloud platforms and data services, covering integration patterns, configuration options, and best practices for cloud-based geospatial analytics. Chapter 10, “Optimizing Apache Sedona Applications” Highlights essential methods for making Apache Sedona applications faster and more memory-efficient, with a focus on optimizing spatial join queries, acceler‐ ating Python applications, and efficiently storing data in Parquet and Apache Iceberg format. How to Use This Book This book is carefully designed to deepen your comprehension and hands-on exper‐ tise with Apache Sedona, suitable for both newcomers and experienced practitioners. Although organized in a progressive sequence that allows you to develop complete mastery from beginning to end, the structure also supports adaptable study patterns. Every chapter stands independently, permitting you to jump straight into particular subjects or scenarios that interest you without requiring prior chapter completion. This methodology transforms the book into an essential tool for both structured education and focused, on-demand learning opportunities. Within these pages, you’ll encounter numerous code examples and hands-on demon‐ strations. To facilitate your educational journey, we have created a specialized GitHub repository accompanying this book. The repository follows a chapter-based structure, providing convenient access to all essential resources, code samples, and illustrations relevant to each section’s material. The repository contains more examples than those in the book; it’s the place where we put design and architectural patterns to use Apache Sedona in production systems or to understand the internals better. For instance, in the book, we covered Apache Airflow for scheduling your geospatial workflows, but in the repository, you can find examples with other schedulers, like Prefect. There are plenty of new upcoming features, or those we can’t put in the book, which you can find in the repository, and they will be updated whenever Preface | xiii
needed. We will include examples with SedonaDB, including the geography data type, GeoPandas data type, and other new features that will be released after the book’s publication. We have a limited amount of space for figures in the book, so we can put many more in the repository. Many of us are visual learners, and we believe that interactive visualizations and figures are the best way to understand geospatial systems and algorithms. All contributions are welcome, including issue reports or example requests. Whether you aim to comprehend Apache Sedona’s technical architecture or need to execute particular features, this repository functions as a supporting resource to strengthen your understanding and implementation of the principles presented throughout the book. Whether you decide to study this guide from beginning to end or concentrate on spe‐ cific chapters according to your current requirements, this book serves as a thorough and user-friendly reference for Apache Sedona, enhanced by practical, interactive elements available through our dedicated GitHub repository. Conventions Used in This Book The following typographical conventions are used in this book: Italic Indicates new terms, URLs, email addresses, filenames, and file extensions. Constant width Used for program listings, as well as within paragraphs to refer to program elements such as variable or function names, databases, data types, environment variables, statements, and keywords. Constant width italic Shows text that should be replaced with user-supplied values or by values deter‐ mined by context. This element signifies a tip or suggestion. This element signifies a general note. xiv | Preface
This element indicates a warning or caution. Using Code Examples Supplemental material (code examples, exercises, etc.) is available for download at https://github.com/wherobots/apache-sedona-book. This book is here to help you get your job done. In general, if example code is offered with this book, you may use it in your programs and documentation. You do not need to contact us for permission unless you’re reproducing a significant portion of the code. For example, writing a program that uses several chunks of code from this book does not require permission. Selling or distributing a CD-ROM of examples from O’Reilly books does require permission. Answering a question by citing this book and quoting example code does not require permission. Incorporating a signifi‐ cant amount of example code from this book into your product’s documentation does require permission. We appreciate, but do not require, attribution. An attribution usually includes the title, author, publisher, and ISBN. For example: “Cloud Native Geospatial Analytics with Apache Sedona by Paweł Tokaj, Jia Yu, and Mo Sarwat (O’Reilly). Copyright 2026 O’Reilly Media, Inc., 978-1-098-17400-2.” If you feel your use of code examples falls outside fair use or the permission given above, feel free to contact us at permissions@oreilly.com. O’Reilly Online Learning For more than 40 years, O’Reilly Media has provided technol‐ ogy and business training, knowledge, and insight to help companies succeed. Our unique network of experts and innovators share their knowledge and expertise through books, articles, and our online learning platform. O’Reilly’s online learning platform gives you on-demand access to live training courses, in-depth learning paths, interactive coding environments, and a vast collection of text and video from O’Reilly and 200+ other publishers. For more information, visit https://oreilly.com. Preface | xv
How to Contact Us Please address comments and questions concerning this book to the publisher: O’Reilly Media, Inc. 141 Stony Circle, Suite 195 Santa Rosa, CA 95401 800-889-8969 (in the United States or Canada) 707-827-7019 (international or local) 707-829-0104 (fax) support@oreilly.com https://oreilly.com/about/contact.html We have a web page for this book, where we list errata and any additional informa‐ tion. You can access this page at https://oreil.ly/cloud-native-geospatial. For news and information about our books and courses, visit https://oreilly.com. Find us on LinkedIn: https://linkedin.com/company/oreilly-media. Watch us on YouTube: https://youtube.com/oreillymedia. Acknowledgments We would like to thank our technical reviewers of this book, Kamil Raczycki, Kacper Leśniara, and Vladislav Bilay. Your professionalism, creative ideas, and valuable sug‐ gestions not only helped us catch errors but also made the book easier and more enjoyable to read. Most of all, thank you to the many contributors to Apache Sedona and its ecosystem. Without their work, this book would not have been possible. Paweł Tokaj First and foremost, I would like to thank my close family, especially my wife and my best friend, Zosia, whose support and patience were significant to me. You were my first reviewer of the chapters before they came into Gary’s hands. I really appreciate the time we spent together reading all the chapters. You were always asking how the writing was going, and you prepared beverages when I stayed up late writing. Also, a huge thank you to my brother Bartek, who is an inspiration and proof that with hard work, nothing is impossible. Thanks to the Sedona community, first to coauthor Jia, who accepted my first Python SDK MR back in late 2019. This has been an incredible journey that I never thought would go in this direction. Thank you for all the discussions, MR reviews, and for always being friendly and open-minded. Secondly, thanks to Matthew for early reviews of the book and willingness to help. xvi | Preface
I would like to thank Gary for his patience and detailed explanations on the process of writing a book. I learned a lot from you. Thanks for the friendly atmosphere and your professionalism, which made writing seamless and possible! To all the great engineers I have the pleasure of working with—Filip Mikina, JC Arbelbide, James Burkhart, Przemysław Walczyk, Piotr Bartkiewicz, Marek Wie‐ wiórka, and other great engineers I met in my career at PwC, Allegro, GetinData, and Splunk—I’ve learned a lot from you and become a better engineer. I am lucky that our paths crossed and we had the opportunity to work together. To the geospatial community, whose resources were invaluable in writing this book, and especially to Matt Forest and Qiusheng Wu for their efforts in evangelizing geospatial data and making it more accessible and easier to understand. Last but not least, to all the inspiration I had throughout my career, book authors, blog posts, tutorials, and many more. Still, three that made a huge impact were Designing Data-Intensive Applications by Martin Kleppmann, The Pragmatic Program‐ mer by David Thomas and Andrew Hunt, and the book that everybody should read, How to Win Friends and Influence People by Dale Carnegie. Jia Yu This book is a milestone in my journey with Apache Sedona, and I am deeply grateful to everyone who made it possible. I want to thank the Apache Sedona community—committers, contributors, and users—for building and shaping Sedona together. Every contribution, no matter how small, has helped Sedona grow into what it is today, and I have learned so much from working alongside this incredible group of people. I am also thankful to the Apache Software Foundation (ASF) for providing the framework of open collaboration and governance that allowed Sedona to thrive. The ASF’s values of community over code have guided me not just in this project, but in my broader open source work. Beyond Sedona, I owe much to the geospatial and big data communities. The stand‐ ards, tools, and datasets created by so many researchers and engineers laid the foundation for Sedona’s capabilities. This book stands on the shoulders of decades of work from those who came before me. I also want to express gratitude to my colleagues at Wherobots who gave feedback on early drafts and shared their real-world experiences with Sedona. Your support and insights pushed me to refine my thinking and make this book as useful as possible. Most of all, I thank my family and loved ones for their patience. Your encouragement gave me the strength to finish this work. Preface | xvii
Loading comments...
Reply to Comment
Edit Comment