Eave-Sense: Geospatial Urban Intelligence on Open Buildings with BigQuery AI
We explore how SQL and Python native workflows on BigQuery AI and BigFrames transform billions of building polygons into actionable urban intelligence
The Foundation: Google's Open Buildings Dataset
We use Google's Open Buildings (v3): polygon WKT + centroid, area_in_meters
, confidence
,
and per-tile precision thresholds (P80/P85/P90). Files are sharded by S2 Level 4 for efficient access;
we query them in place via BigLake external tables, no data copy required.
Recent
The dataset covers 1.8 billion building detections across Africa, Asia, and Latin America, with each polygon representing a structure identified through machine learning on high-resolution satellite imagery.
The Eave-Sense Workflow: End-to-End in BigQuery
Step 1: Ingesting & Preparing Data with BigLake
External tables point to GCS CSVs; then we standardize types, build GEOGRAPHY
,
and join per-tile thresholds to flag precision aware counts.
-- External tables (zero-copy on GCS)
CREATE OR REPLACE EXTERNAL TABLE `archaios-463614.eave_sense.polygons_raw_ext`
OPTIONS (
format = 'CSV',
uris = ['gs://open-buildings-data/v3/polygons_s2_level_4/*.csv'],
field_delimiter = ',',
skip_leading_rows = 1
);
CREATE OR REPLACE EXTERNAL TABLE `archaios-463614.eave_sense.s2l4_thresholds_ext`
OPTIONS (
format = 'CSV',
uris = ['gs://open-buildings-data/v3/score_thresholds_s2_level_4.csv'],
field_delimiter = ',',
skip_leading_rows = 1
);
Step 2: Aggregating Billions of Buildings to S2 Tiles
We aggregate polygons to S2 L10 (approximately 0.65 km²) and focus on the East African Community (TZ/KE/UG/RW/BI) for deep dive analysis. This hierarchical approach enables efficient spatial queries while preserving granular insights.
CREATE OR REPLACE TABLE `archaios-463614.eave_sense.tiles_eac_l10` AS
SELECT
a.iso2, a.country,
S2_CELLIDFROMPOINT(p.centroid, 10) AS s2_id_l10,
COUNT(*) AS bldg_count_all,
COUNTIF(p.pass_p85) AS bldg_count_p85,
AVG(p.area_m2) AS avg_area_m2,
AVG(p.confidence) AS avg_conf
FROM `archaios-463614.eave_sense.polygons_qc` p
JOIN `archaios-463614.eave_sense.eac_aoi_boxes` a
ON ST_WITHIN(p.centroid, a.aoi)
GROUP BY iso2, country, s2_id_l10;
EAC Country Summary
Unleashing BigQuery AI: From Numbers to Knowledge
BigQuery's integrated AI capabilities transform raw building statistics into actionable insights. We leverage multiple AI functions to generate summaries, validate hypotheses, and forecast trends.
Concise Summaries with ML.GENERATE_TEXT
SELECT
prompt_text,
ml_generate_text AS insights
FROM ML.GENERATE_TEXT(
MODEL `archaios-463614.eave_sense.text_bison`,
TABLE (
SELECT 'Given these EAC country stats (...), provide 5 concise insights...' AS prompt_text
),
STRUCT(0.2 AS temperature, 512 AS max_output_tokens)
);
Generated Insight Bullets
Country Briefs with AI.GENERATE (Gemini 2.5 Pro)
Using Gemini 2.5 Pro, we generate comprehensive country briefs that contextualize building patterns within broader urban development narratives.
Gemini Brief
Typed Action Plans with AI.GENERATE_TABLE
AI.GENERATE_TABLE structures unstructured insights into typed, queryable columns, perfect for generating per-country action plans based on settlement patterns.
Per-Country Action Plan
Fact Check: "TZ > KE?" (AI.GENERATE_BOOL)
Numeric Extraction (AI.GENERATE_INT/DOUBLE)
Illustrative Forecasting with AI.FORECAST
We demonstrate time series forecasting on synthetic monthly building counts, showing how urban growth patterns could be projected forward for planning purposes.
SELECT *
FROM AI.FORECAST(
TABLE `archaios-463614.eave_sense.tz_monthly_synth`,
STRUCT(6 AS horizon)
);

Forecast Output
The BigFrames Alternative: Pythonic Analysis at Scale
BigFrames brings the familiar pandas API to BigQuery's distributed compute, enabling Python developers to work with massive datasets without leaving their comfort zone.
import bigframes.pandas as bpd
from bigframes.ml.llm import GeminiTextGenerator
bpd.options.bigquery.project = "archaios-463614"
bpd.options.bigquery.location = "US"
facts = bpd.read_gbq("""
SELECT country,
SUM(bldg_count_p85) AS buildings_p85,
AVG(avg_area_m2) AS mean_area_m2
FROM `archaios-463614.eave_sense.tiles_eac_l10`
GROUP BY country
""")
gen = GeminiTextGenerator(model="gemini-2.5-pro")
prompts = facts.assign(
prompt=lambda df: (
"Summarize settlement intensity for " + df["country"] +
" using buildings_p85=" + df["buildings_p85"].astype(str) +
" and mean_area_m2=" + df["mean_area_m2"].round(1).astype(str) +
". Keep it under 25 words."
)
)[["country","prompt"]]
out = gen.predict(prompts, text_col="prompt")
df_bf = out.to_pandas()
BigFrames: Country Insights
BigFrames: Forecast Output

Interactive Map: EAC S2 L10 Tiles by P85 Quantile
Fully interactive Folium map (pan/zoom, tooltips) visualizing building density across East Africa. Each S2 tile is colored by its P85 building count quantile, revealing urban concentration patterns.


Global Top Tiles
EAC Tile Features (Sample)
Conclusion: A New Paradigm for Geospatial Intelligence
Eave-Sense demonstrates how BigQuery AI + BigFrames can elevate a single public dataset, Open Buildings, into narratives, typed plans, and visual artifacts in a reproducible, mostly SQL native workflow. The result: faster time to insight for urban planning, population workflows, environmental screening, and disaster response targeting.
By combining the scale of BigQuery's distributed compute with the intelligence of modern LLMs, we've shown that geospatial analysis need not be confined to traditional GIS tools. The future of urban intelligence lies in the convergence of spatial data, cloud computing, and AI, a future where billions of observations transform into actionable insights with just a few SQL queries.
The entire workflow, from raw polygons to interactive maps and AI generated insights, runs in under 15 minutes on BigQuery's serverless infrastructure. This represents a paradigm shift in how we approach large scale geospatial analysis: no infrastructure management, no data movement, just pure analytical power at your fingertips.
Cite This Post
@online{alfaxad_eavesense_2025, author = {Alfaxad Eyembe}, title = {Eave-Sense: Geospatial Urban Intelligence on Open Buildings with BigQuery AI}, date = {2025-09-22}, url = {https://alfaxad.github.io/eavesense.html}, urldate = {2025-09-22} }