Materialized Views in Postgres
I got a lot of great feedback on my tutorial at PGCon. The slides, as well as a sample chapter from my upcoming O’Reilly Book are posted on the PGCon web site. You can also grab them here:
- Slides from PGCon 2008: Materialized Views That Work.
- Sample Chapter: Materialized Views from Enterprise Rails (O’Reilly Media).
I highly recommend the sample chapter, as it takes you through an example in prose form. The slides from the talk are pretty content-heavy but some code slides may be difficult to grasp without commentary.
PGCon 2008
My presentation on materialized views in Postgres, and especially materialized views with time dependencies, has been accepted to PGCon 2008.
Abstract: Complex queries repeated in application code often are recast as views. This makes programming easier, but it does nothing to improve the performance of the underlying query. Materializing the view is the next logical step. Materialized views are stored in standard tables, so joins, procedure evaluation, and conditions from the original query are pre-computed before queries are run against the materialized view. Because the data is in a table, performance can be further improved by adding indexes.
Many developers shy away from materializing views in Postgres because the functionality is not built in, as it is in Oracle. In this talk, I will present a straightforward approach that can be applied to materialize any view, even those that Oracle cannot handle. Case in point: views that have a time-dependency (such as those based on the current time, now()) don’t have convenient trigger points (such as commits) to signal that the view should be refreshed, and are a major stumbling block for view materialization. This talk presents a technique to handles those cases as well, bringing materialized views even more powerful than those available in commercial products to Postgres.
Hope to see you in Ottawa!
Geographic Distance in Postgres
To help proliferate the idea that the database is very much a part of your Rails app, I’m going to be posting useful tidbits of PL/pgSQL that can help you application go and go fast. Feel free to send in your own PL/pgSQL snippets and I’ll post them here as well.
I find that when folks say Ruby/Rails is slow, they’re doing things in Ruby/Rails application code that they shouldn’t be. The following PL/pgSQL function will compute an approximate distance between two sets of latitude/longitude pairs.
create or replace function miles_between_lat_long( lat1 numeric, long1 numeric, lat2 numeric, long2 numeric ) returns numeric language 'plpgsql' as $$ declare x numeric = 69.1 * (lat2 - lat1); y numeric = 69.1 * (long2 - long1) * cos(lat1/57.3); begin return sqrt(x * x + y * y); end $$;
Why is this useful? Well, if your web application needs to find distances, such as “drug stores within 10 miles of zip code 02139,” then you can construct a single SQL query that will give you the answer. This can be far more efficient than loading the entire zip code database in memory and trudging through the data in the application layer, which may not be optimized for doing these sorts of computations. For example, assume you’ve got a table drug_stores with latitude and longitude columns, and a table zip_codes that has the latitude and longitude for each zip code in the United States. To find the drug stores within 10 miles of 02139, you issue the following query:
select *
from drug_stores d,
zip_codes z
where z.zip_code = '02139'
and miles_between_lat_long(d.latitude, d.longitude, z.latitude, z.longitude) < 10;

1 comment