[OAM-talk] OpenAerialMap questions
Christopher Schmidt
crschmidt at crschmidt.net
Sat Dec 8 09:13:39 MST 2007
On Fri, Dec 07, 2007 at 12:09:25PM -0800, Matthew Perry wrote:
> On Dec 7, 2007 11:15 AM, Christopher Schmidt <crschmidt at crschmidt.net> wrote:
> The NAIP imagery I am interested in comes as mrsid in UTM projection.
> (http://new.casil.ucdavis.edu/casil/remote_sensing/naip_2005/county_mosaics/).
>
> Is the GDAL version on the server built with Mr Sid support? What
> about supporting on-the-fly reprojection via mapserver? Or is this
> just two strikes against this dataset and it will require alot of
> pre-processing?
Decompressing from MrSID on the fly is slow. Reprojecting on the fly is
very slow. However, pre-processing data is not something that is some
kind of permanant black mark against a dataset: indeed, it is my
expectation that any serious dataset will need lots of reprocessing in
order to use. That's part of what makes OAM hard with big datasets: I've
been processing the MassGIS data for literally weeks, and I'm still not
done. With the exception of the landsat and blue marble data,
*everything* in OAM has required some level of reprocessing in order to
get it to work: whether it's just moasicing and reprojecting,
decompressing and reprojecting, or rectifying to the earth. (And of
course, the landsat data is no small piece of work to get working in and
of itself.)
The important thing to remember though, is that once it's done once, it
*never has to be done again* for that dataset. Imagine that the primary
use case for web mapping is in EPSG:4326 (which is my experience with
OpenLayers). If someone takes a set of 10 different aerial shots over
the city of Boston -- each of which means clearing out the cache under
the image so that the tiles can be regenerated -- that means that I have
to decompress the SID imagery and reproject it 10 times -- for each zoom
level where a tile has been generated.
So yes -- preprocessing is expensive. In addition, you have to do things
like add overlays to your imagery -- definitely not a
non-computationally expensive thing to do. But it only has to be done
once for each dataset, not a dozen or more times. Whether it happens
before the imagery is loaded or as its browsed, it has to happen
sometime, and the former is better in the long run.
> I think supporting other input projections is going to be crucial in
> the long run since platte carre is almost never the preferred
> projection for storing imagery for a number of reasons.
I agree with the premise, but not the result.
> Most organizations doing aerial
> photography in america, for instance, will reference the data to
> stateplane or UTM.
Speaking as someone who has been processing this data for weeks, I fully
understand this.
> > Instead of hosting all the data locally, the longer term plan is that
> > any sizable dataset for which there is a willing provider will be
> > populated by connecting to a remote WMS. This 'distributed' style of
> > data loading is something that we're currently working on building the
> > infrastructure for, after which there is a plan to document how you
> > should set up your own data -- after that OSGeo/OpenAerialMap is only
> > responsible for hosting datasets for which there is no willing provider
> > available. We have not yet discussed how to scale that aspect of it: my
> > hope is that we can avoid it for a while, since I think for the most
> > part there are organizations that are able to provide WMS of most large
> > datasets, meaning that OAM hosts the small stuff directly, and the big
> > stuff is shared.
> >
>
> I like the distributed model as well. It's just that I don't
> personally have the resources to serve up this volume of imagery. But
> I do know that it's available, how to obtain it, how to work with it
> and would be willing to set it up if there was a home available for
> it.
I'll keep that in mind. Is there no university or other organization in
the area that does have the resources? I'm biased, living in Cambridge:
I have MIT down the street. (Not only that, but it's likely that I could
use some of MetaCarta's servers to serve the imagery as well.) I'd like
to encourage the use of OAM as a project where we can get together and
figure out how we can convince local organizations that serving this
data is in their best interest -- not just for OAM, but for others as
well.
> I'd imagine there would be a lot of folks in this situation that could
> contribute time but not servers.
Part of contributing time can be looking into how to get servers
contributed. But it is true that as OAM grows, I have a hope that we
will be able to provide more servers for those who can't find other
homes as well: we just aren't to that point yet. "First things first."
> > > What are the storage limits and scalability of the site?
> >
> > I'm not sure what you mean by 'storage limits and scalability'.
> > Currently, there is enough space to host any imagery that users could
> > reasonably upload, but there are plans to change a lot of the
> > infrastructure once GDAL 1.5 is out: specifically, to using JPG encoding
> > for storing data, which gains us an order of magnitude in terms of what
> > we can do with the storage we have available. Again, the goal is not to
> > have all the data hosted at OAM in the long term, but instead to move to
> > a situation where the data is hopefully requested via WMS instead.
>
> I was assuming OAM was going to be a repository, not a gateway.
> Perhaps this could be clarified on the wiki or the roadmap.
Well, it's not entirely clear what it's going to be yet -- but I do know
that we don't have the > 100TB of disk space it would take to store the
entire country's NAIP data uncompressed. :) I'm working within the
constraints of reality, which is important even if it's frustrating :)
> > WCS is not
> > terribly hard, I don't think, just changing a mapfile, though I'm not
> > sure if it's useful and I'd prefer to have a use case before offering a
> > service. Downloading data by bounding box is on the 'Things that are
> > being pondered' plate, but not near term.
> >
>
> Having access to the actual data (not a WMS-ified image) is certainly
> useful! Let's say I want to create a map of my neighborhood but, for
> whatever reason, can't rely on a WMS service for that data. Maybe I
> need to do some manipulation, feature extraction, reprojection,
> resampling, etc.
Using what toolset? How are you going to get the raw data out via WCS?
What tool are you going to use to browse the imagery? Does it support
Geotiffs? Does the output need to be jp2 in order to make senes? does it
support bigtiff? Etc.
Without a concrete use case, I can't tell if WCS is solving an *actual*
problem or a theoretical problem. Getting the data out in a way that
anyone who wants to solve an *actual* problem can solve it is clearly
important: what's not clear yet is whether WCS helps solve actual
problems in any specific configuration.
If you have an actual problem, and can describe the changes that allow
it to be solved via WCS, then I'll gladly take a suggestion to how to
make the mapfile support it. And of course, if you don't like that path,
you can set up your *own* mapserver: the wiki documents that:
http://wiki.openaerialmap.org/Using_With_MapServer
This lets you drop the OAM data into a layer in any mapfile, which you
can then use to set up your own WCS service, without it needing to live
on the OAM server.
Am I missing something important here?
Regards,
--
Christopher Schmidt
Web Developer
More information about the talk
mailing list