[OAM-talk] OpenAerialMap questions
Christopher Schmidt
crschmidt at crschmidt.net
Sat Dec 8 13:04:19 MST 2007
On Sat, Dec 08, 2007 at 10:37:10AM -0800, Matthew Perry wrote:
> On Dec 8, 2007 8:13 AM, Christopher Schmidt <crschmidt at crschmidt.net> wrote:
> > On Fri, Dec 07, 2007 at 12:09:25PM -0800, Matthew Perry wrote:
> > > On Dec 7, 2007 11:15 AM, Christopher Schmidt <crschmidt at crschmidt.net> wrote:
> > > The NAIP imagery I am interested in comes as mrsid in UTM projection.
> > > (http://new.casil.ucdavis.edu/casil/remote_sensing/naip_2005/county_mosaics/).
> > >
> > > Is the GDAL version on the server built with Mr Sid support? What
> > > about supporting on-the-fly reprojection via mapserver? Or is this
> > > just two strikes against this dataset and it will require alot of
> > > pre-processing?
> >
> > Decompressing from MrSID on the fly is slow. Reprojecting on the fly is
> > very slow. However, pre-processing data is not something that is some
> > kind of permanant black mark against a dataset: indeed, it is my
> > expectation that any serious dataset will need lots of reprocessing in
> > order to use.
>
> Good points. I was looking for efficiency on the data providers end to
> hopefully get more organizations to contribute. But pre-processing is
> certainly more efficient on the delivery end.
I think that
> > The important thing to remember though, is that once it's done once, it
> > *never has to be done again* for that dataset. Imagine that the primary
> > use case for web mapping is in EPSG:4326 (which is my experience with
> > OpenLayers).
> >
> > > I think supporting other input projections is going to be crucial in
> > > the long run since platte carre is almost never the preferred
> > > projection for storing imagery for a number of reasons.
> >
> > I agree with the premise, but not the result.
> >
>
> I strongly disagree with that. The Platte Carre projection is
> ill-suited for almost every purpose. It is distorted, has inconsistent
> scale (no scalebars can be used), can't accurately measure areas, etc.
That was the premise. I agreed with it.
> I'd encourage OAM to at least anticipate the future need to store and
> deliver imagery in a more suitable projection.
I agree with half of this: delivery in alternative projections is a
requisite. *Storage* in alternative projections is not a requisite. I
think that http://openaerialmap.org/map/mercator.html is relatively
effective, and it doens't involved storing anything differently.
> > > > Instead of hosting all the data locally, the longer term plan is that
> > > > any sizable dataset for which there is a willing provider will be
> > > > populated by connecting to a remote WMS. This 'distributed' style of
> > > > data loading is something that we're currently working on building the
> > > > infrastructure for, after which there is a plan to document how you
> > > > should set up your own data -- after that OSGeo/OpenAerialMap is only
> > > > responsible for hosting datasets for which there is no willing provider
> > > > available. We have not yet discussed how to scale that aspect of it: my
> > > > hope is that we can avoid it for a while, since I think for the most
> > > > part there are organizations that are able to provide WMS of most large
> > > > datasets, meaning that OAM hosts the small stuff directly, and the big
> > > > stuff is shared.
> > > >
> > >
> > > I like the distributed model as well. It's just that I don't
> > > personally have the resources to serve up this volume of imagery. But
> > > I do know that it's available, how to obtain it, how to work with it
> > > and would be willing to set it up if there was a home available for
> > > it.
> >
> > I'll keep that in mind. Is there no university or other organization in
> > the area that does have the resources?
>
> Again, I'm thinking of efficiency on the data providers end. Say some
> small city has a stock of digital aerial photos in the public domain.
> They *could* contribute them but they might conclude that there are
> too many roadblocks and expenses in doing so. They are more inclined,
> in that case, to keep the data.
If the options are to make it available by giving it to OAM, or to make
it not available at all, we would find the resources to make it happen
by giving it to OAM. There's no question in my mind on that front. *How*
we would find the resources is still up in the air. We don't have the
resources to do it for everyone, and so we need to find a mechanism that
scales.
The processing of the data can be largely automated, and packaged up
with FWTools, so if they have a Linux or Windows machine, they have the
toolset that they need to process their imagery all wrapped up in a box.
Given that, the next step is sufficient instructions that they can build
what they need using only that. And, if we have those instructions, all
they need to find is a machine with a thick pipe and a fair amount of
disk.
In the near term, OAM has the resources to, on a case by case basis, act
as the thick pipe for data which doesn't have a willing thick pipe
available. For the long term, that's not going to scale to the whole
world, hence the need to make things distributed.
> But I think you've hit on a good idea; having a network of larger
> organizations that could provide the infrastructure and expertise that
> the smaller groups need to distribute the imagery via OAM.
Right. And TelaScience can be the first node, until we have a wider
network.
> > > > > What are the storage limits and scalability of the site?
> > > >
> > > > I'm not sure what you mean by 'storage limits and scalability'.
> >
> > Well, it's not entirely clear what it's going to be yet -- but I do know
> > that we don't have the > 100TB of disk space it would take to store the
> > entire country's NAIP data uncompressed. :) I'm working within the
> > constraints of reality, which is important even if it's frustrating :)
>
> Ah that's what I was getting at. So anything under 100 TB is fine? ;-)
> ... Seriously though, I'm processing the santa barbara county NAIP
> imagery right now as a test case. It will be ~35 GB... is this a
> reasonable amount of data or would you prefer I find a way to host it
> elsewhere?
35GB of data is fine, if you can't host it elsewhere: I'm still
personally working on trying to figure out how we can store this stuff
compressed, so I haven't concentrated on the 'making it easy to store
your data and share it' instructions that are going to be neccesary.
If you want help, we can offer it.
> > > > WCS is not
> > > > terribly hard, I don't think, just changing a mapfile, though I'm not
> > > > sure if it's useful and I'd prefer to have a use case before offering a
> > > > service. Downloading data by bounding box is on the 'Things that are
> > > > being pondered' plate, but not near term.
> > > >
> > >
> > > Having access to the actual data (not a WMS-ified image) is certainly
> > > useful! Let's say I want to create a map of my neighborhood but, for
> > > whatever reason, can't rely on a WMS service for that data. Maybe I
> > > need to do some manipulation, feature extraction, reprojection,
> > > resampling, etc.
> >
> > Using what toolset? How are you going to get the raw data out via WCS?
> > What tool are you going to use to browse the imagery? Does it support
> > Geotiffs? Does the output need to be jp2 in order to make senes? does it
> > support bigtiff? Etc.
>
> In the past, I've built a web map front end with a little "download"
> button which would construct a WCS call based on the current bbox
> extent. Pretty simple and it did the trick.. zoom in, hit download,
> get a full-res, geo-referenced dataset of that area.
And what did users do with this?
> As for the
> format, I played with a few options and found HFA/imagine format to be
> a good fit (for DEMs which we were serving). JP2 was definitely
> smaller and more suited to imagery.
Encoding JP2 with the free encoders is painful, as far as I can tell. I
do it for the MetaCarta Labs rectifier, but that's a bit of a mistake,
and until OAM is big enough that we can afford a commercial license to a
non-free encoder or big enough that we can pay someone to improve the
existing free encoders, we probably won't be getting that type of
imagery out.
> > If you have an actual problem, and can describe the changes that allow
> > it to be solved via WCS, then I'll gladly take a suggestion to how to
> > make the mapfile support it.
>
> Well if the goal of OAM is to provide access to imagery, I'm not sure
> why you wouldn't want to provide access to the data. It doesn't have
> to be WCS but some mechanism to download the a geo-referenced dataset
> by bbox seems crucial. I showed OAM to many of my colleagues yesterday
> and the first question from everyone was "Where do you go to download
> an image?".
>
> > And of course, if you don't like that path,
> > you can set up your *own* mapserver: the wiki documents that:
> >
> > http://wiki.openaerialmap.org/Using_With_MapServer
> >
> > This lets you drop the OAM data into a layer in any mapfile, which you
> > can then use to set up your own WCS service, without it needing to live
> > on the OAM server.
> >
> > Am I missing something important here?
> >
>
> Well you can't have WCS access to data that is cascaded through WMS.
> The whole point of WCS is that it allows access to the actual raw
> data. So WCS would only make sense on the first link in the chain (ie
> the host server of the data in question).
I don't think that's true. WCS provides 'raw data', but it can provide
'raw data' in any format you ask for, regardless of the format of the
data. Assuming the source data is JPG encoded (some is, some isn't, at
the moment), then there's nothing lost from loading up the WCS from JPG
tiles from a WMS, other than perhaps a slight increase in the number of
compression artifacts you see. Right?
> I'd also like to note that I'm not trying to be difficult here or to
> take away from the great accomplishments to date. I'm also not trying
> to suggest that these things need to be change immediately. It's just
> that, as OAM moves forward, I think it's important to consider the
> needs of a broad spectrum of spatial imagery producers and consumers.
> I'm just trying to provide a different perspective on these issues
> based on my experience.
Understood. Some of my needs are practical: we're currently working with
only a couple terabytes of disk space, and I don't want to offer hosting
to everyone on the planet... and run out after satisfying 1% of
requests. Some of my goals are technical: Need to figure out how to
solve 1. while keeping in mind the technical limitations of the software
as it stands, and how to modify the software.
I'm not being argumentative, but I'm a big fan of concrete use cases.
The number of patches that have sat outstanding in OpenLayers land for
lack of concrete use cases grows by the day -- and I don't consider that
a bad thing :)
Regards,
--
Christopher Schmidt
Web Developer
More information about the talk
mailing list