TITLE: Efficient search index for geo-spatial data in a relational model AUTHORS: G. Fekete, A. Szalay, T. Budavari, Dept. Physics and Astronomy, The Johns Hopkins University. ABSTRACT: We discuss a project to develop a system for rapid data storage and retrieval using the Hierarchical Triangular mesh (HTM) to perform fast indexing over a spherical spatial domain in order to accelarate storing and finding data over the Earth and Sky. Spatial searches over the sky are the most frequent queries on astrophysics data, and as such are central to the National Virtual Observatory (NVO) effort and beyond. The implementation of an HTM-based system has been highly successful; it produced a very fast search engine now used at least 20 institutions over the world (STScI, IPAC, ROE, IoA Cambridge, SDSS, UCSD, Leiden, Groningen.) The library has application in Astronomy and Earth Science applications. The goal is to speeedup a query that involves an object (observation, location, etc,) and a region of interest of an arbitrary shape (political boundary, satellite track, etc...). In a very large database of hundreds of millions objects, one wants to minimize the number of calculations needed to decide whether ot not an object meets a spatial search criterion. We use a HTM index based method to build a coarse representation of a covermap on the fly of the query region, which is then used to eliminate most of the objects that are clearly outside the region. False positives that pass the coarse test are then eliminated with more precise calculations. One of the challenging problems, cross matching, is to find data on the same object in separate archives. Simple boxing of rectalinear contraints (ra/dec or lat/lon limits) are inadequate, because they are singular at the poles, unstable near them, and the actual shape of areas of interest do not always fit neatly within a box. For example, the track of orbiting satellite with a great declination a narrow strip such that the whole lat/lon limited rectangle contains areas mostly outide the track. Furthermore, because of constraints imposed by instruments, engineering etc., scientists may need to define their own irregularly shaped query regions. With the recent advances in the worldwide Virtual Observatory effort, we have now a standard XML data model for space-time data. This data model also provides a new standard way to express spherical polygons as search criteria. A planned future outcome of this project is a layer that enables our search engine to run inside relational databases that are either Open Source (eg: mysql) or commercial (eg: SQLServer), and participate as a first-class access method in relational database queries. The toolkit is implemented in a highly portable framework in C#. This allows seamless integration with relational database engines and web services. In particular, this makes it possible to develop a full Web Services implementation of the library that can be accessed through remote calls.