Open-souce fast file storage engine to use in a project?

Posted by: siberia37

Open-souce fast file storage engine to use in a project? - 17/10/2008 13:33

I am working on a project for Windows Mobile that involves storing thousands of bitmap tiles on a device's flash card. For obvious reasons I don't want to store every bitmap tile as an individual tile. Instead, I want to store all the tiles in one (or a few) large files. The tiles have to be stored in such a way to be easily accessible by their x,y and z properties. Does anyone know of any open-source storage engines that would handle this for me?

I have created my own "database" format for this but I get the feeling there are probably much faster ways to handle it. The other complication is I want to be able to add tiles to this database on-the-fly from the device. The database format I dreamnt up a long time ago doesn't handle this well.

Can anyone direct me to an open-source storage engine that does this or give me some pointers on "best practices"? I really don't want to mess with installing a RDBMS on a portable device.
Posted by: tfabris

Re: Open-souce fast file storage engine to use in a project? - 17/10/2008 13:43

There was an article in Game Developer magazine about just that very thing a while back. I'd say it was within the last three years, but I don't remember which issue.

3D video games need to do this as a matter of course in the process of texture management, especially with regard to how the textures fit onto the limited space of a graphics card's VRAM.

I don't have a link to the article handy or anything, I'm afraid, but perhaps it's on gamasutra.com or something. Look for articles about texture management.
Posted by: DWallach

Re: Open-souce fast file storage engine to use in a project? - 17/10/2008 14:32

It's been a long time since I worried about this, but I'll bet you could roll your own without too much trouble. You can use a large, flat file (which you mmap() into memory) for storing your textures. You just lay them out sequentially in whatever format your graphics library calls for. The only trick is that you need a table somewhere that tells you the offset, size, and so forth of each texture tile. Given that, you can figure out the offset and convert that to a pointer of the appropriate type. (Let's hear it for C and C++'s ability to casually convert types back and forth.)
Posted by: LittleBlueThing

Re: Open-souce fast file storage engine to use in a project? - 17/10/2008 14:56

why not 1/file? After all, a filing system is a database.

(or is it the space overhead of the fat filing system?)

On a linux fs I'd use a hash/heirarchy approach.
Take the unique info (x,y,z), hash it to a unique fixed length (think md5 hash) then create a directory structure to have no more than a few thousand files/directory. FAT32 may break with fewer files/dir so consider that.

eg (12,992,10034) -> AE4 FF3 25F A1D
create \db\AE4\FF3\25F\A1D.bmp

Alternatively use a dbm (gdbm)
http://en.wikipedia.org/wiki/Dbm

Posted by: siberia37

Re: Open-souce fast file storage engine to use in a project? - 17/10/2008 15:24

Originally Posted By: LittleBlueThing
why not 1/file? After all, a filing system is a database.

(or is it the space overhead of the fat filing system?)


That doesn't work since Storage Cards are formatted as FAT16 most of the time. That means you have a limitation on the number of files in a single directory. Not to mention you would be wasting a certain amount of space on every file because of the cluster size/file size differences.

I am talking thousands of tiles here like enough to cover several National Parks and National Forests at 2 meter per pixel resolution if that makes things more clear. Thanks for the DBM idea never heard about those.
Posted by: siberia37

Re: Open-souce fast file storage engine to use in a project? - 17/10/2008 15:54

Originally Posted By: DWallach
You just lay them out sequentially in whatever format your graphics library calls for. The only trick is that you need a table somewhere that tells you the offset, size, and so forth of each texture tile. Given that, you can figure out the offset and convert that to a pointer of the appropriate type. (Let's hear it for C and C++'s ability to casually convert types back and forth.)


Ok it's probably best if I describe what I'm doing now in detail. So we have a bunch of tiles that each represents about 400 square meters of the earth. The tiles are seperated by UTM Zones and then by divisions of these UTM zones I call subzones. Each subzone gets it's own file in the database directory. Within each subzone file there is a list of tiles (by x,y,z) in that subzone along with the file offset information into a larger file. The larger file is just a huge collection of bitmap tiles. So when the program renders a map it finds the subzone files it needs to use, loads them into a cache and then using the subzone file information to seek to the right place in the large image file. Then it reads the bitmap from the image file and displays it on the screen.

So I think I'm doing this in a pretty efficient way already. What I was hoping was that there was some storage engine out there that would be just as fast and would handle all these file I/O internals for me. That may be wishful thinking though.
Posted by: mlord

Re: Open-souce fast file storage engine to use in a project? - 17/10/2008 17:56

If you were using Linux, you could still use FAT16 on the card, and store your files within the directory structure of a Linux filesystem within a file on the FAT16 card.

But you're not.

Cheers
Posted by: DWallach

Re: Open-souce fast file storage engine to use in a project? - 18/10/2008 00:22

Okay, sounds like your subzone bitmap data is laid out in exactly the way that I would have recommended. The only way you might be able to squeeze more performance out of this system is if you can be smart enough to prefetch things in advance of when you actually need them. That would require all manner of UI integration. "Ahh, he's scrolling north; prefetch the blocks!" Alternately, you could look at compression of various sorts.

Still, it doesn't look like image data performance is really your problem. Instead, your problem seems to be all about managing your metadata efficiently, so you can quickly go from a (zone/subzone,x,y,z) tuple to the actual image tile file and offset. That sounds like a job for a database of some sort. You (apparently) don't need a full-blown relational database, but you may find something like Berkeley DB to be a useful in-between, particularly when other tools like Perl play very nicely with it, which would help you to build the database in the first place.
Posted by: mlord

Re: Open-souce fast file storage engine to use in a project? - 18/10/2008 09:07

Still sounds like something that would come for free with a real O/S filesystem.