About half a year ago, some really intriguing ideas came to me out of the blue dealing with ways to speed up raytracing. My plan was to create a couple games, showing off these techniques, and after some curiosity was piqued, write an article up talking about how it worked to share it with others in case they found it useful.
For those of you who don’t know what raytracing is, check out this wikipedia article:
Due to some distractions and technical setbacks unrelated to the raytracing itself, I’ve only gotten one small game working with these techniques. One of the distractions is that I implemented the game using google’s native client (NaCl) and for some reason so far unknown to me, it doesn’t work on some people’s machines which is very annoying! I plan to get a mac mini in the next couple months to try and repro / solve the problem.
Check out the game if you want to. Open this link in google chrome to give it a play:
The sections of this article are:
- Geometry Caching
- Pixel Caching
- Future Work
- Compatible Game Ideas
Admittedly, these techniques have some pretty big limitations. I’ve explained my techniques to quite a few fellow game devs and when I mention the limitations, the first reaction people give me is the same one you are probably going to have, which is “oh… well THAT’S dumb!”. Usually after explaining it a bit more, people perk up again, realizing that you can still work within the limitations to make some neat things. So please hold off judgement until checking out the rest of the article! (:
The big fat, unashamed limitations are:
- The camera can’t move (*)
- Objects in the scene shouldn’t move too much, at least not all at the same time
(* there is a possible exception to the camera not moving limitation in the “Future Work” section)
So…. what the heck kind of games can you make with that? We’ll get to that later on, but here’s some things these techniques are good at:
- Changing light color and intensity is relatively inexpensive
- Changing object color and animating textures is inexpensive
- These techniques don’t break the parallel-izable nature of raytracing. Use all those CPU and GPU cores to your heart’s content!
Seems a bit dodgy I’m sure, but read on.
The first technique is geometry caching.
The idea behind geometry caching is: If no objects have moved since the last frame, why should we test which objects each ray hits? It’s a costly part of the ray tracing, and we already KNOW that we are going to get the same results as last frame, so why even bother? Let’s just use the info we calculated last frame instead.
Also, if some objects HAVE moved, but we know that the moving objects don’t affect all rays, we can just recalculate the rays that have been affected, without needing to recalculate all rays.
Just because we know the collision points for rays doesn’t mean that we can just skip rendering all together though. Several things that can make us still need to re-render a ray include: Animating textures, objects changing colors, lights dimming, lights changing color. When these things happen, we can re-render a ray much less expensively than normal (just recalculate lighting and shading and such), so they are comparatively inexpensive operations compared to objects actually moving around.
How I handle geometry caching is give each ray (primary and otherwise) a unique ID, and I have a dynamic array that holds the collision info for each ID.
In the part of the code that actually casts a single ray, i pass the ID and a flag saying whether it’s allowed to use the geometry cache. If it isn’t allowed to use the geometry cache, or there is no entry in the cache for the ID, the code calculates intersection info and puts it into the geometry cache.
It then uses the geometry cache information (whether it was re-calculated, or was usable as is) and applies phong shading, does texture lookups, recurses for ray refraction and reflection, and does the other things to figure out the color of the pixel.
In my first implementation of the geometry cache, it was very fast to render with once it was filled in, but it was really expensive to invalidate individual cache items. If an object moved and a couple hundred geometry cache items needed to be marked as dirty, it was a really computationally expensive operation!
A better option, the one i use now, involves both a 2d grid (for the screen pixels) and a 3d grid (to hold the geometry of the world).
Breaking the screen into a grid, when each ray is cast into the world, I’m able to tell the ray what screen cell it belongs to. This way, as a ray traverses the 3d grid holding the world geometry, it’s able to add itself each world grid maintains to keep track of which rays pass through that 3d cell (keeping just the unique values of course!). Child rays know what screen cell they are in by getting that value from their parent.
If an object moves in the world, you can make a union of which world cells it occupied before it moved, and which world cells it occupies after the move. From there, you can make a union of which screen cells sent rays into that world cell. The last step is to mark all those screen cells as “geometry dirty” so that next frame, the rays in those cells are disallowed from using the geometry cache data, and instead will re-calculate new intersection info.
This method makes it so potentially a lot of rays re-calcuate their intersection data that don’t really need to, but by tuning the size of the screen and world grids, you can find a good happy medium for your use cases.
If you have an idea to better maintain the geometry cache, feel free to post a comment about it!
The second technique is pixel caching which is a fancy way of saying “don’t redraw pixels that we don’t have to”. The less rays you have to cast, the faster your scene will render.
The first challenge to tackle in this problem is how do you know which pixels will be affected when an object changes color? That is solved by the same mechanism that tells us when geometry cache data is invalidated.
When an object changes color (or has some other non-geometry change), you just get the list of world cells the object resides in, and then get the union of screen cells that sent rays through those world cells.
When you have that list, instead of marking the screen cell “geometry dirty”, you mark it as “pixel dirty”.
When rendering the screen cells, any screen cell that isn’t marked as dirty in either way can be completely skipped. Rendering it is a no-op because it would be the same pixels as last time! (:
This is the reason why you want to minimize geometry changes (objects moving, rotating, resizing, etc) and if you have to, rely instead on animating textures, object colors, and lighting colors / intensities.
Here’s a smattering of ideas for future work that I think ought to bear fruit:
- Replace the screen and/or world grid with better performing data structures
- Pre-compute (a pack time process) the primary rays and subsequent rays of static geometry and don’t store static geometry in the world grid, but store it in something else instead like perhaps a BSP tree. This way, at run time, if a ray misses all objects in the dynamic geometry world grid, it can just use the info from the pre-computed static geometry, no matter how complex the static geometry is. If something DOES hit a dynamic object however, you’ll have to test subsequent rays against both the dynamic object world grid, and the data structure holding the info about the static geometry but hopefully it’ll be a net win in general.
- Investigate to see how to integrate this with photon mapping techniques and data structures. Photon mapping is essentially ray tracing from the opposite direction (from light to camera, instead of from camera to light). Going the opposite direction, there are some things it’s really good at – like caustics – which ray tracing alone just isn’t suited for: http://en.wikipedia.org/wiki/Photon_mapping
- In a real game, some things in the world will be obscured by the UI overlays. There might be an oportunity in some places to “early out” when rendering a single ray if it was obscured by UI. It would complicate caching those since an individual ray could remain dirty while the screen cell itself was marked as clean.
- Orthographic camera: If the camera is orthographic, that means you could pan the camera without invalidating the pixel and geometry cache. This would allow the techniques to be used for a side scrolling game, overhead view game, and things of that nature – so long as orthographic projection looks good enough for the needs of the game. I think if you got creative, it could end up looking pretty nice.
- Screen space effects: enhance the raytracing visuals with screen space particles and such. Could also keep a “Z-Buffer” by having a buffer that holds the time each ray took to hit the first object. This would allow more advanced effects.
- Interlaced rendering: to halve the rendering time, every frame could render every other horizontal line. Un-dirtying a screen cell would take 2 frames but this ought to be a pretty straight forward and decent win if losing a little bit of quality is ok.
- red/blue 3d glasses mode: This is actually a feature of my snake game but figured i’d call it out. It works by rendering the scene twice which is costly (each “camera” has it’s own geometry and pixel cache at least). If keeping the “Z-Buffer” as mentioned above, there might be a way to fake it more cheaply but not sure.
Compatible Game Ideas
Despite the limitations, I’ve been keeping a list of games that would be compatible with these ideas. Here’s the highlights of that list!
- Pinball: Only flippers, and the area around the ball would actually have geometry changes, limiting geometry cache invalidating. Could do periodic, cycling color / lighting animations on other parts of the board to spice the board up in the “non active” areas.
- Marble Madness Clone: Using an orthographic camera, to allow camera paning, a player could control a glass or mirrored ball through a maze with dangerous traps and time limits. Marble Madness had very few moving objects and was more about the static geometry so there’d probably be a lot of mileage here. You could also have animated textures for pools of acid so that they didn’t impact the geometry cache.
- Zelda 1 and other overhead view type games: Using ortho camera to allow panning, or in the case of Zelda 1, have each “room” be a static camera. You’d have to keep re-rendering down somehow by minimizing enemy count, while still making it challenging. Could be difficult.
- Metroidvania: side scroller with ortho camera to allow panning. Could walk behind glass pillars and waterfalls for cool refractive effects.
- Monkey Island type game: LOTS of static geometry in a game like that which would be nice.
- Arkanoid type game: static camera, make use of screen space effects for break bricking particles etc
- Mystery game: Static scenes where you can use a magnifying glass to LITERALLY view things better (magnification due to refraction, just like in real life) to find clues and solve the mystery. Move from screen to screen to find new clues and find people to talk to, progress the storyline etc.
- Puzzle Game: could possibly do a traditional “block based” puzzle game like puzzle fighters, tetris, etc.
- Physics based puzzle game: You set up pieces on a board (only one object moves at once! your active piece!) then press “play”. Hopefully it’d be something like a ball goes through your contraption which remains mostly motionless and you beat the level if you get the ball in the hole or something.
- Somehow work optics into gameplay… maybe a puzzle game based on lasers and lights or something
- Pool and board games: as always, gotta have a chess game with insane, state of the art graphics hehe
- mini golf: A fixed camera when you are taking your shot, with a minimum of moving objects (windmills, the player, etc). When you hit the ball, it rolls, and when it stops, the camera teleports to the new location.
- Security gaurd game: Have several raytraced viewports which are played up to be security camera feeds. Could have scenes unfold in one feed at a time to keep screen pixel redraw low.
- Turn based overhead view game: Ortho camera for panning, and since it’s turn based, you can probably keep object movement down to one at a time.
Lastly, here’s a video describing this stuff in action. When you view the video, the orange squares are screen tiles that are completely clean (no rendering required, they are a no-op). Purple squares are screen tiles that were able to use the geometry cache. Where you don’t see any squares at all, it had to re-render the screen tile from scratch and wasn’t able to make use of either caching feature.
Feedback and comments welcomed! I’d be really interested too in hearing if anyone actually uses this for anything or tries to implement in on their own.