Blog

NVidia Driver Update

In a previous post I discussed some of the hang-on-start problems we were seeing on NVidia cards.  Since these started relatively recently, but the sim code hadn’t changed much, we thought maybe there was a driver issue.

Since then we’ve collected a few more data points:

  • Some users with older cards are seeing problems starting X-Plane – there are shader-compile errors in the log file when this happens.  Our shaders haven’t changed.
  • Some users with newer cards see rendering artifacts: typically the cloud shadows will darken some terrain triangles, but not others, which make the terrain look very weird with broken cloud cover.
  • Some users still see a hang on startup which is “fixed” with –no_fbos.

Now here’s the interesting thing: the shader compile errors and rendering artifacts appear to be happening with the 270.61 drivers; going to the 260.99 drivers seem to help.

So if you have NVidia hardware, there are a few data points I am looking for.  Please post a comment if you meet any of the following:

  • If you have the 270.61 drivers and see either visual corruption, crashes or hangs, please try going back to 260.99 and post your results, including what card you have.
  • If you have the 270.61 drivers and a new card (GeForce 400 or 500) and you do not see corrupt cloud shadows, please post so – we need to know if the bug only affects some users.
  • If you have the 270.61 drivers and an old card (GeForce 7xxx or older) and you can run without crashing, please post so – we need to know if the crash bugs only affect some users.

As always, we don’t know if this is a driver bug or an X-Plane bug, and who “fixes” the bug may not be an indication of whose fault it is.  We will work around the bug even if it is in the drivers, and sometimes this kind of problem is due to X-Plane doing something naughty that some but not all drivers tolerate.  We won’t know if this is really a driver bug until we have a full diagnosis.

Edit: These bugs are Windows/Linux specific; Macintosh users will not see them, nor do they have driver version numbers that match the normal NVidia scheme.

Posted in Development by | 21 Comments

That’s One Powerful Landing Light

A while back I described some of the new lighting features in version 10, including lighting calculations done in linear space.  The very short version of this is: X-Plane 10’s lighting will be more physically realistic, and this is important when you have a lot of lights shining all over the place.

The change doesn’t affect how you author content (you still texture your models and we shine the sun on them) but it has been a source of bugs, as we find all of the different parts of the sim that use lighting equations.  In the picture on the right, the landing light hasn’t been updated, and as a result, it lights up the city 2 miles away.  The picture on the left has the landing light turned off.

I kind of like the right hand version, but that’s why I’m not in charge of the overall “look” of the sim.

(Will the runway lights look that “splattery”?  Probably not, but I don’t know; Alex will be in charge of the final decision.  That picture is zoomed in, which makes the lights look a lot bigger, but also the tuning of the runway lights for size is only partially done right now.)

Posted in Development by | 12 Comments

Why not GPGPU?

A commenter asked if we were planning to use a GPGPU API in X-Plane 10 or beyond.  I really can’t answer for the far future, but I can say that we aren’t planning to use GPGPU for X-Plane 10.  This post will explain a little bit about what GPGPU is and why we haven’t jumped on it yet.

Every new technology adoption has an adoption cost.  So the question of GPGPU isn’t just “will it help” but “is it better than the alternative uses of our time”.  For example, do we spend time coding GPGPU, or do we spend time optimizing the existing code to run faster on all hardware?  But this post is about GPGPU itself.

GPGPU stands for General Purpose programming on Graphics Processing Units – the Wiki article is a bit technical, but the short of it is: graphics cards have become more and more programmable, and they are highly powerful.  GPGPU technologies allow you to write programs that run on the GPU other than graphics.

There are two major APIs for writing GPGPU programs: OpenCL and CUDA.  OpenCL is designed to be an open standard and is heavily backed by Apple and ATI; CUDA is NVidia specific.  (At least, I don’t think you can can get CUDA to run on other GPUs.)  I believe that NVidia does support OpenCL with their hardware.  (There is a third compute option, DirectCompute, that is part of DX11, but that is moot for X-Plane because we don’t use Windows only technologies.

If that seemed confusing as hell, well, it is.  The key to understanding the alphabet soup is that there are API standards (which essentially define a language for how a program talks to hardware) and then there are actual pieces of hardware that make applications that use that language fast.  For drawing, there are two APIs (OpenGL and Direct3D) and there are GPUs from 2+ companies (ATI, NVidia, and those other companies whose GPUs we make fun of) that implement the APIs with their drivers.

The situation is the same for GPGPU as for graphics: there are two APIs (CUDA and OpenCL) and there is a bunch of hardware (from ATI and NVidia) that can run some of those APIs.*

So the question then is: why don’t we use a GPGPU API like OpenCL to speed up X-Plane’s physics model?  If we used OpenCL, then the physics model could run on the GPU instead of on the CPU.

There are two reasons why we don’t use OpenCL for the physics engine:

  1. OpenCL and CUDA programs aren’t like “normal” programs.  We can’t just pick up and move the flight model to OpenCL.  In fact, most of what goes on in the flight model is not code that OpenCL would be particularly good at running.
  2. For a GPGPU program to be fast, it has to be running on the GPU.  That’s where the win would be: moving work from the poor CPU to the nice fast GPU.  But…we’re already using the GPU – for drawing!

And this gets to the heart of the problem.  The vast majority of the cost of the flight model comes from interaction with the scenery – a data structure that isn’t particularly GPU-friendly at this point.  Those interactions are also not very expensive in the bigger picture of X-Plane, particularly when the AI aircraft are threaded.

The biggest chunk of CPU time is being spent drawing the scenery.  So to make X-Plane faster, what we really need to do is move the graphics from the CPU to the GPU – more time spent on the GPU on less time on the CPU for each frame of drawing we run through.

And the answer for why we don’t use OpenCL or CUDA for that should be obvious: we already have OpenGL!

So to summarize: CUDA and OpenCL let you run certain kinds of mathematically intense programs on the GPU instead of the CPU.  But X-Plane’s flight model isn’t that expensive for today’s computers.  X-Plane spends its time drawing, so we need to move more of the rendering engine to the GPU, and we can do that using OpenGL.

* Technically, your CPU can run OpenGL via software rendering.  The results look nice, but aren’t fast enough to run a program like X-Plane.  Similarly, OpenCL programs can be run on the CPU too.

Posted in Development by | 3 Comments

Real Physics on an Airplane

A while ago I wrote two posts trying to explain why we would use real physics for the AI planes.  Looking back over the comments, I think my message missed the mark for a number of readers.  The basic idea was this:

  • It’s quicker to get the new ATC system done by using the existing physics model than by inventing a brand new, parallel “fake physics” model for AI planes.  So using real physics lets us focus on other features.
  • The physics model is not more expensive than a fake physics model would; the few things that actually take CPU time on the real physics model are things the fake model must do: check for ground collisions, etc.

In other words, using real physics doesn’t hurt the schedule and it doesn’t hurt fps.  I followed that up with a bunch of talk about how you incrementally build a complex piece of software, and off we went.

What I didn’t do is make the argument for why real physics might be better than fake physics.  So: I made a video of the 777 taxiing and taking off.

Some disclaimers: this isn’t a marketing video, it’s what was on my machine at this instant.  This is an older v9 plane with v9 scenery*.  Version 10 is in development and has plenty of weird stuff going on, and the AI still needs a lot of tuning. Anyway:

With the video compression, calm conditions, v9 airplane, etc. it’s a bit tough to see what’s going on here, but I’ve seen a few AI takeoffs as I run X-Plane in development and it seems to me that the real physics model provides a nuance and depth to how the planes move that would be impossible to duplicate with a “fake” AI (e.g. move the plane forward by this much per frame).  When the airport isn’t flat, the plane sways depending on its landing gear, weight, and wheelbase.  The plane turns based on its rotational inertia, easing into the turn (and skidding if Austin dials in too much tiller).  When the plane accelerates, the rate of acceleration takes into account drag, engine performance, and wind.

 

* Except for Logan – that’s George Grimshaw’s excellent KBOS version 2 – version 3 is payware and I hope they’ll some day bring it to X-Plane.  Unfortunately there is some Z-thrash in the conversion.

Posted in Development by | 41 Comments

Hardware Buying Advice (or Lack Thereof)

If I could have a nickel for every time I get asked “should I buy X for X-Plane 10”, well, I’d at least have enough nickels to buy a new machine.  But what new machine would I buy?  What hardware will be important for X-Plane 10?

The short answer is: I don’t know, it’s too soon.  The reason it’s too soon is because we have a lot of the new technology for version 10 running, but there’s still a lot of optimization to be done.

As I have posted before, the weakest link in your rendering pipeline is what limits framerate.  But what appears to be the weakest link now in our in-development builds of X-Plane 10 might turn out not to be the weakest link once we optimize everything.  I don’t want to say “buy the fastest clocked CPU you can” if it turns out that, after optimization, CPU is not the bottleneck.

One thing is clear: X-Plane 10 will be different from X-Plane 9 in how it uses your hardware.  There has been a relatively straight line from X-Plane 6 to 7 to 8 of being bottlenecked on single-core CPU performance; GPU fill rate has stayed ahead of X-Plane pixel shaders (with the possible exception of massive multi-monitor displays on graphics cards that were never meant for this use).  X-Plane 10 introduces enough new technology (instancing, significantly more complex pixel shaders, deferred rendering) that I don’t think we can extrapolate.

To truly tune a system for X-Plane 10, I fear you may need to wait until users are running X-Plane 10 and reporting back results.  We don’t have the data yet.

I can make two baseline recommendations though, if you are putting together a new system and can’t wait:

  1. Make sure your video card is “DirectX 11 class”.  (This confuses everyone, because of course X-Plane uses OpenGL.  I am referring to its hardware capabilities.)  This means a Radeon HD 5000 or higher, or an NVidia GeForce 400 or higher.  DirectX 11 cards all do complete hardware instancing (something X-Plane 10 will use) and they have other features (like tessellation) that we hope to use in the future.  We’re far enough into DX11 that these cards can be obtained at reasonable prices.
  2. Get at least a quad-core CPU.  It won’t be a requirement, but we have been pushing to get more work onto more cores in X-Plane 10; I think we’ll start to see a utilization point where it’s worth it.  The extra cores will help you run with more autogen during flight, cut down load time, and allow you to run smoother AI aircraft with the new ATC system.

Finally, please don’t ask me what hardware you need to buy to set everything to maximum; I’ve tried to cover that here and here.

Posted in Development by | 28 Comments

Managed Expectations

I’m a bit behind on posting; I’ll try to post an update on scenery tools in the next few days.  In the meantime, another “you see the strangest things when debugging pixel shaders” post.

(My secret plan: to drive down expectations by posting shader bugs.  When you see X-Plane 10 without any wire-frames, giant cyan splotches, or three copies of the airplane, it’ll seem like a whole new sim even without the new features turned on!)

Posted in Development by | 5 Comments

What Is the Cost of 1 Million Vertices?

Hint: it might not be what you think!  Vertex count isn’t usually the limiting factor on frame-rate (usually the problem is fill-rate, that is, how many pixels on screen get fiddled with, or CPU time spent talking to the GPU about changing attributes and shaders).  But because vertex count isn’t usually the problem, it’s an area where an author might be tended to “go a little nuts”.  It’s fairly easy to add more vertices in a high-powered 3-d modeling program, and they seem free at first.  But eventually, they do have a cost.

Vertex costs are divided into two broad categories based on where your mesh lives.  Your mesh might live in VRAM (in which case the GPU draws the mesh by reading it from VRAM), or it might live in main memory (in which case the GPU draws the mesh by fetching it from main memory over the PCIe bus).  Fortunately it’s easy to know which case you have in X-Plane:

  • For OBJs, meshes live in VRAM!  (Who knew?)
  • For everything else, they live in main memory.  This includes the terrain, forests, roads, facades, you name it.

Meshes In VRAM

If a mesh is in VRAM, the cost of drawing it is relatively unimportant.  My 4870 can draw just under 400 million triangles per second – and it’s probably limited by communication to the GPU.  And ATI has created two new generations of cards since the 4870.

Furthermore, mesh draw costs are only paid when they are drawn, so with some careful LOD you can get away with the occasional “huge mesh” – the GPU has the capacity if not everyone tries to push a million vertices at once.  (Obviously a million vertices in an autogen house that is repeated 500 times is going to cause problems.)

But there is a cost here, and it is – the VRAM itself!  A mesh costs 32 bytes per vertex (plus 4 bytes per index), so our mesh is going to eat at least 32 MB of VRAM.  That’s not inconsequential; for a user with a 256 MB card we just used up 1/8th of all VRAM on a single mesh.

One note about LOD here: the vertex cost of drawing is a function of what is actually drawn, so if we have a million-vertex high LOD mesh and a thousand-vertex low LOD mesh, we only burn a (small) chunk of our vertex budget when the high LOD is drawn.

But the entire mesh must be in VRAM to draw either LOD!  Only things drawn on screen have to be in VRAM, but textures and meshes go into VRAM as a whole, all LODs.  So we only save our 32 MB of VRAM by not drawing the object at all (e.g. it being farther away than the farthest LOD).

Meshes in Main Memory

For anything that isn’t an object, the mesh lives in main system memory, and is transferred over the PCIe bus when it needs to be drawn.  (This is sometimes called “AGP memory” because this could first be done when the AGP slot was invented.)  Here we have a new limitation: we can run out of capacity to transfer data on the PCIe slot.

Let’s go back to our mesh: our million vertex mesh probably takes around 32 MB.  It will have to be transferred over the bus each time we draw.  At 60 fps that’s over 1.8 GB of data per second.  A 16x PCIe 2.0 slot only has 8 GB/second of total bandwidth from the computer to the graphics card.  So we just ate 25% of the bus with our one mesh!  (In fact, the real situation is quite a bit worse; on my Mac Pro, even with simple performance test apps, I can’t push much more than 2.5 GB/second to the card, so we’ve really used 75% of our budget.)

On the bright side, storage in main memory is relatively plentiful, so if we don’t draw our mesh, there’s not a huge penalty.  Careful LOD can keep the total number of vertices emitted low.

Take-Away Points

  • Non-OBJ vertex count is significantly more expensive than OBJ vertex count.
  • OBJ meshes take up VRAM; the high LOD takes up VRAM even when the low LOD is in use.
  • To reduce the cost of OBJ meshes, limit the total LOD of the object.
Posted in Scenery by | 7 Comments

CRJ Video

I don’t want to say anything and risk murphy’s law, but it looks like the CRJ will see the light of day after all.

I always enjoy seeing third party add-ons that really show what the rendering engine is capable of.  Also, it’s good to know that Javier brushes his teeth. 🙂

Posted in Aircraft & Modeling by | 22 Comments

What Do You Mean I’ve Had Too much to Drink?

More dubious screen-shots of in-development pixel shaders gone bad.  This one was taken while working on full-screen anti-aliasing for X-Plane’s deferred renderer.

Deferred renderers cannot use the normal hardware full screen accelerated anti-aliasing (FSAA) that you’re used to in X-Plane 9.  (This problem isn’t specific to X-Plane – most new first person shooter games now use deferred rendering, so presentations from game conferences are full of work-around tricks.)

It looks like we will have a few anti-aliasing options for when X-Plane is running with deferred rendering (which is what makes global lighting possible): a 4x super-sampled image (looks nice, hurts fps), a cheaper edge-detection algorithm, and possibly also FXAA.

Posted in Development by | 3 Comments