In 11.50b6 we added a command line argument to run Aftermath, a debugging utility, hoping it will give us more insight into device loss errors.
A “device loss” error is specifically the crash that accompanies the on-screen (or log.txt) error message “Encountered Vulkan device loss error!” Using Aftermath will not help us investigate VRAM issues–that is a different issue entirely.
If you are on Windows, have an NVidia GPU and you see a device loss error followed by a crash, you can help us track these bugs down by running X-Plane with Aftermath enabled. We know from 11.50b5 that many devices are not compatible with Aftermath, so if you crash and burn immediately, you can go back to using beta 6 without the extra command line option.
Aftermath Instructions
We will be using the command line via Command Prompt. (Here are instructions on getting started with this if needed.)
Launch X-Plane from the command line with the following flag:
--aftermath
You can then try to reproduce the steps that caused the initial device loss, or just fly as usual. If device loss happens again, the auto crash report form should come up again. Please fill out your email and submit the auto report to us for investigation.
Well, that was something. I had a very nice post written up last week on the state of beta. We had spent a week very carefully trying to improve stability and then…beta 5 exploded on the launch pad.
So…let’s try this again. But before we get into beta 6, a few graphs:
That’s a graph of auto-reported crashes over time – the big spike up is April 2nd when 11.50 came out. The gap in the timeline at the end is when our crash reporter temporarily was shut off for exceeding quota! From this I can take derive two take-away points:
A lot of people are really excited to try the 11.50 beta even though it’s early and unstable and
The 11.50 beta crashes a lot.
The silver lining is that the crashes we have been collecting are very very informative so it’s been a really great data stream.
Here’s one more graph:
That’s bug reports and they’re up something like 1000% – we have received close to 1800 reports since then. Of these reported bugs, over 500 are in the category of “it crashed” or some other similarly catastrophic, bad thing happened.
So with those graphs in mind, let’s talk about where we are at with the beta.
This post is just targeted at plugin developers who are modernizing their object drawing – if you don’t write plugin code, the Cincinnati Zoo has been showing their animals on Youtube – it’ll be a lot more entertaining than this post. (An XPLMInstance cannot tunnel down two feet in fifteen seconds – one point for the zoo animals.)
XPLMInstance makes a persistent object that lives inside X-Plane that is visible in the 3-d world. It changes how you draw from “run some drawing code every frame” to “tell X-Plane that there is a thing and update its data every now and then.”
Instancing is actually a lot easier than draw callbacks! But there are two tricky gotchas:
1. You must create the custom DataRefs for your OBJ’s animation before you load the object itself with the SDK. (If the DataRefs do not exist at load time, the animations are disabled as “unresolved to any DataRef”.)
2. When you create the instance, make sure your custom DataRefs are on the list of DataRefs for that instance.
Here’s the really baffling thing: if you create the custom DataRef and then add it to the instance’s list, your DataRef callbacks will not be called.
Wha?
Here’s the trick: the DataRef you register is a global identifier, allowing the object to refer to what it wants to listen to. That’s why you have to create the DataRef – so that the identifier exists.
But when you create an instance, each instance has memory that holds a different copy of those DataRefs.
For example, let’s say you have a truck with four DataRefs, and you make five instances. X-Plane allocates 20 slots (four DataRefs times five instances) to store five copies of each DataRef’s values.
The instances never look at the DataRef itself. They only look at their local copies. That’s why when you push different data to the instance with XPLMSetInstancePosition, each instance animates with its own values – each instance looks at its own local data.
This is also why you won’t see your DataRef callbacks called (unless you use DataRefEditor or some other tool). The object rendering engine isn’t looking at the DataRefs themselves, it’s looking at the local copies.
In other words, XPLMInstance turns DataRefs from the pull model you are used to (X-Plane pulls on your read function to get the value) to a push model (you push set with XPLMSetInstancePosition into the instance’s memory).
This implies two things about your add-on:
It doesn’t really matter what your DataRef read functions do – they can just return zero, and
You can’t use tools like DataRefEditor or DataRefTool to debug your animations. (That didn’t work well in legacy code either, but it really won’t work now.)
If you try the obvious optimization of not creating your custom DataRefs (“hey, no one calls them”) before you create your instance, you will find that animation just stops working. This is because we need the DataRef to be that global identifier to match your instance data with the animations of the object itself.
One last note: if your old code used sim/graphics/animation/draw_object_x/y/z to determine which object was being animated (from inside a plugin “get” function) you do not need to do this anymore. Because each instance has its own local copies and your DataRef function isn’t called, this technique is obsolete.
In summary:
You must register custom DataRefs.
Their callbacks can just return 0 – they’ll never be called.
Always list your custom DataRefs for animation when you create an instance.
Do not use draw_object_x/y/z; use XPLMSetInstancePosition to create per-specific-instance animation.
I was going to write a post about X-Plane 11.50 beta 5 – what’s new in it, the new ways we are debugging GPU crashes, the crash bugs we’ve fixed, etc. A lot of stuff that we thought was pretty good went into beta 5. Cool new technology! Big bug fixes! Lots of winning!
As it turns out, beta 5 is dead. I hit “go” on the release this afternoon, and half an hour ago, I hit “stop.” The auto crash reporter was showing way too many new crashes in memory management that we had not seen before, and this strongly implies a new and serious bug.
Laminar Installer Users: if you were auto-notified to update to beta five and did so, and you are not crashing, you can keep flying! If your beta five is just a smoldering wreckage of crumpled VRAM and GPU parts, you can re-run the installer with “get betas” option checked, and it will take you back to beta 4.
If you were not auto-notified to update to beta five, that’s probably for the best. Please stand by and keep flying beta four; we’ll post a new beta when we’ve gotten to the bottom of this. We have enough captured crash data to investigate.
Steam Users: we did not release the beta five build to Steam and this is probably a good thing; we’ll try again with a new release that isn’t made of plutonium and unicorn hallucinations.
And if you’re going “why didn’t y’all test it before you released it”…we did! None of our machines show these crashes. But we also have probably a dozen PCs total we can run on. Moving to a new driver stack has meant learning about the weird things that happen on your computers and not ours.
Do We Need a Two Tiered Beta System
This came up in our impromptu beta five post-mortem meeting: do we need to bring people into new betas in stages? With code for new drivers, beta five probably won’t be the last beta where we code something we think is helping and discover that it fails catastrophically, but not on our hardware. We need beta victims^H^H^H^H^Htesters to find these bugs, but once we get a dozen crashes, we don’t need anyone else to stub their toe for us to fix our problems.
So we thought about two possible ways to do this:
A two-tiered system. Early adopters could get an email and hand-update to the new beta before it is put out for auto-update notification.
Send out the beta update notifications over time, e.g. 10% of users get notified immediately, then another 40% if we don’t see crashes, then the last 50%. (This practice is actually industry standard on mobile apps.)
If you are reading this blog post, this far down, you are probably participating in the beta; I’d be curious what approach you’d find most useful.
X-Plane 11.50b4 is now available if you update via the Laminar Research installer. (Steam users: it’s on the servers and we’ll hit go in a few hours if we don’t hear reports of massive crashing and pain again.)
This update was focused on crash fixes and better triaging. We’ve been seeing a huge uptick in volume of bug reports and auto reported crashes since the initial 11.50 public beta release. We are trying to cut through the noise and provide better information in logs and in the remaining crash reports to fix issues faster, and let our support team (primarily me) get the inbox under control.
The best way to help us handle crashes on Windows and Linux is still to submit the auto report form. You can include your email if you want us to be able to find your specific crash, but we do not need the message field–the log and back trace will have pretty much all the info we need. If you send an auto report, please do not also send a bug report form email.
Mac users do not have the ability to auto report, so they should fill out the bug report form, and include the Apple crash report as well as the log.txt. This can be found under your username /Libraries/Logs/DiagnosticReports. The name will include the date & time of the crash and will end in .crash. You may need to show hidden folders to access it.
We were discussing a particularly exasperated sounding bug report on one of the internal Slack channels when I realized that this might not be obvious: a crash with the error message “pipeline must not be null” – it’s one error message that covers a whole category of bugs. We fixed one major case (skycolors were broken) in b1 and added one major case (custom billboard lights on aircraft) in b2 – conservation of pipeline bugs!
Null pipelines are a new category of crash in X-Plane 11.50, so here are a few notes on what this error is and what you can do to help us fix them (and what you don’t need to bother with).
What Is a Pipeline?
A pipeline is just the Vulkan and Metal term for a shader (plus some extra gak (1)) that we use to do our drawing.
X-Plane 11.41 would ask the OpenGL driver to build shaders as it needed them, and then the driver would turn those GL shaders into hardware pipelines on the fly as it got presented with different scenarios.
Not 11.50. We build everything up front. Vulkan has two rules:
Using a pipeline is fast.
Building a pipeline is not fast.
This is a great pair of rules for us – it means if we build our pipelines at load time, we are not going to have stutters mid-frame.
Why Are We Crashing?
There is one down-side to the 11.50 way of doing things: if we don’t build all of the pipelines we need up front during load, then when it comes time to draw, we’re toast. That’s what a “pipeline must not be null” error is – it just means the loading code did not create the pipeline the drawing code needs.
Why not just build every pipeline we could ever possibly need? Load time. X-Plane can build hundreds of thousands of pipelines depending on rendering settings, scenery packs, custom aircraft, etc. We actually did “just build everything” early in our development process and the sim could take half an hour to load.
So we try to build only the pipelines we need. If we build too many, we slow load, and if we build too low, you see this error.
What Do You Do When You See This Error?
On Windows and Linux, it’s really easy: close the alert box and when the auto crash report form comes up, please press “send”. Don’t bother with you email or a message; everything we need to kill this bug is already in the auto report! (Jennifer’s edit: please DO include your email address with any auto report if you want us to be able to confirm we have your specific report! This is the only way we have of identifying who it came from.)
The good news is: the auto crash reports for the pipeline crashes are insanely easy to find and fix.
Mac users: if you see one of these, we need the Apple crash report – please send it in a bug report.
(1) for the plugin developers that know some OpenGL: a pipeline is basically a GLprogram (shader) plus a bunch of the fixed function state that goes with it: blending, depth/stencil, vertex format, FBO format, and some rando stuff thrown in.
The idea is to have the pipeline contain so much information that there is no risk that the driver has to build two hardware shaders for one Vulkan shader (to cope with other fixed function state) no matter how weird the hardware is.
On lots of actual hardware, the pipeline has stuff that’s not actually in the shader, but some surprising things, like vertex format, actually often are.
X-Plane 11.50b3 is now available if you update via the Laminar Research installer. (Steam users: it’s on the servers and we’ll hit go in a few hours if we don’t hear reports of massive crashing and pain like we did last night.)
We waited on releasing beta 2 on Steam after we started hearing reports of new, unintended crashes, and we spent the last 24 hours coding and testing the fixes. The only new fixes in beta 3 are for crashing with Linux + Vulkan, and null pipeline crashes with third party aircraft.
Hopefully this update will be more stable and we can get back to our regularly scheduled programming of working on a wider range of fixes for beta 4 next week.
Updated 4/8/2020 8:25 PM: Beta 2 is…not our best work. It crashes on start on Linux and crashes on load for a wide variety of third party aircraft (but not LR ones). We are cutting a beta 3 with these two issues fixed; it should be live in the next twenty four hours. We are holding with Beta 1 on Steam until Beta 3 is available.
X-Plane 11.50 Beta 2 is now available. (Steam users: it’s on the servers and we’ll hit go in a few hours if we don’t hear reports of massive crashing and pain.)
We received a lot of bug reports from X-Plane 11.50 beta 1. This is good! I’d much rather have multiple reports of a bug than no reports. Every now and then someone tells us about something and we go “how long has this been going on” and they so “oh for a year now” and we’re, like “why didn’t you file a bug???” Don’t assume someone else will file it!
So with beta two, here’s what we need:
Read the release notes – Jennifer puts real effort into documenting everything that is fixed to save you time.
If your bug is listed as fixed and you still see it, please file a new bug. If you mention the bug number that we listed as fixed in your “it’s still broken” report, this is really helpful for us.
If your bug is not listed as fixed, please do not re-file it. If we didn’t say it was fixed in the release notes, we already know it is still broken, and a re-file of the bug just takes time away from other bug reports.
Beta 2 does not fix all bugs – it doesn’t even come close, so most bugs do not need to be refiled.
With that in mind, there are a few high profile bug fixes in beta 2:
The sky colors dialog box does not crash! We are actually astonished at how many people reported this – we didn’t think it was a heavily used feature, but … who knew.
VR – the right eye is fixed! It turns out this was broken twice; we have fixed both bugs.
Plugins: object drawing in OpenGL for legacy plugins turns out to have been massively borked; this could cause wrong drawing and crashes in all of the pilot clients, ground traffic, push back add-ons, etc. So a large swathe of popular add-ons should work better in OpenGL mode.
Older NVidia cards should now work and not have a black screen. This covers the 600, 700, 800, and some 900 NVidia cards.
Mac users who were getting “out of memory” – this should be a lot better now.
Users with multiple GPUs and SLI should be able to launch without disabling things.
Probably the most common and annoying bug report we get that is not fixed here is blurry textures. Basically if X-Plane thinks it is running out of VRAM, it will lower the resolution on textures where it is allowed to lower the resolution. We have seen cases of this code behaving very poorly and turning texture resolution all the way down.
First, just to state the obvious, this is a bug. You do not need more VRAM to run with Vulkan than OpenGL, we just need to fix the pager. If you have less than 8 GB of VRAM, do not panic.
I am not surprised that we have seen this bug – texture paging is very much about tuning our decisions to match real-world use, and we have shipped with something that works decently in our test cases and sometimes quite badly in real-world use cases that are very different from our test cases. So we will adapt the algorithm over time based on data we collect, and it will take a few betas to get better.
We have updated the X-Plane plugin SDK and related documentation for X-Plane 11.50. Here are some links:
New Plugin SDK: version 3.02 adds the new modern 3-d drawing callback for OpenGL/Vulkan, and deprecates the drawing callbacks that are unavailable in Vulkan/Metal. Download it here.
(Updating to the new SDK is not mandatory for your add-on to be Vulkan-compatible, but it can help catch drawing callbacks that will not work, and it is necessary to use the new 3-d drawing tech.)
Plugin Compatibility Guide:a short guide that focuses on issues existing plugins might have with X-Plane 11.50. If you are updating a plugin, read this!
OpenGL Drawing Guide:complete documentation for all aspects of drawing using OpenGL with X-Plane. If your plugin uses OpenGL, this is a must-read.
Drawing Over 3-d in 2-d: this sample code shows how to draw in a 2-d window or drawing callback with coordinates that match the 3-d world. Many 3-d plugins that draw need this kind of drawing, e.g. to label routes or aircraft with UI that matches the 3-d world.
Instancing Example Plugin:a sample plugin that draws objects using the XPLMInstancing APIs. Many add-ons will need to transition from object drawing to instancing to be compatible with Vulkan. We strongly recommend moving to instancing – it provides the fastest, most compatible drawing path, particle system support, and in the future will support FMOD sound.
Instancing is actually much easier to use than XPLMDrawObject and bypasses a lot of complexity and chaos. Instancing works with all versions of X-Plane 11, and the sample has everything you need.
X-Plane 11.50 has been out for a little bit more than 24 hours, and things have been a little bit nuts. Here are a few quick notes, in no particular order.
Bug Fixes and Work Arounds
While I don’t have work-arounds for the missing right eye in VR or older NVidia cards that won’t run in Vulkan, the good news is that we have fixes for these already. We are going to start testing beta two on Monday and try to get the fixes we have out as soon as possible. While we don’t have every major reported bug fixed, beta two should make a real difference.
Users who can’t start and have SLI setups: disable SLI in the Nvidia control panel and you will be able to run Vulkan. We are still investigating this – our goal is a bug fix so you don’t need to turn SLI off. (We do not expect X-Plane to leverage both cards – we expect it to run without failing.)
Finally, one thing I should have mentioned in the announcement: if you have scripts that modify art controls, please remove them, and don’t put them back.
The art controls are undocumented and subject to change, and they have changed a lot since X-Plane 11.41. Realistically the authors of these tweak scripts need to go back and re-evaluate every one of their tweaks in 11.50 to see if they actually help or are actually making things worse.
This is not like plugins or scenery; we tell you to get a second installation for the beta not because we want you to run add-on free but rather so that if the beta fails you are not locked out of X-Plane. We expect add-ons to work and we are taking plugin, scenery and aircraft bugs seriously.
By comparison, the art controls are “do what you want, but you void the warranty if you mess with this.” If you are running scripts that hack the art controls, we cannot tell the difference between real bugs in the early betas and art controls screwing things up.
The Road Map For Betas
Looking over the bug reports we have received, I think we are going to take on the 11.50 bugs in three phases:
Stability and compatibility. We’ll start by making sure that we run Vulkan and Metal on every platform that should be able to run them, with add-ons just working in the cases where we expect them to. We’ll start by focusing on fixing crashes, black screens, device lost, unstable plugins, etc.
VRAM use. We’ve received a number of reports that make it sound like VRAM management is not working properly. Once we can run, we’ll dig into blurry textures, running out of VRAM, etc. Sidney has built some great tools to get a good picture of how VRAM is being managed. VRAM management is one of the newest and most complex parts of 11.50 so it isn’t surprising that we’ve seen things that look buggy.
Performance. Once we are running where we should and using VRAM that we should, we can look at the cases where users are not seeing performance benefits from Metal and Vulkan, as well as remaining stutters. Once again, Sidney has built some fantastic tools that should help us dig into this quite efficiently.
This is the only order that we can reasonably approach the bugs. If the app won’t run on all qualifying hardware, we can’t test our VRAM use everywhere, and if our VRAM use isn’t correct, it can bias performance testing.