Slaying the system_server Bug

(The following is a technical glimpse at a current XDAndroid development topic aimed at intermediate or advanced users and developers.)

Chances are if you’ve been using XDAndroid on a Rhodium, you’ve been hit by this annoying bug: the device is painfully slow from the time the XDAndroid boot animation begins. This is a bug seen mostly be Rhodium users and of varying degrees of consistency. Some people see it nearly every boot, others see it once in a handful of boots. When investigating the running environment while it’s happening, there is no interesting logcat or dmesg output, and top shows system_server hogging at least 95% of the CPU time.

An effective workaround has been to place a short phone call to the voicemail service and hang up. For whatever reason, this would cause system_server to calm down and act normal for the rest of the XDAndroid session. A more radical (and perhaps less effective) solution was to re-enable Dalvik’s JIT execution and put up with the various bugs it would potentially reintroduce. Somehow, JIT was able to mitigate the bug to the point that casual testing did not reproduce the system_server issue.

Upon further investigation by several XDAndroid team members (most notably arrrghhh and WisTilt2), the bug had been tracked down to the userland libraries used by XDAndroid for hardware support. This meant it was either an issue in the RIL, GPS or sensors (accelerometer) drivers. After even more testing by arrrghhh, who readily volunteered to test an unfinished internal Gingerbread build, it was determined that the likely culprit was the sensors driver, which remains missing in Gingerbread.

At my request, arrrghhh performed repetitive testing on our latest Froyo release (FRX04) with JIT disabled. With the sensors driver in place, the issue was reproduceable on essentially every boot. After removing the sensors driver, it could not be reproduced once. This was a very telling result, so it was time to figure out where the issue was in the sensors driver.

Finding the issue was actually relatively trivial. Such a runaway process usually indicates an uncontrolled loop. The sensors driver continuously checks for data from the sensors devices while Android is running. In our driver, the code responsible for that check is seemingly prone to infinite looping in a specific case where incomplete data is received from the sensors device. In practice, this case occurs frequently on Rhodium and sends system_server into fits. I’m guessing that making a phone call causes Android to query for a proximity sensor, which bails the sensors driver out of that loop to process the request (and then it goes back into the loop and is able to read data normally).

So, since the logic generally looked sound in the loop, the simplest and most likely solution was to add a delay while handling that corner case of incomplete data. For testing, we used an unreasonably large delay and added it in the general case for the loop, ensuring that the loop could never become a runaway in proper runtime conditions. Through more tedious testing, arrrghhh was able to confirm that the delay solved the runaway system_server issue (while making the sensors unusable).

After making that loop delay much shorter and placing it in the corner case exclusively, testing continued to show success. So it seems like the system_server bug is finally defeated. We’ve already released a new rootfs with the relevant change integrated. Give it a try and let us know how it works out, via the IRC channel or the aforementioned bug. Thanks for reading!

XDAndroid 2.2.1 Build FRX04

Hello again XDAndroid users!

We’ve got a minor release for you tonight. No, it’s not Gingerbread… We’re releasing XDAndroid 2.2.1 (Froyo) build FRX04, in the forms of both an updated system.ext2 image and an update.zip OTA update package.

List of changes since FRX03:

  • Improved 3D performance, with fixes for WVGA devices (brought to you by [acl]) – IMPORTANT: this requires the accompanying rootfs release from today, 21 Jan.
  • Synced all minor upstream changes from AOSP

There is also a new rootfs available today which provides an updated library that is part of the 3D change mentioned above. Without an updated rootfs, you will not see the 3D improvements.

Those of you who read the commit logs for the rootfs releases will notice quite a few mentions of gingerbread in there. The new rootfs allows future XDAndroid gingerbread builds to boot. However, we have not yet released a gingerbread system image. Gingerbread is getting closer to a testing prerelease, but isn’t quite there yet.

Thanks for using XDAndroid! As usual, feel free to report bugs and watch development on the IRC channel or mailing list!

Ongoing Gingerbread Progress

My previous Gingerbread-related post painted a grim picture of the progress I had been making up to that point. Little work had been done on stabilizing the system. Lots of features and hardware support were broken. In general, there was a ton of work to do.

Well, that work is getting done. Slowly, I’ve been working through the long list of issues and knocking out whatever I can in reasonably short time. We’re moving towards a completely unsupported prerelease system image meant for testing, probably within the next couple of weeks. Again, that would be completely unsupported and not an official release. We would not advertise it on any official release channels (wiki or forums).

With that out of the way, here’s a list of major fixes since my last post:

  • The severe drop in 2D performance has been fixed
  • The SD card is now visible within Android again
  • Screen backlight issues that I posted about are actually kernel-related auto-backlight issues on Raph/Diam/Blac only
  • Gingerbread’s strange sleep of death has been resolved
  • UPDATE: Audio is working
  • For developers: build system changes have been pushed to the repositories to allow you to build the system (gapps needs to be updated to generate working system images, though)

For users interested in following the technical development discussion, please join our IRC channel (and #htc-linux for kernel development) and the xdandroid-dev mailing list. The bulk of technical discussion occurs in these two places. Thanks!

XDAndroid Project Mailing List

Due to a request by developers, I’ve started building up mailing list infrastructure for the project. Initially, we will have just one list, xdandroid-dev, to be used exclusively for development discussion. This discussion will include chatter regarding bug fixes, new features, and other proposed changes. The xdandroid-dev mailing list will likely have small, experimental code changes posted (in the form of source patches) and will provide a live look at development, along with the IRC channel.

Technically-abled users are encouraged to join development discussion on this list. Posting to the list is limited to members only, however membership is open and users can register with no administrative approval. If you are interested in observing or joining development discussion, please subscribe to the list. The XDAndroid Project has a big shortage of Android developers and we’re looking for all the help we can get!

PS: There’s still no timetable for Gingerbread. Most of the issues outlined in my previous post on the subject are still valid.

Gingerbread (lack of) progress update

It’s been very slow around here lately. Being busy with work and holidays hurts productivity. However, I’m starting to get back into the groove.

Gingerbread is proving to be a beast to port. A lot has changed from Froyo and we’ll be left making square pegs fit in round holes. As usual, all old platforms have been abandoned, except for the venerable Nexus One of course. This makes it somewhat annoying to get all the older hardware working as it did in older versions.

Here’s an incomplete list of things which have broken in the upgrade to Gingerbread:

  • Severe drop in 2D graphics performance (a fix is in progress, pending testing and commit to our frameworks/base fork)
  • Existing sensors library is incompatible, so the accelerometer is not working
  • Location services crash the system during initialization (bootloop); additionally, removing Location provider causes Browser to crash at Google home page
  • Trying to activate bluetooth crashes the system
  • SD card is not visible (as mounted) in Android
  • Screen backlight stays on when it should sleep (display turns black)
  • Device hangs in sleep mode

There are other issues here and there that aren’t important or difficult enough to list here, or that I haven’t found yet. Also important to note is that I’ve been testing exclusively on a Raphael (ATT Fuze) so any issues with other platforms (Rhodium in particular) may not be found by me.

I’ll be working actively on these issues in the immediate future. Needless to say, the Gingerbread release is probably a long way off. There is a lot of work to be done, and unfortunately Gingerbread is mostly a solo effort by myself at the moment. If you’d like to help out, or if you know someone who can help, please drop by our IRC channel and join the development discussion. We need all the help we can get.

Thanks!

First Glimpses of the Gingerbread Men

As I’ve mentioned before, development on porting the newest version of Android to our devices is underway. Last night was release day, so the work that I did was all on getting the build environment ready. Now that I’m able to build images, we now have to worry about bring-up on the actual hardware.

There are still some changes left to be pushed into the XDAndroid AOSP repositories, but most of the codebase that I’m using is already there. With every new version of Android, we need rootfs changes to add support, so that’s the next bit of code that has to be pushed (and it’s mostly completed now, waiting for me to push).

Today, I was able to get the system booted and semi-usable. After merging upstream configuration changes into the rootfs, the system entered boot animation without a hitch. Unfortunately the current version of Google Apps has one incompatible package (the Network Location provider) which caused a “bootloop”. After removing that package, it resolved the crash and the system booted normally.

There’s still no ETA on a release, of course, since a lot of work has to be done to get everything going. In short testing, I found that WiFi is no longer functional, the window manager is very slow, and as mentioned before the Google package has to be updated. There will also be a number of (expected) crashes involved with certain hardware functions, due to changes in software interfaces between the releases.

So, now that we have a running system, how about some pictures?

XDAndroid Gingerbread Development Started

As the XDAndroid Twitter account has eluded to a short time ago, we have indeed begun official development on the new version of Android for our devices, gingerbread.

As development progresses, I will be updating the issues, successes and planning we have to do to get gingerbread running on our devices. Keep watching for new posts. Don’t forget, release announcements will always be made via the xda-developers forums and twitter. Thanks for using XDAndroid!

XDAndroid 2.2.1 Build FRX03

Hello XDAndroid users! Been a while… but we have another release for you! Build FRX03 is now available hot off the grill. We’re changing things up a bit this time around: we have released both a traditional system image, as well as a small OTA update file (as discussed on here previously).

The recommended method is the OTA update.zip. However, for users who don’t have a completely original FRX02 to upgrade from, the system.ext2 file will be necessary.

Here’s a list of changes in the new build:

  • Disable slow background blurring for some dialogs (thanks emwe)
  • Internal improvements to auto-backlight implementation (emwe)
  • Disable JIT by default for various stability improvements
  • Updated gapps package (20101114)
  • Bugs fixed:
    • 19 – Boot loop on first boot (fresh data.img)
    • 36 – Repeated Volume button press crashes the system
    • Possibly 12 – Terminal emulator special keys/digits do not respond

Downloads are available here: system-FRX03.ext2.zip and the OTA update-FRX03.zip.

OTA Update procedure (ONLY for users with the original, unmodified signed FRX02 system image):

  • Download update-FRX03.zip to SD card in the same directory as startup.txt and haret, etc
  • Rename update-FRX03.zip to update.zip
  • Boot haret, watch the updater go
  • After updater finishes, it may either reboot or “freeze” (reboot not implemented on device), reboot manually if it freezes
  • Boot haret again, and the new system will start up

Thanks again for using XDAndroid!

Moving Towards Over-the-Air Updates

The XDAndroid Project currently distributes system updates as compressed, whole filesystem images which the users then unpack onto their SD cards, replacing the originals. This is done for the rootfs, initramfs and system images. Additionally, the project distributes updated versions of the (re)bootloader (HaRET) configuration files and the Linux kernel used to boot into Android.

By far the largest individual piece is the system image (system.ext2). This image typically runs around 60-70MB compressed. Updating in this fashion is not a terrible inconvenience for most users, however the files are large, require tedious uploading by the packagers (and more waiting time for the users) and suck up tons of bandwidth. This is problematic for the servers that host the images, the packagers that upload the images, and the users that download the images (particularly those downloading over the cellular network directly).

To alleviate this, I have been considering (for some time now) rolling over-the-air update packages in the same style as official Android updates from Google. These packages are zip files, signed by and authenticated as originating from the XDAndroid project. The zip files contain a series of binary patches used to update from an old build to a new build without replacing the entire system image.

Over-the-Air (OTA) update packages are useful for a number of reasons: 1. they’re much, much smaller than full filesystem images (24MB for the FRX01 to FRX02 OTA update file); 2. they’re guaranteed authentic (our installer only handles packages signed by the project); 3. they cost less for the project to distribute and the users to receive, in both data and storage space; etc. The only problem? We have no bootloader, which is what traditionally invokes the update process on a native Android device.

Well, problem solved. In the latest rootfs at the time of writing this, I have added over-the-air update support. Since we have no true bootloader, we rely on the rootfs to act as such. In a way, this works out to be a bit easier, since our rootfs has an entire embedded Linux environment to work with. All we need to do is check for an update.zip package and invoke the updater. Piece of cake!

From a user’s standpoint, the update process is done like so:

  1. An OTA update package (update.zip) is downloaded from XDAndroid’s servers
  2. The update.zip file is copied (NOT extracted) to whichever directory your startup.txt file is in (ie. /sdcard/andboot)
  3. The user boots HaRET and enters Linux
  4. The rootfs finds that update.zip and starts updating the device
  5. The updater completes (hopefully!) and reboots the device (into Windows Mobile)
  6. The user boots HaRET again and the device boots to the updated Android system.

Ultimately, this new OTA update support will eventually allow us to roll an Android program that will download these updates (and perhaps also kernels and rootfs and initramfs images) and place them on the SD card automatically. Until that’s ready, though, it’s just a matter of the user putting it on the card as update.zip. Still no more difficult than a system.ext2 replacement, while being much less taxing on our Internet pipes.

So there you have it: a nice, verbose description of a neat little new feature. We look forward to (probably) distributing FRX03 as an OTA update package (as well as a system.ext2 for those of you who like to do that). And finally, since I know everybody wants to see some proof, here’s a screenshot of the updater in action on a my RAPH110/ATT Fuze…

The XDAndroid Updater in action

XDAndroid 2.2.1 Build FRX02

In case you’ve missed it, the XDAndroid Project today released its newest Froyo series build, FRX02. This is the second of our official releases to have a build number. We’ve got some really nice bug fixes, improvements and features for you this time.

Here’s an abbreviated list of changes since FRX01:

  • The following bugs were fixed:
    • 2 – Talk.apk missing
    • 4 – Buttons cut off in the open call menu.
    • 14 – Startup.txt file is incorrect
    • 20 – OpenWnn IME selected by default
  • Google Apps updated to 20101020.1
  • Transitioned to hdpi graphics and fonts
  • Ambient light sensor and hardware auto brightness for RAPH and RHOD (WisTilt2)
  • Debug output for battery service emergency shutdowns (by request of camro)
  • Data roaming off by default (can be dangerous for international users) (emwe)
  • armv6j instruction support from cyanogenmod

I love the light sensor support that WisTilt2 came up with. Give him lots of thanks (and donations!) for the great work.

Check out the release from your preferred thread on XDA-Developers (I like the official Raphael thread myself) and give us some feedback on IRC or file some bugs! Thanks again for using XDAndroid!