Thunderbird: August 2008 Archives

Thunderbird code size shrinks by 1.66Gb!

|
Tinderbox-Thunderbird-osx-mZdiff

First, that's only on OS X, sorry for folks running it on a different platform. Second, it's a bit of a lie, as I am talking about bug 448003.

Since we switched over to building from Mercurial, the code size tests we run were somewhat off on OS X. These tests basically measure the size of the code included in the product. For instance, right now, on Linux, it's reported at 15.7MB.

However, on OS X, they had been reporting 1.56GB! Obviously, this was a bug somewhere in the computation of this number, not the actual code size, so that's why the subject of this post is appropriately misleading.

Tinderbox-SeaMonkey-osx-mZdiff

Various hypothesis were put forward, including the fact it looked like it was off by precisely 100, so possibly a unit conversion problem, or something similar.

I've finally gotten around to it, and after some investigation, turns out it's more interesting than that. On OS X, we don't directly build universal binaries, but instead build once for i386 and once for ppc, then lipo the 2 builds together to form the final product.

To determine code size in this case, it would be misleading to count both architectures, so instead we pick one and size up that one. In our case, for historical reasons, that's the ppc version, even though our build boxes are Intel by now. Determining code size is done by parsing the output of /usr/bin/nm, an object dumping tool that nicely dumps the symbols list out of libraries and executables. Unfortunately, it dumps positions in the code, not size. So, to determine code size for a given function, just make sure addresses are sorted, then subtract the address of the previous function to the address of the current function, and you'll get the number of bytes that it takes up in the object file. Simple, right ?

Well, for some reason, not yet explained, on ppc, there is a strange symbol called trampoline_size that exists first in the list of symbols and sits at address '0'. However, right after it is the next symbol, strangely sitting at the address 0x20000000. That's 536870912 bytes further, if we believe what this is telling us. Of course, this is bogus. I am not entirely certain what this about, but if I could venture a guess, I'd have to say it has to do with the ppc executable format, OS X, and their ability to transparently run ppc binaries on Intel. Anybody that knows OS X/PPC better than me, feel free to correct me on this.

libnspr4.dylib: 00000028 a trampoline_size
libnspr4.dylib: 20000000 t __mh_dylib_header
libnspr4.dylib: 20001a7c t dyld_stub_binding_helper

Now that the problem is somewhat understood, it was a simple matter of teaching the code size stuff to ignore symbols like that on OSX/ppc and start at the next one. As you can see from the Tinderbox boxes, it was still pretty impressive to see such a large code size decrease! Oh, and SeaMonkey got the same drop in size, for the same reason.

hourly_usage_200806.png

Recently, we started receiving nightly logs of the Thunderbird start page hits. It now comes to us nightly, and I also got some historical data back to June 2008.


For the time being, I've just been feeding them all to webalizer, for lack of a better idea. There is not yet a complete plan as to what to do with all that data. Right now, it goes straight to infinite storage every evening. There is a lot of it, averaging at around 8 million hits a day, The included graph is the average traffic per hour of the day. That's in the 300k-500k hits per hour! I am glad that this traffic is being served by the mozilla.org hardware.


I know folks have already started to think of ways we could mine that data to learn useful things about Thunderbird and its user base, but it's only beginning. So, for the time being, I'd like to hear from anybody who can think of a questions that could be answered by all that data. Of course, respecting our users privacy is important, so I don't want anybody to know anything about specific individuals, just about trends. Are Russians users checking e-mails more often than us Canadians, for instance? (Not so sure that's a very useful thing to know)


Got a cool idea ? Have something you'd like to know ? Please, let me know.


<gozer at mozillamessaging dot com>


About this Archive

This page is a archive of entries in the Thunderbird category from August 2008.

Thunderbird: June 2008 is the previous archive.

Thunderbird: September 2008 is the next archive.

Find recent content on the main index or look in the archives to find all content.

Pages

  • mozilla
  • uploaded

September 2008: Monthly Archives

September 2008

Sun Mon Tue Wed Thu Fri Sat
  1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30