2009년 9월 25일 금요일

Selectively disabling Gold linker

Gold is a new, fast linker written by Ian Lance Taylor and his colleagues at Google. It has been a year since its first public release, and it is now packaged as binutils-gold in Debian, so you can easilly try it. Google claims Gold is five times faster than classic GNU linker for linking large C++ applications (of which Google has plenty), and my experience confirms. Overall, this is a welcome development.

Alas, not everything is as rosy as it seems. GNU linker, in its multi-decades history, has accumulated a lot of features, and Gold, being a from-the-scratch re-implementation, is yet to catch up. There are bugs, incompatibilities, and missing functionalities. Above Debian package installs Gold as /usr/bin/ld, and after installation, some softwares may fail to compile from source. Often it is clear this is Gold's fault; but sometimes it is not. And you probably want to know a way to selectively disable Gold linker so you can check whether it's Gold's fault.

Above Debian package leaves classic GNU linker as /usr/bin/ld.single, and it is likely that other systems will also retain classic GNU linker under different names. So one way to disable Gold linker is to let environment variable LD to point to classic GNU linker before configuring softwares.

That is, if the build system of the software in question honors LD. But quite often, linker is implicitly called by compiler driver gcc, so you want to tell gcc to use another linker, and telling build system to use another linker is useless.

This is where gcc's -B option comes to rescue. You can pass "-B prefix" option to gcc, and gcc will first search under prefix to find the linker it uses, before falling back to default paths. So I arrived at the following solution:

$ mkdir /opt/no_gold
$ ln -s /usr/bin/ld.single /opt/no_gold/ld
$ export CC='gcc -B /opt/no_gold'


By the way, Gold problem happened to me while I compiled Midori from source. Midori is a lightweight web browser based on WebKit, with Adblock, user scripts, user styles, and other interesting features. I can do a little advertisement on my blog, can't I? Also, thanks to people on #midori IRC channel at Freenode for all the help and patience.

2009년 5월 4일 월요일

Building libFIRM from the public repository

libFIRM is yet another IR(intermediate representation) from University of Karlsruhe. It has been public for the last year, but the repository was not. Recently, the repository has been published at github, which is a good news.

On the other hand, there doesn't seem to be any documentation for building from the repository. Since the build process is not exactly straightforward, here is a little recipe.

First you want to clone the repository:

$ git clone git://github.com/MatzeB/libfirm
$ cd libfirm


Then you need to generate some source files from scripts.

$ python scripts/gen_ir.py spec ir/ir
$ python scripts/gen_ir_io.py spec ir/ir


The rest is standard autotools build, except there is no autogen.sh.

$ libtoolize
$ aclocal
$ autoheader
$ autoconf
$ automake --add-missing


Now configure and make.

2008년 12월 19일 금요일

Encoding name normalization in Python

Python provides codecs module, which provides the codec registry. You can query the registry with codecs.lookup function. codecs.lookup function receives an encoding name as an argument.

The venerable IANA, our Internet Assigned Numbers Authority, maintains official names for character sets. On the other hand, nobody really cares. According to IANA, "UTF-8" is an encoding name, with no aliases. Therefore, "UTF8", "UTF_8" or any such aliases should be invalid. That doesn't really work in the real world.

So Python normalizes encoding names received from codecs.lookup. How exactly this is done isn't really specified. It turns out that CPython does normalization in two separate places: one in Python in the standard library, and one in C in the implementation. There are normalize_encoding function in Lib/encodings/__init__.py, and normalizestring function in Python/codecs.c. Moreover, these two functions perform different normalizations.

IronPython, being an implementation of Python trying to be compatible with CPython, needs to cope with this. You will be surprised by the number of ways things can go wrong if you don't exactly match how this is done. But did you know that the following code work with CPython? (I don't recommend this!!!)

import codecs
codecs.lookup('utf!!!8')


Yes, those are three exclamation marks. I'm not kidding...

2008년 6월 29일 일요일

My reading list

I haven't blogged for a while, so here is a lame blog post listing feeds I am currently subscribed to. I am a Liferea user, by the way. I tried to migrate to online feed reader, but somehow it didn't seem more convinient.

Software projects


I am subscribed to two software project feeds. One is of course IronPython, which I mainly use to monitor the issue tracker. I would prefer mail notification for this, but it's not implemented in CodePlex.

Another is PyPy. Quoting somebody on YouTube (hah!), "PyPy is the most ambitious project any language has ever had", and I believe it. Even if you don't believe it, it's well worth subscribing if you are into programming languages. By the way, the YouTube video is PyPy - Automatic Generation of VMs for Dynamic Languages.

Programming language developers


I am subscribed to blogs of developers implementing programming languages. For IronPython, I have Dino Viehland and Martin Maly on the list. For JRuby, Charles Nutter and Ola Bini are great.

Just like about everybody else, I read John Lam to follow IronRuby. From Mono developers, I only read Jb Evain. If signal-to-noise ratio for me were a bit higher, I would have read Miguel de Icaza -- but I don't.

For JVM, I read Gary Benson and John Rose, although both are usually over my head. Last two blogs in this category are from GCC developers (among other things): Ian Lance Taylor and Tom Tromey. Surprisingly, Ian and Tom are usually not over my head! I thank them for their generosity -- I am entirely sure that they are capable of writing posts inscrutable to me. :)

Others


Being a science fiction fan, I read Charles Stross. There are many great science fiction writers, but that set somehow doesn't seem to intersect with the set of great bloggers a lot.

Being a fan of mathematics, I read Terence Tao. I don't pretend to understand technical materials there, but occasional posts directed to the "public" is simply great. For computer science fans (as opposed to software engineering!) I recommend Scott Aaronson. Be sure to check out his lecture notes!

I read Alp Toker for no particular reason. His posts always have been enjoyable to me. I temporarily have Antonio Cangiano on the roll, mainly not to miss his Ruby shootout.

Korean blogs


That leaves me some Korean blogs. Hye-Shik Chang, a Python developer and a FreeBSD port maintainer, is the best Korean blogger in his niche. Park, Seong Chan is a theoretical physicist who writes great approachable posts on physics news. Kim Gyuhang is a progressive columnist. I sympathize with his political views. He is also my writing model for how to write clear and affecting Korean prose.

2008년 3월 24일 월요일

Inlining in Mono JIT

It seems that as of Mono 1.9, the inliner in Mono JIT compiler never inlines functions with any branching opcode. To those in the position to know, I ask:

1. Is this true?
2. If it is true, should I manually inline functions like the following?

public static int Min(int x, int y) {
if (x < y) return x;
else return y;
}

2008년 1월 13일 일요일

CLR Add-In

Today I came across CLR Add-In. You can start reading from "System.AddIn Resources" link on the left sidebar.

Its discovery and adaptation model looks rather comparable to those of zope.interface, but it also deals with isolation. Reading the blog, it's interesting to see what design choices there are, and how and why CLR Add-In Team made those decisions.

2008년 1월 11일 금요일

Using AT-SPI from IronPython (2)

Before continuing, let me mention that all the relevant code is in the FePy repository:
https://fepy.svn.sourceforge.net/svnroot/fepy/trunk/atspi/

Now IIOP.NET is built, it's time to compile IDL. IDL stands for "Interface Description Language", and it is used to (surprise) describe the interface. AT-SPI's CORBA interface is described in /usr/share/idl/at-spi-1.0/Accessibility.idl, and it includes a bunch of other files. Some of these files are in different directories, so one needs to specify them.

The compiler built is under IDLToCLSCompiler/IDLCompiler/bin. Copy these files (IIOPChannel.dll, IDLPreprocessor.dll, IDLToCLSCompiler.exe) to the current directory, and run:

$ mono IDLToClsCompiler.exe \
-idir /usr/share/idl/bonobo-2.0 \
-idir /usr/share/idl/bonobo-activation-2.0 \
Accessibility /usr/share/idl/at-spi-1.0/Accessibility.idl


This should produce Accessibility.dll in the same directory. build.sh in the repository automates the process up to this point (download, build, and IDL compilation).

So how does one connect to the server? This is "a well known problem" that has its own FAQ entry. Basically, one obtains IOR, "Interoperable Object Reference", by out-of-band mean, as one gets URL from the bookmark. After one got the first object reference, one can follow links to other objects.

It turns out that AT-SPI publishes IOR as a property of X root window under the name "AT_SPI_IOR". Now one could go read X protocol specification and manually construct GetProperty request (opcode 20), etc., but there is an easier way. xprop utility can display X properties, so one runs "xprop -root AT_SPI_IOR" and parses the output. xprop.py in the repository implements this.

Now IOR is a long sequence of hexademical digits, and one needs a tool to decode it. ior-decode-2 in orbit2 package can do so. If you decode IOR from xprop, you can notice a problem. AT-SPI (actually CORBA implementation it is using, namely ORBit2) uses Unix domain socket by default, but IIOP.NET can't use it. One solution is in the ORBit2 FAQ I linked above. Create .orbitrc with this line.

ORBIIOPIPv4=1


This post is already quite long, so let's quickly skim the rest. corba.py implements the necessary initializations from IIOP.NET documentation. typed.py is a workaround for IronPython's limitation (namely the lack of cast operator), first suggested by Dino Viehland. And this is the meat of cliatspi.py.

orb = corba.init()
ior = xprop.get('AT_SPI_IOR')
obj = orb.string_to_object(ior)
registry = typed.typedproxy(obj, Accessibility.Registry)


tree.py is an example AT-SPI client I wrote, printing a tree of accessible objects in the current desktop. It imports cliatspi on IronPython, but imports an existing AT-SPI binding on CPython. (I used one in Debian python-at-spi package.) As IDL is language-neutral, this script actually runs identically on both CPython and IronPython. Extending tree.py to be a useful tool like UISpy on Windows is left as an exercise to the readers.