You're viewing all posts tagged with english

developing on windows vs developing on linux

Windows sucks! really, it sucks for development.

ok, let me rephrase that: windows sucks for serious development.

This comes down to the windows culture: programs are designed for humans who only use the application occasionally to get something done.

Consider this scenario: I have a large project, I recieve a bug report about a bug in a previous version of the project. The project is actually comprised of several sub-projects.

Here’s what I need to do before I even start examining the bug or doing anything:

  • Checkout the proper version of each sub-project from the repository
  • Build sub-projects in the correct order (of dependancy).

Now, in a linux environment, where I’m using git and automake, I would simply checkout the proper branch (for the version) and run make.

In the worst case scenario where there’s nothing linking the project at the make level, I will have a 6 line script that builds them in the correct order.

very simple command line setup.

Now, what if I’m on windows?

Well, chances are, windows users won’t be using git or make or anything “archaic” like that. They will have visual tools where each action takes about 10 clicks or so to complete.

So, what do we do?

Well, fire up the “visual” application that connects to the central repository and checkout the specific branch of each project one at a time, where the checkout process includes:

  • opening the project from a list of project
  • going through menus/submenues to select the proper branch for the version we need (might require looking up the dependency list in a separate document/application)
  • checking out that version (usually a single button, or a menu item)
  • wait for the progress bar to fill up

Repeat that process about 5 or 6 times (depending on how many sub-projects you have).

I kid you not, this is the actual process required with some certain “commercial” “enterprise-class” tools for “source configuration management” (I can’t spell out its name though).

Now, time to build:

After you’ve checked them all out, you have to open up your fancy IDE; usually that would be visual studio (fortunately, this one I can name).

Again, open up each project in the correct order and build it separately.

This is a really tedious process when you have to do it 3-5 times a day.

Compare this to: git checkout version_branch && make

I will admit, the windows process is easier for newbies, and is bareable if you only do it once a month or so. But for serious development, this process is just unacceptable.

Now, theoretically, you can automate this process in windows using some scripts; but here’s the catch: the tools are NOT designed for scripting; yes, most of them have some kind of a command line interface, but it’s not very well tailored towards time-saving scripting magic.

And, suppose you do manage to script that operation, what about all the other operations that you need to do for serious development? All the tool-chain in windows is centered around gui apps with lots of menus and buttons and forms that must be filled.

So, you can still script that, but, what’s the point? If you’re gonna go command-line, go command-line for real and start using linux.

That’s what I did, and that’s why I really switched to linux.

When I started developing a real project in windows, I was browsing for files in windows explorer!! oh the horror! After a while I got really tired, and started looking for alternatives. The best thing you can find on windows is xyplorer (it’s not free, but it’s the only good option I could find).

Little by little I kept ditching all the default windows tools and switching to better tools: console2 for a terminal (the one and only free terminal emulator on windows), winscp for ssh file upload (windows doesn’t even have an ssh tool, for God’s sake!!!), I started using git (msysgit) for source code management, and got used to typing ls at the command line :)

One might say: but Linux doesn’t have a decent IDE!!!

Well my friend, Linux is an IDE

Linux, (or rather, Unix, (which GNU/Linux happens to be a clone of)) is an integrated development environment.

Scribus & GNU FriBidi

NOTE: Please do not use this as a reference or tutorial for fribidi, it contains incorrect information, I’m keeping it unchanged just for the historical record. Some stuff here is just plain wrong.

In my previous post I talked briefly about scribus’ problem.

The result I got so far depended solely on GNU FriBidi. No HarfBuzz yet.

It’s true that HarfBuzz is the library to use for text layout, but:

http://mces.blogspot.com/2009/11/pango-vs-harfbuzz.html

HarfBuzz only does shaping (….) [it] doesn’t provide:

  • An itemizer
  • A Unicode Bidirection Algorithm implementation
  • A Unicode Line Breaking implementation
  • Glyph rasterization
  • Glyph metrics information
  • etc

So it will get us the shaping and stuff, but not the bidi ordering and line breaking.

The GNU FriBidi API is quite simple, though not in an obvious way; at least if you’re studying it for the first time without prior exposure to the bidi issue and the unicode bidirectional algorithm.

The “core” of the api is the get_embedding_levels function. Embedding levels are used to determine directional runs.

Here’s the setup code I used (roughly):

    embeddingLevels = new FriBidiLevel[inputLength];
    FriBidiCharType *bidi_types = new FriBidiCharType[inputLength];
    fribidi_get_bidi_types (inputString, inputLength, bidi_types);
    baseDir = fribidi_get_par_direction(bidi_types, inputLength);
    FriBidiLevel ok = fribidi_get_par_embedding_levels(bidi_types, inputLength, &baseDir, embeddingLevels);

I’m not entirely sure if calling fribidi_get_par_direction is actually needed, but besides that, the embedding levels allow you to determine if a certain character is part of an RTL or LTR run. If the embeddign level is even, then it’s part of an LTR run, else if it’s odd then it’s part of an RTL run.

    /**
        Does character at index have an RTL embedding level?
     */
    bool BidiInfo::isRtlEmbedding(int index)
    {
        return embeddingLevels[index] % 2 == 1; // odd embedding levels are part of an RTL run
    }

Then we want to get ranges for runs, so we just scan the text until the run changes, and we have the start and end of a run. The way I did that is simple: nextRun(start, limit) searched for the start of the next run, starting the search from start and ending it at limit. The usage is intended to be something like this:

    start = 0
    end = start
    while(start < length):
        end = nextRun(start, length)
        // (start, end) is now a run, do something with it
        start = end

With that, we check if the run is RTL, and if so, we reverse the characters in that run to get a bidirectional display of the text. The way I did the reversing is a bit too much of a detail to be included here.

I only do this stuff after the textframe layout method has done its work, and that’s for a good reason: we have to do the reordering on a per-line basis, otherwise you get problems. And so we need to find out where lines start and end, and so what I did was “watch” the layout process as it happens, and whenever we spot a new line occuring, we record it; in other words, I injected some code everywhere I saw code that handles line breaks, which was about 4 places. This resulted of course in some duplicate code, but I tried to keep to a minimum: 1 line.

scribus, bidi, and arabic shaping

Scribus has had this bug report regarding Arabic support for five years now (it’s been open since December 2004, and now it’s December 2009): #1079

But wait, what is Scribus in the first place? Well, it’s a desktop publishing program, and it’s free (FOSS).

I’m actually not an active user of the program. I got introduced to the problem by OMLX, during the arabteam’s first programming contest. His friend Zeyad at itwadi.net he had a little adventure with the problem.

Scribus uses Qt, which does support Arabic and Bidirectional text properly, but the problem lies in the textframe layout; it uses a custom layout “engine”.

This layout “engine” is mostly a monstrous 1500 lines method filled with spaghetti code. No wonder that bug report has been open for 5 years.

All the pieced needed to build the support for Arabic are out there: HarfBuzz and GNU FriBidi. HarfBuzz mostly just lacks documentation; but any layout guru should be able to make use of it. It’s used in both Pango (GTK+) and Qt. It’s even used in the Linux port of google chrome.

I personally am not a layout guru in any way, shape, form, or sense of the word. I don’t even know the first thing about text-layout. But I know something about bidi: after all, I have created The Free Ressam (shameless self promotion). It’s a tool to do “fake” bidi and arabic-shaping at the text level; it transforms the text stream so that a layout-engine that’s not capable of bidi can be fed something that will seem as if it’s bidi. It doesn’t conform to the unicode bidirectional algorithm, and works only with Arabic, and when I did it I didn’t really know that fribidi already does a better job at it!

I tried to digg into harfbuzz, but so far it hasn’t been so fruitful. Though, I’ve been sorta successful with GNU FriBidi. I managed to get the gist of the api (it’s documented through man pages), and my attempt to integrate it into the PageItem_TextFrame::layout seems to be getting somewhere.

Bidi is kinda there, but not quite. Shaping is not there yet (actually it was there in an earlier test, but it was just too buggy and hackish that I had to take it away and just focus on getting bidi correctly first).

Look at this picture:

scribus

The text below (where it says Qt) is the story editor: it uses a plain text Qt text area, which has all the proper support for Arabic.

The text above (where it says TextFrame) is, obviously, the text frame. You can see that Arabic runs are displayed right to left, but the letters are disjoint. That’s because I’m not playing with the glyph selection process, but it’s ok, my first priority now is fixing the bugs in the bidi part of the problem.

Two problems appear in my sample text:

Something is wrong with line breaking: the first character in the RTL run that crosses a line break is missing.

Something is wrong with the end of the text: the last few characters aren’t detected properly as RTL. This one has been killing me for a while now!! It probably has to do with text length, or maybe the new lines have \rs which are some how confusing the text length when I transform it from QString to FriBidiChar *. I eliminated some possibilities for the cause of this bug, but others are still open.

Also, sometimes I experience crashes with Signal #11, which I was told is a memory access error (on the scribus-dev irc channel).

So, things are still shaky, but inshalla we’re getting somewhere.

Quick Update!

Just as I was writing the last paragraph, I realized what the bugs where, and I fixed them!

Here’s an updated picture of the current state of affairs on the bidi front:

bidi

Addendum

I’m putting my changes public on github: http://github.com/hasenj/scribus/tree/hasen

I talked to andreas “avox” on the dev channel yesterday, he pointed me to a series of patches by pierre that use harfbuzz. http://bugs.scribus.net/view.php?id=4645

Those patches don’t seem related to shaping, but they do use HarfBuzz, so at least they can serve as pointers when I go back to explore HarfBuzz some more. I don’t have a log of the conversation so I forgot whether pierre was in fact working on shaping or not.

For now I still have some work to do on the bidi front: in the current state of affairs, the whole text in the text frame is treated as a single paragraph. This is bad because, I’m using the method of specifying the base paragraph direction according to the first character with a hard-wired direction. So if the text has 2 paragraphs: the first is arabic, and the second is english, then all of the text (i.e. including the english paragraph) will be treated as if having an RTL base direction, which affects how neutral characters (such as punctuation marks) are positioned.