User Tolerant Liveware: 2010

2010/12/22

Oh no you didn't

I made donuts.

I've been wanting to since this fall. I had some pumpkin baked, pulped and frozen, waiting for the right moment. Today, with Désirée home I thought "what a great activity to keep her occupied." Mixing up the batter together was fun, but I realised I didn't have enough oil so we headed down to the store to fetch it and she fell asleep in the stroller on the way home.

Which is just as well; hot oil and 2.5 year olds probably don't mix well.

The result: donuts which taste AWESOME! However, their texture is foul.

I blame 2 things: letting a batter containing baking powder wait on the counter for an hour. But mostly, my deep fryer. While it claims to up to 190C (~375F) during cooking I only highest I read was 135C (~275F) on my IR thermometer. So now we know why my fries take years to cook and are never crispy enough.

So, what should I do? Shell out $$$ to buy a new better deep fryer? Alton Brown claims none of them can get high enough temps. Fry in a pot on the stove and risk a spill and oil fire? While I am paranoid enough not to take my eyes off the pot, and my stoves hood and the granite counter top would probably prevent my house from burning down, I think I'll pass.

There is of course the third, geek option: modify the fryer. The problem is that the rheostat in it is sucktastic. Maybe I could add some resistance in series so that it would have a more favourable duty-cycle.

Or I could do what this guy did: short-circuit the controls in the fryer and use an external control, say a Lutron C-2000. What's more I could repurpose this for use with the heat plate, if ever end up building a smoker. The tech specs say "lights only, no appliances" but at 2000 watts, it should be more then enough to handle a fryer and would be WAY overrated for the hot plate, which is 600 watts according to my Killawatt.

Third option is of course a thermocouple, a relay and an Anduino.

Turns out a new fryer might be the cheapest option.

Now, about the title of this post. While googling I came across the KFC forum. It seems a bunch of people are obsessed about what Colonel Sander's original 11 herbs and spices were. Very few people knew. Until the Internet, a PR goof and many (I assume) test batches of chicken. And now everybody may know.

2010/12/16

Why I hate the CPAN

(OK, actually I love the CPAN, but that now that I have your attention.)

Every time I write and upload a new distro, I dream of how it's going to make some toiling programmers life wonderful, the way POE, LWP and other bits of the CPAN give my life meaning. IKC and then POE::Component::Generic where my bids at taking over the world. Or at least a certain part of it.

However uploading a distro to the CPAN is something of a commitment. My code is going to live for several years/decades, if not forever. I also do not want to be like the CPAN authors that annoy me, with inadequate test coverage, poor doco and annoying holes in their feature set.

These three points intertwine.

I'll be revising the doco and come to a point where I write "doing X is not advised/not yet implemented." Or I'll write "doing X and Y together has not been tested."

But then I kick myself with "Why isn't it implemented? Why don't I write a unit test for that edge case?" The implementation would take 20 minutes, the test cases for that new feature 30 minutes. A new paragraph of doco and now I've lost my place in my original task of revising the documentation.

What's more, I won't do it just once, but 4 times. 8 times. Before you know it, I've spent 2 days improving a module that originally took 2 hours to write.

These aren't just random numbers. The initial draft of POEx::HTTP::Server took 2 hours Monday morning, of which 30 minutes were wasted because I didn't adequately remember how POE::Session::Multiplex works. I spent the rest of the day improving on that start, then 2 more days adding features, tweaking POE::Session::Multiplex and POE::Component::Deamon because of misfeatures I'd discovered.

But backing up further, I wrote POEx::HTTP::Server because last Friday I thought to myself "Now that Sphinx integration is nearly complete, what will I work on next." And the answer was of course SPEED SPEED SPEED! So I went to look at lighttpd as a front end, with a FastCGI or SCGI call to the app server for dynamic content.

So while my real goal was to get version 2.5 of the Document Warehouse newly known as Quaero ready as soon as possible, I ended up spending a week implementing some technology from 1990. Or 1997 depending on how you look at it. And it's not over; today I realised there was a race condition in my code. A race that IKC probably shares. So a day or three tracking that down.

Still, Firebug shows 19ms response times from POEx::HTTP::Server. Of which only 5ms are waiting aka POEx::HTTP::Server generating the content the other 12 being DNS and network overhead.

Which is damn sweet.

2010/11/26

lighttpd and Thawte

Thawte have done it again: mucked about with their root cert. This did this once, years ago. You'd think they'd learn.

So, after much grief I found out how to set up a new SSL certificate in lighttpd:

domain.key is the key you signed your CSR with.
domain.cert is what you have just "Picked up" from Thawte. You want the X.509 one.

cat domain.key domain.cert >domain.pem

wget https://search.thawte.com/library/VERISIGN/ALL_OTHER/thawte%20ca/SSL123_CA_Bundle.pem

Note, please change domain to whatever the FQDN your certificate is for.

You then need to following two lines in your lighttpd config file

ssl.pemfile     = "/etc/lighttpd/domain.pem"

ssl.ca-file     = "/etc/lighttpd/SSL123_CA_Bundle.pem"

The CA Bundle is a chain of certificates. Normaly, an SSL cert is signed directly by a root certificate installed with the browser. But Thawte likes doing things the hard way. So they signed a certificate with their root and now sign all new SSL certs with that intermediate certificate. So the web server has supply both the SSL cert and the intermediate certificate to the browser. That's what SSL123_CA_Bundle.pem is. If you bought one of the more expensive options, you should download another bundle.

2010/11/24

Obsession

I do this too often.

So I want to build an Ugly Drum Smoker. Why? So I can cook pulled pork and BBQ chicken the way it's done by the pros. Also, maybe make some smoked salmon. But something that big wouldn't get used that often, so maybe I should just make a small smoker in a flower pot.

For the flower pot's burner, I'd have to get a rheostat (expensive) or a stove-top-control-the-technical-name-of-I-don't-remember (cheap, but not very precise). But I'm the hacker born! Why not use a K-type thermocouple and an SSR and control the temperature precisely from a laptop. I already have the thermocouple and can get readings from my DMM. But plugging a thermocouple into a computer is expensive; most people who want to do it need 0.1 C precession and 24/7 reliability, so one channel over USB is about 60 $.

(On a side note, why is it that a thermometer that has an LCD output is cheap, but one that would send the temperature over USB or RS-232 so expensive? I've often dreamed of a universal 8-segment to computer converter.)

Ideally, would build a thermocouple interface with this PCB, a few parts and an Arduino board. Total of about 100$, but I get 4 ports so I could log ambiant temperature and meat temperature plus a few digitals output to control the SSR. I'd also get to play with an Arduino. However, it means I have to give up some of my precious Copious Spare Time.

But if I just wanted to save time, I could use a PIB controller, about 80 $ on eBay. No wasted time, but no computer logging and meat temp would have to be verified by eye on the DMM.

But wait a second! ALL I WANTED TO DO WAS COOK CHICKEN!

*sigh*

2010/11/18

It's time

So the time changed a week or so back. But it didn't on my cellphone. My 16h15 Désirée reminder was going off at 15h15, most annoying. Finally, I called Solo to complain. The script reader got me to do a bunch of silly things, none of which worked. A modern cellphone gets its time from the cellular network, so basically the problem was on their side.

That evening Dominique told me her work cell would have the wrong time in the morning, but the right time during the day. Faulty cell tower, I blame you!

My support call was eventually escalated to an engineering ticket, they said 24-48 hours to fix. As I was making bread just now, my cell did the looking-for-server-beep, well, several of those beeps, and now it has got the time right.

Which I guess means they just rebooted the tower.

All this sort of makes me wonder: am I the only person in North Hatley and Katevale to notice that their cell was out of sync? If not, why didn't anyone phone in before me? If was the only one to notice, wtf you people?

2010/11/12

Server-side javascript, yep!

Turns out the solution to server-side Javascript is V8, NodeJS, node-mysql and express. When I get a few spare tuits I'm going to see how all this fits together.

EDIT: Also dnode, expresso

Woah!

So I bit the bullet and installed MSIE 8 on my Windows VM. A task that would have been made easier if the download page didn't crash MSIE 6. Brilliant move there.

Today I was poking around a bit and discovered some startling things: A JavaScript debugger, logging console, CSS and DOM viewer and more! WHO'S A HAPPY PROGRAMMER NOW?

Mind you the MSIE Dev Tools are equivalent to a 3 year old version of Firebug. And the dev tools are pretty slow, about as slow as a ... a very slow thing. I'm going to have to up the resources to that VM.

And of course, this being MS, they go and muck up the console.log API.

The following works as expected:

console.log("Something");

This next snippet however fails:

console.log.apply(console,arguments);

Someone please track down the programmer who wrote this and all the team leads and suits who signed off on it and SLAP THEM UPSIDE THE HEAD! .apply has been part of JavaScript since 1.3, which was released back in the previous millennium. So we get yet another arbitrary work arounds for MSIE.

Because I can't just go sprinkle console.log() through out the code; many/most users will not have Firebug or MSIE 8 installed.

function fb_log () {

    if( window['console'] && window['console']['log'] ) {

        if( window['console']['log']['apply'] ) {

            console.log.apply( console, arguments );

        }

        else {

            // Assume this is the MSIE 8 console

            console.log( fb_format( arguments ) );

        }

    }

}

Firebug has printf-like formatting of console.log output. For MSIE 8 we have to do it by hand:

function fb_format ( args ) {

    var N=1;

    var string = args[0];

    if(typeof string  == "object" ) {

        return fb_object( string );

    }



    return string.replace( /%([sdi])/g,

            function (str, p1, offset, s) {

                var ret = '';

                if( p1 == 's' ) {

                    ret = args[N];

                }

                else {

                    ret = args[N].toString();

                }

                N++;

                return ret;

            } );

}

Firebug also has some magic for logging objects:

function fb_object (obj) { 

    var a = []; 

    for( var k in obj ) {

        if( typeof obj[k] == 'string' ) {

            a.push( k+': "'+obj[k].replace(/"/g, '\\"' )+'"' ); // "

        }

        else {

            a.push( k+': '+obj[k].toString() );

        }

    }

    return "{ "+a.join( ', ' )+" }";

}

2010/11/05

Large projects

While chatting with some friends last weekend, someone asked "What ever happened to the gov't plan to computerise hospital records?" Or something equivalent in conversational French.

My prediction : this will take 5-10 years to happen, will cost 5x times the initial budget, will cause a huge amount of grief for doctors and nurses will have some really stupid implementation details and will have back doors the size of semi-trailers.

I used to say that a web site can cost you 500$ if you get the boss's cousin to do it, 5000$ if you get me to do it or 500,000$ if you go like Archambault and get CGI to do it. Obviously each of us will supply different levels of support and so on. But they won't necessarily produce better websites. Where better means more profits.

The big thing distinguishing the different levels is the sales force. The boss's cousin just has to convince the boss at Christmas time that he can do the job. If I'm doing the job, you probably heard of me through word of mouth, then I show up with long hair, trimmed beard, 3 piercings in my left ear and in my normal-person-disguise. But the half-million dollar level needs some sort of bidding or negotiation. And an entire sales staff who dress to the nines, take the boss out to supper, tell all the right jokes and smile the entire time. These people cost money so out of the half million a lot of that is going to the sales staff. But of course they still hire recent grads to reinvent the wheel badly and have it run on Windows. To keep costs down you see.

Now the hospital records digitisation projects is going to cost on the order of billions of dollars. This will require more then just good jokes; I will bet you any money that bribes or almost bribes or "nobody broke the law" type bribes are going to happen. Things like handing a politician a credit card or paying for his daughter's private school.

Or put it another way, if I show up to build a web site, we are roughly on the same level: Small business owners. But selling a project to Monsieur Ministre de la Santé et des Services sociaux, Madame la Ministre Délégué and Monsieur le Sous-Ministre requires a sales force that is paid a magnitude more. So a lot of the money goes into suits-and-ties, fancy offices and all the overhead of "being an important business".

And there's also the Collège des médecins du Québec and the Ordre des infirmières et infirmiers du Québec, the ones who are going to actually have to use the system. And the institutional inertia of the CMQ is staggering. The number of road blocks and unnecessary requirements they are going to throw in the way of this is going to be impressive.

Now I'd love to design something like this. The security requirements and privacy requirements would be difficult. Who would be allowed to access or modify something. Availability is also going to be tough. The doctor must be able to get at the dossiers even if the Internet has failed, the power has failed or the doctor lost his crypto-token (or whatever).

What's more, getting all the current dossier's digitized will be a herculean task. And any notes the doctor writes down or dictates (you can't expect them to type, now) will have to also be digitized rapidly. Ideally not the way it is done in the USA; several levels of sub-contracting until eventually it is done by someone working for peanuts in India.

However, I wonder if this really has to be done.

When Dominique was pregnant, we initially went to a doctor, before getting a midwife. We had to transfer Dominique's dossier to the midwife. In my mind, this required getting a CD-ROM or something or maybe an rsync. I mean I've written and maintain a document archive system. I had a "WTF? Oh of course, right!" moment when I realised it meant lugging around an small armload of dead trees, something that hasn't changed much since the dawn of modern medicin.

All this is not to say I wouldn't enjoy designing such a system. In fact, at one level it resembles the large contract that Louis is trying to get us. So it wouldn't be wasted effort. Maybe in a future post.

One for the Google bot

If you get the following really strange message :

Can't coerce array into hash at /usr/lib/perl5/5.8.8/ExtUtils/Install.pm line 94.

Just do

touch Makefile.PL; make ; make install

Yes you need 2 makes: first one will cause Makefile to be rebuilt with the same params as used the first time, second one will do the install you wanted in the first place.

Now the longer question is "WHAT CHANGED?" And I have no answer to that.

2010/11/01

Programming languages

Like most programmers of my generation, I first learned BASIC. On an IBM PCjr called Tommy to be exact. November 1984 I went on a 2 week vacation of sorts in France. I took as reading material the hard cover 3-ring binder of IBM BASIC that came with the computer. While on the plane, I tried to work out the equations for projecting 3d objects onto a 2d surface (ie, the screen). I failed.

A year or so later I got my hands on TurboPascal and had great fun with that, especially the better and faster graphics handling.

My first summer job after high-school was programming dBase IV for the North Hatley Library.

In CEGEP (1987-1992), I learned Z80 and 8088 assembler, Forth and C and C++. I also pulled apart a BASIC port of the venerable Star Trek text game and ported it to QuickBASIC, expanding it and giving it a HUD as I did.

At some point I found 2 large books about AI in my father's office at the university. I read as much of them as I understood. I don't remember if I brought them to Bolivia in 1992 or not, but I did try to figure out how to encode AI into a game that would be an extension of the Star Trek text game, which would happen on a randomly generated planet surface. I never coded them up, but I still probably have the notes somewhere.

While in Bolivia, I did some dBase coding and messed around with Fortran so I could play with some 24bit graphics hardware attached to a GIS system. I learned just enough Fortran that I decided I very much disliked Fortran.

After CEGEP, I did an internship at the MTQ where I encountered Clipper, a compiled implementation of the xBase language, with some very powerful extensions. In 1995, I switched to Windows 95 because even if I disliked the GUI, compiling my projects were an order of magnitude faster.

In 1997 I decided to hitch my wagon the rising wave (if you will permit a very mixed metaphor) of the Web in Quebec. Which at that time meant programming in Perl. A language I fell in love with very quickly and have stuck with pretty much since. I played around with Java when it first came out. It looked to be very interesting. But the overhead in terms of setting up classes was so annoying and Perl was so smooth, I gave up on Java.

I spend the week before my 30th birthday learning XML, XSLT and XPath for a project.

Between 1995 and 2005 I went from Windows 95 to dual booting Windows and Linux to using Linux 24/7. I don't remember the exact date though.

2 years ago I wrote POE::XUL, which required me learning JavaScript properly. I'm pleased to see that it has mostly grown up into a real language, despite MS's attempts to sabotage it. The project I wrote based on POE::XUL required me to learn PRO/5 Business Basic. Writing all 3 in the same project was mind bending at times.

Of all languages I've used Clipper, JavaScript and Perl stand out as the most pleasing to use. And Clipper is pretty useless these days, unless someone manages to find me and gets me to maintain some legacy system. Or port it to Linux which would be a nice contract.

I have fond memories of Forth, how small and highly modular it was. But looking at Forth code now just gives me a headache.

Of all the languages I've mentioned, C++ stands out as being annoying, misguided and plane stupid at times. Which is surprising as some very smart people put their minds to creating it.

One regret I have is never learning LISP or another functional language. Erlang in particular gives me a thrill.

2010/10/25

More java

jBoss looks to be amazingly amazing. It's nearly exactly what I want in a Web Framework. In fact, it's close to what I thought JAAS would become, but never did.

It is in Java, though. And I think I'd rather stick my hand down my throat and rip out my lungs rather then live with Java, day in day out.

But maybe I could spend a lot time finishing JAAS and turning it into what I wanted it to be in the first place. Or, if the big contract dream because the Big Signed Contract I could hire some PFYs to do it while I cracked a whip. Yeah, that would be awesome.

TODO on JAAS:

Documentation;
Work Log4perl in;
Reflex layer;
Much better object lifetime;
Much faster/better session lifetime;
Speed! (See previous 2 items);
The Widget to HTML rendering stuff needs to be simplified and dekludged.

That, as a rough begining...

2010/10/21

IPv6 right now.

Reading slashdot I saw an article about the coming IPv4 Apocalypse. So I figured I should spend some time getting up to speed. And when I say "getting up to speed" I think I really mean "hack at it a little."

First I need some routable addresses. My ISP at home is Bell, which is probably going to be the last ISP in the universe to hand out IPv6 addresses. So I need a 6to4 tunnel. I found handy guide for Linode which basically said "get a free tunnel from HE." Which I did.

The following was added to /etc/sysconfig/network on the VM I wanted to be my IPv6 router

IPV6_DEFAULTDEV=sit1

IPV6FORWARDING=yes

IPV6_ROUTER=yes

Next I added he following to /etc/sysconfig/network-scripts/ifcfg-sit1

DEVICE=sit1

BOOTPROTO=none

ONBOOT=yes

IPV6INIT=yes

IPV6TUNNELIPV4=IPv4

IPV6ADDR=A1:B1:C1::2

Where IPv4 is the IP Bell gave me and A1:B1:C1::2 the client IPv6 address from HE.

ifup sit1

ping6 -n ipv6.chat.freenode.net

yay! It works.

Next, I want other computers to be able to access this tunnel. This took some messing around, but by blindly stabbing at it I got it going.

Then I asked for a /48, and added this to /etc/sysconfig/network-scripts/ifcfg-eth0

IPV6INIT="yes"

IPV6ADDR=A2:B2:C2::2

Where A2:B2:C2::2 is part of my Routed /48 from HE.

Then service network restart.

Next, on my desktop computer, I added the following to /etc/sysconfig/network

NETWORKING_IPV6=yes

Then to /etc/sysconfig/network-scripts/ifcfg-eth0

IPV6INIT=yes

IPV6ADDR=A2:B2:C2::6

Then:

ifdown eth0 ; ifup eth0

ping6 A2:B2:C2::2

YAY! That works.

But... what about routing and forwarding and so on? This is where I stabbed around blindly. The solution was to use radvd.

/etc/radvd.conf:

interface eth0

{

    AdvSendAdvert on;

    MinRtrAdvInterval 30;

    MaxRtrAdvInterval 100;

    prefix A2:B2:C2::/64

    {

        AdvOnLink on;

        AdvAutonomous on;

        AdvRouterAddr on;

    };

};

Start the service, then restart eth0 on corey, and bingo! I can surf to ipv6.google.com.

Todo : Setup reverse DNS for my /48. Setup ipv6 for all my computers. Figure out how to tell HE when Bell changes my IP. They have a tool to do this but my first attempt didn't work.

And of course understand what it is I'm doing. For instance, if IPv6 uses 128bit IPs, surely the /64 would be enough

2010/10/20

Server-side javascript

Louis is looking into getting the contract for a big project. A really big project. Big enough that we'd have to hire more programmers, and user-friendly phone answers.

This got me thinking about what language I'd use. In the last 15 years, I've written pretty much everything in Perl. I really like Perl. However, there are many things that annoy me about Perl. Many of these things are solved in Perl 6 and/or Moose. But Perl 6 isn't ready and never will be, and Moose.... well lets just say that I like Perl!

Also if I'm hiring programmer(s) maybe Perl isn't the best language. This project will be used for 10 (read 20) years to come, by a diverse bunch of not very bright users. So what else would I want to program in? Erlang is dead cool, but not for common mortals.

I'll ignore Java as a bad joke, C++ or C wouldn't do for something of this complexity. Ruby, Python? I'd just as soon use Perl.

So how about Javascript?

Javascript is really my new favorite language. Remove the frustration of dealing with JS in MSIE, which is really the problem of MSIE's DOM implementation, and JS is a really nice language, with associative arrays, regexes, objects, inheritance, functions as data, closures, etc. It lacks low-level data manipulation like Perl's pack/unpack, and it has some silly legacy features.

What's more, if all validation is already written in JS, you can then do client-side and server-side validation of input for free.

So how is JavaScript on the server done? Turns out there are many ways.

yum --enablerepo=rpmforge install js will install SpiderMonkey, and I guess you could use #!/usr/bin/js in a CGI. But in 2010, you don't want to be implementing a framework from scratch, including things like MySQL access.

Jaxer is very very cool looking. It hugely shrinks the distance between the client and the server, as it were. The browsers' DOM is accessible from the server. Which is a very very cool idea that I've used in the past.

Jaxer goes one further too: the border between browser and server can be blurry when you use the runat="proxy-server" feature. It 'simply' turns a function call in the browser into a synchronous XMLHttpRequest which calls the function on the server. How cool is that? I'm going to have to implement something like this for POE::XUL.

One drawback of Jaxer is that it seems to need Aptana to compile. Having source code isn't very useful if you can't patch and rebuild it. Especially if you want to create an RPM. Or even do some heavy lifting in C++, and pipe-fitting in JavaScript. Another drawback is that Jaxer is dependent on Aptana to survive. Will Aptana be around in another 5 years?

The Apache foundation has been around for years and isn't going anywhere. They have bsf, which allows one to embed JS in a JavaBean, which brings us back to Java. On the one hand: Java! RUN AWAY! On the other hand: hiring Java programmers should be easy. And doing any heavy lifting in Java, then doing the high-level gluing in JavaScript might be OK.

Mind you, this is all speculation; we are still a year or more away from knowing if we get the contract.

A better backup

I do backups badly. Basically, rsync to a large partition somewhere. That's not really a backup. It protects against hardware failure, yes. But not against "oops! I deleted that file 3 weeks ago." What's more I'm sure I'm not doing it as well as it could be; by backing-up to a hard disk, why not backup the entire OS, and the hard disk bootable? Would be complicated if multiple machines backup to one backup server, but for my clients, I most often have one server which backs-up to one set of removable disks

What's more, I moved all my VMs from Jimmy to George yesterday. When I say "move VM" I should say "moved all the services to new CentOS 5 VMs." Which sort of shows up another problem: keeping track of what you've set up where and why. Jimmy had lighttpd running on it. Why? Oh... to see the RRD graphs of the temperature and humidity in the attic. I should document all this, now that I "know it" but ideally it should be automated.

And conformance tests; a bit like unit tests, you run some scripts to see if everything in the new install is working as expected. After all was done, I realised that I hadn't copied over my subversion repositories, nor set them up.

One central issue, I suppose, is config files. Ideally, you just copy in the backed-up config file, start the service, run the test script, verify success. I notice that rpm provides a --configfiles option. Combined with rpm's verify options, maybe one could detect what config files have changed and keep a backup set of them. Of course, things like /var/spool/hylafax/etc/config.ttyS0 would have to be added by hand. As would stuff installed by hand into /opt and/or /usr/local

And a modified config file implies that the package is being used, so the package would get flagged as important. And then, maybe once a week say, you'd get email "hey, you don't have a conformance script for package X." Or "You didn't write a changelog for the latest changes to file Y."

2010/10/16

How not to spend a friday night.

I cascade of stupidity caused me to drive to Montreal and back on a Friday night. In the dark. In heavy rain. If you know me, this isn't my idea of fun. While I am slightly to blame, most of the blame is elsewhere.

First, power outage at a client's in St-Constant at 13h00 (roughly). DAMN YOU HYDRO. Though really, black-outs are par for the course. Client has a UPS though. BUT the USB cable went missing 6 years ago, so no way for the computer to turn itself off cleanly. DAMN YOU APC. Why not just put a USB-B port on the back of your damn UPSes instead of having using a secret-sacred RJ45 with 10 pins that costs way to much. Oh, yeah, that would be why. And DAMN YOU JEAN-PHILLIPP, a UPS without apcupsd (or equiv) is less then useful.

So anyway, battery eventually drains, BAM! Hard shutdown. Power comes back at some point, but I suspect not very cleanly. The on-off cycling causes the BIOS to loose its settings BLACK EYES TO YOU ASUS. SERIOUSLY WHAT THE EF?! Also: DAMN YOU APC AGAIN! The UPS should be smart enough to wait for a few seconds of clean power before turning passing power though.

But, now that the BIOS has been reset to showing the SATA as IDE drives, GRUB can no longer load stage 2. Which might be damn stupid, or unavoidable. The bug report to me is "GRUB " on the screen, nothing further.

And this is where my blame comes in; I know that ASUS motherboards can reset the BIOS. But I'd just had a problem with /boot on a RAID1, so I'm thinking that was the problem. I eat supper, drive 2 hours, boot the computer with Knoppix, reinstall grub so it can find stage 2. Reboot, yay grub! But then initrd can't find md0, which has VolGroup00 on it! WHA! It's there! KNOPPIX can find it, why can't you?

Messing around for a while until the light goes on: BIOS RESET BECAUSE ASUS HATES LIFE! OK, set the SATA back to AHCI. Boot again. Still no go. FAH!

And then the other shoe drops. One of those things that if you've never pulled an initrd apart and poked at the init script inside one, you wouldn't notice: init was looking for md0, but KNOPPIX was calling it md127. And it turns out that md devices have a prefered minor device number. So when KNOPPIX was calling it md127, it was writing that to the array. Which means when init was trying to activate md0, it goes BUH CAN'T FIND IT.

BLACK EYES TO YOU, KNOPPIX FOR CHANGING THAT! Seriously, changing the preferred name of an array is really bad form.

So how to change it back. First you deactivate the arrays :

mdadm --stop /dev/md126

mdadm --stop /dev/md127

BUH! That last doesn't work; LVM is still holding a lock on the array. I strongly suspect that vgremove would be enough to drop the lock, but there's no way I'm going to test that on live data.

So reboot, don't activate mdadm-raid. Do the following

mdadm --assemble --update=name --name=0 /dev/md0 /dev/sda2 /dev/sdb2 /dev/sdc2

mdadm --detail /dev/md0

mdadm --assemble --update=name --name=1 /dev/md1 /dev/sda1 /dev/sdb1 /dev/sdc1

mdadm --detail /dev/md1

The --detail lines should show Preferred Minor : 0 or 1 depending.

Note: Those are hairy commands. They could potentially kill your arrays if you get the partitioning wrong. DO NOT JUST CUT AND PASTE THEM IF YOU RUN INTO THE SAME PROBLEM AS ME! Read the docs, understand them, then adapt the commands to your setup.

PS : you pull initrd apart with

cd /boot

mkdir t ; cd t

gzip -dc ../initrd-$(uname -r).img | cpio -i

ls

Now go poke at init

2010/10/12

Reinventing the wheel. Badly

So, I had x11vnc all nice and tested with the TightVNC viewer on Linux and RealVNC on Windows but I get a bug report. User couldn't log on. -unixpw was displaying Username: user was entering username, but pressing Enter wouldn't move the the Password field.

Looking at /var/log/x11vnc.log I see a bunch of:

12/10/2010 15:17:39 unixpw_keystroke: bad keysym4: 0xff8d

Looking in /usr/include/X11/keysymdef.h, I see that 0xff8d is XK_KP_Enter, that is, the Enter key to the right of most number pads. And not XK_Return, the Enter key that's just to the right of the alphabet.

Looking at x11vnc's source code, I see that

x11vnc/unixpw.c only checks for XK_Return and XK_Linefeed. But what's more;
reimplements the huge bloody-effing-inputting-text-with-editing wheel;
x11vnc's -remap function doesn't happen in the code path that leads to unixpw_keystroke, so is bloody useless for this problem.

So the solution is to use TightVNC viewer on Windows. I already know that TightVNC is the better viewer for Linux. So now that's 2 out of 3.

Of course, the other solution would be to patch x11vnc. But I already have my time fully commited to reading Irregular Webcomic!

Third point being why was it working for my test setup but not in the field? Well, I had Windows Server 2003 as a VMware Server guest, via the VMware server console running on Linux. So something somewhere was remapping something somehow. "It is always possible to add another layer of indirection."

2010/10/08

Watch that command

I was reminded to day of how useful watch is. I omitted it from my previous list of important and useful commands because I rarely use it. But today I had George open on the bench and was finding out what fan connectors on the motherboard corresponded to which speed sensor, as reported by lm-sensors.

watch -n 0 "sensors w83793-i2c-0-2f | grep fan"

Then, as I plugged and unplugged a 3-wire fan here and there, I could see on screen what was going on. This is especially important, cause you don't want to unplug the CPU fan for any length of time.

FYI, the sensor:fan port mapping for a DSBV-DX is as follows:

fan1  CPU_FAN1

fan2  CPU_FAN2

fan3  FRNT_FAN1

fan4  FRNT_FAN2

fan5  FRNT_FAN3

fan6  FRNT_FAN4

fan7  REAR_FAN1

fan8  REAR_FAN2

fan9  FBD_FAN1

fan10 N/C

fan11 N/C

fan12 N/C

2010/10/07

I take it back

The war on noise continues apace. Next up, remove Jimmy from service, replacing it with George.

# fdisk -H 224 -S 56 /dev/sdd  

# fdisk -H 224 -S 56 /dev/sde

# sfdisk -l /dev/sdd



Disk /dev/sdd: 182401 cylinders, 255 heads, 63 sectors/track

Warning: The partition table looks like it was made

  for C/H/S=*/224/56 (instead of 182401/255/63).

For this listing I'll assume that geometry.

Units = cylinders of 6422528 bytes, blocks of 1024 bytes, counting from 0



   Device Boot Start     End   #cyls    #blocks   Id  System

/dev/sdd1          0+ 233598  233599- 1465132900   fd  Linux raid autodetect

/dev/sdd2          0       -       0          0    0  Empty

/dev/sdd3          0       -       0          0    0  Empty

/dev/sdd4          0       -       0          0    0  Empty

sdd and sde are a pair WD15EARSs. With 4k blocks. Normaly I use sfdisk -d to copy partitions, but that failed on 4k blocks.

mdadm -C -n 2 -l 1 /dev/md2 /dev/sdd1 /dev/sde1

pvcreate /dev/md2

pvs -o name,pe_start

vgcreate -s 32M T00 /dev/md2

lvcreate -l 99%VG --name LV00 T00

mkfs -t ext4 -E stride=32 -m 1 -O extents,uninit_bg,dir_index,filetype,has_journal /dev/T00/LV00 

tune4fs -c 0 -i 0 /dev/T00/LV00

I got the formating commands from I Do Linux. I read up on all those mkfs options. I'd never have guessed they were the "best" options to use.

The tune4fs is basically turning off the fsck that happens automatically every X days or Y reboots; fsck is SLOW, and the automatic fsck always happens when you least want it. And with journals, UPSes and so on, an unsafe shutdown isn't supposed to happen.

I left some space on the VG free so I could do snapshots.

But then I got to thinking. Blocking off one large LV means that the file server VM gets access to the entire disk and all that space can't be used by a different VM or the host for another purpose. With LVM, I can grow the LV if I need it. So why not give it 500G at a time?

e4fsck -f /dev/T00/LV00 

resize4fs -p /dev/T00/LV00 500G 

lvresize -L 500G -t -v /dev/T00/LV00 

resize4fs -p /dev/T00/LV00

This was as much a test of ext4 resizing as anything else. And it worked flawlessly. Btw, fsck on an empty FS is fast.

So what does it look like:

# df -h

Filesystem            Size  Used Avail Use% Mounted on

/dev/mapper/G00-root  572G  537G  5.8G  99% /

/dev/md0              950M   31M  870M   4% /boot

tmpfs                 5.9G     0  5.9G   0% /dev/shm

/dev/mapper/T00-LV00  493G  493G     0 100% /T00

WHAT 100% FULL ALREADY?

Well, no. I wanted to make sure the 500G as resize4fs sees it and 500G as lvresize sees it were the same thing, so I filled the FS to the brim:

# dd if=/dev/zero of=/T00/HUGE

dd: writing to /T00/HUGE': No space left on device

1031717425+0 records in

1031717424+0 records out

528239321088 bytes (528 GB) copied, 6608.14 seconds, 79.9 MB/s

See that? Bare in mind that this dd was happening during the initial RAID build. I hypothesis that the mobo in Corey (ASUS M2N-MX SE) that I was doing the previous tests on SUCKS. And that the mobo in George (ASUS DSBV-DX) does not.

# time rm -f /T00/HUGE

real    0m31.020s

user    0m0.000s 

sys     0m20.302s

2010/10/06

Neat bash tricks

One idiom I like is:

ssh $host "commands;" 2>&1 | while read line ; do

    # react to any error messages or messages from commands in $line

done

For instance, say you were running x11vnc on a remote host. x11vnc has the annoying habit of using a port other then the one you specify, if the one you want is already taken. Very annoying. So:

ssh $host "x11vnc ...." 2>&1 | while read line ; do

    if [[ $line =~ 'PORT=([[:digit:]]+)' ]] ; then

        port=${BASH_REMATCH[1]}

        # now set up some sort of port forwarding so that $port is a sane, known port

        ssh -L '*:9600:localhost'$port $host

    fi

done

This has some problems, in that the second ssh can survive x11vnc exiting. I thought "hey, how about $?" but that has other problems; say the second ssh exits before its time. The $? you saved could be reused. The kill you'd want to do would provoke hilarity. While complaining about this on IRC, a wise soul suggested that I open a lock file, and then any process with that file still open must be killed. I don't need locking, and didn't want to learn about flock in bash right away, so what I roughly did was:

child_kill () {

    if [[ ! $LOCKFILE ]] ; then

        return 0

    fi

    lsof -F '' $LOCKFILE | while read ppid ; do

            if [[ $ppid =~ '^p([[:digit:]]+)$' ]] ; then

                pid=${BASH_REMATCH[1]}

                if [[ $pid != $$ ]] ; then

                    kill -HUP $pid

                fi

            fi

        done

    rm -f $LOCKFILE

}



local LOCKFILE=$(mktemp -p /tmp)

trap "child_kill" EXIT

ssh ... | while read line ; do 

    ....

    ( exec 123>$LOCKFILE

      ssh -L ..... $host &

    )

done

child_kill

I wish there was a better way to deal with lsof's output, but this works so why complain?

What's more, I wish I could use SSH's ControlMaster to make the second connection that much faster. But the quick testing I did with 4.9p1 failed. Bugger

2010/10/04

It's a server, not a telephone!

Dear the people at Red Hat,

I'm trying to set up a VMware server on top of CentOS 5.5. I want / to be LVM on top of RAID5, /boot as RAID also. I want as few packages installed as possible, only the bare minimum services running. This seems like simple but widespread scenario.

But why did it take me 3 hours?

Why did LVM on RAID5 require a bleeding graphical install? This is a server we're talking about.

Why did a huge pile of useless stuff like bluetooth, pcsc, xfs (?!), nfs, cups, wpa_supplicant, tux, apache, nscd, IRDA get installed? If I need these things, yum allows me to install them in seconds. Instead I spend an hours hunting down those and other useless bits.

Maybe I'm just getting old. But I pine for Redhat 9, where everything was small, simple, understandable. Now everything is 5 levels of indirections, lots of smoke and a dash of magic.

Grrrrrr!

-Philip

2010/09/30

Bash and dialog

First, I rant about dialog: WHAT THE HELL WERE THEY ON WHEN THEY WROTE THIS THING?! First, they make keyboard inteaction completely different from what normal users would expect. And then they make it very very hard to actually get the results into a shell script. What's more, every time I write a new script that uses dialog, I realise it lacks many features I need. So I have my own forked / hacked version that I deploy. Thank God for Open Source Code.

Second, I give you some code snippets that will show you how to do simple things with dialog:

Say you want a menulist and want to know which one was selected:

resp="$( dialog ..... 2>&1 >/dev/tty)"

Here we assign a value to the variable resp. The "" preserves whitespace. The $() causes the value to come from a sub command. dialog ... is the "normal" dialog invocation. 2>&1 causes stderr to be sent to sdtout, so that it ends up on resp. >/dev/tty causes dialog's stdout (its original stdout, that is, not the bits of stdout that are coming from stderr) to go straight to the controling TTY, rather then ending up in resp, which would be silly.

Now, say you had a shell function call do_something. This function sets serveral variables as a side effect and takes a while doing so. Maybe you'd want to use dialog's --progressbox or --gauge so the user has something nice to look at while work is going on. If so you might try

do_something | dialog --progressbox

And you would quickly encounter failure; do_something ends up in a sub-shell, whence it can not set the variables in the main shell.

So, you spend time with the BashFAQ and on #bash complaining trying to find an answer. It turns out that FIFOs (aka named pipes) are the only way:

fifo=$(mktemp -u)

mkfifo $fifo

trap "rm -f $fifo" EXIT

dialog --progressbox "Wait" 12 40 <$fifo &

pid=$!

do_something > $fifo

rm -f $fifo

wait $pid

So, we create a randomly named FIFO. Run dialog in the background, reading from the FIFO. Run our function in the foreground, stdout going to the FIFO. When do_something is done, we remove the FIFO and wait for dialog to exit. Which it should, after the fifo has been removed. The trap makes sure the FIFO is deleted even if we exit early.

If you ask me, there's got be a better way, one that doesn't involve mkfifo and does involve exec. But I can't get my head around exec.

EDIT: turns out there is a better way:

do_something > >(dialog .....)

A mixed bag

I ordered a BlacX SATA/IDE/USB docking station today. Should have bought one back when I first discovered them. Would have made all the farting around with Corey a lot less painful.

In other news, I discovered that getting xterm to use TrueType fonts is as simple as -fa NAME-PT. As in:

xterm -r -fa Consolas-17

Consolas, if you didn't know, is a nice little monospaced font that MS commissioned, tuned to LCD screens. The previous link downloads a bloody useless setup.exe. So I downloaded the .ttf from elsewhere.

Also, you might be interested in this review of programming fonts from 2007. Though he seems to think that 11pt is legible.

Finally, a snippet to get the current X resolution. I needed this to adjust the font size based on resolution. My users want the terminal to fill the screen.

read prop Xsize Ysize < <(xprop -notype -root 32cc ' $0 $1\n' _NET_DESKTOP_GEOMETRY)

echo "${DISPLAY:-Display} is ($Xsize,$Ysize)"

If you didn't know about Process Substitution you do now.

2010/09/28

LTSP and x11vnc

A few years back, I wrote a system that ended up replacing about 200 terminals in 5 different offices with LTSP running on VMs. 2 and bit years latter, the last bunch of users were moved from terminals to LTSP. This time, they got FIT-PC2s as opposed to diskless PCs. Lucky them; less noise. But they are complaining that they find the mouse to unwieldy. And because they have locked down desktops, they don't have a config app to tweak the mouse acceleration and threshold. So I poked around and found that the settings are in :

/home/$USER/.config/xfce4/mcs_settings/mouse.xml

Now, that is hardly exciting.

What is exciting: I then wanted to set up x11vnc so that I could see what the users were experiencing. I've used x11vnc often on the same desktop, but my first attempt at doing it with LTSP failed. So I go on #LTSP to complainlook for a solution, which turned out to be -noshm.

They first complained that LTSP-4 (which I'm using) is far to archaic (which it is). They then got me on track for the next little bit of bash:

#!/bin/bash



USER=$1



error () { echo "$*" >&2; exit 5 }



if [[ ! $USER ]] ; then

    error "Usage: $0 user"

fi



userpid="$(pgrep -u $USER xfwm4)"

if [[ ! $userpid ]] ; then

    error "User $USER isn't logged in"

elif [[ $userpid =~ ' ' ]] ; then

    error "User $USER is probably logged on several times: $userpid"

fi



userenv="$(tr '\0' '\n' < /proc/$userpid/environ | egrep '^DISPLAY=|^XAUTHORITY=')"



eval "$userenv"

if [[ ! $XAUTHORITY ]] ; then

    XAUTHORITY=/home/$USER/.Xauthority

fi

export DISPLAY XAUTHORITY



echo DISPLAY=$DISPLAY

echo XAUTHORITY=$XAUTHORITY



exec x11vnc -noshm -nopw

All the error checking might be hiding the really interesting bit: I'm pulling DISPLAY and XAUTHORITY from the user's environment, via /proc. Something I'd never thought of doing before.

2010/09/24

Tools

This is a short list of commands and idioms that make life at the command prompt bearable. Some of them are trivial, but if you don't know they are there, you'll never find them. Reading their man pages is recommended.

screen

perl -lane 'short script'

find [....] -print0 | xargs -0 cmd

( set -e ; cmd1; cmd2; cmd3 ) && killall some-daemon

pstree

ps -C cmd

lsof -c cmd

ssh-keygen

pidof

chkconfig

$(( arithmetic ))

${var/match/replace}

lshw

for n in [...] ; do cmd $n ; done

some-command | grep | while read n ; do cmd $n ; done

lshw combines lspci, lsusb, lshal, lspcmcia and more. Forgot what motherboard a server has? lshw knows. What size/manufacture/number of DIMMs you installed? lshw knows. What version of BIOS? lshw might know. Under CentOS you have to get it from RPMForge.

One should learn variable expansion syntax if one spend any time with bash.

And yes, I do type while and for loops at the command line.

2010/09/23

Silence!

Newegg.ca sent me email yesterday with a tracking number. Click on it and BUH "we have no information about this package." Yes, you are idiots.

But this morning, I try again at 8h00. And BUH PACKAGE ON TRUCK IN SHERBROOKE!

So I don't go back to bed, wait impatiently for the truck to show up. Three cheers for eCommerce!

I give up

So concluded on how to get partitions aligned on 4k boundaries (WDC WD15EARS-00Z), for ~60 MB/s. But if I add a ext3 FS on top, speed drops to 20 MB/s. With LVM or without.

This makes no sense

The great big huge 1.5 TB drive I bought for Corey is a WD15EARS. One thing all those letters mean is that it's an Advanced Format drive. Which is Western Digital Marketing Speak for "we use the new, modern 4k sector size." But of course the drive reports its block as 512 bytes, because otherwise it would fail under Windows. So hoops must be jumped through.

And some of them make no sense.

The parable of the ax.

You know the old parable : I've got this great ax. Had it for years. I've changed the handle 5-6 times. Only changed the head twice though. Still works great!

This also applies to computers. Corey (my desktop computer WAKE UP WAKE UP! DARLING COREY! HOW CAN YOU SLEEP SO SOUND?) started life as a Compaq Presario running WinXP. I never used the original mouse or keyboard. The heatsink clips broke, so I had to replacethe mobo, CPU and RAM. I've replaced the CD-R drive with a DVD-R. And now I've replaced the hard drive with one 100 times bigger and upgraded the OS from Fedora 5 to CentOS 5.5.

Basically all that's left of the original computer is the case and power supply.

So, do I change the computer's name? My feeling is "yes." Mainly because an OS upgrade is a large change in a computer's personality. But on the other hand, changing the might mean going through a bunch of forgotten config files and changing it here and there too. Mind you, the important ones prolly just use the IP.

But on the gripping hand, I'm lazy. And I'll get to tell the Ax Parable during cocktail parties or other inapropriate moments.

2010/09/15

SSDs are not ready

The drive on Corey is/was in the process of dieing. It is going on 7 years old, has been spinning at 7200rpm every hour of most days for those 7 years. I don't begrudge it failing. It did mean I had to spend 2-3 days reconfiguring my daily use computer. Annoying. More details later.

I bought a Western Digital Green 1.5TB drive. 100 CAD! Cheap! It's only 5400rpm in the hopes it will make less noise. But it's still to noisy. Not so much rotation whine but an annoying click every 4-5 seconds. Something I had back in the day with SCSI drives, but I don't think the old IDE drive had this problem. So either I get an SSD or a fit-pc2 to put it in. Or boot LTSP from the basement.

Also, Jean-Phillipp brought up that maybe we should look into moving our clients' MySQL DBs to SSD. Even at the high prices of SSD, it would cost less then what we are currently doing, which is throwing RAM at the problem. Also, there's only so much RAM that a server can hold. And with datasets getting larger then 33 GB, that approach has reached its limit. However, it turns out that SSD, along with being expensive aren't quite ready for prime-time.

The name

This blog's name, User Tolerant Liveware, comes from Doonsebury, by way of Daniel. Way back in the day, there was a Doonesbury cartoon with "Ah, the User Friendly Liveware" as the punch-line. Years ago, Susan refered to me as such to Daniel. He harumpfed and responded to the effect that I was closer to "User Tolerant" then "Friendly."

Can't say I disagree.

Anyway, this is where I plan to complain about computers, so many posts will be technical. It also gives me a blogger account so that I can contradict Daniel.