FreeBSD 6.3-RELEASE on Tyan TA26 B5397 barebones

18 01 2009

I have several Tyan TA26 B5397 barebones (with latest BIOS) which need to be installed with FreeBSD 6.3-RELEASE (32-bit). The installation went smoothly, but after finishing post-install setups and exiting the setup program, it wouldn’t reboot.

Initially I thought this was a small issue but eventually after the system boots up, I couldn’t restart the system with reboot, shutdown, nor init 6. It would stuck/freeze/hang at the “Uptime: …” line and before the “Rebooting..” line. I tried with 2 other barebones thinking that it is hardware issue, I got same results though. Clearly this is a bug in FreeBSD 6.3-RELEASE’s kernel.

I tried installing FreeBSD 6.4-RELEASE (32-bit) and it restarted properly. I was going to update the system to 6.3-STABLE — the server has to run 6.3, as 6.4 is not acceptable yet due to software compatibility issues — to see if the issue has been addressed in the -STABLE branch, but when I tried recompiling the kernel with PAE (in addition to SMP) support (still using -RELEASE kernel source tree), the reboot problem disappeared. YAY! I got it the fix by accident!

I’m not sure what is wrong since similar reboot problem seems to be popping around randomly since FreeBSD 5.X (after doing some Google searches). Browsing the PR database of FreeBSD didn’t help much either. Maybe if you know exactly what is wrong, you can share it here to help others with similar problem.



errno.h problem

18 01 2009

If you see an error like the following when compiling, then most likely it’s the errno.h problem:

/usr/bin/ld: errno: TLS definition in /lib64/libc.so.6 section .tbss mismatches non-TLS reference in envdir.o
/lib64/libc.so.6: could not read symbols: Bad value
collect2: ld returned 1 exit status
make: *** [envdir] Error 1

If there is a conf-cc file, then add “–include /path/to/errno.h” to the gcc line. Normally the file is at /usr/include/errno.h.

References:
http://cr.yp.to/docs/unixport.html#errno



Linux Source-based Routing

7 12 2008

References:
http://www.widianto.org/2006/08/18/source-based-routing/



ext3 write barriers and write caching

7 12 2008

I was reading on ext4 articles just now and read its advantages compared to ext3, which is the most popular Linux file system today. I read that performance-wise, ext4 is faster at handling large files but does not provide significant improvements over real world tasks. ext4 is also backward-compatible to be mounted as ext3 as long as extents are not used in that particular file system.

Out of the few advantages ext4 has over ext3, there is one feature which is very useful to ensure integrity of the file system, journal checksumming.

Quoting Wikipedia’s article on ext3:

Ext3 does not do checksumming when writing to the journal. If barrier=1 is not enabled as a mount option (in /etc/fstab), and if the hardware is doing out-of-order write caching, one runs the risk of severe filesystem corruption during a crash.

Consider the following scenario: If hard disk writes are done out-of-order (due to modern hard disks caching writes in order to amortize write speeds), it is likely that one will write a commit block of a transaction before the other relevant blocks are written. If a power failure or unrecoverable crash should occur before the other blocks get written, the system will have to be rebooted. Upon reboot, the file system will replay the log as normal, and replay the “winners” (transactions with a commit block, including the invalid transaction above which happened to be tagged with a valid commit block). The unfinished disk write above will thus proceed, but using corrupt journal data. The file system will thus mistakenly overwrite normal data with corrupt data while replaying the journal. There is a test program available to trigger the problematic behavior. If checksums had been used, where the blocks of the “fake winner” transaction were tagged with a mutual checksum, the file system could have known better and not replayed the corrupt data onto the disk. Journal checksumming has been added to EXT4.

The ext3 barrier option is not enabled by default on almost all popular Linux distributions, and thus most distributions are at risk. In addition, filesystems going through the device mapper interface (including software RAID and LVM implementations) may not support barriers, and will issue a warning if that mount option is used. There are also some disks that do not properly implement the write cache flushing extension necessary for barriers to work, which causes a similar warning. In these situations, where barriers are not supported or practical, reliable write ordering is possible by turning off the disk’s write cache and using the data=journal mount option.

ext3 apparently has a feature called write barriers which can help maintain integrity without journal checksumming, but there is a conflicting report over its performance impact.

Quoted from Andreas Dilger in a mail to ext3-users mailing list in May 2007:

Ideally, the jbd layer could be notified when the transaction blocks are flushed from device cache before writing the commit block, but the current linux mechanism to do this (write barriers) sucks perforance-wise (it sent throughput from 180MB/s to 7MB/s when enabled in our test systems). It was better to just turn off write cache entirely than to use barriers.

contradicts this post made by

Well, I think I see where ext3 gets its reputation for slow deletes. With the write cache off the delete performance is terrible, nearly 70% lower. It’s clear that enabling write barriers does something as the numbers are lower on a number of items (though not all). However it’s clear that write barriers is minor loss of performance compared to turning the write cache off. I think this leads me to consider how many servers I can run with just ext3 and md raid1 so as to keep the write cache enabled and the filesystem safe. I’ll have to weigh the performance gains against the benefits of using LVM (especially snapshots) and dm-crypt (which might have limited benefits on a server anyway).

Maybe some improvements had been done at the write barriers feature of ext3 since Andreas Dilger’s post, but I couldn’t find anything on Google about it. I couldn’t even find a website which has proper documentation on ext3′s write barriers. At least now I have specialj’s Bonnie++ test results from 2 months ago and it’s good enough to convince me enabling write barriers on all of my servers which run md raid1.

Thanks, specialj! :)

References:
http://hightechsorcery.com/2008/10/evaluating-performance-ext3-using-write-barriers-and-write-caching
http://hightechsorcery.com/2008/06/linux-write-barriers-write-caching-lvm-and-filesystems
http://archives.free.net.ph/message/20070518.190346.7a4c0f9f.en.html



Painfully slow CentOS system

6 12 2008

Today I was installing CentOS 5.2 to a Tyan barebone server with Intel Xeon X3220 processor, 2GB of RAM, and 2 Western Digital SATA II 250GB HDD. I chose to use software RAID 1 for my /boot and /. I didn’t create RAID 1 for swap partition because it is useless and in fact it might slow down performance. It took more than 2 hours to complete the whole installation process even when I used bare minimum packages selection. The format process of / partition itself took quite a while to complete.

Once the installation was complete, it rebooted and I found a painfully slow newly-installed CentOS system. The keyboard response was slow as if I were connected to a remote server with 1 second latency. I wasn’t happy because this is a quad-core processor server with acceptable amount of RAM. When I looked at /proc/mdstat to see how the software RAID is doing, I noticed that the sync speed is only around 1000KB/s. I tried looking for solutions on Google and found this and this. The solution I got didn’t help increase the sync speed, nor the slow response I was getting typing on the console or remotely.

I then somehow managed to find another possible solution. I can’t remember how I finally got this solution (thanks to nixCraft/CyberCiti). I didn’t notice that the hard drives were actually detected as hdX (IDE/PATA) instead of sdX (SATA). Now this makes sense because PATA is much slower than SATA. It turned out to be an issue with the hardware detection process. Somehow CentOS detected the HDDs and used PATA driver instead of SATA driver, so the devices were named hdX and treated as PATA hence the slow sync speed. I made the BIOS changes as mentioned in the solution and attempted another install, it was REALLY fast this time and the software RAID 1 sync speed was over 70000KB/s.

References:
http://www.centos.org/modules/newbb/viewtopic.php?topic_id=16178
http://lists.centos.org/pipermail/centos/2005-February/002068.html
http://www.ducea.com/2006/06/25/increase-the-speed-of-linux-software-raid-reconstruction/
http://www.cyberciti.biz/faq/linux-sata-drive-displayed-as-devhda/
http://www.nodeofcrash.com/?p=57



Hello Kitty Online on Mac OS X with CrossOver

1 11 2008

I wanted to play Hello Kitty Online (HKO) on my MacBook. I still got my Fujitsu laptop, but these days I mainly use my MacBook. Sanrio Digital, the developer of HKO, does not seem to have plans for a Mac version of HKO client any time soon (as shown in HKO’s website FAQ). I saw the The CodeWeavers Great American Lame Duck Presidential Challenge on one of the news websites (I forgot which), then I registered for a code on the website to receive a legit unlocked copy of CrossOver Games (CXG) for Mac.

Once CXG has been installed on OS X, grab a copy of ie6setup.exe. Create a new win98 bottle in CXG, don’t use the “Install Unsupported Software..” button as it will automatically create a winxp bottle instead. Go to the menu bar “Configure” and select “Manage Bottles..”. Click on the + button, assign a name for the new bottle, and set the “New bottle type” to “win98″. Click “Create” to proceed with the bottle creation. It will then appear on the list on the left hand side of the window. Click on the newly created bottle and go to the “Applications” tab on the right hand side, and choose “Install Software..”.

Now the installer window appears again, but this time it will not automatically create a bottle. Choose “Install Unsupported Software..” and click “Continue”, it will then ask to choose which bottle to install into, select the “Other existing bottle” and click “Continue” then “Install”. When prompted with the file window dialog, choose the ie6setup.exe. Follow the setup process until the “Finished” button becomes active.

The reason why IE6 needs to be installed is because of the HKO client requires browser component for Auto Updater, Item Mall, etc. Once IE6 has been installed into the bottle, as indicated in the “Applications” tab, there will be 2 “Installed applications”: “Microsoft XML Parser (MSXML) 3.0″ and “Microsoft Internet Explorer 6 SP1 and Internet Tools”. If there is nothing on the Applications tab, then repeat the above steps and ensure nothing was missed/skipped.

Now to install the HKO client, click on the “Install Software..” button on the “Applications” tab. Choose the “Install Unsupported Software..”, click “Continue”, choose “Other existing bottle”, click “Continue”, and click “Install”. This time, choose the HKO installer file instead of ie6setup.exe. Follow the setup process, again up until the “Finished” button becomes active, then “Hello Kitty Online” will appear on the “Installed applications” list. To run it, choose the menu bar Programs, SanrioTown, Hello Kitty Online, Hello Kitty Online. It should start the Auto Updater and the update process commences. Now HKO is playable on Mac OS X!

Easy, huh?

Thanks a lot, CodeWeavers! :)

References:
http://forums.macrumors.com/showpost.php?p=6518773&postcount=138 (thanks for the hints, kkat69!)
http://www.macosxhints.com/article.php?story=2007012420511452



Reset MikroTik to default factory configuration

17 06 2008

Current MikroTik RouterBOARDs do not have a reset button (if you find a button on your RouterBOARD, it’s not the reset-to-factory-default button) to get it reset to the default factory configuration. User guides for each RouterBOARD, which MikroTik provides on routerboard.com, define the identifier and location of the reset mechanism. Hint: look for a distinct hole on the RouterBOARD and short-circuit it.

I will post some pictures when I get the time chance.

To reset RouterOS to the default configuration, execute the command /system reset-configuration. That should do the job. It will do a backup of the current configuration prior to the reset (how convenient!), in case you change your mind and possibly prevent accidental resets.



Taking screenshots in Mac OS X

26 05 2008

Since I don’t use it often, I keep forgetting the shortcut keys on taking screenshots in Mac OS X. Here is the list:

  • Command-Shift-3: Take a screenshot of the screen, and save it as a file on the desktop
  • Command-Shift-4, then select an area: Take a screenshot of an area and save it as a file on the desktop
  • Command-Shift-4, then space, then click a window: Take a screenshot of a window and save it as a file on the desktop
  • Command-Control-Shift-3: Take a screenshot of the screen, and save it to the clipboard
  • Command-Control-Shift-4, then select an area: Take a screenshot of an area and save it to the clipboard
  • Command-Control-Shift-4, then space, then click a window: Take a screenshot of a window and save it to the clipboard

References:
http://guides.macrumors.com/Taking_Screenshots_in_Mac_OS_X



Extending PPPoE access network with network bridge

20 05 2008

In a situation, I had to extend my PPPoE network over a wireless bridge. I do not want to have nor maintain 2 PPPoE servers. I didn’t bother failed to understand how PPPoE works and made a wireless bridge without WDS. When one user log in from the bridged side of the network, it works flawlessly. When more users are trying to log in, the user which logged in earlier gets disconnected. Apparently this is caused by MAC address problem[1]. From the PPPoE server’s side, users logging in from the bridged side of the network have the same MAC address which is the bridge device’s MAC address. From the users’ side, they are able to see correct MAC address of every device on the other side of the network. The PPPoE server gets confused when it sends PPPoE packets because multiple users have the same MAC address and there is no way for the server to direct reply to individual user.

[1]Wireless AP is connected to the side of the network where the PPPoE server is connected to, and wireless client is connected to the other side of the network. Every device connected to the same side of the wireless AP gets the wireless client’s MAC address for every device connected over the wireless bridge, however every device over the bridge gets to see the real MAC address of every device connected to the same side of the wireless AP. I believe that if I interchange the wireless AP and client (so now PPPoE server is connected to the wireless client instead), it may work properly since the PPPoE server will get the correct MAC address of all devices over the bridge. Devices connected to the wireless AP obviously will get the same MAC address for the PPPoE server and PPPoE users connected to the same side of the wireless client, but the most important thing is that the PPPoE users over the bridge can communicate properly with the PPPoE server. Since this is a one-to-many and many-to-one situation, this should work, but not for many-to-many situation. This explanation sounds quite confusing due to my limited English. If you could rewrite this part, please let me know.



Windows Live Messenger/MSN Messenger sign-in problem

20 05 2008

If connectivity-wise there is no problem, then check system date and time. Most likely system date and time is way off. Synchronize if necessary!

Third party NTP clients (such as Automachron) are better than Windows’ built-in one since they don’t need port 123 (UDP) incoming to be open.