Simulating slow WAN links with Dummynet and VMWare ESX
At work I’m developing a feature that needs to be tested across a variety of WAN links like DSL, cable modems, T1s, T3s, etc. There are commercial network impairment generators that do this for you, but they’re too expensive to be cost effective.
Fortunately I recalled that FreeBSD includes a module called dummynet, which interacts with the IP stack to queue and drop packets however you want. Using this in concert with VMWare ESX 3.5, I was able to set up an environment in which two Windows machines communicate with one another through a dummynet bridge, thus simulating any sort of network performance I want.
It was not easy to get going, particularly since much of the dummynet info on the ‘net haven’t been updated to reflect the significant changes in FreeBSD 7. Since I might want to do this again someday, I’m recording the secret handshake required to get this working.
Setup
The initial setup was quite a PITA, primarily because FreeBSD’s bridging functionality changed drastically recently, and most of the docs on the web are still gears towards the earlier implementation. However, I finally got it going; the following describes my journey.
ESX Virtual Switches
There’s a default virtual switch, vSwitch0, on our ESX box, into which all VM NICs are plugged by default. I created another one, ‘Bandwidth Wilderness 1′, which is not associated with any physical NICs.
The Windows VM that will be on the ‘LAN’ is plugged into vSwitch0, and the Windows VM that will be on the ‘WAN’ is plugged into ‘Bandwidth Wilderness 1′ (or whatever you call yours).
Oh, and the promiscuous mode policy for both switches must be Accept, which is not the default. If you don’t set this, you’ll get some pretty fucking weird ARP resolution problems which will result in some pretty confusing behavior.
Firewalls
To simulate a secure WAN configuration, I’ve enabled the Windows Firewall on both sides, and removed all exceptions but Core Networking and the Visual Studio Debugger stuff. This will break things like file sharing and pings. Turn the firewall off temporarily if you’re having trouble with such things.
FreeBSD VM
You’ll need a VM with three NICs. They will appear in FreeBSD as em0, em1, and em2. em0 is the one the box will use to accept SSH connections and generally communicate with the network. em1 will be plugged into vSwitch0 just like em0, but it won’t have an IP address assigned. em2 will be plugged into Bandwidth Wilderness 1. Once the VM is configured right, it will bridge traffic between em1 and em2, thereby allowing any VM plugged in to ‘Bandwidth Wilderness 1′ to communicate with anything in vSwitch0 transparently, with all packets running through the FreeBSD machine. This is how dummynet is able to introduce packet losses and delays.
Custom kernel
You should start with a FreeBSD install that includes full kernel and userland sources, and a full development environment. Instructions for building a custom kernel are here. I called my custom kernel DUMMYNET instead of MYKERNEL. I used the following kernel options in the config file:
options IPFIREWALL
options IPFIREWALL_VERBOSE
options DUMMYNET
I left all the existing options values unchanged. I then built and installed the kernel per the instructions, and rebooted.
/etc/rc.conf
In FreeBSD much of the service enable/disable configuration resides in /etc/rc.conf. Here’s what I put in mine (not including default stuff that was already there):
# Enable ipfw
firewall_enable="YES"
firewall_script="/etc/rc.firewall.anelson"
# Set up a bridge beteen em1 and em2
ifconfig_em1=”up”
ifconfig_em2=”up”
cloned_interfaces=”bridge0″
ifconfig_bridge0=”addm em1 addm em2 up”
The above should be rather self-explanatory. The bridging bit is a little dodgy, but I pulled it right from the if_bridge man page.
/etc/rc.firewall.anelson
The firewall rules I used are stored in a new file. Make sure you chmod it +x so it’s executable; it is a shell script. All it does is run ipfw (IP FireWall) to load the rule set which routes the bridged traffic through dummynet:
#!/bin/sh
ipfw -q /etc/ipfw.rules
/etc/ipfw.rules
The actual firewall rules are stored in this file. It includes all possible bandwidth configurations, all but one of which must be commented out at all times:
#!/bin/sh
#
# These rules control the dummynet parameters which simulate shitty
# WAN links.
#
# em1 is the LAN interface, while em2 is the WAN interface
#
# This config file is set up so that bandwidth from LAN->WAN may be different
# than WAN->LAN, as in the case of an ADSL link
#
# The setup of the rules is pretty simple. Traffic that comes in on
# em1 is bridged to go out on em2, and traffic coming in on em2 is bridged
# to go out on em1. Thus, the rules for LAN->WAN traffic are attached
# to traffic coming in on em1 ('recv em1'), and the WAN->LAN rules are
# attached to traffic going out on em1 ('xmit em1'). So even though both
# sets of rules refer to em1, the LAN-side interface, they distinguish
# between LAN and WAN traffic by the traffic direction
#
# This setup is heavily influenced by the 2006-April-07 posting to
# the freebsd-ipfw list by John Nielsen titled
# "Notes on using dummynet with ip_bridge"
flush
queue flush
pipe flush
#
# traffic from lan (em1) to wan (em2) goes through pipe 1
# traffic from wan (em2) to lan (em1) goes through pipe 2
#
# No delay, max bandwidth both ways
#pipe 1 config delay 0
#pipe 2 config delay 0
# SDSL
# 40ms RTT, 0.1% round-trip packet loss, 1536Kbit/s bandwidth
# symmetrical (same both ways)
#pipe 1 config delay 20ms plr 0.0005 bw 1536Kbit/s
#pipe 2 config delay 20ms plr 0.0005 bw 1536Kbit/s
# Long-distance T1
# 40ms RTT, 0.1% round-trip packet loss, 1.5Mbit/s bandwidth
# symmetrical (same both ways)
#pipe 1 config delay 20ms plr 0.0005 bw 1536Kbit/s
#pipe 2 config delay 20ms plr 0.0005 bw 1536Kbit/s
# High-latency T1
# 200ms RTT, 0.1% round-trip packet loss, 1.5Mbit/s bandwidth
# symmetrical (same both ways)
#pipe 1 config delay 100ms plr 0.0005 bw 1536Kbit/s
#pipe 2 config delay 100ms plr 0.0005 bw 1536Kbit/s
# Long-distance T3
# 40ms RTT, 0.1% round-trip packet loss, 45Mbit/s bandwidth
# symmetrical (same both ways)
#pipe 1 config delay 20ms plr 0.0005 bw 45Mbit/s
#pipe 2 config delay 20ms plr 0.0005 bw 45Mbit/s
# 100Mbit local ethernet
# 0ms RTT, 0.0% round-trip packet loss, 100Mbit/s bandwidth
# symmetrical (same both ways)
pipe 1 config bw 100Mbit/s
pipe 2 config bw 100Mbit/s
# Add firewall exceptions for localhost
# Localhost traffic not on the lo0 interface is bogus and dropped
add allow all from any to any via lo0
add deny all from any to 127.0.0.0/8
add deny all from 127.0.0.0/8 to any
# Don't firewall anything on em0 which is used to talk to this box
add skipto 60000 all from any to any via em0
# direct LAN->WAN traffic through pipe 1
add pipe 1 all from any to any out recv em1
add skipto 60000 all from any to any out recv em1
# same as above but for WAN->LAN traffic through pipe 2
add pipe 2 all from any to any out xmit em1
add skipto 60000 all from any to any out xmit em1
#The above 'add skipto 60000' are like goto statements to this rule
#which allows everything
add 60000 allow all from any to any
/etc/sysctl.conf
This file contains overrides for sysctl values, which control various kernel behaviors. This is what I added to the bottom of the file:
#Only forward IP traffic across the bridge interface
net.link.bridge.pfil_onlyip=1
#Use IPFW for layer-2 filtering
net.link.bridge.ipfw=1
# Got this from BSD list “Notes on using dummynet with ip_bridge”
net.inet.ip.fw.one_pass=0
That’s pretty much it.
Testing
To test this I installed Cygwin on both the LAN and WAN windows boxes, and built iperf from source on them. I ran this command on the WAN box:
iperf --server --len 1024K
And this command on the LAN box:
iperf --client 10.23.4.86 --time 60 --len 1024K
Where 10.23.4.86 is the IP address of the WAN box. This sort-of accurately simulates a bulk transfer and shows you how fast it is. This way you can verify that you set up the dummynet parameters correctly.
Changing the simulated bandwidth
To select the bandwidth configuration to use, edit /etc/ipfw.rules and comment out all the ‘pipe 1′ and ‘pipe 2′ lines except the two that correspond to the bandwidth config you want. Once you’ve done that, run /etc/rc.firewall.anelson as root and it will apply the new settings. Always test with iperf after you do this, to make sure you didn’t fuck something up and break network connectivity.
References
- Dummynet tutorial (beware, the bridging-related info is out of date and useless)
- Dummynet home page
- FreeBSD custom kernel build guide
- “Notes on using dummynet with if_bridge” listserv post
- if_bridge man page
Ubuntu Feisty Fawn, HighPoint RocketRaid 2220, and Satan
A while back I contorted myself to get a 64-bit FreeBSD 6.0 driver for my HighPoint RocketRaid 2220 RAID controller. Now that I have a 2TB ReadyNAS box, that old 1TB FreeBSD box is falling into disuse, so I thought I’d repurpose it as a dedicated Azureus download machine.
At first, I had hoped I could install Ubuntu Feisty Fawn directly on the RAID array, but I couldn’t even get the Ubuntu live CD to boot without a litany of read errors on sdc and sdb. I gave up on that, pulled one of the five 250GB drives from the array, and hooked it up to the on-board SATA controller, unplugged the RocketRaid, and installed Ubuntu.
Once that was done, I wanted to at least get enough RocketRaid support to create a RAID 0 volume consisting of the four remaining 250GB SATA drives. Long story short, here’s what I had to do:
- Compile a custom 2.6.22 kernel, explicitly excluding the
sata_mvdriver, which is extremely incomatible with the RocketRaid. Addingsata_mvto the blacklist, and using thebrokenmoduleskernel startup parameter were not sufficient; I had to literally compile this out of the kernel. - Download the latest HighPoint RocketRaid Linux driver source code. It may be possible to get the pre-compiled drivers to work on Feisty, but if so I don’t know how.
- Build the RocketRaid driver code per the instructions. The
make installstep failed towards the end, but it made it far enough to get thehptmv6driver built and working and loading at boot time.
Once that was done, it was time to create the RAID array. As I learned when I built a BSD box around this card, the RocketRaid 2220 is what is known as a FakeRAID card, meaning it has no hardware RAID circuitry; it’s just a SATA controller with some proprietary, buggy code that emulates the various RAID levels. So, I decided against using the HighPoint RAID code, and went into the HighPoint BIOS and created one JBOD device for each disk in the array. These devices showed up at /dev/sdb through /dev/sde. I used the software RAID HOWTO to build a /dev/md0 device consisting of these four disk devices, in RAID 0.
Now, I have a 1TB RAID 0 reiserfs partition upon which to stage my ill-gotten gains, before archiving them on my 2TB dedicated NAS box.
Next time, I’ll spend the $300 and get a real, supported RAID controller card.
Maddening problem with clock slip in FreeBSD under VMWare
A few weeks ago my father pointed out that the date stamps on my blog posts were behind by a week. Upon investigation, I found that bonzo‘s clock was a week behind. I updated it and declared victory.
Then, he pointed it out again a few days ago. Sure enough, it had slipped by several days. When I logged into the VMWare Console to check for options to sync the clock or whatever, I noticed a repeated error from the FreeBSD kernel that I’ve been getting on bonzo forever and always ignored:
calcru: runtime went backwards from [some big number] usec to [another] usec for pid [pid]
I googled this message, and found a whole community of FreeBSD users suffering under slipping clocks when running FreeBSD under VMWare. There’s something on the freebsd-current list, and VMWare’s own support forums.
There are a few proposed fixes, most involving the kern.timecounter.hardware sysctl. I tried changing it from its default of APIC to TSC and i8245, but none worked.
I then ran across a post on the VMWare forums suggesting:
In FreeBSD:
'tools.timeSync = "true"' added to .vmx file sysctl -w kern.timecounter.hardware=i8254 kldload vmmemctl (from vmware-tools) and have vmware-guestd running add 'kern.hz="250"' to /boot/loader.confI don’t have APIC or ACPI disabled in my FreeBSD host either
Now, I don’t want to run the VMware tools just to keep the clock in sync, but I did put kern.hz="250" in /boot/loader.conf and kern.timecounter.hardware=i8254 in /etc/sysctl.conf, then rebooted.
It’s been several minutes now, and the clock seems to be holding. I’m afraid I don’t understand in detail why this helps, though a VMware knowledgebase article alludes to a problem of missed timer interrupts, with a fix being reducing the frequency of the timer interrupts requested by the OS. I think that’s kern.hz="250". The importance of switching the time counting method from APIC to i8254 is less clear, unless it’s just a more reliable source of ticks.
At any rate, this problem has caused me to notice that VMWare server is in RC-2. As it’s the free successor to GSX Server 3, I really need to upgrade. Perhaps over the coming long weekend…
Upgrading Subversion from 1.2.1 to 1.3.2
I’ve been running SVN 1.2.1 forever. Now I need to rebuild it to generate updated Apache modules for Apache 2.2.
In /usr/ports/devel/subversion, I do a make deinstall to remove the old version, and a make install WITH_MOD_DAV_SVN=yes WITH_PERL=yes WITH_PYTHON=yes WITH_RUBY=yes WITH_JAVA=yes to build the Apache module that exposes the SVN repository for WebDAV over HTTP, and bindings for all the major languages.
Ack. First problem:
You should build `www/apache22' with db4 support to use subversion with it.
Please rebuild `www/apache22' with option `WITH_BERKELEYDB=(db4|db41|db42)' and try again.
Or you can disable db4 support. Only 'fs' repository backend will be available.
To disable db4 support, define WITHOUT_BDB.
My respository is an ‘fs’ type, which is more stable. So, I can live w/o BDB. I certainly have no desire to compile Apache again. So make that build command make install WITH_MOD_DAV_SVN=yes WITH_PERL=yes WITH_PYTHON=yes WITH_RUBY=yes WITH_JAVA=yes WITHOUT_BDB=yes.
make install automatically adds LoadModule statements to httpd.conf, so that was pretty much it.
UPDATE: Not so fast. When I query apocryph.org/svn/ I get Page not Found from Drupal. I’m missing something else in the configuration…
Ah, I forgot to copy this bit from my old httpd.conf:
#Expose the SVN repository via WebDAV and mod_svn
<Location /svn>
DAV svn
SVNPath /usr/local/apocryph_svn
AuthType Basic
AuthName "apocryph.org svn"
AuthUserFile /usr/local/apocryph_svn/conf/htpasswd
# For any operations other than these, require an authenticated user.
<LimitExcept GET PROPFIND OPTIONS REPORT>
Require valid-user
</LimitExcept>
</Location>
That did it. Now I can browse my SVN repository at http://svn.apocryph.org/svn/.
Installing SquirrelMail 1.4.5 in FreeBSD 5.4-Release
I’m in Baghdad now, where my only link to the ‘net is a high-latency satellite. Mozilla Thunderbird can’t hack the delays this introduces, and thus times out when attempting to fetch my IMAP mailbox on ender. Thus, rather than use one of the other shitty IMAP clients, I’ve decided to pull the trigger on SquirrelMail.
I have previously attempted to install RoundCube, but found it a bit too immature at this point, so I’ve decided to go with something tried and true: SquirrelMail.
Fortunately, the FreeBSD ports collection includes SM, and a pre-built package based on 1.4.5 is available. Unfortunately, the pre-built package uses Apache 1.3, while I run Apache 2 on bonzo. This leads to all sorts of fun:
bonzo# pkg_add ftp://ftp.freebsd.org/pub/FreeBSD/ports/i386/packages-5-stable/All/squirrelmail-1.4.5_2.tbz
Fetching ftp://ftp.freebsd.org/pub/FreeBSD/ports/i386/packages-5-stable/All/squirrelmail-1.4.5_2.tbz... Done.
Fetching ftp://ftp.freebsd.org/pub/FreeBSD/ports/i386/packages-5-stable/All/apache-1.3.34_2.tbz... Done.
pkg_add: package 'apache-1.3.34_2' conflicts with apache-2.0.54_2
pkg_add: please use pkg_delete first to remove conflicting package(s) or -f to force installation
pkg_add: pkg_add of dependency 'apache-1.3.34_2' failed!
Fetching ftp://ftp.freebsd.org/pub/FreeBSD/ports/i386/packages-5-stable/All/php4-4.4.1_3.tbz... Done.
pkg_add: package 'php4-4.4.1_3' conflicts with php5-5.0.4_2
pkg_add: please use pkg_delete first to remove conflicting package(s) or -f to force installation
pkg_add: pkg_add of dependency 'php4-4.4.1_3' failed!
Fetching ftp://ftp.freebsd.org/pub/FreeBSD/ports/i386/packages-5-stable/All/mhash-0.9.2.tbz... Done.
Fetching ftp://ftp.freebsd.org/pub/FreeBSD/ports/i386/packages-5-stable/All/php4-xml-4.4.1_3.tbz... Done.
pkg_add: could not find package apache-1.3.34_2 !
pkg_add: could not find package php4-4.4.1_3 !
pkg_add: pkg_add of dependency 'php4-xml-4.4.1_3' failed!
Fetching ftp://ftp.freebsd.org/pub/FreeBSD/ports/i386/packages-5-stable/All/php4-session-4.4.1_3.tbz... Done.
pkg_add: could not find package apache-1.3.34_2 !
pkg_add: could not find package php4-4.4.1_3 !
pkg_add: pkg_add of dependency 'php4-session-4.4.1_3' failed!
Fetching ftp://ftp.freebsd.org/pub/FreeBSD/ports/i386/packages-5-stable/All/php4-pcre-4.4.1_3.tbz... Done.
pkg_add: could not find package apache-1.3.34_2 !
pkg_add: could not find package php4-4.4.1_3 !
pkg_add: pkg_add of dependency 'php4-pcre-4.4.1_3' failed!
Fetching ftp://ftp.freebsd.org/pub/FreeBSD/ports/i386/packages-5-stable/All/php4-openssl-4.4.1_3.tbz... Done.
pkg_add: could not find package apache-1.3.34_2 !
pkg_add: could not find package php4-4.4.1_3 !
pkg_add: pkg_add of dependency 'php4-openssl-4.4.1_3' failed!
Fetching ftp://ftp.freebsd.org/pub/FreeBSD/ports/i386/packages-5-stable/All/php4-mhash-4.4.1_3.tbz... Done.
pkg_add: could not find package apache-1.3.34_2 !
pkg_add: could not find package php4-4.4.1_3 !
pkg_add: pkg_add of dependency 'php4-mhash-4.4.1_3' failed!
Fetching ftp://ftp.freebsd.org/pub/FreeBSD/ports/i386/packages-5-stable/All/php4-mbstring-4.4.1_3.tbz... Done.
pkg_add: could not find package apache-1.3.34_2 !
pkg_add: could not find package php4-4.4.1_3 !
pkg_add: pkg_add of dependency 'php4-mbstring-4.4.1_3' failed!
Fetching ftp://ftp.freebsd.org/pub/FreeBSD/ports/i386/packages-5-stable/All/php4-gettext-4.4.1_3.tbz... Done.
pkg_add: could not find package apache-1.3.34_2 !
pkg_add: could not find package php4-4.4.1_3 !
pkg_add: pkg_add of dependency 'php4-gettext-4.4.1_3' failed!
Outstanding. Well, I’m not going back to apache 1.3, and I don’t feel like building from the port, since bonzo‘s ports collection is outdated. However, I really don’t want to download the sources from SM and build on FBSD. So, I guess I have no choice but to update the ports tree:
cvsup -L 2 /root/ports-supfile
Took a while; updated.
Now doing:
make WITH_DATABASE=1
The WITH_DATABASE enables PEAR support to ensure SM can use MySQL.
That was uneventful. Next up,
make WITH_DATABASE=1 install
During the build PEAR was installed also, which yielded this:
To use PEAR you have to add the correct include path into
your ${LOCALBASE}/etc/php.ini configuration file, like:
include_path = ".:/usr/local/share/pear"
I’ll have to remember to do that…
Ugh, it’s installing the new PHP 5.1.2. This’ll take a while…
Actually wasn’t bad at all. Final output:
You now need to add an alias to apache's httpd.conf pointing to
/usr/local/www/squirrelmail in order to access SquirrelMail from
your web browser, or create a VirtualHost with DocumentRoot set
to that directory.
For SquirrelMail to work properly you will need to make sure the
following option is set in your php.ini file:
file_uploads = On
If you have problems with SquirrelMail saying "you must login" after
you just have, the following php.ini option may help:
session.auto_start = 1
In order to do your administrative configuration you need to
cd /usr/local/www/squirrelmail && ./configure
SquirrelMail will not work until this has been done.
So, first I’ll add an alias to /usr/local/etc/apache2/httpd.conf
This seemed to do the trick:
Alias /mail/ "/usr/local/www/squirrelmail/"
Ok, that ‘worked’, inasmuch as I got this response in my browser:
ERROR: Config file "config/config.php" not found. You need to configure SquirrelMail before you can use it.
And configure it I shall. As per the instructions above, a quick cd /usr/local/www/squirrelmail && ./configure should be just the thing.
The configuration utility has a numbered menu system circa 1980. I’ll just wander through the menus, specifying values for stuff as they make sense to me.
That was easy. I like the Blue Grey theme, with the Verdana 08 custom style sheet. Nice.
Extending swap space on FreeBSD 5.4
When I first created bonzo, I allocated 96MB of RAM in VMWare. As I ran Gallery 2, Drupal, ByteHoard, etc, it became clear from the out-of-memory errors that I needed to boost the memory space. I since increased the allocation to 256MB, but the swap file is still only 160MB. Consequently, I’m plagued by kernel out of space space errors like:
Dec 5 18:09:18 bonzo kernel: swap_pager_getswapspace(6): failed
and
Dec 6 03:03:07 bonzo kernel: swap_pager: out of swap space
Dec 6 03:03:07 bonzo kernel: swap_pager_getswapspace(4): failed
Dec 6 03:03:07 bonzo kernel: pid 15958 (httpd), uid 80, was killed: out of swap space
I’ve expanded bonzo‘s virtual disk by an additional gigabyte, using the steps from my previous post on growing VMWare disks, and now I need to get FreeBSD to use that new space as swap.
I ran sysinstall to go through the FreeBSD installer again, which is how I created the FreeBSD disk label initially. The disk label is what I as a Windows guy would think of as the partition table; it’s the list of partitions on the disk and their file system types.
To access the disk labeler again I went to ‘Expert’ mode in the installer, then chose ‘Label’. I hit ‘C’ to create a new label, and got:
Not enough space to create an additional FreeBSD partition
Sure enough, the label editor shows:
Disk: ad0 Partition name: ad0s1 Free: 0 blocks (0MB)
Clearly it doesn’t ‘see’ the 1GB I added to the end of the disk.
When I first accessed the ‘Label’ function, I got:
│WARNING: A geometry of 113179/15/63 for ad0 is incorrect. Using │
│a more likely geometry. If this geometry is incorrect or you │
│are unsure as to whether or not it's correct, please consult │
│the Hardware Guide in the Documentation submenu or use the │
│(G)eometry command to change it now. │
│ │
│Remember: you need to enter whatever your BIOS thinks the │
│geometry is! For IDE, it's what you were told in the BIOS │
│setup. For SCSI, it's the translation mode your controller is │
│using. Do NOT use a ``physical geometry''. │
However, I get that alot for large disks or RAID arrays, not just VMware disks, so I didn’t pay it any mind. Perhaps I should.
Apparently there is a tool, growfs, which has been in the FreeBSD base install for years. From the growfs(8) man page:
The growfs utility extends the newfs(8) program. Before starting growfs
the disk must be labeled to a bigger size using bsdlabel(8). If you wish
to grow a file system beyond the boundary of the slice it resides in, you
must re-size the slice using fdisk(8) before running growfs.
Awesome. So I have to resize the slice with fdisk, label the disk to a bigger size with bsdlabel, and only then can I expand the filesystem with growfs. Hmm. I’ll try later.
UPDATE: I wimped out. I’m rather scared to death of damaging bonzo’s filesystem. Even though it’s backed up nightly to jane, that doesn’t mean I relish the prospect of rebuilding the filesystem under duress. Thus, I’ve taken the easy way out; I’ve added a new, 2GB hard drive using VMware, and will make that into a swap volume.
I ran sysinstall again, to use its GUI fdisk-er and label-er. I’ve created a single partition, ad1s1b, which is a 2GB swap partition. I’ve added it to /etc/fstab (sysinstall tried, but the kernel hadn’t picked up the newly created filesystem on /dev/ad1 yet, so it failed. Not a big deal.). A reboot, and I’ll inspect my handiwork.
Sweet:
bonzo# swapinfo
Device 1K-blocks Used Avail Capacity
/dev/ad0s1b 162632 0 162632 0%
/dev/ad1s1b 2097112 0 2097112 0%
Total 2259744 0 2259744 0%
Burning CDs in FreeBSD
Because HP DVD drives sucks, prospertine is temporarily without a CD/DVD burner, so I need to use the one in aenea.
I’d never burned a CD outside of Windows before, so I didn’t know where to begin. Turns out FreeBSD includes a tool, burncd, that interfaces with the burner itself. I just need to burn the Fedora Core 4 install CDs, so I already have the CD images, making it pretty straightforward.
There’s more info in the handbook, under Creating and Using Optical Media.
I’m trying this, straight from the handbook:
burncd -f /dev/acd0 data FC4-i386-disc1.iso fixate
burncd: open(/dev/acd0): Permission denied
Hmm, ok, so I need to run as root. That’s unfortunate:
aenea# burncd -f /dev/acd0 data FC4-i386-disc1.iso fixate
next writeable LBA 0
writing from file FC4-i386-disc1.iso size 649838 KB
written this track 649838 KB (100%) total 649838 KB
fixating CD, please wait..
Wow, can it really be that easy?
Um, no. I know the ISO images are correct, as I verified the SHA-1 hashes with the Fedora Core download page. And yet, when I verify the disc in the Fedora Core installer, discs 1 and 2 have failed, and I’m sure 3 and 4 will too. That said, they’re not totally bogus; I was able to boot disc 1 and get most of the way through the installer, but it destabilized and crashed during file system creation.
So, what’s wrong w/ burncd? Do I have to do something special to make it work? Maybe it’s working fine, and the FC media verifier doesn’t work right w/ 700MB CD-Rs. That seems unlikely.
I’ll try the install again…
UPDATE: It worked fine. FC4 reported each disk as failing verification, but it installed and is running fine. Oh well.
More trouble in paradise: software RAID controllers can suck
Yesterday while I was at work, there was a brief power fluctuation in my townhouse. Since I’m still setting up aenea, she isn’t yet in my server closet, or hooked up to an UPS. So, predictably, she lost power.
This is somewhat bad, since the Highpoint RocketRaid 2220 SATA RAID controller that powers her 1TB RAID 5 disk array does not deal at all well with unorderly shutdowns, since the RAID logic is implemented in a software driver, not hardware.
Predictably, I suffered some file system damage. I now can’t boot, because /var seems sufficiently damaged to cause a panic in some ffs_whatever module. Thankfully it was /var and not, say, /usr, but nonetheless it sucks badly.
I’ve booted the FixIt shell on the FreeBSD 6.0 install disc, and loaded the hptmv6.ko kernel module from a USB floppy, so now I’m hoping I can fsck the problem away from this shell.
First, I’m discovering that a standard fsck in the FixIt shell doesn’t recognize the /var filesystem. fsck_ufs does the trick, but when I run it with fsck_ufs /dev/da0s1d it just outputs the file system errors and calls it a day; it doesn’t fix them.
Hmm, fsck doesn’t work because it’s looking in /sbin and /usr/sbin for the fsck_* executables, but in the FixIt environment they’re in /mnt2/usr/sbin. The FixIt shell is just flaky; sometimes I’ll run a command (ls, fsck, mount, man; it doesn’t matter what) and it hangs. Over on VTTY 2 (Alt-F2) I see about 15 timeout errors from acd0 before the shell finally comes back, only to hang again on my next command.
Fortunately, I’ve read on the lists that the first thing to try when a file system is fucked is to boot in single user mode (option 4 on the boot menu iirc). That boots find and gets me to a shell prompt.
I run
fsck -p /dev/da0s1d
Where -p is preen mode, which from the man page I gather checks for minor inconsistencies, but won’t handle major problems. All the list posts I see use this first.
From this I get:
/dev/ds0s1d: UNEXPECTED SOFT UPDATE INCONSISTENCY; RUN fsck MANUALLY
I gather that’s bad. I found a few things on the list:
First, this frightful message advocating I use vi on the directory to remove invalid file entries. Um, no.
Next, this USENIX paper on FreeBSD soft updates, which explains what they’re for (to allow fsck to run whilst the file system is mounted, for speedier recovery), and when it doesn’t work (when the soft update snapshot is inconsistent, eg on power failure or crash).
So, with little help from the ‘net, I went ahead with:
fsck /dev/da0s1d
And got the UNEXPECTED SOFT UPDATE INCONSISTENCY, this time with a prompt: REMOVE? [yn]. I’m going to go with ‘yes’ and hope for the best…another error, this one UNREF FILE. The prompt is RECONNECT? [yn]. I’ll go with ‘yes’ again. Another ‘yes’ to the NO lost+found DIRECTORY CREATE?
A ton more UNREF FILE msgs; ‘yes’ each time.
FREE BLK COUNT(S) WRONG IN SUPERBLK. SALVAGE? [yn] Most definitely.
SUMMARY INFORMATION BAD. SALVAGE? [yn] Sure, go ahead.
BLKS MISSING IN BIT MAPS. SALVAGE? [yn] Yeah, if you want…
And then, as if nothing had happened, FILE SYSTEM MARKED CLEAN. Yay.
Now I do:
fsck -p
Do do a preening check on all the file systems. A few minor errors on /dev/da0s1f and /dev/da0s1e, but nothing fsck couldn’t handle on its own. Took a long time to scan the huge ~900GB partition…
Done now. I’ll exit this shell and proceed with the boot process, hoping for the best.
Voila! Booted fine.
So the moral(s) of the story are:
- When using a software RAID driver, you mustn’t let the power go out
- When using a BSD UFS file system, you mustn’t let the power go out
- When using UNIX in general, you mustn’t let the power go out
It’s hard for me to get used to this, as the bulk of my computer hours have been spent on Windows, where I’ve forced shutdowns countless times, and never had any serious file system damage. Needless to say, aenea is going on an UPS right now.
UPDATE: aenea sucks so much power she overloads the UPS I have on prospertine. I’ll have to move her into the server closet early, just so she’ll have an available UPS.
mod_php on Apache 2, FreeBSD 6.0 doesn't automatically process .php files
I just installed the apache2 and php5 ports on aenea, and found that accessing .php files via Apache returned the PHP source code, instead of running the PHP server-side.
I had to add the following entries into /usr/local/etc/apache2/httpd.conf in order to get mod_php to pick up the files:
#Register PHP mime types
AddType application/x-httpd-php .php
AddType application/x-httpd-php-source .phps
That worked, but index.php wasn’t run automatically if I navigate to a directory. For that I added index.php to the end of the DirectoryIndex directive:
DirectoryIndex index.html index.html.var index.php
One apachectl restart later, and all was well.
Looking for Non-Lame Open-Source UPnP Media Server
Ever since I got my Linksys wireless music system, I’ve wanted an open source UPnP music server to stream my music to it. As one might expect, it ships w/ Musicmatch Jukebox, which has a UPnP server feature, but MMJB isn’t open source and only runs on Windows. My music is stored on aenea, a FreeBSD 6.0 box, so why can’t I run a media server there, where it makes the most sense?
I’ve found three potentially non-shitty media servers thus far:
- GMediaServer
- MediaTomb
- GeeXboX uShare
As is the case with most trendy open source projects today, they’re all written for Linux. Since I use FreeBSD, there’s inevitably some degree of contortion required to make them work.
First is the installation of the FreeBSD port of libupnp, which is in /usr/ports/devel/upnp. That built and installed without incident.
MediaTomb seems the most mature and feature rich, so I’ll start there. I downloaded the source tarball, and am having quite a bit of trouble getting it to build.
configure is failing somewhat stupidly. First, it couldn’t find the iconv.h header file located in /usr/local/include, even when I point it there explicitly. I had to set the CPPFLAGS env var to -I/usr/local/include to get past that point.
Next, configure can’t find upnp/ixml.h. This one doesn’t appear to be configure‘s fault; there really is no ixml.h in the port. Perhaps part of the problem is the age of the FreeBSD libupnp port; it reports itself as version 1.0.4_1; whatever that means, it doesn’t sound at all like 1.2.1, the latest Linux version. FreeBSD ports bug 82347 was submitted back in June with diffs to update to the current version, but it appears nothing’s been done with them.
I’m trying to apply the aforementioned ports bug diff, with mostly success but a few problems.
Hunk #1 in Makefile failed to apply, and I can’t tell why. For testing I’ll make the changes in that hunk manually; mostly they’re changing the version number, dist name, etc.
The only hunk that failed is in pkg-plist, and seems to reflect a significant refactoring of the libupnp code.
Hmm, no, this patch is still pretty fucked up. The post-download patches themselves are broken now. I’ll just build and install libupnp without a port.
Note to self: from the README:
For the UPnP library to function correctly, Linux networking must be configured
properly for multicasting. To do this:route add -net 239.0.0.0 netmask 255.0.0.0 eth0
where ‘eth0′ is the network adapter that the UPnP library will use. Without
this addition, device advertisements and control point searches will not
function.
Keep this in mind if you ever get it to build.
…
Hmm, that didn’t get me very far:
$ gmake
gmake[1]: Entering directory `/home/anelson/libupnp-1.2.1/ixml'
gmake[2]: Entering directory `/home/anelson/libupnp-1.2.1/ixml/src'
gcc -Wall -I./ -I../inc -I../../pil/inc -fPIC -c -Wall -Os -DNDEBUG -I. -I../inc -Iinc -c ixml.c -o obj/ixml.o
In file included from ../inc/ixml.h:37,
from inc/ixmlmembuf.h:36,
from ixml.c:32:
/usr/include/malloc.h:3:2: #error "<malloc.h> has been replaced by <stdlib.h>"
gmake[2]: *** [obj/ixml.o] Error 1
gmake[2]: Leaving directory `/home/anelson/libupnp-1.2.1/ixml/src'
gmake[1]: *** [all] Error 2
gmake[1]: Leaving directory `/home/anelson/libupnp-1.2.1/ixml'
gmake: *** [upnp] Error 2
Ok, so those patches in the port really do matter.
The libupnp-1.2.1 makefile has changed pretty dramatically since the port (and presumably the port diff). For example, it is modified to support cross-compilation, and has removed some hard-coded paths. I’ve made some manual mods to it such as setting MAKE to gmake, and am deleting patch-makefile, in the hopes that this will be enough.
…
Def not. It’s pretty badly broken. Awesome.
Ok, so back to building without the port.
The problem above is ixml.c attempting to #include malloc.h, which is deprecated in favor of stdlib.h. I commented it out in ixml/inc/ixml.h, but there’s more. A quick grep yields a few files with this same affliction:
ixml/inc/ixml.h:/*#include <malloc.h>*/
threadutil/inc/FreeList.h:#include <malloc.h>
threadutil/src/LinkedList.c:#include <malloc.h>
threadutil/src/iasnprintf.c:#include <malloc.h>
upnp/inc/ixml.h:/*#include <malloc.h>*/
upnp/src/genlib/util/upnp_timeout.c:#include <malloc.h>
upnp/src/inc/client_table.h:#include <malloc.h>
upnp/src/inc/http_client.h:#include <malloc.h>
upnp/src/inc/service_table.h:#include <malloc.h>
upnp/src/inc/uri.h:#include <malloc.h>
I’ll fix them all then.
Next up is a pthreads issue:
ThreadPool.c: In function `SetSeed':
ThreadPool.c:344: error: invalid use of undefined type `struct pthread'
One of the makefile patches in the port had to do with the path to pthreads.
….
Ok, I’ve gone through, error by error. Most of them were already solved in the FreeBSD port patches, although due to various subtle changes in the library sources, the patches won’t work automatically. For testing, I manually made the changes in the patches, and was able to do a build and install. However, I only applied the changes necessary to build; there may still be runtime problems that I don’t know about. I also had to change the install and uninstall targets, so files go to /usr/local/whatever instead of /usr/whatever/, which complicates the MediaTomb build later.
I’m now going to try to build MediaTomb and see how it runs.
First thing I discover is that the configure script needs alot of help. In order to get it to find iconv.h, I had to add CPPFLAGS=-I/usr/local/include, even though /usr/local/include is supposed to be where it’s looking by default.
Then I discovered I hadn’t fully installed libupnp after all. I kept getting:
checking for upnp/upnp.h... no
configure: error: upnp/upnp.h not found. Check libupnp installation
Even though upnp/upnp.h exists in /usr/local/include. The problem with configure in general is that it does its checks by trying simple code snippets in the compiler, and if they fail, concludes the test fails. However, without seeing the compiler error message, it’s hard to debug the problem.
I only figured it out when I read this in the README for MediaTomb:
Installation of this package is not straight forward.
tar zxvf libupnp-1.2.1.tar.gz
cd libupnp-1.2.1
cd upnp
make
make install
cd ../ixml
make
make install
cd ../threadutil
make
make install
cd ..
cp upnp/inc/ixml.h /usr/include/
I only did the make install (actually, gmake install) for upnp, not for the other two folders. Plus, I didn’t manually copy ixml.h. So, the configure test failed not because it couldn’t find upnp.h, but because when it tried to compile upnp.h, it failed since dependent files were missing. Lame.
Once I got configure to admit upnp.h was there, it ran to completion, with the following results:
======== CONFIGURATION SUMMARY =========
sqlite3 : missing
mysql : yes
java-script : missing
libmagic : yes
id3lib : missing
libextractor : disabled
libexif : missing
This isn’t acceptable. I want to use sqlite as the database backend, and will definitely need id3 support. java-script would be nice. I suspect there are ports I can install to address this issue.
libexif is in graphics/libexif. No brainer, that. It installed fine.
id3lib is in audio/id3lib. Also an obvious choice. Took a bit, but installed ok too.
From the configure output, it appears the java-script functionality is provided by SpiderMonkey java-script Engine. That’s in lang/p5-java-script-SpiderMonkey. It’s installing now, but might take a while.
While I wait, an interesting remark from the MediaTomb readme:
NOTE: Write operations to sqlite3 database is VERY SLOW, use sqlite3 only if you don’t have another choise.
This is interesting. In my experience with sqlite, there are few database engines faster. I wonder what he’s doing that makes it suck so bad. One thing that I found made a huge performance difference with sqlite is using large transactions. If you’re doing a bulk insert, you simply must do a bunch of them in a single tx; the overhead associated with starting and commiting a transaction is unusually high, but the overhead associated with operatins within a tx is virtually zero, so you owe it to yourself to use large transactions.
I took a peek at the database interface API; sure enough, it’s really lightweight: an exec and a query method, and just a little glue. Without transaction or bulk insert semantics, sqlite probably will suck.
…
Anyway, spidermonkey is done. Re-ran configure; it picks up everything I added, except SpiderMonkey. I get this:
checking for jsapi.h... yes
checking for JS_NewObject in -ljs... no
checking for JS_NewObject in -lsmjs... no
So it finds the header file, but nfi what’s the deal with this JS_NewObject crap. Actually, I’m getting a similar problem with sqlite:
checking for sqlite3.h... yes
checking for sqlite3_open in -lsqlite3... no
Of course, if the sqlite3 library doesn’t have sqlite3_open, it’s totally broken. So, again, I really wish I could see what error configure is getting to make it think that.
Well, I haven’t figured out how to do that, but I did figure out the problem with configure not seeing the libs. Just as I had to pass CPPFLAGS=-I/usr/local/include to get it to see the include files, I also had to pass LDFLAGS=-L/usr/local/lib to get it to find the libs. Now the board is all green:
$ ./configure CPPFLAGS=-I/usr/local/include LDFLAGS=-L/usr/local/lib
[snipped]
======== CONFIGURATION SUMMARY =========
sqlite3 : yes
mysql : yes
java-script : yes
libmagic : yes
id3lib : yes
libextractor : disabled
libexif : yes
Sweet. Now let’s pull the trigger on the build.
I knew it couldn’t last:
../src/string_converter.cc: In member function `zmm::String StringConverter::convert(zmm::String)':
../src/string_converter.cc:72: error: invalid conversion from `char**' to `const char**'
../src/string_converter.cc:72: error: initializing argument 2 of `size_t libiconv(void*, const char**, size_t*, char**, size_t*)'
Hmm
The line in question is:
ret = iconv(cd, input_ptr, (size_t *)&input_bytes,
output_ptr, (size_t *)&output_bytes); --right here
I suspect the problem is input_ptr, which is a char **. By convention, input strings are const in C, so I assume the iconv method expects a const char**. I’ll try to cast it explicitly, but I suspect this belies a larger problem with the compiler settings; this really should be a warning, not an error, with all due respect to John Robbins.
Well, that got me past it. Now it fails with my old friend:
In file included from ../src/zmm/stringbuffer.cc:23:
/usr/include/malloc.h:3:2: #error "<malloc.h> has been replaced by <stdlib.h>"
I had to fix a bunch of these in libupnp as well; I just replace them with stdlib.h
$ grep -r malloc.h *
src/zmm/stringbuffer.cc:#include <malloc.h>
src/zmm/strings.cc:#include <malloc.h>
src/zmmf/array.cc:#include <malloc.h>
That didn’t carry me much further:
../src/zmmf/exception.cc:24:22: execinfo.h: No such file or directory
../src/zmmf/exception.cc: In constructor `zmm::Exception::Exception(zmm::String)':
../src/zmmf/exception.cc:36: error: `backtrace' undeclared (first use this function)
../src/zmmf/exception.cc:36: error: (Each undeclared identifier is reported only once for each function it appears in.)
../src/zmmf/exception.cc:40: error: `backtrace_symbols' undeclared (first use this function)
Looking at the code in exception.cc, it has some conditional logic to exclude the execinfo.h and associated functions if __CYGWIN__ is defined, presumably because Cygwin doesn’t have execinfo. Well, FBSD doesn’t either, so behold my mod to exception.cc:
//anelson: FreeBSD doesn't have execinfo either, so present we're cygwin
#define __CYGWIN__
Sometimes I amaze myself.
Next up is a linker error in libupnp:
/usr/local/lib/libupnp.so: undefined reference to `gethostbyname_r'
/usr/local/lib/libthreadutil.so: undefined reference to `ftime'
I feel like we’re almost there. gethostbyname_r is a reentrant version of gethostbyname.
From [http://www.unobvious.com/bsd/freebsd-threads.html]():
Some of these functions are available, some (notably the gethost* and other name resolver functions) are not. They are implemented as part of Linuxthreads. Note that the Linuxthreads implementations are “wrappers” around the non-reentrant forms of these functions. This has two implications. First, the performance will not be as good as if they were “natively” implemented (only one call to gethostbyname_r will actually do a lookup at any given time), and second, the Linuxthreads implementation can be used with the user threads library (as they depend on POSIX mutexes to lock the critical section).
That’s awesome, except the Linuxthreads port won’t build on amd64. Doh. Maybe there’s some way to get libupnp to not try to use the reentrant functions? No; the code that makes the call has no conditional compilation. According to this list post, this is a known issue. Awesome. Thanks alot.
I’ve added
#define gethostbyname_r gethostbyname
To upnp/src/inc/uri.h, in the hopes this will address the issue.
Hmm, no, it’s more complicated than that. The caller in uri. is passing in five or six args, but gethostbyname is only supposed to take two.
I don’t think this is going to work.
Since all the media servers I could find depend in libupnp, and libupnp is totally broken on FreeBSD, I’d say I’m pretty well fucked.
Astonishingly, it appears that there is a NetBSD pkgsrc package for libupnp12. Maybe I can use the diffs there? Then again, maybe NetBSD doesn’t suck as bad as FreeBSD, and has things like gethostbyname_r, etc.
Well, I was able to apply all the NetBSD patches without error. I had to re-fix the references to malloc.h, and remove the LD_PATH added by the NetBSD patches to upnp/src/makefile, but otherwise it worked unmodified.
Now I’m getting link errors in the libmediatomb with pthread functions; I suspect a missing -l switch. I added CXXFLAGS=-pthread to the configure call and it proceeded. Now the problem is:
/usr/local/lib/libthreadutil.so: undefined reference to `ftime'
This is libthreadutil.so, which is part of the libupnp build.
From a list post, there’s a -lcompat argument that adds legacy stuff like this. I’ll try adding that to the CC command when building libthreadutil.
Ok, when I built libupnp, I did:
gmake EXTRA_LIBS=-lcompat, then did another install with gmake PREFIX=/usr/local install, and it seemed to do the trick. I was able to build mediatomb through to the end with:
./configure CPPFLAGS=-I/usr/local/include LDFLAGS=-L/usr/local/lib CXXFLAGS=-pthread
make
Now I did a make install, and that’s it.
According to the readme, the next step is tomb-install, which will create a configuration section in ~/.mediatomb. tome-install is in /usr/local/bin:
$ /usr/local/bin/tomb-install
MediaTomb installer 0.3
Proceeding normally
Creating directories... ok
Creating database... ok
Writing configuration... ok
All done. You are now ready to launch MediaTomb!
WARNING: Because of security reasons the UI is disabled by default!
Please refer to the README file on how to enable it.
Once it is enabled:
When the server is running you can access the UI by opening
~/.mediatomb/mediatomb.html in your web browser.
The readme says it will default to sqlite3, since that can be done automatically, but that you should really use mysql instead. I don’t care; I want to see that it’s working first.
The script you’re meant to start with is mediatomb-service, but it’s based on init.d scripts in /etc/rc.d/init.d, which isn’t how FreeBSD does things.
So, I’ll try my hand at the command line:
$ mediatomb -d -u anelson -g anelson -P /home/anelson/mediatomb.pid -l /home/anelson/mediatomb.log
Pid file: /home/anelson/mediatomb.pid
That didn’t work. The process died for some reason; I imagine it doesn’t like not being run as root.
$ tail mediatomb.log
2005-10-27 13:57:35 INFO: Config: option not found: /import/metadata-charset using default value: ISO-8859-1
2005-10-27 13:57:35 INFO: checking ip..
2005-10-27 13:57:35 INFO: Config: option not found: /server/ip using default value:
2005-10-27 13:57:35 INFO: Config: option not found: /server/bookmark using default value: mediatomb.html
2005-10-27 13:57:35 INFO: Config: option not found: /server/port using default value: 0
2005-10-27 13:57:35 INFO: Config: option not found: /server/alive using default value: 180
2005-10-27 13:57:35 INFO: Config: option not found: /import/magic-file using default value:
2005-10-27 13:57:35 INFO: Configuration check succeeded.
2005-10-27 13:57:35 INFO: Config: option not found: /server/ip using default value:
2005-10-27 13:57:35 INFO: got ip: (null)
Nothing suggesting a failure…Ah, look at this from /var/log/messages:
Nov 27 13:57:35 aenea kernel: pid 95269 (mediatomb), uid 1001: exited on signal 11
Great. I’m out of my depth when it comes to debugging this stuff.
…
Time for a crash course (pun accidental) in gdb.
First, I just start gdb with the name of the image to run:
gdb mediatomb
It starts, and gives me the standard prompt:
(gdb)
I can type run followed by any arguments, and it’ll run the process under the debugger. When I do, I get this interesting output:
(gdb) run -d -u anelson -g anelson -P /home/anelson/mediatomb.pid -l /home/anelson/mediatomb.log
Starting program: /usr/local/bin/mediatomb -d -u anelson -g anelson -P /home/anelson/mediatomb.pid -l /home/anelson/mediatomb.log
warning: Unable to get location for thread creation breakpoint: generic error
[New LWP 100174]
Pid file: /home/anelson/mediatomb.pid
[New Thread 0x59e000 (LWP 100174)]
Program exited normally.
(gdb) Fatal error 'mutex is on list' at line 540 in file /usr/src/lib/libpthread/thread/thr_mutex.c (errno = 0)
My developer sense tells me the meaningful error is: Fatal error 'mutex is on list'. Google, what say you?
Nothing but complaints yet, but I learned about bt, which produces what I in my Microsoft Visual Studio nomenclature would call a ‘call stack’. It is:
(gdb) bt
No stack.
WTF?
Amazingly, the only list posting seems to be in reference to this happening whilst running the JDK, and no responses whatsoever. Outstanding.
Looking at the system source code thr_mutex.c line 540, this message is coming from an assert. From my read, it appears it happens if trying to lock a mutex that is already locked.
I’m going to go out on a limb and conclude that mediatomb isn’t using pthreads as FreeBSD expects. From this great article on programmming threads in FreeBSD, I conclude that I should try to build with libc_r instead.
Sadly, I can’t figure out how to do that. I did come across a useful tool: ldd, when given the path to an image, dumps the shared libraries that image links to. mediatomb links to libpthread.so.2.
Never mind; this 5.0 roadmap doc makes clear that libc_r is being deprecated as thread support is going into the kernel directly.
At any rate, I’m pretty clearly screwed. I’m out of my depth, and may have to slink back to Linux to run my media server.