Chemistry in silico: 2008

Thursday, December 18, 2008

MySQL & Perl on Leopard

Mac OS X Leopard comes with Apache2 and PHP pre-installed, but no MySQL. This is the missing link in the very useful MAMP setup. The kind people at mysql.com offer it ready packaged for mac users. When picking your version, MacIntel users pick the 32-bit version, unless you want problems installing Perl's DBD::mysql. What is DBD::mysql? the very useful module that allows Perl scripts to access your MySQL databases. If you install the 64-bit version of MySQL you will hit errors when you install DBD::mysql as a 32-bit installation of Perl is not interested in compiling against 64-bit MySQL libraries.

If you need 64-bit MySQL you could install both versions and compile against the 32-bit libraries.

MacPorts on Leopard with a proxy

MacPorts is an excellent tool for Mac users to grab extra software without having to worry about working out the compile options and read endless instructions etc. As of today there are 5253 ports available for everything from apache2 to iphone apps. The beauty of installing anything from MP is it will install any dependancies as well. I've just upgraded one of our servers to Leopard, but MP didn't want to play anymore. Turns out there is a bug in handling the proxy settings. We have to use a proxy, so this is a show stopper. It looks like they will fix for the 1.8 release, but I can't wait!

Thankfully a simple workaround exists, that requires you to edit /etc/sudoers.

sudo visudo (Don't use vim direct on /etc/sudoers)

in the Default specification section add:


Defaults        env_keep += "http_proxy HTTP_PROXY HTTPS_PROXY FTP_PROXY"
Defaults        env_keep += "ALL_PROXY NO_PROXY"

Essentially when you sudo port install apache2 sudo only takes a stripped down version of current environment variables, so even though you have set your all important $http_proxy, macports doesn't get it. Adding those two lines will fix that. Now install software galore!

Wednesday, December 10, 2008

Shell scripting

I'm quite a big fan of shell scripting. I use scripts to automate many tasks (normally in conjunction with cron). I've always preferred using bash, opposed to (t)csh. Having just stumbled across this 12 year old post I'm glad I chose bash. Hours of troubleshooting have been avoided!

http://www.faqs.org/faqs/unix-faq/shell/csh-whynot/

Friday, December 5, 2008

Compiling Condor 7.1.4

I've blogged previously on the issues with Condor 7.1.1 onwards - essentially openSUSE and Mac/PPC users need to compile themselves. That is fine if you don't get a error in a core component - condor_c++_util.

However, I've found a workaround that seems to work on 7.1.4 (presumably earlier versions as well).

cd src
./build_init
./configure --disable-glibc-version-check --disable-gcc-version-check --disable-full-port --without-classads --without-gsoap
make

This works on openSUSE 11.0 (32-bit). I've not had time to test on 10.x or 64-bit, but it should work. The difference to my previous configure options is not to build gsoap. Strangely this compiles fine as a dependency, but condor_c++_util will fail when linking against it. Obviously this is a show stopper if you use soap services (luckily for me I don't).

As for compiling on Mac/PPC, hopefully it will work, however, I couldn't get past build_init, which I could previously. So probably just my machine playing up.

Thursday, November 27, 2008

Solubility Challenge - Results

The results are in, available from the JCIM homepage (direct link). Several entrants should of been invited to submit articles to JCIM (alas I wasn't one of them!). So look out for them next year some time as the current benchmark in solubility prediction.

Wednesday, November 26, 2008

Academic software

Although there is lots of excellent open source software available for computational chemistry and cheminformatics one would be narrow minded if they ignored the large array of commercial software also available.

Unfortunately, in an academic setting the cost of this software is far beyond the limited budgets available. Most companies offer discounted prices for academia, which is great, but some have gone further. Namely, ChemAxon and OpenEye offer free academic licenses.

So if you're in academia and fancy getting your hands on commerical software I'd follow these links:

Friday, October 24, 2008

Creating shortcuts with IzPack

IzPack is an excellent tool for deploying your software. During the wizard installation process you can create some shortcuts (on Windows or Linux). To enable shortcuts there are a few changes required in your install.xml, then the actual shortcut information is contained in seperate xml files.

First edit your install.xml by adding:

<panels>
...(other panels)...
<panel classname="ShortcutPanel"/>
...(other panels)...
</panels>

(Required for Windows only)
<native type="izpack" name="ShellLink.dll"/>

This ensures the shortcut panel will be present in your installer and the native libraries required for Windows are included. Next specify two shortcuts, one to your application (so a jar) and one to a local HTML file (your documentation). You can alter the file names, as long as you tell install.xml where to find the files.

For Windows, shortcutSpec.xml:

<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>

<shortcuts>
<skipIfNotSupported/>
<programGroup defaultName="MyApp"
            location="applications"/>
<shortcut
name="wekautils"
target="$INSTALL_PATH\MyApp.jar"
description="Launch app"
workingDirectory="$INSTALL_PATH"
iconFile="$INSTALL_PATH\images\logo.ico"
  initialState="normal"
programGroup="yes"
desktop="yes"
applications="no"
startMenu="no"
startup="no">

<createForPack name="Core"/>
</shortcut>

<shortcut
name="Documentation"
target="$INSTALL_PATH\doc\index.html"
description="Launch documentation"
initialState="normal"
programGroup="yes"
desktop="no"
applications="no"
startMenu="no"
startup="no">

<createForPack name="Core"/>
</shortcut>

</shortcuts>

and for Linux, slightly altered format, Unix_shortcutSpec.xml:

<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>

<shortcuts>
<skipIfNotSupported/>
<programGroup defaultName="MyApp"
            location="applications"/>
<shortcut
name="MyApp"
target="/usr/bin/java"
commandLine="-jar $INSTALL_PATH/MyApp.jar"
description="Launch MyApp"
iconFile="$INSTALL_PATH/images/MyApp.png"
workingDirectory="$INSTALL_PATH"
initialState="normal"
programGroup="yes"
desktop="no"
applications="no"
startMenu="no"
startup="no"
type="Application"
terminal="true"
encoding="UTF-8">

<createForPack name="Core"/>
</shortcut>

<shortcut
name="Documentation"
description="Launch documentation"
initialState="normal"
programGroup="yes"
desktop="no"
applications="no"
startMenu="no"
startup="no"
type="link"
url="$INSTALL_PATH/doc/index.html"
terminal="false"
encoding="UTF-8">

<createForPack name="Core"/>
</shortcut>

</shortcuts>

See the IzPack website for further documentation.

Wednesday, October 15, 2008

GROMACS 4

GROMACS 4 was recently released. Installing using the Intel compilers and MKL for FFTW was the same as with 3.3.3.

See my previous post for full instruction. Please note the lines to change in configure are now 26595 and 26663.

Also note the test set hasn't been updated for version 4, so keep and eye out for that.

Saturday, October 11, 2008

IzPack & CruiseControl

I have CruiseControl configured to run IzPack after successfully compiling and testing my code. Thanks to ant integration this is very easy, see here. Izpack has a few other tricks, namely izpack2app and izpack2exe. These python scripts wrap your newly created installer.jar into either a Mac OS X application or a Window executable. Users of these platforms can now run them without the command-line. I've added these steps into CruiseControl so after each build my artifacts tab gives me 3 versions of my installer. The Mac OS application is actually a folder, so zip that up as CC won't have a folder as an artifact.

Edit your config.xml to include something like this:


<schedule interval="3600">
<composite>
     <ant antworkingdir="checkout/${project.name}" 
           buildfile="build.xml" target="izpack" />
     <exec workingdir="checkout/${project.name}/dist" 
     command="/usr/bin/python" 
     args="/Applications/IzPack/utils/izpack2app/izpack2app.py
     installer.jar installer.app"/>
     <exec workingdir="checkout/${project.name}/dist" 
     command="/usr/bin/tar" args="cvfz installer.zip installer.app" />
     <exec workingdir="checkout/${project.name}/dist" 
     command="/usr/bin/python" 
     args="/Applications/IzPack/utils/izpack2exe/izpack2exe.py
     --file=installer.jar --output=installer.exe --no-upx />

</composite>

</schedule>

The four steps are compile from your build.xml, giving your installer.jar. Create the Mac app, zip it, then create the Windows executable. You need 7Zip for the Windows part, Mac users get it via MacPorts: sudo port install p7zip. Don't forget to add the new files to your artifactspublisher.

Tuesday, September 30, 2008

Subversion over a network

I previously talked about creating a subversion repository. But it is only of use if people can access it.

Using svn list there are several protocols that are accepted:

file:///Users/svn
http://somedomain.com/svn
svn://mycomputer/Users/svn
svn+ssh://mycomputer/Users/svn

file is used for viewing a repository on your local machine, this approach is not recommended. http is used to view repository served via apache (Binding modules are available for this, see here). svn is the default method, simple but no data encryption. Finally there is svn+ssh, which as the name suggest uses the SSH protocol. This is my method of choice for a group that already uses SSH between workstations.

There are several changes you will need to implement first.

Create an svn user and group.
Change ownership of your repository to the new svn user and group (chown -R svn:svn /Users/svn)
Change the file permissions to read,write, execute for user and group and nothing for other (chmod -R 770 /Users/svn)
That change would of just altered an extra permission, reset it manually now: chmod g+s /Users/svn/repo/db
Add users who can have SSH access to the svn group
To avoid constantly entering your SSH password create a set of keys.

Now you have command-line access sorted! I'd also suggest adding a web interface to view the repository. ViewVC is excellent for this purpose.

In terms of client applications there are plenty to choose from. Eclipse offers a cross-platform GUI for the repository using either Subclipse or Subversive plug-ins (See my previous post). There are also several other platform specific options: TortoiseSVN (Win), svnX (Mac) and KDEsvn (KDE) come to mind.

Friday, September 26, 2008

Install CruiseControl

CruiseControl (CC) is the third wheel in the pragmatic programming approach. The first two being source version control (e.g. SVN) and testing (e.g. JUnit). It is geared towards Java-based projects, but is so popular it is now available for .NET and Ruby. One of the joys of Java means that installation is a breeze as you can grab the binaries and go.

Installation on Mac OS X & Linux

A servlet server is required to run CC, thankfully one is included in the form of Jetty. You can use an existing servlet server, e.g. Tomcat. I did try to use Tomcat and while I did get the web interfaces to run I did lose some functionality, plus the dashboard isn't as mature as the JSP tool, yet.

Essentially unzip cruisecontrol-bin-2.7.3.zip into /cruisecontrol. Run cruisecontrol.sh and CC is running! Very simple, see the live webpages at http://localhost:8080 and http://localhost:8080/dashboard. Make sure ports 8080 and 8000 are allowed on your firewall. (8000 for JMX, see later). If you don't get anything look in the log file, cruisecontrol.log. You can alter the ports in cruisecontrol.sh.

Now CC is up and running lets go through the components.

The build loop, this is the java process which monitors your selected CVS/SVN repositories, if it detects a change (e.g. you commit something) it will checkout the project to a local directory and attempt to compile/test/package it as per your instructions in your build.xml. If you don't have ant scripts for your projects now is the time to write them. The build loop will then report the results, either a pass or fail with details for failed unit tests/compilation errors.
The reporting loop, originally this just consisted of JSP pages, now dashboard has come along in web 2.0 glory. Dashboard does everything the JSP pages do and more, plus is far easier on the eye. With both tools you can force a build of your project (this uses JMX and this requires an open port, 8000, to communicate on), if you don't want to wait for the next scheduled build. All the XML reports from the build loop are displayed in the two interfaces.

To configure your CC instance edit config.xml, it comes with a sample project in. Use it to construct your project settings and read this to see what else is available (e.g. email users upon a failed build, attach artifacts to successful build, e.g. a jar).

Finally you will want to add Subversion functionality to CC by seeing my previous post. The built-in support doesn't seem to actually work.

Monday, September 22, 2008

Install and setup subversion

We use the popular successor of CVS for our version control, subversion. It is ideal to hold our various projects. In addition to excellent command-line tools it has integration with Eclipse through Subclipse or Subversive and web-based repository viewers, like ViewVC.

Mac OS X
Binary installs are now available from here. Alternatively download from MacPorts or Fink.

openSUSE
You can grab subversion from YaST, but you may not be able to get 1.5 (only 1.4 if using 10.x), thankfully installing the source is very straight forward. Get both the subversion source and dependencies (subversion-deps). Decompress both and run ./configure;make;make install as root. An additional quirk we found on 64-bit openSUSE was that neon must be compiled first:

cd neon
./configure --enable-shared
make
make install

Then run ./configure;make;make install in the folder above.

Create a repository
Use the included admin tool:

svnadmin create /srv/svn

Access this repository as file:///srv/svn, e.g. to see contents issue:

svn list file:///srv/svn

Use svn import, svn ci & svn co to import your initial project, check-in changes and checkout a project. Run svn help to get full details on these and other commands. I'd also recommend checking out the svn book.

To access this repository over a network some additional steps are required so I'll address that in another post.

Tuesday, September 9, 2008

Condor 7.1.1 onwards update

As I have mentioned previously, condor is dropping support for RH9 and Mac/PPC binaries with 7.1.1 onwards. I have used the 7.1.0 RH9 binaries successfully on openSUSE 10.x. Obviously I'd like to use 7.1.1, well 7.1.2 now, but have no choice but to compile from source. That is fine, but unfortunately I've not managed to get 7.1.1 or 7.1.2 to compile on openSUSE 10.3 or 11.0 for either 32-bit or 64-bit. I also tried compiling on my PowerPC Mac, but again to no avail.

I'd like to continue to use the development series, but unless I can get it compile on these platforms, it won't be an option. I may switch to the stable series, 7.0.x, but alas that doesn't work on openSUSE 11.0 (compiles but doesn't run correctly), therefore stopping me from upgrading our 10.x machines. Hopefully a solution will present itself!

Thursday, September 4, 2008

Compile GROMACS with Intel compilers

I recently installed GROMACS on our condor pool. Unsurprisingly code that runs so fast is very dependent on the numerical libraries used during compilation. I therefore chose to use Intel MKL and Intel compiler as they perform very well together. Unfortunately the configure file is not suitable for Intel 10.1 users. I am using GROMACS 3.3.3 (although I hear version 4 should be out imminently), on 32-bit openSUSE 10.3.

First edit your configure file, lines 25897 & 25965:

Alter
LIBS="-lmkl $LIBS"
to
LIBS="-lmkl_intel -lmkl_sequential -lmkl_core $LIBS"

Then run:
./configure CC="icc" CPPFLAGS="-I/opt/intel/mkl/10.0.3.020/include" LDFLAGS="-L/opt/intel/mkl/10.0.3.020/lib/32" --with-fft=mkl
make
make install

You may need to alter the version number or location of your MKL files. I recommend you also download the test set to confirm the compilation. I have 4 fails, but closer inspection revealed they were not show stoppers.

Thursday, July 31, 2008

Under and over sampling with Weka

Weka uses the ARFF format for storing data. In the development series (3.5.x) an XML version of the ARFF format was introduced, XRFF. On the surface, there is little reason to use it, the format is far more verbose so file size quickly swells up. There are 3 additional features over the ARFF format:

Class attribute specification
Attribute weight
Instance weight

Typically the class attribute is the last in the file, else you need to tell the classifier which attribute to use. Now set the class attribute to any attribute:

<attribute class="yes" name="class" type="nominal">

Associate a weight to a attribute (within the header section) using metadata:

<attribute name="petalwidth" type="numeric">
<metadata>
  <property name="weight">0.9</property>
</metadata>
</attribute>

Associate a weight to an individual instance:

<instance weight="0.75">
<value>5.1</value>
<value>3.5</value>
<value>1.4</value>
<value>0.2</value>
<value>Iris-setosa</value>
</instance>

You can use the weight associated to an individual instance to simulate under and over sampling. For example, if you have 100 actives in a dataset and 1000 inactives, oversample the actives. This means training on each active 10 times so the model is composed from 1000 actives and 1000 inactives, granted the same actives are used, but this technique has positive effects on skewed datasets. The weight to add for this dataset would be 10 to each active instance.

<instance weight="10">
<value>5.1</value>
<value>3.5</value>
<value>1.4</value>
<value>0.2</value>
<value>active</value>
</instance>

Monday, July 28, 2008

Weka Online

Weka is an excellent machine learning/data mining workbench, from the University of Waikato. It is Java-based and available under GNU GPL.

An advantage of being in Java means it can easily run on virtually any platform. On the flip side Java can be limited by the amount of RAM available, this is the case with weka as it has been programmed with a memory-driven approach, not disk-driven. As data sets get larger and larger more RAM is required to run them. Couple this with Weka not being specifically designed for large data sets means it isn't hard to exceed a 2GB RAM requirement.

Now for the technical part, 32-bit hardware and Operating Systems (x86) can only use up to 2GB RAM per single process, regardless how much the machine actually has. To use more than 2GB RAM per process you need both 64-bit hardware and Operating System (x86_64). Thankfully it is increasingly common to have 64-bit hardware as standard on new purchases.

However, if you don't have new hardware another solution has recently become available: Weka Online. They allow you to submit Weka tasks from a web interface on to their 64-bit computer cluster (with 2.5-3.5GB RAM available). Alas, as I write this they have disabled submission while they bolster security due to a malicious attack.

Once this service is back it actually offers more than standard Weka, via their CEO framework, see more here. I've not actually tried the service myself, but the idea is certainly appealing.

Tuesday, July 22, 2008

Condor 7.1.1 supported ports

The development series for condor is dropping support for several platforms from 7.1.1 onwards:

Red Hat 9 (Suitable for openSUSE 10.x)
Solaris 5.8
Mac OS X PowerPC

RHEL 3 binaries should be fine for any Red Hat system (and presumably CentOS). Solaris 5.8 users can use the 5.9 binaries.

It should be noted they are continuing these ports for the current stable series (7.0.x).

Unfortunately the RHEL 3 binaries do not work on openSUSE 10.x (well they run but give Shadow exceptions if you try to do anything useful - like run a job!). Looks like a case to compile from source...

UPDATE: Condor 7.1.1 has been pulled due to numerous problems, look out for 7.1.2.

Saturday, July 19, 2008

Perl on Eclipse 3.4 (Ganymede)

I use Eclipse everyday and the ability to use it for multiple languages is crucial. Perl is one of the languages I use and there is an excellent plugin for it: EPIC. However, after installing the recently released Ganymede (3.4) release I couldn't install it.

There are multiple versions of Eclipse available to download, typically I pick Classic. However, EPIC will not install on to this version. I found using Eclipse IDE for Java Developers worked fine. Hopefully any other plugins you use won't mind you using this version!

Friday, July 18, 2008

Display source code in MediaWiki

You have three options by default:

Display code inline like script.sh, by using <code>script.sh</code>.
Blocks of code can be wrapped with <pre>insert your code here</pre>. This works multi-line but doesn't allow formatting.
Indent your text to enable a <pre> like block, then apply standard '''bold''' and ''italic'' formatting.

The above options generally work quite well, but if you end up with lots of code from different languages and more than a few lines it would be handy to have syntax highlighting. Thankfully an extension to MediaWiki can do this. Use SyntaxHighlight_GeSHi to colour away; it is also used on Wikipedia.

You will need root access to your server and subversion installed, then follow the simple instructions on the extension website. Download the extension and GeSHi then add it to your LocalSettings.php.

Now you have a fourth option, wrap your code with <source lang="X">code here</source> where X is php, java, bash, ruby or one of the other nearly 50 supported langauges!

Thursday, July 17, 2008

Thumbnails and TeX support for MediaWiki on Mac OS X

After you have setup a new MediaWiki installation you will likely want to enable some extra functionality which requires additional software.

First off to create thumbnails of images you need to install ImageMagick, which gives you the incredibly handy convert program.

Second you can add maths support, using TeX, for this you need ocaml, latex and dvipng. Grab the required programs from MacPorts, they are available from Fink as well.

sudo port install ImageMagick
sudo port install ocaml
sudo port install tetex
sudo port install ghostscript

Texvc will convert your TeX into whatever MediaWiki wants to display (HTML, MathML or PNG), but it needs to know where several programs are. Given you are probably running your webserver as the www user, who has no $PATH settings how do you tell Texvc where to find the files? In an unusal move, hardcode them! Edit <mediawiki>/math/render.ml, prefixing /opt/local/bin to the four commands

let cmd_dvips tmpprefix = "/opt/local/bin/dvips -R -E " ^ tmpprefix ^ ".dvi -f >" ^ tmpprefix ^ ".ps"
let cmd_latex tmpprefix = "/opt/local/bin/latex " ^ tmpprefix ^ ".tex >/dev/null"
let cmd_convert tmpprefix finalpath = "/usr/local/bin/convert -quality 100 -density 120 " ^ tmpprefix ^ ".ps " ^ finalpath ^ " >/dev/null 2>/dev/null"
let cmd_dvipng tmpprefix finalpath = "/opt/local/bin/dvipng -gamma 1.5 -D 120 -T tight --strict " ^ tmpprefix ^ ".dvi -o " ^ finalpath ^ " >/dev/null 2>/dev/null"

Them recompile texvc with make, ocaml will take over here.

Tell ImageMagick where gs is, by editing /opt/local/lib/ImageMagick-X.X.X/config/delegates.xml, where X.X.X is version number. Replaces every "gs" with "/opt/local/bin/gs", only edit "gs" entries, about half a dozen.

Finally tell MediaWiki to use TeX, by editing your LocalSettings.php with $wgUseTeX = true;

Now math support should be good to go. Thank to this compilation of advice for assistance.

So now this wikitext will produce the following:


== Magical latex in action ==
<math>\left \{ \frac{a}{b} \right \} \quad \left \lbrace \frac{a}{b} \right \rbrace</math>

<math>x \implies y</math> an AMS command

<math>f(n) =
\begin{cases}
n/2,  & \mbox{if }n\mbox{ is even} \\
3n+1, & \mbox{if }n\mbox{ is odd}
\end{cases}</math>

== Image thumbnail ==

[[Image:OpenSUSE.png|frame|Full size|center]]
[[Image:OpenSUSE.png|thumb|A thumbnail|center]]

Wednesday, July 16, 2008

Solubility Challenge

UCC has launched a competition in conjunction with JCIM. Essentially their article (DOI: 10.1021/ci800058v) which appeared on ASAP yesterday, details 132 druglike molecules. They report the solubility for 100 molecules and challenge you to predict the other 32 using whatever method you choose.

Submit your predictions by 15th September 2008 upon which the best submissions will be invited to detail their models as JCIM articles.

Full details are on the Goodman group website, including machine-readable files.

Tuesday, July 8, 2008

Subversion with CruiseControl

As we use subversion for our version control we need to do an extra step as CruiseControl only has limited subversion support (e.g. it can't checkout a project, I'm sure it should but has never worked for me).

To give it the power to do so you need to download SvnAnt. Copy the three jar's from the lib folder into the the lib folder in your installation: /cruisecontrol/apache-ant-1.7.0/lib. This way everything that keeps CruiseControl happy is in one place.Now you need to define a property file defining the location of SvnAnt, something like the file svn-build.props:

svnant.version=1.0.0

lib.dir=../apache-ant-1.7.0/lib

svnant.jar=${lib.dir}/svnant.jar
svnClientAdapter.jar=${lib.dir}/svnClientAdapter.jar
svnjavahl.jar=${lib.dir}/svnjavahl.jar

You need to ensure the lib.dir value is valid, depending where you call this file from (in this example /cruisecontrol/project). As you will see we make a wrapper script to grab the code from the repo, before launching the project ant script. The wrapper script may be in /cruisecontrol/project, but defines it basedir as /cruisecontrol/checkout.

A sample script can be found here (Blogger doesn't want to display it). To use it save to /cruisecontrol/project and edit the sample_project and subversion path. Your project will be checked out to /cruisecontrol/checkout, where it is built, tested, compiled etc. For the first time I had to checkout manually otherwise CruiseControl would kick a fuss up.

In your main config.xml call the new wrapper script (/cruisecontrol/project/sample_project.xml) in the schedule section. This way a fresh copy of the code is checked out before the CruiseControl commences the build.

Thursday, July 3, 2008

Start condor on boot with Mac OS X

Once you have condor running on your clients you will want it to load by default when booting. The condor distribution includes linux-based startup scripts, however there are none for the mac. Looking through the mailing list there is a suggestion of scripts to use, but they use Panther (10.3) based technologies, not recommended in Tiger (10.4) and not available in Leopard (10.5).

Delving a bit further I found another way to start condor by using cron.

Create a script to start condor: sudo vim /usr/sbin/start_condor

Enter these contents, and customise to your installation:

#!/bin/bash
# Ensure network is all setup
sleep 100

# Ensure condor environment is loaded
source /opt/condor/condor.sh

# Start condor
/opt/condor/sbin/condor_master

Our condor installation is actually stored on an NFS drive, so the 100 second sleep is to ensure the NFS drives have mounted before the rest of this script runs. I handle the path settings for
$CONDOR_CONFIG, $PATH & $MANPATH in a separate script (condor.sh), alternatively you could specifiy $CONDOR_CONFIG in this script.

Tell cron about your script, and that it should be run on boot:
sudo echo "@reboot root /usr/sbin/start_condor" >> /etc/crontab
I have the condor daemons run as root, hence the root user mentioned in this crontab entry.

Test the script by running direct from the command-line first, if it runs then you should have trouble when rebooting.

Monday, June 30, 2008

Pretty URL for MediaWiki

MediaWiki is a very popular open source platform for wikis. The most notable user (and actual developer) being the infamous Wikipedia.

As I have previously mentioned, we run a MediaWiki based wiki within my research group. As a purely aesthetic function I chose to use a pretty URL. This essentially means the URL I use to access the wiki is shorter,

www.domain.com/wiki/Chemistry

instead of:

www.domain.com/wiki/index.php/Chemistry

It turns out it is quite fiddly to get the setup sorted, but the end result is definitely preferable.

N.B. These instructions only work when using domain.com/wiki/Chemistry. If you use wiki.domain.com/Chemistry that is more tricky and domain.com/Chemistry is not recommended, see here for info on those scenarios. In addition, I assume you have root access to your webserver, you are using MediaWiki 1.9.x or later and an apache webserver.

In this example I assume your DocumentRoot for html files is /srv/htdocs. Move your unpackaged mediawiki files to /srv/htdocs/w. We will use apache to make wiki valid.

Edit your httpd.conf, to enable /srv/htdocs to read .htaccess files. Alter AllowOverride None to AllowOverride FileInfo.

Now create the file to read /srv/htdocs/.htaccess, with the contents:

RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d

# Make the wiki appear to come from wiki/
RewriteRule ^wiki/?(.*)$ /w/index.php?title=$1 [L,QSA]

Next tell your mediawiki configuration about this change by adding/editing /srv/htdocs/w/LocalSettings.php with:

$wgScriptPath       = "/w";
$wgArticlePath      = "/wiki/$1";

Now when you access domain.com/wiki/Chemistry, Apache will actually request domain.com/w/index.php?title=Chemistry. Although you will not see this in the address bar unless you edit a page. If you edit a page but it wants to edit a page called Index.php, something has gone wrong!

Do check here for other information, perhaps the details there are sufficiently up to date to get this work - they were not when I originally tried.

There is one restriction with this method - articles names must not include a question mark (ampersands, &, are fine).

Friday, June 27, 2008

Deploy your software with IzPack

In terms of free software packagers there are two contenders for me NSIS and IzPack. NSIS is restricted to Windows, so on its own isn't suitable for my needs. IzPack is Java-based, so instantly suitable for every major platform. Granted it is ideal for Java software, but it is not at all restricted to it. Obviously you need a Java runtime environment (JRE) installed to run, but that is common enough nowadays.

A run down of the features:

Open source
Cross-platform
Fully customisable
Native integration (Shortcuts for Windows and Linux)
Ant integration
Uninstaller
Unattended mode
User input
Translations

Just like ant all your settings are stored in a XML file and parsed to create your customised package.

One of the advantages of IzPack being Java-based means you can add it directly to our continuous integration environment, using the Ant integration. That of course runs using CruiseControl, as the final step after compiling and testing our code (courtesy of JUnit), IzPack can step into package my software. So for each build I can take away an ready-to-deploy package.

Thursday, June 26, 2008

Subversion integration with Eclipse

The latest version of Eclipse, Ganymede, has just been released. I'm a keen Eclipse user, I like having all my programming needs met by one application (mainly thanks to many language plugins available; Perl, Shell, LaTeX etc.).

I'm not going to detail why you might want to upgrade to ganymede as there is plenty on that already. All I will say is that, as with previously releases, only some (in this case 23) of the ~90 projects that make up eclipse are actually releasing new milestones in this release, so your favourite subproject may not be updated at the current time.

For me the most important plugin is subversion (although, personally I feel this should be built in like CVS support). Previously I have used Subclipse, but I thought I'd try Subversive this time around.

Installation is fairly straightforward:

Open Software update
Select the Ganymede update site
From Collaborative Tools pick SVN Team Provider

This doesn't include an SVN connector, which is a show stopper!

Add http://www.polarion.org/projects/subversive/download/eclipse/2.0/update-site/ as a new remote site in software update
For future reference get the latest update site from http://www.eclipse.org/subversive/downloads.php
Install SVNKit 1.x Implementation, or 2.x if you want to try the beta.

Now you should be good to go. Select SVN from the New Project Wizard, and explore repositories from the SVN Repository Exploring perspective (Window > Open Perspective > Other...)

Subversion users should also note that version 1.5 has also recently been released, and accordingly you will want your clients to run this version.

If Subversive isn't for you Subclipse works fine in Ganymede too.

Friday, June 20, 2008

Condor on openSUSE 11.0

Having just installed my first openSUSE 11.0 (32-bit) machine I've thrown condor (7.1.0 binary: RH9, x86, dynamic) on to see it runs fine.

As with openSUSE 10.x you need to install compat-libstdc++ first. As root run: zypper in compat-libstdc++ to do this.

Update: Although compiled, it doesn't behave correctly when running jobs, producing lots of shadow exceptions.

Thursday, June 19, 2008

Setup network install for openSUSE 11.0

We run openSUSE on all our linux machines. Therefore, the quickest and easier way to install on lots of machines and ensure quick access to updates is to maintain a local copy of the core repositories onsite.

As we have the entire installation repository we only need network install discs which contain a setup program and then you select whatever packages you want from the repository.

First, download the repositories. There are 3 core repository you need:

Installation
3rd party add on software
Updates

Find a local mirror and use wget to grab the repositories. The mirror I use in the UK has rsync support, which is very handy for keeping the repositories up to date by only downloading content that has altered.

rsync -Pvptrl --delete rsync://rsync.mirrorservice.org/sites/ftp.opensuse.org/pub/opensuse/distribution/11.0/repo/oss/ /www/suse/SUSE11.0-INSTALL/

rsync -Pvptrl --delete rsync://rsync.mirrorservice.org/sites/ftp.opensuse.org/pub/opensuse/distribution/11.0/repo/non-oss/ /www/suse/SUSE11.0-ADDON/

rsync -Pvptrl --delete rsync://rsync.mirrorservice.org/sites/ftp.opensuse.org/pub/opensuse/update/11.0/ /www/suse/SUSE11.0-UPDATE/

The installation and addon repositories are static, but updates will need updating. Let cron take care of that for you by running the command above once a day.

Next grab the network install CD's, http://download.opensuse.org/distribution/11.0/iso/cd (Don't forget to use a mirror).

We make the repositories available via a local apache web server. So when running the setup program just point to the IP and folder on the web server. During the installation add the addon and update repos, and hey presto fast install with up to date repositories!

Wednesday, June 18, 2008

Compile condor 7.1.0 on openSUSE 10.3

Condor has traditionally only been available as binaries, most of the time this is fine. We successfully run the Mac OS X PowerPC & Intel binaries and for openSUSE 10.x use the Red Hat 9/dynamic binaries. However for 64-bit openSUSE the binary (RH5) doesn't really work, it struggles to start up. Since the release of condor 7.x they have included the source, so that seemed a sensible avenue to explore. It does compile, but it is a bit more involved than just configure; make; make install!

These instructions should hold true for 7.0.0, 7.0.1 & 7.0.2 as well.

Grab the source for the latest stable or development release (I personally go with the development release).

First off glance through the README, check you have all the prerequisites. I also found I needed to add termcap, terminfo, ncurses-devel and flex (grab them from yast).

Now lets start configuring:

cd src
./build_init
./configure --disable-gcc-version-check --disable-full-port --without-classads

The configure flags mean:

Our gcc version is newer than the built-in checks.
No standard universe/checkpointing. Standard for a new OS port not to have these.
ClassAds not yet supported (no condor_q -better-analyze).

You will get an error:

configure: error: Condor does NOT know what glibc external to use with glibc-2.6.1

To get around this edit configure.ac with your favourite editor. Around line 2500 add this option for the case statement:

"2.6.1" ) # openSUSE 10.3
including_glibc_ext=NO
;;

Now run build_int and configure again as above.

If that runs you can then execute make. Sit back as this may take a while!

If there is a problem, typically on the externals part view the log which will be indicated. You may need to install something (such as the packages I mentioned earlier).

Compilation is finished when something like this pops up (and no obvious errors)

make[1]: Nothing to be done for 'all'.
make[1]: Leaving directory '/home/build/condor-7.1.0/src/condor_examples'

Everything is compiled so now prepare the release:

make release (output to release_dir, dynamically linked with debugging, ready for testing)
make public (output to ../public, add stripped dynamic/static linked binaries and no debugging)

Find your final installation bundle in ../public as condor-7.1.0---dynamic.tar.gz. Unpack and use it as you normally would. condor_version, will reveal your custom compile:

$CondorVersion: 7.1.0 May 21 2008 $
$CondorPlatform: X86_64-LINUX_SuSE_UNKNOWN $

Need help/advice? There is an excellent presentation which covers this as well as the users mailing list which is both active and helpful.

Hopefully in the future condor will have better support for openSUSE so that a full port (standard universe/checkpointing) and ClassAds will be available.

Good luck!

Tuesday, June 17, 2008

Keep up with your literature

A colleague of mine recently blogged on useful websites and applications for keeping up with new journal articles and reference managers: How I keep track of scientific literature...

The popular social bookmarking service del.icio.us gets a mention. It has been under discussion on the CHMINF-L mailing list recently where Egon Willighagen recommends using Faviki (currently in beta). Similar idea, but using tags from wikipedia as well. I wonder if this will be the next big social bookmarker?

Image courtesy of Papers.

openSUSE 11.0 this week

The new version of openSUSE lands this week. It looks like it will be an excellent release building on the successful 10.2 and 10.3 releases.

Find out more: Sneak peaks, screenshots and wiki.

I download a local copy of the installation repository, which I'll post about later this week.

Monday, June 16, 2008

Condor for number crunching

We have a need for HPC within our group (quantum chemistry calculations, machine learning, molecular dynamics simulations & analysis, etc.). To fulfil this need we have several SGE-based clusters within our department and the university. Our local clusters were in need of a refresh (multiple dated OS's - Red Hat 7!) and ideally needed to be unified somehow. It became tiresome having multiple clusters to pick from. What one has the most free slots, or the shortest queue? If it is full you would have to move all your data and get setup to run on a different cluster. We needed something to maximise our use of the compute nodes but simplify the submission process to avoid wasting time.

We opted for a more grid-based solution: condor. The reasons for this were:

All our local clusters are now combined into one condor pool.
It removes the needs for multiple head nodes, as users can submit direct from their desktops.
Cross-platform so you can use with Windows, Linux & Mac.
Grid approach means we take advantage of our desktop computers as well.

We still use the university's central SGE cluster, it is an invaluable resource. However, condor allows us to make the most of our local resources which are exclusively for our use.

Find out more about condor here: http://www.cs.wisc.edu/condor/. The annual Condor Week now has videos of some of tutorials (as well as slides) so check out what it is all about.

Image courtesy of Wikipedia.

Friday, June 13, 2008

Post one

It was recently recommended that I start my own blog, so here it is! I imagine it will be fairly technical, detailing the adventures I've had as sys admin within my research group at Nottingham. Alas, everything currently resides on our intranet-based wiki. Rest assured I'm not going to cut and paste the entire wiki here. I plan to share some of the solutions to problems I've had and other titbits of interest. Who knows where it might lead, certainly a novel venture for me to try.

The over-riding theme will be in silico chemistry, but that covers quite a lot!