Sunday, December 15, 2013

Install NumPy and SciPy without Fortran

NumPy and SciPy are two great Python packages for scientists, as is the popular Matplotlib. However, installing NumPy and SciPy is not for the faint hearted if you install your Python packages via pip. Assuming you have fortran, blas, lapack and atlas already installed it is actually quite a slow installation, especially SciPy. NumPy took 46 seconds to install, whereas SciPy took 6 mins and 50 seconds on my MacBook Pro. So what if you install once and forget? Two problems with that. First I use mktmpenv when debugging issues. Second I also use tox to test against multiple version of Python and/or Django. All of a sudden 6 build configurations is 42 minutes of SciPy compilation!

Let's not forget Windows users, Fortran - I don't think so and they should be able to enjoy pip and virtualenv as much as any Python developer.

The obvious solution is for SciPy to be packaged with wheel, the new Python binary distribution format. However, I appreciate that would be very hard for the authors, but hopefully one day.

In the meantime Anaconda might be of interest. It is like apt-get/yum for scientific Python, but a new feature has just been announced, you can pip install anaconda itself then take advantage of the binary distributions it provides for you.
So try this (assuming you have pip, virtualenv and virtualenvwrapper installed)

$ mktmpenv
$ pip install conda
$ conda init
$ conda install scipy

SciPy plus NumPy and numerous dependencies are installed in under a minute! Obviously, you can not convert this to a requirements.txt per se, but using Fabric you can make a task to install conda and then the conda packages all with a one liner.



Tuesday, November 19, 2013

New comments, but not like YouTube

I have switched the comments on this blog to Disqus (pronounced "discuss" as their engineers have corrected me on numerous occasions). I have nothing against the former Blogger comments or more recent Google+ comments, but Disqus is leagues ahead as a commenting platform. Plus, it is built on some of my favorite technologies: Python & Django.


Image credit: http://flic.kr/p/7oqZs2

Monday, November 18, 2013

GDG DevFest now includes Albuquerque

Courtesy of http://www.gdgabq.com/
Albuquerque just had it's first GDG DevFest (translation - Google Developer Group meetup). Where a selection of Google employees and enthusiasts met to share their experiences and insight to a few Google products (let's be honest there are quite a few now). Google Glass was out in force, six pairs, which is presumed to be the greatest concentration of them in New Mexico! I attended the following talks:

Google Drive Realtime API

The challenge for Drive is collaboration. Everything is stored as structured data - JSON. Due to the structured nature mutations can be created, which reflect a specific change, but not the actual data itself. Mutations are kept forever until the file is deleted. In fact the mutations make up the file, a snapshot describes the summary of changes to give the current file, save having to process all the mutations, which could be numerous. Mutations are saved on the server and the client (your browser) otherwise collisions when collaborating can easily occur, having the transformation manager on the client allows the reconciling of the incoming mutations and your local mutations, which are then pushed back to the server to the other collaborators. This technique is also how you can work offline, as the mutations are stored and reconciled later.

Welcome to Android

I have never done any Android programming, suffice to say it looks like a regular Java project with a touch of HTML (e.g. storing multiple resolutions of your images etc). Interestingly the IDE of choice, Eclipse with a plug-in is shifting to Android Studio. This is based on JetBrains IntelliJ - perhaps the gold standard of Java IDE (in cost as well), but free for Android developers.

ChromeCast - In Love

The instant success of the ChromeCast (nothing to do with that free Netflix subscription I'm sure) has a API available now. The ChromeCast is a receiver which runs web pages and you build the sender in your app (desktop, mobile, browser etc) to pair establish a connection with the ChomeCast. It looks surprisingly simple (famous last words).

AngularJS - Life changing tech

Google's latest JavaScript library which is receiving much love both within and outisde the company. While many Google products are built on Closure, it seems new sites are being built with AngularJS. AngularJS lets you extends the HTML vocabulary and is a powerful MVC aide.

HTML5 in the Movies

Since ABQ is now the center of the film universe directors look to local talent to fill various facets, which now include computer props. We saw numerous demonstrations of easy to use demos (single click for the actors) that make it appear as they type emails, send and receives replies all with a few random clicks. No surprise making the demos foolproof and loop-able was highly desirable, as was avoiding Windows for stability.

Startup Weekend Panel

For the final session we got a taste of what the the startup weekend events are like. A surprising amount of these events are cropping up around the state, not just in ABQ. We were given two disconnected words and had 10 minutes to come up with a business and 1 minute pitch, certainly a good ice breaker!

Summary

Overall it was an enjoyable day, my thanks to the GDG ABQ for taking the time to organize it and the sponsors (who doesn't want to see a 3D printer in action?) for making it possible. I think almost everyone left with a prize as well, I won an O'Reilly book in the raffle (worth more than the cost of the ticket). I was really impressed with the ChromeCast API and think I have found my next Hack Day project for work.




Wednesday, September 25, 2013

My first WordCamp

ABQ WordCamp is not the sort of conference I normally attend (think PyCon, DjangoCon and AWS Re:Invent are my regular haunts). Apart from being written in PHP the WordPress community is more diverse as you can develop and use it out of the box, with Django you need to build your site first before it gets a more CMS feel.

We have adopted WordPress for our company blog so it seemed sensible to meet some WordPress folk. I followed the user/publisher track, even though I'm more of a developer so it was refreshing to get a more SEO and social media slant. Some top tips:

  • WordPress handles most of your SEO for you
  • Building your blog is an active role, find like minded individuals/communities and get involved (comment on their blogs)
  • Comments are hugely important for an active and engaging blog
  • Disqus is the best commenting platform (plus it is powered by Django)
  • If traffic spikes for a particular post then repeat/stick with that topic
  • Introduce series for these popular topics
  • Be smart on social media, there are numerous WordPress plug-ins to make sharing easy
  • Google+ is the single best thing for SEO, even if you don't use it, post to it for instant indexing
  • bit.ly is perhaps the only URL shortener that is indexed by Google
  • Getting to page one on Google does require time and money (AdWords, potentially) via a trial and error approach, pick keywords, evaluate, update then repeat repeat repeat
  • You must have frequent new content (easier said than done) to get high search rankings
During the Developer Diversity Panel a common theme in tech popped up, that of women. Having heard similar talks at PyCon, WordPress is indeed well ahead of the curve in terms of the number of women present, which is great to see.

Yesterday I found out that Google are running DevFest ABQ, so maybe I'll be attending more conferences in ABQ in the future.


Sunday, May 5, 2013

Book chapter on the use of open source software in the pharmaceutical industry


During my final year at AstraZeneca I was asked to contribute a chapter to "Open source software in life science research: Practical solutions to common challenges in the pharmaceutical industry and beyond". Given this work would be very hard to publish in JCIM, JMC etc and I have never written a book chapter before it was the perfect opportunity. The chapter was entitled: Design Tracker: an easy to use and flexible hypothesis tracking system to aid project team working. It was coauthored by Martin Harrison, who wrote Design Tracker. The abstract sums up best what it covers:
Design Tracker is a hypothesis tracking system used across all sites and research areas in AstraZeneca by the global chemistry community. It is built on the LAMP (Linux, Apache, MySQL, PHP/ Python) software stack, which started as a single server and has now progressed to a six-server cluster running cutting-edge high availability software and hardware. This chapter describes how a local tool was developed into a global production system.
Design Tracker has been mentioned in a few external presentations before but I believe this is the first firm details about it. We talk about its use and how it came to be a global chemistry tool from a prototype at one site. As the book topic suggests we also cover the open source technologies we used to power it. While LAMP is not new, it is not exactly mainstream in the corporate environment for many pharmaceutical companies. We had to harden our setup to make it suitable for 24/7 use, so in addition to the regulars on LAMP we added Red Hat Cluster Suite, Continuent Tungsten and NGINX. We also took the opportunity to move away from apache/mod_python to apache/mod_wsgi. The end result was a service which is available 24/7 and future proofed compared to our previous solution.

The worlds most dynamic and frequently visited websites are powered by similar technologies so they have clearly proven themselves to be suited for the relatively modest needs of a single pharmaceutical company.

The book is available on Amazon UK & US and probably your favourite book reseller as well (ISBN: 978-1907568978). I hope you enjoy our chapter and the many others interesting topics covered.

Saturday, March 9, 2013

Getting ready for PyCon 2013

PyCon US 2013 is nearly upon us. This is my first PyCon and completes my attendance at both major US Python conferences (after attending DjangoCon last September). So thanks to my employer for letting me attend.

I haven't left for Santa Clara yet but I already feel part of the conference because of several factors:

  • The conference schedule has been made available via Guidebook and Lanyrd. Guidebook is an excellent medium for keeping track of your schedule across tutorials, talks, poster sessions and more. Plus the organizers can (and already have) update the information making it the goto source of information.  The only gripe I have is I can not find a way to sync my schedule between multiple iOS devices.
  • I'll check back with Lanyrd later to collect the slides from SlideShare for talks I attended and importantly those I couldn't.
  • Various events outside the regular talks have been posted on Eventbrite (which also features Passport integration on iOS now).
  • Of course you can follow everything on Twitter via the #pycon hashtag and @pycon feed.

Attending a conference with thousands of attendees feels much more personal when these tools are available. I may not be able to catch speakers after their talks (or even attend the talk at all) but I can tweet them to seek answers and potentially arrange to meet or I can grab their slides from SlideShare. I would encourage all speakers to share a contact medium of their preference and post their slides online.

All that is left for me to do now is try and pick my schedule from the plethora of interesting talks and other sessions running!