Producing sequence alignments programmatically

Aligning two protein sequences is part of my day-to-day work. So, I decided the other day to make a small script to actually save the time I lost accessing the browser, typing the address, pasting the sequences, running the program, getting the results, and copy those back to my computer in a decent format.

So, to keep it simple, I thought EBI (I’m using EMBOSS:NEEDLE to make the alignments) surely had some kind of webservice to do this, so I just had to write 10 lines of code to provide the sequences to the webservice and voilá. However, all the documentation they have for that particular webservice is in either Perl or Java, both of which I don’t understand. Okay, I do understand, but to be honest, I couldn’t make any sense of it, so that I could “translate” it to Python.

Anyways, after a day trying to figure out what was in place and what not, I finally got my script to work.

The code turned out to be rather simple, it was just a matter of knowing where to pass the arguments. Now I just have to get another 10 lines of code to convert the output to either FASTA or PIR format and I achieve my goal of time saving ;)

#!/usr/bin/python

from SOAPpy import WSDL
import time

wsdlLink = "http://www.ebi.ac.uk/Tools/webservices/wsdl/WSEmboss.wsdl"

server = WSDL.Proxy(wsdlLink)

blast_params = {
  'tool':'needle',
  'email':'email@user.com'    # User e-mail address
}

seq1 = open('e1.seq').read() # Where e1.seq and e2.seq are fasta formatted files on
seq2 = open('e2.seq').read() # the same directory as this python script.

blast_data = [{'type':'asequence', 'content':seq1}, {'type':'bsequence', 'content':seq2}]

jobId = server.run(params=blast_params,content=blast_data)

status = 'PENDING'
while status=='PENDING' or status == 'RUNNING':
    status = server.checkStatus(jobId)
    print status
    if status == 'RUNNING' or status == 'PENDING':
        time.sleep(15)

result = server.poll(jobId, 'tooloutput')
print result

2 thoughts on “Producing sequence alignments programmatically

  1. Hello!

    It connects to the “old” version. I tried having a look at soaplab but it was too complicated (or at least seemed so) to work with just for what I wanted so I stuck with this one. I added small details in the parameters dict (such as aformat) to be able to work with it. (Check the newest post and the function parseNeedle.)

    You can use it as you wish, no license at all afaik :D

    Regards!

  2. Hello,

    Nice code!. This version connects to the old or the new webservice? I mean old or new in regards your email in the biopython mailing list where you told about a webservice using an old EMBOSS version (2.9.0).
    Can I use your code?, With what license?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s