Downloading FASTQ Files Quickly utilizing IBM’s Aspera Connect on a LINUX or MAC Machine

Please note that this guide is based on this BioStars thread and this GitHub repository


Procedure:


  1. Download the latest version of Aspera Connect from the IBM “featured client software” section (you may need to install a browser extension as well)


  1. Download the file “ena-fast-download.py” from wwood’s GitHub repository


  1. Open the script in your favorite python editor and scroll down to the bottom. Add the specific file path to the ascp field (addition highlighted below)


aspera_commands = []

    for url in ftp_urls:

        quiet_args = ''

        if args.quiet:

            quiet_args = ' -Q'

        cmd = "/Users/USER/Applications/Aspera\ Connect.app/Contents/Resources/ascp{} -T -l 300m -P33001 {} -i {} era-fasp@fasp.sra.ebi.ac.uk:{} {}".format(

            quiet_args,

            args.ascp_args,

            ssh_key_file,

            url.replace('ftp.sra.ebi.ac.uk',''), output_directory)

        logging.info("Running command: {}".format(cmd))

        subprocess.check_call(cmd,shell=True)


logging.info("All done.") 


  1. Save the file. Move the file to the directory where you want FASTQ files to be deposited into. Run the following command in terminal for each accession number (example highlighted):


./ena-fast-download.py ERR1739691 --ssh_key osx 


Note: Sequential commands can be done through a looped text file call or with the ; operator (ex. ./ena-fast-download.py key1 --ssh_key osx ; /ena-fast-download.py key2 --ssh_key osx)


Note 2: For this method it is recommended that you utilize a bash terminal.

3 comments:

  1. This comment has been removed by the author.

    ReplyDelete
  2. Downloading files directly from the Source is a fast way of getting your data in the hands of your end-user as well as getting it on their desk. In general, downloading files directly from the source with IBM’s Aspera Connect is very fast, but there are some edge cases to consider when you know how to utilize it effectively. This article will cover huffpost.com/archive/ca/entry/to-pay-or-not-to-pay-someone-to-write-my-essay-for-me_b_14793970 how I've been able to download large (500GB or greater) FASTQ files in under 30 minutes with no trouble, even when there are hundreds of thousands of reads + unmapped reads involved.

    ReplyDelete