Home
Montreal Seminar Links
Montreal Workshop Links
Presentation
Brochure
Bioinformatics Needs Survey Report
Services
Service Policy
Bioinformatics Platform
Platform Services
Platform Team
Contact Us
Access the Directory
New Our Latest Newsletter
Compendium
Newsletter Archive
Explore Our Links
Bioinformatics.ca Links Directory
Add a link
Visit the Software Repository
Web Servers
Register
Why Register?
Login

Canadian Bioinformatics Help Desk Software Repository

Search by name    

Software Name
Description
align_learn.pl
align_learn.pl converts a multiple sequence alignment into a format that can be readily analyzed using common machine learning algorithms.
annotator.pl
Reads multiple sequence files in FASTA format from a file and submits each to local BLAST. The complete BLAST results are written to a file, and the best match is sent as an Entrez query to NCBI.
Batch PSORT
This program sends protein sequences to a PSORT server, parses the response, and writes the results to a text file.
batch_bind_blast.pl
This script reads multiple FASTA sequences from a file and submits each to BIND BLAST.
BLAST Hit Table Extender
This script uses the identification number to retrieve a more detailed description of the hit sequence from NCBI.
blast_client3_2.pl
This script performs BLAST searches against NCBI's nr database. It prompts the user for a blast search type and an input file of FASTA formatted sequences. An optional 'limit by entrez query' value can be supplied to restrict the search. The script then submits each sequence to BLAST and retrieves the results. For each of the hits the script retrieves a detailed title by performing a separate query of NCBI's databases. Each BLAST hit and its descriptive title are written to a single tab-delimited output file.
blastn_client3_1.pl
This script reads one or more DNA sequences in FASTA format from a file and submits each to NCBI BLAST using the blastn program.
blastx_client3_1.pl
This script reads one or more DNA sequences in FASTA format from a file and submits each to NCBI BLAST using the blastx program.
Clickable Sequence Features
Clickable Sequence Features is an object-oriented program that converts GenBank, EMBL, FASTA, or RAW sequence files into an HTML figure showing the DNA sequence and translations described in the sequence record.
Codon Usage
Codon Usage accepts a DNA sequence and returns the number and frequency of each codon type.
compare_library.pl
This script accepts two files (i and j) containing multiple DNA sequences in FASTA format. Each sequence in file i is compared using local BLAST (bl2seq) to each sequence in file j, and an HTML table is generated to display a summary of the findings.
DNA Stats
DNA Stats returns the number of occurrences of each residue in the sequence you enter.
EMBOSS - User Interface
This software package generates interfaces for the EMBOSS suite of programs.
Extract FASTA Headers
Given a file containing multiple FASTA-formatted entries, this script outputs a file containing only the FASTA headers.
evolving_peptide_search.pl
This script reads multiple protein sequences (in FASTA format) from a file and then searches each for a peptide sequence. The search is repeated using increasingly degenerate versions of the peptide until the maximum allowed number of matches is obtained. This script can be used to find peptides with a primary sequence close to a peptide of interest.
feature_parse.pl
This script reads a genomic sequence in FASTA or RAW format from a file and writes out the features that are described in a feature position file. The extracted features are written in FASTA format to the specified output file.
fetch_protein_v_2.pl
This script accepts a list of Swiss-Prot IDs or Swiss-Prot names. The sequence record corresponding to each ID is retrieved from ExPASy and written to a separate file in the output directory you specify. Records can be written in FASTA format or in Swiss-Prot format.
fetch_swissprot_using_id.pl
This script accepts a list of Swiss-Prot IDs. The sequence and title corresponding to each ID are retrieved from ExPASy and written to a file in FASTA format.
Filter DNA
Filter DNA removes non-DNA characters from text. Use this program when you wish to remove digits and blank spaces from a sequence to make it suitable for other applications.
Filter Protein
Filter Protein removes non-protein characters from text. Use this program when you wish to remove digits and blank spaces from a sequence to make it suitable for other applications.
FlexArray
FlexArray is a Microsoft Windows software package for statistical analysis of microarray data. FlexArray combines the ease-of-use with a comprehensive set of statistical utilities. It offers a wide variety of useful visualization options, a rich interactive environment, full analysis history, a plug-in interface for algorithms and plots, analysis protocol support, and more. FlexArray currently supports Affymetrix expression GeneChip© microarrays, and Illumina expression BeadChip© arrays. FlexArray is free to academic researchers, and it was created with funding from Genome Quebec.
GenBank Feature Extractor
GenBank Feature Extractor accepts a GenBank file as input and reads the sequence feature information described in the feature table, according to the rules outlined in the GenBank release notes. The program concatenates or highlights the relevant sequence segments and returns each sequence feature in FASTA format.
GenBank Trans Extractor
GenBank Trans Extractor accepts a GenBank file as input and returns each of the protein translations described in the file in FASTA format.
genbank_to_cgview.pl
genbank_to_cgview.pl converts a GenBank or EMBL sequence record into an XML document for the CGView genome visualization software (http://wishart.biology.ualberta.ca/cgview/index.html).
generic_ncbi_data_fetcher.pl
This script uses NCBI's Entrez Programming Utilities to perform searches of NCBI databases. This script can return either the complete database records, or the IDs of the records (recommended). It is up to you to know how to handle the IDs and records. The results are written to a single output file. For additional information on NCBI's Entrez Programming Utilities see: http://eutils.ncbi.nlm.nih.gov/entrez/query/static/esearch_help.html.
genome_search.pl
Genome Search reads a genomic sequence in FASTA format from a file and searches for the patterns you specify using regular expressions.
genome_search_parse_results.pl
Reads the results from genome_search.pl (see the description for genome_search.pl) and generates a summary for each match.
go_fish_source.pl
This perl script assigns Gene Ontology (GO) numbers and descriptions for blast results generated by annotator.pl.
Hydrophobicity Profiler
This Perl script reads a FASTA formatted protein sequence file and returns the hydrophobicity profile for the inputted sequence according to the user-specified window size and hydrophobicity scale.
local_blast_client.pl
This script performs BLAST searches against a local blast database. It prompts the user for a BLAST search type and an input file of FASTA formatted sequences. The script then submits each sequence to BLAST and retrieves the results. For each of the hits the script retrieves a detailed title by performing a separate query of NCBI's databases. Each BLAST hit and its descriptive title are written to a single tab-delimited output file.
microarray_randomizer.pl
This script accepts a file consisting of tab-delimited microarray data. Numerical values, except for those in the first column, are replaced with pseudo-random values greater than or equal to the lower limit you specify, and less than the upper limit you specify.
Multiple Align Show
Multiple Align Show accepts a group of aligned sequences (in FASTA or GDE format) and formats the alignment to your specifications.
Multi Rev Trans
Multi Rev Trans accepts a protein alignment and uses a codon usage table to generate a graph that can be used to find regions of minimal degeneracy at the nucleotide level.
new_psort.pl
new_psort.pl sends sequences to a PSORT server and parses and saves the results.
ORF Finder
ORF Finder searches for open reading frames (ORFs) in the DNA sequence you enter. The program returns the range of each ORF, along with its protein translation.
Pearson Correlation Coefficient Parser
This perl script, given a single excel file with multiple genes along with their intensities, will calculate the Pearson correlation coefficient and, if the threshold is above 0.6 or below -0.6, will output the results to two Excel files, Detail_Over.xls and Detail_Under.xls.
Perl BLAST Client
Reads a text file containing multiple sequences in FASTA format and submits each sequence to NCBI's BLAST server using QBLAST'S URL API.
pI/MW batch analysis tool
This Perl program creates a .txt file containing the sequence name, length, predicted molecular weight, and predicted isoelectric point of the protein sequences it receives.
Programming in Perl - Part 1
This collection of simple programs is intended to introduce the Perl programming language to students with little or no programming experience (part one of two).
Programming in Perl - Part 2
This collection of simple programs is intended to introduce the Perl programming language to students with little or no programming experience (part two of two).
Protein Molecular Weight
Protein Molecular Weight accepts a protein sequence and calculates the molecular weight. You can append copies of commonly used epitopes and fusion proteins using the supplied list.
Protein Stats
Protein Stats returns the number of occurrences of each residue in the sequence you enter. Percentage totals are also given for each residue, and for certain groups of residues.
Random DNA Sequence
Random DNA Sequence generates a random sequence of the length you specify. Random sequences can be used to evaluate the significance of sequence analysis results.
Random Protein Sequence
Random Protein Sequence generates a random sequence of the length you specify. Random sequences can be used to evaluate the significance of sequence analysis results.
random_seq_sample.pl
This script accepts a file consisting of multiple FASTA formatted sequence records. It then randomly selects sequences from the file, without replacement.
range_extract.pl
Reads a genomic sequence in FASTA or RAW format from a file and writes out the range of bases between the supplied start and stop positions to a file.
Reformat PDB
A script to reformat unusual PDB files into a more standard PDB format. This script (1) re-orders the atoms within each residue into a 'standard' order, (2) renames atoms to a 'standard' format, e.g. HD23 becomes 3HD2, (3) renames certain residues, e.g. 'HSD' or 'HID' become 'HIS', (4) preserves only one location for each atom, for atoms that have alternate location codes.
remote_blast_client.pl
This script performs BLAST searches against NCBI's sequence databases. It prompts the user for a blast search type and an input file of FASTA formatted sequences. An optional 'limit by Entrez query' value can be supplied to restrict the search. The script then submits each sequence to BLAST and retrieves the results. For each of the hits the script retrieves a detailed title by performing a separate query of NCBI's databases. Each BLAST hit and its descriptive title are written to a single tab-delimited output file.
remove_duplicate_seqs.pl
Reads multiple sequence records in FASTA format from a file and if there are two or more sequences that match, only the first record in the matching group is written to the output file.
remove_duplicates.pl
Reads multiple sequence files in FASTA format from a file and removes duplicate sequence records (based on sequence title).
remove_near_duplicates.pl
This script reads multiple sequence records in FASTA format from a file and if there are two or more sequences that match, only the first record in the matching group is written to the output file. The names of the removed records are written to a log file.
remove_x.pl
Reads multiple sequence files in FASTA format from a file and removes X's and x's from the sequences.
Restriction Summary
Restriction Summary accepts a DNA sequence and returns the number and positions of restriction endonuclease cut sites.
Retrieve_Entrez_Gene_Info.pl
This script uses NCBI's Entrez Programming Utilities URL API to submit batch requests to NCBI Entrez. It retrieves gene information for an organism such as Gene ID, Gene name, Gene description, Gene synonyms, Location, HGNC ID, HPRD ID, MIM ID, phenotype[MIM ID], KEGGPathways, ConserveDomains and Unigene ID information from NCBI's Entrez gene database.
retrieve_seq.pl
This script uses NCBI's Entrez Programming Utilities URL API to submit batch requests to NCBI Entrez. It can be used, for example, to download all the sequences in an NCBI database that were obtained from a particular species.
retrieve_seq_v2.pl
This script uses NCBI's Entrez Programming Utilities to perform batch requests to NCBI Entrez. It can be used, for example, to download all the sequences in an NCBI database that were obtained from a particular species. This version has been customized for retrieval of 16S RNA sequences.
Reverse Complement
Reverse Complement converts a DNA sequence into its reverse, complement, or reverse-complement counterpart.
seqsee
SEQSEE is a comprehensive protein sequence analysis package commercialized by BioTools Inc.
Sequence Extractor
Sequence Extractor accepts a DNA sequence along with a set of primer sequences and returns a textual map showing the annealing positions of the primers, restriction cut sites, and protein translations.
Sequence Manipulation Suite
The Sequence Manipulation Suite is a collection of web-based programs for analyzing and formatting DNA and protein sequences (version 1).
Sequence Manipulation Suite 2
The Sequence Manipulation Suite version 2 is much faster than the previous version and contains several new programs and enhancements. It can be used to perform much of the simple sequence formatting and analysis done in molecular biology labs, and as a teaching aid when introducing students to DNA and protein sequences.
Shuffle DNA
Shuffle DNA randomly shuffles a DNA sequence. Shuffled sequences can be used to evaluate the significance of sequence analysis results, particularly when sequence composition is an important consideration.
Shuffle Protein
Shuffle Protein randomly shuffles a protein sequence. Shuffled sequences can be used to evaluate the significance of sequence analysis results, particularly when sequence composition is an important consideration.
split_fasta.pl
This script accepts a file consisting of multiple FASTA formatted sequence records. It splits the file into multiple new files, each consisting of a subset of the original records.
summary_adder_2.pl
This script obtains summary information from NCBI and adds it to the output of earlier versions of the blast_client.pl scripts (versions 1.2 and earlier).
three_frames.pl
This script converts a fasta formatted DNA sequence file into a new file containing all six protein translations of each supplied DNA sequence.
Translate
Translate accepts a DNA sequence and converts it into a protein using the reading frame you specify.
XALIGN (version 5)
XALIGN is a graphical X-windows program for multiple sequence alignment based on sequence homology and secondary structure (version 5, Linux binary).
XALIGN (version 6)
XALIGN is a graphical X-windows program for multiple sequence alignment based on sequence homology and secondary structure (version 6, source code).
   
   
Copyright ©2003-2007 Canadian Bioinformatics Help Desk
 
The Canadian Bioinformatics Help Desk, as part of the Integrated and Distributed Bioinformatics Platform for Genome Canada, is supported by Genome Alberta, in partnership with Genome Canada and other co-funding partners. Genome Canada is a not for profit corporation that is leading Canada's national strategy on genomics with $600 million in funding from the federal government.

This site is maintained by Canadian Bioinformatics Help Desk and was last updated on December 12 2007 22:19:06.
This page has been viewed times ( last 20 users)

Genome Alberta