OpenShade: An Open-Source Multiple Sequence Alignment Shading and Editing Utility

Faculty Mentor: Sudhir Nayak

Student: Peter Swetits

Protein sequence alignments allow researchers to quickly determine regions of similarity between different proteins. They also provide important clues about the nature of the proteins that may be important to their study. While working with sequence alignments, researchers often find that they need to quickly shade or edit their alignments. However, the most widely used shading program, BoxShade, is difficult to use, does not allow editing, and has a limited number of output options. OpenShade is being developed as an open source software application that solves these issues. It allows the user to input multiple sequence alignments in all popular formats, including FASTA, ALN, MSF, and Phylip. Once imported, the alignment can then be dynamically shaded for identities and similarities, with the consensus being either automatically calculated or defined by the user. The user is able to specify the criteria to form a consensus, change the scoring matrix, and set the minimum score required for shading of either identical or conserved residues. After shading, the user has the ability to edit individual amino acids, entire columns of amino acids, or select and edit a single section of the entire alignment. The shaded sequences can be exported as a document in PDF, PNG, or RTF formats.  OpenShade also contains the ability to conduct pattern matching using regular expressions. The user can input a string of amino acids and then the program will highlight all occurrences of that string in each of the protein sequences independent of position.  The basic graphical interface and shading algorithms have been completed.  We anticipate the completion of the project within the next year.