---------------------------------------------------------------------
Release Notes
stackPACK v 2.2
---------------------------------------------------------------------
Release date: November 2002
INTRODUCTION
This version of stackPACK replaces stackPACK v2.1.1 and includes many
new feature requests from customers. StackPACK v2.2 contains significant
improvements over stackPACK v2.1.1, and focuses mainly on increasing
the accuracy of consensus sequences and improving the ability to handle
large datasets. Enhancements made towards these goals include:
- The incremental addition of new data to existing clusters,
maintaining the cluster history.
- The inclusion of phred quality scores contributing to accurate
consensus sequence generation.
- The ability to assemble and generate consensus sequences with
original data, whilst still retaining the benefits of accurate
clustering using masked data.
Continued improvements in terms of parameterization, data management,
viewing, and extraction functions enable more rapid assessment and
manipulation of alignments and alignment analyses. For example, users
can now output alignments in ACE format for improved editing in
external programs such as consed.
SYSTEM INFORMATION
1. StackPACK v2.2 is available for the following platforms:
Hardware OS Version
----------------------- -----------------
Compaq Tru 64 UNIX 4.0
Compaq Tru 64 UNIX 5.1
Intel-based PC Linux Red Hat 7.3
Silicon Graphics Irix 6.5.x
Sun Microsystems Solaris 8
2. StackPACK requires the following third-party software:
Software Version Location
--------------------- ------------------- ----------
d2_cluster and CRAW(1) latest Academic: Biotique Systems
Academic: University of Houston
Commercial: Electric Genetics
Phrap and Cross_Match 1996 or 1999 Academic: http://www.phrap.org
Commercial: CodonCode Corporation
Commercial: Geospiza
RepBase user's choice Academic: GIRI
(optional) Commercial: Geospiza
Commercial: GIRI
RepeatMasker April 1999 or newer Academic: University of Washington
(optional) Commercial: Geospiza
Apache 1.3 or newer IRIX: sgi freeware
Solaris: Sunfreeware.com
All platforms: The Apache Software Foundation
MySQL(2) 3.23.27 or newer All platforms: MySQL
MySQLdb 0.9.1 All platforms: MySQL
Python(3) 2.2.1 All platforms: python
NOTE:
(1) Commercial customers do not need to obtain d2_cluster and CRAW
separately - it is included in the stackPACK distribution file.
Academic customers: Please state clearly for which of the supported
platforms you would like precompiled d2_cluster and CRAW binaries
for and use the following as the e-mail subject: "Precompiled
d2_cluster and CRAW for <your platform of choice>."
(2) Although MySQL binaries can be obtained from the sgi freeware site,
the Red Hat CDs, and from the Sunfreeware site we strongly
recommend the binaries available from MySQL as they are best
supported.
Irix, Solaris and Tru64: Please be sure to download and install
the latest 3.23 binaries provided under the relevant platform
headings on the MySQL site as per the instructions provided in the
downloaded archive.
Linux: The MySQL binaries packaged with RPM have been tested. In
order to be able to later install MySQLdb it is necessary to
install MySQL-devel-3.23.xx and MySQL-client-3.23.xx in addition
to MySQL-3.23.xx where "xx" is the latest minor version number.
(3) In order to build MySQLdb it may be necessary to manually edit
setup.py to correctly specify the MySQL and Python header and
library files as described in the README file provided in the
archive.
(4) Irix, Solaris and Tru64: Use Python-2.2.1 and compiled using the
--with-threads option.
Linux: Install both the python2-2.2.1 and the python2-devel-2.2.1
RPMs. The latter RPM is to allow the installation of MySQLdb.
Solaris: Do NOT use the precompiled binaries from the Sunfreeware
site.
WHAT'S NEW IN THIS RELEASE
StackPACK v2.2 contains many improvements and new features, most of
which were requested by our academic and commercial customers.
1. StackPACK v2.2 focuses on enhancing the quality of consensus
sequences that are generated:
- Inclusion of phred quality scores either from the web interface or
from the command line.
- Ability to assemble sequences and generate consensus sequences with
original unmasked data, whilst still retaining the benefits of
accurate clustering using masked data. Alignments, alignment
analyses and consensus sequences can then be viewed in the web
interface in their original format.
2. Several enhancements have been made to improve the ability of
stackPACK to handle large datasets:
- Incremental addition of new data to existing clusters maintaining
the cluster history. This can be done either from the web interface
or from the command line.
- Reduction of memory usage for both the web interface and all steps
in the pipeline.
- Significant upgrades in terms of speed of the clonelinking algorithm.
3. New viewing and extraction functions enable rapid simplified
analysis and manipulation of alignments and alignment analyses.
Data exchange with third-party programs is also simplified
resulting in easier assessment of highlighted areas of potential
interest.
New viewing functions include:
- Viewing of the intermediate phrap consensus sequence.
- Improved parsing and viewing of sequence annotation.
- Display of phrap singletons in the cluster family tree view.
- Viewing of previous cluster versions in cases where clusters
have been deprecated or changed due to the incremental addition
of new data to existing clusters.
New reporting and output functions include:
- Output of all constituent sequences for a particular consensus,
contig and/or clonelink.
- Extraction of all original unmasked input sequences in a project,
in FASTA format.
- Extraction of intermediate and final alignments in MSF and
ClustalW format, either for a single alignment or for a project.
- Extraction of phrap alignments in ACE format, either for a
particular cluster or for a project.
- Extraction of the Alignment Analysis CRAW logs for a particular
contig or for a whole project.
- Restriction of the non-redundant output report to clonelink
consensus sequences, contig consensus sequences and/or singleton
sequences.
- Extraction of all phrap singletons for a project, in FASTA format.
The singleton output can be filtered by sequence size.
All these new reporting functions can be output either from the web
interface or from the command line.
4. Several improvements in terms of ease of use, flexibility and
parameterization have been implemented, giving users the freedom to
optimize their clustering results and adapt the system according to
their needs. These include:
- Numerous enhancements ensuring a more robust automated installation.
- Creation of multiple projects with the same name provided that they
are owned by different users.
- Implementation of a configuration flag option for all pipeline steps
enabling use of multiple customized configuration files in any
location for different steps in the pipeline and/or for different
projects.
- Implementation of progress indication for all steps within the
stackPACK pipeline.
- Context-sensitive on-line support.
5. Significant improvements to data management functions have been
implemented in this release, including:
- Project filtering by name, owner or description within
WebProjectManager.
- Project description editing at any time from the WebProjectManager.
- Multiple project deletion from within WebProjectManager.
- Project search functions within WebProbe and WebReport.
- Project filtering by owner and/or description in the command line.
KNOWN PROBLEMS
1. Some of the new features in stackPACK v2.2 are not compatible with
projects that have been created with stackPACK v2.1.1 and converted
to stackPACK v2.2 format. These include:
- Output of alignments in ACE format.
- Output of sequences in their original unmasked format.
- Usage of the --use-unmasked stack_Assemble option.
2. The 'Last Modified' date function displayed in both the command
line and WebProjectManager summary reports is not updated when
projects are modified from the command line.
3. NCBI entries that have been deleted and replaced by a new entry
with a different accession number may have the old accession number
appended to the new accession number within NCBI GenBank ACCESSION
field, e.g. U15570 L36804. In these cases stackPACK will strip
everything after the space.
4. StackPACK writes out temporary files to the location specified for
STACKPACK_TMP under the [STACKPACK] heading in the stackpack
configuration file. These temporary files are generally deleted as
the data is processed. If the processing pipeline is interrupted
for some reason, the temporary files may fail to delete and must
be deleted manually. When this temp directory becomes full, some of
the steps in the pipeline may not complete. It is important to
inspect this location periodically for accumulated files, and to
ensure that enough disk space is allocated for STACKPACK_TMP.
5. The hierarchical navigational icons that represent the various
cluster consensus and alignment views within WebProbe may become
misaligned when using certain font settings on Netscape under Linux.
This can be rectified by setting the Netscape variable width font
to 14 in Edit: Preferences: Font.
6. Anomalous behavior may be experienced when using Netscape FastTrack
as the web-server. This does not affect the integrity of the data,
and can be avoided by using Apache as the web-server.
7. Browser limitations may limit the size or length of cluster that
can be viewed in WebProbe, and can be rectified by increasing your
browser time out value.
8. If the MySQL database is overloaded or if the maximum number of
connections is exceeded, stackPACK may lose its connection with
the database. Processes using stackCORBAd, such as WebProbe, will
terminate with a database error and must be processed again. This
can be rectified by restarting stackCORBAd. stackCORBAd will
restart itself automatically when it is killed as follows:
killall -9 stackCORBAd