----------------------------------------------------------------------

                        Release Notes
                       stackPACK v2.2.1

----------------------------------------------------------------------

Release date: February 2003


INTRODUCTION	

This version of stackPACK replaces stackPACK v2.1.1 for HP Tru64 only 
and includes many new feature requests from customers. StackPACK v2.2.1 
for HP Tru64 has the same functionality as stackPACK v2.2 for 
Intel-based PC Linux Red Hat, Silicon Graphics Irix, and Sun Solaris, 
and is a special release to accommodate many Tru64-specific issues that 
were encountered during development. 

StackPACK v2.2.1 contains significant improvements over stackPACK 
v2.1.1, and focuses mainly on increasing the accuracy of consensus 
sequences and improving the ability to handle large datasets. 
Enhancements made towards these goals include:

- The incremental addition of new data to existing clusters, 
  maintaining the cluster history.

-  The inclusion of phred quality scores contributing to accurate 
   consensus sequence generation.

- The ability to assemble and generate consensus sequences with 
  original data, whilst still retaining the benefits of accurate 
  clustering using masked data.

In addition continued improvements in terms of parameterization, data 
management, viewing, and extraction functions enable more rapid 
assessment and manipulation of alignments and alignment analyses. 



SYSTEM INFORMATION

1. StackPACK v2.2.1 is available for the following platforms:

   Hardware                   OS Version
   -----------------------    -----------------
   HP                         Tru 64 UNIX 4.0F
   HP                         Tru 64 UNIX 5.1A 
 

2. StackPACK requires the following third-party software:

   Software               Version                    Location
   ---------------------  -------------------------  ----------
   d2_cluster and CRAW(1) latest                     Academic: Biotique Systems: bpoh@biotiquesystems.com
                                                     Academic: University of Houston: bsmalley@uh.edu
                                                     Commercial: Electric Genetics: support@egenetics.com

   Phrap and Cross_Match  1996 or 1999               Academic: http://www.phrap.org
                                                     Commercial: http://www.codoncode.com/
                                                     Commercial: http://www.geospiza.com/products/index.htm

   RepBase                user's choice              Academic: http://www.girinst.org/index.html
   (optional)                                        Commercial: http://www.geospiza.com/products/index.htm
                                                     Commercial: http://www.girinst.org/index.html

   RepeatMasker           April 1999 or newer        Academic: http://repeatmasker.genome.washington.edu/
   (optional)                                        Commercial: http://www.geospiza.com/products/index.htm

   Apache                 1.3 or newer               http://www.apache.org

   Python(2)              python2-2.2.1              http://www.python.org/2.2.1/

   MySQL(3)               3.23.27 or newer           http://www.mysql.com/downloads/mysql-3.23.html  
                          - Server 
                            (MySQL 3.23.xx)
                          - Libraries and Header 
                            files for development 
                            (MySQL-devel-3.23.xx)
                          - Client programs  
                            (MySQL-client-3.23.xx)  

   MySQLdb (4)            0.9.1                      http://www.mysql.com/downloads/api-python.html


NOTE:
(1) Commercial customers do not need to obtain d2_cluster and CRAW 
    separately - it is included in the stackPACK distribution file.
    Academic customers: Please state clearly for which of the supported 
    platforms you would like precompiled d2_cluster and CRAW binaries 
    for and use the following as the e-mail subject: "Precompiled 
    d2_cluster and CRAW for <your platform of choice>."

(2) Use Python-2.2.1 and compile using the --with-threads option.
 
(3) Although MySQL binaries can be obtained elsewhere, we strongly 
    recommend the binaries available from 
    http://www.mysql.com/downloads/mysql-3.23.html as they are 
    best supported.
    MySQL-devel-3.23.xx and MySQL-client-3.23.xx are required for 
    installation of MySQLdb

(4) In order to build MySQLdb it may be necessary to manually edit 
    setup.py to correctly specify the MySQL and Python header and 
    library files as described in the README file provided in the 
    archive.



WHAT'S NEW IN THIS RELEASE

StackPACK v2.2.1 contains many improvements and new features, most of 
which were requested by our academic and commercial customers.

1. StackPACK v2.2.1 focuses on enhancing the quality of consensus 
   sequences that are generated:
 - Inclusion of phred quality scores either from the web interface or 
   from the command line.
 - Ability to assemble sequences and generate consensus sequences with 
   original unmasked data, whilst still retaining the benefits of 
   accurate clustering using masked data. Alignments, alignment 
   analyses and consensus sequences can then be viewed in the web 
   interface in their original unmasked format.

2. Several enhancements have been made to improve the ability of 
   stackPACK to handle large datasets:
 - Incremental addition of new data to existing clusters maintaining 
   the cluster history. This can be done either from the web interface 
   or from the command line.
 - Reduction of memory usage for both the web interface and all steps 
   in the pipeline.
 - Significant upgrades in terms of speed of the clonelinking algorithm.

3. New viewing and extraction functions enable rapid simplified 
   analysis and manipulation of alignments and alignment analyses. 
   Data exchange with third-party programs is also simplified 
   resulting in easier assessment of highlighted areas of potential 
   interest. 

   New viewing functions include:
   - Viewing of the intermediate phrap consensus sequence.
   - Improved parsing and viewing of sequence annotation.
   - Display of phrap singletons in the cluster family tree view.
   - Viewing of previous cluster versions in cases where clusters 
     have been deprecated or changed due to the incremental addition 
     of new data to existing clusters.

   New reporting and output functions include:
   - Output of all constituent sequences for a particular consensus, 
     contig and/or clonelink. 
   - Extraction of all original unmasked input sequences in a project, 
     in FASTA format.
   - Extraction of intermediate and final alignments in MSF and 
     ClustalW format, either for a single alignment or for a whole 
     project. 
   - Extraction of phrap alignments in ACE format, either for a 
     particular cluster or for a whole project. 
   - Extraction of the Alignment Analysis CRAW logs for a particular 
     contig or for a whole project.
   - Restriction of the non-redundant output report to clonelink 
     consensus sequences, contig consensus sequences and/or singleton 
     sequences. 
   - Extraction of all phrap singletons for a project, in FASTA format. 
     The singleton output can be filtered by sequence size.

   All these new reporting functions can be output either from the web 
   interface or from the command line. 

4. Several improvements in terms of ease of use, flexibility and 
   parameterization have been implemented, giving users the freedom to 
   optimize their clustering results and adapt the system according to 
   their needs. These include:
 - Numerous enhancements ensuring a more robust automated installation. 
 - Creation of multiple projects with the same name provided that they 
   are owned by different users. 
 - Implementation of a configuration flag option for all pipeline steps 
   enabling use of multiple customized configuration files in any 
   location for different steps in the pipeline and/or for different 
   projects. 
 - Implementation of progress indication for all steps within the 
   stackPACK pipeline.
 - Context-sensitive on-line support.

5. Significant improvements to data management functions have been 
   implemented in this release, including:
 - Project filtering by name, owner or description within 
   WebProjectManager. 
 - Project description editing at any time from the WebProjectManager.
 - Multiple project deletion from within WebProjectManager.
 - Project search functions within WebProbe and WebReport.
 - Project filtering by owner and/or description in the command line.



KNOWN PROBLEMS

1. Tru64-specific known problems within stackPACK v2.2.1 include:
   - stackCORBAd does not spawn multiple threads during processing 
     leading to a degradation in performance when too many users 
     access the web interface. The results are not affected.
   - The corruption of ACE files, generated during the 
     stack_Assemble step, during the insertion of these files back 
     into the database. These corrupt files do not affect the pipeline 
     processing or the integrity of the cluster results, and will 
     only be observed when users output data using the 
     stack_ReportAlignment.py report with the --Format=ACE option. 
     This is due to a memory allocation error within the ODBC driver, 
     myODBC, required to connect to the MySQL database. Electric 
     Genetics is working with the MySQL support engineers to address 
     this issue. 

     The intact ACE files can be captured before they are inserted back 
     into the database by specifying the --leavefiles command at the 
     end of the stack_Assemble application as follows: 
     stack_Assemble <project> --leavefiles. 
     The location of these ACE files are given when processing of the 
     stack_Assemble application has been completed.

The remainder of the known problems for stackPACK v2.2 and v2.2.1 is 
across all platforms:

2. Some of the new features in stackPACK v2.2.1 are not compatible with 
   projects that have been created with stackPACK v2.1.1 and converted 
   to stackPACK v2.2.1 format. These include:
   -  Output of alignments in ACE format.
   -  Output of sequences in their original unmasked format.
   -  Usage of the --use-unmasked stack_Assemble option.

3. The stack_Link algorithm does not take singleton sequences into 
   account and will only link clusters with shared clone ID 
   information. Note that even if the redundancy parameter equals 1, 
   singleton sequences with matching clone IDs will not be linked, 
   and only clusters will be considered during the stack_Link step.

4. The 'Last Modified' date function displayed in both the command 
   line and WebProjectManager summary reports is not updated when 
   projects are modified from the command line. 

5. NCBI entries that have been deleted and replaced by a new entry 
   with a different accession number may have the old accession number 
   appended to the new accession number within NCBI GenBank ACCESSION 
   field, e.g. U15570 L36804. In these cases stackPACK will strip 
   everything after the space.

6. StackPACK writes out temporary files to the location specified for 
   STACKPACK_TMP under the [STACKPACK] heading in the stackpack 
   configuration file. These temporary files are generally deleted as 
   the data is processed. If the processing pipeline is interrupted 
   for some reason, the temporary files may fail to delete and must 
   be deleted manually. When this temp directory becomes full, some of 
   the steps in the pipeline may not complete. It is important to 
   inspect this location periodically for accumulated files, and to 
   ensure that enough disk space is allocated for STACKPACK_TMP.

7. The hierarchical navigational icons that represent the various 
   cluster consensus and alignment views within WebProbe may become 
   misaligned when using certain font settings on Netscape under Linux. 
   This can be rectified by setting the Netscape variable width font 
   to 14 in Edit: Preferences: Font. 

8. Anomalous behavior may be experienced when using Netscape FastTrack 
   as the web-server. This does not affect the integrity of the data, 
   and can be avoided by using Apache as the web-server. 

9. Browser limitations may limit the size or length of cluster that 
   can be viewed in WebProbe, and can be rectified by increasing your 
   browser time out value.

10. If the MySQL database is overloaded or if the maximum number of 
    connections is exceeded, stackPACK may lose its connection with 
    the database. Processes using stackCORBAd, such as WebProbe, will 
    terminate with a database error and must be processed again. This 
    can be rectified by restarting stackCORBAd. stackCORBAd will 
    restart itself automatically when it is killed as follows: 
    killall -9 stackCORBAd



DO YOU STILL HAVE ANY QUESTIONS?

Please do not hesitate to contact Electric Genetics with any questions, 
comments or suggestions for improvement at:

Tel:	+27 (21) 959 3964
Fax:	+27 (21) 959 2512
E-mail:	support@egenetics.com