|
Version 1 |
Genome Sciences Centre, BCCA
Chinook User Guide
Genome Sciences Centre, BCCA
Chinook User Guide
ã
Genome Sciences
Suite
570 West 7th Ave
Phone:
First Floor Fax:
Fifth Floor Fax:
3.0.3.1
Using Chinook in a p2p Environment
3.0.4.1
Setting up the Chinook Bioperl Environment
4.0.1
Using the Resources directory
4.0.1.1Configuring
static services
4.0.1.3
Configure server information
5.0.1
Running a job with Chinook (using the GUI)
5.0.3
Setting up a Chinook server node
5.0.4
Running a batch Perl job
|
C |
hinook is a peer-to-peer (P2P) bioinformatics platform. The goal of the Chinook platform is to facilitate exchange of analysis techniques within a local community and/or worldwide. Chinook operates by turning command-line applications into services which are broadcast over a virtual network. Currently, there are multiple analysis services that have been made accessible by Chinook. These range from alignment to regulation prediction algorithms. Furthermore, Chinook is designed to make it extremely easy to add new services. This is facilitated using XML (A GUI is under development to facilitate service configuration when this manual is written).
Chinook clients can be operated from Java, Perl, or
within applications like Sockeye
(And soon Pegasys at
the Ouellette Lab in
|
Server |
Integrates command-line applications in XML |
ü |
|
Allow multiple input/output files |
ü |
|
|
STDERR/STDOUT Previewing |
ü |
|
|
|
ü |
|
|
WSDL (web services) |
ü |
|
|
Ant startup |
ü |
|
|
Add new services (GUI supported) |
ü |
|
|
Add static services (GUI supported) |
ü |
|
|
Provide access for server-side databases (i.e. Ensemble) |
ü |
|
|
File Storage memory allocation is user configurable |
ü |
|
|
Client |
Run and kill services (GUI supported) |
ü |
|
Run batch files |
ü |
|
|
Download results |
ü |
|
|
Specify filters to narrow search |
ü |
|
|
Allow preview of service webpage |
ü |
|
|
Contain server information
dialogs(GUI supported) |
ü |
|
|
P2P |
Discover nodes |
ü |
|
Advertise services |
ü |
|
|
Multiple clients and servers supported |
ü |
|
|
Auto-configuration supported for JXTA |
ü |
|
|
Perl |
Run batch job through command-line |
ü |
|
Submit batch jobs to queue |
ü |
|
|
Monitor service and job information |
ü |
|
|
Download results |
ü |
Bioinformatics techniques are used to
identify complex, re-occurring relationships in biological data. Genome
sequencing projects and high-throughput expression analyses have contributed
large amounts of data; both complicating analysis and demanding higher-level
coordination of computational resources. Furthermore, the variety of available
bioinformatics tools and algorithms, and their diverse modes of usage create a
situation where most users have trouble discerning where to invest their time
and resources. Chinook resolves these issues by creating a virtual network for
bioinformatics analyses. A user is able to dynamically resolve available
bioinformatics services (algorithms) over the Internet or their local network.
The user can then validate a server's authenticity and submit bioinformatics
analyses to peers that publish their ability to perform desired services.
Information like bandwidth, jobs in queue and the location of Chinook services
are reported to clients to aid in their job submission process. A user is also
able to visit the service creator's website to identify what the particular service
does. Chinook allows a service provider to create a new service by simply
editing an XML file; as long as the new service has a standard output format,
no additional programming is required. The Chinook server runs over the JXTA
peer-to-peer network in both Java Remote Method Invocation (
Chinook is currently designed to work with
only command-line applications where sequence is the input. Chinook groups
these applications together to provide a GUI for users to easily run these applications.
Current input sequence formats supported are Fasta and Multi-fasta. Work is in
progress to allow Chinook to access files uploaded from clients. Furthermore,
Chinook has been designed to allow servers to ‘plug-in’ multiple databases,
giving clients access to protein,
|
T |
here are three ways to install Chinook; you
can use the installer, download a tarred and zipped release, or checkout the
source from CVS. This section will
describe how to do each of these tasks.
For detailed information on setting up a server see the Walkthrough in
Section 5.
The advantage of installing Chinook from installer is that it is very straightforward.
Installers are available at www.bcgsc.bc.ca/chinook/install.htm. Download the latest release from this site. This need not be the directory where you want to install Chinook.
Once downloaded, run the downloaded installer and follow the instructions to install Chinook. (In Windows, you can double-click the downloaded installer. In Linux, you can run the installer from a shell.)
Since the installer is an interactive method of installing Chinook, a non-interactive release build is also available. This is more suitable for setting up servers on remote nodes.
Release builds are available at www.bcgsc.bc.ca/chinook/install.htm. Download the latest version. Unpack it in the directory where you want to
install Chinook.
The Chinook code is available over CVS for GSC network users. (NOTE: For more information on what CVS is, read https://www.cvshome.org/docs/manual/.) Currently, only release builds are available for installation. For the latest version of the Chinook code, e-mail chinook@bcgsc.bc.ca. (If enough interest emerges, we will move the CVS repository to an external home).
For GSC Network users:
To login to the
cvs -d :pserver:USERNAME@triton.bcgsc.bc.ca:/home/cvs login
Replace the USERNAME with your user name, and type in your password when being prompted. Then type in the following command to checkout Chinook.
cvs checkout –d chinook –r chinook
Now you have downloaded a working copy of Chinook, and you can customize the code or compile it.
For more information on how to connect to CVS
from a GUI or even from an IDE like Eclipse, please refer to Chinook Developer Guide.
|
T |
here are several ways you can run Chinook.
You can run Chinook only as a client. Or if you want to share your computer
resources and services, you can run Chinook as a server. Or you can choose both
and run Chinook as a server and a client. No matter which one you choose, if
you want to enable the P2P functionality, you need to run the ChinookP2PNode
before you start client or server.
Running the Chinook Client is quite simple. There
are several ways you can run the Chinook Client. As a client, you can run
services locally or services provided by others if you have connected your
computer to the P2P network. Chinook is designed to make running services
simple and intuitive, so that users will focus on the data and results, not on
how to run a single service and how to type in the command to run the service.
This is the easiest way to start Chinook
Client. Under Windows, just go to the <$INSTALL_
Ant is a Java-based build tool developed by
the Apache Software Foundation (http://ant.apache.org). If
you are familiar with the Make Utility, then you already know the principles of
ant client
Ant
will use the default build.xml file in
the installation directory and run the target client. The Client will start automatically.
If you are interested in customizing Chinook for your own purpose, you can download the Chinook code and modify it, and then run or build the Chinook Server and the Chinook Client from the code.
Note:
Ensure all the dependency jars in the Chinook/lib/folder
are added to the CLASSPATH before you compile any Chinook code, Furthermore, it
is required that the Chinook/resources/
folder is added to the CLASSPATH. The resources
folder contains all the user-configurable Chinook files.
To Run the Chinook Client from the code, you need to run ChinookP2PNode first if you want to connect to other computers through the peer-to-peer network. There are no arguments required for the ChinookP2PNode to run. The main method of the ChinookP2PNode is located at ca.bcgsc.chinook.p2p.ChinookP2PNode.java. Also, there are no arguments required for Chinook Client to run. The main method of the Chinook Client is located at ca.bcgsc.chinook.client.exec.ChinookClient.java. Depending on which IDE you are using to customizing Chinook project, the procedure may be different. For Example, in Eclipse you can go to Run menu, then click Run… menu item. A window similar to the following will appear.

Figure 3.1 Run Window. select which main method to run.
Click Java Application in the Configuration window then click the New button under it. In the Name text field, type in “ChinookP2PNode”. If there is something in the text field besides the Search button, clear it, and then click the Search button and the following window will appear.

Figure 3.2 Choose Main Type Window. It displays all the main methods available.
Select the ChinookP2PNode and then click the OK button. We are going back to the Run window. Click the Apply button, and then the Run button. The ChinookP2PNode will start.

Figure 3.3 Run window. Running ChinooP2PNode.
Follow the same procedure to run the ChinookClient. The difference is to select
ChinookClient from the Choose Main Type window.
Alternatively, you can run Chinook Client from Webstart, go to the following website, http://www.bcgsc.ca/chinook. Click the Latest Download on the right column, and then click Chinook Client Web Start link to open the web page containing the Chinook Client start. Click on the corresponding link to start the Chinook Client. The Webstart is suitable for those people who just want to run services, and don’t care where the services are located. One advantage of running Chinook Client from Webstart is that the user can always run the latest version of Chinook, and does not need to worry about the maintenance of the software.
Running the Chinook Server can be a little
bit tricky for first-time users. There are two ways you can provide services to
Client. One is through Web services. The other is through
ant server-start
The server will start to run. To stop the server type:
ant server-stop
If you want to run
Chinook through Web Services, You need to configure a Tomcat server correctly.
And you also need to edit chinookImplAdvWSDL.xml and chinookSpecAdvWSDL.xml. For an
example of how to modify the xml files, go to 3.0.3.1.2 advertisement section.
Providing services through
If you are running the Chinook Server from
code, you need to set
-Djava.rmi.server.codebase=file:///home/smontgom/jbproject/chinook/classes/
file:///home/smontgom/jbroject/chinook/lib/filewire.jar
-Djava.security.policy=/home/smontgom/jbproject/Chinook/resources/chinookRMI.policy
You need to customize these parameters to
your own Chinook installation.
Chinook is designed to be a peer-to-peer
application to facilitate the exchange of bioinformatics utilities. In this
way, users don’t need to have all services provided locally; they can run
services advertised by other computers.
Running Chinook in a p2p environment using Ant is very similar to running the Chinook Client. At the prompt, type in
ant p2p-start
The p2pNode will start automatically. TO stop the p2p node type:
ant p2p-stop
Chinook servers
publish advertisements using the JXTA protocol (For more information about JXTA
protocol, visit http://www.jxta.org/). The
client peer intercepts these advertisements and displays services (for
which jobs can subsequently be
run). The next section describes how Chinook advertisements are made and how you can edit them for your
services.
An advertisement is an
XML document that describes a particular JXTA message, whether that is a peer,
peer group or service. These messages are discovered then cached locally. (To
see your cache go to your own .jxta/ directory - created when you first run
Chinook). As a service provider, you are interested in only two types of messages,
the ModuleSpecAdvertisement (
The Chinook ModuleSpecAdvertisement
Chinook
has two ModuleSpecAdvertisement's in
its advertisements/ folder. One for
the
chinookSpecAdvRMI.xml
|
<?xml
version="1.0"?> <!DOCTYPE
jxta: <jxta: <MSID>
urn:jxta:uuid-72CE4F415C994ADBB5BCB897E6BBB3D0EB39B9952C0D4D79BAD5 </MSID> <Name>JXTASPEC:Chinook-
<Crtr>smontgom@bcgsc.bc.ca</Crtr> <SURI>http://www.bcgsc.ca/Chinook/</SURI> <Vers>1.0</Vers>
<Desc>A Chinook DBAS </jxta: |
The ModuleSpecAdvertisement is simple in
nature. The <MSID> tag holds a unique id that identifies
this service, for your purposes it can be any valid JXTA id (see http://spec.jxta.org/nonav/v1.0/docbook/JXTAProtocols.html JXTA Protocol Specification). The <Name> tag
shouldn't be changed; Chinook searches for Spec advertisements based on this.
If the <Name> tag is changed, your advertisement won't
be discovered. The <Crtr> tag specifies who the publisher of this
service is (I always use my own name, this will show up for the Chinook client
and they will be able to contact me based on this info). The <SURI> tag
points to the documentation for Chinook; but this can be changed to be any
service providers’ relevant service documents. The <Vers> tag
specifies what version of the Chinook is being used. Finally, the <Desc> tag
describes the service.
The
Chinook ModuleImplAdvertisement
Chinook
has two ModuleImplAdvertisement's in
its advertisements/ folder. These
correspond to specific implementations of the ModuleSpecAdvertisement's. Any service
provider MUST change these
advertisements to reflect the appropriate server information (the
chinookImplAdvRMI.xml
|
<?xml version="1.0"?> <!DOCTYPE jxta:MIA> <jxta:MIA
xmlns:jxta="http://jxta.org">
<MSID> urn:jxta:uuid- 72CE4F415C994ADBB5BCB897E6BBB3D0EB39B9952C0D4D79BAD5
</MSID>
<Comp>
<Efmt> JDK1.4 </Efmt>
<ChinookImpl> 1.0 </ChinookImpl>
</Comp>
<Code>//localhost:1099/ApplicationServerImpl</Code>
<PURI>Not
yet available</PURI> <Prov>smontgom@bcgsc.bc.ca</Prov> <Desc> </Desc> </jxta:MIA> |
The ModuleImplAdvertisement
here has only a few things that must be noted. The <MSID> tag
must match that of the corresponding ModuleSpecAdvertisement.
The <Comp> tag specifies compatibility information.
Here it states that users must have at least JDK1.4 and the 1.0 implementation
of the Chinook interface. The <Code> tag points to the relevant
Chinook has been integrated with Perl to
allow the automated discovery and execution of services from scripts. The Perl code is packaged for Bioperl and
will likely be made part of the Bioperl distribution when Chinook is
widely-released. This section describes
how to set-up your environment to be able to run Perl scripts for Chinook. For a walkthrough of creating a Perl script
for Chinook also see the Walkthrough in Section 5.
Perl needs to know the location of the Chinook Perl modules. The modules for Chinook are installed under the perl/modules/ directory in the Chinook installation directory. To point Perl at these modules, you can set the PERL5LIB environment for your shell. This can be performed by issuing the following commands.
If you are using tcsh/csh shells:
setenv PERL5LIB
${PERL5LIB}:${CHINOOK_HOME}/perl/modules
In bash, the equivalent command is:
export
PERL5LIB=${PERL5LIB}:${CHINOOK_HOME}/perl/modules
Where ${CHINOOK_HOME} is the location of your Chinook installation.
NOTE: We typically prefer to write these commands to our user .bashrc file in our user directory to prevent having to retype them every time we want to use the Chinook Perl modules.
Alternatively, you can use the perl pragma ‘use lib’ at the top of your scripts to point to the location of the Perl modules you wish to use:
|
use lib
‘/home/chinook_install_directory/perl/modules’; |
Where /home/chinook_install_directory/ is the installation directory for Chinook.
To discover services and run analyses, an instance of the Chinook Client must be running in batch mode with a port open for communication with Perl scripts. The Perl scripts connect to this port to determine what services are available and to send requests for execution.
To set-up the Chinook Client for execution in batch mode (for Perl):
1) Open the batch-config.xml file in the resources/ directory in your Chinook installation directory.
2) There are several tags that need to be set.
a. <batch_directory> specifies the directory where information about discovered services is written (batch files). Whenever a new service is discovered, a batch file is written to this directory describing the service, its location, and required parameters for execution.
b. <batch_queue_directory> specifies the directory where completed batch files are stored (batch_queue files). A complete batch file has parameters and data set and is ready for execution. It also has a batch_queue id attached to it to identify downstream output files. The Chinook Client can be notified to read all the files in this directory and process them. (Alternatively, it can be given the location of a batch_queue file directly)
c. <batch_reporting_directory> specifies the directory where completed report information is written to (batch_reports). The Chinook Perl code usually polls this directory for reports matching a specific batch_queue file id.
d. <batch_machine_name> specifies where the Client is executing. Usually localhost is sufficient. In NFS mounted systems, an explicit machine name is required. The Chinook Perl code needs to know this location to be able to connect to the Chinook Client.
e. <batch_port> specifies the port that the Chinook Client will be receiving incoming requests from Perl scripts. The default port is 7999. This can be any valid open port number, but Perl clients will need to know this information in addition to the <batch_machine_name> in order to connect to the Chinook Client.
f. <batch_socket_conns> specifies the maximum number of open socket connections that can be made to the Chinook Client at a time. This number should usually be greater than <batch_receiver_thread_queue_size>.
g. <batch_receiver_thread_queue_size> specifies the maximum number of concurrent processing requests that scripts can make of the client. The rest will block until the pending requests are finished. This should be a low number to prevent excessive use of memory. But should be inline with the number of concurrent requests your Chinook Client receives.
3) Once the batch-config.xml file has been configured to your desired settings, the Chinook Client can be started with batch-mode enabled. This starts a small server inside the Client that will manage incoming requests from Perl scripts.
4) To start the Chinook Client with batching mode, start the Client as normal but with the following flag set.
./ChinookClient –batch
The batch flag will ensure that the batching mode is activated. If you do not want the GUI to appear, you can call the ChinookClient with:
./ChinookClient –batch –nogui
This is ideal for running the ChinookClient on remote machines.
For examples and more information on running Perl scripts once the environment has been configured and batch-enabled client has been started, see the Perl Walkthrough in Section 5.0.4.
hinook
is an open source project. In this section, we are going to guide you through
how to customize chinook platform for your own installation, including
configuring static services, adding new services , and configuring server
information.
As Chinook starts, it uses xml files under resources directory to configure
itself. You can customize Chinook by editing some of the xml files in resources directory.
You can add static services to Chinook by editing the static-services.xml file under resources directory. Each time Chinook starts; it tries to connect to servers in static-services.xml first instead of trying to discover services through p2p node directly. The static-services.xml file is very simple. It may look like the following.
|
<?xml
version="1.0" encoding="ISO-8859-1"?> <static-services> <staticservice> < <mode> </staticservice> <staticservice> < <mode>WSDL</mode>
</staticservice> </static-services> |
The <
You can also edit the static-sevices.xml file through the GUI. After you start the Chinook
Client, click on the Tools menu.
Click the Static Services… menu
item. A window similar to the following will appear.

Figure
4.1 Static Services window. It displays all static services
currently available.
You
can add new static services by clicking the Add button. After clicking the Add
button, the following window will appear.

Figure
4.2 Static Services Editor window. It is used to add or edit
static services.
Type
in the server location and choose the type of the service, then click Test Connection button. If the client
is able to connect to the server, the red square will become green, and the OK button will become enabled. If it
can not be connected, the red square will remain red, and you cannot add the
static service to the static-services.xml
file.

Figure 4.3 static services editor window. How to add a new static service.
After
you’ve finished, click the OK button
on the Static Services Editor window,
and then click the OK button on the Static Services window. All the static
services you just added will be written to the static-services.xml file. Editing existing static service is
similar. You can delete a static service by selecting the service you want to
delete in the Static Services
window, and then click the Delete
button.
Each time Chinook server starts, it checks
all the services specified in applications.xml
file, which under resources
directory, and then publishes them. In order to add new service to your Chinook
Server, you need to describe a new service using xml tags. For a more detailed
example, you can go to 5.0.2 Adding a new Service.
If you want to run as a Chinook Server, you need to customize the server-info.xml file, which under resources directory. This allows clients to obtain information about your server. The server-info.xml file is very simple.
|
<server-info> <location>Genome Science
Centre</location> <description> This is the description of the server </description>
<contact>chinook@bcgsc.bc.ca</contact </server-info> |
The <location> tag defines you server location, it could be the URL of your website.
The <description> tag specifies the description of the server.
The <contact> tag defines the contact information for the
server (typically a maintainers’ email address).
|
I |
n this section, we are going to guide you through using several common features available in Chinook, including running a job using a GUI, adding a new service, and running a batch Perl job.
Chinook is designed to facilitate a bioinformatics’ work. Most of the jobs can be accomplished through a GUI environment. In this example, we will show you how to run a job in Chinook.
Step 1: Starting Chinook Client.
After you start the Chinook Client, the following window will appear (See Figure 5.1). There are five main parts in Chinook client. The first one is the Chinook Menubar (top of the window under the title bar). The second one is Service Type and Filter Panel (upper-left panel), which is used to display all the available service types and used to specify filters to narrow the search of the services you want. The third one is the Discovered Services Panel (upper-right panel), which is used to display all the services by choosing the service type in the Service Type and Filter Panel or by specifying the filters. The fourth one is the Job Status Panel (the lower-left panel), which is used to display the currently running jobs’ status. The fifth one is the Lightweight Web Browser (the lower-right panel), which is used to display the original service developer’s web page.

Figure 5.1 Chinook client starts up.
Step 2: Specifying a filter.
You can select the services you want to run by clicking the service name on the Service Types pane. For example, you can click ALIGNMENT on the left panel; all the server providing ALIGNMENT services will appear on the Discovered Services Panel. If there are too many servers providing the service, you can further filter the servers by providing some filters. To do so, click the Filters tab in the Service type and filter panel. Specify the filters you want by typing in the text field, then checking the checkbox of the corresponding text field. The services fulfill the filters will be displayed in the Discovered Services Panel immediately.

Figure 5.2 Specify filters to facilitate finding a
service.
Step 3: Running a service on a server.
Running a service on Chinook is quite straightforward. You can select the service you want to run in the Discovered Services panel by clicking the name of the service; then click the Run service button. Or you can right click on the service you want to run; then select Run job on server in the Popup menu. For example, you select MLAGAN in the Discovered Services panel, then click Run service button. A window like the following will appear on the screen

Figure 5.3 Configuring Service window. It is used to
edit data and supply parameters.
Step 4: Editing data.
4.1 In the above window, the red dot to the right of the Edit data button indicates currently the data is invalid. You can specify the data you want by clicking the Edit data button. A window like the following will appear

Figure 5.4 Enter data window. It is used to enter all the data the service needed.
4.2 In the above window, the red square on the right of the data box indicates that currently the data is invalid. You can edit the data by select one of the data box by clicking it; then click the Edit button. A window like the following will appear.

Figure 5.5 Enter data window. It is used to enter one sequence used by the service.
4.3 The red dot to the right of each text field and the red square under the window indicate that the data is invalid. You can point the mouse to the red square; a tool tip will tell you which part the data is invalid. After you specifying all the data, if the data is valid, the red square will change to a green square. This indicates the data now is valid. You can click the OK button to return to the previous window to edit another data box if there are some. After you finish editing all the data boxes, you can click the OK button to return to the Configure Service window. Now the red dot to the right of the Edit data button should become green to indicate the data is valid. And you are ready to submit the job to the server.

Figure 5.6 Enter data window. Enter the valid data.

Figure 5.7 Enter data window. Add all valid data.
4.4 You can specify the parameters needed by the service by modifying the Parameter entry panel. Click the OK button on the Configure service window to run the service on server.

Figure 5.8 Configure
service window. Specify parameters, and
submit the job

Figure 5.9 Running job. The job is running on the server.
4.5 You can run more jobs simultaneously by following the same procedure from 4.1 to 4.4. If for some reason you want to cancel the jobs you are submitting, just select the job you want to cancel by clicking it in the Job Status Panel, and then click the Discard Job button. The job will stop running on the server, and being removed form the table. You can do the same thing by right click on the job you selected, and then click the Discard job on the popup menu.
4.6 You can view the job status by right-click on the job you selected, and then click the View result files on the popup menu. A window like the following will appear.

Figure 5.10 running job window.
4.7 After the job is completed on server, the Job status and Result files panel will be updated like this.

Figure 5.11 Job finished window. The job is finished on the server
4.8 One table entry in the Result files is the stand error message. If there some error occurs when the server is running the job, you can download the message to see what has caused the error. If the job is completed successfully, you can download the result by selecting the other table entry, and then click the download button.

Figure 5.12 Downloading report window. The report is
downloaded from server to local hard drive.
Step 5: Exiting Chinook
You
can go to File menu and click Exit to exit Chinook Client. Or you can
click the close button located on
the right of the title bar of Chinook Client.
Currently, there are over 10 analysis
services that have been made "Chinook-ready". These range from
The following example gives you an idea how to add a new service to Chinook Server. Let’s look at LAGAN as an example.
|
<application> <name>LAGAN</name> <type>ALIGNMENT</type> <path>/opt/mlagan</path>
<executable>lagan.pl</executable> <format>exe_path/executable
dna_sequence parameter</format> <allow_stderr_preview>true</allow_stderr_preview>
<allow_stdout_preview>true</allow_stdout_preview>
<results_written_to_stdout>true</results_written_to_stdout> <parsing_class>
ca.bcgsc.chinook.server.runner.alignment.Lagan </parsing_class> <output_path>/tmp</output_path> <description> Lagan is developed at Stanford by Mike
Brudno </description>
<creator>http://lagan.stanford.edu</creator> <version>1</version> <data_entry_set> <name>dna_sequence</name> <maximum_count>2</maximum_count>
<minimum_count>2</minimum_count> <data_entry_type_name> <data_entry_type_name> <set_output_class_name> ca.bcgsc.chinook.parsing.setoutput.impl.DataEntrySetOutputterImpl </set_output_class_name> </data_entry_set> <parameter>
<descriptor>chaos_STRING</descriptor>
<regex_format>["]([.]+)["]</regex_format> <description>The contents of this
string will be passed as arguments to chaos</description>
<user_defined>true</user_defined> </parameter> <parameter>
<descriptor>order_STRING</descriptor> <regex_format>"-gs ([0-9]+)
-gc ([0-9]+) -mt ([0-9]+) -ms ([0-9]+)"</regex_format> <description>The contents of this
string will be passed as arguments to order </description>
<user_defined>true</user_defined> </parameter> <parameter>
<descriptor>recurfl_STRING</descriptor>
<regex_format>"(\([0-9]+,[0-9]+,[0-9]+,[09]+\),)+" </regex_format> <description>Used in recursive
anchoring</description>
<user_defined>true</user_defined> </parameter> <parameter>
<descriptor>translate_BOOLEAN</descriptor> <description>Use translated
anchoring</description>
<user_defined>true</user_defined> </parameter> <parameter>
<descriptor>bin_BOOLEAN</descriptor> <description>Output in binary
format</description> <user_defined>false</user_defined> <on>false</on> </parameter> <parameter>
<descriptor>mfa_BOOLEAN</descriptor> <description>Output
in multifasta format</description>
<user_defined>false</user_defined> <on>true</on> </parameter> <parameter>
<descriptor>rc_BOOLEAN</descriptor> <description> Reverse complement the second sequence
before alignment </description>
<user_defined>true</user_defined> </parameter> <parameter>
<descriptor>fastreject_BOOLEAN</descriptor> <description>Abandon alignment if
homology looks weak </description>
<user_defined>true</user_defined> </parameter> </application> |
All new application specs are defined between
<application> and </application> tags. The <name> tag defines the application that is
being run. This can be anything. The <type>
tag is more important. This is an ontological definition that marks the type of
services class you belong to. It is planned that the website will carry a
dictionary of the terms that wildly used. For now, the well-defined term is
ALIGNMENT, VARIATION, MOTIF DISCOVERY, PATTERN DISCOVERY. The <path> tag simple points to the directory
that the main service is in and <executable> tag holds
the name of the application that will be run.
In the <format> tag, several terms are special and are replaced by appropriate values when the script is run.
1) exe_path is replaced by the contents of the <path> tag.
2) executable is replaced by the contents of the <executable> tag.
3) dna_sequence is replaced by the location of the
sequence files.
4) parameter is replaced by the services specific
parameters.
<allow_stderr_preview>
tag defines if the standard error preview is allowed. <allow_stdout_preview> tag defines if the
standard out preview is allowed. <set_output_class_name>
tag defines the class name used to setup the output. <parsing_class> tag defines what class will
parse the outputfile. Currently supported parsing classes are: ca.bcgsc.chinook.server.runner.alignment.Lagan
for Fasta, ca.bcgsc.chinook.server.runner.alignment.clustalw
for
The <output_path> tag points to your temporary directory for formatting files. The <description> tag points to a description of the service. The <creator> tag is the website of the original author of the services implementation (not the service providers). So in the case of Lagan, it points to the site at Stanford. The <version> is the version of the application.
The <data_entry_set>
tag defines the data formats. The first tag <name> defines the name of the data, which is
used in <format>
tag. The <maximum_count>
defines the maximum number of sequences. If there are no maximum number of
sequences can be supplied to the application. This tag does not need to be
defined. The <minimum_count>
defines the minimum number of the sequences being supplied to the application. <data_entry_type_name> defines the name to get the sequences. <set_output_class_name> defines
what class is used to setup the output.
The <parameter> tag is another special tag in the XML description of your service. These tags define what parameters you want your client to input, what parameters you'd prefer they didn't, and what default values you'd like to maintain. The <parameter> tag offers extensive control over how your client uses your service.
The first part of the <parameter> tag if the <descriptor> tag. This tag defines what type of parameter it is. Fundamentally, there are two types STRING and empty (boolean). When defining the <descriptor> tag, define it as the name than the type, i.e. tree_STRING tells Chinook that you have a parameter called tree that needs a string whereas translate_ tells Chinook that you have a parameter that is either there or it isn't; for instance,
mlagan -tree
“mytree” -translate
The <regex_format> tag describes the regular expression that you want your input string data to match. This is a security feature that allows you to guarantee that parameters will be inputted in the way that you expect to get them – as users are prevented from entering parameters that don't match. This tag is not required however.
The <description> tag describes to the user what this parameter does. The <user_defined> tag tells Chinook whether you want the user to be able to change this parameter; it has two options true or false. The <use_equals> tag tells Chinook whether the parameter is of the form -tree=value or -tree value. It takes two options true or false. The <on> tag tells Chinook whether this boolean parameter is active or not by default. Finally, the <default_value> tag holds the data that you want this parameter to input, i.e the default string data. It is very possible as a service provider to use these tags in way that doesn't make sense. Be very careful about what you want users to do and what the default parameters are. If you find that something is missing to fully describe your input parameters, e-mail me at smontgom@bcgsc.bc.ca
After, you have defined your new service you should be able to share it with the world using Chinook. Visit our online applet to see if our discovery service has picked it up (in development).
This walkthrough will guide you through setting up a Chinook server.
To make Chinook available to Internet users, you will have to have as a prerequisite:
1) A computer where bioinformatics applications can be installed and run (i.e. you will not be able to provide ClustalW analyses if your computer cannot run it as is)
2) Open required ports on the computer. You will need ports 9700 and 9701 open for JXTA communication. You will also need either port 1099 (for RMI mode) or port 8080 (for Web Services mode). These are the default ports; other ports can be selected in place of these in case of overlapping services (you will need to edit the advertisement and resource files if you are not using a default port).
3) A working directory. Ideally, your computer will have at least 100MB of hard-drive space to store temporary files.
To set-up a Chinook server node:
1) Ensure your have the right hardware dependencies (see above).
2) Install Java.
a.
Go to http://java.sun.com
b.
Look for Downloads
c. Download the 1.4.x version of the J2SE JDK.
d. Set the JAVA_HOME environment variable to the installation directory of
Java (i.e. export JAVA_HOME=/usr/lib/java).
You made want to do this in a configuration script so that this
environment variable is preserved.
3) Install Chinook (see Section 2).
4) If you are planning to use the Web Services version of Chinook, you will need to install Tomcat and Apache Axis. Read the inset for instructions on how to do this.
Installing Tomcat and Apache Axis:

5) For the Web Services version of Chinook, you can now deploy the Chinook code. This is done by running the Ant task deploy from the Chinook installation directory. If the CATALINA_HOME environment variable has not been set, this task will fail.
a. Run the command: ant deploy
b. If Ant is not installed, follow the installation instructions below.
Installing ANT

6) Before you run the server, in either RMI or Web Services mode, you will need to configure the server for your machine. Go to the Chinook installation directory.
a. Editing the files in the resources/ directory
i. The most important file to edit is applications.xml
ii. You will want to comment out the protocol block you are not using and comment out the protocol block you are using. For instance, in RMI mode the following would appear:

iii. IMPORTANT: Change the name of the uri to your machine name. Do not use localhost.
iv. Change the publisher information to reflect your user information. This will allow users of your server to contact you. This information should also be set in more detail in the server-info.xml file in the same directory.
v. IMPORTANT: We have not installed any new services into Chinook. When new services are added, they are described in the applications.xml file. If this file contains services, you should comment them out as your server would end up advertising services that do not exist at your location.
b. Editing the files in the advertisements/ directory
i. The advertisements that Chinook uses are specified in advertisement-config.xml in the resources/ directory.
ii. To ensure that you are advertising the right endpoint for your services, edit the advertisement implementation files. Change it from localhost to your machine name.
7) That is all there is to it. You MUST still install services. But to run Chinook start ant p2p-start then wait a few seconds (until the p2pNode has found a rendezvous ~ 10-15 seconds) then type ant server-start. NOTE: Check the Axis configuration page to ensure that your service was deployed. If it was not, first try stopping and restarting the Tomcat server.
8) Test out your services by running a client.
Troubleshooting:
If you are not able to see the deployed services, look in the Tomcat log/ directory to determine the source of the error.
The server-config.wsdd file is not found.
Copy this file from elsewhere in your Tomcat installation (it will likely be in the work/ directory).
I get a 401 error from ANT and the log
says: -
Rejected remote access from host /0:0:0:0:0:0:0:1
The
server-config.wsdd file needs to allow remote administration. Set <parameter name="enableRemoteAdmin"
value="true"/> for the AdminService.
Other
error
Send
the tailing lines of your log files and the ant execution information to chinook@bcgsc.bc.ca.
The Perl interface to Chinook allows you to add analysis capabilities directly into your scripts. This walkthrough will outline how to find services, submit jobs, and read reports from using the Perl interface.
Follow steps:
1) Configure your Perl environment. (see 3.0.4.1)
2) Start the Chinook Client in batching mode (see 3.0.4.2).
3) The example Perl scripts for Chinook batching are located in the perl/t/ directory from your Chinook installation directory.
4) How to Find Chinook Services using Perl. One of the first programmatic tasks is to discover what services are currently available for running over Chinook. There are two ways to input discovered services into your Perl script. This can be either performed by parsing the batch directory using the Bio::Tools::Run::Chinook::BatchDirectory module or by asking the Chinook Client directly using the Bio::Tools::Run::Chinook::ChinookManager module.
a. To parse the batch directory:
|
my $batch_directory = Bio::Tools::Run::Chinook::BatchDirectory->new(
batch_directory
=> "/home/smontgom/batch/batch"); my $services_ref = $batch_directory->getAllServices(); |
b. To use the Bio::Tools::Run::Chinook::ChinookManager:
|
my $chinook_man = Bio::Tools::Run::Chinook::ChinookManager->new(
machine_name
=> "localhost", port => "7999"); my $services_ref = $chinook_man->getServices(); |
Both of these methods get the current services that are available. (Each returns a list of Bio::Tools::Run::Chinook::Service objects) However, only the Bio::Tools::Run::Chinook::ChinookManager is guaranteed of being current as service descriptions are not removed from the batch directory unless manually removed. The recommended method is to use the Bio::Tools::Run::Chinook::ChinookManager to get current information whenever possible.
5) Once a service has been selected, you will need to access a batch file (create a Bio::Tools::Run::Chinook::Batch object) for that service to get the required parameters and supported databases.
6) How to Get Information about Required Data and Parameters using Perl. To create a Bio::Tools::Run::Chinook::Batch object, call the following ( the filename of the batch file can be accessed from the Bio::Tools::Run::Chinook::Service object or it will have to be discovered in the batch directory if the Bio::Tools::Run::Chinook::ChinookManager doesn’t provide the information)
|
my $batch = Bio::Tools::Run::Chinook::Batch->new( batch_filename => $batch_filename); |
Once, you have a Bio::Tools::Run::Chinook::Batch object, you can interrogate the required DEOSets (Data Entry Object Sets) that are required to run the Service.
NOTE:
Data Entry Object Sets (DEOSets). DEOSets
are the data requirements for various services.
A DEOSet contains information about what types of data objects a server
is expecting, how many of them it requires, and the data itself (in the form of
a
|
my $deo = $chinook_manager->getDEO(" |
7) Fill in the required DEOSets.
The process to do this is in the test scripts in the perl/t/
directory. Essentially, first get the
DEOSet objects from the Bio::Tools::Run::Chinook::Batch object. Iterate through each DEOSet and determine how
many DEOs are required and what the allowed
8) Once the data has been filled in and set, the required parameters need to be set.
9) How to Set Required Parameters using Perl. The parameters that are required for any given service are accessible from the Bio::Tools::Run::Chinook::Batch object. An example of how to set all the Boolean parameters to false is described below.
|
my $parameters_ref = $batch->getParameters(); my @parameters = @$parameters_ref; foreach my $parameter (@parameters) { if
($parameter->getType eq "BOOLEAN") {
$parameter->setValue("false"); } } |
10) How to Run a Chinook Job using Perl. To run a job in Chinook using the Perl interface, you finally need to create a Bio::Tools::Run::Chinook::QueueBatch object. Once the parameters and DEOSets have been set, this can be done as below:
|
my $queue_batch = Bio::Tools::Run::Chinook::QueueBatch->new( deosets =>
$deosets, parameters =>
\@parameters,
batch =>
$batch, filename => "/home/smontgom/batch/batch_queue/chinook.LAGAN"); |
The Bio::Tools::Run::Chinook::QueueBatch object requires a filename which will become the prefix for the XML file containing all the information required to run the job over Chinook. This file is written by calling:
|
my $id = $queue_batch->writeQueueBatch(); |
Then the ChinookManager is used to point the Chinook Client at the batch queue file for processing on the server. An example of this is below:
|
$chinook_man->processQueueBatch($queue_batch); |
11) Accessing Reports using Perl. The ID that was returned when the Bio::Tools::Run::Chinook::QueueBatch object was written to file is used to reference that file to resulting report files that are written to the batch reporting directory. To monitor completion of reports, you can periodically poll the reporting directory (substituting your reporting directory in place of the one provided below). An example of this is below.
|
$filename = $chinook_man->isReportReady($id, "/home/smontgom/batch/batch_reporting/"); |
12) If the filename is defined, the report has been written to the file.
13) Getting the Bio::Tools::Run::Chinook::Report object. To get the Report object, once the filename has been defined, call:
|
my $report = Bio::Tools::Run::Chinook::Report->new(
report_filename => $filename); |
14) From the report object, you can get information about warnings, errors, or the information required to download the result files from the server.
15) Downloading Results from the Report. To download the results from the report file, first determine the canonical output file. This is the file that the results are written to (you can also access other files that describe how the job was run and what was available on various streams). To download results to a sample file, follow the example below.
|
my $report_file = $report->getCanonicalReportFile(); my $sample_file = "/tmp/results.out.steve.3"; $chinook_manager->downloadFile( $report_file->getFileId(), $report->getServiceLocation(), $sample_file); |
16) Congratulations. That
should cover the basic process for running jobs via Chinook. There are still lots of steps required that
we hope to reduce the complexity of.
There is also lots of uncovered functionality that can be observed by
looking at the modules and reading the associated Perldocs. But, with a little bit of effort, your
scripts and users can access state-of-the-art algorithms without having them
downloaded.
|
C |
Hinook is funded by Genome
The Chinook mailing list is a low-volume regulated list that broadcasts weekly development announcements. We recommend you to sign up the mailing list at http://www.bcgsc.ca/mailman/listinfo/chinook. You will get the latest information about Chinook (including upgrade, bug fix). You can also view the Archive at http://www.bcgsc.ca/pipermail/chinook/.
Montgomery SB, Fu T, Guan J, Jones
Chinook
Internal:
Chinook Service Developers:
Monica Sleumer, Keven Lin, Tamara Astakhova, Jun Guan, Maik Hassel, James
Kennedy, Eddy Tsang, Yvonne Li, Tony Fu
Other thanks:
Asim Siddiqui, Misha Bilenky, Gordon Robertson
Chinook
External:
Jonathan Lim, Wyeth Wasserman, David He, Sohrab Shah,
Francis Ouellette
1. Web pages may not be displayed properly in
the Lightweight Web Browser. This is
because Chinook is a Java application, and JEditorPane, which is used to
display web pages, only supports Html 3.2 currently. Any web pages created
using html version above 3.2 will likely not be displayed properly.
Chinook
is licensed under a Creative Commons License.