|
|
# Use the Transfer Tool
|
|
|
|
|
|
The following commands are available:
|
|
|
|
|
|
- `./spsp compress <folder>` - compress a folder to tar.gz archive
|
|
|
- `./spsp encrypt <file>` - encrypts a file using gpg command and SPSP public key (which needs to be in your own GPG keys list)
|
|
|
- `./spsp hash <file>` - generates the hash of a file using SHA-256 algorithm
|
|
|
- `./spsp transfer <file>` - transfers a file through sftp to SPSP server (your SSH key needs to be validated by SPSP to use this command)
|
|
|
- `./spsp auto`- automatically run the transfer-tool (this needs to be combined with a CRON task, see below for more information), add `--no-archive` or `-NA` to not keep the sent files
|
|
|
- `./spsp help` - displays the help
|
|
|
|
|
|
## How to prepare FASTQ files with the metadata file
|
|
|
|
|
|
This step assumes that you already followed the guide on [spsp.ch](https://spsp.ch/) and will only tell you how you should orgnaize your files for the Transfer Tool to work properly
|
|
|
|
|
|
- Start by **identifying** if your sequences are from **viruses** or **bacteria** (if mixed, you need to separate them)
|
|
|
- Make sure that the sequences described in your metadata file **match** the FASTQ files
|
|
|
- **Create a subfolder** as the date of the day (for example: 26-06-12) **inside bacteria/viruses directory** depending on their type
|
|
|
- **COPY** your FASTQ files and the metadata file inside the freshly created folder
|
|
|
|
|
|
**IT IS VERY IMPORTANT TO ALWAYS PUT YOUR FASTQ FILES AND METADATA FILE INSIDE A SUBFOLDER IN THE BACTERIA/VIRUSES DIRECTORY, OR THE TT WILL IGNORE THE FILES**
|
|
|
|
|
|
## How to transfer files easily
|
|
|
|
|
|
If you want to quickly and easily send a batch of FASTQ files with their metadata, just follow those instructions:
|
|
|
|
|
|
- Follow the instruction on [How to prepare FASTQ files with the metadata file](#how-to-prepare-fastq-files-with-the-metadata-file)
|
|
|
- Launch the pipeline by typing `./spsp auto` which will trigger the **automatic** mode
|
|
|
- Once the transfer is over, you should find the sent files (encrypted archive and hash file) in the `sent` folder
|
|
|
|
|
|
Before the transfer, your directory should look like this:
|
|
|
- /bacteria
|
|
|
- /26-06-20
|
|
|
- sequence1.fastq
|
|
|
- sequence2.fastq
|
|
|
- sequence3.fastq
|
|
|
- metadata-file.xlsx
|
|
|
- /viruses
|
|
|
- /sent
|
|
|
- /logs
|
|
|
- spsp
|
|
|
- README.md
|
|
|
|
|
|
After the transfer, it shoud look like this:
|
|
|
- /bacteria
|
|
|
- /viruses
|
|
|
- /sent
|
|
|
- 26-06-20.tar.gz.gpg
|
|
|
- 26-06-20.tar.gz.sha256
|
|
|
- /logs
|
|
|
- spsp
|
|
|
- README.md
|
|
|
|
|
|
## Use the automatic mode in combination with a CRON task
|
|
|
|
|
|
If you want to use the automatic mode on daily basis, you need to set up a [CRON](https://en.wikipedia.org/wiki/Cron) task.
|
|
|
|
|
|
We recommend the following settings:
|
|
|
|
|
|
```
|
|
|
0 5 * * * /path/to/spsp/spsp auto >> /path/to/spsp.log
|
|
|
```
|
|
|
|
|
|
This will launch the Transfer Tool at 5 AM every day of the week using the automatic mode and save the output inside a file called `spsp.log` (this will be the main log file).
|
|
|
|
|
|
In order, this is what happens:
|
|
|
|
|
|
1) Checks that the `.outbox`, `sent`, `viruses`, `bacteria` and `.logs` folders exist.
|
|
|
2) Creates a log file using the current date inside `.logs` directory
|
|
|
3) Checks if the connection to SPSP works
|
|
|
4) Scans the two `viruses` and `bacteria` directories for any folder; if one is found, checks that it contains `.fastq` or `.fastq.gz` and `.xlsx` files at least
|
|
|
5) Compresses the folder to tar.gz and move it to `.outbox` directory, then delete the initial folder
|
|
|
6) Then for every file inside `outbox`, generates the hash of the file using SHA-256
|
|
|
7) Encrypts the file using the SPSP public key and delete the initial unencrypted compressed file
|
|
|
8) Transfers `*.sha256` (hash) and `*.gpg` (encrypted tar.gz) files to the corresponding subdirectory (`viruses` or `bacteria`) on the remote server
|
|
|
9) (Optional) If you used the automatic mode with the `--no-archive` option, the sent files will not be moved to the `sent` folder and **will be erased**
|
|
|
|
|
|
If any error occurs during the process, the script will output the error in the log file inside the `.logs` directory and will automatically stop to avoid any more errors.
|
|
|
|
|
|
Keep in mind that in the CRON task, we are returning the output of the automatic mode of the script inside a file called `spsp.log`. This should be your starting point to check if any error occured. Then, you can check the log file inside the `.logs` folder for more information.
|
|
|
|
|
|
Also, be sure that when you copy the `fastq` or `fastq.gz` files inside the directory, the copy process should be completed before 5 AM (based on the recommended settings), or the script will send incomplete files.
|
|
|
|
|
|
Finally, as files may be quite large (several GB per file), it is up to each institution to decide if all the archives should be kept inside the `sent` folder (default behavior) or not (use the `--no-archive` option).
|
|
|
|
|
|
[Debugging →](Debugging) |
|
|
\ No newline at end of file |