Commit 3f30bb06 authored by Robin Engler's avatar Robin Engler
Browse files

Add a frequently-made mistakes section to the README.md file

parent dd897bf6
......@@ -18,7 +18,7 @@ AUTHORIZED_TISSUES <<- c('stroma', 'tumor', 'dermis', 'epidermis', 'melano
AUTHORIZED_COMPARTMENTS <<- c('nucleus', 'membrane', 'cytoplasm', 'entire_cell')
AUTHORIZED_STROMA_VALUES <<- c('DAPI', 'stroma', 'other')
AUTHORIZED_TUMOR_VALUES <<- c('CK', 'tumor')
AUTHORIZED_MARKERS <<- c('CAL', 'CD3', 'CD4', 'CD8', 'CD11C', 'CD15', 'CD20', 'CD56', 'CD68',
AUTHORIZED_MARKERS <<- c('CAL', 'CD3', 'CD4', 'CD8', 'CD11c', 'CD15', 'CD20', 'CD56', 'CD68',
'CD103', 'CD163', 'CD206', 'FOXP3', 'GB', 'gH2AX', 'gH2AXN', 'IDO',
'IL10R', 'Keratin', 'KI67', 'PD1', 'PDL1', 'PERFORIN', 'SOX10',
'WT1', 'CK', 'VISTA')
......
......@@ -81,7 +81,7 @@ rename_samples <- function(sample_rename, root_dir){
load_sample_rename_file <- function(input_file){
# Load file content by line. Lines starting with # are ignored.
file_content = read_file_as_vector(input_file)
file_content = read_file_as_vector(input_file, ignore_comments=TRUE, ignore_empty_line=TRUE)
if(length(file_content) < 2) raise_error(
msg = 'Sample renaming files must contain at least 2 lines: header + one sample.',
file = input_file)
......
# Post-inForm
Post-process cell immunofluorescence data produced by the inForm software.
**Important:** for frequently asked questions, please seet the **Frequently-made mistakes** section
on further down in this document.
##### Dependencies
The following R libraries are needed to run Post-inForm:
* openxlsx
* zip
* checkmate
These can be installed with the following R command: `install.packages(c("openxlsx", "zip", "checkmate"))`
These can be installed with the following R command: `install.packages(c("openxlsx", "zip", "checkmate", "stringi"))`
<br>
<br>
### Examples of how to run Post-inForm
## Examples of how to run Post-inForm
##### Load Post-inForm source code:
Download or clone the project's directory to your local machine. Then set `POSTINFORM_ROOT` to the
`postinform` directory on your local machine and run the `source()` command in your R console, as
......@@ -24,8 +30,8 @@ source(file.path(POSTINFORM_ROOT, 'R', 'config.R'), chdir=TRUE)
##### Process samples:
Post-inForm can be run with 3 different commands:
* `check`: check input data only. Does not produce any output.
* `reduce`: reduce size of input data by deleting all unecessary data from input. For standard
inForm data, the reduction in size is approximatively of a factor 10.
* `reduce`: reduce size of input data by deleting all unnecessary data from input. For standard
inForm data, the reduction in size is approximately of a factor 10.
* `process`: run the full Post-inForm data processing pipeline.
```
......@@ -40,11 +46,11 @@ postinform(input_file_or_dir=input_file, command='process')
* **compress_output**: if `TRUE`, the output is compressed to a .zip file. If `FALSE` the output
remains in an uncompressed directory.
* **allow_overwrite**: if `TRUE`, pre-existing files and directories with the same name as an
output file or directory are silently and mercylessly deleted. Leave this to `FALSE` to avoid
output file or directory are silently and mercilessly deleted. Leave this to `FALSE` to avoid
accidental deletions.
* **output_suffix**: suffix to be appended to the input file or directory name to form the output
name. E.g. if the input file is named `Test_session.zip` and `output_suffix` is set to
`processed`, then the ouput will be named `Test_session_processed`. By default, the suffix
`processed`, then the output will be named `Test_session_processed`. By default, the suffix
value is set to `reduced` when running the "reduce" command, and `processed` when running the
"process" command.
* **immucan_output**: if `TRUE`, produces IMMUCAN compatible outputs.
......@@ -62,10 +68,11 @@ This command will produce an output file named "Test_session_random_suffix.zip".
postinform(input_file_or_dir="Test_session.zip", command='process', output_suffix="random_suffix",
compress_output=TRUE, immucan_output=TRUE, allow_overwrite=FALSE)
```
<br>
<br>
### Post-inForm input parameter file format.
## Post-inForm input parameter file format.
The list of samples, tissues, markers and marker combinations to process are passed to post-inForm
via a single plain text file that must be named `parameters.txt` and be located at the root of the
input directory.
......@@ -103,6 +110,65 @@ scored_markers:
# Marker combinations to test
marker_combinations: all
```
<br>
<br>
## Frequently-made mistakes
If your post-Inform analysis fails with an error, please verify the following points:
* The input `parameters.txt` and `sample_rename.txt` files do not contain any spaces in their
file names names. Ideally, you will name these files simply `parameters.txt` and
`sample_rename.txt` but names such as `SessionName_parameters.txt` and
`SessionName_sample_rename.txt` are also allowed.
However, `SessionName Sample Rename.txt` is not a valid name because it contains spaces.
* The input file `parameters.txt` is based on the template file provided
[here](tests/parameters.txt). In particular, make sure that there are no quotation marks around
any line of the file.
Here is an example of a subset of a **non-valid** `parameters.txt` file and how it should be
corrected:
```
"# List of phenotyped and scored markers."
"phenotyped_markers: CD3, CD20, CD15, CD11c"
```
Should be:
```
# List of phenotyped and scored markers.
phenotyped_markers: CD3, CD20, CD15, CD11c
```
* The input file `sample_rename.txt` is based on the template file provided
[here](tests/sample_rename.txt).
* The `parameter.txt` and `sample_rename.txt` files are encoded in UTF-8 format.
* If your input contains multiple files marker groups to be merged, make sure that:
* The name of all files to be merged is the same (except for the marker names).
* The list of markers in the file names are separated with `_`.
Here is an example of **correct** file names:
```
IMI2_Test_CD3CD15_merge_cell_seg_data
IMI2_Test_CKCD11c_merge_cell_seg_data
```
And here are examples of **wrong** file names:
* In the first file name, the "CD3CD15" marker names are not properly separated with
`_` characters.
* In the second file name, a "_2" was added to the second file name that is not present in the
first file name. Therefore post-inForm will not be able to identify that these two files
should be merged.
```
IMI2_TestCD3CD15_merge_cell_seg_data
IMI2_Test_2_CKCD11c_merge_cell_seg_data
```
* `Split by coordinate failed` error: getting this error means that the splitting + renaming of the
specified sample failed because it could not be automatically split (vertically) in two distinct
samples. Please check the following:
* The sample really contains 2 samples that must be split. Very often, this error occurs
because a sample that is indicated as having to be split (i.e. it has 2 `new_name` values
in the `sample_rename.txt` file), is in fact only a single sample and should not be split.
**Solution:** edit the `sample_rename.txt` file so that only one "new name" value is present.
# Values can be separated by spaces or tabs.
# If a sample should be split, two space separated values should be provided under new_name.
# Please remove the first 3 comment lines of this file before using it.
old_name new_name
SAMPLE_OLD_NAME SAMPLE_NEW_NAME
SAMPLE_OLD_NAME SAMPLE_NEW_NAME_1 SAMPLE_NEW_NAME_2
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment