This document examines how to use RELION 4.0 (beta2 as of writing) with Warp 1.09 and M 1.09 for single particle analysis.
Special thanks to Alister Burt, Pranav Shah and Dimitry Tegunov for discussion on this Twitter thread.
We use the RELION tutorial dataset (beta-galactosidase collected on JEOL CRYO ARM 200, a subset of EMPIAR-10204).
wget ftp://ftp.mrc-lmb.cam.ac.uk/pub/scheres/relion30_tutorial_data.tar.
tar xzvf relion30_tutorial_data.tar
cd relion30_tutorial
The extracted directory contains the Movies
directory.
We rename it to RawMovies
and make a new Movies
directory.
This is because Warp creates a lot of files in the same directory as movies.
We don't want to mix precious raw data with derived files, which can be regenerated.
mv Movies RawMovies
mkdir Movies
cd Movies
ln -s /path/to/RawMovies/* . # make symbolic links by absolute paths!!
By the way, we assume this folder is located on a Linux system and exported to Windows by SAMBA. Do not make symbolic links on NTFS using WSL. Native Windows applications (i.e. Warp and M) don't recognize symbolic links on NTFS created by WSL.
Now the directory structure should look like:
relion_tutorial
: this will be RELION's project directoryRawMovies
: this contains TIFF filesMovies
: this contains symbolic links to above TIFF files. We will run Warp here
Switch to an Windows machine and start Warp.
We start with CTF and motion correction.
-
Movie parameters:
- Pixel size:
0.885
Å/px - Dose:
1.277
e/Å^2/frame - Gain reference:
gain.mrc
without transformations
- Pixel size:
-
CTF parameters:
- Window:
512
px - CTF range:
20 - 3.5
Å - Acceleration voltage:
200
kV - Cs:
1.4
mm - Amplitude contrast:
0.1
- Defocus:
0.6 - 3.0
um No
phase shiftDon't
use Movie sum- Defocus grid:
5 x 5 x 1
- Window:
-
Motion parameters:
- Motion: consider
35.4-7.1
Å - Weigh by B = -
500
Å^2 - Motion grid:
5 x 5 x 20
- Motion: consider
-
Don't
pick particles (yet)
Click Start Processing
. Once a few micrographs are processed, Stop Processing
.
The first image was reported as an outlier in astigmatism, but CTF fit looked OK.
So I included it by increasing the filter's sigma threshold to 10.0
.
Test picking parameters on a few micrographs.
Go to the Real Space
tab and click Pick with Boxnet2mask20180918
.
Activate Show particles from BoxNet2Mask_20180918
with 200
Å to check particle distributions.
For this dataset, at least 0.450 score
led to a reasonable result.
Enable Pick Particles
:
- Use
Boxnet2Mask_20180918
- Diameter:
200
Å - Score:
0.45
cryo
particles- Maintain a minimum distance of
0
Å from junk, i.e. don't use this filter
We want to extract particles at a large pixel size (3-4 Å/px) first to run Class2D and then re-extract only good particles at a smaller pixel size. Unfortunately, however, Warp does not allow particle down-sampling at this point. If you enable binning, the output micrographs are also down-sampled. Thus, you cannot re-extract particles at a smaller pixel size later. So we don't use binning in Warp and use RELION to extract down-sampled particles. Nevertheless, we must let Warp extract something to get a STAR file. To minimize storage waste, we extract dummy 2 px particles in Warp. This is indeed very inconvenient!
- Extract
2
px boxes, which is the smallest size allowed in Warp Don't
invert, normalize, maintain a separate list (we don't use these particles anyway, so save time by skipping operations)
Start Processing
to process all movies.
Upon completion, save settings as warp-config.settings
, which will be used by M later.
Click [Overview]-[Export Micrograph List]
and specify the following options.
Generate files for RELION's polishing
Make paths relative to STAR location
- Write
micrographs.star
to the top of the RELION project, that is, outside theMovies
directory.
The output is written in the RELION 3.0 format. We convert this to the RELION >= 3.1 format with an optics group table by:
relion_convert_star --i micrographs.star --o micrographs-3.1.star --Cs 1.4 --Q0 0.1
Q0
stands for amplitude contrast.
Warp writes a list of good particles to Movies/goodparticles_BoxNet2Mask_20180918.star
.
First, we convert this to the RELION >= 3.1 format by:
cd Movies
relion_convert_star --i goodparticles_BoxNet2Mask_20180918.star --o ../goodparticles-3.1.star
cd ..
We have to run the conversion inside the Movies
directory, because the paths are relative to the directory.
relion_convert_star
needs to inspect at least one MRCS file to find out the particle dimensions.
In addition to the wrong origin of the relative paths, this STAR file has another problem: the rlnMicrographName
column contains movie names, instead of micrograph names.
See the particles
table.
data_particles
loop_
_rlnCoordinateX #1
_rlnCoordinateY #2
_rlnPhaseShift #3
_rlnDefocusU #4
_rlnDefocusV #5
_rlnDefocusAngle #6
_rlnCtfMaxResolution #7
_rlnImageName #8
_rlnMicrographName #9
_rlnOpticsGroup #10
833.380000 190.870000 0.000000 11131.900000 10704.500000 68.000000 4.000000 0000001@particles/20170629_00021_frameImage_BoxNet2Mask_20180918.mrcs 20170629_00021_frameImage.tiff 1
3181.92
By a text editor in Linux, rename the 9-th column to rlnMicrographMovieName
and add the 11-th column rlnMicrographName
.
Windows uses CR+LF as a new line symbol, while Linux uses LF alone.
Editing a STAR file in Windows can cause problems later.
Then use awk
to make the movie paths relative to the top of the project directory and fill the 11-th column.
awk '{if (NF==10) {fn=$9; $9="Movies/"$9; sub(".tiff","",fn); $11="Movies/average/"fn".mrc"} print}' goodparticles-3.1.star > goodparticles-3.1-fixed.star
Now the repaired table should look like:
data_particles
loop_
_rlnCoordinateX #1
_rlnCoordinateY #2
_rlnPhaseShift #3
_rlnDefocusU #4
_rlnDefocusV #5
_rlnDefocusAngle #6
_rlnCtfMaxResolution #7
_rlnImageName #8
_rlnMicrographMovieName #9
_rlnOpticsGroup #10
_rlnMicrographName #11
833.380000 190.870000 0.000000 11131.900000 10704.500000 68.000000 4.000000 0000001@particles/20170629_00021_frameImage_BoxNet2Mask_20180918.mrcs Movies/20170629_00021_frameImage.tiff 1 Movies/average/20170629_00021_frameImage.mrc
We don't care the rlnImageName
column, because we will re-extract particles in RELION anyway.
TODO: Check what happens to rlnCoordinateX/Y
when super-resolution movies are binned.
After making sure your current directory is correct (outside Movies
!), launch RELION.
We extract particles with strong down-sampling.
- Micrograph STAR file:
micrographs-3.1.star
- Input coordinates: (leave empty)
- OR re-extract refined particles:
Yes
- Refine particle STAR file:
goodparticles-3.1-fixed.star
- Particle box size:
280
px - Diameter background circle:
200
px - Rescale particles:
Yes
- Re-scaled size:
64
px (3.872 Å/px)
Perform 200 iterations of VDAM Class2D into 25 classes, select good classes (I got 4283 particles) and generate an initial model. I don't describe the details of RELION processing, since this is explained in the RELION tutorial.
Re-extract chosen particles with less down-sampling. This time I extracted 320 px particles into a 200 px box (1.416 Å/px). Run Refine3D.
Make a mask.
In later steps, M needs a non-soft (binary) mask, so create two masks with and without Add a soft-edge of this many pixels: 5 px
.
I got 3.2 Å in PostProcess.
I proceeded as follows:
- CtfRefine (up to 4-th order aberrations)
- CtfRefine (per-particle defocus and per-micrograph astigmatism)
- 2nd Refine3D with a mask using solvent flattened FSC (3.1 Å)
- Bayesian Polishing
- 3rd Refine3D (3.0 Å)
- CtfRefine (up to 4-th order aberrations)
- CtfRefine (per-particle defocus and per-micrograph astigmatism)
- 4th Refine3D (3.0 Å)
Once we perform Refine3D, we can switch to M. Here we take the result of the first Refine3D job (3.2 Å) and continue refinement in M.
Since M cannot handle STAR files from RELION >= 3.1, we have to revert the file format.
Copy Refine3D/first/run_data.star
to Refine3D/first/run_warp_data.star
and edit as follows in Linux.
- Remove the optics group table.
- Swap the
rlnMicrographName
andrlnMicrographMovieName
columns. - Rename the
rlnOrignXAngst
column torlnOriginX
, therlnOriginYAngst
column torlnOriginY
.
The beginning of the file after modification should look like:
data_particles
loop_
_rlnCoordinateX #1
_rlnCoordinateY #2
_rlnPhaseShift #3
_rlnDefocusU #4
_rlnDefocusV #5
_rlnDefocusAngle #6
_rlnCtfMaxResolution #7
_rlnImageName #8
_rlnMicrographName #9
_rlnOpticsGroup #10
_rlnMicrographMovieName #11
_rlnGroupNumber #12
_rlnAngleRot #13
_rlnAngleTilt #14
_rlnAnglePsi #15
_rlnOriginX #16
_rlnOriginY #17
_rlnClassNumber #18
_rlnNormCorrection #19
_rlnLogLikeliContribution #20
_rlnMaxValueProbDistribution #21
_rlnNrOfSignificantSamples #22
_rlnRandomSubset #23
692.920000 557.320000 0.000000 11086.200000 10658.900000 68.000000 4.000000 000001@Extract/job005/Movies/average/20170629_00021_frameImage.mrcs Movies/20170629_00021_frameImage.tiff 1 Movies/average/20170629_00021_frameImage.mrc 1 105.714139 71.436277 156.476795 0.856420 1.327240 1 0.632212 1.499089e+05 0.225804 22 2
Launch M.
Create New Population
with a name Tutorial
in a new M
folder.
[Manage Data Sources]-[Add local]
and select warp-config.settings
in the Movies
folder.
Name it TutorialDataset
.
Use only first 24
frames (i.e. all frames) in this case.
Click the big +
symbol to add species from scratch
.
- Name it
betaGalactosidase
- Diameter
200
Å 540
kDa (this value is not used at the moment)D2
symmetry2
temporal poses
Select Refine3D/first/run_half1_class001_unfil.mrc
and Refine3D/first/run_half2_class001_unfil.mrc
at 1.4160
Å/px as references.
Select the binary
mask, because M adds soft edges itself.
Select Refine3D/first/run_warp_data.star
.
- Coordinates use
0.885
Å/px - Shifts use
1.000
Å/px - Use
TutorialDataset
for name matching.
It is important to specify 1.000 Å/px as shifts regardless of the down-sampled pixel size used in the Refine3D job.
This is because we generated the rlnOriginX
and rlnOriginY
columns by simply renaming the rlnOriginXAngst
and rlnOriginYAngst
columns.
Make sure all particles are matched to available data sources.
Creation of a new species takes 15-20 minutes to train a map denoiser.
First start with pose refinement in case some parameters went bit off due to rounding errors during conversion.
- Image warp grid:
3x3
Particle poses
1
sub-iterations
I got 3.1 Å (Species version TPOHfPsN
and data source version w3RsVznGO8gdTyrNFzPSpT6Krrs
).
Then perform defocus refinement.
- Defocus
No
grid search1
sub-iterations
The resolution remained 3.1 Å (Species version K4QhbqFv
. The data source is not modified).
Finally perform full iterations.
- Image warp grid:
3x3
Particle poses
- Dooming
- Defocus
No
grid search- Odd Zernike orders
3
and5
- Even Zernike orders
2
and4
3
sub-iterations
Since some parameters are correlated, it is worth performing multiple sub-iretarions when refining many parameters. Note that the reference is not updated between sub-iterations.
The resolution increased to 3.0 Å (Species version -ZrWgEbl
and data source version qMqnBvIDNWfzFSw5oGZrXyr0RO0
).
This is the same as RELION.
Sometimes refinement can go wrong and the resolution can degrade.
To revert to an earlier state, we have to revert both data source (XML files in the Movies
directory) and species (maps and denoising models in the M/species
directory).
Older versions of the files are stored under Movies/versions/
and M/species/SPECIES_ID/versions/
, identified by hash strings.
It is worth recording version strings at each refinement steps as shown above (yours might differ from mine).
NOTE: When a version string starts with the -
symbol (e.g. -ZrWgEbl
), you have to add --
to Linux commands to prevent it from being parsed as an command line option. (e.g. cd -- -ZrWgEbl
, rsync -avuP -- -ZrWgEbl backup
).
As an example, let's revert to the state after the second refinement.
The target is a species version K4QhbqFv
and a data source version w3RsVznGO8gdTyrNFzPSpT6Krrs
.
cd M/species/ae3003ba
rsync -avP versions/K4QhbqFv/
cd ../../../Movies
rsync -avP versions/w3RsVznGO8gdTyrNFzPSpT6Krrs/ .
Besides, the path to the gain reference in Movies/TutorialDataset.source
has to be repaired.
Change
<Param Name="GainPath" Value="../../gain.mrc" />
to
<Param Name="GainPath" Value="gain.mrc" />
.
Depending on the number of flips performed during image collection, M's assumption about the Ewald sphere curvature can be wrong.
To test the opposite hand, edit betaGalactosidase.species
in the species
folder and change
<Param Name="EwaldReverse" Value="True" />
to
<Param Name="EwaldReverse" Value="False" />
. Note that DoEwald
(False
by default) determines whether Ewald sphere curvature is considered during refinement. Reconstruction is always performed with the Ewald sphere curvature. See Dimitry's tweet.
Refine again. In this case, the resolution remained the same (Species version Mkith-F4
and data source version s7aA0uKBgp2IA0GjNtYcNR7_1JU
), as expected for this resolution.
There is another way to export particles from Warp, [OverView]-[Export particles]
.
This allows you to down-sample particles and Make paths relative to STAR
.
However, this has its own problems in addition to the aforementioned rlnMicrographName
bug.
First, Warp extracts particles from raw movie frames even when aligned and averaged images are present.
So this is very slow.
You can choose to Only write STAR
to save time, but then relion_convert_star
complains that particle dimensions cannot be determined.
Next, when down-sampling is applied, the rlnCoordinateX
and rlnCoordinateY
columns are wrongly scaled.
If you down-sample by four, i.e., from 0.885 Å/px to 3.54 Å/px, Warp divides the rlnCoordinateX
and rlnCoordinateY
values by four.
However, this is wrong!
These columns represent particle coordinates in the micrograph coordinate system, so they should be invariant under down-sampling during particle extraction.
You have to use awk
to repair this problem.
Because of these issues, I don't recommend this feature.
- EER