Skip to content

Instantly share code, notes, and snippets.

@biochem-fan
Last active July 22, 2024 12:03
Show Gist options
  • Save biochem-fan/c21b4701cc633201c5c99582b4ca16b3 to your computer and use it in GitHub Desktop.
Save biochem-fan/c21b4701cc633201c5c99582b4ca16b3 to your computer and use it in GitHub Desktop.
Warp-RELION4-M Protocol

RELION interoperabiity with Warp and M

This document examines how to use RELION 4.0 (beta2 as of writing) with Warp 1.09 and M 1.09 for single particle analysis.

Special thanks to Alister Burt, Pranav Shah and Dimitry Tegunov for discussion on this Twitter thread.

Download the movies

We use the RELION tutorial dataset (beta-galactosidase collected on JEOL CRYO ARM 200, a subset of EMPIAR-10204).

wget ftp://ftp.mrc-lmb.cam.ac.uk/pub/scheres/relion30_tutorial_data.tar.
tar xzvf relion30_tutorial_data.tar
cd relion30_tutorial

The extracted directory contains the Movies directory. We rename it to RawMovies and make a new Movies directory. This is because Warp creates a lot of files in the same directory as movies. We don't want to mix precious raw data with derived files, which can be regenerated.

mv Movies RawMovies
mkdir Movies
cd Movies
ln -s /path/to/RawMovies/* . # make symbolic links by absolute paths!!

By the way, we assume this folder is located on a Linux system and exported to Windows by SAMBA. Do not make symbolic links on NTFS using WSL. Native Windows applications (i.e. Warp and M) don't recognize symbolic links on NTFS created by WSL.

Now the directory structure should look like:

  • relion_tutorial: this will be RELION's project directory
    • RawMovies: this contains TIFF files
    • Movies: this contains symbolic links to above TIFF files. We will run Warp here

Pre-processing in Warp

Switch to an Windows machine and start Warp.

CTF & Motion

We start with CTF and motion correction.

  • Movie parameters:

    • Pixel size: 0.885 Å/px
    • Dose: 1.277 e/Å^2/frame
    • Gain reference: gain.mrc without transformations
  • CTF parameters:

    • Window: 512 px
    • CTF range: 20 - 3.5 Å
    • Acceleration voltage: 200 kV
    • Cs: 1.4 mm
    • Amplitude contrast: 0.1
    • Defocus: 0.6 - 3.0 um
    • No phase shift
    • Don't use Movie sum
    • Defocus grid: 5 x 5 x 1
  • Motion parameters:

    • Motion: consider 35.4-7.1 Å
    • Weigh by B = -500 Å^2
    • Motion grid: 5 x 5 x 20
  • Don't pick particles (yet)

Click Start Processing. Once a few micrographs are processed, Stop Processing.

The first image was reported as an outlier in astigmatism, but CTF fit looked OK. So I included it by increasing the filter's sigma threshold to 10.0.

Picking & dummy-extraction

Test picking parameters on a few micrographs. Go to the Real Space tab and click Pick with Boxnet2mask20180918. Activate Show particles from BoxNet2Mask_20180918 with 200 Å to check particle distributions. For this dataset, at least 0.450 score led to a reasonable result.

Enable Pick Particles:

  • Use Boxnet2Mask_20180918
  • Diameter: 200 Å
  • Score: 0.45
  • cryo particles
  • Maintain a minimum distance of 0 Å from junk, i.e. don't use this filter

We want to extract particles at a large pixel size (3-4 Å/px) first to run Class2D and then re-extract only good particles at a smaller pixel size. Unfortunately, however, Warp does not allow particle down-sampling at this point. If you enable binning, the output micrographs are also down-sampled. Thus, you cannot re-extract particles at a smaller pixel size later. So we don't use binning in Warp and use RELION to extract down-sampled particles. Nevertheless, we must let Warp extract something to get a STAR file. To minimize storage waste, we extract dummy 2 px particles in Warp. This is indeed very inconvenient!

  • Extract 2 px boxes, which is the smallest size allowed in Warp
  • Don't invert, normalize, maintain a separate list (we don't use these particles anyway, so save time by skipping operations)

Start Processing to process all movies. Upon completion, save settings as warp-config.settings, which will be used by M later.

Export micrographs to RELION

Click [Overview]-[Export Micrograph List] and specify the following options.

  • Generate files for RELION's polishing
  • Make paths relative to STAR location
  • Write micrographs.star to the top of the RELION project, that is, outside the Movies directory.

The output is written in the RELION 3.0 format. We convert this to the RELION >= 3.1 format with an optics group table by:

relion_convert_star --i micrographs.star --o micrographs-3.1.star --Cs 1.4 --Q0 0.1

Q0 stands for amplitude contrast.

Export particles to RELION

Warp writes a list of good particles to Movies/goodparticles_BoxNet2Mask_20180918.star. First, we convert this to the RELION >= 3.1 format by:

cd Movies
relion_convert_star --i goodparticles_BoxNet2Mask_20180918.star --o ../goodparticles-3.1.star
cd ..

We have to run the conversion inside the Movies directory, because the paths are relative to the directory. relion_convert_star needs to inspect at least one MRCS file to find out the particle dimensions.

In addition to the wrong origin of the relative paths, this STAR file has another problem: the rlnMicrographName column contains movie names, instead of micrograph names.

See the particles table.

data_particles

loop_
_rlnCoordinateX #1 
_rlnCoordinateY #2 
_rlnPhaseShift #3 
_rlnDefocusU #4 
_rlnDefocusV #5 
_rlnDefocusAngle #6 
_rlnCtfMaxResolution #7 
_rlnImageName #8 
_rlnMicrographName #9 
_rlnOpticsGroup #10 
  833.380000   190.870000     0.000000 11131.900000 10704.500000    68.000000     4.000000 0000001@particles/20170629_00021_frameImage_BoxNet2Mask_20180918.mrcs 20170629_00021_frameImage.tiff            1
 3181.92

By a text editor in Linux, rename the 9-th column to rlnMicrographMovieName and add the 11-th column rlnMicrographName. Windows uses CR+LF as a new line symbol, while Linux uses LF alone. Editing a STAR file in Windows can cause problems later. Then use awk to make the movie paths relative to the top of the project directory and fill the 11-th column.

awk '{if (NF==10) {fn=$9; $9="Movies/"$9; sub(".tiff","",fn); $11="Movies/average/"fn".mrc"} print}' goodparticles-3.1.star > goodparticles-3.1-fixed.star

Now the repaired table should look like:

data_particles

loop_ 
_rlnCoordinateX #1 
_rlnCoordinateY #2 
_rlnPhaseShift #3 
_rlnDefocusU #4 
_rlnDefocusV #5 
_rlnDefocusAngle #6 
_rlnCtfMaxResolution #7 
_rlnImageName #8 
_rlnMicrographMovieName #9 
_rlnOpticsGroup #10 
_rlnMicrographName #11
833.380000 190.870000 0.000000 11131.900000 10704.500000 68.000000 4.000000 0000001@particles/20170629_00021_frameImage_BoxNet2Mask_20180918.mrcs Movies/20170629_00021_frameImage.tiff 1 Movies/average/20170629_00021_frameImage.mrc

We don't care the rlnImageName column, because we will re-extract particles in RELION anyway.

TODO: Check what happens to rlnCoordinateX/Y when super-resolution movies are binned.

Processing in RELION

After making sure your current directory is correct (outside Movies!), launch RELION.

Down-sampled particle extraction and Class2D

We extract particles with strong down-sampling.

  • Micrograph STAR file: micrographs-3.1.star
  • Input coordinates: (leave empty)
  • OR re-extract refined particles: Yes
  • Refine particle STAR file: goodparticles-3.1-fixed.star
  • Particle box size: 280 px
  • Diameter background circle: 200 px
  • Rescale particles: Yes
  • Re-scaled size: 64 px (3.872 Å/px)

Perform 200 iterations of VDAM Class2D into 25 classes, select good classes (I got 4283 particles) and generate an initial model. I don't describe the details of RELION processing, since this is explained in the RELION tutorial.

First refinement in RELION

Re-extract chosen particles with less down-sampling. This time I extracted 320 px particles into a 200 px box (1.416 Å/px). Run Refine3D.

Make a mask. In later steps, M needs a non-soft (binary) mask, so create two masks with and without Add a soft-edge of this many pixels: 5 px.

I got 3.2 Å in PostProcess.

Futher refinements in RELION

I proceeded as follows:

  • CtfRefine (up to 4-th order aberrations)
  • CtfRefine (per-particle defocus and per-micrograph astigmatism)
  • 2nd Refine3D with a mask using solvent flattened FSC (3.1 Å)
  • Bayesian Polishing
  • 3rd Refine3D (3.0 Å)
  • CtfRefine (up to 4-th order aberrations)
  • CtfRefine (per-particle defocus and per-micrograph astigmatism)
  • 4th Refine3D (3.0 Å)

Processing in M

Once we perform Refine3D, we can switch to M. Here we take the result of the first Refine3D job (3.2 Å) and continue refinement in M.

Import run_data.star to M

Since M cannot handle STAR files from RELION >= 3.1, we have to revert the file format. Copy Refine3D/first/run_data.star to Refine3D/first/run_warp_data.star and edit as follows in Linux.

  • Remove the optics group table.
  • Swap the rlnMicrographName and rlnMicrographMovieName columns.
  • Rename the rlnOrignXAngst column to rlnOriginX, the rlnOriginYAngst column to rlnOriginY.

The beginning of the file after modification should look like:

data_particles
  
loop_
_rlnCoordinateX #1
_rlnCoordinateY #2
_rlnPhaseShift #3
_rlnDefocusU #4
_rlnDefocusV #5
_rlnDefocusAngle #6
_rlnCtfMaxResolution #7
_rlnImageName #8
_rlnMicrographName #9
_rlnOpticsGroup #10
_rlnMicrographMovieName #11
_rlnGroupNumber #12
_rlnAngleRot #13
_rlnAngleTilt #14
_rlnAnglePsi #15
_rlnOriginX #16
_rlnOriginY #17
_rlnClassNumber #18
_rlnNormCorrection #19
_rlnLogLikeliContribution #20
_rlnMaxValueProbDistribution #21
_rlnNrOfSignificantSamples #22
_rlnRandomSubset #23
  692.920000   557.320000     0.000000 11086.200000 10658.900000    68.000000     4.000000 000001@Extract/job005/Movies/average/20170629_00021_frameImage.mrcs Movies/20170629_00021_frameImage.tiff            1 Movies/average/20170629_00021_frameImage.mrc            1   105.714139    71.436277   156.476795     0.856420     1.327240            1     0.632212 1.499089e+05     0.225804           22            2

Setup species

Launch M.

Create New Population with a name Tutorial in a new M folder.

[Manage Data Sources]-[Add local] and select warp-config.settings in the Movies folder. Name it TutorialDataset. Use only first 24 frames (i.e. all frames) in this case.

Click the big + symbol to add species from scratch.

  • Name it betaGalactosidase
  • Diameter 200 Å
  • 540 kDa (this value is not used at the moment)
  • D2 symmetry
  • 2 temporal poses

Select Refine3D/first/run_half1_class001_unfil.mrc and Refine3D/first/run_half2_class001_unfil.mrc at 1.4160 Å/px as references.

Select the binary mask, because M adds soft edges itself.

Select Refine3D/first/run_warp_data.star.

  • Coordinates use 0.885 Å/px
  • Shifts use 1.000 Å/px
  • Use TutorialDataset for name matching.

It is important to specify 1.000 Å/px as shifts regardless of the down-sampled pixel size used in the Refine3D job. This is because we generated the rlnOriginX and rlnOriginY columns by simply renaming the rlnOriginXAngst and rlnOriginYAngst columns.

Make sure all particles are matched to available data sources.

Creation of a new species takes 15-20 minutes to train a map denoiser.

Refinement

First start with pose refinement in case some parameters went bit off due to rounding errors during conversion.

  • Image warp grid: 3x3
  • Particle poses
  • 1 sub-iterations

I got 3.1 Å (Species version TPOHfPsN and data source version w3RsVznGO8gdTyrNFzPSpT6Krrs).

Then perform defocus refinement.

  • Defocus
  • No grid search
  • 1 sub-iterations

The resolution remained 3.1 Å (Species version K4QhbqFv. The data source is not modified).

Finally perform full iterations.

  • Image warp grid: 3x3
  • Particle poses
  • Dooming
  • Defocus
  • No grid search
  • Odd Zernike orders 3 and 5
  • Even Zernike orders 2 and 4
  • 3 sub-iterations

Since some parameters are correlated, it is worth performing multiple sub-iretarions when refining many parameters. Note that the reference is not updated between sub-iterations.

The resolution increased to 3.0 Å (Species version -ZrWgEbl and data source version qMqnBvIDNWfzFSw5oGZrXyr0RO0). This is the same as RELION.

Tips: how to revert to an earlier state

Sometimes refinement can go wrong and the resolution can degrade. To revert to an earlier state, we have to revert both data source (XML files in the Movies directory) and species (maps and denoising models in the M/species directory). Older versions of the files are stored under Movies/versions/ and M/species/SPECIES_ID/versions/, identified by hash strings. It is worth recording version strings at each refinement steps as shown above (yours might differ from mine).

NOTE: When a version string starts with the - symbol (e.g. -ZrWgEbl), you have to add -- to Linux commands to prevent it from being parsed as an command line option. (e.g. cd -- -ZrWgEbl, rsync -avuP -- -ZrWgEbl backup).

As an example, let's revert to the state after the second refinement. The target is a species version K4QhbqFv and a data source version w3RsVznGO8gdTyrNFzPSpT6Krrs.

cd M/species/ae3003ba
rsync -avP versions/K4QhbqFv/
cd ../../../Movies
rsync -avP versions/w3RsVznGO8gdTyrNFzPSpT6Krrs/ .

Besides, the path to the gain reference in Movies/TutorialDataset.source has to be repaired. Change

<Param Name="GainPath" Value="../../gain.mrc" />

to

<Param Name="GainPath" Value="gain.mrc" />

.

Tips: try the opposite Ewald curvature

Depending on the number of flips performed during image collection, M's assumption about the Ewald sphere curvature can be wrong. To test the opposite hand, edit betaGalactosidase.species in the species folder and change

<Param Name="EwaldReverse" Value="True" />

to

<Param Name="EwaldReverse" Value="False" />

. Note that DoEwald (False by default) determines whether Ewald sphere curvature is considered during refinement. Reconstruction is always performed with the Ewald sphere curvature. See Dimitry's tweet.

Refine again. In this case, the resolution remained the same (Species version Mkith-F4 and data source version s7aA0uKBgp2IA0GjNtYcNR7_1JU), as expected for this resolution.

Miscellanea

Export particles to RELION (another method)

There is another way to export particles from Warp, [OverView]-[Export particles]. This allows you to down-sample particles and Make paths relative to STAR. However, this has its own problems in addition to the aforementioned rlnMicrographName bug.

First, Warp extracts particles from raw movie frames even when aligned and averaged images are present. So this is very slow. You can choose to Only write STAR to save time, but then relion_convert_star complains that particle dimensions cannot be determined.

Next, when down-sampling is applied, the rlnCoordinateX and rlnCoordinateY columns are wrongly scaled. If you down-sample by four, i.e., from 0.885 Å/px to 3.54 Å/px, Warp divides the rlnCoordinateX and rlnCoordinateY values by four. However, this is wrong! These columns represent particle coordinates in the micrograph coordinate system, so they should be invariant under down-sampling during particle extraction. You have to use awk to repair this problem.

Because of these issues, I don't recommend this feature.

TODO

  • EER
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment