In addition to this README
, this torrent contains 4 datasets:
Name | Image size (px) | Scene number | Size compressed (B) | Total size (B) |
---|---|---|---|---|
64.tar.xz |
64x64 | 80K | 9.8G | 19G |
128.tar.xz |
128x128 | 20K | 7.1G | 12G |
256.tar.xz |
256x256 | 3.2K | 5.0G | 8.5G |
512.tar.xz |
512x512 | 3.2K | 19G | 33G |
Each dataset consists in 16 folders.
Each of these folders will have a metadata.json
file to describe the content of the folder
{
"args": {},
"scenes":[],
"fov": 90,
"scenes_nb": 5000,
"resolution":[64,64]
}
- fov is in degrees (it's 90 throughout all the different datasets)
- resolution is in pixels
- args is a dump of the settings of the scenes when it was generated
Each element of the list of scenes has the same structure
{
"depth": [],
"imgs":[],
"length":10,
"speed": [x,y,z],
"time_step":0.1
}
depth
andimgs
are lists of file paths. they should be of the same length specified inlength
, and the nth element ofdepth
should match the nth element ofimgs
speed
is a 3D vector, coordinates are in m/s, defined in [Right Up Forward] system, relative to the cameratime_step
is the time between each frame.
To get 3D displacement between two frames, you can compute it with displacement = shift * time_step * speed
args
has this structure (among other options)
{
"clip": [0.1, 200],
"meshes_nb": 20,
"meshes_var": [4.0, 15.0],
"texture_ratio": 0.5
}
clip
is clipping distance, objects as near as 0.1m or as far as 200m won't appearmeshe_nb
is the number of shapes in each scene, you may not see all of the same at once in the framesmeshes_ver
is the variation in size and position of the meshes of the scene. Both in meterstexture_ratio
is the ratio of textured shapes. Other shapes have a unified color texture