Given a panorama image, our system will run the object detection on the slices sampled from the panorama image.
The objects in the slices will be marked by a bounding box with the confidence score.
Click on a thumbnail to see the full image in a new tag.
ID
Scene
Samples
1
2
3
...
1300 plus scenes.
Please download the results to obtain all examples.
***NOTE: We run our approach on 1305 images, but due to the copyright issues, we only provide the URL to download the images instead of putting the images on our website ***
Results:
The results file includes 2 folders:
panorama: includes all the panorama images we captured by ourselves and the file downloadLinks.txt contains the download links to the Flickr images.
data: includes all the raw data and the result data which is generated by our system for each scene.
In the data folder, each sub-folder is associated with a panorama image in the panorama folder.
scene: includes the visualization of the object detection for the slices of the panorama image.
data.ini: the raw data of scene classification, object detection, and object recognition.
fullSound.ini: the result data generated by our system.
Format of downloadLinks.txt:
For each line, we have:
Image name,
License type,
The Image exactly download URL,
The original webpage which the image has been posted.
The data.ini file contains the raw data of scene classification, object detection, and object recognition.
We use the section-key-value-comment pair to organize this file.
The format of the pair is: [section] key = value ; comment.
*** Note: anything after the semicolon ";" will be the comment and the comment would not provide any useful information.***
Base Section:
The Base section provides a overall information about the scene classification, the object detection and the object recognition.
frameCount: indicates how many slices for the panorama image, it always be 10.
objectCount: indicates the how many objects in total. It includes the duplicated objects, we will remove them later.
objectCatalogCount: indicates how many object catalogs have been detected in this scene.
objectCatalogList: shows the list of object catalogs.
sortedDescription: shows the list of scene catalogs sorted by their scores.
sortedScore: shows the list of scores of the scene catalogs.
unDuplicatedObjectIds: the ids of the unduplicated objects.
unDuplicatedObjectIdsCount: the # of the unduplicated objects.
Frame Section:
The format of the Frame section is [frame_X] where X is the id of the frame.
cameraEulerAngle: the euler angle of the camera while shooting this frame.
imageWidth & imageHeight: the size of the frame.
file: the path if of screenshot of the frame.
objList: the list of the object's ids detected from this frame. (Includes the duplicated objects.)
objCount: the # of objects. (Includes the duplicated objects.)
description: the list of the scene classification of the frame.
score: the list of scores associate with the description.
Object Section:
The format of the Object section is [Object_X] where X is the id of the object.
frame: indicats which frame the object belongs to.
tag: the tag of this object.
minY,minX,maxY,maxX: the frame of the bounding box.
center,leftTop,leftBottom,rightTop,rightBottom: the angles in euler of this object. The angle of the center can be turned into the direction of the object related to the orign of the virtual world.
depth: the depth of the object base on the reference object.
action: the action tag detected by the object recognition. Only the unduplicated person object will have this key.
Format of fullsounds.ini:
The fullsounds.ini file is generated by our system base on our approach.
The system will use this file to place the sounds to the scene.
We use the section-key-value-comment pair to organize this file.
The format of the pair is: [section] key = value ; comment.
*** Note: anything after the semicolon ";" will be the comment and the comment would not provide any useful information.***
Base Section:
The Base section provides a overall information about the scene classification, the object detection and the object recognition.
soundCount: the # of objects in this scene.
Sound Section:
The format of the Sound section is [Sound_X] where X is the id of the sound.
tag: the tag of this sound.
soundFile: the sound file the system used.
bObject: indicates if this sound is an object or background. True means object and False means background.
location: the location of this sound.
Sound Database:
The soundDatabase file includes 2 folders: background and soundableObject.
The sub-folders of these two folders are named based on the tags for scene classification and object recognition.
The mp3 files in those sub-folders are the sound sources we used in our system.