Ambient dataset

Audible Panorama:Automatic Spatial Audio Generation for Panorama Imagery

Project Page

Examples of object detection.

Given a panorama image, our system will run the object detection on the slices sampled from the panorama image.
The objects in the slices will be marked by a bounding box with the confidence score.
Click on a thumbnail to see the full image in a new tag.

ID Scene Samples
1
2
3
... 1300 plus scenes. Please download the results to obtain all examples.

Download the results and the sound database.

***NOTE: We run our approach on 1305 images, but due to the copyright issues, we only provide the URL to download the images instead of putting the images on our website ***

Results:

The results file includes 2 folders:
In the data folder, each sub-folder is associated with a panorama image in the panorama folder.

Format of downloadLinks.txt:

For each line, we have:
Example: ZZZ25086550514, license:1, https://farm2.staticflickr.com/1629/25086550514_240a1a97c4_o.jpg, https://www.flickr.com/photos/24128368@N00/25086550514/

License type:

Format of data.ini:

The data.ini file contains the raw data of scene classification, object detection, and object recognition.

We use the section-key-value-comment pair to organize this file.
The format of the pair is: [section] key = value ; comment.
*** Note: anything after the semicolon ";" will be the comment and the comment would not provide any useful information.***

Base Section:

The Base section provides a overall information about the scene classification, the object detection and the object recognition.

Frame Section:

The format of the Frame section is [frame_X] where X is the id of the frame.

Object Section:

The format of the Object section is [Object_X] where X is the id of the object.

Format of fullsounds.ini:

The fullsounds.ini file is generated by our system base on our approach.
The system will use this file to place the sounds to the scene.

We use the section-key-value-comment pair to organize this file.
The format of the pair is: [section] key = value ; comment.
*** Note: anything after the semicolon ";" will be the comment and the comment would not provide any useful information.***

Base Section:

The Base section provides a overall information about the scene classification, the object detection and the object recognition.

Sound Section:

The format of the Sound section is [Sound_X] where X is the id of the sound.

Sound Database:

The soundDatabase file includes 2 folders: background and soundableObject.
The sub-folders of these two folders are named based on the tags for scene classification and object recognition.
The mp3 files in those sub-folders are the sound sources we used in our system.