Emerging neural rendering technology has been recently evolved to learn an internal representation of arbitrarily observed scenes by synthesizing novel views using deep generative models. However, this was evaluated on simple synthetic scenes with only a few objects in small-sized spaces and limited spatial variability. In this research, we extend existing neural scene representation and rendering to more complex scenes such as typical indoor environments, by proposing an attention mechanism that leverages multimodal scene components, including color, depth images, and semantic masks. To this end, we created a large-scale dataset consisting of 120 million indoor scene images from 26,000 virtual houses. Experimental results shows that our method significantly improves both the quantitative and qualitative performance of the baseline method.
Object slip perception is essential for mobile manipulation robots to perform manipulation tasks reliably in the dynamic real-world. Traditional approaches to robot arms’ slip perception use tactile or vision sensors. However, mobile robots still have to deal with noise in their sensor signals caused by the robot’s movement in a changing environment. To solve this problem, we present an anomaly detection method that utilizes multisensory data based on a deep autoencoder model. The proposed framework integrates heterogeneous data streams collected from various robot sensors, including RGB and depth cameras, a microphone, and a force-torque sensor. The integrated data is used to train a deep autoencoder to construct latent representations of the multisensory data that indicate the normal status. Anomalies can then be identified by error scores measured by the difference between the trained encoder’s latent values and the latent values of reconstructed input data. In order to evaluate the proposed framework, we conducted an experiment that mimics an object slip by a mobile service robot operating in a real-world environment with diverse household objects and different moving patterns. The experimental results verified that the proposed framework reliably detects anomalies in object slip situations despite various object types and robot behaviors, and visual and auditory noise in the environment. [Poster] [Video]
We develop service robots working in the house or several domestic service areas such as restaurants and convenience stores. To this end, we implement a robust integrated perception-action-learning system for mobile manipulation robots. The state-of-the-art deep learning techniques were incorporated into each module which significantly improves the performance in solving various service tasks. Our team and robots won the 1st place of Robocup@Home SSPL 2017 held in Nagoya, and 2nd place of Robocup@Home DSPL 2019 held in Sydney. [Related Article 1] [Related Article 2] [Poster] [Presented Video] [More Videos]
Spatial perception is a fundamental ability necessary for autonomous mobile robots to move robustly and safely in the real-world. Recent advances in SLAM enabled a single camera-based system to concurrently build 3D maps of the world while tracking its location and orientation. However, such systems often fail to track themselves within the map and cannot recognize previously visited places due to the lack of reliable descriptions of the observed scenes. We present a spatial perception framework that uses an object-aware visual scene representation to enhance the spatial abilities. The proposed representation compensates for aberrations of conventional geometric scene representations by fusing those representations with semantic features extracted from perceived objects. We implemented this framework on a mobile robot platform to validate its performance in home situations. Further evaluations were conducted with the ScanNet dataset which provides large-scale 3D photo-realistic indoor scenes. Extensive tests show that our framework can reliably generate maps by reducing tracking-failure, and better recognize overlap in the map. [Full Article] [Presented Poster]
Episodic memory formation is associated with large-scale neuronal activity distributed across the cortex. Decades of neuroimaging and patient lesion studies demonstrated the correlation between the roles of specific brain structures in episodic memory retrieval. Distributed, coordinated and synchronized activities across brain regions have also been investigated, however, neuronal mechanisms based on effective connectivity underlying the coordination of this anatomically distributed information processing into introspectively coherent cognition have remain largely unknown. Here we investigate the information flow network of the human brain during episodic memory retrieval. We have estimated local oscillation amplitudes and asymmetric inter-areal synchronization from EEG recordings in individual cortical anatomy by using source reconstruction techniques and effective connectivity method during episodic memory retrieval. The strength and spectro-anatomical patterns of these inter-areal interactions in sub-second time-scales reveals that the episodic memory retrieval involves increase of information flow and densely interconnected networks between the prefrontal cortex, the medial temporal lobe, and some subregions of the parietal cortex. In this network, interestingly, the SFG acted as a hub, globally interconnected across broad brain regions. [Full Article] [Presented Slides]
Understanding episodic memory formation of real-world events is essential for the investigation of human cognition. Most studies have stressed on delimiting the upper boundaries of this memory by using memorization tasks with conditional experimental paradigms, rather than the performance of everyday tasks. However, naturally occurring sensory stimuli are multimodal and dynamic. In an effort to investigate the encoding and retrieval of episodic memory under more naturalistic and ecological conditions, we here demonstrate a memory experiment that employs audio-visual movies as naturalistic stimuli. Electroencephalography measurements were used to analyze neural activations during memory formation. We found that oscillatory activities in the theta frequency bands on the left parietal lobe, and gamma frequency bands on the temporal lobes are related to overall memory formation. Theta and gamma power of the frontal lobes, and gamma power of the occipital lobes were both increased during retrieval tasks. Furthermore, subjects’ memory retrieval performance on the query task was used to clarify our experimental results. Correlation between behavioral differences and neural activation was observed in the same regions. Our results extend the previous results of neurocognitive studies on memory formation via naturalistic stimuli, neural oscillations, and behavioral analysis combined. [Full Article] [Presented Poster]
Context awareness is necessary for diverse user-oriented services. Especially, place awareness (logical location context awareness) is an essential method for a location-based service (LBS) that is widely provided to smartphone users. However, GPS-based place awareness is only valid when not only a user is located at outdoor positions where a GPS signal is strong enough but also the information on the relation between a physical location and a symbolic place is available. Here we propose a novel place awareness method using visual information as well as GPS data. The proposed method predicts the current place of a user from GPS-tagged scene photographs taken by the digital camera equipped in a smartphone. Our method uses a modified support vector machine (SVM) where scene images and GPS coordinates are given as the instances and the weight parameters of the model, respectively, for classifying the place from the scene. We evaluate our method on the place awareness from approximately 4,000 photographs of four places including hallway, classroom, restaurant, and outdoor. Experimental results show that the proposed method can precisely recognize the place with scene photos only when GPS information is not available and the awareness accuracy is improved in the GPS-available case. Furthermore, we demonstrate a smartphone application using the proposed place awareness method based on vision-GPS data. [Full Article] [Presented Poster]
In this paper, we propose multi-layer structural wound synthesis on a 3D face. The fundamental knowledge of the facial skin is derived from the structure of tissue, being composed of epidermis, dermis and subcutis. The approach first defines the facial tissue depth map to measure details at various locations on the face. Each layer of skin in a wound image has been determined by hue-based segmentation. In addition, we have employed disparity parameters to realise 3D depth in order to make a wound model volumetric. Finally, we validate our methods using 3D wound simulation experiments. [Full Article] [Presented Slides]