Ullman - AI Conference Trip Report 
The conference involved little of interest, except for the talk by Ullman. Professor 
concentrated on the studies of deep learning by means of visual routines. Visual routine is a process 
of comprehension of information from a visual scene. Professor suggests that the whole routine of 
human cognition of visual scenes is divided into two stages: “bottom-up” stage, at which early 
perceptions are being formed, and a “top-down” stage, at which the exact information is being 
extracted from visual objects and scenes. 
It is worth noting, that Ullman identifies retinotopic maps: special representation services in 
human recognition system. Those include the identification of such object properties as direction 
and speed of motion, color, and edge orientation. However, these properties concentrate on the 
general operations performed on the visual input. Accordingly, specific areas of object recognition, 
namely: task-specific knowledge, object-specific knowledge, etc. 
In the light of that, Ullman presents visual routines as complex processes of identification of 
certain spatial information from the observed scene. The process involves recognition of basic 
features of the scene with the following structuring of information on the object with elementary 
visual operations. Specifically, visual operations differ from other psychological process by the 
specified concentration on particular objects of the scene, not on the whole area.  
Professor Ullman singles out the examples of visual operators, namely: focusing on different 
objects, identifying the salient object to focus on it, switching focus over the area of interest, 
exploring the limits of the area, defining the object for further reference. This process is a 
performance of a sophisticated spatial task humans are able to accomplish, for example: stating the 
shape of the object, counting the number of objects in the scene, etc.  
The study by Ullman were applied by the developers of cameras to recognize motion, human 
pointing at a certain object to focus on. As well, gaming industry and artificial maps became the 
fields of the study further development. Professor pointed out a distinguishable basis of deep 
learning and artificial intelligence by the visual recognition study.  
Since human psychology and recognition processes in particular have been studied widely, 
visual features human brain uses lack appropriate research. As a consequence, the ability of the 
brain to recognise objects is a subject of interest. Nevertheless, relative studies in neural biology 
advanced to determine the prospect of development of deep machine learning even in terms of being 
challenging for human brain. It is worth noting, that the basis of machine learning is visual 
recognition as the models are trained on images to make conclusions and to extract information. At 
this point, the psychological problem arises: it is still a question whether human brain applied the 
same technologies as machines do to learn. Taking this into account, Ullman provided evidence-
based knowledge to conclude that the processes are different.  
Moreover, the same features that human brain recognizes to make judgements on the object 
appear to be of little value for machine learning. Interestingly, the studies showed that even a minute 
change in images for the learning may lead to drastic change in the process of perception of the 
object; by humans and by machines as well. As it was expected, machines could not equal the 
abilities of human mind to adapt to changing environment fast and yet to make fair statements on the 
properties of the object. It appeared impossible for the machines to be as sensitive as human brain 
and to recognize small images. At this point, every image should possess some distinguishable 
features for the machine to recognize it and to extract necessary information. Conclusively, human