Ullman - AI Conference Trip Report
The conference involved little of interest, except for the talk by Ullman. Professor
concentrated on the studies of deep learning by means of visual routines. Visual routine is a process
of comprehension of information from a visual scene. Professor suggests that the whole routine of
human cognition of visual scenes is divided into two stages: “bottom-up” stage, at which early
perceptions are being formed, and a “top-down” stage, at which the exact information is being
extracted from visual objects and scenes.
It is worth noting, that Ullman identifies retinotopic maps: special representation services in
human recognition system. Those include the identification of such object properties as direction
and speed of motion, color, and edge orientation. However, these properties concentrate on the
general operations performed on the visual input. Accordingly, specific areas of object recognition,
namely: task-specific knowledge, object-specific knowledge, etc.
In the light of that, Ullman presents visual routines as complex processes of identification of
certain spatial information from the observed scene. The process involves recognition of basic
features of the scene with the following structuring of information on the object with elementary
visual operations. Specifically, visual operations differ from other psychological process by the
specified concentration on particular objects of the scene, not on the whole area.
Professor Ullman singles out the examples of visual operators, namely: focusing on different
objects, identifying the salient object to focus on it, switching focus over the area of interest,
exploring the limits of the area, defining the object for further reference. This process is a
performance of a sophisticated spatial task humans are able to accomplish, for example: stating the
shape of the object, counting the number of objects in the scene, etc.
The study by Ullman were applied by the developers of cameras to recognize motion, human
pointing at a certain object to focus on. As well, gaming industry and artificial maps became the
fields of the study further development. Professor pointed out a distinguishable basis of deep
learning and artificial intelligence by the visual recognition study.
Since human psychology and recognition processes in particular have been studied widely,
visual features human brain uses lack appropriate research. As a consequence, the ability of the
brain to recognise objects is a subject of interest. Nevertheless, relative studies in neural biology
advanced to determine the prospect of development of deep machine learning even in terms of being
challenging for human brain. It is worth noting, that the basis of machine learning is visual
recognition as the models are trained on images to make conclusions and to extract information. At
this point, the psychological problem arises: it is still a question whether human brain applied the
same technologies as machines do to learn. Taking this into account, Ullman provided evidence-
based knowledge to conclude that the processes are different.
Moreover, the same features that human brain recognizes to make judgements on the object
appear to be of little value for machine learning. Interestingly, the studies showed that even a minute
change in images for the learning may lead to drastic change in the process of perception of the
object; by humans and by machines as well. As it was expected, machines could not equal the
abilities of human mind to adapt to changing environment fast and yet to make fair statements on the
properties of the object. It appeared impossible for the machines to be as sensitive as human brain
and to recognize small images. At this point, every image should possess some distinguishable
features for the machine to recognize it and to extract necessary information. Conclusively, human