In this article Uri Guterman, Head of Product & Marketing for Hanwha Techwin Europe and Alessia Saggese, Ph.D., Co-Owner at A.I Tech and Assistant Professor at the University of Salerno, provide an overview of Deep Learning and how the two companies are working together to introduce solutions which will add to the value of video surveillance systems by offering real-life practical benefits.
Progress in the development of artificial intelligence and computer vision has been advancing at such a dramatic rate, that doyens of the technologies, such as computer scientist Yann LeCun, have been known to jokingly refer to work done on the subjects before 2012, as ‘prehistoric’. In terms of object recognition, the algorithms available at that time were only 75% accurate. Nowadays, thanks to a deep learning based approach, we can expect accuracy to be much higher. In fact, the advances made in the last 12 months means, we are close to being in a position to seriously consider incorporating the technologies into the majority of video surveillance systems.
It would be wrong to look at Deep Learning as some kind of advanced video analytics software platform, as it represents a paradigm shift within the security sector as to how incidents can be detected and responded to.
What is Deep Learning?
Unlike most forms of video analytics, the Deep Learning application developer does not have to write complicated algorithms for recognising objects. Instead, a Deep Learning solution has the ability to ‘learn from examples’. During an initial training phase, the application is supplied with large amounts of data representing correctly solved examples of the challenge at hand, e.g. classifying a person by age or gender.
A deep network analyses the relationship between inputted data and the expected output, such as the gender of a person, and learns how to solve the problem by analogies. As an example, being able to correctly establish the gender of a person requires an AI expert to design, train and validate a deep network which during the training stage, uses a database of millions of suitably selected faces, each of which is tagged with its known true gender. After several days of learning, the neural network is ready to be put to work and is likely to have an accuracy of approximately 98%, which is about the same as the ability of human beings to do the same thing.
The Challenge
Deep Learning needs the expertise of machine learning experts together with massive computing resources, as the Application needs to be able to cope with ‘in the wild conditions’, such as changing lighting conditions, shadows, the position of a face, etc. As a result, anything other than the most basic of Deep Learning solutions, will need to be run on servers which have adequate amounts of computing power and memory. For Deep Learning to be a practical addition to the majority of video surveillance systems, it is generally accepted that it will require an optimised software architecture so that it is possible for it to be run at the edge. By this we mean onboard cameras, in the same way Apps are run on smartphones and tablets.
The Hanwha Techwin Partnership
Reducing the processing requirements of Deep Learning so that it can be operated at the edge is no mean feat, which is why Hanwha Techwin is championing the concept of manufacturers of video surveillance solutions working closely with experts in this specialised field, in order to have access to the latest innovations and research.
For Hanwha Techwin, this means working in partnership with A.I. Tech – a spinoff of the Computer Engineering Department – DIEM of the University of Salerno (UNISA), which has a dedicated ‘Intelligent Machines for Video Recognition’ Lab research group. A.I. Tech’s CEO, Mario Vento, is listed as one of the top ranked Italian scientists in engineering and is also among the most cited authors in Italy in the field of Computer Vision and Artificial Intelligence.
A.I. Cameras
Hanwha Tecwhin is also working on introducing new Wisenet cameras during the latter part of 2019 which will incorporate a computer vision chipset allowing Deep Learning applications to be run onboard the cameras. These new 4K and 5MP cameras, which will be additions to our existing Wisenet P premium camera series, will initially offer more accurate forms of existing types of video analytics. However, they will also provide a platform for our technology partners to use our APIs to introduce ground breaking Deep Learning applications that integrate seamlessly with the new cameras.