publication

Mobile, CPU-only Deep Learning Inference: How fast can it be?

Speed metrics of a super-optimal multi-core, multi-cluster CPU-only implementation on several mobile devices

Deep Learning is a word these days frequently spoken by executives and engineers in numerous technology fields ranging from mobile systems to home appliances and automotive. Deep learning systems, although unprecedented in inference accuracy, they introduce a high computational complexity, questioning their usability in systems featuring a limited computational capacity such as in the case of embedded systems used in many markets today such as smartphones, IoT, Home Appliances etc. A possible workaround to this problem, is the use of heterogeneous computing: This involves the exploitation of every computing resource present on an embedded system (CPU, GPU, DSP) to which a part of the load is off-loaded increasing this way the overall computational capacity and thus processing speed. There are however cases, where a multicore CPU is the only available resource on an embedded system. So, there is a reasonable question in this case: Is it possible to have a decent inference speed?

In Irida Labs, we specialize on solving problems of such a kind. To this end, we took a step forward and we evaluated the performance of a SqueezeNet CNN model on a multi-core multi-cluster CPU.

The SqueezeNet Model.

The SqueezeNet 1.1 model is able to achieve similar levels of classification accuracy on ImageNet, to the baseline AlexNet architecture, using 50 times fewer coefficients. The smart combination of small convolutional kernels and a complex architecture that enables information to flow through different paths facilitates the construction of sufficiently highorder image representations that are suitable for a large variety of applications. A coefficients’ size of 3 MB, easily reduced further by a factor of 5 via model-compression techniques, makes SqueezeNet a very appealing architecture for embedded implementations.

Download the full-length Paper

Continue reading about the Food Recognition use case, Inference Speed results of SqueezeNet 1.1 model and more.

This paper has been published on March, 2017.

Cookie	Duration	Description
__cf_bm	30 minutes	This cookie, set by Cloudflare, is used to support Cloudflare Bot Management.
bcookie	1 year	LinkedIn sets this cookie from LinkedIn share buttons and ad tags to recognize browser ID.
bscookie	1 year	LinkedIn sets this cookie to store performed actions on the website.
lang	session	LinkedIn sets this cookie to remember a user's language setting.
li_gc	5 months 27 days	Linkedin set this cookie for storing visitor's consent regarding using cookies for non-essential purposes.
lidc	1 day	LinkedIn sets the lidc cookie to facilitate data center selection.
UserMatchHistory	1 month	LinkedIn sets this cookie for LinkedIn Ads ID syncing.

Cookie	Duration	Description
_clck	1 year	Microsoft Clarity sets this cookie to retain the browser's Clarity User ID and settings exclusive to that website. This guarantees that actions taken during subsequent visits to the same website will be linked to the same user ID.
_clsk	1 day	Microsoft Clarity sets this cookie to store and consolidate a user's pageviews into a single session recording.
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_ga_W6E27R14NE	2 years	This cookie is installed by Google Analytics.
_gat_UA-156119957-1	1 minute	A variation of the _gat cookie set by Google Analytics and Google Tag Manager to allow website owners to track visitor behaviour and measure site performance. The pattern element in the name contains the unique identity number of the account or website it relates to.
_gcl_au	3 months	Provided by Google Tag Manager to experiment advertisement efficiency of websites using their services.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
AnalyticsSyncHistory	1 month	Linkedin set this cookie to store information about the time a sync took place with the lms_analytics cookie.
attribution_user_id	1 year	This cookie is set by Typeform for usage statistics and is used in context with the website's pop-up questionnaires and messengering.
CLID	1 year	Microsoft Clarity set this cookie to store information about how visitors interact with the website. The cookie helps to provide an analysis report. The data collection includes the number of visitors, where they visit the website, and the pages visited.
CONSENT	2 years	YouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data.
nQ_cookieId	1 year	Albacross sets this cookie to help identify companies for better lead generation and more effective ad targeting.
undefined	never	Wistia sets this cookie to collect data on visitor interaction with the website's video-content, to make the website's video-content more relevant for the visitor.

Cookie	Duration	Description
ANONCHK	10 minutes	The ANONCHK cookie, set by Bing, is used to store a user's session ID and also verify the clicks from ads on the Bing search engine. The cookie helps in reporting and personalization as well.
MUID	1 year 24 days	Bing sets this cookie to recognize unique web browsers visiting Microsoft sites. This cookie is used for advertising, site analytics, and other operations.
test_cookie	15 minutes	The test_cookie is set by doubleclick.net and is used to determine if the user's browser supports cookies.
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextId	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.

Cookie	Duration	Description
_bc_uuid	10 years 3 months 16 days 18 hours	No description available.
AWSALBTG	7 days	No description available.
AWSALBTGCORS	7 days	No description available.
debug	never	No description available.
DEVICE_INFO	5 months 27 days	No description
ln_or	1 day	No description
loglevel	never	No description available.
nQ_userVisitId	30 minutes	No description available.
prism_610420756	1 month	No description
rl_anonymous_id	never	No description available.
rl_user_id	never	No description available.
session_referrer	30 minutes	No description
SM	session	No description available.
tf_respondent_cc	6 months	No description

publication

Mobile, CPU-only Deep Learning Inference: How fast can it be?

Speed metrics of a super-optimal multi-core, multi-cluster CPU-only implementation on several mobile devices

The SqueezeNet Model.​

Download the full-length Paper

The SqueezeNet Model.