Jekyll2019-05-06T14:37:33+02:00http://www.arc2019.tu-darmstadt.de/feed.xmlARC201915th International Symposium on Applied Reconfigurable ComputingTaPaSCo Tutorial2019-04-11T13:15:00+02:002019-04-11T13:15:00+02:00http://www.arc2019.tu-darmstadt.de/session/2019/04/11/tutorial1<p>TBD.</p>TBD.Lunch2019-04-11T12:15:00+02:002019-04-11T12:15:00+02:00http://www.arc2019.tu-darmstadt.de/session/2019/04/11/lunch3<p>Session description goes here. It’s a regular post so feel free to do use markup or markdown.</p>Session description goes here. It’s a regular post so feel free to do use markup or markdown.Conference Closing2019-04-11T12:00:00+02:002019-04-11T12:00:00+02:00http://www.arc2019.tu-darmstadt.de/session/2019/04/11/closingSession 8 - Convolutional Neural Networks2019-04-11T10:30:00+02:002019-04-11T10:30:00+02:00http://www.arc2019.tu-darmstadt.de/sessions/session8<h3 id="filter-wise-pruning-approach-to-fpga-implementation-of-fully-convolutional-network-for-semantic-segmentation">Filter-wise Pruning Approach to FPGA Implementation of Fully Convolutional Network for Semantic Segmentation</h3>
<p><em>Masayuki Shimoda, Youki Sada, Hiroki Nakahara</em></p>
<p>This paper presents a hardware-aware sparse fully convolu-
tional network (SFCN) for semantic segmentation on an FPGA. Seman-
tic segmentation attracts interest since for self-driving car it is important
to recognize road and obstacles in pixel level. However, it is hard to im-
plement the system on embedded systems since the number of weights
for the SFCN is so large that embedded systems cannot store them using
limited on-chip memory. To realize good a trade-off between speed and
accuracy, we construct an AlexNet-based SFCN which has no skip con-
nections and deconvolution layers to reduce the computation costs and
the latency. Furthermore, we propose a filter-wise pruning technique that
sorts the weights of each filter by their absolute values and prunes them
by a preset percent filter-by-filter from a small order. It is more suitable
for the hardware implementation since the number of computation of
each filter becomes equal. We trained the AlexNet-based SFCN by us-
ing Camvid image dataset and implemented on Xilinx zcu102 evaluation
board. The results show that the FPGA system is 10.14 times faster than
a mobile GPU one, and its performance per power consumption is 24.49
times higher than the GPU counterpart.</p>
<h3 id="exploring-data-size-to-run-convolutional-neural-networks-in-low-density-fpgas">Exploring Data Size to Run Convolutional Neural Networks in Low Density FPGAs</h3>
<p><em>Ana Goncalves, Tiago Peres, Mário Véstias</em></p>
<p>Convolutional Neural Networks (CNNs) obtain very good
results in several computer vision applications at the cost of high com-
putational and memory requirements. Therefore, CNN typically run on
high performance platforms. However, CNNs can be very useful in em-
bedded systems and its execution right next to the source of data has
many advantages, like avoiding the need for data communication and
real-time decisions turning these systems into smart sensors. In this pa-
per, we explore data quantization for fast CNN inference in low density
FPGAs. We redesign LiteCNN, an architecture for real-time inference of
large CNN in low density FPGAs, to support hybrid quantization. We
study the impact of quantization over the area, performance and accu-
racy of LiteCNN. LiteCNN with improved quantization of activations
and weights improves the best state of the art results for CNN inference
in low density FPGAs. With our proposal, it is possible to infer an image
in AlexNet in 7.4 ms in a ZYNQ7020 and in 14.8 ms in a ZYNQ7010
with 3% accuracy degradation. Other delay versus accuracy ratios were
identified permiting the designer to choose the most appropriate.</p>
<h3 id="faster-convolutional-neural-networks-in-low-density-fpgas-using-block-pruning">Faster Convolutional Neural Networks in Low Density FPGAs using Block Pruning</h3>
<p><em>Tiago Peres, Ana Goncalves, Mário Véstias</em></p>
<p>Convolutional Neural Networks (CNNs) are achieving promis-
ing results in several computer vision applications. Running these models
is computationally very intensive and needs a large amount of memory to
store weights and activations. Therefore, CNN typically run on high per-
formance platforms. However, the classification capabilities of CNNs are
very useful in many applications running in embedded platforms close to
data production since it avoids data communication for cloud processing
and permits real-time decisions turning these systems into smart em-
bedded systems. In this paper, we improve the inference of large CNN
in low density FPGAs using pruning. We propose block pruning and
apply it to LiteCNN, an architecture for CNN inference that achieves
high performance in low density FPGAs. With the proposed LiteCNN
optimizations, we have an architecture for CNN inference with an aver-
age performance of 275 GOPs for 8-bit data in a XC7Z020 FPGA. With
our proposal, it is possible to infer an image in AlexNet in 5.1 ms in
a ZYNQ7020 and in 13.2 ms in a ZYNQ7010 with only 2.4% accuracy
degradation.</p>Filter-wise Pruning Approach to FPGA Implementation of Fully Convolutional Network for Semantic Segmentation Masayuki Shimoda, Youki Sada, Hiroki NakaharaCoffee Break2019-04-11T10:00:00+02:002019-04-11T10:00:00+02:00http://www.arc2019.tu-darmstadt.de/session/2019/04/11/break5<p>Session description goes here. It’s a regular post so feel free to do use markup or markdown.</p>Session description goes here. It’s a regular post so feel free to do use markup or markdown.Session 7 - Safety and Security2019-04-11T09:00:00+02:002019-04-11T09:00:00+02:00http://www.arc2019.tu-darmstadt.de/sessions/session7<h3 id="leveraging-the-partial-reconfiguration-capability-of-fpgas-for-processor-based-fail-operational-systems">Leveraging the Partial Reconfiguration Capability of FPGAs for Processor-Based Fail-Operational Systems</h3>
<p><em>Tobias Dörr, Timo Sandmann, Florian Schade, Falco K. Bapp, Jürgen Becker</em></p>
<p>Processor-based digital systems are increasingly being used
in safety-critical environments. To meet the associated safety require-
ments, these systems are usually characterized by a certain degree of
redundancy. This paper proposes a concept to introduce a redundant
processor on demand by using the partial reconfiguration capability of
modern FPGAs. We describe a possible implementation of this concept
and evaluate it experimentally. The evaluation focuses on the fault han-
dling latency and the resource utilization of the design. It shows that
an implementation with 32 KiB of local processor memory handles faults
within 0.82 ms and, when no fault is present, consumes less than 46 % of
the resources that a comparable static design occupies.</p>
<h3 id="recofuse-your-prc-or-lose-security-finally-reliable-reconfiguration-based-countermeasures-on-fpgas">(ReCo)Fuse Your PRC or Lose Security: Finally Reliable Reconfiguration-based Countermeasures on FPGAs</h3>
<p><em>Kenneth Schmitz, Buse Ustaoglu, Daniel Große, Rolf Drechsler</em></p>
<p>Partial reconfiguration is a powerful technique to adapt the
functionality of Field Programmable Gate Arrays (FPGAs) at run time.
When performing partial reconfiguration a dedicated Intellectual Property
(IP) component of the FPGA vendor, i.e. the Partial Reconfiguration
Controller (PRC), among a wide range of IP components has to be used.
While ensuring the functional safety of FPGA designs is well understood,
ensuring hardware security is still very challenging. This applies in par-
ticular to reconfiguration-based countermeasures which are intensively
used to form a moving target for the attacker. However, from the system
security perspective a critical component is the above mentioned PRC
as noticed by many papers implementing reconfiguration-based counter-
measures against SCA/DPA attacks. In this work, we leverage a new
proposed safety mechanism which creates a container around an IP, to
encapsulate and thereby to protect and observe the PRC of an FPGA.
The proposed encapsulation scheme results in an architecture consisting
of so-called ReCoFuses (RCFs), each capturing a specific protective goal
which have to be fulfilled at any time during PRC operation. The termi-
nology follows the classical electric installation including a fuse box. In
our scheme we employ formal verification to guarantee the correctness
in detecting a security violation. Only after successful verification, the
RCFs are integrated into the ReCoFuse Container. Experimental results
demonstrate the advantage of our approach by preventing attacks on the
PRC of a system secured by reconfiguration.</p>Leveraging the Partial Reconfiguration Capability of FPGAs for Processor-Based Fail-Operational Systems Tobias Dörr, Timo Sandmann, Florian Schade, Falco K. Bapp, Jürgen BeckerRegistration & Welcome2019-04-11T08:00:00+02:002019-04-11T08:00:00+02:00http://www.arc2019.tu-darmstadt.de/session/2019/04/11/registration3<p>Session description goes here. It’s a regular post so feel free to do use markup or markdown.</p>Session description goes here. It’s a regular post so feel free to do use markup or markdown.Social 2 - Visit to ESOC & Dinner2019-04-10T16:45:00+02:002019-04-10T16:45:00+02:00http://www.arc2019.tu-darmstadt.de/sessions/social2<h2 id="esoc-guided-tour">ESOC guided tour</h2>
<p>The tour includes a short introduction film and a visit of <a href="http://www.esa.int/About_Us/ESOC">ESOC’s</a> operations facilities, e.g. the Main Control Room (MCR).</p>
<p>Participants can also take a look at the Rosetta engineering model and further mission specific control rooms.</p>
<h2 id="dinner">Dinner</h2>
<p>After the guided tour, we will have dinner in the <a href="https://www.comedyhall.de/comedy-hall.de/">Comedy Hall</a>. A bus is going to take all participants from the conference to ESOC and to the dinner location after the guided tour. If you do not participate in the guided tour, please come directly to Comedy Hall no later than 7:30pm.</p>
<iframe src="https://www.google.com/maps/embed?pb=!1m18!1m12!1m3!1d2572.3163480186663!2d8.644205415940114!3d49.85530193794861!2m3!1f0!2f0!3f0!3m2!1i1024!2i768!4f13.1!3m3!1m2!1s0x47bd7a86b7140fad%3A0x4e1d308934ce193f!2sKikeriki+Theater!5e0!3m2!1sde!2sde!4v1554456624258!5m2!1sde!2sde" width="600" height="450" frameborder="0" style="border:0" allowfullscreen=""></iframe>
<p>Use tram lines 6, 7 or 8 towards Eberstadt/Alsbach and get off at stop <em>Bessunger Straße</em>.</p>ESOC guided tour The tour includes a short introduction film and a visit of ESOC’s operations facilities, e.g. the Main Control Room (MCR).Invited Talk: Third Party CAD Tools for FPGA Design - A Survey of the Current Landscape2019-04-10T16:15:00+02:002019-04-10T16:15:00+02:00http://www.arc2019.tu-darmstadt.de/sessions/invited-talk<p>The FPGA community is at an exciting juncture in the development of 3rd party CAD tools for FPGA design. Much has been
learned in the past decade in the development and use of 3rd party tools such RapidSmith, Torc, and IceStorm.
New independent open-source CAD tool projects are emerging which promise to provide alternatives to existing vendor tools. The recent release of the RapidWright tool suggests that Xilinx itself is interested in enabling the user community to develop new use cases and specialized tools for FPGA design.
This talk provides a survey of the current landscape, discusses parts of what has
been learned over the past decade in the author’s work with 3rd party
CAD tool development, and provides some thoughts on the future.</p>
<p><a href="https://ece.byu.edu/faculty/brent_nelson" target="_blank">Brent Nelson</a> is department chair and a professor in the Department of Electrical and Computer Engineering at Brigham Young University. He received his PhD in computer science in 1984 from the University of Utah in the area of VLSI CAD. His current research interests focus on CAD tools for the design of digital electronic systems (especially FPGA-based systems) and high-performance computing applications using FPGAs and GPGPU devices.</p>The FPGA community is at an exciting juncture in the development of 3rd party CAD tools for FPGA design. Much has been learned in the past decade in the development and use of 3rd party tools such RapidSmith, Torc, and IceStorm. New independent open-source CAD tool projects are emerging which promise to provide alternatives to existing vendor tools. The recent release of the RapidWright tool suggests that Xilinx itself is interested in enabling the user community to develop new use cases and specialized tools for FPGA design. This talk provides a survey of the current landscape, discusses parts of what has been learned over the past decade in the author’s work with 3rd party CAD tool development, and provides some thoughts on the future.Session 6 - Design Frameworks and Methodology2019-04-10T15:15:00+02:002019-04-10T15:15:00+02:00http://www.arc2019.tu-darmstadt.de/sessions/session6<h3 id="hybrid-prototyping-for-manycore-design-and-validation">Hybrid Prototyping for Manycore Design and Validation</h3>
<p><em>Leonard Masing, Fabian Lesniak, Jürgen Becker</em></p>
<p>The trend towards more parallelism in information process-
ing is unbroken. Manycore architectures provide both massive parallelism
and flexibility, yet they raise the level of complexity in design and pro-
gramming. Prototyping of such architectures helps in handling this com-
plexity by evaluating the design space and discovering design errors.
Several system simulators exist but they can only be used for early soft-
ware development and interface specification. FPGA-based prototypes
on the other hand are restricted by available FPGA resources or expen-
sive multi-FPGA prototyping platforms. We present a hybrid prototyp-
ing approach for manycore systems that consists of an FPGA-part and
a virtual part of the architecture on a host system. The hybrid proto-
typing requires less FPGA resources while retaining its speed advantage
and enabling flexible modeling in the virtual platform.
We describe the concept, provide an analysis of timing accuracy and syn-
chronization of the FPGA with the Virtual Platform (VP) and show an
example in which the hybrid prototype is used for feature development
and evaluation of a scientific manycore architecture. The hybrid proto-
type allows us to evaluate a 7x7 architecture on a Virtex-7 XC7VX485T
FPGA board which otherwise could only fit a reduced 2x2 design of our
architecture.</p>
<h3 id="evaluation-of-fpga-partitioning-schemes-for-time-and-space-sharing-of-heterogeneous-tasks">Evaluation of FPGA Partitioning Schemes for Time and Space Sharing of Heterogeneous Tasks</h3>
<p><em>Umar Ibrahim Minhas, Roger Woods, Georgios Karakonstantis</em></p>
<p>Whilst FPGAs have been integrated in cloud ecosystems,
strict constraints for mapping hardware to spatially diverse distribution
of heterogeneous resources at run-time, makes their utilization for shared
multi tasking challenging. This work aims at analyzing the effects of such
constraints on the achievable compute density, i.e the efficiency in uti-
lization of available compute resources. A hypothesis is proposed and
uses static off-line partitioning and mapping of heterogeneous tasks to
improve space sharing on FPGA. The hypothetical approach allows the
FPGA resource to be treated as a service from higher level and supports
multi-task processing, without the need for low level infrastructure sup-
port. To evaluate the effects of existing constraints on our hypothesis,
we implement a relatively comprehensive suite of ten real high perfor-
mance computing tasks and produce multiple bitstreams per task for
fair evaluation of the various schemes. We then evaluate and compare
our proposed partitioning scheme to previous work in terms of achieved
system throughput. The simulated results for large queues of mixed in-
tensity (compute and memory) tasks show that the proposed approach
can provide higher than 3× system speedup. The execution on the Nal-
latech 385 FPGA card for selected cases suggest that our approach can
provide on average 2.9× and 2.3× higher system throughput for compute
and mixed intensity tasks while 0.2× lower for memory intensive tasks.</p>Hybrid Prototyping for Manycore Design and Validation Leonard Masing, Fabian Lesniak, Jürgen Becker