SASIMI 2018: THE 21ST WORKSHOP ON SYNTHESIS AND SYSTEM INTEGRATION OF MIXED INFORMATION TECHNOLOGIES
PROGRAM FOR MONDAY, MARCH 26TH
Days:
next day
all days

View: session overview

09:20-10:20 Session K: Keynote Speech
Chair:
Mineo Kaneko (Japan Advanced Institute of Science and Technology, Japan)
09:20
Peter Marwedel (TU Dortmund, Germany)
Cyber-Physical Systems: Opportunities, Challenges, and (Some) Solutions ( pdf )
SPEAKER: Peter Marwedel

ABSTRACT. Information processing was initially associated with large mainframe computers. Later, emphasis shifted towards office applications. More recently, miniaturization also enabled the integration of information processing and the physical environment. This is reflected in the introduction of the term ``Cyber-Physical Systems’’ (CPS). Such systems can be defined as integrations of computation and physical processes. The same term also denotes connecting information about physical objects to the Internet, a combination also called the ``Internet of Things’’ (IoT). It is expected that most future applications of information technology will be CPS/IoT-related. For CPS/IoT, there will be a complex interaction between the physical environment, hardware platforms and the software running on them. Mechanical, energy, thermal and dependability issues have to be considered. Sensors may be providing huge amounts of data requiring sophisticated data analysis in order to generate reliable information. Knowledge beyond classical hardware or software design is required for understanding and for performing research on these topics. The talk will start with an introduction to the terms and emphasize the huge potential of CPS. This will be followed by a presentation of major challenges for designing such systems and exemplary results from cooperative work at the collaborative research center SFB 876 at TU Dortmund (see http://sfb876.tu-dortmund.de). Final remarks will concern future directions.

10:20-12:00 Session R1: Regular Poster Session I
Chairs:
Masashi Imai (Hirosaki University, Japan)
Koyo Nitta (NTT, Japan)
10:20
Hsin-I Wu (National Tsing Hua University, Taiwan)
Chi-Kang Chen (Industrial Technology Research Institute, Taiwan)
Da-Yi Guo (National Tsing Hua University, Taiwan)
Ren-Song Tsay (National Tsing Hua University, Taiwan)
[R1-1]
A Highly Efficient Virtualization-Assisted Approach for Full-System Virtual Prototypes ( pdf )
SPEAKER: Hsin-I Wu

ABSTRACT. In this paper we propose a unique virtualization-assisted approach that allows offloading of simulations onto host machine hardware devices. The purpose is to build a highly efficient virtual prototype while having capabilities of system debug and performance estimation. The virtualization-assisted technique lets users specify stopping conditions on host hardware devices and enables a data-dependency-based synchronization process such that system interactions among hardware-assisted components (HACs) and software-modeled components (SMCs) can be executed efficiently and accurately in a deterministic chronological order which is critical for debugging purposes. To achieve maximal efficiency, we also leverage existing virtual machine framework for efficient timing synchronization and fast inter-component communication. Finally, we further incorporate a quality of service (QoS)-aware bus contention timing model in order to accurately analyze system timing behaviors and performance. We implement the proposed approach on a hardware virtualization-enabled off-the-shelf System-on-Chip (SoC) board simply to demonstrate the effectiveness of our idea. The experimental results show that the inter-component interactions are deterministically simulated while running 12~77 times faster than a commercial functional simulator and the performance estimation is only 3~6% apart from real systems.

10:22
Saki Hatta (NTT, Japan)
Nobuyuki Tanaka (NTT, Japan)
Takeshi Sakamoto (NTT, Japan)
[R1-2]
Area-efficient Programmable Finite-state Machine Toward Next Generation Access Network SoC ( pdf )
SPEAKER: Saki Hatta

ABSTRACT. This paper presents an area-efficient programmable finite-state machine (P-FSM) for the next generation of optical access network systems on a chip (SoC). The P-FSM is designed to minimize both the logic and memory area using a specific architecture for state management of communications protocols. It consists of a state processor to handle various types of state-transition diagrams, instruction memory to change and implement additional state management, and a sequencer to decide an address for instruction memory. Cycle-level simulation shows that the P-FSM can meet the performance requirements with flexibility. The processing time for an example state management is one-ninetieth compared with the ARM processor. The total designed area from RTL synthesis is only 1.5% of the area of a conventional communications SoC.

10:24
Kenshi Ito (Osaka University, Japan)
Yoshinori Takeuchi (Osaka University, Japan)
[R1-3]
Performance Analysis of Temporal Codings for Spiking Neural Network ( pdf )
SPEAKER: Kenshi Ito

ABSTRACT. Spiking neural networks (SNNs) are neuromorphic computing models mimicking brain ac- tivity, which are expected to achieve equivalent in- ference performance to artificial neural networks with low energy consumption. Since training SNNs with spike-timing-dependent plasticity requires intensive computational cost, recent studies have focused on how to reduce computational cost in SNN training phase. To address this issue, we analyzed the mul- tivariate trade-off between time resolution, training time, and inference performance. The experimental result showed that setting appropriate time resolu- tion enables to reduce 50% of SNN training time with negligible accuracy degradation.

10:26
Pin-Xin Liao (National Tsing Hua University, Taiwan)
Ting-Chi Wang (National Tsing Hua University, Taiwan)
[R1-4]
A Bus-Aware Global Router ( pdf )
SPEAKER: Ting-Chi Wang

ABSTRACT. In this paper, we present a bus-aware global router that handles the length-matching issue of buses by modifying a well-known global router, NTHU-Route 2.0, with the following enhancements: (1) a new net ordering determination method for rip-up and reroute, and (2) a length-bounded hybrid unilateral monotonic routing method. The experimental results show that our router can successfully solve 9 of 11 test cases without causing any overflow. In particular, for one of the 9 test cases, NTHU-Route 2.0 cannot completely eliminate its total overflow. In addition, our global router can produce a high quality solution in terms of total bus wirelength deviation, while maintaining comparable total wirelength and runtime efficiency.

10:28
Chih-Ko Yang (National Chiao Tung University, Taiwan)
Hao-Chiao Hong (National Chiao Tung University, Taiwan)
[R1-5]
A Firmware for Improving the Writing Performance of Multi-Chip MLC NAND Flash Memory Systems
SPEAKER: Hao-Chiao Hong

ABSTRACT. The writing speed of the NAND flash memory limits the performance of the high-capacity NAND-flash based storage system. Built-in functions of the NAND flash controller such as the cache programming functions, multi-plane functions, and the interleaved writing scheme are used to enhance the writing speed of the multi-chip NAND flash memory. Although these techniques help to improve the writing speed of the multi-level cell (MLC) NAND flash memory, but the enhancement is limited due to the programming time of the LSB and that of the MSB of the MLC NAND flash cell are significantly different. Conventional writing sequence leads the MLC flash chips being not fully active and thus the system cannot achieve the optimal writing speed. This paper presents a firmware procedure that achieves the optimal writing speed of the multi-chip MLC NAND flash memory by reordering the interleaved writing sequence of the MLC chips. The proposed procedure does not conflict with any NAND flash translation layer (FTL) algorithm used by the flash controller. As a result, this method can be directly adopted by the system no matter what FTL algorithm is used. In addition, it improves the writing performance by simply changing the controller's firmware but not the hardware. According to our experimental results, the writing speed is improved by a factor over 12.5%. The improvement depends on the practical characteristics of the MLC flash memory chips.

10:30
Yuki Tanaka (Kyoto University, Japan)
Song Bian (Kyoto University, Japan)
Masayuki Hiromoto (Kyoto University, Japan)
Takashi Sato (Kyoto University, Japan)
[R1-6]
A PUF Based on the Instantaneous Response of Ring Oscillator Determined by the Convergence Time of Bistable Ring
SPEAKER: Yuki Tanaka

ABSTRACT. A novel PUF that significantly improves resistance against machine learning attacks is proposed. By utilizing strong nonlinearity of the convergence time of bistable rings (BRs) with respect to threshold voltage variation, the proposed PUF generates response as an instantaneous value of a ring oscillator at a convergence time of the BR running in parallel. The SPICE simulations show that the prediction accuracy towards the proposed PUF by a support vector machine is about ideal value of 50%.

10:32
Wakako Nakano (Kwansei Gakuin University, Japan)
Nagisa Ishiura (Kwansei Gakuin University, Japan)
[R1-7]
Extended Distributed Control for Dynamic Scheduling across Dataflow Graphs ( pdf )
SPEAKER: Wakako Nakano

ABSTRACT. This paper extends distributed control for run-time scheduling of variable latency operations so that it can execute operations in more than two DFGs in parallel. In contrast to the conventional high-level synthesis methods which determine the scheduling of operations statically and control a datapath with a single finaite state machine, the distributed control enables dynamic adjustment of execution timing of variable latency operations by reining functional units with multiple finite state machines. Although Shimizu extended the Del Barrio's distributed controller to handle control dataflow graphs (CDFGs) consisting of multiple dataflow graphs (DFGs), in which dynamic operation motion across DFGs were possilbe, its formulation allows parallel execution of operations in at most two DFGs. This paper proposes a new formulation of the distributed control in which operations in more than two DFGs may be executed in parallel. A preliminary experiment shows that our new method can reduce execution cycles when data dependency does not prohibit parallel execution of multiple DFGs.

10:34
Shinya Kuwamura (FUJITSU LABORATORIES LTD., Japan)
Satoshi Kazama (FUJITSU LABORATORIES LTD., Japan)
Eiji Yoshida (FUJITSU LABORATORIES LTD., Japan)
Junji Ogawa (FUJITSU LABORATORIES LTD., Japan)
Takashi Miyoshi (FUJITSU LABORATORIES LTD., Japan)
Yasuo Noguchi (FUJITSU LABORATORIES LTD., Japan)
[R1-8]
Near-Data Processing for Genome Analysis Using Software-Controlled SSD
SPEAKER: Shinya Kuwamura

ABSTRACT. A challenge for in-memory processing is to reduce high cost with expensive DRAM. Furthermore, CPU performance becomes bottleneck depending on applications. Proposed near-data processing method that offloads computing to SSD solved these two problems. We developed a dedicated SSD that equips FPGA-based processing units in the neighborhood of NAND flash memories and applied it to genome analysis. As a result, our system was able to process two times faster than in-memory method on 24-core CPU.

10:36
Nobutaka Kito (Chukyo University, Japan)
Yurie Koketsu (Chukyo University, Japan)
Kazuyoshi Takagi (Kyoto University, Japan)
[R1-9]
Designs of Component Circuits for Stochastic Computing Using Rapid Single Flux Quantum Circuits ( pdf )
SPEAKER: Nobutaka Kito

ABSTRACT. Designs of component circuits for stochastic computing using rapid single flux quantum (RSFQ) circuits are presented. A design of a stochastic number generator and a design of a stochastic-to-binary converter are proposed. In those designs, voltage pulses of RSFQ circuits are used for representing ones in a bit-stream of a signal in stochastic computing. By using special gates of RSFQ circuits, they are implemented as simple circuits. A layout for 4-bit multiplication using those proposed designs is shown. By comparison with circuits designed by various methods, it is suggested that low-precision arithmetic circuits can be implemented in smaller area by using stochastic computing.

10:38
Shogo Matsumoto (Kyoto University, Japan)
Hidenori Gyoten (Kyoto University, Japan)
Masayuki Hiromoto (Kyoto University, Japan)
Takashi Sato (Kyoto University, Japan)
[R1-10]
A Feasibility Study of Annealing Processor for Fully-Connected Ising Model Based on Memristor/CMOS Hybrid Architecture
SPEAKER: Shogo Matsumoto

ABSTRACT. The number of connections of a node in Ising model solvers is a determining factor of their computational efficiency. In this paper, a new architecture for the Ising model solver that runs on a memristor/CMOS hybrid circuit is proposed. By utilizing the similarity between memristor crossbar circuits and the Ising model in terms of the product-sum operation, fully-connected spins are realized. Through numerical experiments, it is demonstrated that the proposed anealing processor can realize fully-connected network with only six times larger power consumption than the conventional CMOS annealing processor having four connections for each node.

10:40
Hiroyuki Baba (Fukuoka University, Japan)
Tongxin Yang (Fukuoka University, Japan)
Masahiro Inoue (Fukuoka University, Japan)
Kaori Tajima (Fukuoka University, Japan)
Tomoaki Ukezono (Fukuoka University, Japan)
Toshinori Sato (Fukuoka University, Japan)
[R1-11]
A Carry-Predicting Full Adder for Accuracy-Scalable Computing ( pdf )
SPEAKER: Hiroyuki Baba

ABSTRACT. Approximate computing is one of the promising paradigms to enable high speed, small area, and lower power, which are essential properties for modern applications such as IoT devices. This paper proposes an approximate adder, which has scalabilities in accuracy and in power consumption. It is named Carry-Predicting Adder (CPA). The CPA is based on a previously proposed adder and improves its accuracy by conducting carry prediction. From the gate level simulations, it is found that the CPA improves the accuracy over the existing approximate adder by around 50% with the comparable energy efficiency.

10:42
Hongjie Xu (Kyoto University, Japan)
Jun Shiomi (Kyoto University, Japan)
Tohru Ishihara (Kyoto University, Japan)
Hidetoshi Onodera (Kyoto University, Japan)
[R1-12]
A Hybrid Caching System Using SRAM and Standard-Cell Memory for Energy-Efficient Near-Threshold Circuits
SPEAKER: Hongjie Xu

ABSTRACT. This paper proposes an energy-efficient heterogeneous caching system which combines SRAM and Standard-Cell Memory (SCM). Unlike the conventional SRAM, the aggressively voltage-scaled SCM with a fully digital structure achieves better energy-efficiency than the conventional SRAM while it consumes several times larger area than the SRAM. By inserting a small capacity cache composed of the SCM as a lower level cache than the cache composed of the conventional SRAM, the proposed hybrid system exploits the high energy-efficiency with keeping the area overhead to the minimum. Simulation results using 65nm process technologies show that the proposed system reduces the dynamic energy consumption by 86% at the best case compared with the counterpart using the SRAM only under the same area constraint.

10:44
Kazuyuki Sakata (Renesas Electronics Corporation, Japan)
Takashi Hasegawa (Sony LSI Design, Japan)
Kouji Ichikawa (DENSO CORPORATION, Japan)
Toshiki Kanamoto (Hirosaki University, Japan)
[R1-13]
Prediction of the Impact of Mutual Inductance on Timing Towards Nano-scale SoC ( pdf )
SPEAKER: Kazuyuki Sakata

ABSTRACT. This paper suggests a method to predict the impact of mutual inductance (M) on interconnect signal delay estimation according to resistance (R), self-inductance(L), and capacitance(C) in nano-scale system on a chip (SoC). The proposed method first calculates the difference in delay between RLC and RLMC wire models for a set of parameter variations, then builds response surface functions (RSF) using physical parameters including wire width and spacing. The proposed method contributes to describe design rules to avoid mutual inductance effects.

10:46
Jon T. Butler (Naval Postgraduate School, United States)
Tsutomu Sasao (Meiji University, Japan)
[R1-14]
Analysis of Cyclic Row-Shift Decompositions for Index Generation Functions
SPEAKER: Jon T. Butler

ABSTRACT. We solve the mystery of why row-shifts excel in minimizing the variables needed to represent index generation functions. Given k, the number of indices, we compute two thresholds. The first is an upper threshold U such that, for U <= k, no function is realized by a row-shift. The second is a lower threshold L such that, for k <= L, all functions are realized. We show that an "avalanche" region exists between the two thresholds, where functions abruptly change from realizable to not realizable.

10:48
Takumi Okamoto (Hiroshima University, Japan)
Tetsushi Koide (Hiroshima University, Japan)
Toru Tamaki (Hiroshima University, Japan)
Bisser Raytchev (Hiroshima University, Japan)
Kazufumi Kaneda (Hiroshima University, Japan)
Shigeto Yoshida (Medical Corporation JR Hiroshima Hospital, Japan)
Hiroshi Mieno (Medical Corporation JR Hiroshima Hospital, Japan)
Shinji Tanaka (Hiroshima University Hospital, Japan)
[R1-15]
Investigation of Real-Time Computer-Aided Diagnosis system using CNN feature and SVM identifier with Colorectal Endoscopic Images ( pdf )
SPEAKER: Takumi Okamoto

ABSTRACT. With the increase of colorectal cancer patients in recent years, the needs of quantitative evaluation of colorectal cancer are increased, and the computer-aided diagnosis (CAD) system which supports doctor's diagnosis is essential. We propose the CAD system using Convolutional Neural Network (CNN) feature and Support Vector Machine (SVM) identifier. From evaluation results, about 95% accuracy with True Positive, Precision Rate and F-measure in Type 1 vs. 2A&3 identifier. The 6th and 7th fully-connected layers achieve <90% accuracy, it is better results compare with other features in Type 2A vs. 3.

10:50
Akito Hoshide (The University of Kitakyushu, Japan)
Bo Liu (The University of Kitakyushu, Japan)
Shigetoshi Nakatake (The University of Kitakyushu, Japan)
[R1-16]
An Implementation of Low-cost Wireless Sensor Network for Wide-area Disaster

ABSTRACT. This work presents a new wireless sensor network for a wide-area disaster, where each sensor module is low-cost, and has a binary function to judge ``dangerous'' or ``safe'' comparing with a pre-set threshold. The binary value is transmitted to a monitoring server by LED lighting as well as a wireless communication by ZigBee. Focusing on sensing the oxygen concentration in case of a forest fire, in this work, we introduce a reliable sensing mechanism combining an error correction by smoothing filter and a majority voting.

10:52
Ayano Takezaki (Kobe University, Japan)
Shogo Ohmura (Kobe University, Japan)
Naoki Katayama (Kobe University, Japan)
Tetsuya Hirose (Kobe University, Japan)
Nobutaka Kuroki (Kobe University, Japan)
Masahiro Numa (Kobe University, Japan)
[R1-17]
An Error Diagnosis Technique Based on Unsatisfiable Cores to Extract Error Locations Sets
SPEAKER: Ayano Takezaki

ABSTRACT. This paper presents an error diagnosis technique based on unsatisfiable (UNSAT) cores to extract error location sets. By using SAT solver, we can extract the cause of contradiction from the circuit as UNSAT cores without being influenced by the size and structure of circuit. Experimental results have shown that the proposed technique is effective in reducing error location sets for rectifying large circuits with complicated structures.

10:54
Chi-Kang Chen (Industrial Technology Research Institute, Taiwan)
Hsin-I Wu (National Tsing Hua University, Taiwan)
Cheng-Lin Tsai (National Tsing Hua University, Taiwan)
Ren-Song Tsay (National Tsing Hua University, Taiwan)
[R1-18]
A Reuse-Distance Based Approach for Early-Stage Multi-level Cache Design Optimization ( pdf )
SPEAKER: Chi-Kang Chen

ABSTRACT. We propose a systematic multi-level cache analysis method, which generalize the reuse distance concept for early-stage multi-level cache design optimization. The proposed approach demonstrates 150 to 250 speedup than the traditional simulation-based approach. Compared to real simulation result, the average error is 0.71% (L2) and 1.1% (L3). An analytical model is also introduced which provide insights to designers for the relationship between design decision and result. Therefore, designer is able to make proper decision at early design stage.

12:00-13:30Lunch Break
13:30-14:20 Session I1: Invited Talk I
Chair:
Akihiko Miyazaki (NTT, Japan)
13:30
Sri Parameswaran (The University of New South Wales, Australia)
Heterogeneous Multi-Processor Pipelines: a Real-Time MPSoC Story ( pdf )

ABSTRACT. Several modern systems, from ubiquitous mobile phones to powerful gaming machines, contain multiple heterogeneous processing cores. In a modern phone, for example, a general purpose processor manages the human machine interface, while a DSP manages the baseband signal processing. Usually, these heterogeneous multi-processor systems isolate tasks, and execute the isolated tasks in separate processors.

In this talk a novel heterogeneous multiprocessor pipeline system is described, where a single real-time streaming application is executed by multiple processors. The processors in the system are connected in a pipeline via queues (FIFOs) which allow communication at a higher bandwidth, devoid of the contention exhibited by typical shared bus architecture. Recent developments in Application Specific Instruction Set Processors (particularly from Tensilica Inc), have driven the creation of these multi processor pipelines with ASIPs as the building blocks. Each ASIP in the pipeline is customized with differing additional instructions, and instruction and data cache sizes to improve performance of the task mapped on that particular ASIP. As a result, the performance of the whole system is improved, while minimizing the increase in area. The permutation of configurations for each ASIP make up the design space of the pipelined multiprocessor system, and is rapidly explored. An automated design methodology to choose the best ASIP configurations as the final design in a reasonable amount of time is shown. The rapid exploration methodology used is able to explore design spaces up to 10^16 design points, which is almost impossible to explore otherwise. Finally, the possibilities of merging pipeline systems are discussed.

14:20-16:00 Session R2: Regular Poster Session II
Chairs:
Masato Inagi (Hiroshima City university, Japan)
Tomoaki Ukezono (Fukuoka University, Japan)
14:20
Satoru Maruyama (Ritsumeikan University, Japan)
Ankur Gupta (IIT Roorkee, India)
Sudip Roy (IIT Roorkee, India)
Shigeru Yamashita (Ritsumeikan University, Japan)
[R2-1]
Placement of Reagents on Programmable Microfluidic Devices
SPEAKER: Satoru Maruyama

ABSTRACT. Recently, a new type of biochip called Programmable Microfluidic Device (PMD) has been proposed. It is indispensable to load sample reagents into PMD cells so that reagents are placed as required to perform a bio-protocol after that. However, to the best of our knowledge, no efficient method has not been reported for the task. Thus, this paper considers two methods for the task; one is a relatively naive method, and the other is our newly proposed method. Our proposed method can treat efficiently the property of flows such that a flow overrides the effect of preceding flows. Our preliminary experiment indeed confirms the efficiency of our proposed method.

14:22
Yuuki Imai (Kyoto University, Japan)
Tohru Ishihara (Kyoto University, Japan)
Hidetoshi Onodera (Kyoto University, Japan)
Akihiko Shinya (NTT, Japan)
Shota Kita (NTT, Japan)
Kengo Nozaki (NTT, Japan)
Kenta Takata (NTT, Japan)
Masaya Notomi (NTT, Japan)
[R2-2]
An Integrated Optical Parallel Multiplier based on Nanophotonic Analog Adders and Optoelectronic AD Converters
SPEAKER: Yuuki Imai

ABSTRACT. Integrated optical circuits with nanophotonic devices have attracted attention over the recent years. Optical circuits composed of optical wires and optical switches have a potential for low-power operation and light-speed computation. Due to the potential, high performance arithmetic units are expected to be realized using the nanophotonic devices. This paper first proposes a method of optical multi-valued addition and an architecture of an optical parallel multiplier based on the addition. Next, with optoelectronic circuit simulation, this paper compares the performance of a CMOS parallel multiplier and the proposed optical parallel multiplier. The results show that the optical multiplier is more than three times faster than the CMOS multiplier.

14:24
Kodai Abe (Ritsumeikan University, Japan)
Kentaro Haneda (Ritsumeikan University, Japan)
Shigeru Yamashita (Ritsumeikan University, Japan)
[R2-3]
On Optimization Methods for Decision Diagrams to Represent Probabilities
SPEAKER: Kodai Abe

ABSTRACT. In this paper, we compare two compression methods for Binary Decision Diagrams for Probabilities (BDDPs). BDDPs are used to represent probabilities for the error analysis of logic circuits. In the previous work, divisions are used to make BDDPs canonical, which may cause approximation errors. Thus, in this paper, we propose another method to use subtractions to make BDDPs canonical. To do so, we carefully derive the denition of BDDP nodes, and then we can successfully have a recursive AND operation for newly proposed BDDPs as well. We show a preliminary experiment to conrm that our new ly proposed BD- DPs should be better than the previously proposed BDDPs.

14:26
Yuto Ishihara (Saitama University, Japan)
Shinichi Nishizawa (Saitama University, Japan)
Kazuhito Ito (Saitama University, Japan)
[R2-4]
Minimization of Equality Check for Soft Error Detection in DMR Design Implemented with Error Correction by Operation Re-execution ( pdf )
SPEAKER: Kazuhito Ito

ABSTRACT. Double modular redundancy (DMR) is to execute an operation twice and detect soft error by comparing the operation results. The error is corrected by executing necessary operations again. The re-execution of operations is controlled by a circuit such as a sequencer or a finite state machine, and minimizing the the control circuit complexity is one of the objectives of design optimization. In this paper, a method to minimize the required number of comparison operations to detect a soft error in DMR is proposed, thereby minimizing the schedule of the operation reexecution and hence minimizing the control circuit complexity. The experimental results show that proposed method efficiently obtains the results very close to the optimum.

14:28
Qiaochu Zhao (Osaka University, Japan)
Ittetsu Taniguchi (Osaka University, Japan)
Makoto Nakamura (Laboratory of Hi-Think Corporation, Japan)
Takao Onoye (Osaka University, Japan)
[R2-5]
An Efficient Parts Counting Method based on Intensity Distribution Analysis for Industrial Vision Systems ( pdf )
SPEAKER: Qiaochu Zhao

ABSTRACT. In this paper, we proposed an efficient parts counting method based on intensity distribution analysis for industrial vision system. Counting productions, as a preliminary operation in assemble line, is essential for calculating many industrial index such as deficiency rate. Conventional approach for counting problem is based on template matching, which we consider it as both stiff and time-consuming. In the proposed approach, counting problem is converted into an equivalent classification problem, in which a trained classifier is used to classify whether a specific line segment region belongs to parts or not. While parts flow through this line segment, number of the flowed parts can be effectively counted accord- ing to the interlace of different classified results. Experiments revealed that the proposed method superiors conventional template-matching method by being capable of counting with significant improvement of speed as well as with higher accuracy and stronger robustness. We also considered the proposed method can be readily extended to data with similar proper- ties.

14:30
Juinn-Dar Huang (National Chiao Tung University, Taiwan)
Chia-Hung Liu (National Chiao Tung University, Taiwan)
Wei-Hao Yang (National Chiao Tung University, Taiwan)
[R2-6]
Versatile Ring-Based Architecture for General-Purpose Digital Microfluidic Biochips
SPEAKER: Juinn-Dar Huang

ABSTRACT. Digital microfluidic biochip (DMFB) is an emerging tiny device that can carry out a rich set of bioassays without the need of bulky equipment. However, it is still a big challenge to design a general-purpose DMFB architecture today. NP-hard synthesis problems make on-line synthesis virtually impossible on exiting array-based architectures. In this paper, we first elaborate on the major concerns in a DMFB design flow, from the aspect of synthesis and physical design. We then propose a low-cost versatile ring-based architecture VERBA and its corresponding fast one-pass synthesis flow. Experimental results show that VERBA incorporated with the proposed synthesis flow is a better solution than existing architectures and synthesis algorithms especially for real-time cyber-physical systems.

14:32
Takuya Yamauchi (ams Japan Co., Ltd., Japan)
Tetsuro Okura (ams Japan Co., Ltd., Japan)
[R2-7]
A Deep Neural Network Based Approach to Achieve Aesthetic Schematics ( pdf )
SPEAKER: Takuya Yamauchi

ABSTRACT. This paper proposes a novel method to apply deep neural network to automatically make schematics aesthetic. A deep feedforward network composed of 2 convolutional layers and a fully connected layer is trained with stochastic gradient descent in supervised learning. Entangled schematics and corresponding preferred schematic edit commands are encoded to vectors which the neural network can accept, and used as training data set. Of all provided schematics, 33% are used for training, and the trained neural network successfully cleaned up 95% of schematics separated from the training data.

14:34
Daiki Hara (National Institute of Technology Nagano College, Japan)
Takefumi Yoshikawa (National Institute of Technology Nagano College, Japan)
[R2-8]
A Low Power Data Bus Architecture by Charge Recycling Utilization on Single-Ended Transmission Line
SPEAKER: Daiki Hara

ABSTRACT. This paper proposes novel data bus architecture for ultra-wide (>1K) bits transmission line, which is applicable to vertical chip-to-chip data communication on 3D chip-stacked packaging. The transmission lines are configured as single-ended and two-story (bottom and top floors) data bus with 1/2VDD potential. This configuration enables charge recycling from top to bottom floor at specific data pattern to suppress power consumption. Furthermore, simple data coding of polarity inversion is adopted to reduce the number of data transition from logical 0 (zero) to 1 (one) and enhance the charge-recycling probability. A system simulation shows a possibility of more than 40% power reduction for data transmission at the given number of pseudo random data transmission.

14:36
Takashi Imagawa (Ritsumeikan University, Japan)
Takahiro Ikeshita (Hokkaido University, Japan)
Hiroshi Tsutsui (Hokkaido University, Japan)
Yoshikazu Miyanaga (Hokkaido University, Japan)
[R2-9]
Hardware Design Exploration of Matrix Inversion for Signal Separation in MIMO-OFDM Wireless Communication
SPEAKER: Takashi Imagawa

ABSTRACT. With the increase in the number of MIMO streams and OFDM subcarriers for high-speed wireless communication, the amount of computation required for signal separation is rapidly increasing. In representative method of signal separation, such as MMSE, ZF and SIC, the implementation of matrix inversion has a great impact on processing time and bit error rate. This paper explores the hardware accelerator design of the inverse matrix operation based on Gauss-Jordan elimination and Strassen's algorithm with floating-point number arithmetic. The results of logic synthesis and simulation show that the throughput can be improved efficiently against the increase of the circuit resources induced by pipelined Strassen's algorithm. Other implementations are used in the above design as the sub-matrix inversion circuit.

14:38
Yuuta Satomi (Hirosaki University, Japan)
Koutaro Hachiya (Teikyo Heisei University, Japan)
Masashi Imai (Hirosaki University, Japan)
Toshiki Kanamoto (Hirosaki University, Japan)
Kaoru Furumi (Hirosaki University, Japan)
Atsushi Kurokawa (Hirosaki University, Japan)
[R2-10]
Power Delivery Network Optimization of 3D ICs Using Multi-Objective Genetic Algorithm
SPEAKER: Yuuta Satomi

ABSTRACT. In this paper, we propose a method for optimizing parameters related to power delivery networks (PDNs) of three-dimensional integrated circuits (3D ICs). We present the modeling of a PDN that speeds up optimization processing. We then describe a method to simultaneously optimize the decoupling capacitors, power/ground (P/G) TSVs, and on-chip P/G grids using a multi-objective genetic algorithm. Experimental results demonstrate that the proposed method can obtain optimal PDN parameters with Pareto-optimal set.

14:40
Fu-Lian Wong (Yuan Ze University, Taiwan)
Li-Cheng Zheng (Yuan Ze University, Taiwan)
Yung-Chih Chen (Yuan Ze University, Taiwan)
[R2-11]
Node Merging for Threshold Logic Network Optimization
SPEAKER: Yung-Chih Chen

ABSTRACT. In this paper, we extend an optimization method, which works for conventional Boolean logic networks, to threshold logic networks (TLNs). The method optimizes a Boolean logic network by merging nodes which are functionally equivalent or their differences are never observed. For applying the method to TLNs, we propose an approach for computing the mandatory assignments of a stuck-at fault test on a threshold gate and an approach for conducting logic implication in a TLN. The experimental results show that the proposed method works for a set of IWLS 2005 benchmark circuits by saving an average of approximately 1.7% threshold gates.

14:42
Yuki Arai (Chuo University, Japan)
Shuji Tsukiyama (Chuo University, Japan)
[R2-12]
A Heuristic Method for Delay Insertion to Improve Clock Period of General-Synchronous Circuit and Its Evaluation ( pdf )
SPEAKER: Yuki Arai

ABSTRACT. In general-synchronous framework, the clock signal is distributed to each register in optimal individual timing, so that the clock period can be less than the critical delay of a combinatorial circuit. In order to achieve the minimum clock period, we must increase the shortest delay of a combinatorial circuit optimally. This technique is called delay insertion and several papers have been published. However, due to the process variability, delay values may vary chip-by-chip, and hence we must consider delay insertion in a sort of statistical manner. In such a statistical design approach, if delay insertion techniques are complicated, it may be hard to devise a statistical delay insertion algorithm. Therefore, in this paper, we propose a simple heuristic method for delay insertion and evaluate its performance. This method repeats a graph reduction technique, and operations used in the technique are addition and maximum similar to statistical static timing analysis.

14:44
Naoki Osako (Kwansei Gakuin University, Japan)
Sayuri Ota (Kwansei Gakuin University, Japan)
Suguru Yura (Kwansei Gakuin University, Japan)
Nagisa Ishiura (Kwansei Gakuin University, Japan)
[R2-13]
High-Level Synthesis of Side Channel Attack Resistant RSA Decryption Circuit ( pdf )
SPEAKER: Naoki Osako

ABSTRACT. This paper presents a side channel attack resistant design of an RSA decryption circuit using high-level synthesis. An RSA encryption/decryption circuit can be designed by high-level synthesizers utilizing the C program assets of the GNU multiple precision arithmetic library. With this methodology, an RSA decryption circuit is synthesized based on the Fournaris's algorithm which was designed against power analysis attacks and fault injection attacks. In order to reduce computation time and hardware size, Montgomery modular multiplication and parallelization of CRT-based modular exponentiation are applied. A preliminary synthesis result targeting Xilinx Kintex-7 FPGA shows that the side channel attack resistance has been implemented with the 1.94 times LUT count and 5.17 times execution cycles.

14:46
Hong-Yan Su (National Chiao Tung University, Taiwan)
Yan-Shiun Wu (National Chiao Tung University, Taiwan)
Yi-Hsiang Chang (National Tsing Hua University, Taiwan)
Rasit Onur Topaloglu (IBM, United States)
Yih-Lang Li (National Chiao Tung University, Taiwan)
[R2-14]
MapReduce-Based Pattern Classification for Design Space Analysis
SPEAKER: Hong-Yan Su

ABSTRACT. With the ongoing reduction of feature size, design for manufacturability is a critical concern in advanced technology nodes. Pattern classification is a promising and widely employed approach for design space analysis, design rule generation, and yield optimization. In this paper, we propose a hybrid algorithm that account for two variations for classification metrics: feature edge displacement and total feature area difference. A MapReduce-based framework is proposed to reduce the complexity of the pattern classification problem such that orders of magnitude of performance improvement can be realized. Our experimental results indicate that regarding accuracy and runtime, this work outperforms the winner of the CAD Contest at ICCAD 2016 in terms of contest scoring measure.

14:48
Dave Y.-W. Lin (National Chiao Tung University, Taiwan)
Charles H.-P. Wen (National Chiao Tung University, Taiwan)
[R2-15]
Radiation-Hardened Design by Delay-Controllable Flip-Flops for Soft-Error-Rate Mitigation
SPEAKER: Dave Y.-W. Lin

ABSTRACT. For reducing soft error rate (SER) in system-level failures, this paper proposes a radiationhardened design by Delay-Controllable Flip-Flop (DCFF), which can be generally applied to sequential circuits such as shift registers. DCFF, modified from the Built-In Soft-Error Resilience (BISER) latch, can be easily integrated in the CAD flow and its delay can be adjusted to reject particle strikes with the maximum energy level. As a result, at the device level, DCFF eliminates 99.999997% soft errors by heavy ions and shows greater reduction on SER (e.g. 1.3×10e(10)X in the best case) than the standard flip-flop (STD-FF) through TCAD and SPICE simulation.

14:50
Shunsuke Negoro (Ritsumeikan University, Japan)
Daichi Sukezane (Ritsumeikan University, Japan)
Atsuya Shibata (Ritsumeikan University, Japan)
Kotaro Maekawa (Ritsumeikan University, Japan)
Ittetsu Taniguchi (Osaka University, Japan)
Hiroyuki Tomiyama (Ritsumeikan University, Japan)
[R2-16]
Measurement and Modeling of Quadcopter Energy with ROS
SPEAKER: Shunsuke Negoro

ABSTRACT. Quadcopters are considered as promising vehicles for home delivery services in near future. Since the flight time of quad-copters is limited due to the battery capacity, it is important to establish the technology which accurately estimates the energy consumption to complete the planned delivery flight. This paper presents energy models for delivery quadcopters. Our energy models are based on actual power measurements using ROS.

14:52
Infall Syafalni (Logic Research Co., Ltd., Japan)
Katsuhiko Wakasugi (Logic Research Co., Ltd., Japan)
Tongxin Yang (Logic Research Co., Ltd., Japan)
Tsutomu Sasao (Meiji University, Japan)
Xiaoqing Wen (Kyushu Institute of Technology, Japan)
[R2-17]
Netlist Conversion from Customer Logic Interface Format (CLIF) to Verilog for Legacy Circuits
SPEAKER: Infall Syafalni

ABSTRACT. Electronic industries have been existed for more than five decades. Many inventions and contributions on circuits and designs have been made, and computer aided design tools made circuit designs more efficient and its quality higher. However, some legacy circuits and tools are not supported in the present time. In the perspective of eco-friendliness, the regeneration of legacy circuits hold an important role in producing ICs. This paper proposes a design automation method for legacy circuits. In this work, legacy circuits are represented in component-based netlist e.g., the Costumer Logic Interface Format (CLIF). First, we build a netlist converter, namely NetConv, to convert the legacy format into the Verilog HDL netlist and build the primitive libraries. Then, we perform design verification, synthesis, gate-level simulation on FPGA and an ASIC library, as well as layout. Experimental results show that our proposed method regenerated circuits that perform equally as their legacy counterparts. The proposed method is eco-friendly and useful in enhancing the efficiency of designs based on legacy circuits.

14:54
Masaki Fujikawa (Kogakuin University, Japan)
Kouki Takayama (Kogakuin University, Japan)
Shingo Fuchi (Aoyama Gakuin University, Japan)
[R2-18]
Anti-counterfeiting and Authenticity Verification Technique for Molded Synthetic Resin Products ( pdf )
SPEAKER: Masaki Fujikawa

ABSTRACT. Authors propose anti-counterfeiting and authenticity verification technique for molded resin products. Each product has unique characteristic information formed by "liquid turbulence" which is caused when liquid resin is poured into a mold. The appearance of the product is not affected by this information as it can be covered by IR transparent paint. We confirmed that (1) Feature information could not observed by naked eyes, (2) This information could be extracted from samples without contact, and (3) extracted information differed from each sample.

16:00-17:30 Session D: Panel Discussion

“What is the next place to go, in the era of IoT and AI?”

Moderator:
Prof. Robert Dutton (Stanford University)

Panelist:
Prof. Peter Marwedel (Technische Universitat Dortmund)
Prof. Sri Parameswaran (University of New South Wales)
Prof. Elena Dubrova (Royal Institute of Technology)
Prof. Iris Hui-Ru Jiang (National Taiwan University)

Synopsis:
Wherever we go, we cannot avoid seeing these two terms: IoT and AI. Starting with short talks by experts in the domains related to synthesis and system  integration, we will discuss, in the era of IoT and AI, what are challenges and pitfalls in those domains, and also directions we should go in the long run.

Organizer:
Prof. Kiyoharu Hamaguchi (Shimane University)

( pdf )