Title | Memory Synthesis for Multi-Processor System-on-Chips with Reconfigurable 3D-stacked SRAMs |
Author | Meng-Ling Tsai, *Yi-Jung Chen, Yi-Ting Chen, Ru-Hua Chang (National Chi Nan University, Taiwan) |
Page | pp. 2 - 7 |
Keyword | Memory Synthesis, Reconfigurable 3D-stacked SRAMs |
Abstract | Integrating Multi-Processor System-on-Chips (MPSoCs)
with 3D-stacked reconfigurable SRAM tiles has been
proposed for embedded systems with high memory demands. At
runtime, the SRAM tiles are configured into several memory
areas, which can be reconfigured according to the dynamic
behavior of the system. Targeting this architecture, in this
paper, we propose a data placement and memory area allocation
algorithm. The goal of the proposed algorithm is to optimize the
performance of the memory system by minimizing the on-chip
memory access latency, the number of off-chip memory accesses,
and the number of reconfigurations. Since the behavior of an
embedded system can be described by a set of scenarios, where
each scenario specifies a set of applications that would execute
concurrently, the proposed algorithm synthesizes data placements
and the memory area allocation for each scenario. Not only
the data access patterns within the scenario but also among
all scenarios are considered for data placement. We evaluate
the proposed algorithm on a set of synthetic and real-world
applications. The experimental results show that, compared to
the existing data placement method designed for MPSoCs with
distributed memory modules, the proposed algorithm achieves up
to 11.72% of data access latency reduction. |
Title | Thermal-Pattern-Aware Voltage Assignment for Task Scheduler on 3D Multi-Core Processors |
Author | Chien-Hui Liao, *Cheng Suo, Charles Hung-Pin Wen (National Chiao Tung University, Taiwan) |
Page | pp. 8 - 9 |
Keyword | task scheduling, 3D MCPs, hotspots, DVFS, voltage assignment |
Abstract | In three-dimensional multi-core processors (3D-MCPs), hotspots are found more often and cause severe problems on system reliability and lifetime. Moreover, higher frequency of hotspot occurrence triggers more dynamic voltage and frequency scaling (DVFS), leading to degraded throughput. Therefore, to reduce the frequency of hotspot occurrence effectively, a new thermal-constrained task-scheduling algorithm based on the thermal-pattern-aware voltage assignment is proposed. Through the temperature profiles of different voltage assignments on 3D-MCPs, thermal-pattern aware voltage assignment is applied for reducing the rate of temperature increase among 3D-MCPs effectively. Furthermore, the proposed scheduler includes on-line allocation for 3D vertically-grouping cores and new vertically-grouping voltage scaling which considers thermal correlation among vertically-adjacent cores in 3D MCPs. Experimental results show that, compared to the previous thermal-constrained task-scheduling strategy, our task-scheduling algorithm can reduce the frequency of hotspot occurrence by 38.84% and can further improve throughput by 6.62%. |
Title | High-Level Synthesis from Programs with External Interrupt Handling |
Author | *Naoya Ito, Nagisa Ishiura (Kwansei Gakuin University, Japan), Hiroyuki Tomiyama (Ritsumeikan University, Japan), Hiroyuki Kanbara (Advanced Scientific Technology & Management Research Institute of KYOTO, Japan) |
Page | pp. 10 - 15 |
Keyword | High-level synthesis, Binary synthesis, External interrupt, ACAP |
Abstract | This paper presents a method of synthesizing a given binary program, which contains external interrupt handling, into hardware whose behavior is equivalent to the CPU running the program. In our method, the system control coprocessor which CPU uses for interrupt handling is incorporated into the hardware as a functional unit. Instructions for accessing coprocessor registers, returning from interrupt handling, and making system calls are scheduled as operations, and bound to the coprocessor. Jump register instructions for calling and returning from interrupt service routines are synthesized using operations that convert instruction addresses into the corresponding states of the hardware. Assuming MIPS R3000 as a CPU, the proposed method has been implemented on top of binary synthesizer ACAP. A program of about 40 lines with an external interrupt service routine was synthesized into hardware, and it was confirmed that interrupt handling works correctly. The execution cycles and the delay were reduced by 14% and 26% respectively, at the cost of 1.1 times increase in hardware size. |
PDF file |
Title | An SOC Estimation System for Lithium Ion Batteries Considering Thermal Characteristics |
Author | *Ryu Ishizaki, Lei Lin, Naoki Kawarabayashi, Masahiro Fukui (Ritsumeikan University, Japan) |
Page | pp. 16 - 21 |
Keyword | Extended Karman Filter, SOC estimation, Arrhenius formula, Lithium ion Batteries |
Abstract | This paper discusses an SOC estimation system for lithium ion batteries based on the Extended Karman Filter. The accuracy of the estimation is strongly dependent on accuracy of the battery model. We have newly formulated the equivalent circuit model that considers temperature and SOC dependencies. As the result, the error rate of the estimation bas been improved significantly. The evaluation shows that the new SOC estimation system can be used for wide range of temperature. |
PDF file |
Title | Dynamic Data Migration to Eliminate Bank-Level Interference for Stencil Applications in Multicore Systems |
Author | Wei-Hen Lo, *Yen-Hao Chen, TingTing Hwang (National Tsing Hua University, Taiwan) |
Page | pp. 22 - 27 |
Keyword | data migration, memory controller, page allocation, stencils, multi-threaded |
Abstract | A stencil computation repeatedly updates each point of a d-dimensional grid as a function of itself and its near neighbors. Modern automatic transformation compiler framework can generate efficient tiling parallel stencil codes. Dynamically scheduling parallel stencils significantly improves system performance. However, memory contention problem exacerbates because of less idling cores and more memory requests sent to the DRAM memory. Traditional OS page coloring method which partitions the memory pages in advance can not alleviate the memory contention
in dynamic scheduling parallel stencils. To address this issue, we provide a new software/hardware cooperative dynamic data migration method by exploiting the update-and-reuse property of stencils. We notice that the OS page allocation needs to be aware of the flexibility for dynamic data migration in memory to eliminate the memory interference. Experimental evaluation in a 8-core x86 system shows that our method can improve the system performance by 7% as compared with dynamic scheduling stencils in 8-cores 4-memory banks system. |
PDF file |
Title | A Battery Smart Sensor and Its SOC Estimation Function for Assembled Lithium-Ion Batteries |
Author | *Naoki Kawarabayashi, Lei Lin, Ryu Ishizaki, Masahiro Fukui (Ritsumeikan University, Japan), Isao Shirakawa (University of Hyogo, Japan) |
Page | pp. 28 - 33 |
Keyword | assembled Lithium-ion batteries, Battery Smart Sensor, SOC |
Abstract | This paper discusses about the smart sensor which is the important technology in a smart grid. We have developed the system to monitor the battery condition by the attached sensor. It accumulates the measured data onto the WEB. The battery sensor is implemented with a microcomputer. We have first developed a high accurate and practical SOC sensor using the Extended Kalman filter as a function of the battery sensor. Based on the SOC estimation function for a single cell, the SOC estimation function for assembled Lithium-ion batteries is also developed. |
PDF file |
Title | A Fast and Highly Accurate Statistical Based Model for Performance Estimation of MPSoC On-Chip Bus |
Author | *Farhan Shafiq, Tsuyoshi Isshiki, Dongju Li, Hiroaki Kunieda (Tokyo Institute of Technology, Japan) |
Page | pp. 34 - 39 |
Keyword | bus, statistical model, performance prediction, arbitration stall, bus stall |
Abstract | While Multiprocessor System-On-Chips (MPSoCs) are becoming widely adopted in embedded systems, communication architecture analysis for MPSoCs becomes ever more complex. There is a growing need for faster and accurate performance estimation techniques for on-chip bus. In this paper, we present a novel statistical based technique that makes use of accumulated "workload statistics" to accurately predict the "stall cycle counts" caused due to bus contention. This eliminates the need to simulate arbitration on every bus access, resulting in substantial speed-up. It is assumed that each Processor in the system has a distinct fixed priority, and arbitration is based on priority. We verify accuracy of our proposed model against results achieved by cycle accurate simulation. Two kinds of traffic is used for experiments. Synthetically generated traffic as well as traffic from real-world application is used to verify the bus model. We report an accuracy with an error range of 0.1% - 5% for the synthetic traffic as well as achieving a speedup of 7-10x. For the real traffic, we use a limited “single blocking” bus model and report results accordingly. |
PDF file |
Title | C-Based RTL Design Framework for Processor and Hardware-IP Synthesis |
Author | *Tsuyoshi Isshiki, Koshiro Date, Daisuke Kugimiya, Dongju Li, Hiroaki Kunieda (Tokyo Institute of Technology, Japan) |
Page | pp. 40 - 45 |
Keyword | C-based design, RTL synthesis, processor synthesis, verification, instruction-set simulator |
Abstract | In this paper, we propose a new C-based design framework where the RTL structure is directly described on dataflow C coding style, while the same C code serves as a fast simulation model. Design examples on image signal processing pipeline shows the effectiveness of the proposed C-based tool framework where the dataflow C codes have 1/3 to 1/5 of the number of lines compared to HDLs, can generate high performance circuits having enormously high parallelism of 4000 operations/cycle. Also for RISC processor designs, our dataflow C coding style effectively captures the behavior of the instruction set simulator with less than 1000 lines of C code which is can be directly transformed into RTL structure |
PDF file |
Title | Profiler for Control System in System Level Design |
Author | *Miaw Torng-Der, Yuki Ando, Shinya Honda, Hiroaki Takada, Masato Edahiro (Nagoya University, Japan) |
Page | pp. 46 - 51 |
Keyword | profiler, system level design, FPGA, control system |
Abstract | This paper introduces a profiler architecture for control system in system-level design.
When design a control system, we need to consider two things.
The first thing is the asynchronous signal coming from sensor and actuators, called interrupt request signal.
The second thing is the process should have a higher priority and be activated by interrupt request signal, called interrupt handler.
However, existing profiler cannot obtain the information of the interrupt request signal nor interrupt handler. |
PDF file |
Title | Socket-Based Performance Monitoring Tool Suite for System-on-Chips |
Author | *Ting-Hsuan Wu, Tsun-Hsin Chang, Ing-Jer Huang (National Sun Yat-sen University, Taiwan) |
Page | pp. 52 - 55 |
Keyword | performance, monitoring, system, software, hardware |
Abstract | Since the SoC industry had shifted its development goal from processor clock frequencies increasing to work distribution among multiple IPs. In order to achieve better efficiency of SoC integration, the socket interfaces are adopted to eliminate the migration overhead from system to another. Therefore, this paper proposed a Socket-Based Performance Monitoring Tool Suite (SB PMTS) which is capable to provide a holistic-view of system behavior and performance by monitoring the two types of performance information: (1) The cycle-accurate execution time of a complete task. (2) The transaction events on the socket interfaces. Accordingly, SB PMTS will synchronize the performance information from different resources and enable the average designers to quickly assess the quality of the SoC without any instrumentation. |
Title | Minimization of Register Area Cost for Soft-Error Correction in Low Energy DMR Design |
Author | *Kazuhito Ito, Takumi Negishi (Saitama University, Japan) |
Page | pp. 56 - 61 |
Keyword | DMR, Low energy, Synthesis, Register minimization |
Abstract | Double modular redundancy (DMR) is to execute an operation twice
and detect soft-error by comparing the operation results.
The soft-error is corrected by executing necessary operations again
to obtain correct results.
Such re-executing operations requires thier input data and
many registers are needed to store the necessary data.
In this paper, a method to minimize the area cost of registers
is proposed while the minimization of operation energy consumption is considered
with respect to the give constraints of time, resource, and delay penalty for error correction.
The experimantal results show about 20% of register cost is reduced
on average. |
PDF file |
Title | Simultaneous Test Scheduling and TAM Bus Wire Assignment for Core-Based SoC Designs |
Author | Te-Jui Wang, *Ching-Chun Chiu, Shih-Hsu Huang (Chung Yuan Christian University, Taiwan) |
Page | pp. 62 - 67 |
Keyword | Core-Based Systems, Test Scheduling, Testing Time, Test Access Mechanism |
Abstract | The reduction of total testing time is crucial for the saving of IC testing cost. In the testing of a core-based System-on-Chip (SoC) design, external tests are applied to cores via a specialized test access mechanism (TAM). Previous test scheduling algorithms assume that two external tests cannot utilize the TAM at the same time. However, in fact, if the external tests of different cores do not use the same TAM bus wire, they can be executed concurrently, which reduces the total testing time. Based on this observation, in this paper, we propose an effective and efficient algorithm to perform the simultaneous application of test scheduling and TAM bus wire assignment for the testing of core-based SoC designs. Compared with previous works, experimental results consistently show that the proposed approach can greatly reduce the total testing time. |
Title | Automatic Analog Synthesis Platform with Low-Noise Consideration |
Author | Ying-Chi Lien, Ching-Mao Lee, Chih-Wei Li, *Yi-Syue Han, Chien-Nan Jimmy Liu (National Central University, Taiwan) |
Page | pp. 68 - 71 |
Keyword | analog synthesis, bio-signal, automatic sizing, layout automation |
Abstract | Because the bio-signals are often very weak, they can be influenced by noise easily and become hard to distinguish. In this paper, an automatic analog synthesis platform is presented for bio-acquisition systems to generate the required circuits from specification to layout with low-noise consideration. Process variations and layout effects are also simultaneously considered to generate the required circuits with high design yield. Furthermore, a user-friendly GUI is also provided to help users complete the design flow successfully and efficiently. As shown in the experimental results, this analog synthesis platform is able to generate the required circuits in seconds with low noise. The chip implementation result also verifies the capability of this tool to generate the required designs with fabricable quality. |
Title | Intra-Vehicle Network Routing Algorithm for Weight and Wireless Transmit Power Minimization |
Author | *Ta-Yang Huang, Chia-Jui Chang (National Cheng Kung University, Taiwan), Chung-Wei Lin (University of California at Berkeley, U.S.A.), Sudip Roy (National Cheng Kung University, Taiwan), Tsung-Yi Ho (National Chiao Tung University, Taiwan) |
Page | pp. 72 - 77 |
Keyword | In-Vehicle Network, Routing |
Abstract | As the complexity of vehicle distributed systems increases rapidly, several hundreds of devices (sensors, actuators, etc.) are being placed in a modern automotive system.
With the increase in wiring cables connecting these devices, the weight of a car increases significantly, which degrades the fuel efficiency in driving.
In order to reduce the weight of a car, wireless communication has been introduced to replace wiring cables between some devices.
However, the extra energy consumption for packet transmissions by wireless devices requires frequent maintenances, e.g., recharging of batteries.
In this paper, we propose an intra-vehicle network routing algorithm to simultaneously minimize the wiring weight and the transmission power for wireless communication.
Experimental results show that the proposed method can effectively minimize the wiring weight and the transmit power for wireless communication. |
Title | An Automated Flow Integration to Help Analog Layout Design Migration |
Author | Jou-Chun Lin, *Po-Cheng Pan, Ching-Yu Chin, Hung-Ming Chen (National Chiao Tung University, Taiwan) |
Page | pp. 78 - 82 |
Keyword | analog layout, design migration |
Abstract | The development of the computer-aided-design (CAD) tools for digital circuits has been perfected for these years. However, the CAD tools for analog circuits still remains a great deal of challenges. Since the size of transistors scales down as the process technology advances, design migration problem takes place to increase the degree of layout reusing. With previous work such as placement migration and routing preservation tool, further performance boost becomes the next step. We aim at the width of wires that impacts resistance and capacitance of wires so as to improve the performance. We implement a flow, which can further improve the performance, generate the modified layout automatically and pass the verification check, to speed up the analysis process or design flow by adjusting the wire width. We apply greedy heuristic and simulated annealing algorithm in our framework. Our flow can help with the analog layout synthesis flow in more efficient way. |
PDF file |
Title | Analysis of the Distance Dependent Multiple Cell Upset Rates on 65-nm Redundant Latches by a PHITS-TCAD Simulation System |
Author | *Kuiyuan Zhang, Jun Furuta, Kazutoshi Kobayashi (Kyoto Institute of Technology, Japan) |
Page | pp. 89 - 93 |
Keyword | Soft error, PHITS, TCAD, MCU |
Abstract | Recently, the soft error rates of integrated circuits is
increased by process scaling. Soft error decreases the tolerance of
VLSIs. Charge sharing and bipolar effect become dominant
when a particle hit on latches and flip-flop. Soft error makes
circuit more sensitive to Multiple Cell Upset (MCU). We
analyze the MCU tolerance of redundant latches in 65 nm
process by device simulation and particle and heavy ion
transfer code system (PHITS). The MCU rate of redundant
latches is exponential decreased by increasing the
distance between redundant latches. These results coincide
with the neutron experiments. |
Title | Feasible Shortest Path Frame Bounded Maze-Routing Algorithm for ML-OARST with Ripping up and Re-Building Steiner Points |
Author | *Kuen-Wey Lin, Yeh-Sheng Lin, Yih-Lang Li (Institute of Computer Science and Engineering, National Chiao Tung University, Taiwan), Rung-Bin Lin (Computer Science and Engineering, Yuan Ze University, Taiwan) |
Page | pp. 94 - 99 |
Keyword | Steiner tree, Routing, Obstacle-avoidance, Multilayer, Physical Design |
Abstract | Owing to its large solution space, maze routing has never been used to solve the multi-layer obstacle-avoiding rectilinear Steiner tree problem (ML-OARST). This paper proposes the first maze routing-based algorithm that efficiently identifies a high-quality ML-OARST. Our algorithm employs a three-dimensional Hanan grid graph for maze routing and applies a novel scheme to identify good Steiner points. This significantly reduces the search overhead of maze routing. To reduce the routing cost of ML-OARST, we also develop a novel rip-up and re-building strategy for altering Steiner points and tree topology. Experimental results reveal that the proposed algorithm outperforms the state-of-the-art ML-OARST methods in wire-length and via costs. The required CPU time is comparable to that needed by spanning graph-based approaches. |
Title | A TPL-Friendly Legalizer for Standard Cell Based Design |
Author | *Hsiu-Yu Lai, Ting-Chi Wang (National Tsing Hua University, Taiwan) |
Page | pp. 100 - 105 |
Keyword | Triple Patterning Lithography, Placement, Legalization, Standard Cell, Layout Decomposition |
Abstract | As the shrinking of the feature size and the delay of the next generation lithography, double patterning lithography (DPL) is no longer enough for 14/10nm technology node. Triple patterning lithography (TPL) is a nature extension from DPL, and it can not only triple the pitch but also reduce conflicts and stitches. Although TPL is more difficult and complicated than DPL, TPL is a promising alternative for 14/10nm technology node. In this paper, we consider TPL during the standard-cell legalization stage in order to let the resultant placement be more friendly to TPL layout decomposition. We provide a novel idea of reducing TPL conflicts through cell reordering and white space insertion. The experimental results show that as compared to a conventional legalizer, our legalizer is able to effectively reduce the numbers of conflicts and stitches. |
Title | On the Impact of Initial Placement to SA-Based Placement for Mixed-Grained Reconfigurable Architecture |
Author | *Takashi Kishimoto, Hiroyuki Ochi (Ritsumeikan University, Japan) |
Page | pp. 111 - 116 |
Keyword | Simulated Annealing, Partitioning-based, Reconfigurable Architecture, Placement |
Abstract | In this paper, we investigate a novel placement algorithm for mixed-grain reconfigurable architectures (MGRAs). The proposed algorithm applies partitioning-based method to LUTs to obtain an initial placement, followed by further optimization process for both LUTs and ALUs based on low temperature simulated annealing (SA) method. Compared with a conventional FPGA placement algorithm that uses SA with random initial placement, our method exhibits 9.3% smaller delay after running SA for half an hour. Our method is also superior in terms of final solution after several hours run. |
Title | Through-Silicon-Via Inductor based DC-DC Converters: The Marriage of the Princess and the Dragon |
Author | *Yiyu Shi (Missouri University of Science and Technology, U.S.A.) |
Page | p. 117 |
Abstract | There has been a tremendous research effort in recent years to move DC-DC converters on chip for enhanced performance. However, a major limiting factor to implement on-chip inductive DC-DC converters is the large area overhead induced by spiral inductors. Towards this, we propose to use through-silicon-vias (TSVs), a critical enabling technique in three-dimensional (3D) integrated systems, to implement on-chip inductors for DC-DC converters. While existing literature show that TSV inductors are inferior compared with conventional spiral inductors due to substrate loss for RF applications, we demonstrate that it is not the case for DC-DC converters, which operate at relatively low frequencies. Experimental results show that by replacing conventional spiral inductors with TSV inductors, with almost the same efficiency and output voltage, up to 4.3x and 3.2x inductor area reduction can be achieved for the basic buck converter and the interleaved converter with magnetic coupling, respectively. To the best of our knowledge, this is the very first in-depth study on utilizing TSV inductors for on-chip DC-DC converters in 3D ICs. |
PDF file |
Title | Circuit Reliability: Major Roadblock in Future Technology? |
Author | Organizer: Tsung-Yi Ho (National Chiao Tung University, Taiwan), Moderator: Ing-Chao Lin (National Cheng Kung University, Taiwan), Panelists: Vijaykrishnan Narayanan (Pennsylvania State University, U.S.A.), Anthony Oates (Taiwan Semiconductor Manufacturing Company, Taiwan), Ulf Schlichtmann (Technische Universität München, Germany), Yiyu Shi (Missouri University of Science and Technology, U.S.A.), Tomohiro Yoneda (National Institute of Informatics, Japan) |
Page | p. 118 |
Abstract | As technology scales, circuit reliability has become a major issue. This panel focuses on circuit reliability in current and future technology. Topics for discussion include the following:
1. Major reliability issues in advanced CMOS. Which is the most critical?
2. Major reliability issues in beyond CMOS technology. Any difference?
3. Major reliability issues at 3D IC, automotive, and medical electronics. Reliable hardware platform for automotive applications.
4. The role of EDA in improving circuit reliability. |
PDF file |
Title | Fast Transient and High Current Efficiency Voltage Regulator with Hybrid Dynamic Biasing Technique |
Author | Chia-Min Chen, *Yen-Wei Liu, Chung-Chih Hung (National Chiao Tung University, Taiwan) |
Page | pp. 119 - 122 |
Keyword | Capacitive coupling, voltage spike, low-dropout regulator, hybrid dynamic biasing, transient response |
Abstract | This paper presents an output-capacitorless low-dropout (LDO) voltage regulator that achieves fast transient responses by hybrid dynamic biasing. The hybrid dynamic biasing in the proposed transient improvement circuit is activated through capacitive coupling. The proposed transient improvement circuit senses the LDO output change so as to increase the bias current instantly. The proposed circuit was applied to an output-capacitorless LDO implemented in standard 0.35-um CMOS technology. The device consumes only 25 uA of quiescent current with a dropout voltage of 180 mV. The proposed circuit reduces the output voltage spike of the LDO to 80 mV when the output current is changed from 0 mA to 100 mA. The output voltage spike is reduced to 20 mV when the supply voltage varies between 1.3 V and 2.3 V with a load current of 100 mA. |
Title | Scan Test of Latch-Based Asynchronous Pipeline Circuits under 2-Phase Handshaking Protocol |
Author | *Kyohei Terayama, Atsushi Kurokawa, Masashi Imai (Hirosaki University, Japan) |
Page | pp. 128 - 133 |
Keyword | test, asynchronous circuit, scan D-latch, 2phase handshaking protocol |
Abstract | Asynchronous MOUSETRAP pipeline circuit is a simple and fast circuit thanks to the 2-phase handshaking protocol which has no return-to-zero overhead. In this paper, we propose two scan D-latches in order to support its scan test since D-latches are used instead of flip-flops in the MOUSETRAP. We design some MOUSETRAP pipeline circuits with the ISCAS89 benchmark combinational circuits using 130nm process technologies and show some evaluation results of the overhead and the fault coverage under the single stuck-at fault model. |
PDF file |
Title | Data Reduction and Parallelization for Human Detection System |
Author | *Mao Hatto, Takaaki Miyajima, Hideharu Amano (Keio University, Japan) |
Page | pp. 134 - 139 |
Keyword | Human Detection, FPGA, parallelization, data reduction, HW/SW Co-design |
Abstract | HOG (Histogram of Oriented Gradients) is one of the effective
ways for extracting feature values. Also, Real Adaboost
algorithm has high recognition ratio, and it is adequate to
hardware implementation. Many researches on human detection
systems adopted these two algorithms and had achieved
progress. However, data volume of HOG feature is still a
problem in the whole system. Data volume from only one
frame could be over 1 GB, and this data volume causes some
difficulties from the view point of both sending data to a server
and execution speed. Especially, since much data volume
presses also internal data communication between modules in
hardware execution, much data volume could be a bottle-neck
of the whole system operating speed.
Here, a high speed and small memory consuming implementation
of human detection system using Hardware-Software
Co-design is proposed. For the executing speed of the system,
HOG feature values are accelerated by an FPGA, and Real
Adaboost detection is executed only by accessing ROM data
in the FPGA. As a result, HOG+Real Adaboost part was
accelerated about 23.1 times faster compared to the software
execution. Whole system had been implemented on a single
board, and it achieved 3.22 times speed up from camera input
to VGA display output. Also we tried to reduce feature data
volume, and achieved 93.75% of data compression compared
to double precision calculation, with only 2.68% loss of the
recognition accuracy. |
PDF file |
Title | Evaluation of Approximate SAD Circuits with Error Compensation |
Author | *Toshihiro Goto, Yasunori Takagi, Shigeru Yamashita (Ritsumeikan University, Japan) |
Page | pp. 140 - 145 |
Keyword | SAD, approximate comupting |
Abstract | This paper proposes and evaluates an “approximate” but fast Sum of Absolute Difference (SAD) circuit to provide a design experience for approximate computing, which is an emerging research area. Our idea to design an “approximate” but fast circuit is similar to the one in the previous works in approximate computing researches. Unlike the previous works, we also propose various error compensation methods to use the circuit for real applications. Moreover, this paper reports the result of our hardware design, and our software evaluation of our various error compensation methods by using video compression applications. Our results show that our SAD circuit (with some errors) can reduce the total processing time by 10.71% than the conventional SAD circuit (without error), although it can provide acceptable quality for the video compression applications. |
Title | A Circuit Implementable 5-Output nMOSFET Shearing Stress Sensor |
Author | *Tomochika Harada, Kousuke Takeuchi (Yamagata University, Japan) |
Page | pp. 146 - 148 |
Keyword | shearing stress sensor, MOSFET sensor, multi-output sensor |
Abstract | In this paper, we design, fabricate, and evaluate stress detection operation of 5-output MOSFET type stress detection element. We can verify in strong inversion regions. Stress detection sensitivity can be changed by VGS in the saturation region. If VGS is constant, stress detection sensitivity must set to constant. Furthermore, stress sensitivity is variable by VDS (Not VGS) in the linear region. |
Title | Iddq Testing Against Process Variations and Measurement Noises |
Author | Chia-Ling Chang, *Jack Sheng-Yan Lin, Clarles Hung-Pin Wen (National Chiao Tung University, Taiwan) |
Page | pp. 149 - 150 |
Keyword | Iddq, Data mining, process variation |
Abstract | Analyzing test data can have a significant impact on improving production test and parametric yield. The work investigates the test data analysis on Iddq test data to extract certain knowledge to estimate the process parameters and screen potential defective chips. With a simulation framework, we demonstrate the dependency of this screening to various assumptions, such as the amount of process variations, the sensitivity of measurement noises and the number of Iddq patterns. Experimental results on IWLS’05 designs show that the Iddq analysis reveals its strengths on screening faulty samples under various variations and assumptions in a 45nm technology. |
Title | Pre-Bond Interposer Test Methodology for System in Package |
Author | Katherine Shu-Min Li (Department of Computer Science, National Sun Yat-sen University, Taiwan), Sying-Jyan Wang (Department of Computer Science, National Chung Hsing University, Taiwan), Cheng-You Ho (Department of Computer Science, National Sun Yat-sen University, Taiwan), Yingchieh Ho (Department of Electrical Engineering, National Dong Hwa University, Taiwan), Ruei-Ting Gu (National Sun Yat-sen University/Advanced Semiconductor Engineering (ASE) Group, Taiwan), Bo-Chuan Cheng (Advanced Semiconductor Engineering (ASE) Group, Taiwan) |
Page | pp. 151 - 156 |
Keyword | interposer, test, 2.5D, System in Package, Through-Silicon-Via |
Abstract | Pre-bond testing of silicon interposer is difficult due
to the large number of nets to be tested and small number of
test access ports. Recently, it was proposed to include a test
interposer that is contacted with the interposer under test in
the testing process. Combining these two interposers provides
access to nets that are not normally accessible. Previous
synthesis method for test interposer was based on constrained
breadth-first search, which can be time-consuming. Besides,
separate test interposers have to be provided for open and
short fault testing. In this paper, we present a theoretical study
on the topology of testable circuit structure for interconnect
faults in silicon interposer. Based on the theoretical framework,
a more efficient synthesis method is developed. Furthermore, a
single test interposer can be used for both open and short fault
detection, which leads to shorter test time and lower test cost. |
Title | Oxygen Sensor Module with Majority Sensing for Monitoring Wide Area at Disaster |
Author | *Ryuta Nishino, Tatsuya Yamada, Qing Dong, Shigetoshi Nakatake (The University of Kitakyushu, Japan) |
Page | pp. 157 - 158 |
Keyword | Sensor, Majority Sensing, Oxygen Concentration |
Abstract | This work presents a new sensor module with majority sensing which improve an accuracy by multiple sensor devices. The sensor modules are distributed over disaster region for monitoring environmental information such as a temperature of the surface and oxygen concentration. Each sensor module is connected by a wireless network and transmits the information to a monitoring server. In this work, we focus on sensing oxygen concentration in case of forest fire. To improve an accuracy of the sensing value, we introduce a new sensing mechanism called majority sensing with multiple sensor devices. In experiments, we demonstrate 8.4-14% improvement for the oxygen concentration sensing. |
PDF file |
Title | FPGA Implementation and Evaluation of Image Scaling Circuits Using Seletor-Logic-Based Bi-Linear Interpolation |
Author | *Keita Igarashi, Masao Yanagisawa, Nozomu Togawa (Waseda University, Japan) |
Page | pp. 159 - 160 |
Keyword | selector logic, FPGA, bi-linear interpolation |
Abstract | Bi-linear interpolation is one of interpolation techniques, which interpolates a pixel value linearly from its four circumferences and often used for image scaling. In this paper, we pick up a method to interpolate pixels using selector logics and implement and evaluate it on an FPGA board. By applying selector logics to a bi-Linear interpolation operation, its total product terms are decreased and thus a circuit size and circuit delay are improved. We realize approximately 15.7% speed-up using selector-logic-based bi-linear interpolation. |
Title | An Accelerator for Frequent Itemset Mining from Data Stream with Parallel Item Tree |
Author | *Kasho Yamamoto, Tsunaki Sadahisa, Dahoo Kim, Eric S. Fukuda, Tetsuya Asai, Masato Motomura (Hokkaido University, Japan) |
Page | pp. 161 - 162 |
Keyword | data mining, frequent itemsets, stream processing, hardware accelerator |
Abstract | Frequent itemset mining attempts to find frequent subsets in a transaction database. In this era of big data, demand for frequent itemset mining is increasing. Therefore, the combination of fast implementation and low memory consumption, especially for stream data, is needed. In response to this, we optimize an online algorithm, called Skip LC-SS algorithm, for hardware.In this paper, we present an efficient architecture based on this algorithm. |
Title | A Leakage Current Reduction Algorithm Using Input Vector Control and Cell Topology Modification |
Author | Tsung-Yi Wu (National Changhua University of Education, Taiwan), Hsin-Hui Li (Global Unichip Corp., Taiwan), *Zhi-Yao Ding, Guan-Cheng Guo (National Changhua University of Education, Taiwan) |
Page | pp. 163 - 164 |
Keyword | cell topology modification, input vector control, leakage current reduction, sleep mode |
Abstract | Since the leakage current of a digital circuit depends on the states of its logic gates, assigning a minimum leakage vector to its primary inputs in the sleep mode is a feasible technique for leakage current reduction. In this paper, we propose a heuristic algorithm that applies a cell topology modification and pin reordering technique and minimum leakage vector assignment for leakage current reduction. Experimental results show that the algorithm can reduce the leakage current by average 11.8%. |
Title | Majority-Inverter Graph for FPGA Synthesis |
Author | *Luca Amaru (EPFL - LSI, Switzerland), Ana Petkovska (EPFL - LAP, Switzerland), Pierre-Emmanuel Gaillardon (EPFL - LSI, Switzerland), David Novo Bruna, Paolo Ienne (EPFL - LAP, Switzerland), Giovanni De Micheli (EPFL - LSI, Switzerland) |
Page | pp. 165 - 170 |
Keyword | Majority-Inverter Graph, Logic Synthesis, FPGA |
Abstract | In this paper, we present an FPGA synthesis flow based on Majority-Inverter Graph (MIG). An MIG is a directed acyclic graph consisting of three-input majority nodes and regular/complemented edges. MIG manipulation is supported by a consistent algebraic framework leading to strong synthesis properties. We propose MIG optimization techniques targeting high-speed FPGA implementations. For this purpose, we reduce the depth of logic circuits via MIG algebraic transformations enabling denser LUT mapping on FPGAs. Experimental results show that our MIG-based design flow reduces by 21%, on average, the delay of the arithmetic circuits synthesized on a state-of-art 28nm commercial FPGA device, as compared to a commercial design flow. |
PDF file |
Title | High Observability Scan Chains with Improving Output Compaction Efficiency |
Author | Sying-Jyan Wang, Che-Wei Kao (Department of Computer Science and Engineering, National Chung Hsing University, Taiwan), Katherine Shu-Min Li (Department of Computer Science and Engineering, National Sun Yat-sen University, Taiwan) |
Page | pp. 171 - 176 |
Keyword | scan test, scan chain, output compaction, X-tolerance, diagnosability |
Abstract | Output selection is recently proposed for test
response compaction. This scheme achieves zero aliasing, full
X-tolerance, and high diagnosability, at the cost of inflated test
set and non-trivial hardware overhead. The time/space penalty
in test output compaction is mainly attributed to the loss of
observability. In previous methods, it was in general assumed
that erroneous responses are uniformly distributed among all
scan chains, and the output compactors are designed
accordingly. In this paper, we present three techniques to
improve the performance of output selection based test
response compaction. (1) The uneven distribution of erroneous
test responses is exploited to optimize compactor design. (2) A
test dynamic compaction algorithm is provided to deal with the
test set inflation problem. (3) A low-cost test response
compactor is presented. Experimental results indicate that the
proposed techniques can achieve better compaction results
with lower hardware overhead. |
Title | Using Structural Relations for Checking Combinationality of Cyclic Circuits |
Author | Wan-Chen Weng (National Tsing Hua University, Taiwan), Yung-Chih Chen (Yuan Ze University, Taiwan), Jui-Hung Chen, *Ching-Yi Huang, Chun-Yao Wang (National Tsing Hua University, Taiwan) |
Page | pp. 177 - 182 |
Keyword | combinationality, cyclic circuit |
Abstract | Functionality and combinationality are two main issues that have to be dealt with in cyclic combinational circuits, which are combinational circuits containing loops. Cyclic circuits are combinational if nodes within the circuits are definite values under all input assignments. For a cyclified circuit, we have to check whether it is combinational or not. Thus, this paper proposes an efficient two-stage algorithm to verify the combinationality of cyclic circuits. A set of cyclified IWLS 2005 benchmarks are performed to demonstrate the efficiency of the proposed algorithm. Compared to the state-of-the-art algorithm, our approach has a speedup of about 4000 times on average. |
Title | YAPSIM: Yet Another Parallel Logic Simulation Using GP-GPU |
Author | *Takuya Hashiguchi, Yuichiro Mori, Masahiko Toyonaga, Michiaki Muraoka (Kochi University, Japan) |
Page | pp. 183 - 186 |
Keyword | GP-GPU, Logic Simulator, Parallel algorithm |
Abstract | In this paper, a new high-speed logic simulator YAPSIM based on a parallel logic simulation methodology using GP-GPU is presented. It consists of three acceleration methods for simulation performance, a fan-out cone grouping method, a LUT method and a GPU internal memory access method. The experimental comparison result shows that YAPSIM executed 29 times faster than a high speed commercial simulator for a combinational circuit of 75,000 gates, and 5.7 times faster for a sequential circuit of 84,000 gates respectively. |
PDF file |
Title | Technology Mapping Method for Low Power Consumption and High Performance in General-Synchronous Framework |
Author | *Junki Kawaguchi, Yukihide Kohira (The University of Aizu, Japan) |
Page | pp. 187 - 192 |
Keyword | General-Synchronous Framework, Technology Mapping, Integer Linear Programming |
Abstract | In general-synchronous framework, in which the clock is distributed periodically to each register but not necessarily simultaneously, circuit performance is expected to be improved compared to complete-synchronous framework, in which the clock is distributed periodically and simultaneously to each register. To improve the circuit performance more, logic circuit synthesis for general-synchronous framework is required. In this paper, under the assumption that any clock schedule is realized by an ideal clock distribution circuit, when two or more cell libraries are available, a technology mapping method which assigns a cell to each gate in the given logic circuit by using integer linear programming is proposed. In experiments, we show the effectiveness of the proposed technology mapping method. |
PDF file |
Title | A Quaternary Master-Slave Flip-Flop with Multiple Functions for Multi-Valued Logics |
Author | *Renyuan Zhang, Mineo Kaneko (Japan Advanced Institute of Science and Technology, Japan) |
Page | pp. 193 - 198 |
Keyword | quaternary, flip-flop, Neuron-MOS |
Abstract | A prototype of flip-flop circuit is proposed in this work for storing quaternary signals. Inspired by the Neuron-MOS mechanism, the capacitance-coupling technology is implemented to realize multi-threshold inverters. On the basis of this technology, a self-lock feedback scheme is proposed to process and store quaternary signals with standard CMOS technology and ordinary dual-rail supply voltage. Thanks to the inherent property of quaternary processing and proposed scheme, various behaviors can be easily achieved without additional combination-circuits. An example is given on the quaternary counter with sixteen states. From circuit simulation results, the proposed quaternary multi-functional flip-flop achieves all the basic and extended functions correctly. |
PDF file |
Title | Quantitative Evaluations and Efficient Exploration for Optimal Partially-Programmable Circuits Generation |
Author | *Takumi Tsuzuki (Nara Institute of Science and Technology, Japan), Yuko Hara-Azumi (Tokyo Institute of Technology, Japan), Shigeru Yamashita (Ritsumeikan University, Japan), Yasuhiko Nakashima (Nara Institute of Science and Technology, Japan) |
Page | pp. 199 - 204 |
Keyword | fault tolerance, PPC(Partially-Programmable Circuits), LUT(Look Up Tabble) |
Abstract | In this paper, based on Partially-Programmable Circuits (PPCs), which have been recently proposed for improving the fault tolerance of circuits, we study further effective PPC generation by exploring wider design space. First, we quantitatively evaluated various aspects which may affect the fault tolerance of PPC. Exploiting the findings obtained, we then successfully generated PPCs which improve the area-efficiency of fault tolerance by 34% compared with an existing PPC generation method. Moreover, we developed an efficient exploration of PPCs, leading to exploration time reduction by 70% over exhaustive search, without affecting the optimality. |
Title | Reliability and Robustness - Design and EDA to the Rescue! |
Author | *Ulf Schlichtmann (Technische Universität München, Germany) |
Page | p. 217 |
Abstract | Traditionally, Integrated Circuits (ICs) have been designed with the primary goal of minimizing area and thus cost. Performance also was a key issue from quite early on. Later, power became an important design consideration. Of course, optimizing yield always has been an important goal as well.
Reliability of ICs, however, typically was (and still is) handled on technology level. Technology departments and manufacturing ensured the reliability of individual components, resulting in reliable circuits. But as we move to ever smaller geometries, individual devices (transistors and wires) become less reliable. At the same time, the complexity of ICs continues to grow exponentially. These two forces create a strong imperative to focus on design and especially Electronic Design Automation (EDA) in order to ensure reliability and robustness. Recently, cross-layer approaches have started to appear in order to achieve reliability in a cost-efficient manner.
This talk will give an overview about reliability and robustness challenges and discuss recent research activities and results to address reliability and robustness challenges using EDA. |
PDF file |
Title | New nMOS Dynamic Shift Registers for Driver Circuit of Small LCDs and Their Evaluations |
Author | *Shinji Higa, Shuji Tsukiyama (Chuo University, Japan), Isao Shirakawa (University of Hyogo, Japan) |
Page | pp. 218 - 223 |
Keyword | shift register, nMOS dynamic logic, Liquid Crystal Display, System on Glass, source driver |
Abstract | Driver circuits for small LCDs (Liquid Crystal Displays) are formed on the same glass substrate as LCD by means of thin film transistors, which is called system on glass technology. If such a driver circuit is implemented by nMOS transistor only, then production cost can be reduced, because pMOS process is eliminated. In this paper, we focus on shift registers, which are indispensable in LCD driver circuit, and consider a method to design an nMOS dynamic shift register. Then, we propose two new 2-phase clock shift registers, and evaluate their performances by comparing them with the conventional shift registers using 2-phase or 4-phase clock. The results show that the new shift registers have acceptable areas and outperform the others in speed, power, and variations of power supply voltage and mobility of transistors. |
Title | A Floorplan-Driven High-Level Synthesis Algorithm Utilizing Interconnection Delay Characteristics in FPGA Designs |
Author | *Koichi Fujiwara, Masao Yanagisawa, Nozomu Togawa (Waseda University, Japan) |
Page | pp. 224 - 225 |
Keyword | high-level synthesis (HLS), FPGA, floorplan, interconnection delay |
Abstract | Recently, high-level synthesis (HLS) techniques for FPGA designs are required such as in image processing and software-defined radios. With recent process scaling in FPGAs, interconnection delays become dominant in total circuit delays and each FPGA family has different interconnection delay characteristics. Multiplexer cost is another concern in FPGA designs. We need to consider interconnection delays based on interconnection delay characteristics in FPGA designs with reducing multiplexer cost in HLS. In this paper, we propose a floorplan-driven HLS algorithm utilizing interconnection delay characteristics in FPGA designs. Experimental results show that our algorithm can realize FPGA designs which reduce the latency by up to 6% compared with our previous approach. |
Title | Introducing Loop Statements in Random Testing of C Compilers Based on Expected Value Calculation |
Author | *Kazuhiro Nakamura, Nagisa Ishiura (Kwansei Gakuin University, Japan) |
Page | pp. 226 - 227 |
Keyword | compiler, random testing, for loop |
Abstract | This paper presents a method of reinforcing random testing of C compilers by introducing loop statements.
While random testing based on precomputation of expected values is powerful in detecting bugs in C compilers, loop statements have not been handled, due to difficulties in avoiding undefined behavior.
In this paper, an extended method to eliminate undefined behavior in loop bodies is proposed, where arrays of precomputed constants are used to modify problematic operands during loop iterations.
A random test system based on the proposed method has uncovered a new bug in the latest version of LLVM which can not be detected by the existing methods. |
PDF file |
Title | Product Term Minimization in ROBDDs with Application to Reconfigurable SET Array Synthesis |
Author | *Yi-Hang Chen, Yang Chen, Juinn-Dar Huang (National Chiao Tung University, Taiwan) |
Page | pp. 228 - 231 |
Keyword | single-electron transistor, automatic synthesis, reconfigurable, area minimization, binary decision diagram |
Abstract | The power dissipation has become a crucial issue for most electronic circuit and system designs nowadays when fabrication processes exploit even deeper submicron technology. In particular, leakage power is becoming a dominant source of power consumption. In recent years, the reconfigurable single-electron transistor (SET) array has been proposed as an emerging circuit design style for continuing Moore’s Law due to its ultra-low power consumption. Several automated synthesis techniques for area minimization have been developed for the reconfigurable SET array in the past few years. Nevertheless, most of those existing methods focus on variable and product term reordering during SET mapping. In fact, minimizing the number of product terms can greatly reduce the area as well, which has not been well addressed before. In this paper, we propose a dynamic shifting based variable ordering algorithm that can minimize the number of disjoint sum-of-product terms extracted from the given ROBDD. Experimental results show that the proposed method can achieve an area reduction of up to 49% as compared to current state-of-the-art techniques. |
PDF file |
Title | An Effective Timing-Coherent Transactor Generation Approach for Mixed-Level System Simulations |
Author | *Hsin-I Wu, Li-chun Chen, Ren-Song Tsay (National Tsing Hua University, Taiwan) |
Page | pp. 232 - 237 |
Keyword | Mixed-level simulations, system simulations, transactor, timing coherent, ESL |
Abstract | In this paper we extend the concept of the traditional transactor, which focuses on correct content transfer, to a new timing-coherent transactor that also accurately aligns the timing of each transaction boundary so that designers can perform precise concurrent system behavior analysis in mixed-abstraction-level system simulations which are essential to increasingly complex system designs. To streamline the process, we also developed an automatic approach for timing-coherent transactor generation. Our approach is actually applied in mixed-level simulations and the results show that it achieves 100% timing accuracy while the conventional approach produces results of 25% to 44% error rate. |
Title | An Accurate Processor Power Estimation Approach Based on Microcomponent Structure Analysis |
Author | *Chi-Kang Chen, Zih-Ci Huang, Ren-Song Tsay (National Tsing Hua University, Taiwan) |
Page | pp. 238 - 243 |
Keyword | ESL, power estimation, microcomponent, power anslysis, processor |
Abstract | We propose a new embedded processor power analysis approach that maps instruction executions to microarchitecture components for highly efficient and accurate power evaluations, which are crucial for embedded system designs. We observe that in practice, the execution of each high-level instruction in a processor always triggers the same microcomponent activity sequence while the difference of power consumption values of different instructions is mainly due to timing variations caused by hazards and cache misses. Hence, by incorporating accurately pre-characterized microcomponent power consumption values into an efficient instruction-microcomponent processor timing simulation tool, we construct a highly accurate embedded processor power analysis tool. Additionally, based on the proposed approach, we accurately and effortlessly capture the power waveform at any time point for power profiling, peak power and dynamic thermal distribution analysis. The experimental results show that the proposed approach is nearly as accurate as gate-level simulators, with an error rate of less than 1.2% while achieving simulation speeds of up to 20 MIPS, five orders faster than a commercial gate-level simulator. |
Title | A Verilog Compiler Proposal for VerCPU Simulator |
Author | *Tze Sin Tan (Altera Corporation, Malaysia), Bakhtiar Affendi Rosdi (Universiti Sains Malaysia, Malaysia) |
Page | pp. 244 - 249 |
Keyword | Verilog, Simulator, Hardware Assisted |
Abstract | Verilog is a widely used Hardware Description Language (HDL) for VLSI design and modeling. As a language developed with hardware execution concurrency in mind, Verilog can be mapped onto a dedicated processor for higher simulation throughput. The processor requires a compiler to transform Verilog netlist into compiled-code instructions. In this paper, we propose a data structure that adequately
represents a Verilog model. Then, the Verilog compiler is developed to map Verilog netlist into this data structure. We also demonstrated that it is possible to construct a hardware simulator (VerCPU) utilizing this data structure. |
PDF file |
Title | MorFPGA Duo: A Dual-Core FPGA-Based Embedded System Development Platform |
Author | Chih-Chyau Yang, *Chun-Yu Chen, Chun-Wen Cheng, Yi-Jun Liu, Chien-Ming Wu, Chun-Ming Huang (National Chip Implementation Center, Taiwan) |
Page | pp. 250 - 254 |
Keyword | MorFPGA, MorFPGA Duo, SoC FPGA, All-programmable SoC |
Abstract | To help academia researchers of Taiwan rapidly integrate their IP to a system for complete demonstration of hardware/software co-design, CIC presents a platform named MorFPGA Duo in this paper. MorFPGA Duo owns a dual core, versatile built-in peripherals and high expandability to satisfy users’ eager needs for state-of-the-art research topics. MorFPGA Duo consists of two boards: The core modular board mainly includes a dual core Zynq and versatile peripherals while the multimedia modular board supports the high quality two-channel video sources with SDI interface. With the PMOD and FMC connectors, the various kinds of daughter boards are enabled to integrate into MorFPGA Duo system. The Media Wiki online forum is adopted as the platform to deliver users the latest lab materials. The design flow and two system integration examples are provided to show the MorFPGA Duo is workable. One example introduces how the software and hardware design can be integrated and demonstrated in this platform. The other example shows the video demo system with dual SDI cameras to enable the future development of 3D video applications. |
Title | A 3G-Based Bridge Structural Health Monitoring System Using Cost-Effective 1-Axis Accelerometers |
Author | Chih-Hsing Lin, *Wen-Ching Chen, Chih-Ting Kuo, Gang-Neng Sung, Chih-Chyau Yang, Chien-Ming Wu, Chun-Ming Huang (National Chip Implementation Center, Taiwan) |
Page | pp. 255 - 259 |
Keyword | Bridge health monitoring, 1-axis accelerometer, Cellular communication |
Abstract | This paper proposes a 3G-based structure health
monitoring device (HMD) for short-term monitoring. The
proposed HMD includes three 1-axis accelerometers, microcontroller unit (MCU), analog to digital converter (ADC), and
cellular gateway for long span bridge. The proposed monitoring
system achieves the features of low cost by using three 1-axis
accelerometers with the data synchronization problem being
solved, and easily installation and removal. Furthermore, instead
of using data loggers data is transmitted to Host through 3G
gateway. Compared with 3-axis accelerometer, our proposed
1-axis accelerometers based device has achieved 72.7% cost
saving. Besides, the cost of HMD achieves 34.1% cost saving
when it is compared with data logger inside HMD. To adapt
our HMD system to fit different monitoring environments, the
proposed system can easily exchange the different PCB boards
to achieve variety applications such as communication interfaces
and sensors. Therefore, with using the proposed device, the realtime diagnosis system for bridge damage monitoring can be
conducted effectively. |
Title | Protection Method for AES IP Core from Scan-Based Attack |
Author | *Yifan Wu, Shinji Kimura (Waseda University, Japan) |
Page | pp. 266 - 271 |
Keyword | Advanced Encryption Standard, scan chain, secure scan design, bit difference attack, JTAG |
Abstract | In the research, scan-based two bit difference attack method has been studied and complete the method by further analysis and additional tables and test patterns. Then a protection method for such scan-based attack is also proposed.The proposed method cause less area overhead compared with the original AES IP core, higher security level and fault coverage compared to previous methods. |
PDF file |
Title | Using Range-Equivalent Circuits for Facilitating Bounded Sequential Equivalence Checking |
Author | Yung-Chih Chen (Yuan Ze University, Taiwan), Wei-An Ji, Chih-Chung Wang, *Ching-Yi Huang, Chun-Yao Wang (National Tsing Hua University, Taiwan) |
Page | pp. 278 - 282 |
Keyword | design verification, bounded sequential equivalence checking |
Abstract | This paper presents a method based on range-equivalent circuit technique for SAT-based bounded sequential equivalence checking. Given two sequential circuits to be verified, instead of straightforward unrolling the miter of two sequential circuits, we iteratively minimize the miter with a range-equivalent circuit technique before adding a new timeframe. This is because the previous timeframes can be considered as a pattern generator that feeds input patterns to the next timeframe. Experimental results show that the proposed method saved up to 91% of time for reaching the same bounded depth compared with previous work on IWLS2005 benchmarks. |
PDF file |
Title | A Redundant Task Allocation Method for Reliable Network-on-Chips |
Author | *Hiroshi Saito (The University of Aizu, Japan), Tomohiro Yoneda (National Institute of Informatics, Japan), Yuichi Nakamura (NEC, Japan) |
Page | pp. 287 - 292 |
Keyword | network-on-chips, task allocation, task scheduling, fault tolerance |
Abstract | The possibility of failures on network-on-chip (NoC) will be increased if the size increases. To realize reliable NoCs, we propose a redundant task allocation method which allocates several copies of tasks to different cores based on multiple task scheduling. In the experiments, we apply the proposed method to a real application. Then, the allocation
time of the proposed method and the estimated execution time of the application are evaluated changing parameters such as multiplicities of scheduling and allocation. |
PDF file |
Title | A Cooling Effect Formulation and Implementation of a Cooling System for Li-Ion Battery Modules |
Author | *Yuki Kitagawa, Yusuke Yamamoto, Masahiro Fukui (Ritsumeikan University, Japan) |
Page | pp. 305 - 310 |
Keyword | Lithium-ion battery, Degradation, Air cooling |
Abstract | This paper discusses the theory and experiments of heating and air cooling of battery modules. Heating mechanism is shown first, and cooling of a single battery is examined. Optimum air flow speed is discussed. Then, similar discussion is made for a battery module of six serial cells. Finally, the discussion is to reduce the temperature variation |
PDF file |
Title | Global Transformation-Based Optimization of Threshold Logic Circuits |
Author | *Maiko Kabu, Takayuki Kasugai, Shigeru Yamashita (Ritsumeikan University, Japan), Chun-Yao Wang (National Tsing Hua University, Taiwan) |
Page | pp. 311 - 316 |
Keyword | optimization, threshold logic circuit, global functional flexibility, CSPF |
Abstract | Threshold logic circuit technology, which is considered to be one of the promising new technologies, has been successfully demonstrated recently thanks to the rapid progress of nanotechnology. Since the logic elements used in threshold logic circuits are very different from the ones used in the conventional logic circuits, we may need a totally different design methodology for threshold logic circuits; there have been intensive studies recently. In such previous works, local transformation have been mainly considered for the optimization of circuits. Instead, this paper, for the first time, considers global transformations. More specifically, we propose a method to calculate global functional flexibility based on compatible sets of permissible functions (CSPFs) and how to use it to optimize threshold logic circuits. |
Title | An ECO-Friendly Design Style Based on Reconfigurable Cells |
Author | *Yudai Kabata, Tetsuya Hirose, Nobutaka Kuroki, Masahiro Numa (Kobe University, Japan) |
Page | pp. 319 - 324 |
Keyword | ECO, reconfigurable cell, error diagnosis, technology remapping |
Abstract | This paper presents an ECO-friendly design style based on reconfigurable (RECON) cells to reduce an increase in circuit delay by post-mask Engineering Change Orders (ECO’s). Employing RECON cells to implement not only the changes caused by ECO’s but also a part of the original circuit is the key to provide higher flexibility in the ECO process. Experimental results have shown that the proposed design style is effective to reduce the increase in circuit delay with post-mask ECO. |
Title | IC Design Challenges and Opportunities in Advanced Process Nodes |
Author | *Hsien-Hsin Sean Lee (Taiwan Semiconductor Manufacturing Company, Taiwan) |
Page | p. 326 |
Abstract | Moore’s Law has entered a new frontier as the incessant pace of device scaling continues to approach 10nm and beyond. As the physical dimension of devices and interconnect are shrunk, the design rules and the design flow, both ASIC and custom designs, face unprecedented complexity. Hence, common IC design practice can no longer separate the design and the process fabrication indifferently. Conventional design optimization techniques also need to take the novel process technologies, such as multi-gate devices (e.g., FinFET), spacer technology, and self-aligned multiple patterning lithography, into account in order to achieve the best possible performance, power, and area. In this talk, I will touch upon the challenges and implication of these new process technologies to IC designers from the foundry’s perspective and show how and what to innovate in EDA tools for bridging the gap between physical design and foundry fabrication, and then finally improve the overall design productivity. |
PDF file |
Title | Layout-Based Soft Error Rate Estimation Framework Considering Multiple Transient Faults - from Device to Circuit Level |
Author | Hsuan-Ming Huang, *Yi-Wu Liu, Charles H.-P. Wen (National Chiao Tung University, Taiwan) |
Page | pp. 327 - 332 |
Keyword | Soft error, Multiple transient fault, Reliability |
Abstract | Considering the structure of the layout and resulting nuclear reactions, multiple transient faults tend to be induced more frequently than do single transient faults, due to the effects of technology scaling. This study proposes a layout-based soft error estimation framework, which takes into account multiple transient faults from the device level to the circuit level. Experiment results demonstrate that the soft error rate can be underestimated by an average of 15.72% if only single (rather than multiple) transient faults are taken into account. Our results indicate that netlist-based analysis for the estimation of soft error rates is no longer sufficient, due to the overwhelming influence of the structural layout. Thus, using benchmark c432, a tighter layout will result in a soft error rate 34% higher than that generated in a looser layout. |
Title | Low-Power Gated Clock Tree Synthesis for 3D ICs |
Author | *Yu-Chuan Chen, Chih-Cheng Hsu, Mark Po-Hung Lin (National Chung Cheng University, Taiwan) |
Page | pp. 339 - 343 |
Keyword | clock tree, clock gating, 3D IC |
Abstract | Applying clock gating in three dimensional integrated circuits (3D ICs) is essential for reducing power con- sumption and improving circuit reliability. However, the previous works only present algorithms for 3D clock tree synthesis. None of them address gated clock tree in 3D ICs for dynamic power reduction. In this paper, we propose the first problem formulation in the literature for 3D gated clock network optimization. We apply multilevel framework to effectively construct the topological gated clock tree while considering flip-flop switching activities and the timing constraint of enable signal paths at clock gating cells. Based on the constructed topological gated clock tree, a zero-skew 3D clock routing tree is then generated. Experimental results show that, compared with conventional 3D clock tree synthesis, the proposed 3D gated clock tree synthesis can achieve much less power consumption with similar number of TSVs and clock tree wirelength. |
Title | Graph-Covering-Based Architectural Synthesis for Programmable Digital Microfluidic Biochips |
Author | *Daiki Kitagawa, Dieu Quang Nguyen, Trung Anh Dinh, Shigeru Yamashita (Ritsumeikan University, Japan) |
Page | pp. 344 - 349 |
Keyword | graph, binding, scheduling, programmable, biochip |
Abstract | Digital microfluidic technology has been extensively applied in various biomedical fields.
Different from application-specific biochips, a programmable design has several advantages such as
dynamic reconfigurability and general applicability.
Basically, a programmable biochip divides the chip into several virtual modules.
However, in the previous design, a virtual module can execute only one operation at a time.
In this paper, we propose a new multi-functional module for programmable digital microfluidic biochips,
which can execute two operations simultaneously.
Moreover, we also propose a binding and scheduling algorithm for programmable biochips, which is motivated from a graph-covering problem.
Experiment demonstrates that our algorithm can reduce the completion time of the applications compared with the previous approaches. |
Title | Contamination-Aware Routing Flow for Both Functional and Washing Droplets in Digital Microfluidic Biochips |
Author | *Qin Wang, Yiren Shen, Hailong Yao (Tsinghua University, China), Tsung-Yi Ho (National Chiao Tung University, Taiwan), Yici Cai (Tsinghua University, China) |
Page | pp. 350 - 355 |
Keyword | Contamination-Aware Routing, Washing Droplets Routing, Digital Microfluidic Biochips |
Abstract | A major issue in digital microfluidic biochips is cross-contamination caused by different biomolecule droplets crossing the same sites, where washing operations are necessary to avoid wrong assay results. Existing works either assume unrealistic infinite washing capacity, or ignore execution-time constraint and/or routing conflicts between functional and washing droplets. This paper presents the first practical droplet routing flow considering both realistic washing capacity constraint and routing conflicts between washing and functional droplets. Experimental results are promising. |
PDF file |
Title | Obstacle-Avoiding Wind Turbine Placement for Power-Loss and Wake-Effect Optimization |
Author | *Yu-Wei Wu (National Cheng Kung University, Taiwan), Yi-Yu Shi (Missouri University of Science and Technology, U.S.A.), Sudip Roy (National Cheng Kung University, Taiwan), Tsung-Yi Ho (National Chiao Tung University, Taiwan) |
Page | pp. 356 - 361 |
Keyword | Placement, Wind Turbine |
Abstract | As finite energy resources are being consumed at fast rate than they can be replaced, renewable energy resources have drawn an extensive attention. Wind power development is one such example, which is growing significantly throughout the world. The main difficulty in wind power development is that wind turbines interfere with each other and such turbulent directly affects the power produced, known as the wake effect. In addition, wirelength among wind turbines is not merely an economic factor, but also more decides the power loss occurs in the wirelength. Moreover, in reality, obstacles exist in the wind farm which is unavoidable, e.g., private land, lake. Nevertheless, to the best of our knowledge, none of the existing works consider wake effect, wirelength and obstacle-avoiding at the same time in the wind turbine placement problem. In this paper, we propose an analytical method to solve obstacle-avoiding placement of wind turbines for power-loss and wake-effect optimization. Experimental results show that the wind power produced by our tool is similar to that by the industrial tool AWS OpenWind. Besides, our algorithm can reduce the wirelength and avoid obstacles successfully while finding the locations of wind turbines at the same time. |
Title | Accelerating Random-Walk-Based Power Grid Analysis through Error Smoothing |
Author | *Tsuyoshi Okazaki, Masayuki Hiromoto, Takashi Sato (Kyoto University, Japan) |
Page | pp. 362 - 367 |
Keyword | power grid analysis, random walk, Gauss-Seidel method |
Abstract | This paper proposes a hybrid solver of a random walk and a stationary iterative method. Our solver is based on quasi-zero-variance importance sampling (QZV-IS), in which walk-probability is updated by using coarsely estimated voltages for rapid convergence. Because the convergence speed depends on the smoothness of the estimated voltages, we propose additionally to apply smoothing operator to quickly improve the quality of the estimated voltages. The propose solver achieved 2.3-3.6x speedup compared to the conventional method that only utilizes QZV-IS. |
Title | Improvement of Simulated Annealing Search ---Based on Tree Representations--- |
Author | *Takaaki Banno, Kunihiro Fujiyoshi (Tokyo University of Agriculture and Technology, Japan) |
Page | pp. 368 - 373 |
Keyword | Simulated Annealing, tree representations, O-tree, DTS, packing |
Abstract | Placement problem for LSI layout is often refered to ``Rectangle packing problem.'' For this problem, several representations of rectangle packing were proposed and packings are searched by Simulated Annealing based on a representation.
To search efficiently based on representations, it is necessary to define appropriate MOVE operations.
In this paper, we restrict MOVE operations so that a certain MOVE can restore any adjacent solution to former solution and confirmed the efficteveness by experiments. |
PDF file |
Title | A Hierarchical Type Segmentation Algorithm Based on Support Vector Machine for Colorectal Endoscopic Images with NBI Magnification |
Author | *Takumi Okamoto, Tetsushi Koide, Anh-Tuan Hoang, Koki Sugi, Tatsuya Shimizu, Toru Tamaki, Bisser Raytchev, Kazufumi Kaneda, Yoko Kominami, Shigeto Yoshida, Shinji Tanaka (Hiroshima University, Japan) |
Page | pp. 374 - 379 |
Keyword | Support Vector Machine (SVM), Colorectal Endoscopic Images, Computer-Aided Diagnosis (CAD), Hierarchical Type Segmentation, FPGA |
Abstract | With the increase of colorectal cancer patients in recent years, the needs of quantitative evaluation of colorectal cancer are increased, and the computer-aided diagnosis (CAD) system which supports doctor's diagnosis is essential. In this paper, a hardware design of type identification module in CAD system for colorectal endoscopic images with narrow band imaging (NBI) magnification is proposed for real-time processing of full high definition image (1920 x 1080 pixel). A pyramid style identifier with SVMs for multi-size scan windows, which can be implemented with small circuit area and achieve high accuracy, is verified for actual complex colorectal endoscopic images. |
PDF file |
Title | High Performance Feature Transformation Architecture Based on Bag-of-Features in CAD System for Colorectal Endoscopic Images |
Author | *Koki Sugi, Tetsushi Koide, Anh-Tuan Hoang, Takumi Okamoto, Tatsuya Shimizu, Toru Tamaki, Bisser Raytchev, Kazufumi Kaneda, Yoko Kominami, Shigeto Yoshida, Shinji Tanaka (Hiroshima University, Japan) |
Page | pp. 380 - 385 |
Keyword | Colorectal Endoscopic Images, Computer-Aided Diagnosis(CAD), Feature Transformation, Visual Word(VW), FPGA Hardware Implementation |
Abstract | Our research describe the computer-aided diagnosis (CAD) system for colorectal endoscopic images with narrow band imaging (NBI) magnification, which identifies a pathology type from local feature in the NBI endoscopic image. We propose a high speed feature transformation for CAD system by using Manhattan distance calculation and on the fly normalization method. A high performance and a low cost algorithm for multiple Scan Window (SW) processing for FPGA is also introduced. The proposed high speed feature transformation can be completed within about 380 msec on a real time Full HD NBI endoscopic image. |
PDF file |
Title | Hardware Implementation of Motion Estimation Technology Using High Level Synthesis and Investigations into Techniques for Improvements |
Author | *Shota Nagai (Graduate School of Science and Engineering, Kindai University, Japan), Takashi Kambe (Depart. of Electric and Electronic Engineering, Kindai University, Japan), Gen Fujita (Osaka Electro-Communication University, Japan) |
Page | pp. 386 - 390 |
Keyword | motion estimation, H.264/AVC, EPZS, high level synthesis, Bach C |
Abstract | The motion estimation technology that is a key part of the H.264/AVC (Advanced Video Coding) standard, implemented it as hardware using high-level synthesis technology, and investigated improvements.
An EPZS algorithm was implemented instead of a Full Search algorithm, and the results evaluated to understand the effectiveness of the high-level synthesis technology and of the speedup techniques that were adopted. |
PDF file |
Title | FPGA Oriented Intra Angular Prediction Image Generation Hardware for HEVC Video Coding |
Author | *Eita Kobayashi, Seiya Shibata (NEC Corporation, Japan), Noriaki Suzuki (NEC corporation, Japan), Atsufumi Shibayama (NEC Corporation, Japan), Takeo Hosomi (NEC Coporation, Japan) |
Page | pp. 391 - 396 |
Keyword | HEVC, FPGA, Architecture, High Level Synthesis |
Abstract | This work proposes a novel architecture for intra prediction image generation of High Efficiency Video Coding (HEVC) standards oriented to FPGA. HEVC intra prediction is highly-extended from H.264 in those of mode and block size to realize the high flexibility. From the point of view of hardware, however, this flexibility cause an increasing required the number of MUXs although MUXs tend to be a bottleneck of area and frequency in the case of FPGA. In this paper we propose a Reshaping Buffered Architecture which enables reduction the number of MUXs, drastically. Experimental results show that our proposed architecture can reduce up to 70% of number of MUXs compared with raster scan based architecture. This resulted in a marked improvement of maximum frequency by 43% and LUT usage by 51%, respectively. |
PDF file |
Title | High Accuracy and Simple Real-Time Circle Detection on Low-Cost FPGA for Traffic-Sign Recognition on Advanced Driver Assistance System |
Author | *Anh-Tuan Hoang (Research Institute for Nanodevice and Bio Systems, Hiroshima University, Japan), Masaharu Yamamoto (Graduate School of Advanced Sciences of Matter, Hiroshima University, Japan), Tetsushi Koide (Research Institute for Nanodevice and Bio Systems, Hiroshima University, Japan) |
Page | pp. 397 - 402 |
Keyword | circle detection, traffic sign detection, pipeline scaning, ADAS, multi grain pipelining |
Abstract | This paper describes a hardware oriented algorithm and its conceptual implementation for real-time traffic signs detection system on automotive oriented FPGA. The speed limit sign area on a grayscale video frame is detected through a two-stage simple computation process. Rectangle Pattern Matching roughly detects global luminosity sharing feature between rectangle and circle for Region of Interest (ROI). Then, Circle Detection roughly votes local pixel direction of circle inside the detected ROI in binary image for circle confirmation. The proposed system achieves 83 full HD fps and over 99% accuracy even in difficult situation such as rainy night. It occupies around 50% the hardware available on proposed Xilinx Zynq automotive FPGA, which has 85 K logic cells, 53.2 K LUTs, 106.4 K registers and 506 KB BRAM, and so be able to apply to Advanced Driver Assistance System on common vehicles. |
PDF file |
Title | An Improved Rate-Distortion Optimized Quantization Algorithm and Its Hardware Implementation |
Author | *Genki Moriguchi (Graduate School of Science and Engineering, Kindai University, Japan), Takashi Kambe (Depart. of Electric and Electronic Engineering, Kindai University, Japan), Gen Fujita (Osaka Electro-Communication University, Japan) |
Page | pp. 409 - 414 |
Keyword | H.264/AVC, RDOQ, function based pipelining, high-level synthesis, Bach C |
Abstract | Rate-distortion optimized quantization (RDOQ) is an important technology in H.264/AVC for improving video coding performance.
It is able to determine the optimal value among multiple quantization candidates based on rate-distortion (RD).
We propose improvements to the algorithm to reduce its complexity by changing the bit-rate estimation method and by excluding low scored candidates for the quantization.
We also implement the algorithm in hardware using the Bach C high-level synthesis tool.
Finally, the performances of the proposed algorithm and hardware design results are evaluated. |
PDF file |
Title | Implementation and Evaluation of AES/ADPCM on STP and FPGA with High-Level Synthesis |
Author | *Yuki Ando, Yukihito Ishida, Shinya Honda, Hiroaki Takada, Masato Edahiro (Nagoya University, Japan) |
Page | pp. 415 - 420 |
Keyword | FPGA, DRP, High-level synthesis |
Abstract | Reconfigurable techniques are attracting attention as an alternative to dedicated hardware of SoC. We have evaluated FPGA and STP engine in order to confirm their performance whether they can substitute the dedicated hardware of SoC. We selected AES and ADPCM applications to compare the performance of FPGA and STP engine. The applications were synthesized with the same high-level synthesis tools. Then, we implemented them onto FPGA and STP engine using the integrated development environments. For the evaluation, we compared them in terms of resource usage, the number of states, the number of cycles, frequency, and execution time. |
PDF file |
Title | Speed Traffic-Sign Number Recognition on Low Cost FPGA for Robust Sign Distortion and Illumination Conditions |
Author | *Masaharu Yamamoto, Anh-Tuan Hoang, Tetsushi Koide (Hiroshima University, Japan) |
Page | pp. 421 - 426 |
Keyword | Advanced Driver Assistance System (ADAS), Real-Time Processing, Traffic-Sign Detection, Number Recognition, FPGA Imprementation |
Abstract | In this paper, we propose a hardware-oriented robust speed traffic-sign recognition algorithm which can process real-time for Advanced Driving Assistant System (ADAS). In difficult conditions, such as sign distortion in various angle or at night and rain, the proposed algorithm is still be able to recognize the traffic sign with high precision. The proposed hardware oriented number recognition algorithm achieves more than 99 % in recognition rate in daytime and achieves 94.2 % including difficult conditions in rainy night. |
PDF file |
Title | Efficient Manipulation of Truth Tables on CUDA for Gate-Level Simulation |
Author | *Yuri Ardila, Tatsuyuki Kida, Shigeru Yamashita (Ritsumeikan University, Japan) |
Page | pp. 427 - 432 |
Keyword | logic circuit, verification, simulator, cuda, gpu |
Abstract | Efficient logic circuit simulations are indispensable for manufacturing LSI products. Since the computation of such simulations is usually very time consuming, there have been many efforts to optimize it; and many researches have been succeeded by using
the GPGPU (General-Purpose computing on Graphics Processing Unit) technology for a decade. This paper also studies how to utilize GPGPU to optimize the logic circuit simulation. Our method is mainly based
on efficient parallel manipulations of truth tables. Our idea is different from most of the previous works considering a fact that the outputs of many gates can be evaluated in parallel. We achieved as much as 65.5 times speedup compared to the simulation using only
a CPU. |
Title | Scan-Based Side-Channel Attack Implementation Evaluation on the LED Cipher Using SASEBO-GII |
Author | *Huiqian Jiang, Mika Fujishiro, Masao Yanagisawa, Nozomu Togawa (Waseda University, Japan) |
Page | pp. 433 - 434 |
Keyword | Side-Channel Attack, Scan-Based Attack, LED Cipher, SASEBO-GII, Implementation Evaluation |
Abstract | LED is a lightweight block cipher which is suitable for both hardware
and software. Design-for-test is essential to LSI designers in order to check whether devices work correctly. One of design-for-test techniques using scan chains is called scan-path test, in which testers can observe and control registers inside an LSI chip directly. Recently, scan-based side-channel attack is reported which retrieves the secret information from a cryptosystem using scan chains. In this paper, we demonstrate that the secret key in LED cipher can be retrieved successfully from the SASEBO-GII, side-channel attack standard evaluation board. Experiments show that scan-based attack is practical enough. |
Title | A Study on Visualization of Auscultation-Based Blood Pressure Measurement |
Author | *Yusuke Katsuki, Mingyu Li, Qing Dong, Shigetoshi Nakatake (The University of Kitakyushu, Japan) |
Page | pp. 435 - 436 |
Keyword | Sensor, Medical, digital filtering |
Abstract | Blood pressure measurement by Korotkov sounds auscultation is an essential skill for health care workers, but the skill mastery is not easy because complicated tasks such as simultaneous auscultation, manipulation of pressure, and checking of scale are required. This work provides a system to visualize the Korotkov sounds and pressure-in-cuff by sensing them at the same time. Plus, we evaluate the system from the viewpoint of an educational assistance of the skill mastery of blood pressure measurement. |
PDF file |