(Go to Top Page)

SASIMI 2013
The 18th Workshop on Synthesis And System Integration of Mixed Information Technologies
Technical Program

Remark: The presenter of each paper is marked with "*".
Technical Program:   SIMPLE version   DETAILED version with abstract
Author Index:   HERE

Session Schedule


Monday, October 21, 2013

Opening (Tanchō-Hakuchō 1)
9:00 - 10:00
K1  (Tanchō-Hakuchō 1)
Keynote Speech I

10:00 - 11:00
R1  (Tanchō-Hakuchō 1 & Kujyaku)
Poster I

11:00 - 12:10
Lunch Break
12:10 - 13:40
I1  (Tanchō-Hakuchō 1)
Invited Talk I

13:40 - 14:40
R2  (Tanchō-Hakuchō 1 & Kujyaku)
Poster II

14:40 - 15:50
K2  (Tanchō-Hakuchō 1)
Keynote Speech II

15:50 - 16:50
R3  (Tanchō-Hakuchō 1 & Kujyaku)
Poster III

16:50 - 18:00
Banquet (Hakuchō)
18:30 - 20:30

Tuesday, October 22, 2013

I2  (Tanchō-Hakuchō 1)
Invited Talk II

9:00 - 10:00
R4  (Tanchō-Hakuchō 1 & Kujyaku)
Poster IV

10:00 - 11:30
Lunch Break
11:30 - 13:00
I3  (Tanchō-Hakuchō 1)
Invited Talk III

13:00 - 14:00
D  (Tanchō-Hakuchō 1)
Panel Discussion

14:00 - 15:30
R5  (Tanchō-Hakuchō 1 & Kujyaku)
Poster V

15:30 - 17:00
Closing (Tanchō-Hakuchō 1)
17:00 - 17:10


List of Papers

Remark: The presenter of each paper is marked with "*".

Monday, October 21, 2013

Keynote Speech I
Time: 10:00 - 11:00 Monday, October 21, 2013
Location: Tanchō-Hakuchō 1
Chair: Nagisa Ishiura (Kwansei Gakuin University, Japan)

K1 (Time: 10:00 - 11:00)
TitleScaling the Many-Memory Wall for Many-Core Architectures
Author*Nikil Dutt (University of California, Irvine, U.S.A.)
Pagep. 1
AbstractThe move towards many-core architectures creates an inherent demand for high memory bandwidth, which in turn results in the need for vast amounts of on-chip memory space. On the other hand, many-core architectures have many (distributed) on-chip memories with limited capacities, resulting in a “many-memory wall”. While efforts such as 3D stacking and smarter memory controllers try to alleviate the off-chip memory access problem, there is still a pressing need to carefully provision the limited on-chip memory budget to meet application needs. For on-chip memories, embedded systems often use both software controlled memories (e.g., scratchpad memories) and hardware-controlled memories (e.g., caches), with each having their pros and cons. Efficient on-chip memory management is extremely critical as it has a great impact on the system’s power consumption and throughput. Traditional memory hierarchies primarily consist of SRAM-based on-chip caches. However, with the emergence of non-volatile memories (NVMs) and mixed-criticality systems, we expect to see heterogeneous on-chip memory hierarchies, not only in type (cache vs. scratchpad) but also in technology (e.g., SRAM vs. NVM). This talk will survey the state of the art in memory subsystems for many-core platforms, and present strategies for efficiently managing software-controlled memories in the many-core domain, while addressing emerging challenges faced by designers. I will also propose a holistic software/hardware solution to the problem of scaling the memory wall for many-core architectures.
PDF file


Poster I
Time: 11:00 - 12:10 Monday, October 21, 2013
Location: Tanchō-Hakuchō 1 & Kujyaku
Chairs: Kouichirou Yamashita (Fujitsu Laboratories Ltd., Japan), SeungJu Lee (Waseda University, Japan)

R1-1 (Time: 11:00 - 11:02)
TitleA Novel Fast and Accurate Hot Spot Detection Method with Prüfer Code Layout Encoding
Author*Hong-Yan Su, Chieh-Chu Chen, Yih-Lang Li (National Chiao Tung University, Taiwan), An-Chun Tu, Chuh-Jen Wu, Chen-Ming Huang (Taiwan Semiconductor Manufacturing Company, Taiwan)
Pagepp. 2 - 7
KeywordDesign for manufacturability, process hotspot, pattern matching, centerline, Prüfer Encoding
AbstractAs design-for-manufacturability techniques have become widely used to improve the yield of nano-scale semiconductor technology in recent years, hot-spot detection methods have been investigated with a view to calibrating layout patterns that tend to reduce yield. In this work, we propose two graph models, i.e., skeleton graph and space graph, to formulate polygon topology and spatial relationship among polygons. In addition, a Prüfer Encoding based method is presented to encode each skeleton graph. Single polygon matching problem is then equivalent to the verification of graph isomorphism, which is realized by checking the identity of two correspond-ing enhanced Prüfer codes. A branch-and-bound based pattern anchoring algorithm is presented to resolve the vertex ordering problem for isomorphism checking. Finally, the general exact pattern matching problem can be accom-plished by adopting the space graph to identify the similarity of spatial rela-tionship among polygons. Experimental results show that we can achieve 5.6x runtime speedup than design-rule-based methodology in average.

R1-2 (Time: 11:02 - 11:04)
TitleImplementation of Protocol Independent Control-Intensive Design in High-Level Synthesis
AuthorTung-Hua Yeh, Jen-Chieh Yeh (Industrial Technology Research Institute, Taiwan), *Qiang Zhu (Cadence Design Systems, Japan)
Pagepp. 8 - 12
Keyworddesign experiences, high-level synthesis
AbstractHigh-level synthesis (HLS) has previously been applied to a variety of datapath-dominated and algorithmic designs achieving a comparable quality of result (QoR) with hand-edited RTL designs. However the capability and the applicability of HLS to control-intensive designs were always challenging. In this paper we present an efficient strategy to abstract control-intensive designs to which HLS technologies can efficiently be applied. The design using this strategy not only achieves good QoR, but also improves design reusability and productivity. We demonstrated two control-intensive designs: a Direct Memory Access (DMA) controller and a NAND flash controller, which resulted in a 3X design productivity improvement compared to traditional RTL design methodology, while maintaining comparable design quality to hand-edited RTL designs.

R1-3 (Time: 11:04 - 11:06)
TitleEvaluation of On-Chip Decoupling Capacitor's Effect on AES Cryptographic Circuit
Author*Tsunato Nakai, Mitsuru Shiozaki, Takaya Kubota, Takeshi Fujino (Ritsumeikan University, Japan)
Pagepp. 13 - 18
KeywordSide-channel attack, Electromagnetic analysis, ASIC semi-custom design, Cryptographic circuit, On-chip capacitor
AbstractPower Analysis (PA) attack and Electromagnetic Analysis (EMA) attack reveal a secret key on cryptographic circuits by measuring power variation and electromagnetic radiation during cryptographic operations, respectively. Inserting decoupling capacitors reduces PA leak; however, a resistance against EMA attack is not well-known. We fabricated Advanced Encryption Standard (AES) cryptographic chips with and without on-chip decoupling capacitors, and evaluated the resistance against PA and EMA attack. This paper presents on-chip decoupling capacitors make vulnerable to the EMA attack using Hamming-weight model.
PDF file

R1-4 (Time: 11:06 - 11:08)
TitleA Real-Time Peak Load Shaving with Error Compensation of Residential Load/PV Power Generation Forecasting
Author*Hide Nishihara, Ittetsu Taniguchi, Masahiro Fukui (Ritsumeikan University, Japan)
Pagepp. 19 - 24
KeywordPeak-Shaving, Smart Grid
AbstractThis paper proposes a real-time peak load shaving with error compensation of residential load/PV power generation forecasting. Various load/generation forecasting techniques have been proposed, but it is impossible to avoid forecasting error completely. This paper supposes a house with photovoltaic (PV) panel and energy storages, and proposes a power distribution method at household level to minimize a peak value of electric power demand. Experimental results show that the proposed method reduces the sum of purchased energy and wasted energy drastically with the same peak-shaving ratio, and the load/generation forecasting error is effectively compensated.

R1-5 (Time: 11:08 - 11:10)
TitleA Design of CMOS On-Chip Photovoltaic Device and Regulated DC-DC Converter for Micro System
Author*Haruki Ono, Kazuki Nomura, Nobuhiko Nakano (Keio University, Japan)
Pagepp. 25 - 27
Keywordstand-alone micro system, photovoltaic device, bootstrap charge pump
AbstractIn this paper, we propose electric power system for a stand-alone micro system. The micro system consists of photovoltaic device, voltage boost, ring oscillator, and regulator on a single silicon chip. We designed and measured several types of photovoltaic devices. The maximum output voltage of photovoltaic device is 550mV. The bootstrap charge pump circuit and regulator are designed for this power supply. This power supply outputs more than 1V. It is enough voltage for standard CMOS circuit.
PDF file

R1-6 (Time: 11:10 - 11:12)
TitleAn Error Diagnosis Technique Using QBF Solver to Fix LUT Functions
Author*Naoki Katayama, Hiroyuki Sakamoto, Tetsuya Hirose, Nobutaka Kuroki, Masahiro Numa (Kobe University, Japan)
Pagepp. 28 - 33
Keyworderror diagnosis, ECO, QBF
AbstractThis paper presents an error diagnosis technique using a QBF (Quantified Boolean Formula) solver to fix LUT functions. Although the conventional SAT-based error diagnosis technique checks equivalence between the given specification and the rectified circuit for every assignment to truth variables with each LUT function, the proposed QBF-based technique obtains all assignments to truth variables for satisfying equivalence at a time. Experimental results have shown that the proposed technique rectifies circuits which were unable to be corrected by the conventional SAT-based technique.

R1-7 (Time: 11:12 - 11:14)
TitleEnergy-Efficient Dynamic Voltage and Frequency Scaling by P/N-Performance Self-Adjustment Using Adaptive Body Bias
Author*A.K.M. Mahfuzul Islam, Norihiro Kamae, Tohru Ishihara, Hidetoshi Onodera (Kyoto University, Japan)
Pagepp. 34 - 39
KeywordDVFS, Energy Efficiency, Process Variation, Adaptive Body Bias
AbstractDynamic voltage and frequency scaling (DVFS) is a promising technique to improve energy efficiency for heterogeneous systems where work load varies with time. This paper addresses the effects of process variation on the energy efficiency for wide voltage range DVFS and proposes the use of P/N-performance self-adjustment scheme to enable typical-case design. Simulation results show that energy efficiency can be improved by more than 100% with the proposed technique compare to the conventional worst-case design methodology for a 65-nm commercial process.

R1-8 (Time: 11:14 - 11:16)
TitleA Nested Loop Pipelining in C Descriptions for System LSI Design
Author*Masahiro Nambu, Takashi Kambe (Kinki University, Japan), Shuji Tsukiyama (Chuo University, Japan)
Pagepp. 40 - 43
Keywordnested loop, pipelining, C based design, high level synthesis, BACH sytem
AbstractBehavioral synthesis from C language is now a key technology of system LSI design. Since large streaming data are usually processed by nested loops in behavioral description of system LSI, it is important to synthesize a circuit which can process such data efficiently. Nested loop pipelining is a useful implementation technique of the description to synthesize a circuit such that both computational throughput and hardware utilization are maximized. In this paper, we propose an algorithm for nested loop pipelining, which can produce pipeline stages with different processing times. We show two practical experimental results in order to demonstrate the performance of the proposed algorithm.
PDF file

R1-9 (Time: 11:16 - 11:18)
TitleGeneral Position-Based Weighted Round-Robin Arbitration for Arbitrary Traffic Patterns
Author*Hanmin Park, Kiyoung Choi (Seoul National University, Republic of Korea)
Pagepp. 44 - 49
KeywordNetwork-on-Chip, Weighted Round-Robin, Fair Arbitration, Equality of Service
AbstractThis paper presents the position-based weighted round-robin arbitration for equality of service in many-core network-on-chips employing a deterministic routing algorithm. We concentrate on the network saturation induced by arbitrary traffic patterns. It exploits the deterministic properties of the network to achieve global fairness of service provided to each node. The weights for input arbitration can be adjusted to make the network better adapted to arbitrary traffic patterns. By the adjustment, better equality of service can be achieved with no degradation of the network saturation throughput.

R1-10 (Time: 11:18 - 11:20)
TitleMemory Management for Dual-Addressing Memory Architecture
Author*Ting-Wei Hong, Yen-Hao Chen, Yi-Yu Liu (Yuan Ze University, Taiwan)
Pagepp. 50 - 55
KeywordDual-addressing memory, 2D virtual memory management, Data granularity and indexing
AbstractDual-addressing memory architecture is designed for two-dimensional memory access with both row-major and column-major localities. In this paper, we highlight two memory management issues in dual-addressing memory. First, to avoid the external fragmentation, we propose a virtual dual-addressing memory design to enable memory management via operating system. After that, to deal with the size mismatch between user-defined data and dual-addressing memory, we discuss data arrangement policies for different data granularity. With the proposed memory management techniques, we are capable of maximizing the memory utilization of dual-addressing memory.
PDF file

R1-11 (Time: 11:20 - 11:22)
TitleAlpha-Gamma Data Compression Method for Artificial Vision Systems Using Visual Cortex Stimulation
Author*Tomoki Sugiura, Arif Ullah Khan, Yoshinori Takeuchi, Masaharu Imai (Osaka University, Japan)
Pagepp. 56 - 61
Keyworddata compression, artificial vision, hybrid organ
AbstractIn this paper a data compression method for visual cortex stimulation based artificial vision is proposed and evaluated. The proposed method uses run-length encoding to express visual cortex stimulus data in numerical form, in which the numerical data representing ’1’ data and ’0’ data are encoded into binary by alpha encoding and gamma encoding, respectively. From experimental results, the proposed method reduced data size approximately 83% while execution cycles of the proposed method is practically equal to gamma encoding.
PDF file

R1-12 (Time: 11:22 - 11:24)
TitleAn Efficient Test Pattern Generator -Mersenne Twister-
AuthorHiroshi Iwata, *Sayaka Satonaka, Ken'ichi Yamaguchi (Nara National College of Technology, Japan)
Pagepp. 62 - 67
KeywordMersenne Twister, manufacturing test, pseudo random pattern, built-in self test, fault coverage
AbstractBuilt-in self test (BIST) is an answer for a high reliable manufacturing test with a reasonable cost. In this paper, we supposed that the Mersenne Twister is used as the test pattern generator instead of the LFSR to implement BIST into VLSIs. Experimental results show that the test patterns generated through the Mersenne Twister are efficient with respect to the fault coverage and it is implemented with a comparable cost to the LFSR.
PDF file

R1-13 (Time: 11:24 - 11:26)
TitlePower Optimization of a Micro-Controller with Silicon on Thin Buried Oxide
Author*Kuniaki Kitamori, Hongliang Su, Hideharu Amano (Keio University, Japan)
Pagepp. 68 - 73
KeywordSOTB, V850E-Star, low power consumption
AbstractNowadays, from battery supplied mobile devices to supercomputers, reducing the power consumption has become a serious design issue. Although using low power supply is the most efficient way to reduce the power, it also increase the leakage power and delay variance. Low-power Electronics Association & Project(LEAP) developed Silicon On Thin Buried Oxide(SOTB) technology to solve those problems. In order to verify the SOTB technology, we have applied to an automotive microcontroller V850E-Star. In this report, we investigate the operational speed and leak power with 40 kinds of reverse bias and forward bias voltages for each purpose: standby, energy maximum and performance maximum. In the standby mode, leak power of the energy maximum mode is reduced by 92%, while it works with 33MHz frequency clock in the energy maximum mode.
PDF file


Invited Talk I
Time: 13:40 - 14:40 Monday, October 21, 2013
Location: Tanchō-Hakuchō 1
Chair: Nagisa Ishiura (Kwansei Gakuin University, Japan)

I1 (Time: 13:40 - 14:40)
TitleComputer-Aided Design of Electric Vehicle Hybrid Energy Storage System
AuthorSangyoung Park, Younghyun Kim, *Naehyuck Chang (Seoul National University, Republic of Korea)
Pagepp. 74 - 75
AbstractElectric vehicles (EV) are considered a strong alternative to internal combustion engine (ICE) vehicles. EVs are considered to have low environmental impact and operating costs. However, studies show that electric vehicle is not a cure-all solution for all the problems and needs careful optimization. Many problems persist including the drive range, high initial cost, and battery degradation of the EVs. These shortcomings are mainly due to cycle efficiency and cycle life constraints of the energy storage system (ESS), which is usually a homogeneous lithium-ion battery bank in commercial vehicles. Such optimization practices have been performed in many ways, which are rather layered optimization. On the other way, a more systematic approach based on computer-aided design gives holistic optimization opportunities. We propose to overcome the hurdles by computer-aided design optimization of hybrid energy storage system (HESS) for EVs.


Poster II
Time: 14:40 - 15:50 Monday, October 21, 2013
Location: Tanchō-Hakuchō 1 & Kujyaku
Chairs: Kenshu Seto (Tokyo City University, Japan), Hiroshi Saito (University of Aizu, Japan)

R2-1 (Time: 14:40 - 14:42)
TitlePlace-and-Route Algorithms for a Reliability-Oriented Coarse-Grained Reconfigurable Architecture Using Time Redundancy
Author*Takashi Imagawa, Masayuki Hiromoto (Kyoto University, Japan), Hiroshi Tsutsui (Hokkaido University, Japan), Hiroyuki Ochi (Ritsumeikan University, Japan), Takashi Sato (Kyoto University, Japan)
Pagepp. 76 - 81
Keywordcoarse-grained reconfigurable architecture, reliability, time redundancy, dynamic reconfiguration, place-and-route algorithm
AbstractCoarse-grained reconfigurable architectures (CGRAs) are expected to enhance the reliability of LSI systems. The time-redundancy technique can enhance the fault tolerance even under severe circuit area constraints. This paper proposes two place-and-route algorithms for the CGRA that utilizes time-redundancy. The application circuits implemented on the CGRA with these algorithms are different in the performance degradation and hard-error tolerance. The one algorithm can achieve the hard-error tolerance improvement with small performance degradations, the other improves the tolerance largely with large degradations.

R2-2 (Time: 14:42 - 14:44)
TitlePower Analysis Resistant IP Core Using IO-Masked Dual-Rail ROM for Easy Implementation into Low-Power Area-Efficient Cryptographic LSIs
Author*Megumi Shibatani, Mitsuru Shiozaki, Yuki Hashimoto, Takaya Kubota, Takeshi Fujino (Ritsumeikan University, Japan)
Pagepp. 82 - 87
Keywordcryptographic module, side-channel attack, power analysis, countermeasure circuit, IO-masked dual-rail ROM
AbstractRecently, it has been pointed out that power analysis (PA) attacks are a threat to cryptographic circuits which handle confidential information. Our goal of this study is to provide easily implementable cryptographic IP core in small area and low power consumption with PA resistance. We have proposed IO-masked dual-rail ROM scheme and prototyped an advanced encryption standard (AES) circuit using the proposed scheme. This paper presents the evaluated results of chip area, power consumption, and PA resistance.
PDF file

R2-3 (Time: 14:44 - 14:46)
TitleScaling up Size and Number of Expressions in Random Testing of Arithmetic Optimization of C Compilers
AuthorEriko Nagai (Fujitsu Systems West Limited, Japan), *Atsushi Hashimoto, Nagisa Ishiura (Kwansei Gakuin University, Japan)
Pagepp. 88 - 93
Keywordcompiler, random testing, programming language C
AbstractThis paper presents an enhanced method of testing validity of arithmetic optimization of C compilers using randomly generated programs. Its bug detection capability is improved over an existing method by 1) generating longer arithmetic expressions and 2) accommodating multiple expressions in test programs. Undefined behavior in long expressions is successfully avoided by modifying problematic subexpressions during computation of expected values for the expressions. An efficient method for minimizing error inducing test programs is also presented, which utilizes binary search. Experimental results show that a random test system based on our method has higher bug detection capability than existing methods; it has detected more bugs than previous method in earlier versions of GCCs and has revealed new bugs in the latest versions of GCCs and LLVMs.
PDF file

R2-4 (Time: 14:46 - 14:48)
TitleA Routing Method Using Minimum Cost Flow Algorithm for Routes with Target Wire Lengths
Author*Kunihiro Fujiyoshi, Kazuo Yamane (Tokyo University of Agriculture and Technology, Japan)
Pagepp. 94 - 99
Keywordrouting, PCB, error, minimum cost flow algorithm
AbstractDue to the increase of operation frequency, influence of routing delays is increasing. So it is important to obtain the routes with the small difference between target wire length and actual wire length. For this purpose, CAFE router which obtains the river routing with small length error using maximum flow was proposed. But, in many cases, the obtained routes have small length error. In this paper, we propose a method using minimum cost flow, which obtains routes with smaller differences.

R2-5 (Time: 14:48 - 14:50)
TitleCompact Pipeline Hardware Architecture for Pattern Matching on Real-Time Traffic Signs Detection
Author*Anh-Tuan Hoang, Mutsumi Omori, Masaharu Yamamoto, Tetsushi Koide (Hiroshima University, Japan)
Pagepp. 100 - 105
KeywordTraffic signs detection, Pipeline Architecture, Compact Hardware
AbstractThis paper describes a novel compact hardware oriented algorithm and its conceptual implementation for real-time traffic signs detection system. The limit speed sign area on a grayscale video frame is detected based on a novel, simple and compact rectangle pattern matching and circle detection modules. The limit speed recognition system is divided into two-pipeline stages. The frame is scanned with multi-scan windows in parallel for each position and each scan windows is also processed in pipeline to increase throughput. It achieve 100% in detection rate.
PDF file

R2-6 (Time: 14:50 - 14:52)
TitleA Parallel Simulated Annealing Algorithm with Look-Ahead Neighbor Solution Generation
Author*Yusuke Ota, Kazuhito Ito (Saitama University, Japan)
Pagepp. 106 - 111
Keywordsimulated annaling, parallel SA, lookahead
AbstractSimulated annealing (SA) is a general method to solve combinational optimization problems. SA generates a neighbor solution from a current solution randomly and evaluates the solution with a cost function. If the neighbor solution is better than the current solution, or otherwise stochastically, the neighbor solution is accepted as a new current solution. This process is iterated many times and therefore SA needs long execution time. We propose a fast SA method where some neighbor solutions are generated at a time in a look-ahead manner and evaluated in parallel. To increase the efficiency of the parallelized SA, a method to adaptively generate neighbor solutions is proposed to reduce void solutions not used in a SA chain.
PDF file

R2-7s (Time: 14:52 - 14:54)
TitleA 10-Bit Low-Glitch Binary-Weighted Current-Steering DAC
Author*Fang-Ting Chou, Chung-Chih Hung (National Chiao Tung University, Taiwan)
Pagepp. 112 - 113
KeywordDAC, binary-weighted, low glitch
AbstractA low-glitch and low-power design for a 10-bit binary-weighted current-steering digital-to-analogue converter (DAC) is presented. Instead of large input buffers, the proposed design uses variable-delay buffers with a compact layout to compensate for delay difference and to reduce high glitch energy significantly, from 7 pVsec to less than 1.5 pVsec. The proposed DAC is capable of high-speed, low-glitch operation without compromising power consumption and chip area.

R2-8 (Time: 14:54 - 14:56)
TitleRover II: A Router for Via Configurable Structured ASIC with Standard Cells and IPs
AuthorChiung-Chih Ho, Hsin-Pei Tsai, *Rung-Bin Lin (Yuan Ze University, Taiwan)
Pagepp. 114 - 117
KeywordStructured ASIC, Router, Regular fabric, IP
AbstractThis article presents a router, called Rover II, for via-configurable structured ASIC. Rover II extends the work of Rover to handle IPs and incorporate a porting of NTHU-route 2.0 and NCTU-GR global routers. Experimental results show that Rover II can successfully route a via-configurable structured ASIC with standard cells and IPs under different routing fabrics. The results also show that the global router in Rover is as good as the state-of-art global routers such as NTHU2.0 and NCTU-GR.

R2-9 (Time: 14:56 - 14:58)
TitleA Compact and Energy-Efficient Muller C-Element for Low-Voltage Asynchronous CMOS Digital Circuits
Author*Yuzuru Shizuku, Tetsuya Hirose, Yuya Danno, Nobutaka Kuroki, Masahiro Numa (Kobe University, Japan)
Pagepp. 118 - 122
Keywordlow supply voltage, asynchronous circuit, Muller-C-element, energy-efficient
AbstractAn asynchronous circuit has attracted much attention as a promising low-power and robust digital design technique. Muller C-element is one of the fundamental building blocks for asynchronous circuit and is used in timing control of each circuit block and pipeline processing. However, conventional Muller C-elements have the problem that it is difficult to operate at lower supply voltage. In this paper, we propose a new Muller C-element capable with low supply voltage operation. The circuit is based on the conventional C-element and use a MOS resistor to ensure robust operation. Simulation results have demonstrated that the proposed circuit can operate at low-supply voltage of 0.38 V and the power-delay product (PDP) was 4.32 aJ at VDD = 1.08 V, which was lower by 9.3% compared with a conventional Muller C-element.

R2-10 (Time: 14:58 - 15:00)
TitleAnalytical Thermal Modeling and Calibration Method for Lithium-Ion Batteries
Author*Keiji Kato, Yusuke Yamamoto, Naoki Kawarabayashi, Lei Lin, Masahiro Fukui (Ritsumeikan University, Japan)
Pagepp. 123 - 128
KeywordLithium-ion battery, Thermal analysis, Calibration
AbstractLithium-ion battery is an important component to construct the circuit of mobile systems. However, the behavior of the battery varies depending on its thermal condition. Thus, to optimize the mobile system, in terms of long life and low power, it is important to estimate the inner temperature of the battery. This paper proposes an analytical method to estimate the inner temperature considering Joule heat and Entropy heat. Evaluation by a real battery sample is also shown.

R2-11 (Time: 15:00 - 15:02)
TitleA Sensor Modeling Technique Using SystemC-AMS For Fast Simulation of System-in-Package Based Bio-Medical Systems
Author*Arif Ullah Khan, Yoshinori Takeuchi, Masaharu Imai (Osaka University, Japan)
Pagepp. 129 - 133
KeywordSensor, Bio-Medical, Simulation Model, SystemC-AMS, SiP
AbstractUse of biomedical systems, which includes healthcare systems and bio-medical implants, is increasing rapidly. These systems consist of analog and digital blocks. In order to develop these systems in short time while meeting strict size, energy and cost constraints there is need for a new design methodology. This research is focused on developing an application specific instruction-set processor (ASIP) and system in package (SiP) based common hardware platform, which could be used by different health monitoring and bio-medical systems. For fast design space exploration there a fast simulation model of complete system is also needed. Different blocks of SiP have been modeled, at different abstraction levels, using SystemC and SystemC-AMS. In this paper a sensor modeling technique, for modeling of available analog sensors, will be presented using SystemC-AMS, which will be used in the SiP simulation model.
PDF file

R2-12 (Time: 15:02 - 15:04)
TitleA Cool Charger for Lithium-Ion Battery
Author*Yusuke Yamamoto, Keiji Kato, Lei Lin, Masahiro Fukui (Ritsumeikan University, Japan)
Pagepp. 134 - 139
KeywordLithium-ion battery, Charger, Thermal Management
AbstractMobile systems, electric vehicle and smart house have lithium-ion battery. It has many advantages like high capacity. On the other hand, degradations of battery are occurred by various factors. We focus on thermal degradation and developed temperature management system for lithium-ion battery. This system includes inner temperature estimation system to grasp battery characteristics. We construct charging system to control temperature of battery while charging. We check about characteristic of cooling battery for charging and discharging.

R2-13s (Time: 15:04 - 15:06)
TitleA Hardware Generator for Aesthetic Nonlinear Filter Banks
Author*Tomoki Komuro, Hirotaka Nishikawa, Yukihiro Iguchi, Kaoru Arakawa (Meiji University, Japan)
Pagepp. 140 - 141
KeywordSignal Processing, nonlinear filter bank, aesthetic filter bank, hardware generator, FPGA
AbstractThis paper considers hardware realizations of nonlinear filter banks for facial beautification. First, users describe filters' characteristics and how to connect them using filter bank description languages (FDLs), then the proposed system generates Verilog HDLs to realize them. Preliminary experimental results show that generated HDLs have the same performance as ones coded by hands.
PDF file


Keynote Speech II
Time: 15:50 - 16:50 Monday, October 21, 2013
Location: Tanchō-Hakuchō 1
Chair: Nagisa Ishiura (Kwansei Gakuin University, Japan)

K2 (Time: 15:50 - 16:50)
TitleReplacing Optical Lenses by Silicon Structures
Author*Rudy Lauwereins (IMEC, Belgium)
Pagep. 142
AbstractSince the start of photography, image optics, recording and display has been analog. In the forties, analog image display has been partially replaced by digital displays: through the line based CRT screen. In the early eighties, also image recording has been digitized thanks to solid state silicon imagers. We are now at the edge of digitizing the last analog part in the image handling flow: the optics. By making opto-mechanical structures in silicon matching the size of the wave length of light, the large, heavy, fragile and costly optical lenses can be replaced by cheap mass produced silicon, often integrated together with the imager and/or processing logic in a heterogeneous stack. This presentation first describes the opportunities silicon processing offers to replace lenses, as well as the requirements that have to be met to make such a replacement successful. Examples are given of a few applications domains that benefit from silicon lenses. Next, one application area is presented in more detail, namely fast and cheap hyperspectral imaging. The adoption of hyperspectral cameras by industry has so far been limited due to their limited speed, limited compactness and their high cost, all caused by the need for many high quality optical lenses. Silicon structures allow to counter all these drawbacks. A novel hyperspectral sensor is detailed with the following key innovations: a Fabry-Pérot wedge filter monolithically integrated on top of a standard CMOS sensor; processed with minimal cavity sizes and integrated with the needed software to improve image quality. The result is a compact 2 megapixel sensor with a spectral range between 550 and 1000 nm and a spectral resolution lower than 10 nm. The speed is 340 fps at illumination levels as used in machine vision. Developing such a breakthrough solution required intense cross-disciplinary collaboration between process technologists, circuit designers, optical experts, camera and system designers, image enhancement specialists, image classification specialists and application experts. The design approach followed will be explained, as well as the need to develop special software tools to emulate the complete system, including lighting, lens aberrations, process technology variability, mechanical variability, optical distortions and software correction algorithms.
PDF file


Poster III
Time: 16:50 - 18:00 Monday, October 21, 2013
Location: Tanchō-Hakuchō 1 & Kujyaku
Chairs: Shinobu Nagayama (Hiroshima City University, Japan), Zhu Qiang (Cadence Design Systems, Japan)

R3-1 (Time: 16:50 - 16:52)
TitleA Heuristic Method to Find Linear Decompositions for Incompletely Specified Index Generation Functions
AuthorTsutomu Sasao, *Yuta Urano, Yukihiro Iguchi (Meiji University, Japan)
Pagepp. 143 - 148
Keywordlogic minimization, linear transform, Functional decomposition, minimul covering
AbstractThis paper shows a method to find a linear transformation that reduces the number of variables to represent a given incompletely specified index generation function. It first generates the difference matrix, and then finds the minimal set of variables using a covering table. Linear transformations are used to modify the covering table to produce a smaller solution.

R3-2 (Time: 16:52 - 16:54)
TitleA New Design Methodology for Rounding and Hardware Minimization in Look-Up-Table-Based Arithmetic Function Evaluation
Author*Shen-Fu Hsiao (National Sun Yat-sen University, Taiwan), Hou-Jen Ko (Purdue University, U.S.A.), Yu-Ling Tseng (SpringSoft, Taiwan), Chia-Sheng Wen (National Sun Yat-sen University, Taiwan)
Pagepp. 149 - 152
Keywordfunction evaluation, error analysis, truncated multiplier, digital arithmetic
AbstractThis paper presents a new approach of determining the bit-widths of hardware components in arithmetic function evaluators based on Look-Up Tables (LUTs) The rounding of floating-point constant values to be stored in LUTs and the hardware minimization in the subsequent arithmetic computation are considered jointly in order to optimize the entire design. The piecewise polynomial approximation with truncated multiplication is used to demonstrate the proposed method. Previous similar designs usually determine the bit widths of the quantized polynomial coefficients and the corresponding multipliers by pre-assigning allowable errors for the individual hardware components, including ROM and arithmetic units. The proposed design considers all the error sources, including the approximation errors, quantization errors, truncation errors, and final rounding errors simultaneously. Thus, the total error budget can be utilized more efficiently and the bit widths of the hardware components (ROM, multipliers, adders) can be optimized, leading to significant improvements in both area and delay. Experimental results show that the proposed method can reduce up to 48% of the total area and up to 25% of delay compared to conventional design approaches.

R3-3 (Time: 16:54 - 16:56)
TitleA Global Router Considering Scenic Controls
AuthorHsueh-Ju Chou (Faraday Technology Corporation, Taiwan), Hsi-An Chien, *Ting-Chi Wang (National Tsing Hua University, Taiwan)
Pagepp. 153 - 158
KeywordGlobal Routing, Scenic Controls
AbstractIn this paper, we study a global routing problem that considers not only overflow and wirelength but also scenic controls. A scenic control is often given to a timing-critical net for coping with timing closure in a modern physical synthesis flow. We enhance an academic global router to handle scenic controls. The enhancements are (1) a new net ordering method for rip-up and reroute, (2) two length bound allocation methods, and (3) a length-bounded adaptive multi-source and multi-sink maze routing method. The experimental results show that our global router is able to produce a high-quality solution in terms of overflow and wirelength without any violation of scenic controls for each test case.

R3-4 (Time: 16:56 - 16:58)
TitleA Tuning Method of Programmable Delay Element with Two Values for Yield Improvement
Author*Hayato Mashiko, Yukihide Kohira (The University of Aizu, Japan)
Pagepp. 159 - 164
KeywordDelay variation, Timing violation, Yield, Programmable delay element
AbstractTo recover the timing violations, which cause significant reduction in the yield of LSI chips, programmable delay elements called PDEs are inserted into the clock tree before fabrication and their delays are tuned after fabrication. In this paper, we use PDEs with two delay values and propose a delay tuning method of the PDE to improve the yield and to reduce the number of tests. Moreover, we evaluate the circuits obtained by the proposed method by using commercial CAD tools.
PDF file

R3-5 (Time: 16:58 - 17:00)
TitleImpact of Drive Strength and Well-Contact Density on Heavy-Ion-Induced Single Event Transient
Author*Jun Furuta (Kyoto University, Japan), Masaki Masuda, Katsuyuki Takeuchi, Kazutoshi Kobayashi (Kyoto Institute of Technology, Japan), Hidetoshi Onodera (Kyoto University, Japan)
Pagepp. 165 - 169
Keywordsoft error, SET, pulse width
AbstractWe measure distributions of heavy-ioninduced Single Event Transient (SET) pulse widths from the 4 kinds of inverter chains to measure their characteristics and estimate SET-induced soft error rates on a Flip-Flop (FF) and a delayed TMR FF. Measurement results show that maximum SET-induced soft error rate on a FF is equivalent to 20% of Single Event Upset (SEU) rate. On the delayed TMR with 400ps delay element, SET-induced soft error rate can be reduced by using 4x inverters with 2μm well-contact distance.

R3-6 (Time: 17:00 - 17:02)
TitleA Technique for Accelerating Adaptive Super Resolution Technique Based on Local Features of Images Using GPU
Author*Kento Kugai, Yuzuru Shizuku, Tetsuya Hirose, Nobutaka Kuroki, Masahiro Numa (Kobe University, Japan)
Pagepp. 170 - 175
KeywordCUDA, GPGPU, Super Resolution
AbstractIn this paper, we propose a technique to accelerate adaptive super resolution technique based on local features of images using a Graphics Processing Unit (GPU). We have applied the acceleration technique to both super resolution process and learning process. Experimental results have shown that the proposed technique achieves speedups of 2.36 times in average, and 3.35 times at the maximum compared with the conventional technique using Central Processing Unit (CPU).

R3-7 (Time: 17:02 - 17:04)
TitleParallel Layer-Aware Partitioning for 3D Designs
Author*Yi-Hang Chen, Yi-Ting Chen, Juinn-Dar Huang (National Chiao Tung University, Taiwan)
Pagepp. 176 - 179
Keywordthrough-silicon via, 3D integration technology, layering, partitioning, multicore architecture
AbstractAs compared with two-dimensional (2D) ICs, 3D integration is a breakthrough technology of growing importance that has the potential to offer significant performance and functional benefits. This emerging technology allows stacking multiple layers of dies and resolves the vertical connection issue by through-silicon vias (TSVs). However, though a TSV is considered a good solution for vertical connection, it also occupies significant silicon estate and incurs reliability problem. Because of these challenges, to minimize the number of TSVs becomes important in the design processes. Therefore, in this paper, we propose a two-phase parallel layer-aware partitioning algorithm for TSV minimization in 3D structures. In the first-phase, we employ OpenMP to parallelize the 2-way min-cut partitioning steps and get the initial solution. In the second-phase, we further improve the result by using parallel simulated annealing algorithm on GPU. The experimental results show that proposed method can reduce the number of TSVs by about 39% as compared to several existing methods.
PDF file

R3-8 (Time: 17:04 - 17:06)
TitleLithium-Ion Battery Degradation Model and Its Application to Power Management of Smart House
Author*Ryosuke Miyahara, Ami Watanabe, Masahiro Fukui (Ritsumeikan University, Japan)
Pagepp. 180 - 185
KeywordSmart house, battery modeling, power management
AbstractAiming at low carbon society, it is important that smart houses spread for efficient utilization of natural energy sources, e.g., photovoltaic battery. However, cost of the degradation of the batteries is a non-ignorable portion of the total cost for the power management in the smart houses. This paper discusses how to formulate the degradation of the batteries, and clarifies the efficient keys to control the cost of battery degradation. Finally, we propose an efficient power management scheme for real smart houses.

R3-9 (Time: 17:06 - 17:08)
TitleParameter Estimation and Model Reduction for Digital IIR Filters Using a Modified PSO Algorithm
Author*Wei-Der Chang, Ching-Lung Chi (Shu-Te University, Taiwan)
Pagepp. 186 - 189
Keywordmodified PSO, filter design
AbstractThis paper develops a new design method for digital IIR filters in the parameter estimation and model reduction. The employed method is the modified particle swarm optimization (MPSO) whose velocity formula is slightly changed to enhance the algorithm performance. Design steps of MPSO-based are presented for the IIR filters. Finally, two kinds of experiments including the filter parameter estimation and model reduction are provided as well. Simulation results will show the applicability of the proposed method.

R3-10 (Time: 17:08 - 17:10)
TitleInvestigating Performance Advantages of Random Topologies on Network-on-Chip
AuthorSarat Yoowattana (Asian Institute of Technology, Thailand), *Ikki Fujiwara, Michihiro Koibuchi (National Institute of Informatics, Japan)
Pagepp. 190 - 194
KeywordNetwork-on-Chip, topology, interconnection networks
AbstractAs technology continues to scale down, the number of cores significantly increases, e.g. 64 cores. The communication latencies increasingly give the negative impact on the performance of parallel applications on Chip MultiProcessors (CMPs). A random topology, which provides lowest diameter and average shortest path length, has been recently considered for low-latency Network-on-Chip (NoC). In this work we investigate its advantage in throughput-and-latency properties for various traffic patterns and we compare the random topology with traditional non-random topologies, such as two-dimensional mesh in various network sizes. Thorough our cycle-accurate network simulation, we found that the random topology significantly outperforms 2-D mesh and 2-D torus in terms of network latency.
PDF file

R3-11 (Time: 17:10 - 17:12)
TitleSpeed Traffic-Sign Recognition Algorithm for Real-Time Driving Assistant System
Author*Masaharu Yamamoto, Anh-Tuan Hoang, Mutsumi Omori, Tetsushi Koide (Hiroshima University, Japan)
Pagepp. 195 - 200
KeywordNumber recognition, Hardware implementation, Driving safety support system, Speed traffic-sign, Real-time
AbstractThe purpose of this research is development of an algorithm for hardware implementation for number recognition applying in speed traffic-sign recognition system for car driving assistant. We recognize the speed limit of the speed traffic-sign using hardware oriented extraction algorithm. The numbers are recognized by comparing their feature values with the recognized features. The proposed hardware oriented number recognition algorithm achieves almost 100 % in recognition rate in 31 scenes in highways and 23 scenes in local roads.
PDF file

R3-12s (Time: 17:12 - 17:14)
TitleA Development and Evaluation of Variable Speed Charger System for Lithium-Ion Battery
Author*Akihiro Segawa, Yusuke Yamamoto, Lei Lin, Masahiro Fukui (Ritsumeikan University, Japan)
Pagepp. 201 - 202
KeywordLithium-ion Battery, Charger, variable speed charger
AbstractThis paper describes a development and evaluation variable speed charger system for lithium ion battery. High speed charge is a cause to occur degradation and a safe problem because the over-current and over-voltage are supplied to the battery at charging, and temperature of the battery rises. Thus, we proposed a variable speed charge system that controls the charge current depending on the temperature rise. The variable speed charger supplies variable current during the CC (Constant Current) charge operation. That is controlling the currents to restrain a temperature rise of the battery. It reduces the damages to battery and optimally charge for a battery.



Tuesday, October 22, 2013

Invited Talk II
Time: 9:00 - 10:00 Tuesday, October 22, 2013
Location: Tanchō-Hakuchō 1
Chair: Nagisa Ishiura (Kwansei Gakuin University, Japan)

I2 (Time: 9:00 - 10:00)
TitlePower of Enumeration --- State-of-the-art Algorithms for Tackling Combinatorial Explosion
Author*Shin-ichi Minato (Hokkaido University/JST, Japan)
Pagepp. 203 - 207
AbstractDiscrete structure manipulation is a fundamental technique for many problems solved by computers. BDDs/ZDDs have attracted a great deal of attention for twenty years, because it efficiently manipulates basic discrete structures such as logic functions and sets of combinations. Recently, one of the most interesting research topics related to BDDs/ZDDs is “frontier-based method,” a very efficient algorithm for enumerating and indexing the subsets of a graph to satisfy a given constraint. This work is important because many kinds of practical problems can be efficiently solved by some variations of this algorithm. In this article, we present an overview of the frontier-based method and recent topics on the state-of-the-art algorithms to show the power of enumeration.


Poster IV
Time: 10:00 - 11:30 Tuesday, October 22, 2013
Location: Tanchō-Hakuchō 1 & Kujyaku
Chairs: Masato Inagi (Hiroshima City University, Japan), Yukihiro Iguchi (Meiji University, Japan)

R4-1 (Time: 10:00 - 10:02)
TitleHigh Speed Approximation Feature Extraction in CAD System for Colorectal Endoscopic Images with NBI Magnification
Author*Tsubasa Mishima, Satoshi Shigemi, Anh-Tuan Hoang, Tetsushi Koide, Toru Tamaki, Bisser Raytchev, Kazufumi Kaneda, Yoko Kominami, Rie Miyaki, Taiji Matsuo, Shigeto Yoshida, Shinji Tanaka (Hiroshima University, Japan)
Pagepp. 208 - 213
KeywordDense Scale-Invariant Feature Transform (D-SIFT), Colorectal Endoscopic Images, Computer-Aided Diagnosis (CAD), Bag-of-Features (BoF), FPGA
AbstractIn this study, we have proposed an improvement for feature extraction in computer-aided diagnosis system for colorectal endoscopic images with narrow-band imaging (NBI) magnification. Dense Scale-Invariant Feature Transform (D-SIFT) is used in the feature extraction. It is necessary to consider a trade-off between the precision of the feature extraction and speedup by the FPGA implementation for processing of real time full high definition image. In this paper, we reduced the number of dimensions for feature representation in hardware implementation purpose.
PDF file

R4-2 (Time: 10:02 - 10:04)
TitleA Fixed-Length Routing Method Based on the Color-Coding Algorithm
Author*Tieyuan Pan, Yasuhiro Takashima (University of Kitakyushu, Japan)
Pagepp. 214 - 219
KeywordFixed-Length Routing, Color-Coding, PCB
AbstractThis paper proposes a fixed-length routing method based on the Color-Coding Algorithm. In recent LSI system design, exact signal propagation delay is required because of the growth of the operation frequency. As one of the techniques to control the delay, the wire-length matching is widely used. This paper proposes a fixed-length routing method based on the Color-Coding algorithm. We analyze the complexity of the proposed approach and confirm its efficiency empirically.
PDF file

R4-3 (Time: 10:04 - 10:06)
TitleRetiming of Single Flux Quantum Logic Circuits for Flip-Flop Reduction
Author*Nobutaka Kito (Chukyo University, Japan), Kazuyoshi Takagi, Naofumi Takagi (Kyoto University, Japan)
Pagepp. 220 - 225
KeywordSFQ circuits, retiming, flip-flop
AbstractWe propose a retiming method of superconductive Single Flux Quantum (SFQ) logic circuits for flip-flop reduction. Because SFQ logic circuits use pulse logic, each input of logic gates has latching function. The number of flip-flops in SFQ circuits can be reduced by utilizing the latching function. We formulate retiming for flip-flop reduction as an instance of integer linear program considering the latching function. Experimental results show that most of flip-flops in SFQ circuit realizations of ISCAS'89 benchmark circuits can be eliminated by the proposed method.
PDF file

R4-4 (Time: 10:06 - 10:08)
TitleForwarding Unit Generation with Runtime Dependency Analysis in High-Level Synthesis
Author*Shingo Kusakabe, Kenshu Seto (Tokyo City University, Japan)
Pagepp. 226 - 230
Keywordhigh-level synthesis, loop pipelining, forwarding
AbstractWe propose a technique to reduce the initiation intervals of loops which contain RAW dependences whose occurrences change during runtime. In the proposed technique, the written data to arrays in such RAW dependences are also written to temporary variables and the temporary variables are read when the RAW dependences occur, thereby the initiation intervals are minimized. Experimental results show that the proposed technique successfully achieves significant speedups with moderate increase in gate counts.

R4-5 (Time: 10:08 - 10:10)
TitleAn NFA-Based Programmable Regular Expression Matching Engine Highly Suitable for FPGA Implementation
Author*Hiroki Takaguchi, Yoichi Wakaba, Shin'ichi Wakabayashi, Shinobu Nagayama, Masato Inagi (Hiroshima City University, Japan)
Pagepp. 231 - 236
KeywordRegular Expression Matching, FPGA, NFA
AbstractIn this paper, we propose a new programmable regular expression matching engine based on a string-transition NFA. The proposed engine can perform matching at high speed, and any regular expression can be set as a pattern in a very short time. The proposed hardware engine has a two-dimensional circuit structure, and thus it is highly suitable for FPGA implementation. Comparing with an existing hardware matching engine, the effectiveness of the proposed hardware was evaluated.
PDF file

R4-6 (Time: 10:10 - 10:12)
TitleGraphillion: ZDD-Based Software Library for Very Large Sets of Graphs
AuthorTakeru Inoue, *Hiroaki Iwashita (Japan Science and Technology Agency, Japan), Jun Kawahara (Nara Institute of Science and Technology, Japan), Shin-ichi Minato (Hokkaido University, Japan)
Pagepp. 237 - 242
Keywordgraph, binary decision diagram, frontier-based search, software library, Python
AbstractGraphillion is a library for manipulating very large sets of graphs, based on zero-suppressed binary decision diagrams (ZDDs) with advanced graph enumeration algorithms. Graphillion is implemented as a Python extension in C++, to encourage easy development of its applications without introducing significant performance overhead. Experimental results show that Graphillion allows us to manage an astronomical number of graphs with very low development effort.
PDF file

R4-7 (Time: 10:12 - 10:14)
TitleClock Jitter Compensation for Continuous-Time Sigma-Delta Modulator Through Divided-by-N Feedback DAC
Author*Zong-Yi Chen, Chung-Chih Hung (Department of Electrical Engineering, National Chiao Tung University, Taiwan)
Pagepp. 243 - 247
Keywordclock jitter, sigma-delta modulator, ADC, divided-by-n feedback DAC
AbstractThis paper proposes a new compensation method to overcome the high sensitivity of the continuous-time (CT) sigma-delta modulator to clock jitter by using divided-by-n (D-N) feedback DAC waveform. There are two types of clock jitter: independent clock jitter and accumulated clock jitter. This method provides a useful approach to solve one of the critical non-idealities, independent clock jitter, in the CT sigma-delta modulator without increasing the speed requirement of the modulator as well as the complexity of system and circuit design. Results prove the effectiveness of this new compensation method for independent clock jitter.

R4-8 (Time: 10:14 - 10:16)
TitleSimultaneous Escape Routing Considering Length Matching of Differential Pairs
AuthorYen-Jung Lee, *Hung-Ming Chen, Ching-Yu Chin (National Chiao Tung University, Taiwan)
Pagepp. 248 - 252
KeywordEscape routing, differential pairs, length matching
AbstractIn PCB design, the escape routing problem is considered an essential part and has been widely studied in literature. There are industrial tools and some studies that work on simultaneous escape routing and escape routing of differential pairs on dense circuit boards. However, to route differential pairs simultaneously considering length-matching is still an important and on-going research problem. In this work, inspired by prior state-of-the-arts, we have implemented an integrated approach that achieves simultaneous escape routing considering length matching of differential pairs, our method avoids time-consuming ILP solutions in finding length-matching differential signal paths. Experimental results show that our approach can efficiently and effectively obtain length-matching of differential pairs on simultaneous escape routing to reduce differential-pair skews, compared with B-escape router we reimplemented.

R4-9 (Time: 10:16 - 10:18)
TitleTechnology Remapping Based on Multiple Solutions for Post-Mask Functional ECO
Author*Yudai Kabata, Tetsuya Hirose, Nobutaka Kuroki, Masahiro Numa (Kobe University, Japan)
Pagepp. 253 - 258
KeywordIncremental synthesis, Engineering change order (ECO)
AbstractThis paper presents a technology remapping technique using reconfigurable (RECON) cells in order to reduce an increase in delay time induced by Engineering Change Orders (ECO’s). Based on the estimated maximum delay time for the paths related to ECO’s using each of multiple solutions obtained by error diagnosis, we can select a solution which minimizes increase in the delay along with the critical path. Experimental results have shown that the proposed technique is effective to reduce the critical path delay with the rectified circuit for post-mask ECO's.

R4-10s (Time: 10:18 - 10:20)
TitleA Fast Trace-Driven Heterogeneous L1 Cache Configuration Simulator for Dual-Core Processors
Author*Masashi Tawada, Masao Yanagisawa, Nozomu Togawa (Waseda University, Japan)
Pagepp. 259 - 260
Keywordcache, simulation, multi-core
AbstractMulti-core processors are used in embedded systems very often. Since application programs running on embedded systems are much limited, there must exists an optimal cache memory configuration in terms of speed, power and area. Simulating application programs on various cache configurations is one of the best options to determine the optimal one. In this paper, we propose a very fast heterogeneous dual-core L1 cache configuration simulation method. Experimental results show that our method runs up to 14x faster than a naive simulation algorithm.

R4-11 (Time: 10:20 - 10:22)
TitleA Dynamic Offload Scheduler for Spatial Multitasking on Intel Xeon Phi Coprocessor
Author*Takamichi Miyamoto, Kazuhisa Ishizaka, Takeo Hosomi (NEC, Japan)
Pagepp. 261 - 266
KeywordIntel Xeon Phi, Multi-tasking, Offload, Scheduling
AbstractIntel Xeon Phi Coprocessor appears and it fully supports multitasking, but it does not automatically ensure high performance in this case. A conventional task level resource allocation scheduler could be used, but a processor utilization of the Xeon Phi is low because of idle time on the Xeon Phi. In this paper, we propose a dynamic offload scheduler which assigns processor resources of the Xeon Phi to tasks by an offload level. We describe an effectiveness of the proposed method with evaluations.
PDF file

R4-12s (Time: 10:22 - 10:24)
TitleA Restricted Dynamically Reconfigurable Architecture for Low Power Processors
Author*Takeshi Hirao, Dahoo Kim, Itaru Hida, Tetsuya Asai, Masato Motomura (Hokkaido University, Japan)
Pagepp. 267 - 268
KeywordReconfigurable system, Processor architecture, Embedded system
AbstractIn this paper, we propose a Control-flow Driven Data-flow Switching variable datapath architecture for embedded applications that demand extremely low power consumption and a wide range of usage. In the proposed architecture aim to achieve both flexibility and low power consumption by limiting the scope of dynamic reconfiguration. As a preliminary evaluation, we have mapped a small program to understand the fundamental characteristics of the proposed architecture.
PDF file

R4-13 (Time: 10:24 - 10:26)
TitleA Processor Architecture for Motion Sensing Systems Using Accelerometer
Author*Takashi Matsuo, Arif Ullah Khan (Graduate School of Information Science and Technology, Osaka University, Japan), Takashi Hamabe (MICRONIX Inc., Japan), Yoshinori Takeuchi, Masaharu Imai (Graduate School of Information Science and Technology, Osaka University, Japan)
Pagepp. 269 - 274
KeywordASIP, Coordinate transform, Power consumption, Accelerometer, Motion sensing
AbstractMicroelectromechanical-systems-based accelerometers are widely used in motion sensing, in which coordinate system transform is the major process. In this study, a method for coordinate system transform is introduced, and a low-power processor architecture specialized for coordinate system transform is proposed and evaluated. Through experimental results, the proposed processor reduced energy consumption by 48.3% compared with a conventional RISC processor implementation.
PDF file

R4-14 (Time: 10:26 - 10:28)
TitleRadiation-Hard Layout Structures on Bulk and SOI Process by Device-Level Simulations
Author*Kuiyuan Zhang, Kazutoshi Kobayashi (Kyoto Institute of Technology, Japan)
Pagepp. 275 - 279
KeywordSoft error, Layout, Bulk, SOI, TCAD
AbstractThis paper analyze the soft error tolerance related to layout structures on 65-nm bulk and SOI processes. The layout structure in which well contacts are placed between redundant latches suppresses MCU effectively. Also the tolerance of SOI structure transistor is estimated by TCAD simulations. The charge collection mechanism is suppressed by the BOX (Buried Oxide) in SOI transistor. Charge sharing and bipolar effects between SOI redundant latches are suppressed. There is no MCU occurrence in SOI redundant latches.

R4-15s (Time: 10:28 - 10:30)
TitleEvent Modeling Method for Verification of Power Analysis Attacks
Author*Kyota Sugioka, Toshiya Asai, Masaya Yoshikawa (Meijo University, Japan)
Pagepp. 280 - 281
KeywordTamper resistance, power analysis attacks, cryptographic circuit
AbstractTo evaluate the resistance of a cryptographic circuit to power analysis attacks during the design stage, this paper proposes a method to improve the efficiency of the acquisition process of information about power consumption, which requires a large number of encryption simulations. The proposed method newly introduces an event modeling method, maintains accuracy equivalent to that of a simulation program with an integrated circuit emphasis (SPICE), and acquires information about power consumption within a realistic processing timeframe that is similar to that of a logic simulation.

R4-16 (Time: 10:30 - 10:32)
TitleA Variable-Length String Matching Circuit Based On SeqBDDs
Author*Atsushi Matsuo, Yasunori Takagi (Ritsumeikan University, Japan), Hiroki Nakahara (Kagoshima University, Japan), Shigeru Yamashita (Ritsumeikan University, Japan)
Pagepp. 282 - 287
KeywordString Matching, Haradware, Programmable Sequence Logic, Sequence Binary Decision Diagram
AbstractThis paper proposes a new hardware architecture for fast string matching. The proposed method utilizes a concept of Sequence Binary Decision Diagrams that is an efficient data structure to store a set of string sequences. Our proposed architecture is a natural extension of an Programmable Sequence Logic Circuit to SeqBDDs. The naive implementation of the Programmable Sequence Logic takes one character in one clock cycle, and thus we seek a way to evaluate multiple characters in one clock cycle with some Content-addressable memories (CAMs). We also report preliminary evaluations for the proposed architectures.

R4-17s (Time: 10:32 - 10:34)
TitleAn Image Compression Method for Frame Memory Size Reduction Using Local Feature of Images
Author*Yuki Fukuhara (Osaka University, Japan), Akihisa Yamada (Sharp Corporation/Osaka University, Japan), Takao Onoye (Osaka University, Japan)
Pagepp. 288 - 289
Keywordframe memory, image compression, system architecture
AbstractThis paper proposes a method of image compression aiming at reduction of frame memory size of digital appliances. In spite of adopting a variable length coding, this method can guarantee the minimum compression ratio by adaptively controlling pixel data reduction rate. Utilizing local feature of images, the reduction rate is more finely controlled so as to maintain visual quality of images. Experimental results show that it attains a compression ratio of 1/3, while keeping visual quality of images by means of adequate bit allocation.
PDF file


Invited Talk III
Time: 13:00 - 14:00 Tuesday, October 22, 2013
Location: Tanchō-Hakuchō 1
Chair: Nagisa Ishiura (Kwansei Gakuin University, Japan)

I3 (Time: 13:00 - 14:00)
TitlePhysical Design of Microfluidic Biochips
Author*Tsung-Yi Ho (National Cheng Kung University, Taiwan)
Pagep. 290
AbstractMicrofluidic-based biochips are soon revolutionizing clinical diagnostics and many biochemical laboratory procedures due to their advantages of automation, cost reduction, portability, and efficiency. The basic idea of microfluidic biochips is to integrate all necessary functions for biochemical analysis onto one chip using microfluidics technology. These micro-total-analysis- systems (μTAS) are more versatile and complex than microarrays. Integrated functions include microfluidic assay operations and detection, as well as sample pre-treatment and preparation. The first generation of microfluidic biochips contained permanently etched structures such as pumps, valves and channels, and relied on continuous liquid flow stream to carry out specific tasks. This type of biochips hereafter is referred to as continuous-flow microfluidics or channel- based biochips. On the contrary, digital microfluidics, the second-generation biochip architecture, relies on discrete liquid particles to carry out general-purpose analysis. Continuing growth of various applications have dramatically complicated the chip/system integration and design complexity making traditional manual designs not suitable enough especially under the time-to-market issue. It is necessary to develop high-quality physical design tools to relieve the design burden of manual optimization of bioassays and time-consuming chip layout designs. In this talk, technology platforms for accomplishing “biochemistry on a chip”, and introduce the audience to both the droplet-based "digital" microfluidics based on electrowetting actuation and flow-based “continuous” microfluidics based on microvalve technology will be described. Next, a holistic perspective on physical design tools for microfluidic biochips and several associative combinatorial and geometric optimization for placement and routing problems will be discussed. In this way, the audience will see how a “biochip compiler” can translate protocol descriptions provided by an end user (e.g., a chemist or a nurse at a doctor’s clinic) to a set of optimized and executable fluidic instructions that will run on the underlying microfluidic platform. Having these physical design tools, users and designers will be able to generate an optimized chip layout for good fluidic performance, high reliability, and low manufacturing cost. Therefore, biochip users and designers can concentrate on the development at application level, leaving layout details to physical design tools.


Panel Discussion
Time: 14:00 - 15:30 Tuesday, October 22, 2013
Location: Tanchō-Hakuchō 1
Moderator: Masahiro Fujita (University of Tokyo, Japan)

D (Time: 14:00 - 15:30)
TitleApplication of EDA Technologies to Non-EDA Areas
AuthorOrganizer/Moderator: Masahiro Fujita (University of Tokyo, Japan), Panelists: Rudy Lauwereins (IMEC, Belgium), Shin-ichi Minato (Hokkaido University, Japan), Tsung-Yi Ho (National Cheng Kung University, Taiwan), Giovanni De Micheli (EPFL, Switzerland), Kazutoshi Wakabayashi (NEC, Japan)
Pagep. 291
AbstractEDA (Electronic Design Automation) technologies have been developed for over 40 years. Lots of research and development efforts have been put into EDA technology development. As a result, a number of sophisticated algorithms have come out and been proven to work in practical situations. Now with the state-of-the-art EDA tools, high level design descriptions, such as the ones in C programming language, can automatically be converted into silicon. Also, as programmable hardware, such as FPGA (Field Programmable Gate Arrays), is commonly used, EDA technologies can be directly applied to make them more efficient. Also, as hardware design flows include various stages, such as high-level synthesis, logic synthesis, layout synthesis, test synthesis, pre- and post-silicon verification, and others, EDA techniques are dealing with varieties of problems in computer science and electrical engineering. Some of them can be applied to totally different areas from the traditional pure hardware design problems. In this panel discussion, various aspects of non-EDA applications of EDA technologies are discussed, and new possible directions are explored. Applications to be discussed include not only custom and highly efficient FPGA based hardware for specific computations, but also synthesis and verification of biomedical optical systems as well as efficient data structures and their associated algorithms for general discrete problems in computer science.
PDF file


Poster V
Time: 15:30 - 17:00 Tuesday, October 22, 2013
Location: Tanchō-Hakuchō 1 & Kujyaku
Chairs: Yuko Hara-Azumi (Nara Institute of Science and Technology, Japan), Masashi Imai (Hirosaki University, Japan)

R5-1 (Time: 15:30 - 15:32)
TitleA Study of ESD Clamp Placement Impact on Peripheral- and Area-I/O Designs
Author*Yi-Cheng Liang, Hung-Ming Chen, Ming-Fang Lai (National Chiao Tung University, Taiwan)
Pagepp. 292 - 297
KeywordESD, I/O Placement
AbstractArea-I/O style flip-chip designs are now used in the main stream high-end electronics products due to the higher performance and better noise control in high density microsystem designs. Among design requirements in such microsystems and packaging, electrostatic discharge (ESD) is still one of the most important reliability concerns. The conventional I/O ring has been used for a long time, however it increases the distance of connection in flip-chip designs. In this study, we analyze rule-of-thumb principles and develop a new I/O distribution structure. In our analysis, the new structure in area-I/O has a large improvement for ESD clamp protection over peripheral I/O, and novel strategies of cell assignment on this structure can obtain less ESD violations than that from general assignment method. Our method can be easily applied in the usual design flow, especially with state-of-the-art area-I/O style cases.

R5-2 (Time: 15:32 - 15:34)
TitleCustomizable Hardware Architecture of Support Vector Machine in CAD System for Colorectal Endoscopic Images with NBI Magnification
Author*Satoshi Shigemi, Tsubasa Mishima, Anh-Tuan Hoang, Tetsushi Koide, Toru Tamaki, Bisser Raytchev, Kazufumi Kaneda, Yoko Kominami, Rie Miyaki, Taiji Matsuo, Shigeto Yoshida, Shinji Tanaka (Hiroshima University, Japan)
Pagepp. 298 - 303
KeywordColorectal Endoscopic Images with NBI Magnification, Support Vector Machine (SVM), Computer-Aided Diagnosis (CAD), FPGA
AbstractWith the increase of colorectal cancer patients in recent years, the needs of quantitative evaluation of colorectal cancer are increased, and the computer-aided diagnosis (CAD) system which supports doctor's diagnosis is essential. In this paper, a hardware design of type identification module in CAD system for colorectal endoscopic images with narrow band imaging (NBI) magnification [1] is proposed for real-time processing of full high definition (Full HD) image (1920 x 1080 pixel). As a result, it has possible to realize real-time processing of our system. In addition, in order to improve the identification accuracy for type B (TA: tubular adenoma) and type C3 (SM-m cancer) , algorithms to realize a 3-class identification, which has high efficiency and high accuracy, is proposed.
PDF file

R5-3 (Time: 15:34 - 15:36)
TitleAnalysis of Corner Conditions in PVT Variations and Reliability Degradations
AuthorAtsushi Kurokawa, *Masayuki Watanabe, Makoto Hoshi, Tetsuya Kobayashi, Masa-aki Fukase (Hirosaki University, Japan)
Pagepp. 304 - 309
Keywordvariability, reliability, timing analysis, corner model, on-chip variation
AbstractThe opposite conditions exist between the best/worst cases for PVT variations and reliability degradations. There are also gaps between general PVT variation and reliability degradation and that of product specifications that must be guaranteed by timing verification during the design process. We clarify these issues through analysis and then present an approach for design guarantee with realistic best-case/worst-case (BC/WC) corner conditions. Finally, the result that analyzed the max conditions of WC corners is shown.

R5-4 (Time: 15:36 - 15:38)
TitleHigh Level Synthesis with Stream Query to C Parser: Eliminating Hardware Development Difficulties for Software Developers
Author*Eric Shun Fukuda (Hokkaido University, Japan), Takashi Takenaka, Hiroaki Inoue (NEC Corporation, Japan), Hideyuki Kawashima (University of Tsukuba, Japan), Tetsuya Asai, Masato Motomura (Hokkaido University, Japan)
Pagepp. 310 - 315
KeywordDynamically Reconfigurable Hardware, Stream Processing, SQL, HLS, C
AbstractRecently, reconfigurable hardware is attracting wide attention as a stream processing platform for its high performance and power efficiency. To allow many software engineers to benefit from reconfigurable hardware, high level synthesis tools have been actively developed. Although these tools have enormously reduced the amount of work and difficulties, the users still need hardware development knowledge. In this paper, we introduce a method that parses SQL queries into high-level-synthesis-intended C codes. Our experiments using a dynamically reconfigurable hardware that features a high level synthesis tool showed that the hardware's potential was fully extracted and the developer writing the SQL queries does not need hardware development knowledge.

R5-5 (Time: 15:38 - 15:40)
TitleFaster Multiple Pattern Matching System on GPU Based on Bit-Parallelism
Author*Hirohito Sasakawa, Hiroki Arimura (Hokkaido University, Japan)
Pagepp. 316 - 321
KeywordGPGPU, extended pattern matching, large-scale pattern matching, bit-parallel method
AbstractIn this paper, we propose fast string matching system using GPU for large scale string matching. The key of our proposed system is the use of bit-parallel pattern matching approach for compact and fast parallel simulation of NFA transition on GPU. In the experiments, we show the usefulness of our proposed pattern matching system.
PDF file

R5-6 (Time: 15:40 - 15:42)
TitleHigh-Level Synthesis for Nested Loop Kernels with Non-Uniform Dependencies
Author*Akihiro Suda, Hideki Takase, Kazuyoshi Takagi, Naofumi Takagi (Kyoto University, Japan)
Pagepp. 322 - 327
KeywordHigh-Level Synthesis, Polyhedral Optimization, Buffering, OpenMP
AbstractIn high-level synthesis, parallelization for nested loop kernels has been hard due to their complex data dependencies, especially non-uniform dependencies. In this paper, we propose a new method to synthesize a parallelized circuit from such kernels using polyhedral optimization, which has been vigorously studied in the software field. The key point of our contribution is a buffering method for parallel RAM accesses. The experimental result shows that the parallelized circuit with 8 PEs is 5.73 times faster than the sequential one.
PDF file

R5-7 (Time: 15:42 - 15:44)
TitleA Fast Simplification Algorithm for Packet Classification
Author*Infall Syafalni (Kyushu Institute of Technology, Japan), Tsutomu Sasao (Meiji University, Japan)
Pagepp. 328 - 333
KeywordPartitioning, Elimination of rules, TCAM, Packet classification
AbstractPacket classification is used in various network applications such as firewalls, access control lists, and network address translators. This technology uses ternary content addressable memories (TCAMs) to perform high speed packet forwarding. However, TCAMs dissipate high power and their cost are high. Thus, reduction of TCAMs is crucial. This paper shows a method to simplify rules in TCAMs for packet classification. We partition the rules into groups so that each group has the same source address, destination address and protocol. After that, we simplify rules in each group by removing redundant rules. We developed a computer program to simplify rules among groups. Experimental results show that this method reduces the size of rules up to 57% of the original specification for ACL5 filter, 73% for ACL3 filter, and 87% for overall filters. This algorithm is useful to reduce TCAMs for packet classification.

R5-8 (Time: 15:44 - 15:46)
TitleA Low Energy Full TMR Design Method with Optimized Selection of Time/Space TMR Mode and Supply Voltage
Author*Kazuhito Ito, Yuki Hayashi (Saitama University, Japan)
Pagepp. 334 - 339
KeywordTMR, Low energy, MIP, Schedule exploration
AbstractTriple modular redundancy (TMR) is to execute an operation three times and obtain the correct result by taking the majority of the three outputs. While TMR is effective in eliminating soft errors in LSIs, the overhead of area as well as the energy consumption is the problem. In addition to the space TMR mode, where the three copies of an operation are actually executed, the time TMR mode is available, where only two copies of an operation are executed and the results are compared, then if the results differ, the third copy is executed to get the correct result. With the time TMR mode, the penalty of energy consumption can be reduced. The drawback of time TMR is that it requires longer time duration. Appropriately selecting the power supply voltage is also an effective technique to reduce the energy consumption. In this paper, a method to derive a TMR design is proposed which selects the TMR mode and supply voltage for each operation to minimize the energy consumption within the time and area constraints.
PDF file

R5-9s (Time: 15:46 - 15:48)
TitleVia-Configurable Structured Asic Using Dual Supply Voltages
AuthorTa-Kai Lin (Yuan Ze University, Taiwan), Kuen-Wey Lin (National Chiao Tung University, Taiwan), Chang-Hao Chiu, *Rung-Bin Lin (Yuan Ze University, Taiwan)
Pagepp. 340 - 341
KeywordDual supply voltages, Structured ASIC, Level converter, Low power
AbstractThis paper presents a via-configurable logic block and a design methodology for realizing fine-grained, dual-supply-voltage structured ASIC. Our results show that, given various timing budgets, our approach achieves a reduction up to 44% on energy per switching of our dual-supply-voltage structured ASIC at the expense of 1.6% overhead on level converters.

R5-10 (Time: 15:48 - 15:50)
TitleAutomatic On-Chip Interface Synthesis Between Incompatible Protocols with Advanced Features
Author*Jiayi Zhang, Masahiro Fujita (University of Tokyo, Japan)
Pagepp. 342 - 347
KeywordProtocol, Conversion
AbstractAbstract - A system-on-chip contains individual processing and peripheral components connected together. Hardware module reuse is a standard solution to the problem of increasing complexity of chip architectures and growing pressure to reduce time to market. In the absence of a single module interface standard, integration of pre-designed modules often requires the use of protocol converters to solve the mismatches. Mismatches occur when the exchange of control signals and/or data between components is not consistent with the intended behavior of their interaction. Complete automation of the converter synthesis process can save time and effort in both design and verification phase and reduce the risk of human error. The ability of the converter to deal with data mismatches and clock mismatches is essential for industrial usage. In the paper we proposed a method to automatically synthesize the protocol converter between incompatible protocols. Our method is applicable to complex protocols used by industries and handles advanced features such as data width mismatch and multi-clock domain.

R5-11 (Time: 15:50 - 15:52)
TitleLow-Power Op-Amp with Capacitor-Base On-Chip Power Supply
Author*Kazuhiro Hanada, Shigetoshi Nakatake (The University of Kitakyushu, Japan)
Pagepp. 348 - 353
Keywordon-chip power supply, energy harvesting system, sensor IC, low-power analog circtuit
AbstractThis paper presents a low-power analog system with a mechanism which provides a power supply via rechargeable capacitor. The system is promising for sensor systems with energy harvesting mechanism. We implement a capacitor-base power supply using MIM structure, and provide a case study which a nano-watt op-amp operates in the proposed system. The simulation results show that the op-amp works for an hour by 1 µF charge to the capacitor.
PDF file

R5-12 (Time: 15:52 - 15:54)
TitleA Basic-Block Level Optimistic Energy Estimation for Power-Gated VLIW Data-Path Model
Author*Shunsuke Nakamura, Ittetsu Taniguchi, Hiroyuki Tomiyama, Masahiro Fukui (Ritsumeikan University, Japan)
Pagepp. 354 - 359
KeywordEnergy estimation, VLIW data-path, Power-gating
AbstractThis paper proposes a basic-block level optimistic energy estimation for power-gated very long instruction-set word (VLIW) data-path model. A power-gating (PG) brings a big benefit for leakage power reduction, but it makes an instruction scheduling difficult because applying PG usually takes dozens or hundreds of consecutive NOP cycles. To estimate the energy consumption of such power-gated VLIW data-path, an optimization of instruction scheduling is necessary. Proposed method enables fast and accurate energy estimation without time consuming instruction scheduling. Experimental results demonstrated the effectiveness of proposed method.

R5-13 (Time: 15:54 - 15:56)
TitleA Memory-Saving Technique for 4K Super-Resolution Circuit with Binary Tree Dictionary
Author*Ayumi Kiriyama, Ryo Matsuzuka, Kohei Michibata, Takahiro Kitayama, Yuzuru Shizuku, Tetsuya Hirose, Nobutaka Kuroki, Masahiro Numa (Kobe University, Japan)
Pagepp. 360 - 365
Keywordlearning-based super-resolution, memory-saving, hardware architecture
AbstractIn this paper, we propose a memory-saving technique for 4K super-resolution circuit with binary tree dictionary. In the conventional architecture, 8 super-resolution circuits work in parallel to output 4K video signal. Each circuit needs a large dictionary. We propose a memory-saving technique by sharing the dictionary. In the proposed architecture, a binary search tree circuit consists of a ROM-read stage and a calculation stage, which enables 8 super-resolution circuits to access a single ROM in parallel. Moreover, we propose a memory compaction technique for the binary tree dictionary. All nodes of the tree are stored on the ROM without gaps. Since each node has addresses of child nodes on the ROM, we can trace the tree easily. Experimental results have shown that our architecture can reduce 87% memory area.

R5-14 (Time: 15:56 - 15:58)
TitleHLS Utilizing Area Optimizing Method for High-Definition MRA-TV Denoise Circuit
Author*Eita Kobayashi (NEC Corporation, Japan), Kenta Senzaki, Atsufumi Shibayama (NEC Corporataion, Japan), Yuichi Nakamura (NEC Corporation, Japan)
Pagepp. 366 - 371
KeywordCircuits, Optimization, Design Methodology, High-Level Synthesis, Denoise
AbstractThis work proposed an area optimization method of high-definition image denoising for full HD image resolution. Conventional denoise techniques have a common defect, which outline of object is blurred while increases the strength of the noise reduction. Meanwhile, we develop a MRA-TV algorithm combined with wavelet transform and TV norm optimization to clear the outline. This method enables a high-quality image denoising with the maintenance of clear outlines. However, there is a fundamental problem that MRA-TV circuit with iterative TVs requires a large implementation due to the size of TV module. In this work, we achieve a significant improvement of that area with the combination of reduction of the calculation and resource sharing utilizing high-level synthesis. Evaluation results show the 52% of area reduction with the maintenance throughput or latency.
PDF file

R5-15 (Time: 15:58 - 16:00)
TitleA Circuit Design Method for Dynamic Reconfigurable Circuits
Author*Hajime Sawano, Takashi Kambe (Kinki University, Japan)
Pagepp. 372 - 376
KeywordReconfigurable Computing, Design Method, DAPDNA-2, JPEG encoder
AbstractReconfigurable Computing (RC) is a new paradigm that addresses the conflicting design requirements of high performance and high area density. In Coarse Grained Architecture (CGA) RC systems, it is important to achieve acceleration using pipelining and also achieve a high PE utilization ratio. This paper proposes an interactive circuit design methodology for Dynamically Reconfigurable Processors to accelerate their performance and achieve compact, low power circuits. The method is applied to a JPEG encoder design and its performance evaluated.
PDF file

R5-16 (Time: 16:00 - 16:02)
TitleConcurrent Verification Experience of Cache Protocol in Real Development of Large SMP Server Product by Using Model Checking
Author*Toru Shonai (Hitachi, Ltd., Japan), Shoichi Hanaki (OKANO Electric Co., Ltd, Japan), Yoshiaki Kinoshita (Hitachi, Ltd., Japan)
Pagepp. 377 - 382
Keywordmodel checking, formal verification, cache protocol, product development, high-end server
AbstractWe have verified the cache protocol by using model checking in real development of the highly multiple-CPU server product. A formal verification engineer abstracted the models for model checking several times through the design process from the protocol specifications written in natural language by the architect team. We discovered actual nine complicated protocol bugs acknowledged by the architects in advance of logic simulation. Some bugs we found were too complicated to be replicated in logic simulation. This effort surely shortened the total design duration. We proved the effectiveness of formal verification of cache protocols in early design phase of real server product development.
PDF file

R5-17 (Time: 16:02 - 16:04)
TitleImplementation of Strictly Convex QP Solver with Multiple Precision Arithmetic
Author*Masahiro Kimura, Hiroshige Dan (Kansai University, Japan)
Pagepp. 383 - 386
KeywordStrictly convex QP, Multiple precision arithmetic, Solver
AbstractOptimization solvers are usually implemented with so-called double precision arithmetic because it has been defined rigorously in the IEEE754-1985 standard and can perform high-speed floating point arithmetic. Double precision arithmetic for optimization basically works well, but it sometimes fails to solve some ill-posed problems. On the other hand, multiple precision arithmetic has attracted much attention recently. In this research, we implemented a solver for strictly convex QPs by using multiple precision arithmetic.