INESC-ID   Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento em Lisboa
technology from seed Inesc-ID Lisboa

List All Seminars


An Ethical Crisis in Computing?

Moshe Y. Vardi


Abstract—IST Distinguished Lecture Computer scientists think often of "Ender's Game" these days. In this award-winning 1985 science-fiction novel by Orson Scott Card, Ender is being trained at Battle School, an institution designed to make young children into military commanders against an unspecified enemy. Ender's team engages in a series of computer-simulated battles, eventually destroying the enemy's planet, only to learn then that the battles were very real and a real planet has been destroyed. Many of us got involved in computing because programming was fun. The benefits of computing seemed intuitive to us. We truly believe that computing yields tremendous societal benefits; for example, the life-saving potential of driverless cars is enormous! Like Ender, however, we realized recently that computing is not a game--it is real--and it brings with it not only societal benefits, but also significant societal costs, such as labor polarization, disinformation, and smart-phone addiction. The common reaction to this crisis is to label it as an "ethical crisis" and the proposed response is to add courses in ethics to the academic computing curriculum. I will argue that the ethical lense is too narrow. The real issue is how to deal with technology's impact on society. Technology is driving the future, but who is doing the steering?
Date: 03-Dec-2020    Time: 17:00:00    Location:


POSTPONED - Ethics of, and Trust in, Artificial Intelligence

Mariarosaria Taddeo

University of Oxford

Abstract—I will analyse the ethical opportunities and risks that artificial intelligence (AI) brings about and how ethical analyses can help to harness the potential for good of AI and mitigate its risks.
Date: 08-May-2020    Time: 14:30:00    Location: Anfiteatro Abreu Faro, IST


CANCELED - Introduction to Compact Data Structures: A Strategy for Big Data Processing

Nieves Brisaboa

Universidad de la Coruña

Abstract—The need to deal with huge amounts of data (big data world) has motivated the development of new data structures able to represent data and indexes to access them in compact space. The key idea in this field is to store and process the data in its compressed form avoiding decompression, except to show the data to final users. The advantage of this strategy is that more data can be placed in memory avoiding time for disk access or for data transmission. That is, this strategy takes advantage of the different speeds of the different memory hierarchy levels. The talk motivates this research field and will present some basic tools. Some examples of basic compressed data structures for the representations of trees and graphs will be used to explain, at a conceptual level, how those data structures work. To understand the talk, only basic knowledge about data structures and algorithms programing is necessary.
Date: 25-Mar-2020    Time: 15:00:00    Location: Anfiteatro PA2, Pavilhão de Matemática do IST


Introduction to IOTA -- a feeless cryptocurrency

IOTA Foundation

Abstract—In this talk we discuss the basics of the IOTA cryptosystem: its main principles and approach to the consensus (a.k.a. Coordicide).
Date: 14-Feb-2020    Time: 14:00:00    Location: 020


Talks on Model Driven Engineering & Artificial Intelligence Approaches

António Menezes Leitão, João Penha-Lopes

Abstract—3rd Talk/2020: IST, 24/January/2020, with Prof. António Menezes Leitão (IST) and Dr. João Penha-Lopes (Quidgest) The department of Computer Science and Engineering from Instituto Superior Técnico and Quidgest associates to promote a series of meetings with guest speakers at lunch time. APDSI, COTEC, CS03 from IPQ and Link to Leaders also support the initiative. Light lunch offered to participants! Registration is free and required, limited to the seats of the auditorium: Title: The Algorithmization of Architecture Since many other areas, architecture is going through a deep but irreversible change: the new generation of practitioners starts to adopt algorithmic approaches in their project processes, which allows conceiving ways that were almost unthinkable before. Besides that, using algorithms, project and analysis processes are automated and also can optimize projects. In this presentation we will discuss the algorithmization of architecture and also present some recent developments in this area. Guest Speaker: António Menezes Leitão*. At this session will also be presented the article "Model Driven Automatic Code Generation: An Evolutionary Approach to Disruptive Innovation Benefits", written by João Penha-Lopes, Manuel Au-Yong-Oliveira e Ramiro Gonçalves. Guest Speaker: João Penha-Lopes**.
Date: 24-Jan-2020    Time: 13:00:00    Location: Room 0.17 Pavilhão de Informática II, IST


Focusing the Macroscope: How We Can Use Data to Understand Behavior

Joana Gonçalves de Sá

Universidade Nova de Lisboa

Abstract—Individual decisions can have a large impact on society as a whole. This is obvious for political decisions, but still true for small, daily decisions made by common citizens. Individuals decide how to vote, whether to stay at home when they feel sick, to drive or to take the bus. In isolation, these individual decisions have a negligible social outcome, but collectively they determine the results of an election and the start of an epidemic. For many years, studying these processes was limited to observing the outcomes or to analyzing small samples. New data sources and data analysis tools have created a "macroscope" and made it possible to start studying the behavior of large numbers of individuals, enabling the emergence of large-scale quantitative social research. At the Data Science and Policy (DS&P) research group we are interested in understanding these decision-making events, expecting that this deeper knowledge will lead to a better understanding of human nature, and to improved public decisions. In the past, we have been focusing mainly on three types of problems, strongly dependent on both the behaviors of individuals (in what we call bottom-up collective processes), and of decision-makers (the top-down decisions). The first is related with what we usually identify as political debate and deliberation and we have computationally analyzed the past 40 years of debates in the Portuguese Parliament. The second is disease dynamics, of both infections and non-infectious diseases, and we try to improve nowcasting and forecasting of several diseases and reduce antibiotic over-prescription. The third is much more fundamental and it comes from the realization that the Digital Era is offering us a giant mirror, a macroscope, that will allow us to understand human behavior at a completely new scale. By using both social networks and the spread of fake news as case studies, we are trying to identify underlying principles, both mathematical and behavioral, that can be generalized to different contexts. In parallel, and recognizing that these tools might also have a very negative impact on society, we try to raise public awareness of these risks and involve citizens in the definition of appropriate ethical guidelines and legislation. During the talk I will briefly describe some of these past projects and offer examples of how we can use data science to study psychology and human behavior. At the end, I will present new ideas in distributed computing and how it can help us in privacy protection.
Date: 20-Dec-2019    Time: 16:00:00    Location: Room 0.19 Pavilhão informática II, IST Alameda


Hardware Engineering in ARM

Francisco Gaspar


Abstract—In this talk, Francisco Gaspar will explain what a Hardware Engineer at ARM can accomplish in his role, and try to transmit both the importance and the expectations from the Design, Verification and Implementation sides of it. He will start with a brief presentation of ARM and proceed to break down a hypothetical small CPU example and explain how the different roles of a hardware engineer can contribute the get the design working together. Francisco will also address a few aspects of research, how it contributes to ARM products and how universities can participate in this process.
Date: 18-Dec-2019    Time: 11:00:00    Location: 336


Cloud computing overview and Running code on Google Cloud

Wesley Chun


Abstract—Cloud computing has taken over industry by storm, yet there aren't enough new college grads who know enough about it. This session begins with a vendor-agnostic, high-level overview of cloud computing, including its three primary service levels. This is followed by an introduction to Google Cloud, its developer platforms, and which products serve at which service levels. Attendees will learn how to run applications on Google Cloud serverless platforms (in Python & JavaScript; other languages are supported) as well as hear about the teaching & research grants available to faculty for use in the classroom or the lab. Google also provides a career-readiness program allowing students & faculty to qualify for an Associate Cloud Engineer certification, which, coupled with a college degree, represents a skillset that can be used on the job immediately. So whether you're a professor, researcher, edtech consultant, IT staff, TA grad student, or lecturer, you'll know how to run code on Google's cloud and help enable the next-generation cloud-ready workforce.
Date: 09-Dec-2019    Time: 11:00:00    Location: Room EA2 - IST Alameda


Scaling Distributed Machine Learning with In-Network Aggregation

Marco Canini

KAUST: King Abdullah University of Science and Technology

Abstract—Training complex machine learning models in parallel is an increasingly important workload. We accelerate distributed parallel training by designing a communication primitive that uses a programmable switch dataplane to execute a key step of the training process. Our approach reduces the volume of exchanged data by aggregating the model updates from multiple workers in the network. We co-design the switch processing with the end-host protocols and ML frameworks to provide a robust, efficient solution that speeds up training by up to 310%, and at least by 20% in most cases for a number of real-world benchmark models.
Date: 15-Nov-2019    Time: 15:00:00    Location: 020


Data processing methodologies in the area of E-Health for categorizing therapeutic responses in patients with migraine

Abstract—Migraine is a chronic disease that affects the daily development of activities of people around the world. To alleviate the symptoms, OnabotulinumtoxinA (BoNT-A) has solid proven evidence for their use according to various works and clinical trials. Nowadays, it is known that 70-80% of patients with chronic migraine show an improvement with this treatment (improvement defined as a reduction in migraine attack frequency or days with attacks by at least 50% within 3 months, leading to a significantly improved functioning of the patients and their overall quality of life). As has been mentioned by [1], it is very important to predict if the BoNT-A treatment will be effective in a patient. Knowing the phenotype-response relationship may help in the development of new treatments for the 20-30% of patients that do not respond to the treatment. This talk will describe two approaches for addressing the prediction of the therapeutic response to BoNT-A: panoramic and feedback prediction [2]. Panoramic prediction makes it possible to decide whether the treatment will be beneficial without using previous knowledge and without involving unnecessary treatments. Feedback prediction can be more accurate prediction since it considers the results of previous stages of the treatment. With the purpose of unveiling the medical attributes that make treatment effective for patients, consensus models are applied to the prediction models found through the proposed approaches. The following attributes have been found to be relevant when predicting the treatment response to BoNT-A: migraine time evolution, unilateral pain, analgesic abuse, headache days and the retroocular component. According to doctors, these factors are also medically relevant and in alignment with the medical literature. When training the prediction models, an attribute weighting task is considered. It is performed with the purpose of finding those weights that improve the representation of the numeric labels encoded by doctors for each stage of BoNT-A treatment. In the panoramic prediction, the attribute weighting is multiobjective because we need to find the optimal weights that improve the prediction accuracy for all stages, simultaneously. In this sense, multiobjective evolutionary algorithms (MOEAs) that support parallelization have been considered for improving the training time of predictive models [3]. The obtained results show accuracies close to 85% and 90% for panoramic and feedback prediction approaches, respectively. Moreover, the training time of the panoramic prediction models is decreased from 8 to less than 2 hours when using 8 threads.
Date: 08-Nov-2019    Time: 11:00:00    Location: 336


Reactive Boolean Networks

Daniel Figueiredo

Universidade de Aveiro

Abstract—In the context of biological regulatory networks, the connection between Piecewise-linear (PWL) models and Boolean networks is well-known, since Boolean networks can be seen as simplifications of PWL models. However, due to the abstraction of BN models, some asymptotic dynamics can be lost. This is observed since some steady states of a PWL model can be forgotten by the corresponding BN, not being signaled by terminals. Reactive Boolean networks (RBN) are an intermediary kind of model which is based on the notion of switch graph, as introduced by Marcelino and Gabbay. A switch graph is a generalized graph which includes special edges which grants a notion of "memory" to a usual graph. An adapted notion of terminal is obtained for RBN and it is shown that this class of model can recover asymptotic dynamics than Boolean networks. Finally the concept of switch graph is generalized to include weights in edges. This produces a more general class of model and its application to model biological regulatory networks is explored.
Date: 07-Nov-2019    Time: 10:00:00    Location: 336


Biomechanical Analyses of Human Movement Aimed at Improving Rehabilitation Outcomes

Richard Neptune

University of Texas at Austin

Abstract—The human neuromusculoskeletal system is exceedingly complex due to highly nonlinear multi-body dynamics and musculotendon actuators, and redundant muscle control that isn’t well understood. As a result, gaining insight into normal and pathological movement remains a challenge due to the extremely difficult task of identifying causal relationships between muscle force development and resulting movement dynamics. This talk will discuss how experimental and modeling and simulation techniques are being used to gain insight into the biomechanics and neuromotor control of human movement with the goal to improve rehabilitation outcomes for those with movement disabilities. Specifically, we will look at how biomechanical analyses of specific movement tasks can give insight into how individual muscles contribute to specific biomechanical functions such as providing body support, forward propulsion and balance control and how clinical interventions can help or hinder the performance of these functions.
Date: 28-Oct-2019    Time: 16:00:00    Location: Sala AM (Anfiteatro de Mecânica) no Ed. de Mecânica II


Comprehending Energy Behaviors of Java I/O APIs

Gustavo Pinto

Federal University of Pará, Brazil.

Abstract—APIs that implement I/O operations are the building blocks of many well-known, non-trivial software systems. These APIs are used for a great variety of programming tasks, from simple file management operations, to database communications and implementation of network protocols.Aims:Despite their ubiquity, there are few studies that focus on comprehending their energy behaviors in order to aid developers interested in building energy-conscious software systems. The goal of this work is two-fold. We first aim to characterize the landscape of the Java I/O programming APIs. After better comprehending their energy variations, our second goal is to refactor software systems that use energy inefficient I/O APIs to their efficient counterparts.Method: To achieve the first goal, we instrumented 22 Java micro-benchmarks that perform I/O operations. To achieve our second goal, we extensively experimented with three benchmarks already optimized for performance and five macro-benchmarks widely used in both software development practice and in software engineering optimization research.Results: Among the results, we found that the energy behavior of Java I/O APIs is diverse. In particular, popular I/O APIs are not always the most energy efficient ones. Moreover, we were able to create 22 refactored versions of the studied benchmarks, eight of which were more energy efficient than the original version. The(statistically significant) energy savings of the refactored versions varied from 0.87% up to 17.19%. More importantly, these energy savings stem from very simple refactorings (often touching less than five lines of code). Conclusions: Our work indicates that there is ample room for studies targeting energy optimization ofJava I/O APIs. Preprint: (to appear at ESEM'2019).
Date: 19-Aug-2019    Time: 11:00:00    Location: 336


Seminar - Personhood Online: Identification-Free Identities and Digital Citizenship via Proof-of-Presence

Bryan Ford

EPFL - École Polytechnique Fédérale de Lausanne

Abstract—Numerous fundamental weaknesses of our online ecosystem derive from its inability to distinguish securely between real people and fake identities. Social botnets, fake news, anonymous trolling, astroturfing, deep fakes, sock puppetry, online poll and ballot stuffing, annoying CAPTCHAs, and environmentally disastrous proof-of-work mining are all symptoms of the Internet’s lack of Sybil attack protection. Conventional “strong identity” solutions such as KYC, biometrics, and trust networks each have severe security, privacy, and usability weaknesses. The AI-powered arms race between better detection and better fakery leads us only towards real people becoming less “convincingly real” online than fakes, rendering real people increasingly-powerless bystanders in a bot-versus-bot world. These lessons point towards a single conclusion: to keep technology accountable to real people, we must stop seeking a magic-bullet pure technology solution to Sybil attacks, as there probably isn’t one. Because securely recognizing real people must actually involve real people, we propose a “back to basics” approach to Sybil protection founded on physical security. We demand only that each real person have a real, physical body with which to attend occasional offline events in person. Via physical security and transparency processes, each real person obtains one and only one cryptographic proof-of-presence or “attendance badge” per event. Proof-of-presence ceremonies may be run at minimal cost by groups of people anywhere, coincident with other in-person events organized anyway such as meetings, conferences, town halls, concerts, political protests, etc. The convenience cost of these physical events are amortized by numerous applications and potential rewards from proof-of-presence tokens, including: trolling-resistant social networks and newsfeeds; reputation systems and “likes” that count only real people; accountably-anonymous “verified” online identities for browsing and website login; secure single-use promotional coupons from local or online businesses; privacy-preserving credentials for abuse-resistant online forums, polls, and democratic deliberation; smart contract systems that understand the notion of “person” and can implement “one-per-person” accounts, airdrops, and other benefits; and cryptocurrencies that provide a permissionless form of universal basic income.
Date: 12-Jul-2019    Time: 11:00:00    Location: 336


'Gielis Transformations in design and engineering'

Johan Gielis

University of Antwerp, Bio-Engineering Sciences

Abstract—Gielis Transformations in design and engineering Gielis’ Transformations (a.k.a. Superformula) are a generalization of the Pythagorean Theorem. It not only defines the circle, as Pythagoras does, but circles, polygons, starfish, shells, flowers, or superquadrics….; an infinite number of 2D and 3D shapes can be described in a few variables and numbers only. GT have been used in over 100 widely different applications in science, education and technology. In the field of design and engineering they have been used, among others, for the optimization of wind turbine blades, antennas, nanoparticles and lasers. In this presentation the following advantages and methods will be highlighted: • Easy modeling with the computer as a creative sparring partner, • Extremely compact (and uniform) representations, also of complex shapes, • Procedural modeling (, • Multi-objective and topology optimization, • Solution of boundary value problems and mesh-free modeling.
Date: 04-Jul-2019    Time: 14:00:00    Location: INESC-ID Lisboa Room 0.20


"Photovoltaic Optiverter – A Novel Hybrid MLPE Technology for Residential and Small Commercial Photovoltaic Applications“ by Dmitri Vinnikov

Dmitri Vinnikov

Abstract—Abstract: In this presentation a novel approach to photovoltaic (PV) module level power electronics called OPTIVERTER (PVOPT) will be introduced and discussed. Functionally, it is a hybrid of PV power optimizer and PV microinverter with such key features as shade-tolerant (or global) maximum power point tracking (MPPT), galvanic isolation, direct AC connectivity, PV module level monitoring and safety cut-off, flexibility of installation and PV power system sizing. Explanations will be provided both from the hardware and software point of view, with the main focus on the realization of the shade-tolerant MPPT, which is a new feature of the PV microinverters. The experimental results and main challenges of the proposed approach will be analyzed and discussed.
Date: 19-Jun-2019    Time: 10:30:00    Location: IST, anf. EA2, Torre Norte


Algorithm/Architecture Co-design for Smart Signals and Systems in Cognitive Cloud/Edge

Gwo Giun Chris Lee

National Cheng Kung University (NCKU)

Abstract—Niklaus Emil Wirth introduced the innovative idea that Programming = Algorithm + Data Structure. Inspired by this, we advance the concept to the next level by stating that Design = Algorithm + Architecture. With concurrent exploration of algorithm and architecture entitled Algorithm/Architecture Co-exploration (AAC), this methodology introduces a leading paradigm shift in advanced system design from System-on-a-Chip to Cloud and Edge. As algorithms with high accuracy become exceedingly more complex and Edge/IoT generated data becomes increasingly bigger, flexible parallel/reconfigurable processing are crucial in the design of efficient signal processing systems having low power. Hence the analysis of algorithms and data for potential computing in parallel, efficient data storage and data transfer is crucial. With extension of AAC for SoC system designs to even more versatile platforms based on analytics architecture, system scope is readily extensible to cognitive cloud and reconfigurable edge computing for multimedia, a cross-level-of abstraction topic which will be introduced in this tutorial together with case studies.
Date: 20-May-2019    Time: 15:30:00    Location: IST Anfiteatro Abreu Faro


Conflict-free Replicated Data Types: An Overview

Nuno Manuel Ribeiro Preguiça

Universidade Nova de Lisboa

Abstract—Internet-scale distributed systems often replicate data at multiple geographic locations to provide low latency and high availability, despite node and network failures. Geo-replicated systems that adopt a weak consistency model allow replicas to temporarily diverge, requiring a mechanism for merging concurrent updates into a common state. Conflict-free Replicated Data Types (CRDT) provide a principled approach to address this problem. This talk will provide an overview of CRDT research and practice, addressing the aspects relevant for the application developer, the system developer and the CRDT developer independently
Date: 16-May-2019    Time: 14:30:00    Location: 336


BADA - Big Automotive Data Analytics

Ahmad Al-Shishtawy

RISE Research Institutes of Sweden

Abstract—The BADA project fuses big data analytics with the automotive industry in Sweden. It is a collaboration between RISE SICS, Volvo cars, Volvo trucks, Scania, and the Swedish Transport Agency. The goal of the project is to help the Swedish automotive industry to adopt Big Data technologies (Big Data platforms and tools) for data driven analytics and machine learning. We investigate how big data analytics platforms and machine learning algorithms can impact the transport and automotive industries. The approach is centred on a number of industry driven use-cases and the development of prototype systems. The project ended in 2018. In this talk, I give an overview of the project highlighting some of the results and lessons learned.
Date: 28-Feb-2019    Time: 15:00:00    Location: 336


InfraComposer: Policy-driven Adaptive and Reflective Middleware for the cloudification of Simulation and Optimization Workflows

Bert Lagaisse

Imec-DistriNet Research Group, KU Leuven

Abstract—We present motivating scenarios as well as an architecture for adaptive and reflective middleware that supports smart cloud-based deployment and execution of en- gineering workflows. This middleware supports deep inspection of the work- flow task structure and execution, as well as of the very specific mathematical tools, their executions and used pa- rameters. The reflective capabilities are based on multiple meta-models to reflect workflow structure, deployment, execution and resources. Adaptive deployment is driven by both human input as meta-data annotations as well as the actual execution history of the workflows.
Date: 28-Feb-2019    Time: 14:30:00    Location: 336


Sketching as an engine of research and development thinking

Anna Lobovikov-Katz


Abstract—Sketching is a natural way to communicate ideas quickly: with only a few pencil strokes, complex shapes can be evoked in viewers. Freehand sketching is well-known for its contribution to the development of spatial-visual ability, which is a must for success in STEM (Science, Technology, Engineering and Mathematics). Furthermore, sketching is a natural tool for formulating ideas in diverse areas - from engineering to abstract theoretical matters. The inter- and multidisciplinarity of the modern research and development reality turns sketching into a universal tool for the ideas exchange between experts from different areas. However, the traditional methods for the development of practical ability in sketching are time-consuming, and the application of sketching is limited mostly to design-related areas. This brings sketching out of reach of researchers, developers, and engineers. I will cover its applications to ideation, spatial reasoning, and design thinking. The lecture addresses the subject by shifting the main focus from the result to the process of sketching [2]. It introduces a methodology for the rapid development of practical sketching ability, and exemplifies its « thinking » uses, through live sketching [3]. <br> [1]<br> [2]<br> [3] with the help of Rapid Learning Methodology in Freehand Sketching (RaLeMeFS), developed by author<br> [4] Olsen, L., Samavati, F., Sousa, M. C., Jorge, J., 2009. Sketch-based modeling: A survey. Computers & Graphics. 33, 85–103
Date: 04-Feb-2019    Time: 11:30:00    Location: 020


Learning with Sociable Robots and Artifacts

Sandra Y. Okita

Teachers College, Columbia University

Abstract—People often turn to others to improve their own learning. Technological artifacts (e.g. humanoid robots, pedagogical agents/avatars) often consist of human-like qualities ranging across appearance, behavior, and intelligence. These features often elicit a social response from humans that provide distinctive ways to examine human-artifact interactions. Virtual humans and humanoid robots create unique situations that have interesting implications for peer learning and social behavior. The talk explores possible ways to capitalize on the strong social components of technology that enables students to develop peer-learning relationships (e.g., recursive feedback during learning-by-teaching, self-other monitoring). I will introduce some ongoing research that uses technological artifacts (robots, pedagogical agents/avatars) as a threshold to learning, instruction, and assessment in formal (e.g., classrooms) and informal learning environments (e.g., online learning environments). Technological artifacts also present an array of interesting design choices (e.g., customization, creating look-alikes, adopting personas) when modeling interactions with human learners, and how identifying cause-and- effect relationships enables us to more effectively design interventions. I will introduce work in this area that explore how facial similarity with peer avatars may influence human learning, and how robotic features combined with specific scripts and scenarios assist engagement and behavior.
Date: 31-Jan-2019    Time: 15:00:00    Location: IST Taguspark - Room 1.38


Atomic Transaction Commit for Modern Data Stores

Alexey Gotsman

IMDEA Software Institute in Madrid

Abstract—Modern data stores often need to provide both high scalability and strong transactional semantics. They achieve scalability by partitioning data into shards and fault-tolerance by replicating each shard across several servers. A key component of such systems is the protocol for atomically committing a transaction spanning multiple shards, which is usually integrated with concurrency control. Unfortunately, the classical theory of atomic commit is too restrictive to capture the complexities of such protocols. I will present a new problem statement for atomic commit that more faithfully reflects modern requirements and will describe solutions to this problem in different classes of data stores, including those for geo-replication and those that exploit Remote Direct Memory Access (RDMA).
Date: 15-Jan-2019    Time: 10:00:00    Location: 336


Tezos, a blockchain by scientists: principles and applications

Diego Pons

Abstract—Tezos is a blockchain done by computer scientists most of them related to the INRIA. As a result, Tezos is written in OCaml, advocates solid algorithms, domain specific languages for smart contracts and formal verification. We will expose some of the principles that guided the creation of Tezos, explain the challenges ahead and discuss potential applications.
Date: 14-Jan-2019    Time: 15:00:00    Location: 020


TerraHidro: Modelação Hidrológica Distribuída

Sérgio Rosim

Instituto Nacional de Pesquisas Espaciais, Brasil

Abstract—Apresentação das geotecnologias, pesquisas e desenvolvimentos de sistemas voltados para aplicações com dados espaciais na Divisão de Processamento de Imagens do Instituto Nacional de Pesquisas Espaciais - INPE, Brasil. O sistema TerraHidro, voltado para modelagem hidrológica distribuída, será apresentado em detalhes. Idéias iniciais sobre utilização de visualização e edição 3D e interface homem máquina, para sistemas de informações geográficas, também serão mostradas.
Date: 14-Dec-2018    Time: 11:00:00    Location: 020



Francisco Miranda


Abstract—Today's children and young people are increasingly disinvesting from their school paths. However, one approach has been dramatically effective in motivating all young people to pursue, in a dedicated and persistent way, personal (though seemingly virtual) self-improvement goals - games. What is the mystery behind games? Our answer is that games are nothing more than ideal spaces for learning. Games allow young people to control their own course and their own pace and they allow collaboration with other players regardless of their capacity or provenance. Moreover, player are not afraid to fail and feel safe. And when they fail, they try again until they have overcome all obstacles. They are always challenged in the right way, according to the level they have. To summarize in one sentence: players remain optimistic about the final result, always believing that they are capable. Our mission is to transform schools through games - SPOT GAMES. SPOT GAMES are tools that change the mechanics of learning making learning a fun, magnetic, collaborative and inclusive process. With the implementation of our games schools can bring parents and teachers together and engage them as co-players around students' progress goals while also bringing together the network of local partners (municipalities and local projects) of the school, where everyone has an active role in the games. We also want teachers to have a system of results and impact analysis that is integrated into the games' backoffice and bureaucracy facilitator. Finally, our goal is also to expand the learning spaces beyond the time and space of the classroom, integrating the natural and cultural heritage of the surrounding of the school in the physical course and in the contents of the games.
Date: 26-Nov-2018    Time: 11:00:00    Location: IST Taguspark - Room 2.10


Scalability and Efficiency in Graph Mining

Wagner Meira Jr

Universidade Federal de Minas Gerais

Abstract—Despite significant research, graph mining remains a challenging task, due to characteristics such as its computational complexity and the large spectrum of models that may be mined. In this talk we discuss some of these challenges and focus on strategies targeted at two significant issues in relevant scenarios, such as social networks and bioinformatics. The first issue is scalability and we present some strategies not only for creating computationally scalable solutions, but also for developing them more easily. The second issue is the efficiency of the mining process, and we present new graph mining models as well as robust sampling strategies for them. We conclude by summarizing the lessons learned and presenting current trends.
Date: 20-Nov-2018    Time: 10:00:00    Location: 336


Firewall Configuration: Rule Sets and Usability

Leonardo A. Martucci

Abstract—In this presentation we report on our work in measuring the usability of rule sets in terms of how easy it is for system administrators to understand and manage them. We begin with a description of the problem and a brief introduction to our past and current work on measuring the usability of access control and firewall rules sets.
Date: 14-Nov-2018    Time: 10:00:00    Location: 336


Socially Competent Robot Navigation

Chris Mavrogiannis

Cornell University

Abstract—Crowded human environments such as pedestrian scenes constitute challenging domains for mobile robots, for a variety of reasons including the heterogeneity of pedestrians’ decision making mechanisms, the lack of universal formal rules regulating traffic, the lack of channels of explicit communication with them, robot hardware limitations etc. State-of-the-art approaches for planning socially compliant robot motion tend to hard-code social norms or directly imitate observed human behaviors, thus exhibiting poor generalization to different environments and contexts, and often lack a thorough, in-depth validation, raising questions about reproducibility. To address these gaps, we have proposed a family of planning algorithms that employ mathematical abstractions from topology and physics to model multi-agent dynamics and follow design principles extracted from studies on human behavior. Our algorithms leverage the collaborative nature of human navigation as well as the human mechanisms of nonverbal communication and teleological inference to generate consistently intent-expressive and socially compliant robot motion in the context of a dynamic, multi-agent environment. Evidence extracted from an online user study suggests that humans perceive the motion generated by our framework as more legible compared to the motion generated by two widely employed baselines. This is in parallel to our findings from extensive lab experiments suggesting that human motion is smoother when navigating around a robot running our algorithm. In this talk, I will present the mathematical and computational foundations of our approach,summarize our key findings and discuss potential directions for future work
Date: 13-Nov-2018    Time: 11:00:00    Location: Room 1.38 - IST Taguspark


From Runtime Failures to Patches: Study of Patch Generation in Production

Thomas Durieux


Abstract—Patch creation is one of the most important actions in the life cycle of an application. Creating patches is a time-consuming task. Not only because it is difficult to create a sound and valid patch, but also because it requires the intervention of humans. Indeed, a user must report the bug, and a developer must reproduce it and fix it, which takes much time. To address this problem, techniques that automate this task have been created. However, those techniques still require a developer to reproduce the bug and encode it as a failing test case. This requirement drastically reduces the applicability of the approaches since it still relies on humans. During this talk, we will explore two techniques that remove the human intervention from the automatic patch generation by putting as close as possible the patch generation to the production environment where the data and human interactions that lead to the bug are present. The presented techniques exploit the production data state to detect bugs, generate and validate patches
Date: 06-Nov-2018    Time: 14:00:00    Location: 336


Personalizing Personal Robots

Dan Grollman

Misty Robotics

Abstract—In this talk I'll present some ideas on how to make your robot truly your own. Beyond sensing, thinking, and acting, robots need to feel and express their own unique interpretation of the world around them, and adapt themselves to you and your environment. Their coherent personality must persist across an array of diverse, possibly 3rd-party developed capabilities as well.
Date: 06-Nov-2018    Time: 11:00:00    Location: Room 1.38 (Taguspark).


Timely, Reliable, and Cost-Effective Internet Transport Service using Structured Overlay Networks

Yair Amir

Johns Hopkins University

Abstract—Emerging applications such as remote manipulation and remote robotic surgery require communication that is both timely and reliable, but the Internet natively supports only communication that is either completely reliable with no timeliness guarantees (e.g. TCP) or timely with only best-effort reliability (e.g. UDP). We present an overlay transport service that can provide highly reliable communication while meeting stringent timeliness guarantees (e.g. 130ms round-trip latency across the US) over the Internet. To enable routing schemes that can support the necessary timeliness and reliability, we introduce dissemination graphs, providing a unified framework for specifying routing schemes ranging from a single path, to multiple disjoint paths, to arbitrary graphs. Based on an extensive analysis of real-world network data, we develop a timely dissemination-graph-based routing method that can add targeted redundancy in problematic areas of the network. We show that this approach can cover close to 99% of the performance gap between a traditional single-path approach and an optimal (but prohibitively expensive) scheme.
Date: 17-Oct-2018    Time: 17:00:00    Location: 336


Artificial sociality- modelling the social mind

Gert Jan Hofstede

Wageningen Universiteit

Abstract—Gert Jan will discuss ‘artificial sociality’, the subject for which he was recently appointed professor. It is about foundational conceptual models of human sociality based on social science, for use in agent-based models of complex systems in the life sciences. Artificial sociality involves two main components: designing individual minds, including emotions and underlying social motives, and modelling self-organization of behaviour at the level of the social system. A special focus rests on safety and resilience in socio-‘something’ systems in which human components are replaced by technical ones.
Date: 15-Oct-2018    Time: 11:30:00    Location: IST Taguspark - Room 1.38


HOOVER: Distributed, Flexible, and Scalable Streaming Graph Processing on OpenSHMEM

Max Grossman


Abstract—Many problems can benefit from being phrased as a graph processing or graph analytics problem: infectious disease modeling, insider threat detection, fraud prevention, social network analysis, and more. These problems all share a common property: the relationships between entities in these systems are crucial to understanding the overall behavior of the systems themselves. However, relations are rarely if ever static. As our ability to collect information on those relations improve (e.g. on financial transactions in fraud prevention), the value added by large-scale, high-performance, dynamic/streaming (rather than static) graph analysis becomes significant. <p> This talk introduces HOOVER, a distributed software framework for large-scale, dynamic graph modeling and analysis. HOOVER sits on top of OpenSHMEM, a PGAS programming system, and enables users to plug in application-specific logic while handling all runtime coordination of computation and communication. HOOVER has demonstrated scaling out to 24,576 cores, and is flexible enough to support a wide range of graph-based applications, including infectious disease modeling and anomaly detection.
Date: 15-Oct-2018    Time: 11:00:00    Location: 336


On the Self in Selfie

Christoph Kirsch

University of Salzburg

Abstract—Selfie is a self-contained 64-bit, 10-KLOC implementation of (1) a self-compiling compiler written in a tiny subset of C called C* targeting a tiny subset of 64-bit RISC-V called RISC-U, (2) a self-executing RISC-U emulator, (3) a self-hosting hypervisor that virtualizes the emulated RISC-U machine, and (4) a prototypical symbolic execution engine that executes RISC-U symbolically. Selfie can compile, execute, and virtualize itself any number of times in a single invocation of the system given adequate resources. There is also a simple linker, disassembler, debugger, and profiler. C* supports only two data types, uint64_t and uint64_t*, and RISC-U features just 14 instructions, in particular for unsigned arithmetic only, which significantly simplifies reasoning about correctness. Selfie has originally been developed just for educational purposes but has by now become a research platform as well. In this talk, we show how selfie leverages the synergy of integrating compiler, target machine, and hypervisor in one self-referential package while orthogonalizing bootstrapping, virtual and heap memory management, emulated and virtualized concurrency, and even replay debugging and symbolic execution. This is joint work with A. Abyaneh, M. Aigner, S. Arming, C. Barthel, S. Bauer, T. Hütter, A. Kollert, M. Lippautz, C. Mayer, P. Mayer, C. Moesl, S. Oblasser, C. Poncelet, S. Seidl, A. Sokolova, and M. Widmoser. Web Link:
Date: 12-Oct-2018    Time: 10:30:00    Location: 336


The Future of Cyber-autonomy

David Brumley

Carnegie Mellon University

Abstract—My vision is to automatically check and defend the world's software from exploitable bugs. In order to achieve this vision, I am building technology, called Mayhem, that shifts the attack/defend game away from the current manual approaches for finding and fixing software security vulnerabilities to a fully autonomous cyber reasoning systems
Date: 10-Oct-2018    Time: 13:30:00    Location: Anfiteatro FA1 (Pav. Informática)


Improved Maximum Likelihood Decoding using sparse Parity-Check Matrices

Tobias Dietz

Technische Universität Kaiserslautern

Abstract—Maximum-likelihood decoding is an important and powerful tool in communications to obtain the optimal performance of a channel code. Unfortunately, simulating the maximum-likelihood performance of a code is a hard problem whose complexity grows exponentially with the blocklength of the code. In order to optimize the performance, we minimize the number of ones in the underlying parity-check matrix, formulate it as an integer program and give a heuristic algorithm to solve it. Using these minimized matrices, we significantly reduce the runtime of several ML decoders for several codes, resulting in speedups of up to 81% compared to the original matrices.
Date: 10-Oct-2018    Time: 11:30:00    Location: 336


Efficient paths in ordinal weighted graphs

Luca Schafer

Technische Universität Kaiserslautern

Abstract—We investigate the single-source-single-destination "shortest" paths problem in acyclic graphs with ordinal weighted arc costs. We define the concepts of ordinal dominance and efficiency for paths and their associated ordinal levels, respectively. Further, we show that the number of ordinally non-dominated paths vectors from the source node to every other node in the graph is polynomially bounded and we propose a polynomial time labeling algorithm for solving the problem of finding the set of ordinally non-dominated paths vectors from source to sink
Date: 10-Oct-2018    Time: 10:00:00    Location: 336


Crypto-hardware design for secure applications

Erica Tena-Sánchez, F. E. Potestad-Ordóñez

University of Seville

Abstract—Any electronic devices considered 'secure', and in fact any electronic device handling relevant information, make use of cryptographic services to ensure confidentiality, authentication and integrity of the processed data. These cryptographic engines implement mathematically secure algorithms, however due to leakages from their physical implementations they can reveal sensitive information during computation. This talk will present a brief overview on the design and evaluation of hardware countermeasures against side channel attacks and fault injection attacks that can be deployed on to these devices.
Date: 09-Oct-2018    Time: 11:00:00    Location: 336


Interactive Systems based on Electrical Muscle Stimulation

Pedro Lopes

University of Chicago

Abstract—How can interactive devices connect with users in the most immediate and intimate way? This question has driven interactive computing for decades. If we think back to the early days of computing, user and device were quite distant, often located in separate rooms. Then, in the ’70s, personal computers “moved in” with users. In the ’90s, mobile devices moved computing into users’ pockets. More recently, wearables brought computing into constant physical contact with the user’s skin. These transitions proved to be useful: moving closer to users and spending more time with them allowed devices to perceive more of the user, allowing devices to act more personal. The main question that drives my research is: what is the next logical step? How can computing devices become even more personal? Some researchers argue that the next generation of interactive devices will move past the user’s skin, and be directly implanted inside the user’s body. This has already happened in that we have pacemakers, insulin pumps, etc. However, I argue that what we see is not devices moving towards the inside of the user’s body but towards the “interface” of the user’s body they need to address in order to perform their function. This idea holds the key to more immediate and personal communication between device and user. The question is how to increase this immediacy? My approach is to create devices that intentionally borrow parts of the user’s body for input and output, rather than adding more technology to the body. I call this concept “devices that overlap with the user’s body”. I’ll demonstrate my work in which I explored one specific flavor of such devices, i.e., devices that borrow the user’s muscles. In my research, I create computing devices that interact with the user by reading and controlling muscle activity. My devices are based on medical-grade signal generators and electrodes attached to the user’s skin that send electrical impulses to the user’s muscles; these impulses then cause the user’s muscles to contract. While electrical muscle stimulation (EMS) devices have been used to regenerate lost motor functions in rehabilitation medicine since the ’60s, during my PhD I explored EMS as a means for creating interactive systems. My devices form two main categories: (1) Devices that allow users eyes-free access to information by means of their proprioceptive sense, such as a variable, a tool, or a plot. (2) Devices that increase immersion in virtual reality by simulating large forces, such as wind, physical impact, or walls and heavy objects.
Date: 26-Sep-2018    Time: 13:30:00    Location: Tagus Park (room TBD), and Alameda (room 0.19 by VC)


State-of-the-Art FinFET Technology: An Industry Designer’s Perspective

Gonçalo Nogueira

Socionext, Inc

Abstract—Size scaling of CMOS transistors has been happening for the past 30 years, with technologies like FinFET or FD-SOI being used recently to make up for limitations found in Bulk technology. With TSMC releasing 5nm FinFET in 2019 (with gate lengths of the order of dozens of atoms wide), design and layout are changing significantly from what is seen in older technologies. This seminar addresses the topic of FinFET from an industry designer’s perspective, with the following content: an introduction to FinFET, design and layout with FinFETs, advantages and challenges, and lastly, the expected future of solid state circuits.
Date: 19-Sep-2018    Time: 17:00:00    Location: Room EA4 North Tower, Alameda


"How Acting Through Autonomous Machines Changes People’s Decision Making" by Celso de Melo

Celso Melo

US Army Research Lab

Abstract—Recent times have seen an emergence of a new breed of intelligent machines that act autonomously on our behalf, such as autonomous vehicles. Despite promises of increased efficiency, it is not clear whether this paradigm shift will change how we decide when our self-interest (e.g., comfort) is pitted against the collective interest (e.g., environment). In this talk, I show that acting through machines changes the way people solve these social dilemmas and I'll present experimental evidence showing that participants program their autonomous vehicles to act more cooperatively than if they were driving themselves. We further show this happens because programming vehicles to act autonomously causes short-term rewards to become less salient and this leads participants to consider broader societal interests and behave more cooperatively. Our findings also indicate this effect generalizes beyond the domain of autonomous vehicles. We discuss implications for designing autonomous machines that contribute to a more cooperative society.
Date: 05-Sep-2018    Time: 14:00:00    Location: Room 1.38 - IST Taguspark


IT Governance in the Board Room

Steven de Haes

University of Antwerp

Abstract—Disruptive new technologies are increasing and have an important influence on the business we are doing. Previously, the board could delegate, ignore or avoid it, but that is no longer the case. Yet, it seems that 80% of boards of directors are still looking away. Digital transformation seems to be ‘the elephant in the boardroom’. In this session, we will address that boards need to extend their governance accountability, from often a mono-focus on finance and legal as proxy to corporate governance, to include technology and provide digital leadership and organizational capabilities to ensure that the enterprise’s IT sustains and extends the enterprise’s strategies and objectives.
Date: 16-Jul-2018    Time: 18:00:00    Location: AM Amphitheater, Alameda Campus, IST


A Compiler-based Approach to Mitigate Fault Attacks Using SIMD Instructions

Alexander V. Veidenbaum

University of California at Irvine

Abstract—Today's general-purpose microprocessors support vector (SIMD) instructions. This creates opportunities for developing new compilation approach to mitigate the impact of faults on cryptographic implementations, which is subject of this work. A compiler-based approach is proposed to automatically and selectively apply vectorization in a cryptographic library. This transforms a standard software library into a library with vectorized code that is resistant to glitches. Unlike traditional vectorization for performance, the proposed compilation flow uses the multiple vactor lanes to introduce data redundancy in cryptographic computations. The approach has a low overhead in both code size and execution time. Experimental results show that the proposed approach only generates an average of 26% more dynamic instructions over a series of asymmetric cryptographic algorithms in the Libgcrypt library. Only 0.36% injected faults are undetected by this approach.
Date: 28-Jun-2018    Time: 11:00:00    Location: 336


Bridging the design and implementation of distributed systems with program analysis.

Ivan Beschastnikh

University of British Columbia

Abstract—Much of today's software runs in a distributed context: mobile apps communicate with the cloud, web apps interface with complex distributed backends, and cloud-based systems use geo-distribution and replication for performance, scalability, and fault tolerance. However, distributed systems that power most of today's infrastructure pose unique challenges for software developers. For example, reasoning about concurrent activities of system nodes and even understanding the system’s communication topology can be difficult. In this talk I will overview three program analysis techniques developed in my group that address these challenges. First, I will present Dinv, a dynamic analysis technique for inferring likely distributed state properties of distributed systems. By relating state across nodes in the system Dinv infers properties that help reason about system correctness. Second, I will review Dara, a model checker for distributed systems that introduces new techniques to cope with state explosion by combining traditional abstract model checking with dynamic model inference techniques. Finally, I will discuss PGo, a compiler that compiles formal specifications written in PlusCal/TLA+ into runnable distributed system implementations in the Go language. All three projects employ program analysis in the context of distributed systems and aim to bridge the gap between the design and implementations of such systems.
Date: 21-Jun-2018    Time: 14:00:00    Location: 336


Bridging Informatics and Biology: case studies from the "field"

Daniel Sobral

Gulbenkian Science Institute

Abstract—In Biology, like in many disciplines, technological advances are generating a flurry of new data. Many researchers in Biology are having a hard time processing and integrating these new massive datasets to obtain biological insights. The Bioinformatics Unit at the IGC is a service facility that provides support to researchers in handling and processing large datasets of biological data, particularly of sequencing data. In this talk, I will give a few examples of user support we've been providing, as well as some of our attempts at empowering the user and increasing autonomy.
Date: 14-Jun-2018    Time: 10:00:00    Location: IST Taguspark - Room 1.38


Distributed Search and Recommendation with Profile Diversity

Esther Pacitti

INRIA & CNRS, University Montpellier

Abstract—With the advent of Web 3.0, the Internet of things, and citizen science applications, users are producing bigger and bigger amounts of diverse data, which are stored in a large variety of systems. Since the users’ data spaces are scattered among those independent systems, data sharing becomes a challenging problem. Distributed search and recommendation provides a general solution for data sharing and among its various alternatives, gossip-based approaches are particularly interesting as they provide scalability, dynamicity, autonomy and decentralized control. Generally, in these approaches each participant maintains a cluster of “relevant” users, which are later employed in query processing. However, only considering relevance in the construction of the cluster introduces a significant amount of redundancy among users, which in turn leads to reduced recall. Indeed, when a query is submitted, due to the high similarity among the users in a cluster, the probability of retrieving the same set of relevant items increases, thus limiting the number of distinct results that can be obtained. In this talk I will present the resultant new gossip-based clustering algorithms and validate them through experimental evaluation over four real datasets, we show that taking into account diversity based clustering score enables to obtain major gains in terms of recall. In addition, I will also present same ongoing work on scientific data management carried by Zenith Inria team.
Date: 11-Jun-2018    Time: 14:30:00    Location: 336


INVITED TALK - Dr. Amit Kumar Pandey


Abstract—Title: End User Expectations From the Social Robotics Revolution: an Industrial Perspective Never before in the history of robotics, robots have been so close to us, in our society. We are ‘evolving’, so as our society, lifestyle and the needs. AI has been with us for decades, and now embodied in robots, penetrating more in our day-to-day life. All these are converging towards creating a smarter eco-system of living, where social robots will coexist with us in harmony, for a smarter, healthier, safer and happier life. Such robots are supposed to be socially intelligent and behave in socially expected and accepted manners. The talk will reinforce that social robots have a range of potential societal applications and hence impacting the education needs and job opportunities as well. The talk will begin with illustrating some of the social robots and highlight what it means to develop a socially intelligent robot, and the associated R&D challenges. This will be followed by some use cases, end user feedback and the market analysis. The talk will conclude with some open challenges ahead, including social and ethical issues and emphasize on the greater need of a bigger and multi-disciplinary effort and eco-system of different stakeholders including policy makers.
Date: 08-Jun-2018    Time: 11:00:00    Location: Room 1.38 - IST Taguspark


Robot learning from few demonstrations by exploiting the structure and geometry of data

Sylvain Calinon

EPFL - École Polytechnique Fédérale de Lausanne

Abstract—Many human-centered robot applications would benefit from the development of robots that could acquire new movements and skills from human demonstration, and that could reproduce these movements in new situations. From a machine learning perspective, the challenge is to acquire skills from only few interactions with strong generalization demands. It requires the development of intuitive active learning interfaces to acquire meaningful demonstrations, the development of models that can exploit the structure and geometry of the acquired data in an efficient way, and the development of adaptive controllers that can exploit the learned task variations and coordination patterns. The developed models need to serve several purposes (recognition, prediction, generation), and be compatible with different learning strategies (imitation, emulation, exploration). I will present an approach combining model predictive control, statistical learning and differential geometry to pursue such goal. I will illustrate the proposed approach with various applications, including robots that are close to us (human-robot collaboration, robot for dressing assistance), part of us (prosthetic hand control from tactile array data), or far from us (teleoperation of bimanual robot in deep water).
Date: 06-Jun-2018    Time: 11:00:00    Location: IST Alameda - DEI Informática II, room 0.19


Urban Data Management, Analysis and Visualization

Claudio Silva

New York University

Abstract—The large volumes of urban data, along with vastly increased computing power, open up new opportunities to better understand cities. Encouraging success stories show that data can be leveraged to make operations more efficient, inform policies and planning, and improve the quality of life for residents. However, analyzing urban data often requires a staggering amount of work, from identifying relevant datasets, cleaning and integrating them, to performing exploratory analyses and creating predictive models that take into account spatio-temporal processes. Our long-term goal is to enable domain experts to crack the code of cities by freely exploring the vast amounts of urban data. In this talk, we will present methods and systems which combine data management, analytics, and visualization to increase the level of interactivity, scalability, and usability for urban data exploration. We will show practical applications of the novel technology in real applications. This work was supported in part by the National Science Foundation, the Moore-Sloan Data Science Environment at NYU, IBM Faculty Awards, AT&T, NYU Tandon School of Engineering and the NYU Center for Urban Science and Progress.
Date: 05-Jun-2018    Time: 15:30:00    Location: DEI Meeting Room


Computer Graphics in the Age of AI and Big Data

Richard (Hao) Zhang

Simon Fraser University

Abstract—Computer graphics is traditionally defined as a field which covers all aspects of computer-assisted image synthesis. An introductory class to graphics mainly teaches how to turn an explicit model description including geometric and photometric attributes into one or more images. Under this classical and arguably narrow definition, computer graphics corresponds to a ``forward'' (synthesis) problem, which is in contrast to computer vision, which traditionally battles with the inverse (analysis) problem. In this talk, I would offer my view of what the NEW computer graphics is, especially in the current age of machine learning and data-driven computing. I will first remind ourselves several well-known data challenges that are unique to graphics problems. Then, by altering the above classical definition of computer graphics, perhaps only slightly, I show that to do the synthesis right, one has to first ``understand’’ the task and solve various inverse problems. In this sense, graphics and vision are converging, with data and learning playing key roles in both fields. A recurring challenge, however, is a general lack of “Big 3D Data”, which graphics research is expected to address. I will show you a quick sampler of our recent works on data-driven and learning-based syntheses of 3D shapes and virtual scenes. Finally, I want to explore a new perspective for the synthesis problem to mimic a higher-level human capability than pattern recognition and understanding.
Date: 05-Jun-2018    Time: 14:30:00    Location: DEI Meeting Room 0.19


Talk Kirk Bresniker Chief Architect and HPE Fellow/VP

Abstract—Title: Exaflops, Zettabytes and microseconds – Preparing to capitalize on simultaneous regime change in Computing Exascale supercomputers transforming science and industry, intelligent social infrastructure comprised of tens of billions autonomous agents hosting artificial intelligences devouring Zettabytes of information in real time. Unprecedented opportunities for information technology to precipitate societal transformation, but as we reach the twilight of Moore’s Law, equally unprecedented is the uncertainty on how long conventional approaches will continue to allow sustainable computational growth to match changing demand. I’ll review these motivating factors and then cover The Machine Advanced Development Program I’ve lead at Hewlett Packard Labs is preparing to meet these complex opportunities.
Date: 30-May-2018    Time: 12:00:00    Location: 336


Towards Knowledge-Based Decision Support System using Propositional Analysis and Rhetorical Structure Theory

Cláudio Duque

Universidade de Brasília

Abstract—The project's leading objective is to develop a natural language interface for knowledge-based decision support system (KBDSS) using rhetorical structure theory (RST) and propositional analysis. KBDSS is a system that provides specialized expertise (problem-solving) stored as facts, rules, procedures, or in similar structures that can be directly accessed by the user. The idea is to develop an independent module that, based in IRS collections’ texts, generates questions in natural language to help users to find the relevant information in the system. It is a research project basically, but not only, in fields of linguistics, computational linguistics, artificial intelligence, information retrieval, and information science.
Date: 18-May-2018    Time: 14:00:00    Location: 020


Hugo Rosa - Using Fuzzy Fingerprints For Cyberbullying Detection in Social Networks

Hugo Rosa


Abstract—As cyberbullying becomes more and more frequent in social networks, automatically detecting it and pro-actively acting upon it becomes of the utmost importance. In this work, we study how a recent technique with proven success in similar tasks, Fuzzy Fingerprints, performs when detecting textual cyberbullying in social networks. Despite being commonly treated as binary classification task, we argue that this is in fact a retrieval problem where the only relevant performance is that of retrieving cyberbullying interactions. Experiments show that the Fuzzy Fingerprints slightly outperforms baseline classifiers when tested in a close to real life scenario, where cyberbullying instances are rarer than those without cyberbullying.
Date: 18-May-2018    Time: 14:00:00    Location: 336


INVITED TALK - Prof. Pawel Kulakowski

Abstract—Nanoscale Communications <p> The talk will discuss the possible means for nanocommunications, i.e. communication between future nanomachines. An overview of possible approaches will be given, including miniaturization of existing communication devices, building nanomachines from basic blocks and molecular communication motivated by the communication mechanisms already existing in biology. The talk will later focus on the phenomenon of FRET (Foerster Resonance Energy Transfer). FRET can provide viable communication means on nano-distances with propagation delays of few nano-seconds only. The theory of FRET will be introduced, followed by a report on experiments on its performance performed in the last few years. The last part of the talk will present some further simulation studies, showing possible applications of FRET-based nanocommunication. <p> Free entrance. Send registration to Vera Almeida:
Date: 07-May-2018    Time: 17:00:00    Location: Anfiteatro Abreu Faro, Complexo Interdisciplinar, IST, Lisboa


Basic Measurements in IP-Based Networks Jan Jerabek

Jan Jerabek

Brno University of Technology, Czech Republic

Abstract—Access speed became the most commonly published metric for characterizing the quality of broadband offerings by Internet Service Providers. Obviously, this applies for both wired and wireless access technologies. However, measurements of speed for the same service can vary significantly, not mentioning frequent difference between download and upload direction given by the design of the technology. Moreover, we should care about many other parameters, starting from round-trip time, type of interconnection with other networks, unaltered Domain Name System communication, possibility to use any kind of application and not omitting sophisticated Quality of Service tests. Currently, not only in Europe, there is significant attention to so-called network neutrality issues and regulations that in wider meaning also address these topics. This talk is meant to introduce ideas, standards and tools covering measurement of basic parameters of IP networks in European context and to present possible approaches how to measure these parameters in IP-based networks.
Date: 09-Apr-2018    Time: 17:00:00    Location: Anfiteatro Abreu Faro, Complexo Interdisciplinar, IST, Lisboa


Fast-SG: An alignment-free algorithm for hybrid assembly

Alex Di Genova


Abstract—Long read sequencing technologies are the ultimate solution for genome repeats, allowing near reference level reconstructions of large genomes. However, long read de novo assembly pipelines are computationally intense and require a considerable amount of coverage, thereby hindering their broad application to the assembly of large genomes. Alternatively, hybrid assembly methods which combine short and long read sequencing technologies can reduce the time and cost required to produce de novo assemblies of large genomes. In this paper, we propose a new method, called FAST-SG, which uses a new ultra-fast alignment- free algorithm specifically designed for constructing a scaffolding graph using light-weight data structures. FAST-SG can construct the graph from either short or long reads. This allows the reuse of efficient algorithms designed for short read data and permits the definition of novel modular hybrid assembly pipelines. Using comprehensive standard datasets and benchmarks, we show how FAST-SG outperforms the state-of-the-art short read aligners when building the scaffolding graph, and can be used to extract linking information from either raw or error-corrected long reads. We also show how a hybrid assembly approach using FAST-SG with shallow long read coverage (5X) and moderate computational resources can produce long-range and accurate reconstructions of the genomes of Arabidopsis thaliana (Ler-0) and human (NA12878).
Date: 08-Mar-2018    Time: 10:00:00    Location: 336


Helping out the middle class (in the cache) with the MITHRIL prefetching algorithm

Ymir Vigfusson

Emory University

Abstract—The growing pressure on cloud application scalability has accentuated storage performance as a critical bottleneck. Although cache replacement algorithms have been extensively studied, cache prefetching – reducing latency by retrieving items before they are actually requested – remains an underexplored area. Existing approaches to history-based prefetching, in particular, provide too few benefits for real systems for the resources they cost. My talk will detail MITHRIL: a prefetching layer that efficiently exploits historical patterns in cache request associations. MITHRIL is inspired by sporadic association rule mining and only relies on the timestamps of requests. Through evaluation of 135 block-storage traces, we show that MITHRIL is effective, giving an average of a 55% hit ratio increase over LRU and PROBABILITY-GRAPH, and a 36% hit ratio gain over AMP at reasonable cost. Finally, I'll demonstrate the improvement comes from MITHRIL being able to capture mid-frequency blocks.
Date: 27-Feb-2018    Time: 15:00:00    Location: 020


Crosslinguistic perception and production of filled and silent pauses and raising L2 learners’ awareness of them

Ralph L. ROSE

Waseda University

Abstract—Filled pauses (e.g., uh/um) in speech are thought to be signs of underlying cognitive processes and hence listeners may see them as signals of such things as syntactic structure. Less is known about the signalling nature of silent pauses however. In this talk, I will share evidence from a series of psycholinguistic experiments to show that both silent and filled pauses may serve a signalling function which listeners are sensitive to across languages. Furthermore, the data show some differences between silent and filled pauses: Silent pauses seem to be more closely related to syntactic structure than filled pauses. Given the importance of and difference between silent and filled pauses in perception, it behooves second language (L2) learners to become more aware of their use of these phenomena and how they influence their fluency. In the latter part of the talk, I will describe and demonstrate an application I’m developing called Fluidity, which is designed to help L2 learners raise their awareness of different aspects of their speech including pauses and thereby improve their speed fluency.
Date: 26-Feb-2018    Time: 10:00:00    Location: 336


Development of Radiation Hard 150nm Standard Cell Library

João Baptista Martins

Federal University of Santa Maria

Abstract—The effects produced by radiation on integrated circuits can be classified into Single Event Effects (SEE) related to transient problems and Total Ionization Dose (TID) effects that arise due to the long exposure time ionizing radiation. The mitigation of these effects on integrated circuits can be done in three ways: Manufacturing Process Level, Architectural Level (redundancy) and Layout Level. The work presented here deals with the third way of mitigation, that is, the cell library design of radiation tolerant integrated circuits. Designed and manufactured in silicon on 150nm technology, the SMDH-RH library is based on the use of guard rings and the application of closed geometry techniques (ELT – Enclosed Layout Transistor). The library includes simple and complex digital logic gates. It was tested in space as payload of a Nanosatellite (NanosatC-Br1), launched in space in 2014 and in activity still, being approved its operation and functionality.
Date: 07-Feb-2018    Time: 14:30:00    Location: 336


Static versus Dynamic Deferred Acceptance in School Choice: Theory and Experiment

Joana Pais


Abstract—In the context of school choice, we experimentally study how behavior and outcomes are affected when, instead of submitting rankings in the student proposing or receiving deferred acceptance (DA) mechanism, participants make decisions dynamically, going through the steps of the underlying algorithms. Our main results show that, contrary to theory, (a) in the dynamic student proposing DA mechanism, participants propose to schools respecting the order of their true preferences slightly more often than in its static version while, (b) in the dynamic student receiving DA mechanism, participants react to proposals by always respecting the order and not accepting schools in the tail of their true preferences more often than in the corresponding static version. As a consequence, for most problems we test, no significant differences exist between the two versions of the student proposing DA mechanisms in what stability and average payoffs are concerned, but the dynamic version of the student receiving DA mechanism delivers a clear improvement over its static counterpart in both dimensions. In fact, in the aggregate, the dynamic school proposing DA mechanism is the best performing mechanism.
Date: 11-Jan-2018    Time: 15:00:00    Location: room 0.17 at DEI - Instituto Superior Técnico


Toward Autonomous Social Robots in the Wild

Iolanda Leite

School of Computer Science and Communication at the Royal Institute of Technology, Sweden

Abstract—As social robots move out of controlled laboratory environments to be deployed in the real world, a long-standing barrier is the need to respond and adapt to dynamic environments without the need for operator intervention. In this talk, I will present my past and current research on artificial intelligence mechanisms that enable robots to interact with people in dynamic, real-world social environments. I will also discuss limitations of the current state of the art in robotic technology suitable for realistic social environments, arguing that an improved understanding of how robots perceive, reason and act depending on their surrounding social context can lead to more natural, enjoyable and useful human-robot interactions in the long-term.
Date: 21-Dec-2017    Time: 10:00:00    Location: Room 1.38 - IST Taguspark


Computational Geometry Challenges and Results in Multiobjective Optimization

Michael Emmerich

Leiden University

Abstract—In multiobjective optimization and decision analysis, it is common to compute sets of points or polytopes that cover trade-off (hyper)surfaces. In this talk, we will look at computational geometry problems related to this and their computational complexity. Many of these results have been discovered very recently and show that the boundary between the computational problems that are tractable and intractable depends on various parameters and is very sensitive to the number of objectives.
Date: 18-Dec-2017    Time: 14:30:00    Location: 020


Rationality, Contextuality, and Extended Probabilities

José Acacio de Barros

San Francisco State University

Abstract—Rationality is often associated with classic theory of probability, which is built on top of a Boolean structure of events. However, some experimental conditions, both in physics and in the social sciences, do not permit a classic description, as this description would lead to logical contradictions. The reason for such contradictions are that experimental outcomes depend non-trivially on the context, i.e. they are contextual. In this talk we will discuss such examples and then show how classic probability theory can be modified to include contextual systems.
Date: 14-Dec-2017    Time: 14:00:00    Location: 336


Managing Application Resilience: A Programming Language Approach

Pedro Diniz

USC Information Sciences Institute

Abstract—System resilience is an important challenge that needs to be addressed in the era of extreme scale computing. High-performance computing systems will be architected using millions of processor cores and memory modules. As process technology scales, the reliability of such systems will be challenged by the inherent unreliability of individual components due to extremely small transistor geometries, variability in silicon manufacturing processes, device aging, etc. Therefore, errors and failures in extreme scale systems will increasingly be the norm rather than the exception. Not all the errors detected warrant catastrophic system failure, but there are presently no mechanisms for the programmer to communicate their knowledge of algorithmic fault tolerance to the system. In this talk we present a programming model approach for system resilience that allows programmers to explicitly express their fault tolerance knowledge. We propose novel resilience oriented programming model extensions and programming directives, and illustrate their effectiveness. An inference engine leverages this information and combines it with runtime gathered context to increase the dependability of HPC systems. The preliminary experimental results presented here, for a limited set of kernel codes from both scientific and graph-based computing domains reveal that with a very modest programming effort, the described approach incurs fairly low execution time overhead while allowing computations to survive a large number of faults that would otherwise always result in the termination of the computation. As transient faults become the norm, rather than the exception, it will be come increasingly important to provide the user with high-level programming mechanisms with which he/she can convey important application acceptability criteria. For best performance (either in terms of time, power, energy) the underlying systems need to leverage this information to better navigate the very complex system-level trade-offs to still deliver a reliable and productive computing environment. The work presented here is a simple first step towards this vision.
Date: 11-Dec-2017    Time: 15:00:00    Location: 336


Towards providing digital immunity to humanitarian organizations

Stevens Le Blond

EPFL - École Polytechnique Fédérale de Lausanne

Abstract—Humanitarian action, the process of aiding individuals in situations of crises, poses unique information-security challenges due to natural or manmade disasters, the adverse environments in which it takes place, and the scale and multi-disciplinary nature of the problems. Despite these challenges, humanitarian organizations are transitioning towards a strong reliance on digitalization of collected data and digital tools, which improves their effectiveness but also exposes them to computer security threats. This talk presents the first academic effort seeking to understand and address the computer-security challenges associated with the digitalizing humanitarian action. First, I will describe a qualitative analysis of the computer-security challenges of the International Committee of the Red Cross (ICRC), a large humanitarian organization with over sixteen thousand employees, legal privileges and immunities, and over 150 years of experience with armed conflicts and other situations of violence worldwide. Second, I will present a research agenda to design and implement anonymity networks, block chains, and secure-processing systems addressing these challenges, and to deploy them in collaboration with the ICRC. I will close with a discussion on how to generalize our approach to provide digital immunity to humanitarian and other at-risk organizations.
Date: 07-Dec-2017    Time: 12:30:00    Location: DEI Meeting room 0.19 (informática II building)


Adapting the Studio Based Learning Methodology to Computer Science Education

Paula Alexandra Silva

Maynooth University

Abstract—Recent research projects have depicted Studio-Based Learning as a successful approach to teaching computer science students. This talk will describe using Studio-Based Learning as a pedagogical approach in an online introductory Computer Science 1 (CS1) course. The studio-based instructional model emphasizes learning activities in which students (a) construct personalized solutions to assigned computing problems, and (b) present solutions to their instructors and peers for feedback and discussion within the context of design critiques. For SBL to be effective, assignments to be critiqued must be solvable by a variety of thinking paths. Building upon the identification of students’ most frequent programming errors, we implemented SBL sessions and analyzed the impacts compared to sessions that did not employ the SBL methodology. The online nature of this class allowed for a rich collection of data and the integral recording of the sessions. In addition to the students’ performance, motivation and perception of their learning process, the analysis of this data provided insight into students’ thought processes.
Date: 22-Nov-2017    Time: 12:00:00    Location: Room 2.10 - IST Taguspark


Privacy-preserving solutions for Security-as-a-Service (SecaaS) modell

Marcin Niemiec

Abstract—Nowadays, Security-as-a-Service (SecaaS) model is using by organizations of different sizes and profiles. Unfortunately, such services are characterized by a risk of confidential information leakage – the customer's security policies are exposing to cloud service providers. The security policy contains confidential information regarding the organization's infrastructure, vulnerabilities, and threats. This raises a serious privacy concern. In order to resolve the problem, some privacy preserving solutions for SecaaS services have been proposed over the past years. During the seminar, the mechanisms which are operating using the principles of the hybrid cloud architecture will be presented. They are dedicated to security services whose security policies can be represented in the form of a decision tree (firewall, IDS, IPS and others).
Date: 20-Nov-2017    Time: 18:30:00    Location: Instituto Superior Técnico, Pavilhão Central, Salão Nobre


Cases on optimization for scheduling and timetabling

Inês Marques

Instituto Superior Técnico

Abstract—Scheduling and timetabling are planning activities of operations management in any organization. Central to an operation’s ability to deliver is the way its activities are planned. Planning is concerned with activities that attempt to reconcile the demand and the ability of the operation’s resources to deliver, thus bringing supply and demand together. This talk presents demand-driven research on this area, where research questions have been gathered from practice. Operating room scheduling, staff scheduling at emergency medical services and university course timetabling are cases to be shared. Focus is given to the operations research methods used in these cases. Open questions and opportunities for future work are also suggested.
Date: 17-Oct-2017    Time: 11:00:00    Location: 336


Bicriteria Fixed-Charge Network Flow - Separating Fixed Costs and Flow Costs

Michael Stiglmayr

University of Wuppertal

Abstract—The fixed-charge network design problem is one of the classical network design problems. From a multiobjectve perspective its objective function is the weighted-sum of a network flow objective and the sum of the fixed costs (i.e. design costs). Flow can only be routed through those edges for which the respective binary design variable equals one. The single criteria flow problem is well studied in literature; a lot of different solution approaches have been developed and applied, including branch and cut, Lagrangian relaxation as well as heuristic methods like dynamic cost scaling. However, from an application point of view design costs and flow costs are not directly comparable. Usually the design costs are due a-priori, whereas the flow costs correspond to maintenance or operation costs which incur on a regular basis. In this talk we will present heuristic and exact solution approaches based on the two-phase method and ranking algorithms.
Date: 12-Oct-2017    Time: 11:00:00    Location: 336


Phonology of Dyslexia: Phonological and neurobiological explanations of cross-linguistic variations

Fusa Katada

Waseda University

Abstract—The neurobiological disability called dyslexia (< Greek dys- ‘impaired’ + lexis ‘word’) is a specific learning disability (LD) that affects literacy skills: both decoding (pronouncing written words) and encoding (spelling words). It has been generally assumed that congenital form of dyslexia, termed developmental dyslexia, stems from a particular problem in language acquisition affecting phonological awareness. However, the exact nature of phonological awareness has not yet been made clear. The majority of studies on dyslexia have been carried out with respect to Roman alphabetic languages, most especially English, and it might be the case that certain truths of dyslexia remain unrevealed under such research situations. <p>This talk first establishes the relevance of the mora-basic hypothesis that moras (CVs) are the units underlying all human natural languages. It then spotlights a seemingly mysterious discrepancy in the prevalence of phonological dyslexic populations between the English-speaking world and the Japanese-speaking world: namely as high as 17% for the former and as low as 1% for the latter. On the basis of English dyslexic reading marked by an overproduction of moraic (CV) units in the absence of rhyme (VC) units, the talk will show that the discrepancy is due to differences in prosodic structures between the two languages. For rhyme(VC)-oriented English, readers must depict the unit rhyme through prosodic restructuring from the underlying CV-C (do-g) to rhyme-oriented C-VC (d-og). A failure to do so manifests as phonological dyslexia. For mora(CV)-oriented, rhymeless Japanese, such prosodic restructuring is irrelevant, and phonological dyslexia is largely undetected. <p>The talk furthermore moves on to exploring possible explanations of a failure in such a prosodic restructuring. From the articulatory phonological point of view, onset consonants are coarticulation of the following vowels. Moras (CVs) are thus formed automatically and essentially free. In contrast, coda consonants are not coarticulation with the preceding vowels. Forming rhymes (VCs) instead requires a temporal-spatial decision load, which a dyslexic mind is unable to bear. Mora inclination is explained accordingly. <p>The talk will deepen the above view and come to claim that mora-forming coarticulation is easy because it is a synchronized articulatory behavior, akin to a synchronized human locomotive behavior. This view conforms to a human neurobiological restriction inclined toward synchronized behavior, which is claimed to be acquired in the process of human evolution.
Date: 15-Sep-2017    Time: 14:00:00    Location: 336


Efficient computation of the search region in multi-objective optimization

Kerstin Dächert

University of Wuppertal

Abstract—Multi-objective optimization methods often proceed by iteratively producing new solutions. For this purpose it is important to determine and update the search region efficiently. It corresponds to the part of the objective space where new nondominated points could lie and can be described by a set of so-called local upper bounds whose components are defined by already known nondominated points. In the bi-objective case the update of the search region is easy since a new point can dominate only one local upper bound. Moreover, the local upper bounds as well as the nondominated points can be kept sorted increasingly with respect to one objective and decreasingly with respect to the other. For more than two objectives these properties do no longer hold. In particular, several local upper bounds might have to be updated at once when a new nondominated point is inserted into the search region. In this talk we concentrate on how to design this update efficiently. Therefore we study a specific neighborhood structure among local upper bounds. Thanks to this structure we can quickly identify all local upper bounds that are affected by a new nondominated point, i.e. that have to be updated. We propose a new scheme to update the search region with respect to a new point more efficiently compared to existing approaches. Besides, the neighborhood structure provides new theoretical insight into the search region and the location of nondominated points for more than two objectives (cf. Dächert, K., Klamroth, K., Lacour, R., Vanderpooten, D.: Efficient computation of the search region in multi-objective optimization, European Journal of Operational Research 260(3):841–855, 2017).
Date: 14-Sep-2017    Time: 11:00:00    Location: 336


Exploring light-weight tangibles

Beryl Plimmer

University of Auckland

Abstract—As we know, tangibles offer an interaction experience that is richer and more helpful for learning than traditional mouse + keyboard interaction. However, much of the early tangibles research required complex hardware setups including cameras, specialized tables or displays and specialized objects that incorporated electronics. This talk will focus on the work we have done with light-weight tangibles. The tangibles can easily be made with everyday objects and be used on any multi-touch device. The basic interaction is supported by a toolkit we have developed to enable easy deployment of context specific apps. The long-term goal of this work is to have tangible applications that can easily be used by non-experts in everyday settings.
Date: 07-Sep-2017    Time: 11:30:00    Location: 336


Body Language Without Body: Social Signals in Technology Mediated Communication

University of Glasgow

Abstract—Nonverbal communication is a natural phenomenon that takes place in face-to-face interactions. However, an increasingly significant fraction of our social exchanges takes place in technology mediated settings where natural nonverbal cues (facial expressions, vocalisations, gestures, posture, etc.) are partially or totally impossible to display and access. For example, phones make it possible to use vocal cues (fillers, laughter, pauses, etc.), but not facial expressions or gestures and, in the case of online textual chats, the use of nonverbal cues is simply not possible. The question at the core of this talk is whether nonverbal communication still plays a role in these settings and, if yes, what are the nonverbal cues and their functions
Date: 05-Sep-2017    Time: 11:00:00    Location: Room 1.38 - IST Taguspark


The Rise of Potentially Unwanted Programs: Measuring its Prevalence, Distribution through Pay-Per-Install Services, and Economics

Juan Caballero

IMDEA Software Institute in Madrid

Abstract—Potentially unwanted programs (PUP) such as adware and rogueware, while not outright malicious, exhibit intrusive behavior that generates user complaints and makes security vendors flag them as undesirable. PUP has been little studied in the research literature despite recent indications that its prevalence may have surpassed that of malware. We have performed a systematic study of Windows PUP over a period of 4 years using a variety of datasets including malware repositories, AV telemetry from 3.9 million real Windows hosts, dynamic executions, and financial statements. This presentation summarizes what we have learned from our measurements on PUP prevalence, its distribution through pay-perinstall (PPI) services, which link advertisers that want to promote their programs with affiliate publishers willing to bundle their programs with offers for other software, and the economics of PPI services that distribute PUP.
Date: 12-Jun-2017    Time: 11:00:00    Location: 336


Challenges in Natural Language Processing: Question Answering and Dialog System

Yoshinobu Kano

Shizuoka University

Abstract—I will introduce a couple of ongoing projects we organize. The projects include automatic examination solver, the Todai Robot project which aims to solve Japanese university entrance examinations (multiple choice selection and descriptive), the challenge of Legal Bar Exam for lawyers. Other projects include medical document processing for automatic diagnosis, AI werewolf that creates an AI player for a conversation game, text mining for scientific literatures.
Date: 05-Jun-2017    Time: 11:00:00    Location: 336


Deploying Incompatible Unmodified Dynamic Analyses in Production via Multi-version Execution

Luís Pina

Imperial College London

Abstract—Popular dynamic analysis tools such as Valgrind and compiler sanitizers are effective at finding and diagnosing challenging bugs and security vulnerabilities. However, they cannot be combined on the same program execution, and incur a high overhead, which typically prevents them from being used in production. In this talk I will present the FreeDA system which enables to deploy existing multiple incompatible dynamic analysis tools without requiring any modification and while masking their overhead. FreeDA levarages on multi-version execution, in which the dynamic analyses are run alongside the production system. FreeDA is applicable in several common scenarios, involving network servers and interactive applications. In particular, FreeDA is able to deploy Valgrind and Clang's sanitizers to high-performance servers, such as Ngninx and Redis, and interactive applications, such as Git and HTop.
Date: 26-May-2017    Time: 10:00:00    Location: 336


Language Learning for Verification of Configuration Files

Mark Santolucito

Yale University

Abstract—Software failures resulting from configuration errors have become commonplace as modern software systems grow increasingly large and more complex. The lack of language constructs in configuration files, such as types and specifications, has directed the focus of a configuration file verification towards building post-failure error diagnosis tools. In addition, the existing tools are generally language specific, requiring the user to define the language model and explicit rules to check. In this talk, we propose a framework which analyzes datasets of configuration files and derives rules for building a language model from the given dataset. The resulting language model can be used to verify new configuration files and detect errors in them. We will discuss the implementation, ConfigC, of this framework - as well as the underlying model and how it might be extended in the future.
Date: 24-May-2017    Time: 11:00:00    Location: 336


Biomedical Image Informatics

David Breen

Drexel University

Abstract—The goal of biomedical image informatics is to develop techniques and systems that extract quantitative information from biomedical images and construct robust models of the structures and processes captured in the images. The word "images" is used in its broadest sense, meaning data that can be 2D, 3D or 4D in nature and changing over time. This talk will describe a number of image informatics projects conducted by the Drexel University Geometric Biomedical Computing Group. In the first project histology images of breast carcinomas are analyzed to determine if the tumor has metastasized. The second project developed techniques for automatically categorizing the memory and learning capabilities of a fruit fly model of Alzheimers Disease. This is accomplished via analysis of videos of the flies' courtship behavior. The final project generates geometric models of the individual cells of the imaginal wing disc of larval fruit flies, based on 3D reconstructions produced from confocal microscopy image stacks. Detailed geometric quantities about the cells are then computed in order to provide insight into the developmental processes that formed the wing disc.
Date: 12-May-2017    Time: 11:30:00    Location: 020


The Future of Multidimensional Video Capture

Brian K Cabral


Abstract—The advent of small monoscopic 360 cameras and stereo capture camera’s like Surround360 have ushered in a new era of multidimensional video image capture. It represents a transition in media nearly as profound as the transition from film video to digital. Digital video capture changed the film industry by simultaneously lowering cost while increasing creative latitude for directors and producers. It also democratized the medium because movie magic that only big studios could afford could now be done with sophisticated digital toolchains - so too with 360 and 360 stereo capture. We will discuss the technical, artistic, and toolchain challenges confronting this medium. Focusing on the technical and marketplace elements that both drive and hinder progress.
Date: 04-May-2017    Time: 17:00:00    Location: TagusPark A2


Preservation of cultural heritage: “Human” technology in the digital era

Dr. Anna Lobovikov-Katz

Israel Institute of technology

Abstract—The revolutionary development in digital theory and technology calls for non-trivial decisions in bridging between the virtual and real, between STEM (Science, Technology, Engineering and Mathematics) and humanities and arts subjects and themes in education, research and application. The field of conservation of cultural heritage provides various challenges, especially with regards to learning, study, investigation and documentation of tangible heritage through applications of intangible ICT technologies. Linking between the advanced methods and techniques, and between the areas which has been traditionally associated with humanities and arts can be helpful in modern research and application in diverse areas. The interaction between e-learning, and the actual on-site learning and study of historic buildings and sites, with an emphasis on their visual characteristics, has been enabled and implemented in EU project ELAICH, and some other selected activities. Integration of two interconnected areas: visual disciplines through their wide spectrum (incl. descriptive geometry, perspective, freehand drawing, painting), and conservation of cultural heritage, and especially between the visual and the material-technological aspects of the latter, has been applied by author for “research by education”, to contribute to preservation of cultural built heritage, enriching it through my inter-disciplinary visual insight into the heritage conservation equilibrium. Visualisation has been widely known for its uses in STEM, and its contribution to learning has been shown in recent research. Drawing is considered by many researchers as indispensable for visual thinking, while at the same time, it contributes to visual skills, which are the key to success in STEM. RALEMEFS Methodology, developed by author, makes traditional human technologies of visual analysis accessible to any researcher, scientist, engineer and architect.
Date: 20-Apr-2017    Time: 11:30:00    Location: 336


Human error is not the problem

Harold Thimbleby

University of Wales Swansea

Abstract—Error, if it was a disease, would be classified as the third biggest killer after cancer and heart disease. Why is it neglected, and what can be done? When something bad happens to a patient, then surely somebody must have done something bad? Although it’s a simple story, it’s usually quite wrong. This talk argues, with lots of surprising examples, that the correct view is you do not want to avoid error and you want to avoid patient harm. Drawing on human factors and computer science, this talk shows the astonishing ways that systems conspire to cause and hide the causes of error. We will then show that better design can reduce harm significantly. We explain why industry is reluctant to improve, and how new policies could help improve technology.
Date: 07-Apr-2017    Time: 10:00:00    Location: INESC-ID Tagus Park, sala 2.10


Arts and Design Based Research for the Creation of Transmedia Experiences

Patrícia Gouveia

Faculdade de Belas Artes da Universidade de Lisboa

Abstract—Arts and Design Based Research for the Creation of Transmedia Experiences
Date: 24-Mar-2017    Time: 14:00:00    Location: room 1.38 @ INESC-ID TagusPark


High Performance Dense Linear Algebra on Low-Power Asymmetric Multicore Architectures

José-Ramón Herrero

Universitat Politècnica de Catalunya

Abstract—In this talk we will present recent research focused on producing high performance Dense Linear Algebra (DLA) codes on Asymmetric Multicore Processors (AMP). We tailor the study on representative asymmetric multicore processors present in an ARM big.LITTLE architecture specifically designed for energy efficiency. We will discuss the challenges posed by these architectures and the way we have exploited parallelism in order to produce high performance implementations of the most significant DLA kernels within BLAS and LAPACK. We will stress the need for malleability and dynamically adapting to the available resources.
Date: 03-Mar-2017    Time: 11:00:00    Location: 336


Symbolic Execution for Evolving Software

Cristian Cadar

Imperial College London

Abstract—One of the distinguishing characteristics of software systems is that they evolve: new patches are committed to software repositories and new versions are released to users on a continuous basis. Unfortunately, many of these changes bring unexpected bugs that break the stability of the system or affect its security. In this talk, I describe our work on devising novel symbolic execution techniques for increasing the confidence in evolving software: a technique for reasoning about the correctness of optimisations, in particular those that take advantage of SIMD and GPGPU capabilities; a technique for high-coverage patch testing, and a technique for revealing regression bugs and behavioural divergences across versions.
Date: 24-Feb-2017    Time: 11:00:00    Location: 020


Haptic Interfaces & Virtual Reality

Makoto Sato

Tokyo Institute of Technology

Abstract—This talk presents the history and evolution of the haptic interface system SPIDAR. Since its first version in 1989, SPIDAR system was adopted and customized to various kinds of virtual environments to meet user and task’s requirements. Ranging from a simple pick and place task to more complicated physical based interactions, SPIDAR has emerged as a distinguished haptic interface capable of displaying various aspects of force feedback while having the following advantages: 1) scalable system, with simple modification it can fits to most of the required working space in virtual environment. Desktop, workbench, human-scale, and networked versions have been developed to accommodate different demand of VR applications. 2) String based, its usage of string based technology to track user hands or fingers position as well as displaying back force feedback. 3) Transparent system, It keeps the working space transparent and do not obscure the visual display system.
Date: 23-Feb-2017    Time: 11:30:00    Location: room 1.38 @ INESC-ID TagusPark


EXPRESS - Expression and Recognition of Irony in Multicultural Social Media

Sílvio Moreira, Filipe Nuno Marques Pires da Silva, Paula Carvalho


Abstract—The main goal of this project is to systematically analyze the expression of irony and sarcasm in social media, in a cross-lingual and multicultural perspective, aiming at its automatic detection. Automating the detection of irony and sarcasm is yet an unsolved problem. Previous efforts to achieve this aim have been limited in their approach, tending to focus on shallow textual cues indicative of ironic intent. Research has not systematically explored specific linguistic patterns and rhetorical strategies typically used to express verbal and situational irony in text. Moreover, previous studies usually consider these phenomena as a whole, instead of analyzing independently the mechanisms involved in expressing it. This project aims to answer the following open research questions that are critical to improving irony and sarcasm detection in text mining activities: i. In social media content, are the linguistic and extra-linguistic mechanisms used to express irony and sarcasm language-dependent? ii. To what extent does irony expression and representativeness differ across domains, geographical regions, and targets involved? iii. Which are the most representative linguistic and extra-linguistic devices used to express irony in different types of domains and topics? iv. How reliable are individuals with respect to identifying and processing ironic messages? v. Which rhetorical devices are easier to recognize, and which ones are particularly hard to detect, especially by humans not sharing the same cultural and pragmatic context? vi. Which types of approaches best suit the automatic detection of irony and sarcasm in text? Are we capable of training shallow models to learn these phenomena? Or should we explicitly explore contextual and linguistic information, using advanced NLP strategies? The main results achieved in the scope of this project will be presented by: - Paula Carvalho: Identifying Situational Irony in News Headlines - Sílvio Moreira: Modelling context with user embeddings for sarcasm detection in social media - Filipe Silva: Computational Detection of Irony in Textual Messages
Date: 19-Dec-2016    Time: 11:00:00    Location: 336


Algorithmic Mechanisms for Reliable Internet-based Master-Worker Computing: An Evolutionary Approach

Chryssis Georgiou

University of Cyprus

Abstract—The need for high-performance computing and the growing use of personal computers and their capabilities, and the wide access to the Internet, have established Internet-based computing as an inexpensive alternative to supercomputers. The most popular form of Internet-based computing is volunteer computing, where computing resources are volunteered by the public to help solve (mainly) scientific problems. BOINC is a popular platform where volunteer computing projects run, such as SETI@home. Profit-seeking computation platforms, such as Amazonís Mechanical Turk, have also become popular. One of the main challenges for further exploiting the promise of such platforms is the untrustworthiness of the participating entities. In this talk I will focus on Internet-based Master-Worker task computations, where a master process sends tasks, across the Internet, to worker processes to compute and return back a result. Workers, however, are not trustworthy, and might be at their best interest (or due to malice or malfunction) to report incorrect results. Through different studies, workers have been categorized as either malicious (always report an incorrect result), altruistic (always report a correct result), or rational (report whatever result maximizes their benefit). I will explain how such computations can be modeled using evolutionary dynamics and identify the conditions under which the master can reliably obtain the task results. The talk is based on work performed jointly with Evgenia Christoforou (IMDEA Networks), Antonio Fernandez Anta (IMDEA Networks), Miguel Mosteiro (Pace Univ.) and Angel Sanchez (Univ. Carlos III de Madrid).
Date: 14-Dec-2016    Time: 15:00:00    Location: 336


Robotics for Children:From Exploratory Studies to Practical Applications

University of Tsukuba

Abstract—Robotics for Children: From Exploratory Studies to Practical Applications. The talk will start from the early studies of Sony's entertainment robots in 2000s, a three-year exploratory field study conducted in a nursery school in University of California (2004-2007), and then some recent topics in educational robotics including a commercial application released from SoftBank (2014).
Date: 07-Nov-2016    Time: 11:30:00    Location: IST TAGUSPARK room 0.65 Ground floor


From Birthing the Apocalypse to BIG FAT FAIL: (Net) Art as Software Research

Benjamin Grosser

Abstract—How are numbers on Facebook changing what we "like" and who we "friend?" Why does a bit of nonsense sent via email scare both your mom and the NSA? What makes someone mad when they learn Google can't see where they stand? From net art to robotics to supercuts to e-lit, Ben Grosser constructs interactive experiences, machines, and systems that investigate the cultural, social, and political effects of software.
Date: 28-Sep-2016    Time: 13:30:00    Location: TagusPark Anfiteatro A3


A Secure Data Sharing and Query Processing Framework via Federation of Cloud Computing

Prof. Sanjay K Madria

Missouri University of Science and Technology

Abstract—Due to cost-efficiency and less hands-on management, data owners are outsourcing their data to the cloud, which can provide access to the data as a service. However, by outsourcing their data, the data owners lose control and privacy, as the cloud provider becomes a third party service provider. At first, encrypting the data by the owner and then exporting it to the cloud seems to be a good approach to preserve the privacy. However, there is a potential efficiency problem with the outsourced encrypted data when the data owner revokes some of the users’ access privileges. An existing solution to this problem is based on symmetric key encryption scheme but it is not secure when a revoked user rejoins the system with different access privileges for the same data record. In this talk, I will discuss an efficient and Secure Data Sharing (SDS) framework using a homomorphic encryption and proxy re-encryption scheme that prevents the leakage of unauthorized data when a revoked user rejoins the system. I will also discuss modification to the underlying SDS framework and present a new solution based on the data distribution technique to prevent the information leakage in the case of collusion between a revoked user and the cloud service provider. A comparison of the proposed solution with existing methods is provided in detail. Furthermore, I will demonstrate how the existing work can be utilized in the proposed framework to support secure query processing. I will provide a detailed experimental analysis of the proposed framework on Amazon EC2 and discuss its practical relevance.
Date: 05-Sep-2016    Time: 15:00:00    Location: 336


Teams, Swarms, Crowds, and Collectives: Special Cases?

Gal Kaminka

Bar Ilan University

Abstract—Teams of agents and robots, swarms of robots or animals, crowds of people, and collectives (of everything) permeate our technological, biological, and sociological worlds. They have inspired generations of researchers in multi-agent and multi-robot systems. However, much of the research has split along technological and philosophical fault-lines: emergent or planned? communicating or just sensing? globally coordinated, or locally-reactive? rational or procedural? This talk will briefly explore three fronts, which bridge over such fault lines: Cognitive psychology connecting with human crowds,reinforcement learning in swarms connecting with game theory, and Asimov's laws implemented in molecular robots (nanobots). These fronts hint at a broader and deeper science of social intelligence that is still waiting to be discovered.
Date: 05-Sep-2016    Time: 15:00:00    Location: IST Tagusparque, room 1.38


Dear router, let me make that decision for myself: extending DTN messages with routing and delivery software codes

Carlos Borrego Iglesias

Universitat Autònoma de Barcelona

Abstract—Delay and Disruption Tolerant Networking is extremely useful when source and destination nodes are intermittently connected. Active DTN introduces software code to improve DTN performance. DTN implementations use application-specific routing algorithms to overcome those limitations. However, current implementations do not support the concurrent execution of several routing algorithms. A solution to this problem is extending the messages being communicated by incorporating software code for forwarding, delivery, lifetime control and prioritisation purposes. Our proposal stems from the idea of moving the routing/delivery algorithms from the host to the message. This solution is compatible with Bundle Protocol (BP) and facilitates the deployment of applications with new routing and delivery needs. Additionally, a secure geographic routing protocol that learns about the mobility habits of the nodes of the network will be presented. This routing protocol uses information about the usual whereabouts of the nodes to make optimal routing decisions and makes use of homomorphic cryptographic techniques from secure multi-party computation to preserve nodes’ privacy while taking routing decisions.
Date: 02-Sep-2016    Time: 16:00:00    Location: 336


Cooperation with the National Institute of Informatics

Henri Angelino

National Institute of Informatics, Tokyo

Abstract—This visit intends to present the National Institute of Informatics, Tokyo (NII) to the research and development institutions with whom NII cooperates through a general presentation of NII and the internship program. All researchers and students are welcomed to attend this presentation.
Date: 06-Jul-2016    Time: 09:30:00    Location: 336


Stochastic Computing

Rui Duarte

Abstract—Stochastic computing has emerged as a computational paradigm that offers arithmetic operators with high-performance, compact implementations and robust to errors by producing approximate results. In stochastic computations, the data is encoded as pseudo-random bitstreams of 0s and 1s, which represent a number between 0 and 1 that is defined by the ratio of the number of 1s over the total number of bits. In this talk Prof. Rui Duarte will introduce the basics of stochastic computing, along with its benefits, costs, and state-of-the-art. Stochastic computing has been applied to a relatively limited set of applications which include DSP applications: Finite Impulse Response (FIR) and Infinite Impulse Response (IIR) filters, neuromorphic and bioinspired systems: binary synapses for low-power neuromorphic systems, and digital neurosynaptic network for neuromorphic chips to develop brain-like computational structures, decoding of Low-Density Parity Code (LDPC) codes, and Bayesian computing machines. Most of these applications are based on additions and multiplications, and are tolerant to some errors in their computations.
Date: 05-Jul-2016    Time: 14:30:00    Location: 336


The Unicage method for (big) data processing and MONO-gramming for IoT - from Japan to the World.

Prof. Hiroyuki Ohno

Kanazawa University, Japan

Abstract—Unicage is an application development method born 23 years ago in Japan. Unicage is based on the UNIX fundamentals such as "shell script", "command", "text file" and is based on combination of commands, that are easy to master for staff of general companies and beneficial to cost reduction, rapid deployment, and achieving sustainable systems. Also, the "MONO-gramming" approach ("the things"-programming in English) is prepared for developing IoT devices easily under the strong support of UNIX operating system and "shell script". With this approach, people who want to control IoT devices need not write code in programming languages like C, C++, Java, Ruby, Python, etc. In the presentation, a demonstration of small devices connected to a small computer are controlled by the shell and shell script.
Date: 01-Jul-2016    Time: 11:00:00    Location: 336


Haplotype Assembly

Nadia Pisanti

Universitá di Pisa

Abstract—The human genome is diploid, which requires to assign heterozygous single nucleotide polymorphisms (SNPs) to the two copies of the genome. The resulting haplotypes, lists of SNPs belonging to each copy, are crucial for downstream analyses in population genetics. Currently, statistical approaches, which are oblivious to direct read information, constitute the state-of-the-art. Haplotype assembly, which addresses phasing directly from sequencing reads, suffers from the fact that sequencing reads of the current generation are too short to serve the purposes of genome-wide phasing. While future-technology sequencing reads will contain sufficient amounts of SNPs per read for phasing, they are also likely to suffer from higher sequencing error rates. I will describe WhatsHap, the first approach that yields provably optimal solutions to the weighted minimum error correction problem in runtime linear in the number of SNPs. WhatsHap is a fixed parameter tractable (FPT) approach with coverage as the parameter. We demonstrate that WhatsHap can handle datasets of coverage up to 15x, and that 15x are generally enough for reliably phasing long reads, even at significantly elevated sequencing error rates. I will then show some theoretical results on the optimization problem that lead to HapCol, a fixed parameter algorithm and tool with the number of errors as the parameter. HapCol can handle coverage higher than WhatsHap while being more sensible to the error rate.
Date: 29-Jun-2016    Time: 14:00:00    Location: 408


Popping Superbubbles and identifying clumps

Ritu Kundu

Kings College

Abstract—The information that can be inferred or predicted from knowing the genomic sequence of an organism is astonishing. String algorithms are critical to this process. This talk provides an overview of two particular problems - Superbubbles (an important subgraph class for analysing assembly graphs) and Clustered-clumps (a maximal overlapping set of occurrences of a given set of patterns) - that arise during computational molecular biology research, and recent algorithmic developments in solving them.
Date: 28-Jun-2016    Time: 14:30:00    Location: 408


Optimal Computation of Avoided Words

Solon Pissis

Kings College

Abstract—The deviation of the observed frequency of a word w from its expected frequency in a given sequence x is used to determine whether or not the word is avoided. This concept is particularly useful in DNA linguistic analysis. The value of the standard deviation of w, denoted by std(w), effectively characterises the extent of a word by its edge contrast in the context in which it occurs. A word w of length k > 2 is a &#961;-avoided word in x if std(w) &#8804; &#961;, for a given threshold &#961; < 0. Notice that such a word may be completely absent from x. Hence computing all such words naïvely can be a very time-consuming procedure, in particular for large k. In this talk, we propose an O(n)-time and O(n)-space algorithm to compute all &#961;-avoided words of length k in a given sequence x of length n over a fixed-sized alphabet. We also present a time-optimal O(&#963;n)-time algorithm to compute all &#961;-avoided words (of any length) in a sequence of length n over an integer alphabet of size &#963;.
Date: 28-Jun-2016    Time: 14:00:00    Location: 408


Efficient Power and Wideband Data Transmission in Near Field

Maysam Ghovanloo

Georgia Institute of Technology

Abstract—Wireless power transmission is on the rise for a variety of applications from electric vehicles to smartphone and implantable microelectronic devices (IMD). Unlike pacemakers, extreme size constraints and high power consumption prevent many IMDs such as cochlear and retinal implants from using primary batteries as their energy source. Moreover, such devices need to deliver a sizable volume of information from external artificial sensors to the nervous system while interfacing with large neural populations at high stimulus rates. Nonetheless, the skin barrier should remain intact and the temperature should be maintained well within the safe limits. In this talk I will cover the fundamentals of efficient short-range power and wideband data transmission across inductive links. I will discuss the optimization procedure to achieve the highest possible power transmission efficiency using two, three, and four coil systems. I will review some of the latest techniques to establish wideband bidirectional communication links across the skin, and will also touch on efficient methods to convert the received AC power on the IMD to DC and stabilize it at a desired level despite coupling variations due to coil misalignments.
Date: 27-Jun-2016    Time: 09:30:00    Location: IST, room EA3


In search of the role's footprints in client-therapist dialogues: Analyzing speech prosody from a sociolinguistic perspectiveciency Levels

Dr. Silber-Varod

The Open University of Israel

Abstract—As human listeners, we bring many sources of information to the interpretation of a message, including syntax, semantics, our knowledge of the world, the conversational context, and prosody – the information gleaned from the timing and melody of speech. In the present lecture, I will focus on recent findings from a study on prosodic phenomena related to the fact that speakers interact in a conversation – thus, prosody in interaction. In the current study, we looked for evidence of prosodic-acoustic discriminative role cues in therapeutic sessions. The goal of this research is to identify speaker's role via machine learning of broad acoustic parameters, in order to understand how an occupation, or a role, affects voice characteristics. Results based on the acoustic properties show high classification rates, suggesting that there are discriminative acoustic features of speaker's role, as either a therapist or a client. Within the realm of this study, we compiled the first Hebrew Map-Task corpus, which consists of spoken task-oriented dialogues and which will be available for varied linguistics studies of speech in interaction, as has been already done in many other languages.
Date: 22-Jun-2016    Time: 16:00:00    Location: 020


Some recent work on Convolutional Neural Networks (CNNs) for text classification

Byron Wallace

University of Texas at Austin

Abstract—Text classification is a fundamental natural language processing (NLP) task. Modern neural models that exploit (usually pre-trained) word embeddings have recently achieved impressive results on such tasks. Feed-forward Convolutional Neural Networks (CNNs), in particular, have emerged as a relatively simple yet powerful class of models for text classification, often outperforming more complex recurrent neural models such as Long Term Short Term networks (LSTMs). In this talk, I will review CNN architectures appropriate for text and discuss model design and hyper-parameter trade-offs. I will then introduce new variants of CNNs, including an architecture that jointly exploits multiple sets of embeddings and a model that capitalizes on "rationale-level" supervision, i.e., labels on sentences concerning their relevance to the classification task at hand. Finally, I will present recent work on "active learning" approaches for CNNs that aim to rapidly induce discriminative embeddings with as few labels as possible. I will present results with respect to diverse text classification tasks, ranging from verbal irony detection to biomedical text classification.
Date: 22-Jun-2016    Time: 14:30:00    Location: 336


Design and Development of Microcontroller ZR16 - A Successful Case

João Baptista Martins

Universidade Federal de Santa Maria

Abstract—ZR16 is the first microcontroller with 100% brazilian design. This lecture will present the development of microcontroller, main characteristics and applications. Also, Santa Maria Design House (SMDH), an integrated circuit design center located in Santa Maria, RS, Brazil will be presented.
Date: 22-Jun-2016    Time: 11:00:00    Location: 020


Estocada: Flexible Hybrid Stores

Ioana Manolescu-Goujot

INRIA, Saclay

Abstract—Data management goes through interesting times, as the number of currently available data management systems (DMSs in short) is probably higher than ever before. This leads to unique opportunities for data-intensive applications, as some systems provide excellent performance on certain data processing operations. Yet, it also raises great challenges, as a system efficient on some tasks may perform poorly or not support other tasks, making it impossible to use a single DMS for a given application. It is thus desirable to use different DMSs side by side in order to take advantage of their best performance, as advocated under terms such as hybrid or poly-stores. This talk describes Estocada, a novel system capable of exploiting side-by-side a practically unbound variety of DMSs, all the while guaranteeing the soundness and completeness of the store, and striving to extract the best performance out of the various DMSs. Our system leverages recent advances in the area of query rewriting under constraints, which we use to capture the various data models and describe the fragments each DMS stores
Date: 17-Jun-2016    Time: 10:00:00    Location: 336


Taking AR to Task: Explaining Where and How in the Real World

Steve Feiner

Columbia University

Abstract—Researchers have been actively exploring Augmented Reality (AR) for a half century, first in the lab and later in the streets. What can AR make possible by interactively integrating virtual media with our experience of the physical world? I will try to answer this question in part by presenting some of the research being done by Columbia’s Computer Graphics and User Interfaces Lab to explore how we can support users in performing skilled tasks. Examples I will discuss range from providing standalone assistance, to enabling collaboration between a remote expert and a local user. I will address infrastructure spanning the gamut from lightweight, monoscopic eyewear, to hybrid user interfaces that synergistically combine tracked, stereoscopic, see&#8208;through head&#8208;worn displays with tabletop and handheld displays.
Date: 27-May-2016    Time: 11:00:00    Location: 336


Maximizing Parallelism without Exploding Deadlines in a Mixed Criticality Embedded System

Prof. Gilles Muller


Abstract—Complex embedded systems today commonly involve a mix of real-time and best-effort applications. The recent emergence of small low-cost commodity UMA multicore processors raises the possibility of running both kinds of applications on a single machine, with virtualization ensuring that the best-effort applications cannot steal CPU cycles from the real-time applications. Nevertheless, memory contention can introduce other sources of delay, that can lead to missed deadlines. In this research report, we present a combined offline/online memory bandwidth monitoring approach. Our approach estimates and limits the impact of the memory contention incurred by the best-effort applications on the execution time of the real-time application. We show that our approach is compatible with the hardware counters provided by current small commodity multicore processors. Using our approach, the system designer can limit the overhead on the real-time application to under 5% of its expected execution time, while still enabling progress of the best-effort applications.
Date: 18-May-2016    Time: 16:00:00    Location: 336


Audio research for movies

Nuno Fonseca

Abstract—Prof. Nuno Fonseca will present 2 research projects for the audio industry of Hollywood: a text-to-sing solution for soundtrack composers, and an audio particle system for sound designers. WordBuilder is a text-to-sing solution, integrated on “EASTWEST Symphonic Choirs” sound library, allowing the user to write the text, play the melody, and hear that text and melody being sung with a symphonic choir sound, being used by many composers around the world. Particle Systems are a widely used tool in computer graphics, to create elements like fire, smoke or rain. But particle systems can also be used for audio applications, where each particle represents a sound source, allowing the simulation of epic battles with thousands of sounds at the same time. Currently, his software (Sound Particles) is under tests in all major Hollywood studios and was used on several movies including “Batman v Superman”.
Date: 03-May-2016    Time: 15:30:00    Location: IST, Anfiteatro Ea4


Middleboxes as a cloud service

Justine Sherry

University of California at Berkeley

Abstract—Today's networks do much more than merely deliver packets. Through the deployment of middleboxes, enterprise networks today provide improved security -- e.g., filtering malicious content -- and performance capabilities -- e.g., caching frequently accessed content. Although middleboxes are deployed widely in enterprises, they bring with them many challenges: they are complicated to manage, expensive, prone to failures, and challenge privacy expectations. In this talk, we aim to bring the benefits of cloud computing to networking. We argue that middlebox services can be outsourced to cloud providers in a similar fashion to how mail, compute, and storage are today outsourced. We begin by presenting APLOMB, a system that allows enterprises to outsource middlebox processing to a third party cloud or ISP. For enterprise networks, APLOMB can reduce costs, ease management, and provide resources for scalability and failover. For service providers, APLOMB offers new customers and business opportunities, but also presents new challenges. Middleboxes have tighter performance demands than existing cloud services, and hence supporting APLOMB requires redesigning software at the cloud. We re-consider classical cloud challenges including fault-tolerance and privacy, showing how to implement middlebox software solutions with throughput and latency 2-4 orders of magnitude more efficient than general-purpose cloud approaches. Some of the technologies discussed in this talk are presently being adopted by industrial systems used by cloud providers and ISPs.
Date: 28-Apr-2016    Time: 14:00:00    Location: 336


Generating Textual Summaries at Different Target Reading Levels: Summarizing Line Graphs for Visually Impaired Users

Kathy McCoy

Universidade de Delaware

Abstract—People with visual impairments who are reading web documents in popular media often do not have access to the content of graphics contained in the documents. In this work, I describe a system that conveys the high-level information about a line graph (given a representation of the graphic assumed to be produced by a visual extraction module). The aim is to produce a summary that includes the information that a visual reader would get from a casual perusal of the graphic – its high-level message and salient visual features. The work involves solving natural language generation problems – choosing the content of the summary, structuring the text, and making aggregation and lexicalization decisions in order to produce English text. In doing this work, it was noticed that on-line articles are written to target readers at different reading levels – some of the text is rather simple (geared toward 4th grade readers, for example), while other text is quite sophisticated (geared toward college-level readers, for example). We also found that text that was geared toward a reading level that did not match the reader was difficult for that reader to understand. Thus, we attempt to produce a summary whose writing sophistication matches that of the article in which the graphic appears. This talk will introduce the overall project, with emphasis on how we tailor the generated text to different reading levels through the use of aggregation and grade-appropriate lexical items. Evaluation of our methodology will be included.
Date: 27-Apr-2016    Time: 11:00:00    Location: 020


Distributed Estimation and Prediction of Radiation for Solar Energy Plants (CODISOL)

Francisco R. Rubio

Universidad de Sevilla

Abstract—The purpose of this research project is the study and development of a pilot system for cloud detection and short-term prediction of solar radiation in solar fields. The system is intended to allow the detection of cloud distributions and predict the fraction of the solar field that will be affected by a reduced radiation as a consequence of passing clouds. The cloud detector would eventually allow the control system to anticipate the effect of these disturbances, which have an impact on the optimization and efficiency of the operation. The prediction could additionally be used to manipulate the plant's fundamental variables, and allow a proper foresight and planning of the use of auxiliary power sources to maintain the output power as close as possible to the intended daily generation reference. Predictive models will be developed based on meteorological measurements and images collected by a number of fisheye cameras deployed at the plant. Once developed the model, the camera system will be used as a network of distributed sensors for predicting the solar radiation. The results and algorithms developed will be assessed in their application to a solar installations. A small size experimental setup available at the laboratories of the Department of Systems and Automation at the School of Engineering of Seville.
Date: 31-Mar-2016    Time: 15:00:00    Location: DEEC Meeting room, Campus Alameda, Torre Norte, room 5.9


Reconfigurable Computing: Architectures, Compilers, Applications

Markus Weinhardt

University of Karlsruhe

Abstract—In this talk, Prof. Markus Weinhardt will present the results of several research projects conducted at different institutions. In the first project (at KIT and Imperial College), the "Pipeline Vectorization" compilation method for FPGAs was developed. The second project (at PACT XPP Technologies) extended this method for Coarse-Grain Reconfigurable Architectures. The PACT XPP-III processor is also presented. The third project (at Osnabrueck UAS) deals with FPGA Acceleration for Streaming Applications.
Date: 31-Mar-2016    Time: 14:00:00    Location: 336


Harnessing Virtualization Technology for Intrusion Detection and Analysis

Hans P. Reiser

University of Passau

Abstract—Virtualization technology has been know for several decades, and has become one of the core technologies of cloud infrastructures. Main benefits include the possibility to efficiently share resources securely among multiple tenants, running multiple operating systems, and including the ability to rapidly allocate, migrate and de-allocate virtual machines. Virtualization has also proven to be useful for building highly available, replicated systems. In this talk, we explore a different dimension of virtualization technology: its ability to support the detection and analysis of intrusions. In the Bavarian FORSEC project, we investigate new approaches for enhancing security in large-scale distributed system. The CloudIDEA architecture (Cloud Intrusion DEtection and Analysis) extends a cloud management platform with the ability to continuously monitor virtual machines using low-impact introspection techniques, automatically react to suspicious behaviour with system reconfigurations, and analyze in detail (potentially) malicious actions with more heavy-weight introspection approaches. Core building blocks of this architecture are LibVMTrace, a virtual machine tracing library that builds upon LibVMI, and CloudPhylactor, a secure architecture that enables running introspection applications in isolated domains in cloud environments. In future work, we plan to extend our work regarding forensic data acquisition and processing, visualization, and reporting of IT-security incidents.
Date: 17-Mar-2016    Time: 12:00:00    Location: 020


System and Toolchain Support for Reliable Intermittent Computing

Brandon Lucia

Carnegie Mellon University

Abstract—Emerging energy-harvesting devices (EHDs) are computer systems that operate using energy extracted from their environment, even from low-power sources like ambient radio-frequency energy. Future EHDs will be a key enabler of emerging IoT applications, but today's EHDs operate intermittently, only as environmental energy is available. Unfortunately, intermittence makes today's EHDs unreliable and extremely difficult to program. In this talk I will summarize the main challenges of intermittent execution. I will then discuss our recent efforts to develop future architecture, system, and toolchain support for EHDs to address the challenges of intermittence, focusing especially on programmability, debugging, and reliability. I will close by discussing our recent work on building a reliable, EHD-based, hardware/software application platform.
Date: 10-Mar-2016    Time: 11:00:00    Location: 336


Chain-of-trust packet marking

Otávio Carpinteiro

Universidade Federal de Itajubá

Abstract—The talk will focus on a new deterministic IPv6 traceback method - chain-of-trust packet marking (CTPM) - which has been developed by the Research Group on Systems and Computer Engineering (GPESC) at the Federal University of Itajuba (UNIFEI), Brazil. CTPM establishes a chain of trust composed of the border routers of autonomous systems (ASes). It makes use of a new IPv6 extension header - the traceback extension header (TEH) - which extends the datagram size in 168 bytes at most. The TEH contains encrypted marks which trace the path taken by each IPv6 datagram from its origin to the destination. A few preliminary results will be presented during the talk.
Date: 26-Feb-2016    Time: 11:00:00    Location: 336


NetKAT: A Formal System For The Verification Of Networks

Alexandra Silva

University College of London

Abstract—NetKAT is a relatively new programming language and logic for reasoning about packet switching networks that fits well with the popular software defined networking (SDN) paradigm. NetKAT was introduced quite recently by Anderson et al. (POPL 2014) and further developed by Foster et al. (POPL 2015). The system provides general-purpose programming constructs such as parallel and sequential composition, conditional tests, and iteration, as well as special-purpose primitives for querying and modifying packet headers and encoding network topologies. The language allows the desired behavior of a network to be specified equationally. It has a formal categorical semantics and a deductive system that is sound and complete over that semantics, as well as an efficient decision procedure for the automatic verification of equationally-defined properties of networks.
Date: 29-Jan-2016    Time: 11:00:00    Location: 336


Gamification in your Classroom: Avoiding stale water in fancy juice boxes

Lennart Nacke

University of Waterloo

Abstract—Gamification has recently become strongly hyped as a method for teaching and learning content and materials. Leaderboards, badges, point systems have seen a strong renaissance thanks to educational books like "The Multiplayer Classroom". However, as gamification reaches in Zenith of popularity, it is time to think about the limited applicability of some concepts that have gained fame and really look at what makes students want to learn content with games. In a critical tour de force of his own endeavours in gamifying education, Dr. Lennart Nacke, will critically reflect on things that worked and that did not work when he tried to turn his classroom into a game. The audience will leave this talk with a good understanding of some of the pitfalls of classroom gamification and why some of it is dangerous and exciting at the same time.
Date: 11-Jan-2016    Time: 10:30:00    Location: 336


The promise and peril of anthropomorphizing agents

ICT- University of Southern California (USC)

Abstract—Advances in autonomy raise the potential for rich partnerships between humans and machines. Recent scholarship has explored the potential of incorporating human-like traits into robot and computer teammates as a means to enhance team effectiveness. However, whereas a growing body of research illustrates that machines can be made more human-like, less research has considered how this benefits or harms human-machine team performance. Indeed, I will illustrates several examples where human-like qualities actually undermine team performance. More fundamentally, attempts to merely replicate human characteristics overlook an opportunity to improve on human-human interaction: perhaps machines can be designed to interact in different but complementary ways that draw on those social mechanisms that benefit team outcomes while avoiding those that detract from this goal. In this talk, I will illustrate several projects examining the difference between behavior towards human-like and non-human machines and discuss a preliminary theoretical framework for guiding the design of effective, rather than anthropomorphic, human-agent interactions
Date: 22-Dec-2015    Time: 14:30:00    Location: IST Taguspark - 0.65


Advanced Brain Imaging Techniques Symposium

several speakers

Abstract—Special workshop under the research project Project HiFI - MRI on Advanced Brain Imaging Techniques, taking place at IST. Among other well known researchers, Professor Lawrence Wald will be with us and will give a talk on "New Directions for MRI Hardware and Acquisition". <br/> <table> <tr><td></td><td><h2 style='margin: 30px 0px 10px'>Program</h2></td></tr> <tr><td style='text-align: right; padding-right: 10px'><b>9h30</b></td><td><b>New Directions for MRI Hardware and Acquisition, Lawrence Wald</b></td></tr> <tr><td></td><td>Massachusetts General Hospital A. Martinos Center, Harvard Medical School, Harvard-MIT HST</td></tr> <tr><td style='text-align: right; padding-right: 10px'><b>10h30</b></td><td><b>Brain Microstructure at Ultra-High Fields, Noam Shemesh</b></td></tr> <tr><td></td><td>Champalimaud Neuroscience Program, Champalimaud Foundation</td></tr> <tr><td style='text-align: right; padding-right: 10px'><b>10h55</b></td><td><b>Enhancing Simultaneous EEG-fMRI in humans: from 3T to 7T, Patrícia Figueiredo</b></td></tr> <tr><td></td><td>Institute for Systems and Robotics, Instituto Superior Tecnico, Universidade de Lisboa</td></tr> <tr><td></td><td></td></tr> <tr><td style='text-align: right; padding-right: 10px'><b>11h20</b></td><td><b>Cofee Breack</b></td></tr> <tr><td></td><td></td></tr> <tr><td style='text-align: right; padding-right: 10px'><b>11h35</b></td><td><b>Echo-planar Imaging of Human Brain Function, Physiology and Structure at 7 Tesla: Challenges and Opportunities, Marta Bianciardi</b></td></tr> <tr><td></td><td>Massachusetts General Hospital, A. Martinos Center for Biomedical Imaging</td></tr> <tr><td style='text-align: right; padding-right: 10px'><b>12h00</b></td><td><b>Reinforcement learning, habits, and tics, Tiago Maia</b></td></tr> <tr><td></td><td>School of Medicine, Universidade de Lisboa</td></tr> <tr><td style='text-align: right; padding-right: 10px'><b>12h25</b></td><td><b>Modern Optimization in Imaging: Some Recent Highlights, Mário Figueiredo</b></td></tr> <tr><td></td><td>Telecommunications Institute, Instituto Superior Técnico, Universidade de Lisboa</td></tr> <tr><td style='text-align: right; padding-right: 10px'><b>12h50</b></td><td><b>Closing Remarks</b></td></tr> </table>
Date: 03-Dec-2015    Time: 09:00:00    Location: IST, anfiteatro Abreu Faro


Operation and Design of VHF Self-Oscillating DC-DC Converter with Integrated Transformer

Igor M. Filanovsky

University of Alberta

Abstract—Increasing the operating frequency of the DC-DC converter is a direct way of reducing the size of energy storage elements such as bulky capacitors and inductors, which usually dominate the overall converter size. The challenges arise when the converter operating frequency is increased into Very High Frequency (VHF) range from 30 MHz to 300 MHz; the conventional topologies become impractical. In the literature one can find the converter circuits suitable for operation in this frequency range. Some of them, for the lower end of the band, were breadboard prototyped, yet, their fully integrated realization, to the best of our knowledge, is not known yet. <p>The VHF converters may be loosely qualified as the circuits with self-oscillating resonant gate drivers. This driver is realized using a separate from load oscillating circuit. In our case the load and feedback circuits are combined using one integrated transformer. Usually an integrated coil is realized using the top metal layer of lowest resistivity. <p>The layers under this coil can be used to create a transformer secondary without any additional silicon area. The transformer parameters and layout were carefully investigated, and it happens that the high resistance of the secondary is only beneficial in our case: the secondary operates as open circuit, it does not introduce any load in the primary, the description of the circuit operation is simplified, and the oscillation frequency may be evaluated. <p>In the proposed converter the primary represents a “necessary” passive connection providing the path from the power supply to the capacitive load. The core of the system is the feedback loop. The feedback loop, besides of the secondary, includes a duty cycle detector and a pulse-shaping circuit. The output signals of the pulse-shaping circuit represent two in-phase rectangular pulses driving the gates of power transistors. As usual in the feedback systems, it is the feedback that determines such parameters as, for example, the system stability. For this reason we provide full calculation of signals in the duty cycle detector. We describe the full circuit of the proposed converter and its operation principle. Then the detailed analysis of the signals in the duty cycle detector is given. The recommendations on smooth start-up of the converter power transient are provided. Then we describe the circuit layout. Finally, we discuss the obtained results and outline the direction of further investigation.
Date: 23-Nov-2015    Time: 14:00:00    Location: 336


Ramon Llull: From the Ars Magna to Agreement Computing

Carles Sierra


Abstract—Philosopher Ramon Llull (1232-1316) proposed the Ars Generalis, an argumentative method to persuade the non-Christians of the truth of the Christian faith. Although this effort is obviously futile, Ramon Llull made a seminal contribution to one of the most interesting research topics in multiagent systems: Agreement Computing. He proposed a basic alphabet (later extended by Leibniz, who used numbers) that, by means of combinations would contract a coherent vision of the wold with which everybody would need to agree. In this talk, I will describe Llull’s contributions to Logic, Argumentation and Social Choice, and, time permitting, some of my current work in the area of Agreement Computing.
Date: 05-Nov-2015    Time: 15:00:00    Location: IST Taguspark, Porto Salvo, room 0.65


Predictive and Scalable Macro­Molecular Modeling

Chandrajit Bajaj

University of Texas at Austin

Abstract—Most bio­molecular complexes involve three or more molecules, forming macro­molecules consisting of thousands to a million atoms. We consider fast molecular modeling algorithms and data structures to support automated prediction of bimolecular structure assemblies formulating it as the approximate solution of a non­convex geometric optimization problem. The conformation of the macro­molecules with respect to each other are optimized with respect to a hierarchical interface matching score based on molecular energetic potentials ((Lennard­-Jones, Coulombic, generalized Born, Poisson Boltzmann ). The assembly prediction decision procedure involves both search and scoring over very high dimensional spaces, (O(6^n) for n rigid molecules) , and moreover is provably NP-­hard. To make things even more complicated, predicting bio­molecular complexes requires search optimization to include molecular flexibility and induced conformational changes as the assembly interfaces complementarily align. I shall also briefly present fast computation methods which run on commodity multicore CPUs and manycore GPUs. The key idea is to trade off accuracy of pairwise, long­-range atomistic energetics for a higher speed of execution. Our CUDA kernel for GPU acceleration uses a cache­-friendly, recursive and linear- space octree data structure to handle very large molecular structures with up to several million atoms. Based on this CUDA kernel, we utilize a hybrid method which simultaneously exploits both CPU and GPU cores to provide the best performance based on selected parameters of the approximation scheme.
Date: 11-Sep-2015    Time: 11:00:00    Location: 02.2 Centro de Congressos IST


Lattice-based crypto: parallelization of sieving algorithms on multicore CPUs

Artur Mariano

Universität Darmstadt

Abstract—Quantum computers pose a serious threat to cryptoschemes, since classic schemes like RSA or Diffie-Hellman can be broken in the presence of quantum computers. Lattice-based cryptography stands out as one of the most prominent types of quantum immune cryptography. The main task taken on by cryptographers at this point in time is the assessment of potential attacks against lattice-based schemes, and the developement of schemes which manage to thwart the attacks that are known up until now. In this talk, I will present lattice-based cryptography from a cryptanalysis (aka attack) standpoint. To this end, I will explain what lattices are, which lattice problems are interesting for cryptography and which algorithms are usually used to address these problems. I will then select specific algorithms for the SVP, a particularly relevant problem, and explain in detail how they work and how they can be implemented and parallelized efficiently on shared-memory CPU systems. This is achieved with lock-free data-structures that scale linearly with the number of used cores, and HPC techniques such as data prefeching and memory pools.
Date: 23-Jul-2015    Time: 17:30:00    Location: 020


Temporal Information Retrieval – Understanding Time Sensitive Queries

Ricardo Campos

Instituto Politécnico de Tomar

Abstract—The amount of information that is produced every day is growing exponentially. New pages are added, deleted or simply updated at an incredible pace. With this steady increasing growth of the web a huge amount of temporal information has become widely available. This information can be very useful in helping to meet the users’ information needs whenever they include temporal intents. However retrieving the information that meets the user temporal query demands is still an open challenge. The ambiguity of the query is traditionally one of the causes impeding the retrieval of relevant information. This is particularly evident in the case of temporal queries where users tend to be subjective when expressing their intents (e.g., “avatar movie” instead of “avatar movie 2009”). Determining the possible times of the query is therefore of the utmost importance when attempting to achieve more effective results. In this talk, we will describe how to use the information extracted from web page contents to reach this goal. The fact that relevant temporal expressions to the query can be automatically determined, proves its usefulness by paving the way to the emergence of a number of temporal IR applications, which we will survey in this talk. We will specifically focus on GTE-Cluster and GTE-Rank, two temporal applications that stems from this research.
Date: 20-Jul-2015    Time: 14:30:00    Location: 336


Distributed Route Aggregation on the Global Network (DRAGON)

João Luís Sobrinho

Instituto de Telecomunicações (IT)

Abstract—The Internet routing system faces serious scalability challenges due to the growing number of IP prefixes that needs to be propagated throughout the network. Although IP prefixes are assigned hierarchically and roughly align with geographic regions, today's Border Gateway Protocol (BGP) and operational practices do not exploit opportunities to aggregate routing information. In the talk, I will present DRAGON, a distributed route-aggregation technique whereby nodes analyze BGP routes across different prefixes to determine which of them can be filtered while respecting the routing policies for forwarding data-packets. DRAGON works with BGP, can be deployed incrementally, and offers incentives for Autonomous Systems (ASs) to upgrade their router software. I will illustrate the design of DRAGON through a number of examples and I will present results of its performance. Experiments with realistic AS-level topologies, assignments of IP prefixes, and routing policies show that DRAGON reduces the number of prefixes in the forwarding-tables of each AS by close to 80% with minimal stretch in the lengths of AS-paths traversed by data-packets.
Date: 30-Jun-2015    Time: 10:00:00    Location: 336


Algoritmos Optimizados e Hardware Dedicado para a Codificação de Vídeo

Marcelo Porto@Universidade Federal de Pelotas (UFPel), Brasil, Luciano Agostini@Universidade Federal de Pelotas (UFPel), Brasil

Abstract—Esta palestra apresenta os principais temas de pesquisa na área de codificação de vídeo que vêm sendo desenvolvidos no Grupo de Arquiteturas e Circuitos Integrados (GACI) da Universidade Federal de Pelotas (UFPel), no Brasil. Dois trabalhos são apresentados e discutidos com maiores detalhes. O primeiro deles trata da redução da largura de banda de memória necessária para a codificação de vídeo através da compressão de quadros de referência. A solução desenvolvida, chamada Double Differential Reference Frame Compressor (DRFC), reduz em cerca de 69% o a largura de banda necessária para a comunicação com a memória externa, proporcionando uma redução no consumo de energia para acesso à memória de cerca de 65%. O segundo trabalho apresenta um esquema para a redução de complexidade na codificação de vídeos 3D no padrão 3D-HEVC. O esquema é baseado em dois algoritmos, Simplified Edge Detection (SED) e Gradient Based Mode One Filter (GMOF). Estes dois algoritmos visam à redução de complexidade na predição intra de mapas de profundidade, sendo que o esquema proposto apresenta redução de 7 a 35% no tempo de processamento, com perdas mínimas na eficiência de codificação.
Date: 04-Jun-2015    Time: 16:00:00    Location: 336


Trends in Software Automatic Tests


Abstract— Automated functional tests allow significant reduction in the validation effort every time the application is updated, thus being suitable for the implementation of regression testing. Additionally, the accuracy and the coverage rate of this type of testing are much higher because it allows for the execution, over a short period of time, of hundreds or thousands of test cases. Finally, another important aspect is that it makes it easier to get metric and management information. Besides automated functional testing, Wintrust is also specialize in executing performance tests (volume, stress and load testing) in pre-production environments, as well as in measuring the response times in production environments.
Date: 03-Jun-2015    Time: 10:00:00    Location: IST, anfiteatro GA1


Hardware Security: Challenges, Solutions and Opportunities

Chip Hong Chang

Nanyang Technological University

Abstract—The geographical dispersion of chip design activities, coupled with the heavy reliance on third party hardware intellectual properties (IPs), have led to the infiltration of counterfeit and malicious chips into the integrated circuit (IC) design and fabrication flow. Counterfeit chips (such as unauthorized copies, remarked/recycled dice, overproduced and subverted chips or cloned designs) pose a major threat to all stakeholders in the IC supply chain, from designers, manufacturers, system integrators to end users. The consequence they caused to the electronic equipment and critical infrastructure can be disastrous yet identifying compromised ICs is extremely difficult. New attack scenarios could put the integrated electronics ecosystem in dire peril if nothing is done to avert these hardware security treats. This talk provides an overview of our research effort in hardware security. Constraint-based watermarking and fingerprinting are first introduced as a detection approach to hardware IP copyright protection, which can be augmented by an identity-based signature scheme to enable multiple IP cores marked by different authors in a single chip to be publicly authenticable in the field by the end users. As reusable IPs sold in the form of FPGA configuration bitstreams are vulnerable to cloning, misappropriation, reverse engineering and hardware Trojan (HT) attacks, a pay-per-use liscensing scheme is proposed to assure the secure installation of FPGA IP cores onto contracted devices agreed upon by the IP provider and IP buyer. Side-channel analysis method for HT detection and an active current sensing circuit for fast screening of HT-infected chips will also be presented. The last part of this talk will introduce disorder-based methods to avoid the long-term presence of keys in vulnerable hardware. These methods enable random, unique and physically unclonable device fingerprints to be generated on demand for authentication and other cryptographic applications. The high-quality physical unclonable functions (PUFs) we proposed include the robust RO-PUF for resource-constrained platforms, CMOS image sensor based PUF for sensor-level authentication and the PUFs based on emerging non-volatile memory technologies. Finally, some on-going and future research topics addressing the challenges and opportunities in hardware security will be outlined.
Date: 27-May-2015    Time: 11:00:00    Location: Electrical and Computer Engineering Depart. Meeting Room, IST


Computing in Space with OpenSPL

Georgi Gaydadjiev

Chalmers University

Abstract—For a long time all atomic arithmetic and storage structures of computing systems were designed as two-dimensional (2D) structures on silicon. Currently processor vendors offer chips with steadily growing numbers of cores and recent circuits started to grow in the third dimension by integrating silicon dies on the top of each other. All of this results in severe increase of the programming complexity. To date, predominately the one-dimensional view of computing systems organization and behavior is used forming a severe obstacle in exploiting all the associated advantages. To enable this, a more natural, at least 2D view of computer systems is required to represent closer the physical reality in both space and time. This calls for radically novel approaches in terms of programming and system design. Computing in space allows designers to express complex mathematical operations in a more natural, space and area aware way and map them on the underlying hardware resources. OpenSPL is one such approach that can be used to partition, lay out and optimize programs at all levels from high-level algorithmic transformations down to individual custom bit manipulations. In addition, the OpenSPL execution model enables highly efficient scheduling (or better called choreography) of all basic computational actions with the guarantee of no side effects. It is clear that this approach requires a new generation of design tools and methods and a novel way to measure (or rate) performance as compared to all traditional practices. In this talk we will address all of the topics relevant to spatial computing and show its enormous capabilities to design power efficient computing systems. Examples and results based on real systems deployed by Maxeler Technologies will emphasize the advantages of this approach but will also stress the difficulties along the road ahead.
Date: 01-Apr-2015    Time: 14:00:00    Location: 336


The University of the Future: Taking Knowledge Out of Campus, A Personal Tale

Nivio Ziviani

Universidade Federal de Minas Gerais

Abstract—An important way of wealth generation is the creation of knowledge intensive startups from research results. The objective of this talk is to present the experience of startup creation in the Department of Computer Science of the Universidade Federal de Minas Gerais in Brazil. We will discuss three examples: (i) Miner Technology Group, sold to the group Folha de São Paulo/UOL in June 1999, which is one of the first experiences in spinning off a Web startup company from research conducted at a Brazilian university; (ii) Akwan Information Technologies, which became a successful startup and a reference for web search in Brazil and was acquired by Google Inc. in July 2005—an acquisition from which Google bootstrapped its R&D Center for Latin America, located in Belo Horizonte; and (iii) Zunnit Technologies — a new startup company focused on the convergence of Deep Learning and Big Data aiming to improve knowledge of user behavior, management of multimedia assets, sales leads, or recommendation of items of interest for web users.
Date: 27-Mar-2015    Time: 11:30:00    Location: IST Room QA 1.1 (South Tower)


Lasp: a language for eventually consistent distributed programming with CRDTs

Peter Van Roy@Université catholique de Louvain, Christopher Meiklejohn@Basho Technologies

Abstract—We propose Lasp, a new programming language designed to simplify large-scale fault-tolerant distributed programming. Lasp is being developed in the SyncFree European project ( It leverages ideas from distributed dataflow extended with convergent replicated data types (CRDTs). This supports computations where not all participants are online together at a given moment. The initial design supports synchronization-free programming by combining CRDTs together with primitives for composing them inspired by functional programming. This lets us write long-lived fault-tolerant distributed applications, including ones with nonmonotonic behavior, in a functional paradigm. The initial prototype is implemented as an Erlang library built on top of the riak-core distributed systems infrastructure, which is based on a ring with consistent hashing. We show how to implement one nontrivial large-scale application, the ad counter scenario from SyncFree. Future extensions of Lasp will focus on efficiency, practicality, and extensions to add synchronization where needed such as explicit causality and mergeable transactions.
Date: 13-Mar-2015    Time: 10:00:00    Location: 020


Rethinking reverse converter design: From algorithms to hardware components

Amir Sabbagh Molahosseini

Islamic Azad University

Abstract—In this talk a practical methodology to achieve RNS reverse converters with the desired characteristics based on target application’s constraints is introduced. This procedure can be also applied to other RNS particularly difficult operations such as scaling, sign detection and magnitude comparison to enhance them; resulting in practical and efficient RNS. The presented area-delay-power-aware adder placement procedure broke down into four phases namely: i) moduli set and reverse converter architecture selection, ii) placement using theoretical analysis, iii) implementation, and iv) placement using experimental results. Furthermore, during these phases key notes about hardware design of reverse converter are proposed. The Effectiveness of the proposed placement procedure is shown by using distinct converters and implementation technologies.
Date: 04-Mar-2015    Time: 16:00:00    Location: 336


Non-cooperative and Deceptive Dialogue

David Traum

ICT- University of Southern California (USC)

Abstract—Cooperation is usually seen as a central concept in the pragmatics of dialogue. There are a number of accounts of dialogue performance and interpretation that require some notion of cooperation or collaboration as part of the explanatory mechanism of communication (E.g., Grice s maxims, interpretation of indirect speech acts, etc). Most advanced computational work on dialogue systems has also generally assumed cooperativity, and recognizing and conforming to the users intention as central to the success of the dialogue system. In this talk I will review some recent work on modeling non-cooperative dialogue, and the creation of virtual humans who engage in Non-cooperative and deceptive dialogue. These include tactical questioning role-playing agents, who have conditions under which they will reveal truthful or misleading information, and negotiating agents, whose goals may be at odds with a human dialogue participant, and calculate utilities for different dialogue strategies, and also have an ability to keep secrets using plan-based inference to avoid giving clues that would reveal the secret.
Date: 20-Feb-2015    Time: 14:00:00    Location: room 0.65 @INESC-ID Taguspark


Collaborative Mobile Charging and Coverage in Wireless Sensor Networks

Jie Wu

Temple University

Abstract—The limited battery capacity of sensor nodes has become the biggest impediment to wireless sensor network (WSN) applications over the years. Recent breakthroughs in wireless energy transfer, based on rechargeable lithium batteries, provide a promising application of mobile vehicles. These mobile vehicles act as mobile chargers to transfer energy wirelessly to static sensors in an efficient way. In this talk, we discuss some of our recent results on several charging and coverage problems involving multiple mobile chargers. In collaborative mobile charging, a fixed charging location, called a base station (BS), provides a source of energy to mobile chargers, which in turn are allowed to recharge each other while collaboratively charging static sensors. The objective is to ensure sensor coverage while maximizing the ratio of the amount of payload energy (used to charge sensors) to overhead energy (used to move mobile chargers from one location to another). This is done such that none of the sensors will run out of batteries. Here, sensor coverage spans both dimensions of time and space. We first consider the uniform case, where all sensors consume energy at the same rate, and propose an optimal scheduling scheme that can cover a one-dimensional (1-D) WSN with infinite length. Then, we present several greedy scheduling solutions to 1-D WSNs with non-uniform sensors and 2-D WSNs, both of which are NP-hard. Finally, we study another variation, in which all mobile chargers have batteries of unlimited capacity without resorting to a BS for recharging. The objective is then to deploy and schedule a minimum number of mobile chargers that can cover all sensors. Again, we provide an optimal solution to this problem in a 1-D WSN with uniform sensors and several greedy solutions with competitive approximation ratios to the problem setting of 1-D WSNs with non-uniform sensors and 2-D WSNs, respectively.
Date: 12-Feb-2015    Time: 10:00:00    Location: 336


Reading News with Maps by Exploiting Spatial Synonyms


University of Maryland

Abstract—NewsStand is an example application of a general framework to enable people to search for information using a map query interface, where the information results from monitoring the output of over 10,000 RSS news sources and is available for retrieval within minutes of publication. The issues that arise in the design of a system like NewsStand, including the identification of words that correspond to geographic locations, are discussed, and examples are provided of its utility. More details can be found in the video at which accompanies the ``cover article' of the October 2014 issue of the Communications of the ACM about NewsStand which can be found at
Date: 08-Jan-2015    Time: 11:00:00    Location: DEI / Alameda / Informática II


Frequent Sequence Mining in MapReduce*

Klaus Berberich

Max-Planck-Institute für Informatics

Abstract—Frequent sequence mining is a fundamental building block in data mining. While the problem has been intensively studied, existing methods cannot handle datasets consisting of billions of sequences. Datasets of that scale are common in applications such as natural language processing, when computing n-gram statistics over large-scale document collections, and business intelligence, when analyzing sessions of millions of users. In this talk, I will present two methods that we developed recently to mine frequent sequences using MapReduce as a platform for distributed data processing. Suffix-Sigma, as the first method, targets the special case of contiguous sequences such as n-grams. It relies on sorting and aggregating sequence suffixes, leveraging ideas from string processing. MG-FSM, as the second method, identifies also non-contiguous frequent sequences. To this end, it partitions and prepares the input in such way that frequent sequences can be efficiently mined in isolation on each of the resulting partitions using any existing method. Experiments on two large-scale document collections demonstrate that Suffix-Sigma and MG-FSM are substantially more efficient and scalable than alternative approaches. Furthermore, I will discuss extensions of Suffix-Sigma and MG-FSM, for instance, to report only closed or maximal sequences and thus drastically reduce their output. (* Joint INESC-ID/LASIGE Seminar) Klaus Berberich is a Senior Researcher at the Max Planck Institute for Informatics where he coordinates the research area Text + Time Search & Analytics. His research is rooted in Information Retrieval and touches the related areas of Data Management and Data Mining. Klaus has built a time machine -- to search in web archives. More recently, he has worked on frequent sequence mining algorithms for modern platforms such as MapReduce. His ongoing research focuses on (i) novelty & diversity in web archive search; (ii) temporal linking of document collections; (iii) mining document collections for insights about the past, present, and future. Klaus holds a doctoral degree (2010, summa cum laude) and a diploma (2004) in Computer Science from Saarland University. He has served on numerous program committees in his research communities of interest (IR, DB, DM).
Date: 04-Dec-2014    Time: 15:30:00    Location: 020


Inside Information - From Martian Meteorites to Mummies

Anders Ynnerman

Eurographics - The European Association for Computer Graphics

Abstract—In the last decades imaging modalities have advanced beyond recognition and data of rapidly increasing size and quality can be captured with high speed. This talk will show how data visualization can be used to provide public visitor venues, such as museums, science centers and zoos with unique interactive learning experiences. By combining data visualization techniques with technologies such as interactive multi-touch tables and intuitive user interfaces, visitors can conduct guided browsing of large volumetric image data. The visitors then themselves become the explorers of the normally invisible interior of unique artifacts and subjects. The talk will take its starting point in the current state-of-the-art in CT and MRI scanning technology. It will then discuss the latest high-quality interactive volume rendering and multi-resolution techniques for large scale data and how they are tailored for use in public spaces. Examples will then be shown of how the inside workings of the human body, exotic animals, natural history subjects, such as the martian meteorite, or even mummies can be explored interactively. The recent mummy installation at the British Museum will be shown and discussed from both a curator and visitor perspective and results from a 3 month trial period in the galleries will be presented.
Date: 01-Dec-2014    Time: 14:00:00    Location: 336


The Effect of Streaming Time-shifted TV on TV Consumption

Pedro Ferreira

Carnegie Mellon University

Abstract—How does the introduction of time-shifted TV (TSTV) change the total value of advertising captured by networks? How does the introduction of TSTV change the viewership patterns across viewers and across programs? We analyze the effects of the introduction of TSTV on a large cable operator serving more than 1.5 million users that introduced TSTV in 2012. We use click-stream data from TV remote controls in 2012 and 2013 to analyze short and long term effects of TSTV. We use fixed-effects and differences-in-differences with propensity score matching to obtain our results. We find that the introduction of TSTV does not increase TV viewership. In the short-term TSTV viewership amounts for 6% of the total TV viewership. In the long-run, TSTV viewership accounts for 9% of total viewership. We also find that the concentration of TV consumption across programs increases: the most popular programs are watched disproportionately more in time-shift. Interestingly, the most popular programs are also the ones that lose the most in live viewership. This decrease in total live viewership for the most popular programs decreases the total value networks can appropriate from advertising. Pedro Ferreira is an assistant professor of Information Systems and Management at the Heinz College and at the Department of Engineering and Public Policy, Carnegie Mellon University (CMU). He received a Ph.D. in Telecommunications Policy from CMU and a M.Sc. in Electrical Engineering and Computer Science from the Massachusetts Institute of Technology. Pedro’s research interests lie in two major domains: identifying causal effects in dense network settings, with direct application to understanding the future of the digital media industry, and the evolving role of technology in the economics of education. Currently, he is working on a series of large-scale randomized experiments in network settings looking at identifying the role of peer influence in the consumption of media. Pedro has published in top journals and top peer-reviewed research conferences such as Management Science, Management of Information Systems Quartely and the IEEE Conference on Social Computing.
Date: 27-Nov-2014    Time: 14:30:00    Location: 020


Using GPU coprocessor to accelerate computations in 3D MPDATA algorithm

Krzysztof Rojek

Czestochowa University of Technology

Abstract—This talk will address an efficient and portable adaptation of stencil-based 3D MPDATA algorithm to GPU cluster. We propose a performance model, which allows for the efficient distribution of computation across GPU resources. Since MPDATA is strongly memory-bounded, the main challenge of providing a high performance implementation is to reduce GPU global memory transactions. With this purpose, our performance model ensures a comprehensive analysis of transactions based on local memory utilization, sizes of halo areas (ghost zones), data dependencies between and within stencils. The results of analysis performed using the proposed model are number of GPU kernels, distribution of stencils across kernels, as well as sizes of CUDA blocks for each kernel.
Date: 27-Nov-2014    Time: 10:00:00    Location: 336


Signature-Free Asynchronous Byzantine Consensus with < n/3 and O(n²) Messages*

Michel Raynal

Institut Universitaire de France

Abstract—This talk presents a new round-based asynchronous consensus algorithm that copes with up to Byzantine processes, where n is the total number of processes. In addition of being signature-free and optimal with respect to the value of t, this algorithm has several noteworthy properties: the expected number of rounds to decide is four, each round is composed of two or three communication steps and involves $O(n²) messages, and a message is composed of a round number plus a single bit. To attain this goal, the consensus algorithm relies on a common coin as defined by Rabin, and a new extremely simple and powerful broadcast abstraction suited to binary values. The main target when designing this algorithm was to obtain a cheap and simple algorithm. This was motivated by the fact that, among the first-class properties, simplicity --albeit sometimes under-estimated or even ignored-- is a major one. <p>*(this is a joint work with Achour Mostéfaoui, and Hamouma Moumen)
Date: 14-Nov-2014    Time: 16:00:00    Location: 020


Eventual Leader Election in Evolving Mobile Networks

Fabíola Greve

Universidade Federal da Bahia (UFBA)

Abstract—Many reliable distributed services rely on an eventual leader election to coordinate actions. The eventual leader detector has been proposed as a way to implement such an abstraction. It ensures that, each process in the system will be provided by an unique leader, elected among the set of correct processes in spite of crashes and uncertainties. A number of eventual leader election protocols were suggested. Nonetheless, as soon as we are aware off, no one of these protocols tolerates a free pattern of node mobility. This talk presents a new protocol for this scenario of dynamic and mobile unknown networks. Fabíola Greve é professora Associada da Universidade Federal da Bahia, com trabalhos na área de computação distribuída e confiável.
Date: 30-Oct-2014    Time: 11:00:00    Location: 020


Data integration tools for pre-processing biological data

Valéria Magalhães Pequeno

INESC-ID Lisboa and IST

Abstract—The increasing use of Electronic Health Records (EHRs) enables a better analysis of patient data, improving the quality of medical care. EHRs must be processed in order to provide a variety of services to the physician, such as risk classification and summarization. EHRs usually are stored in unstructured text or Excel files containing different data formats and types, missing information, and, sometimes, inconsistent information. Therefore, before analyzing the data, we often need to transform and integrate it. In this presentation, we show some examples of data integration tools that can be used to extract and transform data. As example, we use an Excel file containing exam information regarding patients with ALS (Amyotrophic Lateral Sclerosis).
Date: 26-Jun-2014    Time: 14:30:00    Location: 336


The Biodegradation and Surfactants Database

Jorge dos Santos Oliveira

INESC-ID Lisboa and IST

Abstract—The Biodegradation and Surfactants Database (BioSurfDB) is a curated relational information system currently integrating 14 metagenomes, 137 organisms, 73 biodegradation relevant genes, 62 proteins and 6 of their metabolic pathways; 29 documented bioremediation experiments, with specific pollutants treatment efficiencies by surfactant producing organisms; and a 46 biosurfactants curated list, grouped by producing organism, surfactant name and class and reference. Our goal is to gather published and novel information on the identification and characterization of genes involved in Oil Biodegradation and Bioremediation of polluted environments and provide it in a curated way together with a series of computational tools to aid biology studies.
Date: 12-Jun-2014    Time: 14:30:00    Location: 336


Data integration tools for pre-processing biological data - CANCELED

Valéria Magalhães Pequeno


Abstract—The increasing use of Electronic Health Records (EHRs) enables a better analysis of patient data, improving the quality of medical care. EHRs must be processed in order to provide a variety of services to the physician, such as risk classification and summarization. EHRs usually are stored in unstructured text or Excel files containing different data formats and types, missing information, and, sometimes, inconsistent information. Therefore, before analyzing the data, we often need to transform and integrate it. In this presentation, we show some examples of data integration tools that can be used to extract and transform data. As example, we use an Excel file containing exam information regarding patients with ALS (Amyotrophic Lateral Sclerosis).
Date: 29-May-2014    Time: 14:30:00    Location: 336


A Deep Neural Network Approach to Speech Enhancement

Chin-Hui Lee

Georgia Institute of Technology

Abstract—In contrast to the conventional minimum mean square error (MMSE) based noise reduction techniques, we formulate speech enhancement as finding a mapping function between noisy and clean speech signals. In order to be able to handle a wide range of additive noises in real-world situations, a large training set, encompassing many possible combinations of speech and noise types, is first designed. Next a deep neural network (DNN) architecture is employed as a nonlinear regression function to ensure a powerful modeling capability. Several techniques have also been adopted to improve the DNN-based speech enhancement system, including global variance equalization to alleviate the over-smoothing problem of the regression model, and dropout and noise-aware training strategies to further improve the generalization capability of DNNs to unseen noise conditions. Experimental results demonstrate that the proposed framework can achieve significant improvements in both objective and subjective measures over the MMSE based techniques. It is also interesting to observe that the proposed DNN approach can well suppress the highly non-stationary noise, which is tough to handle in general. Furthermore, the resulting DNN model, trained with artificial synthesized data, is also effective in dealing with noisy speech data recorded in real-world scenarios without generating the annoying musical artifact commonly observed in conventional enhancement methods. <strong>[ Bio ]</strong> Chin-Hui Lee is a professor at School of Electrical and Computer Engineering, Georgia Institute of Technology. Before joining academia in 2001, he had 20 years of industrial experience ending in Bell Laboratories, Murray Hill, New Jersey, as a Distinguished Member of Technical Staff and Director of the Dialogue Systems Research Department. Dr. Lee is a Fellow of the IEEE and a Fellow of ISCA. He has published over 400 papers and 30 patents, and was highly cited for his original contributions with an amazing h-index of 66. He received numerous awards, including the Bell Labs President's Gold Award in 1998. He won the SPS's 2006 Technical Achievement Award for "Exceptional Contributions to the Field of Automatic Speech Recognition". In 2012 he was invited by ICASSP to give a plenary talk on the future of speech recognition. In the same year he was awarded the ISCA Medal in scientific achievement for “pioneering and seminal contributions to the principles and practice of automatic speech and speaker recognition”.
Date: 26-May-2014    Time: 15:30:00    Location: QA1.2 (IST Alameda)


Advanced techniques for integrated DCDC converters

Marcelino Bicho dos Santos, Pedro Alou Cervera

Universidad Politécnica de Madrid

Abstract—Professor Pedro Alou Cervera will present his research activity in the following areas: - PowerSoC project: european project to fully integrate a dc/dc converter - Advanced techniques to optimize the dynamic response of DC/DC converters Pedro Alou (M’07) was born in Madrid, Spain, in 1970. He received the M.S. and Ph.D. degrees in Electrical Engineering from the Universidad Politécnica de Madrid (UPM), Spain in 1995 and 2004, respectively. He is Professor of this university since 1997. He has been involved in Power Electronics since 1995, participating in more than 40 R&D projects with the industry. He has authored or coauthored over 100 technical papers and holds three patents. Main research interests are in power supply systems, advanced topologies for efficient energy conversion, modeling of power converters, advanced control techniques for high dynamic response, energy management and new semiconductor technologies for power electronics. His research activity is distributed among industrial, aerospace and military projects.
Date: 23-May-2014    Time: 09:00:00    Location: 336


FaRM: Fast Remote Memory

Aleksandar Dragojevic

Microsoft Research

Abstract—I will talk about the design and implementation of FaRM, a new main memory distributed computing platform that exploits RDMA communication to improve both latency and throughput by an order of magnitude relative to state of the art main memory systems that use TCP/IP. FaRM exposes the memory of machines in the cluster as a shared address space. Applications can allocate, read, write, and free objects in the address space. They can use distributed transactions to simplify dealing with complex corner cases that do not significantly impact performance. FaRM provides good common-case performance with lock-free reads over RDMA and with support for collocating objects and function shipping to enable the use of efficient single machine transactions. FaRM uses RDMA both to directly access data in the shared address space and for fast messaging and is carefully tuned for the best RDMA performance. We used FaRM to build a key-value store and a graph store similar to Facebooks. They both perform well, for example, a 20-machine cluster can perform 160 million key-value lookups per second with a latency of 31 micro-seconds.
Date: 09-May-2014    Time: 14:00:00    Location: 336


Integrative biomarker discovery in neurodegenerative diseases: a survey

André Carreiro

INESC-ID Lisboa and IST

Abstract—Data mining has been widely applied in biomarker discovery, resulting in significant findings of different clinical and biological biomarkers. With developments in technology, from genomics to proteomics analysis, a deluge of data has become available, as well as standardized data repositories. Nonetheless, researchers are still facing important challenges in analyzing the data, especially when considering the complexity of pathways involved in biological processes or diseases. Data from single sources seem unable to explain complex processes, such as the ones involved in brain related disorders, thus rising the need for a more comprehensive perspective. A possible solution relies on data and model integration, where several data types are combined to provide complementary views, which in turn can result in the discovery of previously unknown biomarkers, by unravelling otherwise hidden relationships between data of different sources. In this work, we review the different single-source types of data used for biomarker discovery in neurodegenerative diseases, and then proceed to provide an overview on recent efforts to perform integrative analysis in these disorders, discussing major challenges and advantages.
Date: 24-Apr-2014    Time: 14:30:00    Location: 336


Fixed-parameter tractable reductions to SAT

Ronald de Haan

TU Wien

Abstract—Modern propositional satisfiability (SAT) solvers perform extremely well in many practical settings and can be used as an efficient back-end for solving NP-complete problems. However, many fundamental problems in Knowledge Representation and Reasoning are located at the second level of the Polynomial Hierarchy (PH) or even higher, and hence for these problems polynomial-time transformations to SAT are not possible, unless the PH collapses. Recent research shows that in certain cases one can break through these complexity barriers by fixed-parameter tractable (fpt) reductions which exploit structural aspects of problem instances in terms of problem parameters. We develop a general theoretical framework that supports the classification of parameterized problems on whether they admit such an fpt-reduction to SAT or not. We base this framework on its application to concrete reasoning problems from various domains. We develop several parameterized complexity classes to provide evidence that in certain cases such fpt-reductions to SAT are not possible. Moreover, we relate these new classes to existing parameterized complexity classes. Additionally, for problems for which there exists a Turing fpt-reduction to SAT, we develop techniques to provide lower bounds on the number of calls to a SAT solver needed to solve these problems.
Date: 02-Apr-2014    Time: 14:00:00    Location: 336


Challenges for embedded systems development: can we have it all?

Luigi Carro

Universidade Federal do Rio Grande do Sul

Abstract—In this talk we discuss the current design challenges for embedded systems, suffering pressures from the market, technology and software development. After discussing the context, we introduce some first steps in the direction of having software productivity with high reliability and low energy dissipation. We present RA3, the Resilient Adaptive Algebraic Architecture, which is capable of adapting parallelism exploitation in a time-deterministic fashion to reduce power consumption, while meeting the existing real-time deadlines. Furthermore, the architecture provides low overhead error correction capabilities, through the use of algebraic properties of the operations it performs. We use two real-time industrial case studies to validate the architecture and to show how the adaptive exploitation works. Finally, we present the results of fault-injection campaigns to show the architecture resilience against soft-errors.
Date: 02-Apr-2014    Time: 10:00:00    Location: IST, VA1 (Pavilhão de Civil)


Automatic Detection and Correction of Web Application Vulnerabilities using Data Mining to Predict False Positives

Ibéria Medeiros

Faculdade de Ciências de Universidade de Lisboa

Abstract—Web application security is an important problem in today’s internet. A major cause of this status is that many programers do not have adequate knowledge about secure coding, so they leave applications with vulnerabilities. An approach to solve this problem is to use source code static analysis to find these bugs, but these tools are known to report many false positives that make hard the task of correcting the application. This paper explores the use of a hybrid of methods to detect vulnerabilities with less false positives. After an initial step that uses taint analysis to flag candidate vulnerabilities, our approach uses data mining to predict the existence of false positives. This approach reaches a trade-off between two apparently opposite approaches: humans coding the knowledge about vulnerabilities (for taint analysis) versus automatically obtaining that knowledge (with machine learning, for data mining). Given this more precise form of detection, we do au- tomatic code correction by inserting fixes in the source code. The approach was implemented in the WAP tool and an experimental evaluation was performed with a large set of open source PHP applications. The talk will be a dry run of a paper presentation that will be given at the International World Wide Web Conference - WWW 2014.
Date: 21-Mar-2014    Time: 16:00:00    Location: 020


Extracting academic data and linked data anonymization

Pedro Rijo

INESC-ID Lisboa and IST

Abstract—Data is becoming more valuable each day as more diverse and rich data sources become available, allowing us to discover knowledge on unprecedented ways. IST uses FénixEdu information system for managing most of internal data. The system contains data about students, teachers, employees, courses, and all major aspects of IST as an organization. Such data may be useful for both external agents and, more importantly, for IST itself to study our academic environment. Data may be used as input for state-of-art IR and KD technologies to extract newer and deeper knowledge about academic agents allowing to solve problems on and to understand better our community. Releasing this kind of data publicly comprises an additional step in what concerns privacy preserving of referred individuals and, as has been shown, simple de-identification may not be enough to achieve such goal. On the other hand we must deal with both internal and external data, on top of an evolving environment, where linked data based approaches can definitely help us to deal with such complexity. In this talk we will discuss a solution for exposing, sharing, and connecting data, information, and knowledge available on IST information system, taking into consideration privacy and anonymity issues.
Date: 20-Mar-2014    Time: 14:30:00    Location: 336


Aspects of Geospatial Search and Analysis

Dirk Ahlers

European Research Consortium for Informatics and Mathematics

Abstract—Geography helps us to understand and map the world and to navigate in it. Geospatial data and location references help us to understand spatial relations and characteristics of diverse documents. The talk will discuss some aspects of the development of geospatial search engines such as crawling, geoparsing/extraction, geocoding, and analysis. It will further showcase experiences in developing geospatial search in various settings and countries. An emphasis will be put on recent work in gazetteer analysis, looking into quality indicators for the GeoNames dataset, which is widely used for geoparsing and geocoding. Open questions and possible future work will hopefully start a motivating discussion.
Date: 11-Mar-2014    Time: 14:30:00    Location: 336


Network mining based analysis of whole brain functional connectivity

André Chambel

Departamento de Engenharia Informática

Abstract—Mapping the human brain has been a topic of interest for the last few decades. In spite of its incredible complexity it is now possible to map the brain using a combination of advanced data representation and data processing algorithms supported on the huge computational power that is available nowadays. In this work we describe an approach for mapping whole-brain functional connectivity. The starting point of our work is a set of high resolution functional magnetic resonance images (fMRI) obtained with a 7T magnetic field that cover a wider brain volume than usual. The fMRIs are then used to build the so called brain functional connectivity network. These networks extracted from the brain can be represented as graphs, i.e., a set of nodes (regions) and a set of edges connecting such nodes. With the networks represented as graphs we apply network mining techniques to them, namely clustering and modularity algorithms that allow us, for instance, to identify functional modules of the brain. Presumably, the increased resolution will allow to obtain more detailed information and potential to uncover additional structure. Due to the size of the graphs all the algorithms must be optimized in order to minimize the used resources.
Date: 06-Mar-2014    Time: 14:30:00    Location: 336


Computational prediction of microRNA targets in plant genomes

Manuel Reis

Departamento de Engenharia Informática

Abstract—MicroRNAs (miRNAs) are important posttranscriptional regulators and act by recognizing and binding to sites in their target messenger RNAs (mRNAs). They are present in nearly all eukaryotes, in particular in plants, where they play important roles in developmental and stress response processes by targeting mRNAs for cleavage or translational repression. MiRNAs have been shown to have a crucial role in gene expression regulation, but so far only a few miRNA targets in plants have been experimentally validated. Based on the number of identified genes, on the number of experimentally validated miRNAs and on the fact that one miRNA often regulates multiple genes, a long list of yet unidentified targets is to be expected. Here, we present a novel miRNA target prediction method for plants, that incorporates an evolutionary approach. With this approach, we intend to understand whether a transcript shows evidence of exhibiting a sequence bias towards either eliciting or avoiding target sites for a particular miRNA.
Date: 20-Feb-2014    Time: 14:30:00    Location: 336


Topology-aware placement and load-balancing

Emmanuel Jeannot


Abstract—Current generation of NUMA nodes clusters feature multicore and many core processors. Programming such architectures efficiently is a challenge because numerous hardware characteristics have to be taken into account, especially the memory hierarchy. One appealing idea to improve the performance of parallel applications is to decrease their communication costs by matching the communication pattern to the underlying hardware architecture. In this task we detail the algorithm and techniques proposed to achieve such a result. First, we gather both the communication pattern information and the hardware details. Then we compute a relevant reordering of the various process ranks of the application. Finally, those new ranks are used to reduce the communication costs of the application. Then, we developed two load balancers for Charm++ that take into account topology and communication aspects depending on the fact that the application is compute-bound or communication-bound. We show that the proposed load-balancing scheme manages to improve the execution times for the two classes of parallel applications.
Date: 12-Feb-2014    Time: 14:30:00    Location: 336


Design and Implementation of a Domain Specific Language for Next Generation Sequence Analysis

Paulo Monteiro

Departamento de Engenharia Informática

Abstract—Next Generation Sequecing (NGS) is a set of molecular biology technologies which generate, at low cost, many millions of short nucleotide reads. Typical datasets consist of tens of millions of reads, with each read comprising 35-500 basepairs (depending on the technology used, different read sizes can be obtained). There are many tools for handing these datasets. However, they must still be combined to build a full analysis pipeline. Current solutions to build these pipelines are Make-like tools which can handle text-files and Unix-like commands. Several GUI-based solutions allow users who are not comfortable with the command line to build and run these pipelines. However, they still operate at the semantic level of Make: file dependencies and transformation commands. Because each problem and each variation on the technology requires a different processing pipeline, it would be impossible to design a single pipeline for every need. This paper aims at the description of a context aware tool that will allow for the first phase of NGS analysis.
Date: 06-Feb-2014    Time: 14:30:00    Location: 336


Self-Stabilizing Leader Election in Population Protocols

Janna Burman

University Paris-South 11 and LRI - Laboratoire de Recherche en Informatique, Orsay, France

Abstract—We consider the fundamental problem of self-stabilizing leader election (SSLE) in the model of population protocols. In this model, an unknown number of asynchronous, anonymous and finite state mobile agents interact in pairs over a given communication graph. SSLE has been shown to be impossible in the original model. This impossibility can been circumvented by a modular technique augmenting the system with an oracle - an external module abstracting the added assumption about the system. Fischer and Jiang have proposed solutions to SSLE, for complete communication graphs and rings, using an oracle &#937;?, called the eventual leader detector. In this work, we present a solution for arbitrary graphs, using a composition of two copies of &#937;?. We also prove that the difficulty comes from the requirement of self-stabilization, by giving a solution without oracle for arbitrary graphs, when an uniform initialization is allowed.
Date: 28-Jan-2014    Time: 14:00:00    Location: 336


The Organization of the Retina and Visual System

Prof Eduardo Fernandez

Instituto de Bioingenieria, Facultad de Medicina, Universidad Miguel Hernandez

Abstract—Understanding the organization of the vertebrate retina has been an important research topic in the last years. Anatomic descriptions of the cell types that constitute the retina, and the understanding of the role of those cells in combination with psychophysical studies, have contributed to understand how the retina might be organized and functioning. In this talk, Prof. Eduardo Fernandez will present the most recent advances in understanding the visual system, and its applications to impaired people, namely to develop BCIs and humanoid robots.
Date: 20-Dec-2013    Time: 14:30:00    Location: 220


Application of RNS to Cryptography

Prof Jean-Claude Bajard

Univ. Paris VI (Pierre et Marie Currie)

Abstract—Residue Number Systems (RNS) are effective number representation, with several advantages in comparison with weighted number systems, namely for DSP and cryptography. In this talk it will be presented the last research results of the application of RNS to public-key cryptography, namely to compute the Montgomery exponentiation.
Date: 18-Dec-2013    Time: 17:00:00    Location: 220


Everything you always wanted to know about worst-case (but were afraid to ask) ...

Helmut Graeb

Technische Universitaet Muenchen

Abstract—Process corners, corner cases, worst case parameter sets, ...; there are a lot of myths about certain parameter sets that are supposed to capture some kind of measure for variability of a circuit manufactured in a semiconductor technology. But what are these corners really? How are they determined? How should the results of a worst-case simulation be interpreted? And how can I get an estimation of the yield, more specifically, the parametric yield? These are questions that every designer of analog and mixed-signal circuits is confronted with in his every-day life of designing complex circuits in ever-advancing technologies with ever-increasing transistor variability. The first part of the talk will give some answers. Constraints are key elements of analog design automation: a mathematical optimization tool would not be applicable if it would not be provided with constraints to keep transistors in saturation, to take care of symmetrical sizing, for instance. Interestingly, the netlist of an analog circuit inherently can provide a lot of constraints. The second part of the talk presents a method to automatically extract constraints out of a given netlist. It consists of two parts. First, an analysis of the hierarchical structure of a circuit is described. Second, a signal path analysis is presented. The overall outcome are constraints for sizing and placement, as well as a construction plan for analog placement. It will be illustrated how to use this outcome in sizing and placement of analog circuits. <p>Bio: Helmut Graeb got his Dipl.-Ing., Dr.-Ing., and habilitation degrees from Technische Universitaet Muenchen in 1986, 1993 and 2008, respectively. He was with Siemens Corporation, Munich, from 1986 to 1987, where he was involved in the design of DRAMs. Since 1987, he has been with the Institute of Electronic Design Automation, TUM, where he has been the head of a research group since 1993. His research interests are in design automation for analog and mixed-signal circuits, with particular emphasis on Pareto optimization of analog circuits considering parameter tolerances, analog design for yield and reliability, hierarchical sizing of analog circuits, analog/mixed signal test design, discrete sizing of analog circuits, structural analysis of analog and digital circuits, and analog layout synthesis. Dr. Graeb has, for instance, served as a Member of the Executive Committee of the ICCAD conference, as a Member or Chair of the Analog Program Subcommittees of the ICCAD, DAC, and D.A.T.E conferences, as Associate Editor of the IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS PART II: ANALOG AND DIGITAL SIGNAL PROCESSING and IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, and as a Member of the Technical Advisory Board of MunEDA GmbH Munich, which he co-founded. He is a Senior Member of IEEE (CAS) and member of VDE (ITG). He was the recipient of the 2008 prize of the Information Technology Society (ITG) of the Association for Electrical, Electronic and Information Technologies (VDE), of the 2004 Best Teaching Award of the TUM EE Faculty Students Association, of the 3rd prize of the 1996 Munich Business Plan Contest.
Date: 12-Dec-2013    Time: 10:30:00    Location: room EA3, Torre Norte do IST


A data mining approach to study disease presentation patterns in Primary Progressive Aphasia.

Telma Pereira

Departamento de Engenharia Informática

Abstract—Nowadays the world is faced with an ageing population and the related challenges, as healthcare issues given the current incidence of diseases more prevalent in elders, such as neurodegenerative diseases. Primary Progressive Aphasia (PPA) is a neurodegenerative disease characterized by a gradual dissolution of language abilities, being these patients regarded with special attention since they possess higher risk to evolve to dementia. Consequently, discovering the different subtypes of PPA patients is fundamental to the timely administration of pharmaceutics and therapeutic interventions, improving patients quality of life. This thesis aims to propose a data mining approach to extract relevant knowledge from clinical data, namely to learn the variants of PPA. Initially, standard clustering algorithms were applied with the purpose of studying the number of groups existent in the dataset and eventually, study the potential existence of new groups, different from the PPA subtypes already defined in the literature. Then, during a second phase, supervised learning techniques were used to analyze patients according to their clinical classification in one of the three PPA variants and develop a new and accurate classification model. The unsupervised learning analysis pointed to the existence of two main groups in the dataset analyzed in this work. This study included the evaluation of diverse sets of attributes in order to access which type/set of attributes produced better results. Finally, two new methodologies for classifying patients with PPA were developed, reaching good accuracies in the dataset under study. One of those methodologies enables the identification of instances which are (potentially) not from any of the already defined three PPA subtypes.
Date: 05-Dec-2013    Time: 14:30:00    Location: 336


Some results with MWMR registers by H. Fauconnier

University Paris Diderot

Abstract—Abstract: What is the number of registers required to solve a task? Many years ago, Ellen and al. have proved a lower bound of square root of n registers to (obstruction free) solve the consensus, but today there is no known consensus algorithm using less than n registers. In a system of n processes, if each process has its own SWMR register, it is possible to emulate any number of registers, but what of tasks can be solved with less than n registers? Before considering this question, what’s happens when we only have MWMR registers? A trivial way may be to assign each process one MWMR: given an array C of MWMR registers, C[i] will be assigned to process i. But if the n processes have ids drawn from a very large set of N identifiers, the size of C depends on N not on n. Renaming algorithms may help but they use a non linear (on n) number of MWMR registers. We give a solution without renaming that implements for each process a SWMR register using only n MWMR registers. This implementation is only non-blocking, but we get with 2(n-1) MWMR a wait-free implementation. Moreover we prove that n is a lower bound to such implementation. We also prove that n MWMR registers are sufficient to solve any wait-free task solvable with any number of (MWMR or SWMR) registers. If the number of MWMR is less than n, we prove that some tasks may nevertheless been (obstruction-free) solved. For example, we prove that 2 registers are necessary and sufficient to (Obstruction-Free) solve the set-agreement problem. This a joint work with C. Delporte, H. Fauconnier, E. Gafni and S. Rajbaum (ICDCN 2013). A recent extension to the adaptive case has been made jointly with L. Lamport ( DISC 2013) Bio: H. Fauconnier received his Ph.D. in 1982 and HDR degree in 2001 in Computer Science from the University Paris-Diderot, after Master degrees in Mathematics and Computer Science. He is a top level expert in Fault Tolerance Distributed Computing and he has published papers in many journals like JACM, Distributed Computing, TOPLAS and in top level conferences of this aera (PODC, DISC, DSN, ICDCS, ...). He has been program committee members of established conferences in Distributed Computing such as PODC, DISC, IEEEE ICDCS, OPODIS...He is currently at LIAFA, University Paris Diderot.
Date: 04-Dec-2013    Time: 11:00:00    Location: 336


Towards face-to-face conversations with social robots

Joakim Gustafson


Abstract—Bio: Joakim Gustafson, Professor in speech technology at KTH, has been a prolific researcher on multimodal dialogue systems since 1993. He has an industrial background from TeliaSonera where, in addition to research, he was involved in the launching of public speech applications. Gustafson’s research activities covers design and development of multimodal conversational systems, interactional analysis of spontaneous spoken phenomena, conversational phenomena in speech synthesis, development of speech-enabled robots and data collections of human-computer interactions in public spaces. He has participated in several EU-projects such as Onomastica, NICE, MonAmi, IURO, GetHomeSafe and SpeDial. He is currently the principal investigator in two nationally funded three-year research projects: Incremental Text-To-Speech Conversion and Situated Audio Visual Interaction with Robots, He is also member of the Editorial Board of the journal Speech Communication.
Date: 03-Dec-2013    Time: 16:00:00    Location: 336


Evaluating differential gene expression using RNA-sequencing data: a case study in host-pathogen interaction upon Listeria monocytogenes infection

Joana Cruz

Departamento de Engenharia Informática

Abstract—Unlike the genome, the cell transcriptome is dynamic and specific for a given cell developmental stage or physiological condition. Understanding the transcriptome is essential for interpreting the functional elements of the genome and revealing the molecular constituents of cells. Recently, developments of high-throughput DNA sequencing methodologies have provided a new method to sequence RNA at unprecedented high resolutions. This method is termed RNA-Seq and has been emerging as the preferred technology for both characterization and quantification of the cell transcripts. Bearing this in mind, in this thesis I propose a bioinformatics pipeline to compare two RNA-Seq samples. This pipeline permits biological insight into the analysed samples, by extracting the main biological processes that are differentially active among the samples in analysis. Subsequent to this pipeline, I developed a novel methodology to inspect the activation of a given cellular pathway in a time-course RNA-Seq dataset. The evaluation of a Listeria monocytogenes RNA-Seq dataset with the developed tools testified its proper functioning. It was possible to identify global changes in the human host transcriptome and associate these changes to different stages of the Listeria monocytogenes infection lifecycle.
Date: 28-Nov-2013    Time: 14:30:00    Location: 336



Miguel Coimbra

Departamento de Engenharia Informática

Abstract—Metagenomics is the study of metagenomes, unprocessed genetic material residing in the most varied sites, without separation into individual organisms. Metagenomic approaches to the study of biological communities are quickly changing our understanding of the function and inter-relationships among living organisms in ecosystems. The rapid advances in metagenomics are largely due to the hasty development of high throughput platforms for deoxyribonucleic acid (DNA) sequencing, that need to be accompanied by significant advances in data analysis techniques. With this work, I intended to develop and apply new techniques for data analysis that can be applied to large amounts of data generated by metagenomics. This document presents a proposal to address the challenges posed by the storage and manipulation of such information types and the need to develop new data analysis techniques that can be applied directly to this problem. For this purpose, there was an intention to harness the power of parallel computing. The target-result of this thesis was MetaGen-FRAME, a metagenomic framework capable of handling heterogeneous data types (from DNA sequences to genome, proteome and metabolome annotations) though the use of different data structures and computational approaches.
Date: 31-Oct-2013    Time: 14:30:00    Location: 336


Unsupervised semantic structure discovery for audio

Bhiksha Raj

Carnegie Mellon University

Abstract—Automatic deduction of semantic event sequences from multimedia requires awareness of context, which in turn requires processing sequences of audiovisual scenes. Most non-speech audio databases, however, are not labeled at a sub-file level, and obtaining (acoustic or semantic) annotations for sub-file sound segments is likely to be expensive. In our work, we introduce a novel latent hierarchical structure that attempts to leverage weakly or un labeled data to process the observed acoustics to infer semantic import at various levels. The higher layers in the hierarchical structure of our model represent increasingly higher level semantics.
Date: 31-Oct-2013    Time: 13:00:00    Location: 020


On Multi-class Classification Problems Using Genetic Programming

Vijay Ingalalli

Departamento de Engenharia Informática

Abstract—Genetic Programming (GP) is a field under the hood of Evolutionary Computing, that has been successful in addressing a variety of problems in the field of data mining and machine learning, notexcluding the problems of multi-class classification (mcc). However, its realms have been successful only in extending the binary GP classifiers to the problems of mcc, thereof still retaining a void of not having any efficient multi-class classifiers, when compared to non-GP classifiers. In this work, I will present a novel algorithm that incorporates some ideas on the representation of the solution space for a tree based GP, that will lay some foundations on filling this void, which might also lead to some future research in this direction. During the presentation, I shall reveal the success and competitiveness of this approach, and discuss about the future directions.
Date: 24-Oct-2013    Time: 14:30:00    Location: 336


Tracking attention to issues as a way to learn about political systems: An Introduction to the Comparative Agendas Project

Enrico Borghetto

Universidade Nova de Lisboa

Abstract—The importance of studying political agendas - the list of issues political actors devote attention to - cannot be overstated. Attention is a scarce resource in politics and, at the same time, it is a precondition for every kind of political action. The Comparative Agendas Project (CAP), a network comprising 18 universities from different member states, developed a distinct methodological approach to study the quantitative flow of issue attention through time and institutions. This approach allows measuring streams of influence and power within single political systems and, at the same time, systematically compare the workings of different political systems. This presentation will provide an overview over the theoretical and methodological approach adopted by the CAP project, as well as other more recent spin-off projects. The analysis of Portuguese agendas is about to kick off in the coming months and there is no better moment to develop new ideas on how to contribute and exploit the massive amount of text data available.
Date: 23-Oct-2013    Time: 14:00:00    Location: 336


Quick Hyper-Volume

Luis Russo

Departamento de Engenharia Informática

Abstract—I will present a new algorithm to calculate exact hypervolumes. Given a set of $d$-dimensional points, it computes the hypervolume of the dominated space. Determining this value is an important subroutine of Multiobjective Evolutionary Algorithms (MOEAs). We analyze the ``Quick Hypervolume QHV algorithm theoretically and experimentally. The theoretical results are a significant contribution to the current state of the art. Moreover the experimental performance is also very competitive, compared with existing exact hypervolume algorithms.
Date: 10-Oct-2013    Time: 14:30:00    Location: 336


Distributed Computations Using Local Broadcasts

Fabian Kuhn

University of Freiburg

Abstract—We discuss basic distributed computation and information dissemination tasks in networks where as the basic communication primitive, nodes can locally broadcast a bounded sized message to all their neighbors. Such a communication assumption is natural in wireless settings and it is particularly suited to study dynamic networks and networks with unidirectional links. For directed networks, we show that even if the network has diameter 2, as long as this fact is not known to the nodes, computing even simple functions such as the minimum of a bunch of values requires time of order essentially sqrt{n}, where n is the number of nodes of the network. We also review recent results on the complexity of such basic data aggregation tasks and of simple information dissemination tasks in dynamic networks. Finally, we discuss some novel results showing that in ordinary static, undirected networks, the achievable throughput when performing multiple network-wide broadcasts is tightly connected to the vertex connectivity of the network graph.
Date: 07-Oct-2013    Time: 11:00:00    Location: 336


Parallel efficient alignment of reads for re-sequencing applications

Miguel Coimbra

Departamento de Engenharia Informática

Abstract—In bioinformatics, in the context of resequencing projects, the e cient and accurate mapping of reads to a reference genome is a critical problem. One instance of this problem is the local alignment of pyrosequencing reads produced by the 454 GS FLX system against a reference sequence, an instance for which the software tool TAPyR (Tool for the Alignment of Pyrosequencing Reads) was developed. TAPyR implements a methodology to e ciently solve this problem, which proved to yield results of a quality (both in terms of content and execution speed) higher than those of mainstream applications. With the goal of further improving this platforms results, we produced a parallel implementation of the query and reference sequence access procedures of the original version. Through the use of multithreading, this new version, P-TAPyR, produces considerable reductions in the processing time of queries, scaling with the amount of hardware-supported threads (not accounting for hyper-threading) available. For larger data sets, we were able to observe running times roughly 26 times faster than serial execution with 30 executing threads, showing an experimental (progressively-decreasing) execution serial fraction of 0.8% (determined by the Karp-Rabin Metric described in a posterior section). Herein we present the modi cations made to this software tool to allow for parallel querying of reads against an indexed reference which, scales proportionally to the amount of available physical cores.
Date: 26-Sep-2013    Time: 11:00:00    Location: 336


Incremental Maintenance of RDF Views of Relational Data

Vânia Vidal

Universidade Federal do Ceará

Abstract—Professor Vânia Vidal will present an incremental maintenance strategy, based on rules, for RDF views defined on top of relational data. The first step relies on the designer to specify a mapping between the relational schema and a target ontology and results in a specification of how to represent relational schema concepts in terms of RDF classes and properties of the designer’s choice. Using the mappings obtained in the first step, the second step automatically generates the rules required for the incremental maintenance of the view.
Date: 17-Sep-2013    Time: 14:00:00    Location: 336


Coupling Pattern Recognition and Signal Processing

Ahmed Hussen Abdelaziz

Institut für Kommunikationsakustik, Ruhr-Universität Bochum

Abstract—Signal processing and pattern recognition are often treated as separate problems. However, tight coupling between them can yield a significantly improved performance in both of these tasks. In this talk, we will introduce two new approaches for such a stronger coupling, providing more precise input from signal processing to pattern recognition and vice versa. We start with coupling pattern recognition models with signal processing algorithms using a new statistical model, called the twin hidden Markov model (THMM), for speech enhancement. By using the THMM, hidden Markov models HMMs can be exploited to enhance speech signals in a recognize-and-synthesize scheme by using the most appropriate features in both recognition and synthesis. After that, we introduce a new approach for coupling signal processing with pattern recognition, called significance decoding (SD). The SD approach is a new uncertainty-of-observation technique that deploys the features uncertainties estimated by the signal processing algorithm to improve the recognition accuracy of automatic speech recognition under tough environmental conditions. Finally, we combine these two schemes in the context of audio-visual speech recognition in order to enhance its performance in very noisy environments.
Date: 19-Jul-2013    Time: 15:00:00    Location: 020


Identification of Hybrid Time-varying Parameter systems with Particle Filtering and Expectation Maximization

Andras Hartmann

Departamento de Engenharia Informática

Abstract—Abstract:One limiting assumption of many mathematical models for dynamic systems is that the parameters of the system do not change during the observation period, which however does not necessary hold in many cases. This is typical for biological and medical systems, where we observe a high intra-individual variability in the model parameters. Hybrid time-varying parameter framework is able to capture the changes of parameters that may represent the change of state of the individual, for example in HIV infected patients, changes of conditions in regulatory metabolic networks or diauxic bacterial growth on mixed sugar medium. Thus, in these scenarios, a subset or even all the parameters have to be treated as time-varying in order to capture the dynamics of the system. An offline (batch) algorithm that combines particle filtering and the expectation maximization is introduced for the identification of such systems. The efficiency of the proposed method is illustrated through simulated and real-world examples.
Date: 19-Jul-2013    Time: 11:00:00    Location: 336


The Brazilian National Institute of Science and Technology for the Web: Towards a Better Understanding of Web Data

Alberto H. F. Laender

Universidade Federal de Minas Gerais

Abstract—The National Institute of Science and Technology for the Web – InWeb is a multi-institutional project supported by the Brazilian Ministry of Science, Technology and Innovation that aims to develop models, algorithms and novel technology to make information distribution and services through the Web more effective and safe. In this talk, we will provide an overview of InWeb activities by describing its main research lines and some ongoing work.
Date: 03-Jul-2013    Time: 11:00:00    Location: 407


Deterministic Scheduling for Replicated Systems

Franz Hauck

Ulm University

Abstract— Deterministic scheduling is a strong requirement for most replication-based systems, as they require a deterministic behaviour of replicas and one source of indeterminism is the scheduling of multiple threads. Often scheduling is avoided at all by disallowing multiple concurrent threads. For modern multi-core hardware this is a waste of resources. A few algorithms for a deterministic user-level scheduling were developed, e.g., LSA, PDS and MAT. Unfortunately they all have their killer application at which they perform worst. In the talk I will introduce the problems behind deterministic scheduling and sketch the potential design space of different schedulers. Our aim is to develop an adaptive scheduler that takes application behaviour into account. Finally, I will briefly introduce some of our other work items in the context of fault-tolerant computing, e.g. the Virtual Nodes framework, the Dj deterministic Java runtime, and the COSCA PaaS platform.
Date: 03-Jul-2013    Time: 10:00:00    Location: 020


Spoken Dialogue Systems: Progress and Challenges

Steve Young

University of Cambridge

Abstract—The potential advantages of statistical dialogue systems include lower development cost, increased robustness to noise and the ability to learn on-line so that performance can continue to improve over time. This talk will briefly review the basic principles of statistical dialogue systems including belief tracking and policy representations. Recent developments at Cambridge in the areas of rapid adaptation and on-line learning using Gaussian processes will then be described. The talk will conclude with a discussion of some of the major issues limiting progress. <span style="font-weight: bold;">Bio</span>: Steve Young received a BA in Electrical Sciences from Cambridge University in 1973 and a PhD in Speech Processing in 1978. He held lectureships at both Manchester and Cambridge Universities before being elected to the Chair of Information Engineering at Cambridge University in 1994. He was a co-founder and Technical Director of Entropic Ltd from 1995 until 1999 when the company was taken over by Microsoft. After a short period as an Architect at Microsoft, he returned full-time to the University in January 2001 where he is now Senior Pro-Vice-Chancellor. His research interests include speech recognition, language modelling, spoken dialogue and multi-media applications. He is the inventor and original author of the HTK Toolkit for building hidden Markov model-based recognition systems (see, and with Phil Woodland, he developed the HTK large vocabulary speech recognition system which has figured strongly in DARPA/NIST evaluations since it was first introduced in the early nineties. More recently he has developed statistical dialogue systems and pioneered the use of Partially Observable Markov Decision Processes for modelling them. He also has active research in voice transformation, emotion generation and HMM synthesis. He has written and edited books on software engineering and speech processing, and he has published as author and co-author, more than 250 papers in these areas. He is a Fellow of the Royal Academy of Engineering, the IEEE, the IET and the Royal Society of Arts. He served as the senior editor of Computer Speech and Language from 1993 to 2004 and he was Chair of the IEEE Speech and Language Processing Technical Committee from 2009 to 2011. In 2004, he received an IEEE Signal Processing Society Technical Achievement Award. He was elected ISCA Fellow in 2008 and he was awarded the ISCA Medal for Scientific Achievement in 2010. He is the recipient of the 2013 Eurasip Individual Technical Achievement Award.
Date: 24-Jun-2013    Time: 14:30:00    Location: Anfiteatro do Pavilhão Interdisciplinar, IST Alameda


Unravelling communities of ALS patients using network mining

André Carreiro

Departamento de Engenharia Informática

Abstract—Amyotrophic Lateral Sclerosis is a devastating neurodegenerative disease characterized by a usually fast progression of muscular denervation, generally leading to death in a few years from onset. In this context, any significant improvement of the patients life expectancy and quality is of major relevance. Several studies have been made to address problems such as ALS diagnosis, and more recently, prognosis. However, these analysis have been mostly restricted to classical statistical approaches used to find the most associated features to a given outcome of interest. In this work we explore an innovative approach to the analysis of clinical data characterized by multivariate time series. We use a distance measure between patients as a reflection of their relationship, to build a patients network, which in turn can be studied from a modularity point of view, in order to search for communities, or groups of similar patients. The preliminary results show that it is possible to extract relevant information from such groups, each presenting a particular behavior for some of the features (patient characteristics) under analysis.
Date: 21-Jun-2013    Time: 11:00:00    Location: 336


Host-pathogen interaction upon infection with Listeria using NGS techniques

Joana Cruz

Departamento de Engenharia Informática

Abstract—Listeria monocytogenes is a model bacterial pathogen whose, after internalization, is capable of disrupting a double-membrane vacuole, replicate in the host cytosol and manipulate the innate response triggered in the cytosol. Its intracellular lifecycle in the human host provides insight into the dynamics of general host-pathogen interactions. The identification of host sequences affected during these interactions is paramount to our understanding of how pathogens engineer their cellular environments. The main goal of this project is, therefore, to comprehend in which way pathogens are influencing human host cells, by identifying global changes in the host transcriptome and characterizing the alterations in host nuclear architecture. Furthermore, it is aimed to associate these changes to different stages of the Listeria monocytogenes infection lifecycle. For that, total RNA was extracted from three different cell populations at four time-points (after 20, 60, 120 and 240 minutes) with the purpose of having represented specific stages in the bacterium lifecycle.
Date: 07-Jun-2013    Time: 11:00:00    Location: 336


Novel semantic approaches in Genetic Programming.

Stefano Ruberto

Departamento de Engenharia Informática

Abstract—Evolutionary algorithms are stochastic optimization techniques based on the principles of natural evolution and Genetic Programming (GP) belongs to this family . In recent years the study of GP systems has been extended to phenotypic aspects while in previous phase it was mainly focused on genotypic and syntactic aspects. Phenotype or semantic is utilized with the aim of optimizing the capacity of GP algorithms to explore the solution space in an effective way, classifying similar individuals and exploring new semantic areas, increasing the probability to find an optimal solution and to escape local optimum. Currently semantic GP is strictly related to the evaluation of individuals behavior in the candidate population: this kind of evaluation is mainly obtained through the fitness function itself. This work introduces a new way of measuring semantic similarity between individuals that is more independent from the fitness itself, allowing a fair comparison even when the finesses values involved are very far away from each other. This new measure enable a new series of techniques to be used to tackle the open problems in GP, like bloat and over-fitting, and also targeting the phenotypes variety preservation thereby enhancing performances. Preliminary results will be provided. A new theoretical GP algorithm based on this new semantic measure it is also introduced showing the potential advantages. Very early results coming from a first naive implementation show interesting insight on this potential comparing with others on the cutting edge algorithms.
Date: 24-May-2013    Time: 11:00:00    Location: 336


Equilibria in a Repeated Epidemic Dissemination Game

Xavier Vilaça

Departamento de Engenharia Informática

Abstract—Abstract: "Epidemic dissemination protocols are known to be extremely scalable and robust. As a result, they are particularly well suited to support the dissemination of information in large-scale peer-to-peer systems. In such an environment, nodes do not belong to the same administrative domain. On the contrary, many of these systems rely on resources made available by rational nodes that are not necessarily obedient to the protocol. There are two main incentive mechanisms that can be used to deal with rational behavior. One is to rely on balanced exchanges, which is feasible to implement in epidemic protocols where interactions are symmetric. For the asymmetric case, incentives based on a monitoring approach are more suited. Unfortunately, the literature does not provide any meaningful theoretical results for this last type of incentives. In this talk, I will present basic results that establish a tradeoff between the amount of information provided by a monitor and the ability to sustain cooperation among rational nodes, assuming a perfect monitoring." Xavier Vilaça is a PhD student at IST and a researcher of Distributed Systems Group at INESC-ID. He got a MSc degree in Computer Science and Engineering from IST in 2011 and a BSc also in Computer Science and Engineering from University of Minho in 2009. This work is being presented as a final report for the Complex Network Analysis course from the PhD program in Computer Science and Engineering at IST.
Date: 10-May-2013    Time: 11:00:00    Location: 336


Technical Deep-Dive in a Column-Oriented In-Memory Database

Martin Faust


Abstract—Column-oriented databases are on trend in industry (SAP HANA, Vertica) and academia (C-Store, MonetDB, HYRISE) alike. With the recent advances of hardware and the availability of machines with terabytes of RAM the idea of a main memory database becomes viable for large installations. The speed of main memory allows to rethink the classic separation between transactional and analytical systems and thereby provide a single-source-of-truth database system that allows users to run analytical queries on the up-to-date transactional data instead of a stale OLAP version. The talk will focus on our findings of the database utilization of large ERP customers and the conclusions that led to the design of an in-memory column store. We will cover the memory hierarchy and its implications on database design, basic data structures and usage examples that benefit from the ability to run analytical-style queries on the transactional data. The talk is a condensed version of our online lecture "In-Memory Data Management" which attracted over 10.000 students when held at in September 2012.
Date: 29-Apr-2013    Time: 16:00:00    Location: 336


Novel semantic approaches in Genetic Programming.

Stefano Ruberto

Departamento de Engenharia Informática

Abstract—Evolutionary algorithms are stochastic optimization techniques based on the principles of natural evolution and Genetic Programming (GP) belongs to this family . In recent years the study of GP systems has been extended to phenotypic aspects while in previous phase it was mainly focused on genotypic and syntactic aspects. Phenotype or semantic is utilized with the aim of optimizing the capacity of GP algorithms to explore the solution space in an effective way, classifying similar individuals and exploring new semantic areas, increasing the probability to find an optimal solution and to escape local optimum. Currently semantic GP is strictly related to the evaluation of individuals behavior in the candidate population: this kind of evaluation is mainly obtained through the fitness function itself. This work introduces a new way of measuring semantic similarity between individuals that is more independent from the fitness itself, allowing a fair comparison even when the finesses values involved are very far away from each other. This new measure enable a new series of techniques to be used to tackle the open problems in GP, like bloat and over-fitting, and also targeting the phenotypes variety preservation thereby enhancing performances. Preliminary results will be provided. A new theoretical GP algorithm based on this new semantic measure it is also introduced showing the potential advantages. Very early results coming from a first naive implementation show interesting insight on this potential comparing with others on the cutting edge algorithms.
Date: 26-Apr-2013    Time: 11:00:00    Location: 336


Named-entity recognition in the past

Gerrit Bloothooft

Universitaet Utrecht

Abstract—This talk will be about "Named-entity recognition in the past": the limited use of grapheme to phoneme conversion in this process, and possibilities to automatically learn variation in the spelling of names from rich historical data sources, such as full population vital registers. Bio: Gerrit Bloothooft works in the area of Phonetics and Speech Technology since 1978. He contributed to European educational networks in Language and Speech technology and was ISCA board member from 1997-2005. Over the years, his research interests moved from (singing) voice research to the application of speech technology and computational linguistics to data matching in the past.
Date: 17-Apr-2013    Time: 10:00:00    Location: PA2, IST Alameda


Identification of microRNAs and analysis of their expression in Eucalyptus globulus

Jorge Oliveira

Departamento de Engenharia Informática

Abstract—Portugal is one the largest producers of pulp derived from Eucalyptus globulus, making it a fun- damental species for the country. The selection of adequate genotypes would make the exploitation of cultivation areas more efficient. A key objective is to understand the regulatory mechanisms impacting wood characteristics. Here we focus on microRNA-mediated regulation. MicroRNAs are endogenous molecules that act by silencing targeted messenger RNAs. Although approximately 21,000 microRNAs have been identified for many species, none is documented for the Eucalyptus genus. Here, we propose a pipeline that makes use of Cravela, a single-genome miRNA finding tool, and a new NGS data analysis algorithm that provides a novel scoring function to evaluate the expression profile of candidates. This approach produced a short list of candidates, including both conserved and non-conserved sequences. Experimental validation showed amplification in 4 out of 5 candidates chosen from the best-scoring non-conserved sequences.
Date: 12-Apr-2013    Time: 11:00:00    Location: 336


SSL/TLS session-aware user authentication against man-in-the-middle attacks

Rolf Oppliger

eSECURITY Technologies Rolf Oppliger

Abstract—In spite of the fact that SSL/TLS is omnipresent in todays Internet commerce, it is highly vulnerable to man-in-the-middle (MITM) attacks. In this talk, we explain why this is the case and what possibilities one has at hand to protect SSL/TLS-secured Internet commerce against MITM attacks. In particular, we introduce, discuss, and put into perspective a technology called SSL/TLS session-aware (TLS-SA) user authentication that basically links a user authentication to a particular SSL/TLS session to reveal the existence of an MITM. The technology does not protect against malware taking control after user authentication (a so-called man-in-the-browser attack). So TLS-SA does not stop the general trend towards transaction authentication in addition to user authentication for applications with high security requirements, such as Internet banking.
Date: 10-Apr-2013    Time: 11:00:00    Location: 020


Towards OpenLogos Hybrid Machine Translation

Anabela Barreiro


Abstract—In this presentation, I will describe the OpenLogos machine translation system, its architecture and its semantico-syntactic representation language (SAL), which is the heart of the system. I will show how OpenLogos has addressed classic problems of rule-based machine translation, such as those related to ambiguity and complexity of natural language. I will exemplify the kind of quality translation that OpenLogos is capable of and show how OpenLogos has an ideal platform for a hybrid machine translation solution.
Date: 05-Apr-2013    Time: 15:00:00    Location: 020


INESC-ID Distinguished Lecture Series: Model Checking and the Curse of Dimensionality

Prof. Edmund M. Clarke

Carnegie Mellon University

Abstract—Model Checking is an automatic verification technique for large state transition systems. It was originally developed for reasoning about finite-state concurrent systems. The technique has been used successfully to debug complex computer hardware and communication protocols. Now, it is beginning to be used for software verification as well. The major disadvantage of the technique is a phenomenon called the State Explosion Problem. This problem is impossible to avoid in worst case. However, by using sophisticated data structures and clever search algorithms, it is now possible to verify state transition systems with astronomical numbers of states.
Date: 13-Mar-2013    Time: 11:00:00    Location: IST Alameda, sala EA1, Lisboa


NLP-triggered, ontology-based KB enrichment strategies

Nuno Silva

Instituto Superior de Engenharia do Porto (ISEP)

Abstract—Publicly available text-based documents (e.g. news, meeting transcripts) are a very important source of knowledge, especially for organizations. These documents refer domain entities such as persons, places and professional positions, decisions and actions. Querying these documents (instead of browsing, searching and finding) is a very relevant task for any person in general, and particularly for professionals dealing with intensive knowledge tasks. Querying text-based documents’ data, however, is not supported by common technology. For that, such documents’ content has to be explicitly and formally captured into KB facts. Making use of automatic NLP processes is a common approach, but their relatively low precision and recall give rise to data quality problems. Further, facts existing in the documents are often insufficient to answer complex queries, thus the need to enrich the captured facts with facts from third-part repositories (e.g. public Linked Open Data (LOD) repositories). While this description suggests an integration problem, addressing this issue includes more than that, namely duplicate detection, object mapping, consistency checking, consistency resolution and semantic and controlled data enrichment. This talk will describe a process for enriching the repository from LOD repositories. This process is triggered by the NLP parsing process and conducted by the constraints of the knowledge base’s underlying semantically rich ontology. The ontological constraints adopted by the ontology are interpreted and adopted as configuration data for the enrichment strategies. Strategies are responsible for actually enriching the knowledge base (i.e. add new instances and new properties for the instances) according to the interpretation of the constraints.
Date: 01-Mar-2013    Time: 15:00:00    Location: 020


Re-Thinking Web Accessibility

Vicki Hanson

University of Dundee

Abstract—Title: Re-Thinking Web Accessibility Abstract: Previous studies of Web accessibility have found little evidence for the impact of the Web Content Accessibility Guidelines, at least over the relatively short time periods examined. This talk presents new data from over 100 top-traffic and government websites over the 14 years since the publication of WCAG 1.0. Automated analyses of WCAG Success Criteria again found high percentages of violations overall. Unlike earlier studies, however, improvements on a number of accessibility indicators were found, with government sites being less likely than top-traffic non-government sites to have accessibility violations. Examination of the causes of success and failure suggests that improvements may be due, in part, to changes in website technologies and coding practices rather than a focus on accessibility per se. Possible contributors to improving accessibility include the use of new browser capabilities to create more sophisticated page layouts, a growing concern with improved page rank in search results, and a shift toward cross-device content design. Understanding these examples may inspire the creation of additional technologies with incidental accessibility benefits. The talk concludes with a look at how adapting even non-compliant Web content can improve accessibility for a broad range of people. <p> <b>Short bio</b>: Vicki Hanson is Professor of Inclusive Technologies at the University of Dundee, and Research Staff Member Emeritus from IBM Research. She has been working on issues of inclusion for older and disabled people throughout her career, first as a Postdoctoral Fellow at the Salk Institute for Biological Studies. She joined the IBM Research Division in 1986 where she founded and managed the Accessibility Research group. Her research examines the changing nature of technologies and the motivations and barriers to their use by populations in danger of digital exclusion, focusing on issues related to aging, cognition, and language. Applications she has created have received multiple awards from organizations representing older and disabled users. She is Past Chair of the ACM SIG Governing Board, Past Chair of ACM Special Interest Group on Accessible Computing (SIGACCESS), and is the founder and co-Editor-in-Chief of ACM Transactions on Accessible Computing. Prof Hanson is a Fellow of the British Computer Society and was named ACM Fellow in 2004 for contributions to computing technologies for people with disabilities. In 2008, she received the ACM SIGCHI Social Impact Award for the application of HCI research to pressing social needs. She currently is the ACM Secretary/Treasurer. We recently received the 2013 Anita Borg "Woman of Vision Award for Social Impact" and was just elected Fellow of the Royal Society of Edinburgh.
Date: 25-Feb-2013    Time: 18:30:00    Location: 336


Organizational Learning and Support Tools

André Luis Andrade Menolli

Universidade Estadual do Norte do Paraná

Abstract—Organizational learning is an area that helps companies to improve their processes significantly through the reuse of experiences. For a knowledge-intensive area such as software engineering, it is extremely important that the acquired knowledge to be stored and reused systematically. However, to make learning possible in software development companies is not an easy task, since it is an area in which processes and knowledge are usually internalized in the mind of their employees. Hence, it is necessary to create environments that promote and motivate information sharing and knowledge dissemination. Therefore, this work proposes a semantic collaborative environment based on Web 2.0 tools, learning objects and units of learning, in order to help improve organizational learning in software development teams.
Date: 22-Feb-2013    Time: 17:30:00    Location: 020



Philippe Boula de Mareüil


Abstract—The present work which focuses on regional, foreign and social accents in French, combines perceptual and acoustic approaches to account for variation due to speakers' geographic and (socio)linguistic backgrounds. It is based on large amounts of data, using measurement tools derived from automatic speech processing techniques to quantify certain trends. This work first aims at modelling identification and characterisation processes of regional and foreign accents in French. Perceptual experiments and acoustic analyses were carried out using automatic phoneme alignment, which could include pronunciation variants corresponding to Southern, Belgian, West-African, Maghrebian, English, German and Portuguese accents, among others. In total, over 100 hours of regional- or foreign-accented French were analysed. Some of the most discriminating pronunciation features, such as the realisation of nasal vowels in Southern French or the realisation of schwas (backed and closed) in Portuguese-accented French were ranked using automatic learning techniques. Since speech conveys both phonemic and prosodic information, the contribution of prosody to the perception of various accents was examined. The methodology included prosody modification/resynthesis techniques. The contribution of prosody was highlighted especially for the so-called banlieue accent, with a sharp pitch fall before a prosodic boundary. Modelling the production and perception of variation in speech is of major importance for understanding how language may evolve. Orientations for future work are proposed, to better take into account social factors especially and to link accents, speaking styles and expressive speech.
Date: 13-Feb-2013    Time: 16:00:00    Location: 336


INESC-ID DISTINGUISHED LECTURE SERIES: Symbiotic Autonomy: Robots, Humans, and the Web

Prof. Manuela Veloso

Carnegie Mellon University, USA and INESC-ID Lisboa, IST

Abstract—INESC-ID DISTINGUISHED LECTURE SERIES <p>We envision ubiquitous autonomous mobile robots that coexist and interact with humans while performing assistance tasks. Such robots are still far from common, as our environments offer great challenges to robust autonomous robot perception, cognition, and action. In this talk, I present symbiotic robot autonomy in which robots are aware of their limitations and proactively ask for help from humans, access the web for missing knowledge, and coordinate with other robots. Such symbiotic autonomy has enabled our CoBot robots to move in our multi-floor buildings performing a variety of service tasks, including escorting visitors, and transporting packages between locations. I will describe CoBots fully autonomous effective mobile robot indoor localization and navigation algorithms, its human-centered task planning, and its symbiotic interaction with the humans and with the web. I will further discuss our ongoing research on knowledge learning from our speech-based robot interaction with humans. The talk will be illustrated with results and examples from many hours-long runs of the robots in our buildings.
Date: 11-Feb-2013    Time: 14:00:00    Location: IST Alameda



Sérgio Pequito

Departamento de Engenharia Informática

Abstract—Abstract: This talk introduces a method to design a distributed sensor network for field reconstruction that is minimal with respect to a communication cost function. This cost function is given by the sum of communication between sensors and that of a subset of sensors used for backbone communication. To achieve this goal, we want to create an observable distributed sensor network, where through the (at most the number of sensors) measurements collected by the central authority, the central authority can recover the initial parameters at different sensors location. To achieve this goal, we need to first decide which sensors should communicate and after design the weights by which each sensor should update their states with those of its neighbors, in other words, the distributed sensor network dynamics. In addition, we need to identify a subset of sensors that can report their state to a central location, corresponding to the design of the backbone reporting function. The joint design of the sensor network dynamics and the backbone reporting function to recover the initial state of the dynamic system justifies the notion of an observable distributed sensor network. We show an efficient algorithm for designing the optimal observable distributed sensor network for a given set of sensors and cost function, providing an illustrative example.
Date: 01-Feb-2013    Time: 11:00:00    Location: 336


Partilha de dados científicos : Um contributo do LNEG

Teresa Ponce de Leão

LNEG - Laboratório de Energia e Geologia

Abstract—O Laboratório de Energia e Geologia (LNEG) disponibiliza desde 2010 o Geoportal ( uma infraestrutura de serviços integrados de suporte à gestão e visualização de dados espaciais, que visa disponibilizar, em ambiente aberto, informação georreferenciada. A aplicação, que teve a sua origem na necessidade das geociências partilharem e fornecerem cartografia geológica e informação geo-científica, foi totalmente desenvolvida no contexto da diretiva europeia INSPIRE (INfrastructure for SPatial InfoRmation in Europe), encontrando-se em conformidade com os princípios e normativos estabelecidos por esta Diretiva, e está atualmente em fase de alargar esta abordagem a outras áreas científicas. A aplicação apresenta funcionalidades como a pesquisa de Metadados, a pesquisa de Bases de Dados Online e o Visualizador de mapas. Neste seminário, o Geoportal será apresentado, ilustrando as suas potencialidades através de diferentes exemplos, desde a cartografia aos projectos.
Date: 25-Jan-2013    Time: 11:00:00    Location: 020


Ultra Low-Power Circuits

Tuan-Vu Cao

Norwegian University of Science and Technology

Abstract—The presentation comprises the three latest works of Dr. Tuan-Vu Cao, namely: 1) a 9-bit 50MS/s Asynchronous SAR ADC in 28nm CMOS 2) a Frequency-based Delta-Sigma Modulator 3) an Energy-efficient resonant BFSK Tx.
Date: 21-Jan-2013    Time: 11:00:00    Location: EA3 at IST


Understanding the mechanisms of virulence and resistance

Felipe Lira

Departamento de Engenharia Informática

Abstract—Infectious diseases remain among the major causes of human death in the world. Several infections at hospitals are due to opportunistic pathogens, microorganisms that rarely infect healthy people, but are a frequent cause of infection in people with basal diseases, who are immunodepressed or debilitated. Environmental bacteria, frequently antibiotic resistant, constitute a large percentage of those pathogens. Our work focuses on understanding the mechanisms of virulence and resistance, as well as possible crosstalk, of these pathogens. Within this scope, in the last two years, we have been defining those genes whose mutation changes the phenotype of antibiotic susceptibility. As a result, we have selected nearly three hundred genes for future analysis and are currently studying whether those mutations that challenge intrinsic resistance also alter the virulence of Pseudomonas aeruginosa and Stenotrophomonas maltophilia. We found that mutations in several genes encoding proteins from different categories that include multidrug efflux pumps, two component systems, metabolic enzymes or global regulators, simultaneously alter the antibiotic susceptibility and the virulence of P. aeruginosa. Another opportunistic pathogen we are working with is S. maltophilia, which is characterized by its intrinsic low susceptibility to several antibiotics. Part of this low susceptibility relies on the expression of chromosomally-encoded multidrug efflux pumps. Including this, the metagenome approach infers new pathways to explain the transmission of antibiotic enconded genes caused by horizontal gene transfer.
Date: 18-Jan-2013    Time: 11:00:00    Location: 336


Future Many-core Processors: Challenges and Solutions

Pedro Trancoso

University of Cyprus

Abstract—Processor design has evolved considerably in the last years. In order to cope with Moore s Law, processors became increasingly complex and their power consumption reached unacceptable levels. This led to a paradigm shift to what currently is the de-facto standard the multi-core processors. Even though these processors are able to offer high performance at a lower power consumption level, they introduce new challenges, particularly as the number of cores per processor increases. It is expected that in the future we will have thousands of cores within a chip and that there will be cores of different characteristics on the same chip. Such processors are known as heterogeneous many-core chips. In this presentation an overview of the past, present, and future research projects dealing with these issues will be given. The focus is on two topics: TFlux, an implementation of the Data-Driven Multithreading execution model and the Fine-grain parallelism for different multi-cores and accelerators. In addition, results from different applications and scheduling for the Intel Single-chip Cluster Computer (SCC) 48-core processor willbe presented. All projects are unified under a common umbrella: the vision that future heterogeneous many-core processors will be packaged together with a virtualization layer hiding the complexity and managing the resources to exploit the best performance.
Date: 20-Dec-2012    Time: 14:00:00    Location: 020


Challenges in the Application of Molecular and Quantum Mechanics to Biomolecular Problems

University of Göttingen

Abstract—The range of application of quantum mechanical methods has been increasing rapidly in the last few years. Today, one is able to study large biomolecular systems at a level of accuracy which a decade ago was only possible for 5-10 atoms. These developments are an outcome of the increasing computer power available to the quantum chemist, but also by new theories/procedures which have helped remove some of the major bottlenecks in the calculations. In this talk, Prof. Mata will give a short introduction to the research issues that his group has been addressing in this field of expertise.
Date: 19-Dec-2012    Time: 09:30:00    Location: 336


Searching Web Archives

Miguel Costa


Abstract—The web is rapidly vanishing along with our collective memory. At least 77 web archives have been developed, but these remain data silos with a great potential to unfold. In this seminar, we will present the path followed at FCCN to improve the search effectiveness in web archives, addressing an evaluation methodology that encompasses characterizations about the users and the environment to support realistic simulations. We present for the first time the search effectiveness of the state-of-the-art technology employed in web archives and the improvements achieved by modeling temporal information and using supervised machine learning algorithms.
Date: 10-Dec-2012    Time: 11:00:00    Location: 020


Cyber-physical MPSoC Systems: Future Multi-Core Architectures for reliable Mobility & Technologies

Prof. Juergen Becker

Karlsruhe Institute of Technology, Karlsruhe, Germany

Abstract—An INESC-ID Distinguished Lecture Series talk. Cyber-physical MPSoC Systems: Future Multi-Core Architectures for reliable Mobility & Technologies
Date: 05-Dec-2012    Time: 11:00:00    Location: Anfiteatro do Complexo Interdisciplinar, IST


The L2F Spoken Web Search system for Mediaeval 2012

Alberto Abad


Abstract—The objective of the “Spoken Web Search” (SWS) task at MediaEval 2012 consists of searching for audio content within audio content using an audio content query. In this presentation, we introduce both the SWS task and the SWS system developed by the INESC-ID’s Spoken Language Systems Laboratory (L2F) for the Mediaeval 2012 campaign. The proposed system is composed by the fusion of four individual phonetic-based SWS sub-systems. Each sub-system exploits different language-dependent phonetic networks. The main characteristic of the sub-systems is that they exploit hybrid ANN/HMM connectionist methods for both query tokenization and search. In spite of the simplicity of the approach, very promising results were obtained ranking among the best systems of the competition.
Date: 30-Nov-2012    Time: 15:00:00    Location: 020


Map Matching :: Novageo Solutions at the 2012 ACM SIGSPATIAL Cup

Sérgio Freitas

Novageo Solutions

Abstract—The 2012 ACM SIGSPATIAL cup focused on the task of map matching, i.e. the problem of correctly matching a sequence of location measurements to road segments. Participating teams were given access to several vehicle trips recorded with a GPS logger, along with the correct map matching results and a representation of the road network, and they had to develop a map matching program that could match a sequence of location measurements to roads, as acurrately and as fast as possible. Map matching is a necessary part of in-vehicle navigation systems for determining which road a vehicle is on. More recently, map matching is becoming particularly important, as vehicles are used as probes for measuring road speeds, building statistical models of traffic delays, and understanding the behaviors of drivers. In this talk, Eng. Sérgio Freitas will describe the open-source map matching system developed at Novageo Solutions for participating in the 2012 ACM SIGSPATIAL cup, which was ranked in the 7th position out of the 31 participating systems. The system uses a combination of different criteria, such as geospatial distance towards candidate road segments or topological matching between the candidate edges. I will present the general architecture of the developed system, describe its main features, and discuss possible directions for further improvements.
Date: 12-Nov-2012    Time: 11:00:00    Location: 020


Sucint structures to self-indexing text

Nieves R. Brisaboa

Universidade de Coruña

Abstract—The development of applications that manage large text collections needs indexing methods which allow efficient retrieval over text. Several indexes have been proposed which try to reach a good trade-off between the space needed to store both the text and the index, and its search efficiency. Self-indexes are becoming more and more popular in the last years. Not only they index the text, but they keep enough information to recover any portion of it without the need of keeping it explicitly. Therefore, they actually replace the text. In this talk I will present two useful self-index with good properties. They need only about a 35% of the space of the plain text, but they can efficiently answer retrieval queries thanks to their indexing capabilities.
Date: 29-Sep-2012    Time: 11:00:00    Location: 336


Privacy-Preserving Speech and Audio Processing

Bhiksha Raj

Carnegie Mellon University

Abstract—The privacy of personal data has generally been considered inviolable. On the other hand, in nearly any interaction, whether it is with other people or with computerized systems, we reveal information about ourselves. Sometimes this is intended, for instance when we use a biometric system to authenticate ourselves, or when we explicitly provide personal information in some manner. Often, however, it is unintended; for instance a simple search performed on a server reveals information about our preferences. An interaction with a voice recognition system reveals information to the system about our gender, nationality (accent), and possibly emotional state and age. Regardless of whether the exposure of information is intentional or not, it could be misused, potentially setting us at financial, social and even physical risk. These concerns about exposure of information has spawned a large and growing body of research, addressing various issues about how information may be leaked, and how to protect it. One area of concern are sound data, particularly voice. For instance, voice-authentication systems and voice-recognition systems are becoming increasingly popular and commonplace. However, in the process of using these services, a user exposes himself to potential abuse: as mentioned above the server, or an eavesdropper, may obtain unintended demographic information about the user by analyzing the voice and sell this information. It may edit recordings to create fake recordings the user never spoke. Other such issues can be listed. Merely encrypting the data for transmission does not protect the user, since the recepient (the server) must finally have access to the data in the clear (i.e. decrypted form) in order to perform its processing. In this talk, we will discuss solutions for privacy-preserving sound processing, which enable a user to employ sound- or voice-processing services without explosing themselves to risks such as the above. We will describe the basics of privacy-preserving techniques for data processing, including homomorphic encryption, oblivious transfer, secret sharing, and secure-multiparty computation. We will describe how these can be employed to build secure "primitives" for computation, that enable users to perform basic steps of computation without revealing information. We will describe the privacy issues with respect to these operations. We will then briefly present schemes that employ these techniques for privacy-preserving signal processing and biometrics. We will then delve into uses for sound, and particularly voice processing, including authentication, classification and recognition, and discuss computational and accuracy issues. Finally we will present a newer class of methods based on exact matches built upon locality sensitive hashing and universal quantization, which enables several of the above privacy-preserving operations at a different operating point of privacy-accuracy tradeoff.
Date: 28-Sep-2012    Time: 14:00:00    Location: 020


FPGA-Based Platform for Real-Time Internet

Maciej Wielgosz

Norwegian University of Science and Technology

Abstract—Distributed Media Play (DMP) is the futuristic telepresence system which is aimed at providing near-natural virtual meeting environment between users hosted in collaboration spaces. There are a number of applications for which DMP may prove useful. These include, but are not limited to, musical sessions, song lessons, distributed opera, multi-player games, near-natural virtual meetings and remote surgery. The collaboration spaces proposed by DMP give the perception that the communicating parties are in the same physical space, as nearly as possible, by having 3D auto-stereoscopic multi-view video and multi-channel sound. This requires very high data rates, about 10^3 to 10^5 times higher than currently offered throughput. For achieving this goal, DMP proposes a three-layer network architecture with combined header for Application, Transport, Network and Link Flow Control (LFC) layers, called the AppTraNetLFC. It is to be implemented on hardware (FPGA) to give the minimum possible end-to-end delay for routing and performing other network-related functionalities. The network part of DMP is termed as the Real-Time Internet.
Date: 07-Sep-2012    Time: 10:00:00    Location: 336


Speculations in Reliable Distributed Computing

EPFL - École Polytechnique Fédérale de Lausanne

Abstract—ABSTRACT: If we are ever to understand what computers can collectively do, we need a new theory of complexity. Recent evolutions, including the cloud and the multicore, are turning computing ubiquitously distributed, rendering the classical complexity theory of centralized computing at best insufficient. A complexity theory for distributed computing has emerged in the last decades, measuring complexity for each specific model of the networked environment, represented by an adversary that may provoke asynchrony, failures, contention, etc. This one adversary - one result approach led to an exponential proliferation of seemingly unrelated results, none of which captures current practices in the development of distributed applications. Instead, applications rely on speculative algorithms that perform well when the environment behaves nicely and gracefully degrades if the environment is more hostile, considering thereby several adversaries at the same time. With no underlying theory, the proposed speculative algorithms lack however rigor and there is anecdotal evidence of their fragility. It is moreover usually impossible to predict their behavior or determine whether their limitations are related to fundamental impossibilities or artifacts of specific infrastructures. The goal of this talk is to discuss a glimmer of a theory of speculative distributed computing. BIO: Rachid Guerraoui is Professor in Computer Science at EPFL where he directs the Institute of Theoretical Computer Science. He has worked in the past with HP Labs in Palo Alto and MIT. He is interested in distributed computing on which he wrote few books and more papers (
Date: 06-Sep-2012    Time: 16:00:00    Location: 020


Speculations in Reliable Distributed Computing

Rachid Guerraoui

EPFL - École Polytechnique Fédérale de Lausanne

Abstract—If we are ever to understand what computers can collectively do, we need a new theory of complexity. Recent evolutions, including the cloud and the multicore, are turning computing ubiquitously distributed, rendering the classical complexity theory of centralized computing at best insufficient. A complexity theory for distributed computing has emerged in the last decades, measuring complexity for each specific model of the networked environment, represented by an adversary that may provoke asynchrony, failures, contention, etc. This one adversary - one result approach led to an exponential proliferation of seemingly unrelated results, none of which captures current practices in the development of distributed applications. Instead, applications rely on speculative algorithms that perform well when the environment behaves nicely and gracefully degrades if the environment is more hostile, considering thereby several adversaries at the same time. With no underlying theory, the proposed speculative algorithms lack however rigor and there is anecdotal evidence of their fragility. It is moreover usually impossible to predict their behavior or determine whether their limitations are related to fundamental impossibilities or artifacts of specific infrastructures. The goal of this talk is to discuss a glimmer of a theory of speculative distributed computing. <P> BIO:<P> Rachid Guerraoui is Professor in Computer Science at EPFL where he directs the Institute of Theoretical Computer Science. He has worked in the past with HP Labs in Palo Alto and MIT. He is interested in distributed computing on which he wrote few books and more papers (
Date: 06-Sep-2012    Time: 16:00:00    Location: 020


WhatsUp : a P2P instant news items recommender


Abstract—WhatsUp is an instant news system aimed for a large scale network with no central bottleneck, single point of failure or censorship authority. Users express their opinions about the news items they receive by operating a like-dislike button. WhatsUp’s collaborative filtering scheme leverages these opinions to dynamically maintain an implicit social network and ensures that users subsequently receive news that are likely to match their interests. Users with similar tastes are clustered using a similarity metric reflecting long-standing and emerging (dis)interests without revealing their profile to other users. News items are disseminated through a heterogeneous epidemic protocol that (a) biases the choice of the targets towards those with similar interests and (b) amplifies the dissemination based on the interest of every actual news item. The push and asymmetric nature of the network created by WhatsUp provides a natural support to limit privacy breaches. The evaluation of through large-scale simulations, a ModelNet emulation on a cluster and a PlanetLab deployment on real traces collected both from Digg as well as from a real survey, show that WhatsUp provides an efficient tradeoff between accuracy and completeness.
Date: 05-Sep-2012    Time: 15:00:00    Location: Auditório INESC, Av. Duque de Ávila, 23


Cleaning data with constraints

Floris Geerts

University of Antwerp

Abstract—In this talk I will survey various approaches for modeling data quality by means of certain kinds of constraints (dependencies). In particular, the consistency and currency of data and the identification of similar objects will be addressed. Furthermore, constraints will be shown to be helpful when repairing the data in an automated way. Finally, future directions of research in data quality will be identified.
Date: 27-Jul-2012    Time: 10:00:00    Location: 020


Deeper QA: CMU, Watson, and the Open Advancement of Question Answering

Eric Nyberg

Carnegie Mellon University

Abstract—This talk presents a synopsis of 10 years of research and development at Carnegie Mellon in the area of Question Answering systems, including CMU's collaborations with IBM on the Watson system. Recent work in the open-source development of QA systems for different application domains will also be discussed.
Date: 19-Jul-2012    Time: 17:00:00    Location: Complexo Interdisciplinar, Instituto Superior Técnico


Ensemble pruning via Weighted Accuracy and Diversity

Samuel Zeng

University of Macao - Macao, China

Abstract—Ensemble Pruning (EP) refers to the approach dealing with diminution or reduction of the ensemble size prior to predictions combination. The objective of ensemble pruning is intent to reduce the memory requirement and accelerate the classification process while preserving or improving prediction accuracy of the ensemble. A pernicious problem among the typical ensemble pruning algorithms is taking only one of the two crucial criteria into account for evaluating ensemble quality: either accuracy or diversity. None of them considers these two guidelines simultaneously, nor the interaction between them. Our claim is that accuracy and diversity are mutual restraint factors, assembling all classifiers with high accuracy sometimes may downgrade the complementarity (diversity) and robustness of the algorithm; whereas diversely assemble the classifiers may seriously decrease the classification accuracy. Therefore, we proposed Weighted Accuracy and Diversity (WAD) method, a novel criterion or measure to evaluate the quality of a classifier ensemble, helping the ensemble pruning task. Our method can coordinate accuracy and diversity by dint of weighting to get a score representing ensemble quality. In our research, the proposed method has been validated in several natural language processing applications, including part-of-speech tagging, shallow parsing and sentence boundary detection
Date: 11-Jul-2012    Time: 16:00:00    Location: 336


Integrated Neurocomputational and Empirical Studies of Learning and Cognition

Tiago V. Maia

Departamento de Engenharia Informática

Abstract—My program of research focuses on the integrated use of neurocomputational models and empirical techniques – most notably brain imaging – to investigate the neural bases of learning and cognition in healthy subjects and the disruption of these processes in psychiatric disorders. This talk will present three examples in which this integrated approach has been crucial to arrive at novel insights that would be out of reach for more classical approaches. In the first example, I will use a standard reinforcement-learning model to show that learning to avoid negative outcomes depends on an internally generated reward signal. The firing of dopamine neurons in the brain represents reward signals, so this model made new predictions concerning patterns of dopamine release during avoidance learning that we have confirmed experimentally. In the second example, we have used model-based functional magnetic resonance imaging (fMRI) – a technique that fits a computational model to behavior and fMRI data – to identify the neural substrates of habit learning in humans. We found that learning a habit depends on the engagement of a specific brain region; in fact, we were able to distinguish participants who learned the habit from those who did not based on whether or not they engaged this region. In the third and final example, using a neurocomputational model in which we manipulated neurotransmitter levels, we found that reduced levels of a specific neurotransmitter produce the behavioral deficits that characterize attention-deficit/hyperactivity disorder (a disorder characterized by inattention, hyperactivity, and impulsivity). The model made several new predictions concerning abnormal patterns of brain connectivity in patients with this disorder, which we have confirmed using fMRI. These examples illustrate the power of using neurocomputational models in tight integration with empirical techniques to advance our understanding of the neural bases of high-level cognitive processes in healthy subjects and to elucidate how these processes go awry in psychiatric disorders.
Date: 29-Jun-2012    Time: 11:00:00    Location: 04


Entropy-based Pruning for Phrase-based Machine Translation

Wang Ling

Carnegie Mellon University, USA and INESC-ID Lisboa, IST

Abstract—Phrase-based machine translation models have shown to yield better translations than Word-based models, since phrase pairs encode the contextual information that is needed for a more accurate translation. However, many phrase pairs do not encode any relevant context, which means that the translation event encoded in that phrase pair is led by smaller translation events that are independent from each other, and can be found on smaller phrase pairs, with little or no loss in translation accuracy. In this work, we propose a relative entropy model for translation models, that measures how likely a phrase pair encodes a translation event that is derivable using smaller translation events with similar probabilities. This model is then applied to phrase table pruning. Tests show that con- siderable amounts of phrase pairs can be excluded, without much impact on the translation quality. In fact, we show that better translations can be obtained using our pruned models, due to the compression of the search space during decoding.
Date: 22-Jun-2012    Time: 15:00:00    Location: 336


Translational research on genomics and proteomics: experience of the SING group

Departamento de Engenharia Informática

Abstract—bout SING ( @ University of Vigo The Next Generation Computer Systems Group (SING, Sistemas Informáticos de Nueva Generación) brings together a reduced number of researches with the aim of developing intelligent models and deploying them in real environments. The expertise of the members comes from different areas related with previous research in developing symbolic, connexionistic and hybrid AI systems, solving security problems, administration of networks, e-commerce, VoIP, implementation of web applications and developing systems working with documental data bases. The projects carried out by the SING group always follow a practical point of view, but taking into consideration the formal aspects needed in any research work. Indeed, most interesting techniques employed in previous works cope with the utilization of case-based reasoning, artificial neural networks, fuzzy logic, rough sets, intelligent agents and multi-agent systems, etc.
Date: 04-Jun-2012    Time: 17:00:00    Location: 020


Formalization of English Phrasal Verbs

Peter A. Machonis

Florida International University

Abstract—As Sag et al. (2002: 14) state, multiword expressions "constitute a key problem that must be resolved in order for linguistically precise NLP to succeed." This talk presents the results of using the linguistic development environment, NooJ, together with manually constructed lexicon-grammar tables of transitive and neutral English phrasal verbs, to automatically recognize these structures, with and without insertion, in large corpora. We tested our grammar on written works, such as 19th century British novels, as well as on an oral corpus consisting of 25 transcribed Larry King Live programs from January 2000 achieving 85% accuracy. In addition to a phrasal verb grammar and a dictionary containing 1,200 English phrasal verbs, we have also added two disambiguation grammars to automatically remove incorrect phrasal verbs from NooJ’s Text Annotation Structures (TAS) and a dictionary to eliminate additional noise originating from compounds and idiomatic expressions. Although our analysis shows how difficult it is to accurately identify English phrasal verbs in large corpora, our study confirms that lexicon-grammar tables and NooJ are indeed powerful linguistic tools that when used together can help solve a key problem in Natural Language Processing.
Date: 25-May-2012    Time: 15:00:00    Location: 336


Prediction of Escherichia coli single gene deletion mutants by projection to latent pathways


Abstract—In metabolic engineering or synthetic biology robust models with high predictive power are required. Constraints-based modelling methods such as metabolic flux analysis (MFA), flux balance analysis (FBA), elementary flux modes (EFMs) or extreme pathways (EP) have been widely used. The success of these methods is however conditioned by the many times insufficient mechanistic knowledge base. In this study, we built upon a previously developed hybrid constraints-based modelling method to develop E. coli models with improved predictive power. In particular, we apply a projection to latent pathways (PLP) method that merges together mechanistic and statistical constraints. It may be considered as a middle-out modelling approach that combines reliable knowledge and reverse engineering to extract unknown mechanisms from “omics” data sets. The method is applied to predict the central carbon fluxes of several E. coli strains (both wild-type and single gene KO mutants). We show that the central carbon fluxes of several single gene KO E. coli mutants could be predicted with high accuracy from the combined information of gene deletion and environmental conditions.
Date: 25-May-2012    Time: 14:30:00    Location: 020


Epidemic spreading in online and offline social networks

Ciro Cattuto

ISI Foundation

Abstract—The ever increasing adoption of online social networks and mobile technologies allows to measure human behaviors at unprecedented levels of detail and scale. This enables a data-driven investigation of complex processes over social networks, challenging established assumptions and paving the road to new models. In the first part of the talk we will focus on human mobility, and in particular on dynamic networks of face-to-face proximity in indoor environments, measured by using wireless wearable sensors. We will show that simple spreading processes can be used as dynamical probes to expose important features of the interaction patterns such as burstiness and causal constraints. We will argue that in order to correctly model the arrival times of messages propagating over the network, it is useful to abandon the notion of wall-clock time in favor of a node-specific notion of time, defined in terms of activity levels. We will use this insight to build a model of dynamic social network that reproduces in simulation the spreading features observed for empirical data. Finally, we will highlight a few challenges in using empirical human contact networks to simulate more complex epidemic processes. In the second part of the talk we will shift our focus to online social networks and discuss information spreading in the Twitter micro-blogging system. We will discuss the viral adoption of social annotations (hashtags) and track their popularity over time as a proxy for the collective attention of an online community. We will show that hashtag popularity defines discrete classes of collective attention that correspond to diverse social semantics of the hashtags. We will model hashtag adoption as an epidemic susceptible-infectious-recovered (SIR) process, and discuss the interpretation of the measured epidemic parameters for the different hashtag classes.
Date: 23-May-2012    Time: 11:00:00    Location: 020


Assistive Speech Technology

Steve Renals

University of Edinburgh

Abstract—Speech Technology has a lot to offer Augmentative and Alternative Communication (AAC) and Assistive Technologies (AT). In this talk, I will outline some of the work that we have done in this area, focusing on two main areas: the development of speech-based interfaces for older people, which may be used in the home environment; and the development of speech-based interfaces for people with disabilities or motor control diseases that result in disordered speech. Specific research topics will include speech recognition for ageing voices, synthetic speech suitable for older people, the recognition of dysarthric speech, and the development of personalised synthetic voices (voice cloning).
Date: 17-May-2012    Time: 14:00:00    Location: 336


Gerenciamento Explícito de Memória de Rascunho a partir de Arquivos-objeto para Melhoria da Eficiência Energética de Sistemas Embarcados

Prof. Dr. José Luís Almada Güntzel

Universidade Federal de Santa Catarina

Abstract—Memórias de rascunho (Scratchpad Memories - SPM) tornaram-se populares em sistemas embarcados por conta de sua eficiência energética. A literatura sobre SPMs parece indicar que a alteração dinâmica de seu conteúdo suplanta a alocação estática. Embora técnicas overlay-based (OVB) operando em nível de código-fonte possam beneficiar-se de múltiplos hot spots para uma maior economia de energia, elas não conseguem explorar elementos de programa oriundos de bibliotecas. Entretanto, quando operam diretamente em binários, as abordagens OVB conduzem a uma menor economia, frequentemente exigem hardware dedicado e às vezes impossibilitam a alocação de dados. Por outro lado, a economia de energia reportada por todas as técnicas, até o momento, ignora o fato de que, em sistemas que possuem caches, estas deverão ser otimizadas antes da alocação para SPM. Nesta apresentação mostra-se evidência experimental de que, quando métodos non-overlay-based (NOB) são utilizados para manipulação de arquivos binários, a economia de energia em memória, por conta da alocação em SPM, varia entre 15% a 33%, em média, sendo tão boa ou melhor do que a economia reportada para abordagens OVB que operam sobre binários. Como esta economia (ao contrário dos trabalhos correlatos) foi medida após o ajuste-fino das caches - quando existe menos espaço para otimização -, estes resultados estimulam o uso de métodos NOB, mais simples, para a construção de alocadores capazes de considerar elementos de bibliotecas e que não dependam de hardware especializado. Os resultados de experimentos também mostram que, dada uma capacidade Ct de uma cache pré-ajustada equivalente, o tamanho ótimo de SPM reside em [Ct/2, Ct] para 85% dos programas avaliados. Finalmente, mostram-se evidências contra-intuitivas de que, mesmo para arquiteturas baseadas em cache contendo SPMs pequenas, é preferível utilizar-se a granularidade de procedimentos à de blocos básicos, exceto em algumas poucas aplicações que combinam elementos frequentemente acessados e taxas de faltas relativamente altas.
Date: 09-May-2012    Time: 11:00:00    Location: 04


Advances in Structured Prediction for Natural Language Processing

André Martins


Abstract—This thesis proposes new models and algorithms for structured output prediction, with an emphasis on natural language processing applications. We advance in two fronts: in the inference problem, whose aim is to make a prediction given a model, and in the learning problem, where the model is trained from data. For inference, we make a paradigm shift, by considering rich models with global features and constraints, representable as constrained graphical models. We introduce a new approximate decoder that ignores global effects caused by the cycles of the graph. This methodology is then applied to syntactic analysis of text, yielding a new framework which we call “turbo parsing,” with state-of-the-art results. For learning, we consider a family of loss functions encompassing conditional random fields, support vector machines and the structured perceptron, for which we provide new online algorithms that dispense with learning rate hyperparameters. We then focus on the regularizer, which we use for promoting structured sparsity and for learning structured predictors with multiple kernels. We introduce online proximal-gradient algorithms that can explore large feature spaces efficiently, with minimal memory consumption. The result is a new framework for feature template selection yielding compact and accurate models.
Date: 04-May-2012    Time: 15:00:00    Location: 336


A Toolbox for Probability Calculus and Optimization

Pedro Miguel Lúcio Melgueira

Universidade de Évora

Abstract—Probability calculus is required in the solution of many engineering problems. While there are software libraries available for linear algebra operations, like BLAS and LAPACK, libraries specific for probability computations are generally lacking. In this talk a toolbox for discrete probability calculus will be presented that supports an arbitrary number of variables. This library can then be used to solve common problems like the application of Bayes law, HMM state estimation, optimization problems on probability spaces, etc. The work was developed in the scope of a BII grant by Pedro Melgueira from the University of Évora.
Date: 27-Apr-2012    Time: 15:00:00    Location: 04


Dynamic neuroimaging using EEG-fMRI

Patrícia Figueiredo


Abstract—It has recently become possible to record the EEG simultaneously with fMRI, providing whole-brain maps of the hemodynamic (fMRI) correlates of electrophysiological (EEG) activity. The combination of the EEG high temporal resolution with the fMRI high spatial resolution offers a unique opportunity for studying the spatio-temporal dynamics of brain activity noninvasively. Here, I will focus on the application of EEG-fMRI to the study of spontaneous epileptic activity in patients undergoing pre-surgical evaluation. The EEG is used to identify electrical discharges associated with epileptic seizures, as well as interictal epileptiform spikes, and the fMRI correlates of such EEG features can then be used to localize the brain networks involved in the epileptic activity. I will first introduce the basic principles and main challenges of the EEG-fMRI technique. I will then present results regarding the investigation of the link between EEG and fMRI signals, as well as the functional brain connectivity underlying seizure propagation. Appropriate biophysically-inspired models are employed in order to extract the relevant information from the EEG-fMRI data. It is shown that, in this way, important insights may be gained into the spatio-temporal dynamics of epileptic activity.
Date: 27-Apr-2012    Time: 14:30:00    Location: 020


Data Integration Issues in Facilities Management

Paulo Jorge Fernandes Carreira


Abstract— Facilities Management (FM) is receiving increasing attention as the efficient operation buildings grows in importance both for economic and environmental reasons. The multi-disciplinary nature of FM requires a plethora of different tools to be brought together. Therefore, the integration of information underlying FM tools is a crucial aspect for the effectiveness of FM as whole, which has been a source of concern of research and industry alike. Yet, despite these efforts, information integration in FM still faces a number of limitations which, in our view, hinder the development of FM itself. In this talk we will overview the main challenges regarding information integration in this important field and discuss how to tackle them using state-of-the-art data integration techniques.
Date: 26-Apr-2012    Time: 11:00:00    Location: N7.1


Speaker and Content Identification

Xavier Anguera

Telefonica Research

Abstract—In this talk I will cover two of the topics I have been recently working on. On the one hand, with regard to speaker identification, I will introduce the use of binary fingerprints to model the voice of a speaker. Based on the projection of standard acoustic vectors into a special GMM model (representing the speaker acoustic space), high-dimensional binary vectors, which have been proven successful in identifying speakers for speaker verification and diarization tasks, are obtained. On the other hand, I will talk about current developments in pattern matching approaches that allow for the development of content-centric applications when little or no training data is available for a particular language. In particular, I will describe a query-by-example system I presented to Mediaeval 2011 evaluation, which uses a novel feature extraction front-end I further described in a paper at ICASSP 2012.
Date: 13-Apr-2012    Time: 15:00:00    Location: 336


Hybrid Modeling for Systems Biology: Theory and Practice

Rui Oliveira

Universidade Nova de Lisboa

Abstract—I will give an introduction to hybrid modeling methods for bioprocess and biochemical networks modeling. Hybrid methods combine parameter-free modeling with statistical modeling tools. They enable to blend mechanistic knowledge and statistical relationships into models with improved performance and broader scope. Examples of such techniques are hybrid bioreactor modeling for optimisation and control, hybrid metabolic flux analysis for modeling formation of complex recombinant products, metabolic pathway analysis constrained by statistical relationships with “omic” data sets, and reverse envirome-guided metabolic reconstruction without the knowledge of kinetic parameters. This tutorial aims at (i) giving an overview on theoretical fundaments of hybrid modeling for systems biology and (ii) to provide an introduction to the software tool HYBMOD, a MATLAB toolbox for systems biology hybrid modeling. The presented theoretical methods will be exemplified by examining simulation and experimental case studies.
Date: 13-Apr-2012    Time: 14:30:00    Location: 020


Seminário do grupo DMIR : NetDyn - Understanding real large networks, from structure to dynamics

Alexandre P. Francisco

INESC-ID Lisboa and IST

Abstract—For more than 250 years, graph theory has evolved and became an important area of research. In recent years, however, with the explosive growth of real networks and structured data sets, a new class of graphs came to light, and a new multidisciplinary area of research on graphs has rapidly developed. The NetDyn project comes in this line of research, with the aim of developing new models and tools for the study of large networks structure evolution and processes dynamics. With these tools, we hope to be able to answer questions about real systems, by fitting our models to real data and by using developed tools. In this talk, I will present the NetDyn project, in which we plan to study real datasets such as query logs collected from the Yahoo! search engine, social network messages and connections collected from Twitter through the public programmatic streaming API, the yeast regulatory network available at, and population evolution data collected from several origins through well known molecular typing methods. All these data sets are known to be scale-free networks and subject to highly dynamic processes.
Date: 29-Mar-2012    Time: 14:00:00    Location: 336


Computational Analysis of Protein Coevolution and Interaction

Fábio Madeira


Abstract—Computational modelling of protein interactions (docking) is an important endeavour because protein complexes are difficult to determine by experimental methods alone. Nevertheless, computational prediction of protein interactions is no trivial task either and there is much to be done to improve the reliability of protein docking methods. Protein coevolution traces that are identifiable from the analysis of multiple sequence alignments can help predict protein interacting contacts and can be used to constrain the search space of constrained docking algorithms, such as BiGGER. Although several methods for detecting it have been developed, only a few were extended to the study of protein-protein interactions and not much is known about their ability in identifying contact points. We addressed this issue, developing an integrated system, Pycoevol, that is a set of open source tools for automating the identification of contact points from inter-protein coevolution.
Date: 09-Mar-2012    Time: 14:30:00    Location: 020


Systems analysis and metabolic networks modeling

Rafael Costa


Abstract—Systems biology provides new approaches for in silico metabolic engineering and drug development through the application of analysis, simulation and optimization methods for metabolic models. In silico modeling of cellular metabolism is divided between genome-scale stoichiometric models and small-scale kinetic models. While the former are analyzed using optimal assumptions predicting intracellular microbial fluxes and growth rates, the later are used for dynamic behaviour simulations. However, there is currently a separation between these two modeling approaches. In this talk, I will discuss my work related to the challenges to build novel computational approaches for complex large metabolic networks of biological systems and to fill the gap between kinetic and genome-scale stoichiometric models.
Date: 02-Mar-2012    Time: 14:30:00    Location: 020


Semantic Analysis of Streaming Social Data

Vasco Pedro

Instituto Superior Técnico

Abstract—Social data represents an important aspect of human communication. More and more services rely on stream analysis but analyzing the content of those streams is difficult since most of the resources available for text analysis are not a good fit for social data. In addition, the amount of data requires a level of scalability that brings particular challenges In this talk I will give an overview of new techniques for social data analysis and current efforts for scaling analysis pipelines. Finally I will describe some of the practical applications of these technologies and how the technology is being applied in the market on a daily basis.
Date: 02-Mar-2012    Time: 11:00:00    Location: 336


Coordinating towards a common good

Francisco C. Santos

INESC-ID Lisboa and IST

Abstract—Throughout their life, humans often engage in cooperative endeavors ranging from family related issues to global warming. In all cases, the tragedy of the commons threatens the possibility of reaching the optimal solution associated with global cooperation, a scenario predicted by theory and demonstrated by many experiments. Using the toolbox of evolutionary game theory, I will first address several mechanisms that are able to efficiently promote cooperation. Next, I will discuss an evolutionary and/or social learning dynamics approach to a broad class of cooperation problems in which attempting to minimize future losses turns the risk of failure into a central issue in individual decisions. We find that decisions within small groups under high risk and stringent requirements to success significantly raise the chances of coordinating actions and escaping the tragedy of the commons. We also offer insights on the scale at which public goods problems of cooperation are best solved. Instead of large-scale endeavors involving most of the population, which as we argue, may be counterproductive to achieve cooperation, the joint combination of local agreements within groups that are smaller than the population at risk is prone to significantly raise the probability of success. In addition, our model predicts that, if one takes into consideration that groups of different sizes are interwoven in complex networks of contacts, the chances for global coordination in an overall cooperating state are further enhanced.
Date: 22-Feb-2012    Time: 11:00:00    Location: 04


Ciberescola da Língua Portuguesa: objectivos, construção e resultados

Ana Sousa Martins

Centro de Linguística da Universidade Nova de Lisboa

Abstract—A presente comunicação tem por objectivo principal descrever as funcionalidades da plataforma Ciberescola da Língua Portuguesa. Integram a Ciberescola da LP exercícios interactivos e cursos online de ensino do português, língua materna e língua estrangeira. Todos os conteúdos, quer dos exercícios, quer dos cursos, são originais, produzidos e geridos especificamente por professores e investigadores em linguística e literatura, cobrindo as competências da gramática, leitura, escrita, compreensão oral e vocabulário. Trata-se de uma solução exclusivamente em HTML. A realização de cada exercício é assistida por pontuação, sinalização de erro e, em fase ulterior, por soluções, tendo o utilizador acesso ao percurso por si realizado. Após a descrição dos objectivos gerais e de um breve historial da construção do projecto, procede-se à demonstração das operações neste momento disponíveis ao utilizador. A comunicação termina com a reportação de falhas, apuradas após testagem informal, e com a projecção das alterações a implementar com vista à melhoria do projecto.
Date: 10-Feb-2012    Time: 15:00:00    Location: 336


Online Bayesian Time-varying Parameter Estimation of HIV-1 data

Andras Hartmann


Abstract—The importance of a system theory based approach in understanding immunological diseases, in particular the HIV-1 infection, is being increasingly recognized. The dynamics of virus infection may be effectively represented by compact state space models in the form of nonlinear ordinary differential equations (ODEs). Nonlinear Bayesian filtering offers various online tools for system identification of parametric ordinary differential equation models. Since parameters may change with time, it is a relevant question to assess how well time-varying parameters can be estimated from data. For this purpose two different filtering methods, Extended Kalman Filter and Particle Filter were applied for state and time-varying parameter estimation. After evaluating the methods on simulated time-series we applied them to long-term clinical datasets. Estimated time-varying parameters on clinical data are consistent with previously reported results with offline algorithms.
Date: 27-Jan-2012    Time: 14:30:00    Location: 020


CLIO & ++SPICY - tools for schema mapping

Valéria Magalhães Pequeno


Abstract—In order to share and integrate information from multiple autonomous and heterogeneous data sources in a reliable environment, we must deal with the schema mapping problem. This problem involves both: a) the mapping of one or more schemata in another one (named the target); and b) the discovery of one or more executable transformations (e.g., SQL queries) that transform the source data into the new structure in the target. In this presentation, we will show some concepts related to schema mapping, and then focus on two schema mapping tools: CLIO and ++SPICY.
Date: 25-Jan-2012    Time: 11:00:00    Location: 336


Supervised Topical Key Phrase Extraction of News Stories using Crowdsourcing, Light Filtering and Co-reference Normalization

Luís Marujo

INESC-ID Lisboa and IST

Abstract—Fast and effective automated indexing is a critical problem for personalized online news aggregation systems, such as News360, Google News, and Yahoo! News. Key phrases that consist of one or more words and represent the main concepts of the document are often used for the purpose of indexing. The accuracy of current state of the art automated key-phrase extraction systems (AKE) is in the 30-50% range. This makes improvements in AKE an urgent problem. In this work, we followed a fairly traditional approach of training a classifier to select an ordered list of the most likely candidates for key phrases in a given document. We augmented the process with new features, e.g.: the use of signal words, freebase categories, etc. We have also experimented 2 forms of document pre-processing that we call light filtering and co-reference normalization. Light filtering removes sentences from the document, which are judged peripheral to its main content. Co-reference normalization unifies several written forms of the same named entity into a unique form. Finally, we used Amazon’s Mechanical Turk (Mturk) service to label documents for training and testing.
Date: 06-Jan-2012    Time: 15:00:00    Location: 336


Deadline-aware scheduling for Software Transactional Memory

Pascal Felber

Universite de Neuchatel

Abstract—Software Transactional Memory (STM) is an optimistic concurrency control mechanism that simplifies the development of parallel programs. Still, the interest of STM has not yet been demonstrated for reactive applications that require bounded response time for some of their operations. We propose to support such applications by allowing the developer to annotate some transaction blocks with deadlines. Based on previous execution statistics, we adjust the transaction execution strategy by decreasing the level of optimism as the deadlines near through two modes of conservative execution, without overly limiting the progress of concurrent transactions. Our implementation comprises a STM extension for gathering statistics and implementing the execution mode strategies. We have also extended the Linux scheduler to disable preemption or migration of threads that are executing transactions with deadlines. Our experimental evaluation shows that our approach significantly improves the chance of a transaction meeting its deadline when its progress is hampered by conflicts.
Date: 30-Dec-2011    Time: 14:00:00    Location: 020


Tools and medical applications of an evolutionary cell biology

José Pereira-Leal

Instituto Gulbenkian da Ciência

Abstract—In the Computational Genomics Lab we combine the study of evolutionary cell biology with translational, or medical bioinformatics. We study evolutionary cell biology, i.e. the evolutionary mechanisms underlying the origins and evolution of cellular life and the complex structures within the cell. We are also very interested in the medical, or translational applications of bioinformatics and evolutionary genomics, and are conducing collaborative projects on pathogenic bacteria, protozoa and several types of human cancers. We are a small research group that includes biologists, computer scientists, chemists and clinicians.
Date: 06-Dec-2011    Time: 12:30:00    Location: Room C01 of IST (cave do pavilhão central)


Innovation & Entrepreneurship - BES Ventures

BES Ventures

Abstract—An interesting seminar to all researchers, specially those interested in technology transfer, and with ideas with high potential for the creation of technology-based companies.
Date: 05-Dec-2011    Time: 15:00:00    Location: 336


Topic-dependent Language Model Switching for Embedded Speech Recognition

Marcos Santos Pérez

Universidade de Málaga

Abstract—Nowadays, many of the applications for mobile embedded devices have a voice interface that uses Automatic Speech Recognition (ASR). In the case of multi-domain/topic speech recognition, other researchers propose the use of several recognizers in parallel or in series to improve the WER (Word Error Rate). This kind of approach is not feasible in embedded devices because the recognition time would be prohibitive. Another approach would be the use an external server to perform the recognition at expense of certain limitations (network availability, latency, etc.). This work focuses on a new approach based on Language Model Switching (LMS). In this case, the topic detection and language switching are performed within a multipass ASR system running on an embedded device.
Date: 02-Dec-2011    Time: 15:00:00    Location: 020


G-Tries: an efficient data-structure for counting subgraphs

Pedro Ribeiro

CRACS & INESC-Porto LA / DCC-FCUP, Universidade do Porto

Abstract—Complex networks are ubiquitous in real-world systems. In order to understand their design principles, the concept of network motifs emerged. These are recurrent overrepresented patterns of interconnections that can be seen as building blocks of networks. Algorithmically, discovering these motifs is a hard problem, which limits their practical applicability. I will give an overview of the state of the art in algorithms for finding these patterns. I will then present a novel data structure, g-tries, designed to efficiently represent a collection of graphs and to search for them as induced subgraphs of another larger graph. I will explain how it takes advantage of common substructure and how symmetry breaking conditions can be used to avoid redundant computations. I will also briefly introduce a sampling methodology capable of trading accuracy for even better execution times, and give some notes on the scalability of the methods, showing that they are suitable for a parallel computation. Finally, I will show an extensive empirical evaluation of the developed algorithms on a set of diversified complex networks, showing that g-tries can outperform all previously existing competitor algorithms.
Date: 02-Dec-2011    Time: 14:30:00    Location: 336


Advances in understanding the epidemiology of HIV over the past decade

Professor Sir Roy Anderson FRS, FMedSci

Imperial College London

Abstract—Short biography: Sir Roy is Professor of Infectious Disease Epidemiology in the School of Public Health, Faculty of Medicine, Imperial College London. His recent appointments include Rector of Imperial College London and Chief Scientist at the Ministry of Defence, UK. Sir Roy has also served as Director of the Wellcome Centre for Parasite Infections from 1989 to 1993 (at Imperial College London) and as Director of the Wellcome Centre for the Epidemiology of Infectious Disease from 1993 to 2000 (at the University of Oxford). He is the author of over 450 scientific articles and has sat on numerous government and international agency committees advising on public health and disease control including the World Health Organisation and UNAIDS. From 1991-2000 he was a Governor of the Wellcome Trust. He currently is a Trustee of the Natural History Museum, London, a Governor of the Institute of Government London, a Member of the Singapore National Research Foundation, a Member of the International Advisory Committee of Thailand National Science and Technology Development Agency, and is a Member of the Bill and Melinda Gates Grand Challenges advisory board. He is a non-executive director of GlaxoSmithKline and a member of the International Advisory Board of Hakluyt and Company Ltd. Sir Roy was elected Fellow of the Royal Society in 1986, a Founding Fellow of the Academy of Medical Sciences in 1998, a Foreign Associate Member of the Institute of Medicine at the US National Academy of Sciences in 1999 and a Foreign Member of the French Academy of Sciences in 2009. He was knighted in the 2006 Queens Birthday Honours.
Date: 28-Nov-2011    Time: 11:00:00    Location: Anfiteatro 58, Edifício Egas Moniz, Faculdade de Medicina da Universidade de Lisboa


Learning to Rank Academic Experts

Catarina Moreira

Instituto Superior Técnico

Abstract—O problema de procura de peritos é um problema de recuperação de informação que tem vindo a receber bastante interesse por parte da comunidade científica nos últimos anos. No entanto, o actual estado da arte carece de boas abordagens para combinar diferentes fontes de evidência de uma forma óptima. Este trabalho explora o uso de métodos baseados em Learning to Rank como uma boa abordagem para combinar múltiplos estimadores de relevância numa maneira óptima. Estes estimadores são derivados de conteúdos textuais, de estruturas em grafo representando padrões de citação na comunidade de peritos e de informações baseadas no perfil dos peritos. Este trabalho também explora a eficácia de abordagens baseadas na agregação de rankings combinando diversas técnicas de fusão de dados. As experiências efectuadas numa colecção de dados de publicações académicas no campo da engenharia informática confirmam a adequação de ambas as abordagens propostas.
Date: 18-Nov-2011    Time: 15:00:00    Location: 020


Intelligent Systems and Control

Prof. Dr. Thomas A. Runkler

Siemens ICT

Abstract—Intelligent systems and components are able to efficiently manage complexity and uncertainty, while exhibiting high flexibility and robustness. Intelligent systems perceive and understand their environment, learn and adapt their behavior, and act autonomously, even under previously unknown circumstances. Agent technologies, peer-to-peer systems, and grid computing are the basis for self-organization, self-optimization, and self-configuration. Key applications of intelligent systems are complex data analysis, signal processing, monitoring, fault prevention, diagnosis, system modeling, prognosis, robotics and advanced control systems. This presentation gives an overview of the intelligent system activities at Siemens and presents selected industrial project examples. Thomas Runkler received his MS and PhD in electrical engineering from the Technical University of Darmstadt, Germany, in 1992 and 1995, respectively. He has been teaching computer science at the Technical University of Munich, Germany, since 1999, and was appointed honorary professor in 2011. Since 1995 he has been working for Siemens Corporate Technology, currently in the position of the Global Technology Field Leader for Intelligent Systems and Control with about 60 employees located in Germany, USA, Russia, and India. Thomas authored and co-authored more than 150 scientific publications. His main research interests include computational intelligence, data analytics, self-organization and control. He is a Vice Chair of the IEEE Computational Intelligence Society Committee on Technology Transfer and speaker of the Fuzzy Systems and Soft Computing group of the German Association for Computer Science (GI).
Date: 10-Nov-2011    Time: 10:00:00    Location: Room P12, Pav. de Matemática, IST


Mixed-signal fault equivalence: search and evaluation

Marcelino Bicho dos Santos, Nuno Guerreiro

Silicongate, Lda

Abstract—The aim of this paper is to reduce the fault simulation effort required for the evaluation of test effectiveness in mixed-signal circuits. Exhaustive simulation of basic analog and mixed-signal structures in the presence of individual faults is used to identify potentially equivalent faults. Fault equivalence is finally evaluated based on the simulation of all faults in a case study – a DCDC (switched buck converter). The number of transistor stuck-on and stuck-off faults that need to be simulated is reduced to 31% in the structures already processed by the proposed methodology. This approach is a significant contribution to make mixed-signal fault simulation possible as part of the production test preparation.
Date: 08-Nov-2011    Time: 12:30:00    Location: 020


Test Scheduler: a Design-based Tool for Test Automation

Carlos Beltran Almeida, Marcelino Bicho dos Santos, Aleksandar Ilic, Tiago Manuel Oliveira Henriques Moita


Abstract—Testing nowadays analogue and mixed-signal integrated circuits is a difficult task. In order to cope with distinct circuit design characteristics and lack of standardized test principles, we rely on a test platform based on a novel test methodology. The presented solution bridges the gap between design and test processes, by offering an intuitive unified interface. On top of this methodology, in the software platform layer, we propose a software tool developed to assist designers and test engineers in the process of automatic generation and configuration of test sequences. The proposed tool is capable of detecting test parameters from the input HSPICE-compliant test specification files. Moreover, an extension to the HSPICE format is proposed to express mutually exclusive values for test parameters. The tool also provides the automatic creation of new HSPICE-compliant test sequences based on the selection of test parameters through an intuitive graphical interface. Additionally, it allows parameter reordering in preexisting test sequences. The benefits of using the proposed solution are presented when generating a large number of test sequences, providing a significant reduction of design and test time..
Date: 08-Nov-2011    Time: 11:30:00    Location: 020


PT-STAR - Speech Translation Advanced Research to and from Portuguese

Luísa Coheur


Abstract—Every year more than a billion Euros is spent translating documents and interpreting speeches by European institutions. Also, about half of the Europeans speak only its own language. Just these two facts per se are a strong motivation for the fostering of Speech-to-Speech Machine Translation (S2SMT) technologies, which aim at enabling natural language communication between people that do not share the same language. S2SMT can be seen as a cascade of three major components: Automatic Speech Recognition, Machine Translation and Text-to-Speech Synthesis. One of the main problems of this multidisciplinary area, however, is the still weak integration between the three components. Hence the main goal of PT-STAR is to improve speech translation systems for Portuguese by strengthening this integration. During this talk we will present the main achievements within PT-STAR project until now and we will show a real time demo.
Date: 04-Nov-2011    Time: 15:00:00    Location: 020


A Relational Data Mining perspective for Bioinformatics applications

Rui Camacho

Universidade do Porto

Abstract—The talk will have two parts. In the first part I will give a broad overview of the area of Inductive Logic Programming (ILP) as a promising approach to Relational Data Mining. Advantages of such approach together with their main applications will then be presented. In the second part I will focus on: i) a technique for conceptual clustering in Relational Data Mining that I have been working recently; ii) the application of ILP in rational Drug Design; iii) work on using ILP for Protein Folding.
Date: 31-Oct-2011    Time: 12:30:00    Location: Room C01 of IST (cave do pavilhão central)


Behavioral Pattern Detection using Compact and Fast Methods

Nuno Homem


Abstract—This work proposes algorithms and methods for individual behavior detection within very large populations. One will consider domains where individual behavior presents some stable characteristics over time, and where the individual actions can be observed through events in a data stream. Event patterns will be characterized and used as a proxy to individual behavior and actions. As in many domains, behavior does not remain static but evolves over time; one will therefore consider the sliding window model, making the assumption that behavior is stable during the considered time window. This work will cover the detection of the specific characteristics of the individual and what distinguishes his behavior from that of all other individuals. Algorithms must have minimal memory footprint and scalability to cope with huge number of individuals. Providing and keeping results up to date in near real time is also a goal, as information is only useful for limited periods in many situations. Fortunately, approximate answers are usually adequate for most problems. Some fast and compact methods for diversity analysis will be introduced both for unlimited time and for the sliding window model. Innovative algorithms will be proposed to describe and characterize the individual event patterns. Those algorithms will then be used to create an individual event fingerprint. Using that fingerprint one will be able to identify the individual even when the identification information is not available. Distinct uses of the fuzzy fingerprint concept will be presented for individual identification that might also be extended to specific behavior identification, classification, profiling, etc., with examples in several domains such as internet traffic analysis, telecommunications fraud detection and text authorship analysis.
Date: 28-Oct-2011    Time: 15:00:00    Location: 020


Linking Named Entities to the Corresponding Wikipedia Pages

Ivo Anastácio

INESC-ID Lisboa and IST

Abstract—This talk will present participation of the DMIR/INESC-ID team on the Monolingual Knowledge Base Population (KBP) track at the 2011 Text Analysis Conference (TAC-KBP --, for which we developed a supervised learning approach to the task of linking named entities in an english text to global unique identifiers. Formally, the KBP problem consists in linking named entities (queries) and the respective texts where they occur, to the corresponding entries in a knowledge base (i.e., a subset of the English Wikipedia). If there are no such entries, the systems are required to cluster together the queries referring to the same non-KB (NIL) real world entity. We modeled the aforementioned problem through three supervised learning tasks, namelly (a) ranking candidate knowledge base disambiguations for each named entity, (b) classifying the top-ranked disambiguations as correct or not (i.e., finding the NIL queries), and (c) classifying pairs of queries with an estimated incorrect disambiguation as referring to the same entity or not, so that the transitive closure of these pairs can form the set of equivalence classes (i.e., the NIL clusters). The talk we detail the features and the learning methods used for modeling the three tasks, also presenting an analysis of the obtained results.
Date: 18-Oct-2011    Time: 15:30:00    Location: 336


Training with Imperfect Transcripts and Language Model Adaptation for ASR of TED talks

João Miranda


Abstract—In the first part of this talk, we describe a method that was developed to take advantage of audio with imperfect transcriptions for training acoustic models. Labels that are only approximate, e.g. closed captions, are more common than carefully transcribed speech, and therefore it is useful to be able to take advantage of these types of transcriptions for training. An iterative algorithm is presented that attempts to improve the transcriptions by inserting or removing filler words and pauses, before they are handed to the training process. In the second part of the talk, we describe an information-retrieval based language model adaptation technique, which is employed to try to improve recognition performance in topic-oriented TED lectures, together with a new technique for implicit language model interpolation.
Date: 30-Sep-2011    Time: 15:00:00    Location: 020


Support for User Involvement in Data Cleaning

Helena Galhardas


Abstract—Data cleaning and ETL processes are usually modeled as graphs of data transformations. The involvement of the users responsible for executing these graphs over real data is important to tune data transformations and to manually correct data items that cannot be treated automatically. This talk will describe recent research in which, in order to better support the user involvement in data cleaning processes, data cleaning graphs were equiped with data quality constraints to help users identifying the points of the graph and the records that need their attention, and manual data repairs for representing the way users can provide the feedback required to manually clean data items. Some preliminary experimental results will be presented, showing the signi&#64257;cant gains obtained with the use of data cleaning graphs.
Date: 26-Sep-2011    Time: 16:00:00    Location: 336


TTS issues in Speech-to-Speech Machine translation

Gopala Krishna Anumanchipalli

Carnegie Mellon University, USA and INESC-ID Lisboa, IST

Abstract—This talk focusses on the speech synthesis issues in Speech-to-Speech Machine Translation(S2SMT). Specifically, I will present the TTS problems (not solutions!) we currently face in the PT-STAR project. I will present some oracle experiments which may be considered as the best-case performances and compare it to the current state of the system. This is joint work with the PT-STAR groups in L2F and CMU. This is very much a work in progress and comments and suggestions are most welcome.
Date: 23-Sep-2011    Time: 15:00:00    Location: 020


Projects "CAMP - Computational Analysis of MicroRNAs in Plants" and "NetDyn: Understanding real large networks, from structure to dynamics"

A. P. Francisco, Paulo G. S. da Fonseca


Abstract—Paulo Fonseca and Alexandre Francisco are researchers at INESC-ID / IST. They are the Principal Investigators of the two projects specified below, approved in the last FCT call. This friday they will informally talk about them. <BR><BR> Title: CAMP - Computational Analysis of MicroRNAs in Plants, PTDC/EIA-EIA/122534/2010 <BR> Speaker: Paulo Fonseca <BR><BR> Title: NetDyn: Understanding real large networks, from structure to dynamics, PTDC/EIA-CCO/118533/2010 <BR> Speaker: Alexandre Francisco
Date: 23-Sep-2011    Time: 14:30:00    Location: 336


On the Implementation of a Secure Musical Database Matching

José Portêlo

INESC-ID Lisboa and IST

Abstract—This paper presents an implementation of a privacy-preserving music database matching algorithm, showing how privacy is achieved at the cost of computational complexity and execution time. The paper presents not only implementation details but also an analysis of the obtained results in terms of communication between the two parties, computational complexity, execution time and correctness of the matching algorithm. Although the paper focus on a music matching application, the principles can be easily adapted to perform other tasks, such as speaker verification and keyword spotting.
Date: 09-Sep-2011    Time: 15:00:00    Location: 336


Towards Environmentally Robust Speech Applications

Ramon Fernandez Astudillo


Abstract—The talk will review the work performed in this first ten months and introduce new tools and testing benchmarks. It will also focus on future objectives and will try to motivate potential joint work. This first year was focused on speech processing and dynamic adaptation of acoustic models to the environment. Future work will shift towards the use of context and language models for robustness as well as language processing robustness in the presence of partially missing or unreliable ASR transcriptions. For this reason I find the talk a good opportunity to get feedback on how to steer future research lines and start joint research. That also means that I will do my best to make the topic as understandable as possible (no formulas stampede).
Date: 29-Jul-2011    Time: 14:30:00    Location: 336


The Delft Reconfigurable VLIW Processor

Stephan Wong

Technical University of Delft

Abstract—In this presentation, we present the rationale and design the Delft reconfigurable and parameterized VLIW processor called rho VEX (rVEX in short). Its architecture is based on the Lx/ST200 ISA developed by HP and STMicroelectronics. We implemented the processor on an FPGA as an open-source softcore and made it freely available. Using the rVEX, we intend bridge the gap between general-purpose and application-specific processing through parametrization of many architectural and organizational features of the processor. The initial set of parameters include: instruction set (number and type of supported instructions), the number and type of functional units (FUs), issue-width (number of slots), register file size, memory bandwidth. The parameters can be set in a static or dynamic manner in order to provide the best performance or the best utilization of available resources on the FPGA. A complete toolchain including a C compiler and a simulator is freely available. Any application written in C can be mapped to the rVEX processor. This VLIW processor is able to exploit the instruction level parallelism (ILP) inherent in an application and make its execution faster compared to a RISC processor system. Recent developments will be presented. The rVEX is currently being further developed within an EU-funded project called ERA: Embedded Reconfigurable Architectures.
Date: 26-Jul-2011    Time: 16:30:00    Location: 336


A Polymorphic Finite Field Multiplier

Saptarsi Das

Indian Institute of Science

Abstract—In this work we present the architecture of a polymorphic multiplier for operations over various extensions of GF(2). We evolved the architecture of a textbook shift-and-add multiplier to arrive at the architecture of the polymorphic multiplier through a generalized mathematical formulation. The polymorphic multiplier is capable of morphing itself in runtime to create data-paths for multiplications of various orders. In order to optimally exploit the resources, we also introduced the capability of sub-word parallel execution in the polymorphic multiplier. The synthesis results of an instance of such a polymorphic multiplier shows about 41% savings in area with 21% degradation in maximum operating frequency compared to a collection of dedicated multipliers with equivalent functionality. We introduced the multiplier as an accelerator unit for field operations in the coarse grained runtime reconfigurable platform called REDEFINE. We observed 2.4x improvement in performance of the AES algorithm with the multiplier used as against software realization of multiplication kernels.
Date: 22-Jul-2011    Time: 15:00:00    Location: 020


REDEFINE: Application Synthesis on Reconfigurable Silicon Cores

Prof. S. Nandy

Indian Institute of Science

Abstract—Emerging embedded applications are based on evolving standards (ex. MPEG2/4, H.264/265, IEEE802.11a/b/g/n). Since most of these applications run on handheld devices, there is an increasing need for a single chip solution that can dynamically interoperate between different standards and their derivatives. In order to achieve high resource utilization and low power dissipation, we propose REDEFINE, a Polymorphic ASIC in which specialized hardware units are composed from basic functional units at runtime. It is a ³future-proof² custom hardware solution for multiple applications and their derivatives in a domain. In this talk, I will provide a broad overview of the architecture of REDEFINE and its hardware aware application synthesis framework. REDEFINE comprises an array of Tiles interconnected in a Honeycomb network. Each Tile comprises a Compute Element and a Router. In the synthesis process, applications described in High Level Language (for ex: C) are compiled into application sub-structures. Each application substructure is hosted onto a set of Compute Elements on REDEFINE to form a Computational Structure that is a functional equivalent of a hardwired unit. In the application synthesis methodology for REDEFINE, application specific Computational Structures are composed and destroyed in both space and time for the different application substructures to support polymorphism in hardware. The characterization, diversity and multiplicity of the functional units in a Compute Element are domain specific. Thus, while the architecture of REDEFINE is application agnostic, the Compute Elements in REDEFINE can be chosen to be domain specific to enable synthesis of hardware accelerators on Reconfigurable Silicon Cores.
Date: 21-Jul-2011    Time: 14:00:00    Location: Sala de reuniões do DEEC


Pentaho Data Integration and the AgileBI Movement

André Simões

Xpand IT

Abstract—O objectivo desta apresentação é explicar a funcionalidade da ferramenta de ETL ("Extract-Transform-Load") disponibilizada pela plataforma "open-source" Pentaho (supõe-se uma sessão mais interactiva de dúvidas/curiosidades sobre esta). Também será demonstrada a funcionalidade que permite mapear a realidade de desenvolvimento de ETL com metodologias agéis.
Date: 12-Jul-2011    Time: 10:00:00    Location: 336


Towards a mathematical model of risk assessment of biocide induced antibiotic resistance

Joana Coelho

INESC-ID Lisboa and IST

Abstract—Biocides have been widely used for several decades to preserve materials including food and cosmetics, to decontaminate surfaces, to disinfect instruments, used in fabrics and, even, in toys, for personal hygiene, and to prevent transmission of infections. Nevertheless, when used in large volumes or at high concentrations, biocides have toxic effects and excessive use is dangerous for the environment, including animals and humans. Despite this widespread and ever increasing use of biocides, most bacterial and fungal species remain susceptible but decreased susceptibility has been reported and occasionally linked to antibiotic resistance, mainly in human and veterinary pathogens. The problem of the development of resistances, together with the possibility to prevent them, has been carefully considered by the EC in the Biocides Directive 98/8/CE, a norm which oversees a high protection for the environment and man, and harmonizes the rules for placing on the market within the European Union any active substances and biocidal products. This work is developed in the context of the European project BIOHYPO (Proposal No 227258 of the Programme ``FP7 Cooperation Work Programme: Food, Agriculture and Fisheries, and Biotechnologies') (Dr. Marco Oggioni, PI). The main goal is the evaluation of the risk for clinically significant increase or spread of antibiotic resistance in food pathogens due to biocide use. Statistical analyses are performed in a large data set of Staphylococcus aureus in order to have insight about the real clinical relevance of any antibiotic/biocide co- and cross-resistance.
Date: 08-Jul-2011    Time: 15:00:00    Location: Anf. Qa1.2 – Torre Sul (IST)


Systems-of-Systems Engineering - The Engineering of Multiple Integrated Complex Systems

Rui Santos Cruz

Instituto Superior Técnico

Abstract—This talk provides an overview of the concepts of "Systems Engineering" and "Systems-of-Systems Engineering", based on the following bibliography review: - A. Gorod, et al., “System-of-Systems Engineering Management: A Review of Modern History and a Path Forward,” IEEE Systems Journal, vol. 2, pp. 484 –499, Dec. 2008. /// - S. B. Johnson, “Three Approaches to Big Technology: Operations Research, Systems Engineering, and Project Management,” Technology and Culture, vol. 38, pp. 891–919, Oct. 1997. /// - A. Sousa-Poza, et al., “System of systems engineering: an emerging multidiscipline,” International Journal of System of Systems Engineering, vol. 1, pp. 1–17, Jan. 2008. /// - R. Valerdi, et al., “A research agenda for systems of systems architecting,” International Journal of System of Systems Engineering, vol. 1, pp. 171–188, Jan. 2008.
Date: 08-Jul-2011    Time: 12:00:00    Location: 020


Conceptual Modelling in Information Systems

Hugo Miguel Álvaro Manguinhas

Instituto Superior Técnico

Abstract— The set of concepts used to describe a particular domain of interest constitutes a conceptualization of that domain (i.e. a conceptual model). In the past decades, several modeling languages have been designed to ultimately express all the constraints of a conceptual model. This paper presents an overview of some of these languages evaluating their ability to express conceptual models.
Date: 08-Jul-2011    Time: 11:30:00    Location: 020


Research challenges from Free Software Distributions

Prof. Roberto di Cosmo

University Paris VII

Abstract—Free Software distributions, like Debian, RedHat, or Ubuntu, are some of the largest component based software systems, and they all use packages as their building blocks, together with tools for selecting, installing and removing packages on a running system.Evolving such complex software systems is a daunting task that carries significant challenges: in this talk, after providing a simple formalisation of packages and distributions, we will survey some recent results and algorithms developed to answer questions like "which is the most important package among the 27000 ones in Debian squeeze?", or "what version change is most likely to have an impact on the system"?
Date: 06-Jul-2011    Time: 09:00:00    Location: sala I do Pav. de Informática II do IST


Local identifiability of a HIV-1 infection model using a sensitivity approach

João Gonçalo Silva Marques

INESC-ID Lisboa and IST

Abstract—The dynamic modeling of the Human Immunodeficiency Virus 1 (HIV-1) infection is still one of the great challenges in systems biology. The high prevalence of Acquired Immune Deficiency Syndrome (AIDS), known to be caused by HIV, and the fact that no cure has yet been discovered, confers relevancy to this area of study. In this paper, a dynamic model for the HIV-1 infection is analyzed. The sensitivity and identifiability issues are addressed with the purpose of optimizing the time points at which patients' blood samples should be drawn. This paper shows that there are time periods far more informative than others, thus improving parameter identifiability and estimability in the reverse engineering step.
Date: 01-Jul-2011    Time: 14:00:00    Location: 04


Onto.PT: uma ontologia lexical para o Português, construída de forma automática

Hugo Gonçalo Oliveira

Universidade de Coimbra

Abstract—Tendo em vista o panorama dos recursos lexicais para o Português e as dificuldades inerentes à construção manual de um recurso léxico-semântico amplo para uma língua, o projecto Onto.PT tem como objectivo explorar recursos textuais livres, como dicionários, thesaurus ou enciclopédias, na construção automática de uma ontologia lexical para a nossa língua. Nesta apresentação serão descritos os passos necessários para a, a partir de texto, se obter uma ontologia lexical, estruturada de forma semelhante a uma wordnet, ou seja, onde conceitos são representados por grupos de palavras sinónimas (synsets) e que, por sua vez se encontram ligados a outros conceitos através de relações semânticas. A apresentação será acompanhada de exemplos de resultados obtidos, incluídos na primeira versão deste recurso.
Date: 22-Jun-2011    Time: 14:30:00    Location: 336


A mixture-of-experts approach to biclustering

José Caldas

Helsinki Institute for Information Technology

Abstract—Biclustering is the unsupervised learning task of mining a data matrix for submatrices, known as biclusters, with desirable properties. For instance, the goal can be to find groups of genes that are co-expressed under particular biological conditions. Many biclustering methods do not allow biclusters to overlap; others do, but need to specify how the biclusters interact at the overlapping regions. It is therefore of interest to devise methods that allow flexible, overlapping bicluster structures while not forcing the practitioner to specify bicluster interaction models. We propose a mixture modelling framework allowing biclusters to overlap but not requiring the practitioner to postulate any parameter interaction models between biclusters. Sharing a similar intuition to mixture-of-experts models, our model allows biclusters to specify partly overlapping regions of expertise in which the biclusters are able to model the data adequately. The uncertainty over assignments of data points to biclusters depends on the membership of data points to these regions of expertise. We perform inference and parameter estimation via a variational expectation-maximization framework. The model is easily adaptable to different data types and compares favorably to other approaches, both in a binary DNA copy number variation data set and in a miRNA expression data set.
Date: 17-Jun-2011    Time: 14:00:00    Location: 020


Parallel Video Coding on Multi-Core Platforms

Svetislav Momcilovic


Abstract—This talk adresses scalable parallelization methods for real-time video coding, considering both conventional H.264/AVC and Distributed Video Coding (DVC), on multi-cores platforms, such as the most recent general purpose multi-cores, Graphical Processing Units (GPUs) and Cell Broadband Engine (Cell/BE).
Date: 17-Jun-2011    Time: 10:00:00    Location: 336


The Statistical Phrase/Accent Model for Intonation Modeling

Gopala Krishna Anumanchipalli

Carnegie Mellon University, USA and INESC-ID Lisboa, IST

Abstract—In this talk I will describe the newly developed statistical phrase accent model for generation of natural and expressive intonation contours in speech synthesis. I will briefly mention the conventional approaches and drawbacks of existing intonation models for speech synthesis. I will introduce the proposed statistical model, an associated training algorithm and performance on some tasks. This is joint work with Dr. Alan Black and Dr. Luis Oliveira.
Date: 08-Jun-2011    Time: 14:30:00    Location: 336


A mixture-of-experts approach to biclustering

José Caldas

Helsinki Institute for Information Technology

Abstract—Biclustering is the unsupervised learning task of mining a data matrix for submatrices, known as biclusters, with desirable properties. For instance, the goal can be to find groups of genes that are co-expressed under particular biological conditions. Many biclustering methods do not allow biclusters to overlap; others do, but need to specify how the biclusters interact at the overlapping regions. It is therefore of interest to devise methods that allow flexible, overlapping bicluster structures while not forcing the practitioner to specify bicluster interaction models. We propose a mixture modelling framework allowing biclusters to overlap but not requiring the practitioner to postulate any parameter interaction models between biclusters. Sharing a similar intuition to mixture-of-experts models, our model allows biclusters to specify partly overlapping regions of expertise in which the biclusters are able to model the data adequately. The uncertainty over assignments of data points to biclusters depends on the membership of data points to these regions of expertise. We perform inference and parameter estimation via a variational expectation-maximization framework. The model is easily adaptable to different data types and compares favorably to other approaches, both in a binary DNA copy number variation data set and in a miRNA expression data set.
Date: 03-Jun-2011    Time: 14:00:00    Location: 020


Um sistema online para o tratamento à distância da afasia

Annamaria Pompili

INESC-ID and Universidade de Roma Tor Vergata

Abstract—A afasia é uma deterioração da função da linguagem que pode causar problemas em aspetos muito variados da linguagem: a nível fonético, sintático ou semântico. Há vários tipos de afasia, cada uma é caraterizada por vários sintomas, mas todas as síndromes têm um problema comum: a dificuldade em nomear ações e objetos. Esta característica, aliada ao facto de frequentes sessões de terapia conduzirem a uma reabilitação mais rápida, levaram ao desenvolvimento do projeto VITHEA - Terapeuta Virtual para o Tratamento da Afasia. Para garantir a acessibilidade a qualquer lugar e a qualquer hora, o sistema foi desenvolvido como uma aplicação Web. A arquitetura software é estruturada segundo o padrão de desenho Model-View-Controller (MVC) e integra diversos "frameworks open source" para a plataforma Java EE. O sistema age como um terapeuta, guiando os doentes no desempenho das sessões de terapia, onde o efetivo reconhecimento da correção das expressões pronunciadas é efetuado por meio do reconhecedor automático da fala, AUDIMUS.
Date: 01-Jun-2011    Time: 15:30:00    Location: 020


Rich Prior Knowledge in Learning for Natural Language Processing

João Graça


Abstract—We possess a wealth of prior knowledge about most prediction problems, and particularly so for many of the fundamental tasks in natural language processing. Unfortunately, it is often difficult to make use of this type of information during learning, as it typically does not come in the form of labeled examples, may be difficult to encode as a prior on parameters in a Bayesian setting, and may be impossible to incorporate into a tractable model. Instead, we usually have prior knowledge about the values of output variables. For example, linguistic knowledge or an out-of-domain parser may provide the locations of likely syntactic dependencies for grammar induction. Motivated by the prospect of being able to naturally leverage such knowledge, four different groups have recently developed similar, general frameworks for expressing and learning with side information about output variables. These frameworks are Constraint-Driven Learning (UIUC), Posterior Regularization (UPenn), Generalized Expectation Criteria (UMass Amherst), and Learning from Measurements (UC Berkley). This tutorial describes how to encode side information about output variables, and how to leverage this encoding and an unannotated corpus during learning. We survey the different frameworks, explaining how they are connected and the trade-offs between them. We also survey several applications that have been explored in the literature, including applications to grammar and part-of-speech induction, word alignment, information extraction, text classification, and multi-view learning. Prior knowledge used in these applications ranges from structural information that cannot be efficiently encoded in the model, to knowledge about the approximate expectations of some features, to knowledge of some incomplete and noisy labellings. These applications also address several different problem settings, including unsupervised, lightly supervised, and semi-supervised learning, and utilize both generative and discriminative models. The diversity of tasks, types of prior knowledge, and problem settings explored demonstrate the generality of these approaches, and suggest that they will become an important tool for researchers in natural language processing. The tutorial will provide the audience with the theoretical background to understand why these methods have been so effective, as well as practical guidance on how to apply them. Specifically, we discuss issues that come up in implementation, and describe a toolkit that provides "out-of-the-box" support for the applications described in the tutorial, and is extensible to other applications and new types of prior knowledge.
Date: 27-May-2011    Time: 15:30:00    Location: 336


Reconfiguration schemes to mitigate faults in automated irrigation channels

Erik Weyer

University of Melbourne

Abstract—The infrastructure associated with an automated irrigation channel contains a large number of actuators (electro-mechanical gates) and water level and gate position sensors. Actuator and sensor faults happen from time to time, and they will lead to loss of water and reduced service to farmers unless corrective action is taken. In this seminar we will look at strategies for dealing with such faults. Sensor faults are dealt with by estimating the value of the signal measured by the faulty sensor and using the estimated signal as input to the controllers. An actuator fault necessitates a relaxation of the control objectives, and techniques for reconfiguring the controller to meet the new objectives will be presented. Experimental results from an operational irrigation channel will be shown demonstrating that despite faults, good control performance can be achieved without disruption of the of the operation of the channel.
Date: 26-May-2011    Time: 14:30:00    Location: IST - Torre Norte, Level 5, room 5.9


System identification and control of irrigation channels

E. Weyer

University of Melbourne

Abstract—Water is an increasingly scarce resource in many parts of the world, and it is important to manage the water resources well. This is of particular importance in networks of irrigation channels where the operational losses are large. In irrigation channels the flows and water levels can be controlled by manipulating the positions of mechanical gates located along the channel, and the water losses can be significantly reduced by automating the operation of the channel networks using closed loop control. In this talk we will briefly present system identification techniques for obtaining models of irrigation channels useful for control design. The control objectives usually involve a trade-off between minimum wastage of water and quality of the water delivery to the farmers. Different control configurations (centralized, decentralized and distributed) for achieving the objectives will be presented, together with control design methods, with emphasis on frequency domain techniques. Experimental results from operational irrigation channels will be presented.
Date: 25-May-2011    Time: 15:30:00    Location: IST - Torre Norte, Level 5, room 5.9


Multi-agent control for coordination in water infrastructures and intermodal transport networks

R. Negenborn

Delft University of Technology

Abstract—In this talk we discuss how multi-agent model predictive control can be used for control of large-scale transport infrastructures. Such a control approach has the potential to operate transport infrastructures closer to their capacity limits, while taking into account the increasingly complex dynamics. We will in this talk particularly focus on applications in the domain of water infrastructures and networks of interconnected transport hubs.
Date: 25-May-2011    Time: 14:30:00    Location: IST - Torre Norte, level 5, room 5.9


Recovering Capitalization and Punctuation Marks on Speech Transcriptions

Fernando Batista

INESC-ID Lisboa and IST and ISCTE

Abstract—This presentation addresses two important metadata annotation tasks, involved in the production of rich transcripts: capitalization and recovery of punctuation marks. The main focus of this study concerns broadcast news, using both manual and automatic speech transcripts. Different capitalization models were analysed and compared, indicating that generative approaches capture the structure of written corpora better, while the discriminative approaches are suitable for dealing with speech transcripts, and are also more robust to ASR errors. The so-called language dynamics have been addressed, and results indicate that the capitalization performance is affected by the temporal distance between the training and testing data. In what concerns the punctuation task, this study covers the three most frequent marks: full stop, comma, and question mark. Early experiments addressed full-stop and comma recovery, using local features, and combining lexical and acoustic information. Recent experiments also combine prosodic information and extend this study to question marks.
Date: 25-May-2011    Time: 14:30:00    Location: 020


Controlo distribuído de canais de água

José Igreja, Filipe Cadete, Luís Pinto


Abstract—O seminário será apresentado por José Igreja, Filipe Cadete e Luís Pinto do Grupo Controlo de Sistemas Dinâmicos Após uma revisão concisa dos algoritmos de controlo preditivo descentralizado, será apresentado um algoritmo preditivo com restrições de estabilidade, descentralizado. Serão apresentados e discutidos resultados experimentais obtidos no canal experimental do NuHCC (Univ. de Évora) com este algoritmo e com diversas versões do LQG/LTR multivariável e descentralizado.
Date: 13-May-2011    Time: 11:00:00    Location: Sala de reuniões do DEEC, Torre Norte, Piso 5, IST Alameda


Robust linear regression methods in association studies

Vanda Lourenço

FCT/UNL, Dep. Mathematics and IST/UTL, CEMAT

Abstract—Motivation: It is well known that data deficiencies, such as coding/rounding errors, outliers or missing values, may lead to misleading results for many statistical methods. Robust statistical methods are designed to accommodate certain types of those deficiencies, allowing for reliable results under various conditions. We analyze the case of statistical tests to detect associations between genomic individual variations (SNP) and quantitative traits when deviations from the normality assumption are observed. We consider the classical ANOVA tests for the parameters of the appropriate linear model and a robust version of those tests based on M-regression. We then compare their empirical power and level using simulated data with several degrees of contamination. Results: Data normality is nothing but a mathematical convenience. In practice, experiments usually yield data with nonconforming observations. In the presence of this type of data, classical least squares statistical methods perform poorly, giving biased estimates, raising the number of spurious associations and often failing to detect true ones. We show through a simulation study and a real data example, that the robust methodology can be more powerful and thus more adequate for association studies than the classical approach.
Date: 06-May-2011    Time: 14:00:00    Location: 020


Using perspective schemata and a reference model for helping in the design of data integration systems

Valéria Magalhães Pequeno

Faculdade de Ciências e Tecnologia da UNL

Abstract—Sharing and integrating information across multiple autonomous and heterogeneous data sources has emerged as a strategic requirement in modern business. We deal with this problem by proposing a declarative approach based on the creation of a reference model and perspective schemata. The former serves as a common semantic meta-model, while the latter defines correspondence between schemata. Furthermore, using the proposed architecture, we have developed an inference mechanism which allows the (semi-) automatic derivation of new mappings between schemata from previous ones.
Date: 04-May-2011    Time: 15:30:00    Location: N7.1


Speeding up information extraction using sub-optimal algorithms

Gonçalo Fernandes Simões

INESC-ID Lisboa and IST

Abstract—Information Extraction (IE) proposes techniques capable of extracting, from unstructured text, relevant segments in a given domain and represent them in a structured format. Most of the scientific proposals in IE so far aim at increasing the accuracy of the extraction results. However, the existing IE techniques still have efficiency problems when processing large data volumes. IE optimization aims at executing IE processes as fast as possible with minimal or no impact in the accuracy results. In this talk, we will first describe the state of the art in IE optimization. Then, we will present a novel approach for IE optimization. The key idea is to make IE programs faster by using sub-optimal extraction algorithms, which are are typically fast but may produce some erroneous results or not produce some of the results of traditional algorithms (thus, leading to a negative impact on the recall and precision values). We propose a cost model that is able to evaluate not only the expected execution time of a given IE execution plan but also the quality of the results produced, in terms of the expected number of good and bad tuples. Using this cost model, our solution is able to choose a fast execution that is able to fulfill a set of objectives imposed by a user (e.g., minimum number of good tuples desired, minimum precision desired). Finally, we will report the preliminary experimental results obtained with two data sets and three IE programs, that show the gains brought by our approach with respect to the state-of-the-art solutions.
Date: 15-Apr-2011    Time: 16:00:00    Location: N7.1


Stochastic Modeling of Stem Cell Induction Protocols

Filipe Gracio

INESC-ID Lisboa and IST

Abstract—Generation of pluripotent stem cells starting from adult human cells using induction processes is a technology that has the potential to revolutionize regenerative medicine. However, the production of these so called iPS cells is still quite inefficient and may be dominated by stochastic effects. In this work we build mass action models of the core circuitry controlling stem cell induction and maintenance. The model includes not only the network of transcription factors NANOG, OCT4, SOX2, but also important epigenetic regulatory features of DNA methylation and histone modifications. We are able to show that the network topology reported in the literature is consistent with the observed experimental behavior of bistability and inducibility. Based on simulations of stem cell generation protocols we show that cooperative and independent reaction mechanisms have experimentally identifiable differences in the dynamics of reprogramming, and we analyze such differences and their biological basis. It had been argued that stochastic and elite models of stem cell generation represent distinct fundamental mechanisms. Work presented here illustrates the possibility that rather they represent differences in the amount of information we have about the distribution of cellular states before and during reprogramming protocols. We show that unpredictability decreases as the cell moves through the necessary induction stages, and that identifiable groups of cells with elite-like behavior can come about by stochastic process. We also show how different mechanisms and kinetic properties impact the prospects of improving the efficiency of iPS cell generation protocols.
Date: 15-Apr-2011    Time: 14:00:00    Location: 020


Efficient algorithms for the identification of miRNA motifs in DNA sequences

Nuno D. Mendes

INESC-ID Lisboa and IST

Abstract—In the last decade, a novel gene expression regulatory mechanism was discovered. It is mediated by RNA molecules named miRNAs and it acts by silencing target genes. Despite the advancements in this research field, we are still not able to rigourously characterise miRNA genes in order to identify the sequence, structure or contextual requirements that are needed to obtain a functional miRNA. In this work we present a strategy to sieve through the vast amount of stem-loops that can be found in metazoan genomes, significantly reducing the set of candidates while retaining most known miRNA precursors. The approach relies on a combination of robustness measures, on the analysis of precursor structure, and on the incorporation of information about the transcription potential of each candidate. The methodology was applied to the genomes of Drosophila melanogaster and Anopheles gambiae, and to homologs of known precursors in the newly sequenced Anopheles darlingi. We obtain, thus, a restricted and ordered set of candidates for these organisms which fulfil the established prerequisites. Keywords: miRNA, gene finding, single-genome, robustness, stability, secondary structure
Date: 01-Apr-2011    Time: 14:00:00    Location: 020


Prospecção de Texto e Jornalismo Computacional

Mário J. Silva

INESC-ID Lisboa and IST

Abstract—O manancial de informação com que somos actualmente confrontados requer novas práticas de jornalismo que permitam monitorizar, interpretar e resumir notícias, e novos modelos para apresentar conteúdos dinâmicos, interactivos e integrados. Esta visão preside aos recentes trabalhos em "jornalismo computacional". Alguns dos desafios que se colocam presentemente nesta área prendem-se com (i) a análise automática de conteúdos, (ii) a análise automática de redes sociais explícitas e implícitas, (iii) o desenho de interfaces ricas em termos da visualização e interacção, para a apresentação de notícias dinâmicas e personalizadas e para a aprendizagem de relações implícitas entre notícias e comunidades de leitores, e (iv) o estudo de casos em ambiente de produção para avaliação das metodologias de jornalismo computacional. Esta apresentação discutirá estes desafios e a forma como começaram a ser endereçados no âmbito do projecto REACTION, uma iniciativa recente do programa UTAustin|Portugal.
Date: 22-Mar-2011    Time: 11:30:00    Location: 336


Computational Methods for DNA Resequencing: A Survey

Francisco Fernandes


Abstract—Recent developments in next-generation sequencing technologies allow constantly increasing throughput and shorter running times while reducing the costs of the sequencing process. This leads to the production of huge amounts of data which raise important computation challenges not only due to the large volume of information but also to the increase of the reads length and sequencing errors. Several assembly and mapping tools have recently been developed for generating assemblies from short, unpaired sequencing reads. However, the need for faster and more accurate algorithmic approaches to keep up with the demand of frequently emerging resequencing projects, justify the growing number of short read mapping tools that surfaced in the last couple of years. In this report we present an overview of the state of the art software applications detailing their algorithms and data structures.
Date: 18-Mar-2011    Time: 17:00:00    Location: 020


Nasalidade Vocálica em Português - Pistas para identificação forense de falantes

Manuel Domingos

INESC-ID and Centro de Linguística da Universidade de Lisboa

Abstract—A tese “Nasalidade Vocálica em Português: Pistas para identificação Forense de falantes” tem como objectivos a constituição de pistas para identificação forense de falantes e a discussão sobre a representação fonológica da nasalidade vocálica em Português. Na tese foram analisados os correlatos acústicos das vogais nasais nos sistemas do Português Europeu (PE) e do Português Angolano (PA). Desta forma, foram analisadas as frequências dos dois primeiros formantes (F1 e F2) e a Frequência Fundamental (F0) das cinco vogais nasais, na parte oral e na parte nasal, considerando as suas características articulatórias. Também foram medidas as durações dos três eventos de cada vogal nasal (i.e., parte oral, parte nasal e apêndice nasal) e as durações da oclusão, da explosão e do VOT das oclusivas, [+voz] e [-voz], adjacentes à direita. Dos resultados obtidos das análises feitas, foram relevantes as diferenças quanto à qualidade vocálica e à duração dos eventos acústicos analisados nas cinco vogais nasais. Desta forma, os dois sistemas (PE e PA) distinguem-se pelos níveis de abertura, assim como pelo avanço ou recuo da língua, tendo em consideração os valores de F1 e de F2 de cada uma das cinco vogais nasais. Quanto à duração, foi possível verificar que as vogais nasais são mais longas no PA do que no PE. Relativamente à representação fonológica da nasalidade, foram verificadas produções que podem ser interpretadas como outputs que remetem para uma mesma representação da nasalidade vocálica nos dois sistemas. Contudo, algumas produções idiossincráticas permitiram também considerar a possibilidade da ocorrência de uma consoante nasal homorgânica com a oclusiva seguinte no PA. Relativamente à identificação de falantes, as pistas consistiram nas particularidades do sistema e do respectivo sexo, tendo-se encontrado possibilidades de identificação quer ao nível da qualidade vocálica e das trajectórias dos formantes e de F0, quer ao nível dos vários aspectos de duração dos eventos acústicos considerados na tese.
Date: 16-Mar-2011    Time: 14:30:00    Location: 336


Feature extraction for content-based recommendation - Mining the long tail

Paula Vaz Lobo


Abstract—The large amount of available items for consumption surpasses our processing capabilities. New content (books, news, music, video, etc.) is published every day, highly exceeding our capacity to make informed choices. The items that we do not know become potentially useless, because we are not aware of its existence and cannot specifically search for them. Current recommendation systems try to predict what we want to consume. Nevertheless, quite often tend to recommend popular items, because they are mostly based on ratings. This phenomenon shapes the consumer curve as a Pareto's distribution placing popular rated items in the "head" (the first 20% of the total items) and the unpopular unrated items in the "long tail" (the rest 80%). Items in the long tail have a recognized interest for smaller groups of people. However, current recommendation systems are failing to reveal the unpopular items, because of the rating scarcity. There is a need to assist people finding interesting unrated items in the long tail. In this thesis we explore textual features of documents in long tail. We explore document content to find similar documents using a top-N recommendation algorithm. We use semantic similarity (documents about the same subjects) as well as stylometric similarity (documents with similar types of writing style) to find documents that are closer to user preferences. Document similarity is measured using documents semantic and stylometric features. The combination of these two features type can improve recommendations novelty and help people find interesting documents in the long tail.
Date: 09-Mar-2011    Time: 14:30:00    Location: 336


Control-based Clause Sharing in Parallel SAT

Youssef Hamadi

Microsoft Research

Abstract—Clause sharing is key to the performance of parallel SAT solvers. However, without limitation, sharing clauses might lead to an exponential blow up in communication or to the sharing of irrelevant clauses. This talk presents two innovative policies to dynamically adjust the size of shared clauses between any pair of processing units. The first approach controls the overall number of exchanged clauses whereas the second additionally exploits the relevance quality of shared clauses.
Date: 28-Feb-2011    Time: 16:00:00    Location: 336


A Tutorial on Genetic Programming

Sara Silva


Abstract—Genetic Programming (GP) is the youngest paradigm inside the Artificial Intelligence field called Evolutionary Computation. Created by John Koza in 1992, it can be regarded as a powerful generalization of Genetic Algorithms, but unfortunately it is still poorly understood outside the GP community. The goal of this tutorial is to provide motivation, intuition and practical advice about GP, along with very few technical details.
Date: 15-Feb-2011    Time: 13:00:00    Location: IST, Room PA2


Methods for the Detection of Multilocus Interactions

Orlando Anunciação

INESC-ID Lisboa and IST

Abstract—In recent years there has been intense research to find genetic factors that influence common complex traits. The approach that is commonly followed to discover those associations between genetic factors and complex traits such as diseases is to perform a Genome-Wide Association Study (GWAS). It has been pointed out that there is no single marker for disease risk and no single protective marker but, rather, a collection of markers that confer a graded risk of disease. As an example of this, it has been suggested that many genes with small effects rather than few genes with strong effects contribute to the development of asthma. For human height the heritability explained with SNPs discovered with GWAS is about 5%. However, a recent study showed that it is possible to explain around 45% of the phenotypic variance for height with GWAS data. The problem is that the individual effects of the interacting SNPs are too small to be detected with common statistical methods. This shows that there is a need for powerful methods that are able to consider interactions between SNPs with low marginal effects. In this document we describe a wide range of methods that have been proposed to detect interactions between SNPs in association studies data. We will give examples of statistical methods (explaining also how to deal with the multiple testing problem), search methods (deterministic and stochastic) and machine learning methods.
Date: 11-Feb-2011    Time: 14:00:00    Location: 336


Efficient Arithmetic Operators Applied to DSP Architectures

Eduardo Costa

Universidade Católica de Pelotas

Abstract—This presentation is based on the research topics that professor Eduardo Costa has been working in Brazil. The presentation has been divided into three main topics which are entitled: Efficient Dedicated Multiplication Blocks for 2s Complement Radix-2m Array Multipliers; Fast Forward and Inverse Transforms for the H.264/AVC Standard Using Hierarchical Adder Compressors; and Radix-2 Decimation in Time (DIT) FFT Implementation Based on a Matrix-Multiple Constant Multiplication Approach. In the first topic it is presented the improvement of radix-2m binary array multiplier architecture that was previously proposed in literature. For this purpose it will be presented the use of a different scheme in order to optimize the dedicated modules that perform the radix-16 and radix-256 multiplication. In the second topic, efficient adder compressors are used in order to reduce the computational complexity of the Forward and Inverse Transforms, where Compressor-Based architectures for H.264/AVC transforms are developed. Finally, in the third topic, the main goal is the implementation of fully-parallel radix-2 Decimation in Time (DIT) Fast Fourier Transform - FFT, using the Matrix-Multiple Constant Multiplication (M-MCM) at gate level. <dl> <dt> </dt> <dd><strong>Eduardo da Costa</strong> received the five-year engineering degree in electrical engineering from the University of Pernambuco, Recife, Brazil, in 1988, the M.Sc. degree in electrical engineering from the Federal University of Paraiba, Campina Grande, Paraíba, Brazil, in 1991, and the Ph.D. degree in computer science from the Federal University of Rio Grande do Sul, Porto Alegre, Brazil, in 2002. Part of his doctoral work was developed at the Instituto de Engenharia de Sistemas e Computadores (INESC-ID), Lisbon, Portugal. He is currently a Professor with the Departments of Electrical Engineering and Informatics, Catholic University of Pelotas (UCPel), Pelotas, Brazil. He was a post-doc at UFRGS from November 2009-April-2010. He is Master thesis Advisor in the Program in Computer Science, UCPel. His research interests are VLSI architectures and low-power design. <p></p> </dd> <dt><strong>Date and local</strong></dt> <dd>Tuesday, February, 8 2011, 11h00, room 336 at INESC-ID, Lisbon. <p></p> </dd> <dt><strong>More info</strong></dt> <dd> <a href=""> Seminars page of INESC-ID</a><br> Seminar organized by the <a href="">ALGOS group (</a> </dd> </dl>
Date: 08-Feb-2011    Time: 11:00:00    Location: 336


3Hq Feature: A New 3D Shape Descriptor

Denise Guliato

Universidade Federal de Uberlândia

Abstract—Similarity searching based on 3D shape descriptors is an important process in content-based 3D shape retrieval tasks. The development of efficient 3D shape descriptor is still a challenge. We propose a novel approach to characterize 3D shapes based on the Hilbert space-filling curve. Our proposal is invariant under translation and it is little sensitive under scale and rotation. Experiments were carried out with the Princeton Shape Benchmark. The evaluation of the results indicated higher precision rates when compared to the related works.
Date: 08-Feb-2011    Time: 11:00:00    Location: 020


Biclustering-based Classi&#64257;cation of Clinical Expression Time Series: A Case Study in Patients with Multiple Sclerosis

André Carreiro


Abstract—In the last years the constant drive towards a more personalized medicine led to an increasing interest in temporal gene expression analyses. In fact, considering a temporal aspect represents a great advantage to better understand disease progression and treatment results at a molecular level. In this work, we analyse multiple gene expression time series in order to classify the response of Multiple Sclerosis patients to the standard treatment with Interferon-&#946; , to which nearly half of the patients reveal a negative response. In this context, obtaining a highly predictive model of a patient’s response would de&#64257;nitely improve his quality of life, avoiding useless and possibly harmful therapies for the non-responder group. We propose new strategies for time series classi&#64257;cation based on biclustering. Preliminary results achieved a prediction accuracy of 94.23% and reveal potentialities to be further explored in classi&#64257;cation problems involving other (clinical) time series.
Date: 04-Feb-2011    Time: 15:00:00    Location: 336


Fault Tolerant Control

Mario Mendes

Instituto Superior de Engenharia de Lisboa (ISEL)

Abstract—Tutorial overview of Fault tolerant Control. Integrated in the AQUANET project, presented by Mário Mendes.
Date: 04-Feb-2011    Time: 10:30:00    Location: Meeting Room of Mechanical Eng. Dep. - IST


Network-Based Disease Candidate Gene Prioritization: Towards Global Diffusion in Heterogeneous Association Networks

Joana P. Gonçalves

INESC-ID Lisboa and IST

Abstract—Disease candidate gene prioritization addresses the association of genes with disease susceptibility. Network-based approaches have successfully exploited the connectivity of biological networks to compute a disease-relatedness score between candidate and known disease genes. Nonetheless, available strategies yield three major concerns: (1) most networks used rely exclusively on curated physical interactions, resulting in poor genome coverage and sparsity issues; (2) devised scores are often local and thus restrict the search to a limited neighborhood around known genes and ignore potentially informative indirect paths; (3) some methods disregard interaction con&#64257;dence weights which could confer extra reliability. Results: We hypothesized that capturing disease-relatedness at the interactome scale based on weighted gene associations integrated from heterogeneous sources is likely to outperform current methods lacking one of these features and proposed to combine a particular personalized ranking method with data from STRING. Our claim was con&#64257;rmed in comparative leave-one-out cross-validation case studies assessing the impact of network density and coverage, score globality and con&#64257;dence weights on the prioritization of candidate genes for 29 diseases. Finally, the proposed method was applied to Parkinson’s disease and proved effective to recover prior knowledge and unravel interesting genes which could be linked to several pathological mechanisms of the disease.
Date: 28-Jan-2011    Time: 14:00:00    Location: 336


Variação morfo-sintáctica em algumas variedades do Português. Características do Português L1 e L2 em diferentes situações de aprendizagem.

Alan Norman Baxter

University of Macao - Macao, China

Abstract—Variação morfo-sintáctica em algumas variedades do Português. Características do Português L1 e L2 em diferentes situações de aprendizagem.
Date: 26-Jan-2011    Time: 14:30:00    Location: 336


Seminário do grupo DMIR : Gestão de Informação pessoal

Daniel Jorge Viegas Gonçalves

INESC-ID Lisboa and IST

Abstract—A gestão de informação pessoal é uma área cada vez mais relevante. A crescente quantidade de informação com que temos que lidar no dia-a-dia, nos nossos computadores e online, torna cada vez mais difícil a sua organização e recuperação. Novas ferramentas e metodologias são necessárias para ajudar os utilizadores a gerir a sua informação de forma pessoalmente relevante. Duas classes de abordagens são aqui populares: recuperação e visualização de informação. Apresentarei a área e as suas especificidades, mostrando exemplos de aplicações que tentam minorar os problemas sobre os quais esta se debruça.
Date: 26-Jan-2011    Time: 10:00:00    Location: 336


A Computational Device Based on Regulation

Ernesto Costa

Universidade de Coimbra

Abstract—Nature is a great designer and problem solver. The theories posit by Darwin, Mendel and all those that contributed for the modern synthesis, based on molecular biology, explained us how this could happen. Some decades ago, computer scientists start proposing computational models, called evolutionary algorithms, based on some of the processes used by nature, in order to solve problems that either do not have an analytical solution or are to costly if we apply exact methods. Along time, many complex problems were satisfactory solved by those algorithms, even if those nature-inspired heuristic methods are very simplistic, and based on a basic separation between the genotype and the phenotype. In recent years, the biologic understanding was increased with the comprehension of the multitude of regulatory mechanisms that are fundamental in both processes of inheritance and of development, and some researchers advocate the need to explore computationally this new understanding. One of the outcomes was the Artificial Gene Regulatory model, first proposed by Wolfgang Banzhaf. In this talk, we will present a modification of this model, aimed at solving some of its limitations, and show experimentally that it is effective in solving a set of benchmark problems. We will also discuss some future developments of the model.
Date: 14-Jan-2011    Time: 14:00:00    Location: 336


Propriedades de subcategorização verbal no Português de S. Tomé

Rita Gonçalves

Centro de Linguística da Universidade de Lisboa

Abstract—Esta comunicação constitui o resultado da investigação realizada no âmbito de mestrado e tem como principal objectivo contribuir para o estudo da variação e mudança do português em África, defendendo a emergência de uma nova variedade de língua em S. Tomé e Príncipe. Numa primeira parte, caracterizar-se-á o quadro linguístico do arquipélago e discutir-se-á o fenómeno de transição histórica do português L2 para L1. Numa segunda parte, analisar-se-ão alguns contextos sintácticos que, pela frequência da sua ocorrência, podem (vir a) constituir características do Português Oral de S. Tomé (POST). Destacam-se as construções de duplo objecto, a subcategorização de um SN por verbos de movimento direccionado e a omissão do marcador aspectual a, em variação, respectivamente, com a construção ditransitiva preposicionada, com a selecção de um SP introduzido por uma preposição diferente da usada pela norma e com a inserção de preposições em complementos oracionais.
Date: 12-Jan-2011    Time: 14:30:00    Location: 336


Seminário do grupo DMIR : Processpedia

António Rito Silva

INESC-ID Lisboa and IST

Abstract—In this presentation, Rito Silva will introduce Processpedia, an overall Business Process Management (BPM) approach which fosters the capture and integration of end users tacit knowledge. He will identify a set of requirements for a hybrid BPM approach, both top-down and bottom-up, and propose a software architecture that integrates different types of BPM knowledge. Finally, he will describe a new BPM method based on a stepwise discovery of knowledge.
Date: 22-Dec-2010    Time: 11:00:00    Location: N7.1


In-silico strategies in drug design

Nuno Palma

BIAL, Departamento de Investigação e Desenvolvimento

Abstract—In-silico strategies in drug design
Date: 13-Dec-2010    Time: 12:30:00    Location: Room C01, IST (Pavilhão Central)


Real-time link extraction and classification

Bruno Pedro


Abstract—How to extract and classify links from a real-time feed of information? This presentation will focus on the challenges of receiving real-time data from twitter, extracting links and attempting to generate automatic classifications. Addressing this problem involves dealing with different Computer Science areas, starting with the storage of all the received data and ending with the parallel processing power needed to generate results with little latency.
Date: 10-Dec-2010    Time: 11:30:00    Location: Anfiteatro A3 - IST TagusPark


José B. Pereira-Leal

Instituto Gulbenkian da Ciência

Abstract—Eukaryotic cells have a complex organization into membrane-delimited organelles. Whereas we have been steadily accumulating mechanistic data on the organization and regulation of these compartments, far less is understood about their origins and evolution. I will discuss our recent work in the evolutionary analysis of three types of intracellular compartments: endosymbiotic, endomembranous, and microtube-derived. To study evolution at a this level we are having to develop new sets of tools, from neutral models at the whole genome evolution level, sequence classification methods, ways to link molecular information with morphology and databases. I will discuss how we are using these tools to discover new principles and new molecular components. Unexpectedly, this evolutionary approach is leading us to new ways of finding drugable targets and to reposition existing drugs.
Date: 07-Dec-2010    Time: 12:30:00    Location: Room C01 - IST (Pavilhão Central)


Vernáculo de Angola – Estado actual de conhecimento e perspectivas de investigação futura

Liliana Inverno

Universidade de Coimbra

Abstract—Esta comunicação tem como objectivo resumir o conhecimento acumulado ao longo de seis anos de investigação sobre a variedade angolana do português e divide-se em duas partes. Na primeira parte, partindo de dados linguísticos recolhidos in loco e de uma breve revisão da literatura e do contexto sociolinguístico vivido em Angola desde os primeiros tempos de contacto até ao presente, apresenta-se uma descrição detalhada do conhecimento actual sobre a estrutura linguística da variedade angolana do português e dos factores sociais, históricos e linguísticos que permitem compreendê-la. Na segunda parte, analisam-se as limitações dos dados e estudos existentes e reflecte-se sobre formas de as ultrapassar.
Date: 03-Dec-2010    Time: 15:00:00    Location: 336


Reasearch in Kochi University of Technology, Japan

Shinichi Yamagiwa

Kochi University of Technology

Abstract—Dr. Yamagiwa will present the research work performed in Kochi University of Technology, Japan
Date: 23-Nov-2010    Time: 10:30:00    Location: 336


Pulsar Navigation

Chris Verhoeven

Technical University of Delft

Abstract—Pulsar are already for many years raising scientific interest but are rarely used for the benefit of a technical system. They are the best time references in the universe, with well documented signal properties. This presentation will introduce the potential of pulsar navigation for both space missions and terrestrial applications.
Date: 12-Nov-2010    Time: 17:00:00    Location: IST Alameda FA1


On the Origin of Satellite Swarms

Chris Verhoeven

Technical University of Delft

Abstract—Nano-satellites are simple afforable spacecraft that enable many new players to enter the world of spacecraft design and operation. This leads to opportunities new applications, new missions, new science and new business models. This presentation will introduce the swarm concept as the niche for nano-satellites for the coming future.
Date: 11-Nov-2010    Time: 17:00:00    Location: IST Taguspark Anf. 5


Next Generation Search

Ricardo Baeza Yates

Yahoo! Research Labs

Abstract—We provide our personal vision of what could be the next generation of Web search engines, based on a single premise: people do not really want to search, they want to get tasks done. Hence, the key to a better experience will come from the combination of the deeper analysis of content with the detailed inference of user intent. To achieve this, the main ideas are: (1) in place of the indexing that search engines traditionally perform, we have a content analysis phase that spots entities such as people, places and dates in documents; (2) at query time we assign an intent to the user based on the query and its context; and then (3) we retrieve entities matching the intent and assemble a results page not of documents, but of matching entities and their attributes.
Date: 03-Nov-2010    Time: 16:30:00    Location: VA6 (IST, Pavilhão de Civil)


Pesquisas em PLN no Núcleo de Linguística Computacional (NILC)

Maria das Graças Volpe Nunes

USP/São Carlos & NILC, Brasil

Abstract—Será apresentada uma breve história do NILC desde sua fundação, ressaltando seus principais projetos de pesquisa e desenvolvimento nas áreas de ferramentas de escrita, tradução automática, linguística de corpus, entre outras. Serão abordados os projetos atuais e as perspectivas de projetos futuros.
Date: 22-Oct-2010    Time: 15:00:00    Location: 336


Fully generalized graph cores and applications

A. P. Francisco

INESC-ID Lisboa and IST

Abstract—In this talk, we will discuss graph cores, graph clustering and their application to a real problem. A core in a graph is usually taken as a set of highly connected vertices. Although general, this definition is intuitive and useful for studying the structure of many real networks. Nevertheless, depending on the problem, different formulations of graph core may be required, leading us to the known concept of generalized core. Thus, we study and further extend the notion of generalized core. Given a graph, we propose a definition of graph core based on a subset of its subgraphs and on a subgraph property function. Our approach generalizes several notions of graph core proposed independently in the literature, introducing a general and theoretical sound framework for the study of fully generalized graph cores. Moreover, we discuss emerging applications of graph cores, such as improved graph clustering methods and complex network motif detection. In particular, we discuss an application to query log mining.
Date: 22-Oct-2010    Time: 14:00:00    Location: 336


Dynamic model identification of Lactococcus lactis metabolism time-series

Andras Hartmann


Abstract—Systems Biology is an emerging field within bioscience, that uses holism, a global and integrative perpective rather than a reductinism to explain the biological system´s behavior. This approach is particularly useful to quantitatively characterize and predict the systems dynamic. In our application multivariate time-series of Lactococcus lactis metabolite concentrations are measured in perturbation experiments. Prior knowledge about the metabolic network topology is represented in the form of parametrized nonlinear ordinary differential equations. Our goal is to identify appropriate models and parameters to the network. In this talk two different approaches will be introduced for parameter estimation: Bayesian filtering and unified modeling of Glucose uptake. We concluded that Bayesian approach offers powerful tools for identifying parameters of such networks if the identifiability is granted. Taking several different experiments in account may results model parameters that can describe the systems behavior in various conditions.
Date: 15-Oct-2010    Time: 14:00:00    Location: 336


Detecting mis-recognitions in ASR output

Thomas Pellegrini


Abstract—Detecting incorrect words in automatic transcriptions can be useful for many applications: to mark or discard low-confidence words in automatic news subtitles or transcriptions, to select unsupervised material to train acoustic models, etc. In this talk, I will report experiments where various statistical classifiers were compared: a baseline Maximum Entropy approach, Conditional Random Fields, and a Markov Chain approach. New features gathered from other knowledge sources than the decoder itself were explored: a binary feature that compares outputs from two different ASR systems (word by word), a feature based on the number of hits of the hypothesized bigrams, obtained by queries entered into a very popular Web search engine, and finally a feature related to automatically infered topics at sentence and word levels. A classification error rate improvement from 13.9% to 12.1% was achieved. Experiments were conducted on a European Portuguese and an American English broadcast news corpus.
Date: 08-Oct-2010    Time: 15:00:00    Location: 336


Using Excel as user interface on a semantic information system

Pedro Reis


Abstract—Some scientists use Excel as the main application for data storage and analysis. This approach leads to data dispersion and knowledge segregation in organizations, mainly because Excel files are usually stored in personal computers and data contained in these files cannot be queried. Organizations dealing with constant changes in their knowledge domain, such as the life sciences, have been adopting semantic web technologies to withstand large amounts of data and obtain the flexibility needed to support ontology changes over time with minimal impact on the existing data. By mapping OWL ontologies into the Excel object model, we demonstrate that is possible for end users to still use Excel as front-end and provide organizations with the means to store data in an aggregate manner, allowing a more thorough data analysis.
Date: 01-Oct-2010    Time: 14:00:00    Location: 336


Distributed and Predictable Software Model Checking

Nuno Claudino Pereira Lopes


Abstract—In this talk it will be presented a predicate abstraction and refinement-based algorithm for software verification that is designed for the distributed execution on compute nodes that communicate via message passing, as found in today&apos;s computer clusters. A successful adaptation of predicate abstraction and refinement from sequential to distributed setting needs to address challenges imposed by the inherent non-determinism present in distributed computing environments. In fact, our experiments show that up to an order of magnitude variation of the running time is common when a naive distribution scheme is applied, often resulting in significantly worse running time than the non-distributed version. It will be presented an algorithm that overcomes this pitfall by making deterministic the counterexample selection in spite of the distribution, and still efficiently exploits distributed computational resources. Joint work with Andrey Rybalchenko (TU Munich). <p/><br/> <dl> <dt><strong>More info</strong></dt> <dd> <a href=""> Seminars page of INESC-ID</a> <br/>Seminar organized by the <a href="">ALGOS group (</a> </dd> </dl>
Date: 29-Sep-2010    Time: 13:30:00    Location: 336


Distributed Compensations with Interruption in Long-Running Transactions

Roberto Bruni

Universitá di Pisa

Abstract—(joint work with Anne Kersten, Ivan Lanese, Giorgio Spagnolo) <br> Compensations are a well-known and widely used mechanism to ensure the consistency and correctness of long-running transactions in the area of databases. More recently, several calculi emerged in the area of business process modelling, service-oriented and global computing to provide the necessary formal ground for compensation primitives like those exploited in orchestration languages like WS-BPEL. The focus of this work is on the compensation policy to select for parallel branches. The choice of the right strategy allows the user to prevent unnecessary actions in case of an abort. In the past different policies have emerged in cCSP and Sagas. We propose new, optimal, operational and denotational semantics for parallel kernel of cCSP/Sagas with interruption and prove the correspondence between the two. The new semantics guarantees that distributed compensations may only be observed after a fault actually occurred. <p> <b>About the author:</b><br> Roberto Bruni´s homepage is at <a href=""></a>
Date: 22-Sep-2010    Time: 14:00:00    Location: 336


Multiplication Algorithms for Monge Matrices

Luís M. S. Russo

INESC-ID Lisboa and IST

Abstract— In this talk we will study algorithms for the max-plus product of Monge matrices. These algorithms use the underlying regularities of the matrices to be faster than the general multiplication algorithm, hence saving time. A non-naive solution is to iterate the SMAWK algorithm. For specific classes there are more efficient algorithms. We present a new multiplication algorithm (MMT), that is efficient for general Monge matrices and also for specific classes. The theoretical and empirical analysis shows that MMT operates in near optimal space and time. Hence we give further insight into an open problem proposed by Landau. The resulting algorithms have several applications in bioinformatics, in particular Monge matrices occur in genome alignment problems.
Date: 17-Sep-2010    Time: 14:00:00    Location: 336


Characterful Speech Synthesis

Matthew Aylett


Abstract—Speech synthesis is a key enabling technology for pervasive and mobile computing as well as a key requirement for accessability. Adding character to synthetic voices is a requirement for effective interaction and for devices that wish to present a coherant branded interface. In this talk I will argue that current approaches to sythesis, and current commercial pressures, make it difficult for many systems to create characterful synthesis. We will present how CereProc's approach differs from the industry standard and how we have attempted to maintain and increase the characterfullness of CereVoice's output. (Online demo available at We will outline the expressive synthesis markup that is supported by the system, how these are expressed in underlying digital signal processing and selection tags. Finally we will present the concept of second pass synthesis where cues can be manually tweaked to allow direct control of intonation style, and where synthesis can be seamlessly mixed with pre-recorded prompts to produce extremely natural output. We will also demonstrate how we can use synthesis to 'clone' celebrity voices with a brief demonstration of voices copied from George W. Bush (e.g Time permiting I will also demonstrate some experiments looking at hybrid approaches to parametric/unit selection synthesis.
Date: 10-Sep-2010    Time: 15:00:00    Location: 336


Modeling the F0 curve for Speech Synthesis

Gopala K. Anumanchipalli

Carnegie Mellon University

Abstract—In this talk I will review some approaches used for modeling the Fundamental Frequency, the F0 contour. I will detail the F0 modeling strategy currently used in Clustergen, CMU's statistical parametric Synthesis framework. I will describe our recent work attempting to improve the baseline modeling strategy by incorporation of longer range (Syllable and Phrase) features into the F0 model. We use the TILT model of Intonation for this work, and I will briefly describe the tilt framework and mention the alternative frameworks used in Intonation modeling.
Date: 03-Sep-2010    Time: 15:00:00    Location: 336


Hierarchical Phrase-based Translation with Weighted Finite-State Transducers

Adrià de Gispert

University of Cambridge

Abstract—In this talk I will describe HiFST, the Cambridge University Engineering Department statistical machine translation system. I will review hierarchical-phrase based translation, and explain why an implementation based on Weighted Finite-State Transducers is convenient to avoid search errors in decoding. I will then give an overview of the current research lines of our SMT group, focused on defining appropriate translation grammars, exploiting the vast number of alternative hypotheses encoded in translation lattices via restoring with large-scale language models, and decoding under Minimum Bayes Risk for system combination.
Date: 20-Jul-2010    Time: 14:00:00    Location: Room FA1 - Pavilhão de Informática I (IST)


Meaning Propagation

Fernando Pereira


Abstract—Information about the meanings of terms in context supports useful inferences in a variety of language processing and information retrieval tasks. We hypothesize that much of that information can be derived from explicit and implicit relationships between terms in the masses Web content and user interactions with that content. I will describe initial tests of that hypothesis where we use graph label- propagation methods to combine many small pieces of evidence from unstructured and semi-structured Web text to bootstrap broad-coverage instance-class relationships from a few seed examples.
Date: 19-Jul-2010    Time: 15:00:00    Location: Room FA1 - Pavilhão de Informática I (IST)


Structured Prediction Cascades

Ben Taskar

University of Pennsylvania (Upenn)

Abstract—Structured prediction tasks pose a fundamental bias-computation trade- off: The need for complex models to increase predictive power on the one hand and the limited computational resources for inference in the exponentially-sized output spaces on the other. We formulate and develop structured prediction cascades to address this trade-off: a sequence of increasingly complex models that progressively filter the space of possible outputs. We represent an exponentially large set of filtered outputs using max marginals and propose a novel convex loss for learning cascades that balances filtering error with filtering efficiency. We provide generalization bounds for error and efficiency losses and evaluate our approach on several natural language and vision problems. We find that the learned cascades are capable of reducing the complexity of inference by up to several orders of magnitude, enabling the use of models which incorporate higher order dependencies and features and yield significantly higher accuracy.
Date: 19-Jul-2010    Time: 14:00:00    Location: Room FA1 - Pavilhão de Informática I (IST)


Efeitos de Radiação em Circuitos Integrados

Jader Alves de Lima Filho

Centro de Tecnologia da Informação Renato Archer

Abstract—Seminário sobre: - Conceitos básicos dos efeitos de radiação em dispositivos MOSFET e BJTs - Efeitos TID e SEE - Blindagem - Endurecimento à radiação de CIs analógicos e digitais através de técnicas de: i) layout ii) hardware (desenho e redundância) iii) software - Nova topologia de transistores de potência endurecidos à radiação. Resultados Experimentais.
Date: 13-Jul-2010    Time: 11:00:00    Location: VA2 - Pavilhão de Civil do IST Alameda



Pietro Manzoni

Universidade Politecnica de Valencia

Abstract—Numerous technologies have been deployed to assist and manage transportation. In fact, recent concerted efforts in academia and industry point to a paradigm shift in intelligent transportation systems. Vehicles are expected to carry computing and communication platforms, and will have enhanced sensing capabilities. They will enable new versatile systems that enhance transportation safety and efficiency and will provide infotainment. This talk will provide a brief overview of the approaches, solutions, and technologies across a broad range of projects for vehicular communication systems. Moreover, details about ongoing research activity by the Networking group (GRC) at the Technical University of Valencia will be presented.
Date: 07-Jul-2010    Time: 15:00:00    Location: 336


Predicting Cloze Task Quality for Vocabulary Training

Adam Skory

Carnegie Mellon University

Abstract—Computer generation of cloze tasks still falls short of full automation; most current systems are used by teachers as authoring aids. Improved methods to estimate cloze quality are needed for full automation. We investigated lexical reading difficulty as a novel automatic estimator of cloze quality, to which cooccurrence frequency of words was compared as an alternate estimator. Rather than relying on expert evaluation of cloze quality, we submitted open cloze tasks to workers on Amazon Mechanical Turk (AMT) and discuss ways to measure of the results of these tasks. Results show one statistically significant correlation between the above measures and estimators, which was lexical co-occurrence and Cloze Easiness. Reading difficulty was not found to correlate significantly. We gave subsets of cloze sentences to an English teacher as a gold standard. Sentences selected by co-occurrence and Cloze Easiness were ranked most highly, corroborating the evidence from AMT.
Date: 02-Jul-2010    Time: 15:00:00    Location: 336


Computational Methods for the characterization and detection of protein binding sequences through information theory

Joan Maynou

Universitat Politècnica de Catalunya

Abstract—Regulatory sequence detection is a critical facet for understanding the cell mechanisms in order to coordinate the response to stimuli. Protein synthesis involves the binding of a transcription factor to specific sequences in a process related to the gene expression initiation. A characteristic of this binding process is that the same factor binds with different sequences placed along all genome. Thus, any computational approach shows many difficulties related with this variability observed from the binding sequences. Our job proposes the detection of transcription factor binding sites based on a parametric uncertainty measurement (Rényi entropy). This detection algorithm evaluates the variation on the total Rényi entropy of a set of sequences when a candidate sequence is assumed to be a true binding site belonging to the set.
Date: 02-Jul-2010    Time: 11:00:00    Location: 336


Dynamics of CD4+ T cells in HIV-1 Infection

Ruy M. Ribeiro

Los Alamos National Laboratory

Abstract—Mathematical modeling is becoming established in the immunologist’s toolbox as a method to gain insight into the dynamics of the immune response and its components. No more so than in the case of the study of human immunodeficiency virus (HIV) infection. I will review different areas of the study of the dynamics of CD4+ T-cells in the setting of HIV, where modeling played important and diverse roles in helping us understand CD4+ T-cell homeostasis and the effect of HIV infection on T-cell dynamics, and the processes of T-cell production and destruction.
Date: 24-Jun-2010    Time: 14:00:00    Location: 336


Learning words and speech units through natural interactions

Jonas Hörnstein

Instituto de Sistemas e Robótica

Abstract—According to the ecological and emergent approach to language learning, infants are able to learn word-like patterns without preprogrammed linguistic knowledge such as phonemes. It is believed that the first words are learned by using relatively simple pattern matching techniques. Instead phonemes emerge as the vocabulary grows and statistical models are needed to handle the increasing complexity. This work shows how pattern matching techniques can be used to create an initial set of words through the natural interaction between an infant and its caregiver. It also shows how a statistical "phoneme" model can emerge from this initial set of words in an unsupervised way. The learning techniques are implemented and demonstrated on a humanoid robot.
Date: 08-Jun-2010    Time: 16:00:00    Location: 04


Challenges and Directions in the Multicore Era

Cliff Click

Azul Systems

Abstract—Available core counts are going up, up, up! Intel is shipping quad-core chips; Sun's Rock has (effectively) 64 CPUs and Azul's hardware nearly a thousand cores. How do we use all those cores effectively? The JVM proper can directly make use of a small number of cores (JIT compilation, profiling), and garbage collection can use about 20 percent more cores than the application is using to make garbage but this hardly gets us to four cores. Application servers and transactional—J2EE/bean applications scale well with thread pools to about 40 or 60 CPUs, and then internal locking starts to limit scaling. Unless your application (such as a data mining; risk analysis; or, heaven forbid, Fortran-style weather-prediction application) has embarrassingly parallel data, how can you use more CPUs to get more performance? How do you debug the million-line concurrent program? Locking paradigms (lock ranking, visual inspection) appear to be nearing the limits of program sizes that are understandable and maintainable. Transactions, the hot new academic solution to concurrent-programming woes, has its own unsolved issues (open nesting, wait, livelock, significant slowdowns without contention). Neither locks nor transactions provide compiler support for keeping the correct variables guarded by the correct synchronization, such as atomic sets. Application-specific programming, such as stream programming or graphics, is, well, application-specific. Tools (debuggers, static analyzers, profilers) and libraries (JDK Concurrent utilities) are necessary but not sufficient. Where is the general-purpose concurrent programming model? This session's speaker claims that we need another revolution in thinking about programs. <br> <br> <strong>Bio:</strong> With more than twenty-five years experience developing compilers, Cliff serves as Azul Systems Chief JVM Architect. Cliff joined Azul in 2002 from Sun Microsystems where he was the architect and lead developer of the HotSpot Server Compiler, a technology that has delivered dramatic improvements in Java performance since its inception. Previously he was with Motorola where he helped deliver industry leading SpecInt2000 scores on PowerPC chips, and before that he researched compiler technology at HP Labs. Cliff has been writing optimizing compilers and JITs for over 20 years. He is invited to speak regularly at industry and academic conferences including JavaOne, JVM, ECOOP and VEE; serves on the Program Committee of many conferences (including PLDI and OOPSLA); and has published many papers about HotSpot technology. Cliff holds a PhD in Computer Science from Rice University.
Date: 02-Jun-2010    Time: 16:00:00    Location: Anfiteatro do Complexo Interdisciplinar (IST)


Mining the Web 2.0 to Improve Search

Ricardo Baeza-Yates

Yahoo Research and University Pompeu Fabra

Abstract—There are several semantic sources that can be found in the Web that are either explicit, e.g. Wikipedia, or implicit, e.g. derived from Web usage data. Most of them are related to user generated content (UGC) or what is called today the Web 2.0. In this talk we show several applications of mining the wisdom of crowds behind UGC to improve search. We will show live demos to find relations in the Wikipedia or to improve image search as well as our current research in the topic. Our final goal is to produce a virtuous data feedback circuit to leverage the Web itself.
Date: 01-Jun-2010    Time: 11:00:00    Location: QA1.2 (Torre Sul)


Towards a Coding Style for Scalable Nonblocking Data Structures

Cliff Click

Azul Systems

Abstract—Nonblocking (NB) algorithms are something of a Holy Grail of concurrent programming--typically very fast, even under heavy load, and they come with hard guarantees about forward progress. The downside is that they are very hard to get right. I have been working on writing some nonblocking utilities over the last year (open sourced on SourceForge in the high-scale-lib project) and have made some progress toward a coding style that can be used to build a variety of NB data structures: hash tables, sets, work queues, and bit vectors. These data structures scale much better than even the concurrent JDK software utilities while providing the same correctness guarantees. They usually have similar overheads at the low end while scaling incredibly well on high-end hardware. The coding style is still very immature but shows clear promise. It stems from a handful of basic premises: You don’t hide payload during updates; any thread can complete (or ignore) any in-progress update; use flat arrays for quick access and broadest-possible striping; and use parallel, concurrent, incremental array copy. At the core is a simple state-machine description of the update logic. <br> <br> <strong>Bio:</strong> With more than twenty-five years experience developing compilers, Cliff serves as Azul Systems Chief JVM Architect. Cliff joined Azul in 2002 from Sun Microsystems where he was the architect and lead developer of the HotSpot Server Compiler, a technology that has delivered dramatic improvements in Java performance since its inception. Previously he was with Motorola where he helped deliver industry leading SpecInt2000 scores on PowerPC chips, and before that he researched compiler technology at HP Labs. Cliff has been writing optimizing compilers and JITs for over 20 years. He is invited to speak regularly at industry and academic conferences including JavaOne, JVM, ECOOP and VEE; serves on the Program Committee of many conferences (including PLDI and OOPSLA); and has published many papers about HotSpot technology. Cliff holds a PhD in Computer Science from Rice University.
Date: 31-May-2010    Time: 16:00:00    Location: Anfiteatro do Complexo Interdisciplinar (IST)


Controlling Complexity in Part-of-Speech Induction

João Graça


Abstract—We consider the problem of fully unsupervised learning of part-of-speech tags from unlabeled text, without assuming a word-tag dictionary. The standard Hidden Markov Model (HMM) fit via Expectation Maximization (EM) performs quite poorly, due in large part to the weakness of its inductive bias and excessive model capacity. We address these problems by reducing its capacity via parametric and non-parametric constraints: eliminating parameters for rare words, adding morphological and orthographic features and enforcing word-tag association sparsity. We propose a simple model and an efficient learning algorithm, which are not much more complex than training using standard EM. Our experiments on six languages (Bulgarian, Danish, English, Portuguese, Spanish, Turkish) achieve dramatic improvements over state-of-the-art results: 11% average absolute increase in aligned tagging accuracy.
Date: 28-May-2010    Time: 14:00:00    Location: 04


Speedpath Analysis Under Parametric Timing Models

Luis Guerra e Silva


Abstract— The clock frequency of a digital IC is limited by its slowest paths, designated by speedpaths. Given the extreme complexity involved in modeling modern IC technologies, often speedpath predictions provided by timing analysis tools are not correct. Therefore, several practical techniques have recently been proposed for design debugging, that combine silicon stepping of improved versions of a circuit with subsequent correlation between measured and predicted data. Addressing these issues, in this talk we proposes a set of techniques that enable the designer to obtain reduced subsets of paths, guaranteed to contain all the speedpaths of a given circuit or block. Such subsets can be computed either from timing models, prior to fabrication, or incorporating actual delay measurements from fabricated instances.
Date: 28-May-2010    Time: 11:00:00    Location: 04


SITIU: The Portuguese Version of Let's Go!

José Lopes


Abstract—In this talk I will present the preliminary version of SITIU (Serviço de Informação de Transportes Intra-Urbano) the Portuguese version of the Let's Go! dialogue system developed at Carnegie Mellon University. The talk will be focused on the technical issues regarding the adaptation of this platform to a new language and the modifications that were made in order to include different speech recognition and synthesis engines. To our knowledge, this is the first time that the freely distributed dialogue system is used in other language than English. Finally I will talk about the implementation and research issues still pending and the plans regarding the development of SITIU.
Date: 21-May-2010    Time: 15:00:00    Location: 336


Global Tolerance of Biochemical Systems and its Design Implications

Pedro Coelho

University of California at Davis

Abstract—The ability of organisms to survive under a multitude of conditions is readily apparent. This robustness in performance is difficult to precisely characterize and quantify. At a biochemical level, it leads to physiological behavior when the parameters of the system remain within some neighborhood of their normal values. However, this behavior can change abruptly, often becoming pathological, as the boundary of the neighborhood is crossed. Currently, there is no generic approach to identifying and characterizing such boundaries. We address the problem by introducing a method that involves quantitative concepts for boundaries between regions and “global tolerance”. To illustrate the power of these concepts, we analyzed a large class of biological modules called moiety-transfer cycles and characterized the specific case of the NADPH redox cycle in human erythrocytes, which is involved in conferring resistance to malaria. Our results show that the wild-type system operates well within a region of “best” local performance that is surrounded by “poor” regions.
Date: 21-May-2010    Time: 14:00:00    Location: 336


“On-the-Fly Model Checking for Regular Alternation-Free Mu-Calculus” and “One Interface to Serve Them All”

Radu Mateescu and Jaco van de Pol

INRIA Rhône-Alpes / University of Twente

Abstract— "On-the-Fly Model Checking for Regular Alternation-Free Mu-Calculus”: Model-checking is a successful technique for automatically verifying concurrent finite-state systems. When designing a model-checker, a good compromise must be made between the expressive power of the property description formalism, the complexity of the model-checking problem, and the user-friendliness of the interface. The logic we adopted is the regular alternation-free mu-calculus, an extension of the alternation-free mu-calculus with ACTL-like action formulas and PDL-like regular expressions, allowing a concise and intuitive description of safety, liveness, and (some) fairness properties over labelled transition systems (LTSs). The model-checking method consists in reformulating the verification problem as the local resolution of a boolean equation system (BES), which is carried out by linear-time algorithms based on various strategies (depth-first search, breadth-first search, etc.). These algorithms also generate full diagnostic information (examples and counterexamples) illustrating the truth value of temporal formulas. This method is at the heart of the EVALUATOR 3.5 model-checker of the CADP toolbox, developed using the generic OPEN/CAESAR environment for on-the-fly verification. The BES resolution algorithms are provided by the CAESAR_SOLVE library of OPEN/CAESAR. <br><br> “One Interface to Serve Them All” : I will explain a new interface for model checking tools, which enables o connect many model checking algorithms (explicit, distributed, symbolic, multi-core) to various input languages (Promela, mCRL, DVE). Also, several generic optimizations can be applied at the interface. Additionally, I will demonstrate the application of model checking to safety requirements of railway interlocking systems, in a project led by the UIC (Union Internationale des Chemins de Fer).
Date: 17-May-2010    Time: 15:00:00    Location: IST, Anfiteatro QA (Torre Sul)


Voltage-mode Quaternary FPGAs: An Evaluation of Interconnections

Cristiano Lazzari


Abstract— This work presents a study about FPGA interconnections and evaluates their effects on voltage-mode binary and quaternary FPGA structures. FPGAs are widely used due to the fast time-to-market and reduced non-recurring engineering costs in comparison to ASIC designs. Interconnections play a crucial role in modern FPGAs, because they dominate delay, power and area. The use of multiple-valued logic allows the reduction of the number of signals in the circuit, hence providing a mean to effectively curtail the impact of interconnections. The most important characteristic of the results are the reduced fanout, fewer number of wires and the smaller wire length presented by the quaternary devices. We use a set of arithmetic circuits to compare binary and quaternary implementations. This work presents the first step on developing quaternary circuits by mapping any binary random logic onto quaternary devices.
Date: 14-May-2010    Time: 11:00:00    Location: 04


Analysis of interrogatives in different domains

Helena Moniz


Abstract—The aim of our study is twofold: to quantify the distinct interrogative types in different contexts and to discuss the weight of the linguistic features that best describe these structures, in order to model interrogatives in speech. In European Portuguese, as in other languages, interrogatives may be sub classified in yes-no questions, wh-questions and tags. These distinctions are accompanied by lexico-syntactic and prosodic features. Yes-no questions have the same syntactic structure of a declarative and may be differentiated by intonation contours; wh-questions have wh-words which make them recognizable; and tags have also distinctive forms, e.g., declarative sentence + negative particle + verb. State-of-the-art studies on sentence boundary detection and punctuation have discussed the relative weight of the previously mentioned features. Shriberg et al. (1998; 2008) report that prosodic features are more significant than lexical ones and that better results are achieved when combining both features; Wang and Narayanan (2004) claim that results based only on prosodic properties are quite robust; Boakye et al. (2009), analyzing meetings, state that lexico-syntactic features are the most important ones to identify interrogatives. This raises the following question: is the weight of the features dependent on the nature of the corpus and on the most characteristic types of interrogative in each? This study addresses that question, using three distinct corpora for European Portuguese: broadcast news (61h, 449k words), classroom lectures (27h, 155k words), and map-task dialogues (7h, 61k words). Results will also be presented for newspaper text (148M words).
Date: 07-May-2010    Time: 15:00:00    Location: 336


Computer Architecture: an experience on performance and power

Filipa Duarte, PhD


Abstract—In this presentation, I will introduce memory copy hardware accelerator (MCHA) and the benefits of its usage in a multiprocessor platform supporting an explicit send and receive programming model. The MCHA redirects the destination address to the original data in the cache through an indexing table tightly coupled with a cache. Therefore, it avoids cache pollution as there is no overwrite of the copied data over original data and vice-versa. Moreover, the MCHA reduces the number of instructions executed when performing a memory copy when compared with the classical software implementation. Thus, as the copy is performed in hardware, it is faster, providing a speedup of 2.97. In the second part of the presentation, I will introduce a biomedical system developed at IMEC, targeting the ECG application. At the centre of the system is the Coolflux BSP processor, that processes heart beat signals collected from several sensors and sends the reports of the analysis to the radio interface. The system has aggressive power and clock management in order to reduce the power consumption of the system. Moreover, I will also introduce the circuit-level techniques as well the architecture optimizations to reach the expected power consumption.
Date: 30-Apr-2010    Time: 14:00:00    Location: 336


From Assembling Short DNA Reads to Protein Sequencing by Assembling Mass Spectra

Pavel Pevzner

University of Califormia at San Diego (UCSD)

Abstract—Increasing read length is viewed as the crucial condition for fragment assembly with next-generation sequencing technologies. However, introducing mate-paired reads (separated by a gap of length GapLength) opens a possibility to transform short mate-pairs into long mate-reads of length approximately GapLength, and thus raises the question as to whether the read length (as opposed to GapLength) even matters. We describe a new tool for assembling mate-paired short reads and use it to analyze the question of whether the read length matters. We further complement the ongoing experimental efforts to maximize read length by a new computational approach for increasing the effective read length. While the common practice is to trim the error-prone tails of the reads, we present an approach that substitutes trimming with error correction using repeat graphs. An important and counterintuitive implication of this result is that one may extend sequencing reactions that degrade with length "past their prime" to where the error rate grows above what is normally acceptable for fragment assembly. If time allows, we will further address the problem of sequencing molecules that are not directly inscribed in the genomes (e.g., antibodies or antibiotics-like non-ribosomal peptides) and propose to assemble them from tandem mass spectra. We show that our Eulerian approach to DNA sequencing can be generalized to Shotgun Protein Sequencing (SPS). We illustrate applications of SPS to sequencing of snake venoms (collaborations with Karl Clauser at Broad Institute) and antibodies (collaboration with Jennie Lill at Genentech). We further show how mass-spectrometry enables de novo sequencing of peptide-like natural products. This is a joint work with Nuno Bandeira (UCSD), Mark Chaisson (Pacific Biosciences) and Dima Brinza (Life Technologies).
Date: 26-Apr-2010    Time: 15:00:00    Location: IST, Room VA3 (Pavilhão de Civil)


A Data Mining Approach for the detection of High-Risk Breast Cancer Groups

Orlando Anunciação


Abstract—It is widely agreed that complex diseases are typically caused by the joint effects of multiple instead of a single genetic variation. These genetic variations may show very little effect individually but strong effect if they occur jointly, a phenomenon known as epistasis or multilocus interaction. In this seminar, we explore the applicability of decision trees to this problem. A case-control study was performed, composed of 164 controls and 94 cases with 32 SNPs available from the BRCA1, BRCA2 and TP53 genes. There was also information about tobacco and alcohol consumption. We used a Decision Tree to find a group with high-susceptibility of suffering from breast cancer. Our goal was to find one or more leaves with a high percentage of cases and small percentage of controls. To statistically validate the association found, permutation tests were used. We found a high-risk breast cancer group composed of 13 cases and only 1 control, with a Fisher Exact Test value of 9:7 * 10^-6. After running 10000 permutation tests we obtained a p-value of 0.017. These results show that it is possible to find statistically significant associations with breast cancer by deriving a decision tree and selecting the best leaf.
Date: 09-Apr-2010    Time: 14:00:00    Location: 336


Recent improvements in the PT-STAR Project

Tiago Luís, Wang Ling


Abstract—This seminar is divided into two parts. The first part will focus on the recent enhancements in the translation system. Among the experiments are the evaluation of the improvements in the translation quality with the introduction of a named entity recognizer, the study of the impact of language models in translation, and the initial results of the use of confusion networks as input in translator. In the second part we present the system that is collecting articles from news websites, and how this information will be used to improve the language models of the speech recognizer.
Date: 26-Mar-2010    Time: 15:00:00    Location: 336


Human Immunodeficiency Virus (HIV) Dynamic Modeling and Antiretroviral Treatment Analysis

Ana Calhau, Constança Roquette, Teresa Cordeiro

Instituto Superior Técnico

Abstract—The emergence of the Acquired Immune Deficiency Syndrome (AIDS) raised new problems and concerns worldwide. HIV-AIDS is now a global disease which has great influence in people’s lives, especially in developing countries and controlling this type of disease has a significant socio-economical impact. HIV virus attacks preferentially CD4+ T immune cells, incorporating its DNA, which was previously transcript from a viral RNA, into the cells’ genome. Antiretroviral treatments act in different stages of HIV’s infection, decreasing the organism’s viral loads. The simplest dynamic models for HIV’s infection behavior relate the concentrations of healthy and infected CD4+T cells with viral load. Currently there are more complex models which compare more state variables and parameters. For a better knowledge of the disease it is essential to cross information obtained through these mathematical models with data from infected people. The goal of this project is to understand HIV’s complexity and to explore its dynamic, through a mathematical model based on nonlinear differential equations. Biomedical Engineering is expected to have a crucial role in the development of new tools and techniques for discovering a potential AIDS cure.
Date: 26-Mar-2010    Time: 14:00:00    Location: 336


Avaliação de Usabilidade

Alfredo Ferreira


Abstract—O sucesso de um sistema informático depende em grande medida da sua aceitação pelo utilizadores. A opinião destes sobre tal sistema é fortemente condicionada pela sua usabilidade. A usabilidade de um sistema depende não apenas o conjunto de funcionalidades que são oferecidas, mas também a forma como o utilizador interage com este. Actualmente existem diversas técnicas para avaliar a usabilidade de uma solução. Neste seminário vamos focar algumas destas técnicas, nomeadamente a avaliação com utilizadores.
Date: 19-Mar-2010    Time: 14:00:00    Location: 336


In silico Metabolic Engineering

Miguel Rocha

Universidade do Minho

Abstract—Metabolic Engineering (ME) deals with designing organisms with enhanced capabilities regarding the productivities of desired compounds. This field has received increasing attention within the last few years due to the extraordinary growth in the adoption of white or industrial biotechnological processes for the production of bulk chemicals, pharmaceuticals, food ingredients and enzymes, among other products. Many different approaches have been used to aid in ME efforts that take available models of metabolism together with mathematical tools and/ or experimental data to identify metabolic bottlenecks or targets for genetic engineering. Our conceptual framework in the development of tools for in silico ME relies on three layers: accurate mathematical models (stoichiometric models, regulatory networks, dynamic models), good simulation methods (e.g. steady state simulations with flux balance analysis, Boolean network simulation, numerical integration of ODEs) and robust optimization algorithms. This framework gave rise to the OptFlux platform, an open-source, user-friendly and modular software aimed at being the reference computational platform for ME applications. Indeed, the rational design of microbial strains has been limited to the developers of the techniques, since a platform that provides a user friendly interface to perform such tasks was not yet available. OptFlux aims to change this situation, by providing the following features: freely available, open-source, user-friendly, modular and compatible with standards such as the Systems Biology Markup Language (SBML) and the layout information of CellDesigner. The main methods allow the simulation of both wild-type and mutant organisms (using Flux Balance Analysis or other methods) and optimization tasks, i.e., the identification of ME targets can be performed with metaheuristics such as Evolutionary Algorithms, Simulated Annealing.
Date: 12-Mar-2010    Time: 14:00:00    Location: 336


Fail-aware untrusted storage (FAUST)

Christian Cachin

IBM Research - Zurich

Abstract—Fail-aware untrusted storage (FAUST) We consider a set of clients collaborating through an online service provider that is subject to attacks, and hence not fully trusted by the clients. We introduce the abstraction of a fail-aware untrusted service, with meaningful semantics even when the provider is faulty. In the common case, when the provider is correct, such a service guarantees consistency (linearizability) and liveness (wait-freedom) of all operations. In addition, the service always provides accurate and complete consistency and failure detection. We illustrate our new abstraction by presenting a Fail-Aware Untrusted STorage service (FAUST). Existing storage protocols in this model guarantee so-called forking semantics. We observe, however, that none of the previously suggested protocols suffice for implementing fail-aware untrusted storage with the desired liveness and consistency properties (at least wait-freedom and linearizability when the server is correct). We present a new storage protocol, which does not suffer from this limitation, and implements a new consistency notion, called weak fork linearizability. We show how to extend this protocol to provide eventual consistency and failure awareness in FAUST. Joint work with Alexander Shraer and Idit Keidar.
Date: 03-Mar-2010    Time: 15:00:00    Location: 336


Speaker Verification Experiments on the NIST SRE Database

Jordi Luque

L2F and TALP

Abstract—In this talk, we are going to show results concerning the work we have performed last months at the L2F Laboratory. The talk will complement previous presentations showing speaker verification results reported on a subset of the NIST SRE 2008 evaluation. Such experiments are the core of the development of the Speaker Verification system will be submit to the incoming NIST SRE 2010 evaluation. The performance of several systems, previously explained, will be compared. Among them: The classical GMM-UBM approach, the Gaussian Super Vectors technique, a GMM-UBM provided with the Joint Factor Analysis compensation technique and a newly developed system based on a Speaker Dependent Transformation Network applied to a phone classifier based on MLP. In addition, some clues about score normalization techniques will be presented.
Date: 26-Feb-2010    Time: 15:00:00    Location: 336


CHE - Evolutionary Algorithms for Cluster Geometry Optimization

Francisco B. Pereira

Universidade de Coimbra

Abstract—CHE is a joint research project involving the Evolutionary and Complex Systems group (ECOS-CISUC) and the Coimbra Chemistry Centre from the University of Coimbra. It aims to develop effective bio-inspired algorithms for difficult optimization problems from the theoretical chemistry area. This talk will focus on cluster geometry optimization. In this problem, the goal is to determine the structural organization for a set of atoms or molecules that minimizes the total potential energy. Determining the relative position of the particles that compose a cluster is essential, as it helps to understand the chemical properties of the aggregate. In this talk we present a simple and unbiased evolutionary algorithm that can effectively tackle hard cluster geometry optimization problems. Additionally, we will identify some key components that are essential to improve the performance of the optimization method.
Date: 26-Feb-2010    Time: 14:00:00    Location: 336


Winter Workshop of the Distributed Systems Group


Abstract—An informal event of the Distributed Systems Group to present the I&D activities of this specific group.
Date: 23-Feb-2010    Time: 09:00:00    Location: IST Alameda - sala FA-1


A Music Classification Method Based on Timbral Features

Thibault Langlois, Gonçalo Marques


Abstract—We present a method for music classification based solely on the audio contents of the music signal. More specifically, the audio signal is converted into a compact symbolic representation that retains timbral characteristics and accounts for the temporal structure of a music piece. Models that capture the temporal dependencies observed in the symbolic sequences of a set of music pieces are built using a statistical language modeling approach. The proposed method is evaluated on two classification tasks (Music Genre classification and Artist Identification) using publicly available datasets. Finally, a distance measure between music pieces is derived from the method and examples of playlists generated using this distance are given. The proposed method is compared with two alternative approaches which include the use of Hidden Markov Models and a classification scheme that ignores the temporal structure of the sequences of symbols. In both cases the proposed approach outperforms the alternatives.
Date: 19-Feb-2010    Time: 15:00:00    Location: 336


Novelty and Evolution in Biological, Chemical and Random Reaction Networks

Pietro Speroni di Fenizio

FCT, Universidade de Coimbra

Abstract—We shall first investigate how reaction networks are at the base of the various disciplines. We will then learn a method (Chemical Organization Theory) that helps us to study novelty in reaction networks, and see what results does this method gives us in various example of Reaction Network. Both Artificial and non. From artificial chemistries to simulated ecologies, to agent based models, to chemistry.The talk will also touch and start to frame the (unsolved) problem of how to generate reaction networks that can sustain evolution.
Date: 19-Feb-2010    Time: 14:00:00    Location: 336


Phone Recognition and Language Modeling for Variety Identification

Oscar Koller


Abstract—This talk will introduce the phonotactic approach "Phone Recognition and Language Modeling" (PRLM) for language/variety identification. After a detailed view on this token based method, I will present the use of a specialized Phone Recognizer to differentiate African Portuguese from European Portuguese in a highly accurate way. In contrast to other PRLM based methods, the tokenizer combines distinctive knowledge about the differences between the target varieties. This knowledge is introduced into a MLP phone recognizer by training two varieties’ mono-phonemes as contrasting phoneme-like classes within a single tokenizer. Significant improvements in terms of identification rate and computational cost were achieved compared to conventional single tokenizer PRLM based systems and to the combination of up to five parallel PRLM identifiers.
Date: 19-Feb-2010    Time: 14:00:00    Location: 336


Formal verification techniques: model checking in systems biology

Pedro T. Monteiro


Abstract—The study of biological networks has led to the development of increasingly large and detailed models. While whole-cell models are not on the horizon yet, complex networks underlying specific cellular processes have been modeled in detail. The study of these models by means of analysis and simulation tools leads to large amounts of predictions, typically time-courses of the concentration of several dozens of molecular components in a variety of physiological conditions and genetic backgrounds. This raises the question how to make sense of these simulations, that is, how to obtain an understanding of the way in which particular molecular mechanisms control the cellular process under study, and how to identify interesting predictions of novel phenomena that can be confronted with experimental data. Formal verification techniques based on model-checking provide a powerful technology to keep up with this increase in scale and complexity. The basic idea underlying model checking is to specify dynamical properties of interest as statements in temporal logic, and to use model-checking algorithms to automatically and efficiently verify whether the properties are satisfied or not by the model. The application of model-checking techniques is hampered, however, by several key issues described in this thesis. First, the systems biology domain brought to the fore a few properties of the network dynamics that are not easily expressed using classical temporal logics, like Computation Tree Logic (Ctl) and Linear Time Logic (Ltl). On the one hand, questions about multistability are important in the analysis of biological regulatory networks, but difficult (or impossible) to express in Ltl. On the other hand, Ctl is capable of dealing with branching time, important for multistability and other properties of non-deterministic models, but it does not do a good job when faced with questions about cycles in a Kripke structure. Second, the problem of posing relevant and interesting questions is critical in modeling in general, but even more so in the context of applying model-checking techniques, due to the fact that it is not easy for non-experts to formulate queries in temporal logic. Finally, most of the existing modelling and simulation tools are not capable of applying model-checking techniques in a transparent way. In particular, they do not hide from the user the technical details of the installation of the model checker, the export in a suitable format of the model and the query, the call of the model checker, and the import of the results produced by the model checker (the true/false verdict and witness/counterexample). This report starts by describing the basic concepts for formal verification, introducing the necessary data structures that represent the possible behaviors of a dynamical system, as well as the different types of temporal logics necessary for the encoding the properties. It also presented the generic model checking problem, which consists in determining if a given system satisfies a given set of properties. Some recent examples of the application of model checking techniques in system biology are also presented, as well as the current problems and limitations that need to be addressed. Finally, some concluding remarks are presented pointing some directions towards a better integration between the formal verification and the systems biology fields.
Date: 12-Feb-2010    Time: 14:00:00    Location: 336


Lexicon extraction from bilingual comparable corpora

Luís Carvalho


Abstract—Parallel corpora is an expensive resource to come by in Machine Translation Systems. Since it was proved that even in unrelated texts of di&#64256;erent languages patterns of words co-occurring with each other are preserved, non-parallel texts became part of these systems over parallel corpora. Comparable corpora is a speci&#64257;c type of non-parallel texts with high level of comparability, that is, they point to the same subject, have similar time window and size. This type of corpora is preferred over parallel corpora not only due to its high abundance, but also because it is easily accessible via web. The ob jective of this work is to build a bilingual lexicon from a source language to a target language using comparable corpora. For that purpose, the system is composed by two modules: one is responsible for the detection of cognate words using di&#64256;erent approaches like verbatim detection, rule based detection, non-rule based detection and sound based detection. The potential equivalents collected are extracted using similarity measures. The other module uses a characteristic found in comparable texts: context preservation between words across the corpora, that is, the context of a given word in the source language tend to be similar to the context of its translation in the target language. Then, for each word, co-occurrences of context words are counted and stored in context vectors which are further compared with all target vectors using similarity measures. These modules combined may form an e&#64259;cient platform of automatic translation between equivalents of two languages in the creation of a bilingual lexicon.
Date: 08-Feb-2010    Time: 16:00:00    Location: 336


PT-STAR: speech translation

Nuno Grazina


Abstract—Global communication and understanding play an increasingly vital role in shaping our economic and social environment. However, language di&#64256;erences are a great barrier in achieving true global understanding and knowledge sharing. Human translators are often unavailable or are prohibitively expensive, and cannot deliver much needed information in a timely and usable manner. Thus, if a system capable of automatically performing translations with the same accuracy levels as a human being was to be created, it would be greatly bene&#64257;cial in allowing true cross-lingual communication. The &#64257;eld of Spoken Language Translation aims at developing such systems, but despite seeing large improvements in the last few years, it still fails to achieve its goals for the majority of languages and real scenarios. This work’s ob jective is to build automatic translation systems for unlimited domains such as Broadcast News, lectures and presentations and for the speci&#64257;c case of the Portuguese-English language pair. Since the most popular Machine Translation paradigm, and the one that better suits unlimited domains, is based on statistical techniques, high quality language resources are needed in order to enable the systems to produce accurate translations. When considering the Portuguese-English language pair, such resources are very scarce and di&#64259;cult to obtain. Several approaches have also been proposed to improve translation results which will be discussed and applied in this work. This document de&#64257;nes the context in which this work is developed, presents its ob jectives and related work and describes how the objectives have been met so far and how they will be met in the future. It contains an historic overview of the Spoken Language Translation &#64257;eld, focused primarily on the area of Machine Translation, which presents the evolution of the main technologies and resources involved in the process of translating spoken language, followed by a review of several state-of-the art techniques aimed at improving translation accuracy. This work has followed and will follow some of these studied approaches and technologies.
Date: 08-Feb-2010    Time: 15:00:00    Location: 336


Intellectual Property Seminar

Engª Sofia Mendes


Abstract—The participant of this action will be able to understand and identify the general concepts relating to patentability requirements necessary for granting requests for patent and utility models, as well as being able to understand differences between patent and utility model-to requirements, and to grasp possible reasons refusal based on these requirements. REGISTRATION is mandatory by email (
Date: 02-Feb-2010    Time: 14:00:00    Location: 336


Recent advances in language and speaker recognition: Compensation methods, the Joint Factor Analysis

Jordi Luque

L2F and TALP

Abstract—A considerable amount of promising methods for language and speaker recognition have been proposed in the most recent NIST language (LRE) and speaker (SRE) recognition evaluation workshops. In this talk we will focus on the problem of compensation to several sources of variability such as speaker or session and we will introduce the Joint Factor Analysis (JFA) modeling. We will give an explanation of the JFA model and a brief account of the algorithms needed to carry out a JFA of speakers and session variability in a training set in which each speaker is recorded over many different channels. JFA is a model of speaker and session variability in Gaussian mixture models (GMMs) and it is capable of performing at least as well as fusions of multiple systems of other types. The JFA technique makes use of the super-vector form for modeling. That assumes that a speaker- and channel-dependent supervector (M) can be decomposed into a sum of two supervectors statistically independent, a speaker supervector (s) and a channel supervector (c). In addittion, JFA assumes that all speaker dependent supervectors are contained in the affine space defined by the eigenvoices, the directions of speaker variability, which generate the "speaker space". On another front, the channel variability is confined in the "channel space" defined by the eigenchannels.
Date: 29-Jan-2010    Time: 15:00:00    Location: 336


Project "EnviGP - Improving Genetic Programming for the Environment and Other Applications"

Sara Silva


Abstract—EnviGP is a FCT project involving INESC-ID, the University of Coimbra, the Tropical Research Institute in Lisbon, and the University of Milano-Bicocca, Italy. Genetic Programming (GP) is a population-based search procedure that, although powerful and versatile, still faces a few obstacles to its fully successful usage in the real world. After a brief introduction to GP, the main subjects of the project are informally addressed: bloat, overfitting, complexity, and the still poorly understood relationship between them. Current results and work in progress are described and illustrated with examples. All these subjects are addressed from the practical point of view of the non computer science practitioners that share the same goal as the EnviGP project: having GP provide accurate, simple, and effectively useful/usable solutions to their real-life problems. Discussion is highly encouraged. Among the audience there will be team members from the Tropical Research Institute and from the University of Milano-Bicocca.
Date: 29-Jan-2010    Time: 14:00:00    Location: 336


Recent advances in language and speaker recognition: Gaussian Super Vectors and compensation methods

Alberto Abad, Jordi Luque

L2F and TALP

Abstract—A considerable amount of promising methods for language and speaker recognition have been proposed in the most recent NIST language (LRE) and speaker (SRE) recognition evaluation workshops. One of the most widely accepted approaches consists of  combining both Gaussian mixture models (GMM) and Support Vector Machines (SVM). A classical GMM-UBM (Universal Background Model) approach is used to obtain an adapted model for each training utterance. Then, Gaussian means of these adapted models are stacked in a super-vector form to train a SVM for each different target language or client speaker. During identification, super-vectors are extracted from a model adapted to the test utterance and used to obtain a classification with the SVMs previously trained. This approach is generally known as Gaussian Super Vectors (GSV). In addition to the GSV approach, most recent efforts have been devoted to the problem of compensation to different sources of variability such as session, channel, speaker, and so on. Two of the most outstanding compensation methods are the Nuisance Attribute Projection (NAP) and the Joint Factor Analysis (JFA). In this talk, we are going to explain the conventional GMM-UBM approach and how it is related to the GSV method. A detailed explanation of the GSV method and some variations of it will be presented. We will also introduce some of the most recent compensation methods mentioned above. Experimental results on LRE and SRE corpora will be enclosed to better characterize the techniques presented.
Date: 22-Jan-2010    Time: 15:00:00    Location: 336


Wideband CMOS Receivers Exploiting Noise and Distortion Cancelling

Eric A. M. Klumperink

University of Twente

Abstract—Wide-band LNAs suffer from a fundamental trade-off between noise figure NF and source impedance matching, which limits NF to values typically above 3 dB. Some years ago, we proposed a feed-forward noise canceling technique to break this trade-off. This presentation reviews the principle of the technique and its key properties. Later, we realized that it is also possible to exploit the technique to cancel the distortion contribution of the matching device. In parallel, other research groups have also exploited the noise canceling technique. This presentation reviews various proposed circuits and the obtained results.
Date: 22-Jan-2010    Time: 14:30:00    Location: IST (Alameda) Sala VA-1


How to Complete an Interactive Configuration Process?

Mikoláš Janota

University College Dublin

Abstract—When configuring customizable software, it is useful to provide interactive tool-support that ensures that the configuration does not breach given constraints. But, when is a configuration complete and how can the tool help the user to complete it? I will formalize this problem and relate it to concepts from non-monotonic reasoning well-researched in Artificial Intelligence. The results are interesting for both practitioners and theoreticians. Practitioners will find a technique facilitating an interactive configuration process and experiments supporting feasibility of the approach. Theoreticians will find links between well-known formal concepts and a concrete practical application.
Date: 15-Jan-2010    Time: 11:00:00    Location: 336


Concise Integer Linear Programming Formulations for Dependency Parsing

André Martins


Abstract—We formulate the problem of non-projective dependency parsing as a polynomial-sized integer linear program. Our formulation is able to handle non-local output features in an efficient manner; not only is it compatible with prior knowledge encoded as hard constraints, it can also learn soft constraints from data. In particular, our model is able to learn correlations among neighboring arcs (siblings and grandparents), word valency, and tendencies toward nearly-projective parses. The model parameters are learned in a max-margin framework by employing a linear programming relaxation. We evaluate the performance of our parser on data in several natural languages, achieving improvements over existing state-of-the-art methods.
Date: 08-Jan-2010    Time: 15:00:00    Location: 336


Motif representation and discovery

Alexandra M. Carvalho


Abstract—An important part of gene regulation is mediated by specific proteins, called transcription factors (TF), which influence the transcription of a particular gene by binding to specific sites on DNA sequences, called transcription factor binding sites (TFBS). Such binding sites are relatively short stretches of DNA, normally 5 to 25 nucleotides long. A commonly used representation of TFBS is a position specific scoring matrices (PSSM) which assumes independence of nucleotides in the binding sites. Recently, some works argued in the direction of non-additivity in protein-DNA interactions making a way for more complex models to appear which account for nucleotide interactions. We propose to model TFBS representing nucleotide interactions with consistent k-graph Bayesian networks (where k represents the maximum number of interactions between nucleotides) jointly with a set of features, directly scored from each base sequence, which appear to be relevant for TFBS characterization. The model is flexible to incorporate any set of features scored from base sequences. We consider discriminative learning of such models since it outperforms generative learning in the context of classification with a large set of features.
Date: 08-Jan-2010    Time: 14:30:00    Location: 04


Fast Kullback-Leibler Optimization Algorithm: Software Library Implementation

Eugéne Suter

Universidade de Évora

Abstract—This talk presents the result of a one year BII devoted to the implementation of natural gradient algorithms of probability distributions, and more specifically to the Kullback-Leibler optimisation. A software library was developed, in C language, taking advantage of multicore CPUs and NVidia GPUs (with CUDA). Problems tackled include relation between memory hierarchy and data organisation, loop optimisation and parallel processing. Numerical issues arising in large problems and some performance tweaks are also considered. Benchmarks comparing several approaches are shown.
Date: 11-Dec-2009    Time: 14:00:00    Location: 04


A Physiological model for human patients subject to anaewsthesia

Tiago Jorge


Abstract—The seminar describes a model fort patients under anaesthesia comprising depth of anaesthesia, neuromuscular blockade and the interaction of analgesic drugswith the cardiovascular system.
Date: 10-Dec-2009    Time: 16:00:00    Location: 336


O Arquivo da Web Portuguesa

Daniel Coelho Gomes


Abstract—O Arquivo da Web Portuguesa (AWP) é um projecto da Fundação para a Computação Científica Nacional que tem como principal objectivo a preservação da informação publicada na web de Portugal. (…) Este projecto da Fundação para a Computação Científica Nacional (FCCN) visa a criação de um sistema de arquivo de conteúdos da web portuguesa, que terá como missão recolher periodicamente, armazenar e preservar a informação publicada. A primeira fase do desenvolvimento do Arquivo teve início em Janeiro de 2008 e prevê-se que termine no prazo de 2 anos. Contudo, a manutenção de um sistema desta natureza e a preservação da informação arquivada é uma tarefa que deverá ser perpetuada posteriormente. (…) Entende-se por Web portuguesa, todos os conteúdos alojados sob o domínio .pt. Numa primeira fase, pretende-se arquivar apenas conteúdos alojados sob este domínio nacional, embora posteriormente se possam vir a abranger todas as páginas escritas em língua portuguesa.
Date: 09-Dec-2009    Time: 14:30:00    Location: Sala F4 do Pavilhão Informática I – IST / Alameda


H.264 video encoding tools and the development of efficient hardware architectures

Vagner Rosa

Universidade Federal do Rio Grande do Sul

Abstract—The H.264/AVC (Advanced Video Coder) or MPEG-4 part 10, is the current state-of-the-art in video coding, providing the highest compression ratios achievable by an internationally standardized video coder (ISO/IEC and ITU-T). These compression capabilities are achieved by a well refined set of coding tools. However further improvements were required and H.264 has already been revised twice to provide new profiles for fidelity extension and professional applications in 2005 and scalability support (H.264/SVC - Scalable Video Coder) in 2007. A third revision under development by JVT (Joint Video Team) aim to support video sources with multiple views (H.264/MVC - Multi-view Video Coder). This lecture presents an overview of the coding tools used by the H.264/AVC and some algorithmic optimizations developed by the UFRGS (Brazil) team toward hardware architecture development for real-time high-definition encoders and decoders. <dl> <dt><strong>Biography</strong></dt> <dd><strong>Vagner Rosa</strong> is assistant professor at the Federal University of Rio Grande (FURG - Rio Grande - Brazil), and is currently a full-time PhD student at Federal University of Rio Grande do Sul (UFRGS - Porto Alegre - Brazil). His thesis work is the development of hardware architectures for video encoding according to the H.264 standard. His C.V. can be viewed at <a href="">CNPQ</a>. <p></p> </dd> <dt><strong>Date and local</strong></dt> <dd>Wednesday, December, 9 2009, 11h00, room 336 at INESC-ID, Lisbon. <p></p> </dd> <dt><strong>More info</strong></dt> <dd> <a href=""> Seminars page of INESC-ID</a><br> Seminar organized by the <a href="">ALGOS group (</a> </dd> </dl>
Date: 09-Dec-2009    Time: 11:00:00    Location: 336


Management and analysis of heterogeneous biological data : how the web can help

Ana T. Freitas


Abstract—The World Wide Web has revolutionized how researchers from various disciplines collaborate over long distances. This is nowhere more important than in the Life Sciences, where interdisciplinary approaches are becoming increasingly powerful as a driver of both integration and discovery. In this talk I will focus on new data management solutions for the Life Sciences field, showing the desired key features of a web-based data management system. Examples of Web 2.0 applications data standards and semantic web projects in Life Sciences will be presented.
Date: 04-Dec-2009    Time: 14:00:00    Location: 336


A View On Adaptive and Dependable Distributed Systems

Raimundo Macêdo

Universidade Federal da Bahia (UFBA)

Abstract—Abstract: Given a distributed system configuration - processes, communications channels and related properties -, a conventional dependable distributed system is, by definition, adaptive regarding a number of component failures that may occur. However, when the system configuration may change over time, the dependability mechanisms may have to adapt to new configurations with possibly degraded (or upgraded) dependability levels. Further complications may arise when, besides the system configuration, application requirements may change at run-time. In such a dynamic environment, distributed systems are required to be adaptive at run-time concerning both application requirements and system configurations. In this talk it will be briefly presented the approaches which are being investigated in LaSiD (Distributed Systems Laboratory) at UFBA to design and implement such systems, both in the system model and architectural level. Bio: Professor Titular do Departamento de Ciência da Computação, Instituto de Matemática, Universidade Federal da Bahia - UFBA ( Coordenador do Laboratório de Sistemas Distribuídos ( e do Doutorado em Ciência da Computação ( Formação em Ciência da Computação. Graduação na UFBA, Mestrado na UNICAMP/Brasil e Doutorado na Univ. de Newcastle/Inglaterra.
Date: 26-Nov-2009    Time: 14:00:00    Location: IST Taguspark, 1.38


BICS-Based March Test for Resistive-Open Defects Detection in SRAM

Fabian Vargas

PUCRS - Pontifícia Univ. Católica do Rio Grande do Sul

Abstract—Nowadays, embedded Static Random Memories (SRAMs) can occupy a significant portion of the chip area and contain hundreds of millions of transistors. Due to technology scaling, SRAM functional fault models, traditionally applied in memory testing, have become insufficient to correctly reproduce the effects produced by some defects generated during the manufacturing process. In this seminar, we investigate the possibility to use Built-In Current Sensors (BICSs) in combination with an optimized March algorithm to detect static faults associated to resistive-open defects. Experimental results obtained throughout electrical simulations validate the proposed technique demonstrating its viability and effectiveness.
Date: 23-Nov-2009    Time: 15:30:00    Location: 336


Optimization and Control for Metabolic Networks

Alexandre Domingues


Abstract—The increasing availability of metabolic network models and data poses new challenges in what concerns optimization. Due to the high level of complexity and uncertainty associated to these networks the suggested models often lack detail and liability, required to determine the proper optimization strategies. A possible approach to overcome this limitation is the combination of both kinetic and stoichiometric models. In the first part of this paper three control optimization methods, Direct Optimization and Bi-level optimization using two different inner-optimization procedures, with different levels of complexity and assuming various degrees of process information, are presented and their results compared using a prototype network. The results obtained show that the bi-level optimization provides a good approximation to networks with incomplete kinetic information. The process of formulating Metabolic Network models and the estimation of its parameters is complex and there is no defined framework to obtain valid solutions. On the second part of this paper, a procedure to estimate parameters using data sets from different experiments is presented. The procedure is illustrated by a case study on the effect of Nisin on Mannitol production by Lactococcus lactis. The obtained results are encouraging, providing a consistent estimate of the model parameters.
Date: 20-Nov-2009    Time: 14:00:00    Location: 336


A Residue Approach to the Finite Field Arithmetics

Jean-Claude Bajard

Univ. Paris VI (Pierre et Marie Currie)

Abstract—Finite fields arithmetic is one of the challenges in current computer arithmetic. It occurs, in particular, in cryptography where the needs increase with the evolution of the technologies and also of the attacks. Through our research, we have proposed different systems based on residues representations. Different kinds of finite fields are concerned with. For each of them, some specificities of the representations are exploited to ensure the efficiency, as well as for the performances, than for the robustness to side channel attacks. In this paper, we deal with three similar approaches: the first one is dedicated to prime field using residue number systems, a second one concerns extension finite fields of characteristic two, the last one discusses of medium characteristic finite fields. The main interest of these systems is their inherent modularity, well suited for circuit implementations.
Date: 17-Nov-2009    Time: 14:00:00    Location: 04


Security: Enabling the Reliability of IP Telephony


Abstract—By Dr. François Cosquer CTO, Security and Technology Strategy, Alcatel-Lucent Enterprise Business Group Voice communication is a critical business application. In the past, telephony benefited from a dedicated, highly reliable infrastructure. When migrating voice communications to IP, the prime concern is how reliable the service will be, both in terms of Quality of Service (QoS), for example SLA metrics, but also in terms of Quality of Experience (QoE) as perceived by users. Issues such as performance, availability and confidentiality that are often taken for granted in the conventional TDM PBX world need to be assessed in the converged IP environment. The reliability of IP telephony – not only dropped calls, delay and audio quality but also service outage, downtime and availability – is a factor of the network design architecture, expected traffic load and the many components of the IP infrastructure involved in delivering the service. Security is one of these components, but is often perceived as simply a suite of threat mitigation features or systems against malicious attacks and mis-configurations. The talk will discuss how security must play a broader role in enabling the reliability of IP Telephony. Bio: Dr. François Cosquer is CTO Security and Technology Strategist for the Alcatel-Lucent Enterprise Business Group. Over the past 18 years, he has held senior positions with research institutions, equipment vendors and telecommunications operators. He draws on extensive experience in security architecture, networking, operating systems, middleware and multimedia applications. He has been speaker, panelist and chair at key industry events and conferences. François graduated in Electronics and Computing and holds an MSc in Computer Science and a Ph.D. in Computer Engineering. He currently serves as Adjunct Professor at the Faculty of Engineering and Computer Science, University of Concordia, Montreal.
Date: 11-Nov-2009    Time: 10:00:00    Location: 336


High-Voltage-Enabled Analog/RF Circuit Techniques for Nanoscale CMOS

Pui-In (Elvis) Mak

University of Macao - Macao, China

Abstract—Technology downscaling has led to a continuous reduction of supply voltage (VDD) to maintain device reliability. High-voltage-(HV)-enabled analog/RF circuits have emerged as a feasible alternative to cope with the nanometer technologies at low cost. An elevated VDD directly open up much flexibility in defining circuit topologies while preserving sufficient voltage headroom for signal swing. Evidently, design-for-reliability is essential to avoid overstress on each device. This lecture starts by introducing the concept of HV-enabled analog/RF circuits. State-of-the-art works are discussed to justify their advantageous features over wide range of aspects. Our recently proposed 2xVDD RF techniques for enhancing circuit performances without leveraging reliability are then presented. A HV-enabled mobile-TV tuner RF front-end is fabricated in 90-nm CMOS 1 as a proof-of-concept prototype. It include a cascode-cascade inverter-based low-noise amplifier, a linearized programmable C-2C attenuator with a reliable overdrive control, a gain roll-off compensation path and dual cascode I/Q mixer drivers. Stress-conscious circuit topologies and gate-drain-source engineering techniques enable reliable 2-V operation with standard 1-V thin-oxide transistors.
Date: 04-Nov-2009    Time: 17:00:00    Location: VA2 (Pav. Civil - IST Alameda)


Power and Delay Comparison of Binary and Quaternary Arithmetic Circuits

Cristiano Lazzari


Abstract—Interconnections play a crucial role in deep sub-micron designs since they dominate the delay, power and area. This is especially critical for modern million-gates FPGAs, where as much as 90% of chip area is devoted to interconnections. Multiple-valued logic allows for the reduction of the required number of signals in the circuit, hence can serve as a means to effectively curtail the impact of interconnections. We present in this paper a comparison of binary and quaternary implementations of arithmetic modules based on lookup table structures using voltage-mode circuits. Our assessment demonstrates that significant power reduction is possible through the use of quaternary structures, with very low delay penalties. <dl> <dt><strong>Date and local</strong></dt> <dd>Tuesday, November, 3 2009, 11h30, room 336 at INESC-ID, Lisbon. <p></p> </dd> <dt><strong>More info</strong></dt> <dd> <a href=""> Seminars page of INESC-ID</a><br> Seminar organized by the <a href="">ALGOS group (</a> </dd> </dl>
Date: 03-Nov-2009    Time: 11:30:00    Location: 336


Hacking life: how to build a new life form in your computer

Arlindo L. Oliveira


Abstract—Synthetic biology is a new field of research that combines computer models of biological systems with DNA synthesis and genetic engineering techniques in order to design and build new biological functions, systems and organisms. While still in its infancy, this area of research is expected to develop rapidly, so that very soon researchers, companies and hackers will be able to design, build and release in the wild new organisms. In this talk, I will address some questions and challenges posed by this technology, and, in particular, the role that will be played by research areas such as Systems Biology, Bioinformatics and Information Systems in the design of artificial life forms.
Date: 23-Oct-2009    Time: 14:00:00    Location: 336


Solving Implicit Problems and Using Cyclic Graphs for Graphics

Brian Wyvill

University of Victoria

Abstract— The talk will be divided into two parts. In the first part implicit blending is discussed and in the second cyclic scene graphs. Blending is both the strength and the weakness of implicit surfaces. While it gives them the unique ability to smoothly merge into a single, arbitrary shape, it makes implicit modelling hard to control since implicit surfaces blend at a distance, in a way that heavily depends on the slope of the field functions that define them. We have found that to be more intuitive and easy to control, blends should be located where two objects intersect, while enabling other parts of the objects to come as close to each other as desired without being deformed. Our solution relies on automatically defined blending regions around the intersection curves between two objects. Outside of these volumes, a clean union of the objects is computed thanks to a new operator that guarantees the smoothness of the resulting field function; meanwhile, a smooth blend is generated inside the blending regions. This talk describes joint work done with French researchers, Marie-Paule Cani, Loic Barthe and Adrien Berhardt.<p> The second half of the talk describes work on scene graphs. Conventional scene graphs use directed acyclic graphs. We investigate scene graphs with recursive cycles for defining graphical scenes. This permits both conventional scene graphs and iterated function systems within the same framework and opens the way for other definitions not possible with either. We explore several mechanisms for limiting the implied recursion in cyclic graphs, including both global and local limits. This approach permits a range of possibilities, including scenes with carefully controlled and locally varying recursive depth. It has applications in art and design, and opens up interesting avenues for future research. The second half of the talk describes work done with Prof. Neil Dodgson, University of Cambridge.<p> <b>Bio:</b><p> Brian Wyvill graduated from the University of Bradford, Uk with a PhD in computer graphics in 1975. As a post-doc he worked at the Royal College of Art and helped make some animated sequences for the Alien movie. He emigrated to Canada in 1981 where he has been working in the area of implicit modeling, sometimes with his brother Geoff Wyvill (University of Otago). He is also interested in sketch based modeling and NPR and enjoys combining these areas of research. In 2007 Brian took up an appointment as Professor and Canada Research Chair at the University of Victoria, British Columbia.
Date: 15-Oct-2009    Time: 12:00:00    Location: 336


SMART-System - Metadata-based Sports Video Database, its Development and Experience

Chikara Miyaji

Japan Institute of Sports Sciences

Abstract—SMART-system is a video database developed by Japan Institute of Sports Sciences (JISS). More than 12 NSFs are using the system, and 39,601 video files and 380129 meta-data are archived at the date of August 2009. SMART-system is a metadata-based movie database specialized for sports performance training and coaching. The main characteristics are summarized as follows: (1) based on streaming technology which is enhanced for sports movement analysis, (2) metadata-based searching and requesting system specialized for sports, (3) coaching annotation system, and (4) authentication system for distributed streaming servers. At my Talk, technical aspects and internal structure of the system will be explained. And also the experience of providing this software system will be described, especially, from the view point of sports specific difficulties and solutions.
Date: 13-Oct-2009    Time: 15:00:00    Location: 336


Preparing a cyanobacterial chassis for H2 production: a synthetic biology approach

Catarina Pacheco

Institute for Molecular and Cell Biology (IBMC)

Abstract—Molecular hydrogen (H2) is an environmentally clean energy carrier that can be a valuable alternative to the limited fossil fuel resources of today. The BioModularH2 project aims at designing reusable, standardized molecular building blocks that integrated into a “chassis” will result in a photosynthetic bacterium containing engineered chemical pathways for competitive, clean and sustainable hydrogen production. For this project the unicellular cyanobacterium Synechocystis sp. PCC 6803 (Synechocystis) is being used as the photoautotrophic “chassis” for this project. To prepare the chassis for an optimal H2 production, the Synechocystis native bidirectional hydrogenase was inactivated. Later on, a synthetic circuit containing a heterologous highly efficient hydrogenase will be introduced into the “chassis”. Due to hydrogenase sensitivity to molecular oxygen, and to provide the anaerobic environment required for an optimal heterologous hydrogenase activity, synthetic oxygen consuming devices are being prepared based on native and heterologous enzymes that use O2 as substrate, and will be subsequently tested. Finally, the integration of the designed synthetic circuits into the “chassis” will provide an anaerobic environment within the cell for an optimized and highly active hydrogenase.
Date: 09-Oct-2009    Time: 14:00:00    Location: 336


Power Macro-Modelling using an Iterative LS-SVM Method

J. Monteiro


Abstract—In this talk I will describe a new method for power macromodeling of functional units for high-level power estimation based on Least-Squares Support Vector Machines (LS-SVM). This method improves the already good modeling capabilities of the basic LS-SVM method in two ways. First, a modified norm is used that is able to take into account the weight of each input for global power consumption in the computation of the kernels. Second, an iterative method is proposed where new data-points are selectively added as support-vectors to increase the generalization of the model. The macromodels obtained provide not only excellent accuracy on average (close to 1% error), but more importantly, thanks to our proposed modified kernels, we were able to reduce the maximum error to values close to 10% <dl> <dt><strong>Date and local</strong></dt> <dd>Tuesday, October, 6 2009, 11h30, room 336 at INESC-ID, Lisbon. <p></p> </dd> <dt><strong>More info</strong></dt> <dd> <a href=""> Seminars page of INESC-ID</a><br> Seminar organized by the <a href="">ALGOS group (</a> </dd> </dl>
Date: 06-Oct-2009    Time: 11:30:00    Location: 336


Observability-based Coverage-directed Path Search using Pseudo-Boolean Optimization

J. Monteiro


Abstract—In this talk, I will address the problem of finding a minimal set of execution paths that achieve a user-specified level of observability coverage. Under this metric, a program statement is only considered covered if its execution has influence in some output. Pseudo-Boolean Optimization (PBO) is used to model the problem of finding the paths that are most likely to increase code coverage. Generated paths are then validated to check for feasibility. This methodology was implemented into a fully functional tool that is capable of handling real programs specified in the C language.
Date: 29-Sep-2009    Time: 11:30:00    Location: 04


Neurodynamic Optimization with Its Application for Model Predictive Control

Jun Wang

Chinese University of Hong-Kong

Abstract—Optimization problems arise in a wide variety of scientific and engineering applications. It is computationally challenging when optimization procedures have to be performed in real time to optimize the performance of dynamical systems. For such applications, classical optimization techniques may not be competent due to the problem dimensionality and stringent requirement on computational time. One very promising approach to dynamic optimization is to apply artificial neural networks. Because of the inherent nature of parallel and distributed information processing in neural networks, the convergence rate of the solution process is not decreasing as the size of the problem increases. Neural networks can be implemented physically in designated hardware such as ASICs where optimization is carried out in a truly parallel and distributed manner. This feature is particularly desirable for dynamic optimization in decentralized decisionmaking situations arising frequently in control and robotics. In this talk, I will present the historic review and the state of the art of neurodynamic optimization models and selected applications in robotics and control. Specifically, starting from the motivation of neurodynamic optimization, we will review various recurrent neural network models for optimization. Theoretical results about the stability and optimality of the neurodynamic optimization models will be given along with illustrative examples and simulation results. It will be shown that many problems in control systems, such model predictive control, can be readily solved by using the neurodynamic optimization models. Specifically, linear and nonlinear model predictive control based on neurodynamic optimization will be delineated.
Date: 29-Sep-2009    Time: 11:00:00    Location: 336


Apt-pbo: Solving the Software Dependency Problem using Pseudo-Boolean Optimization

Paulo Trezentos


Abstract—The installation of software packages (on Linux as well as in other package-driven platforms as eclipse plugins) depends on the correct resolution of dependencies and conflicts between packages. As an NP-complete problem, this is an hard task which todays technology does not address in an acceptable way. This seminar introduces a new approach to solving the software dependency problem in a Linux environment, devising a way for solving dependencies according to available packages and user preferences. We present the “apt-pbo” tool - the first available tool that solves dependencies in a complete and optimal way. The contribution is threefold. Our main finding is an efficient encoding of the dependencies and conflicts as a pseudo-boolean optimization problem without the need of ILP or SAT extra-steps. Second, we achieve this goal without sacrificing performance, a critical issue for a tool with user interaction. Finally, the developed tool is available under a free license allowing enhancement and benchmarking.
Date: 25-Sep-2009    Time: 13:00:00    Location: 336


Next-generation sequencing (for dummies)

Paulo G. S. da Fonseca


Abstract—We present the basics of the new high-throughput sequencing technologies and discuss some of its applications and associated research problems from a bioinformatics perspective.
Date: 10-Sep-2009    Time: 14:00:00    Location: 336


Dynamic Programming Optimization of Multi-rate Multicast Video -Streaming Services

'Nestor Michael C. Tiglao'


Abstract—In large scale IP Television (IPTV) and Mobile TV distributions, the video signal is typically encoded and transmitted using several quality streams, over IP Multicast channels, to several groups of receivers, which are classified in terms of their reception rate. As the number of video streams is usually constrained by both the number of TV channels and the maximum capacity of the content distribution network, it is necessary to find the selection of video stream transmission rates that maximizes the overall user satisfaction. In this presentation, we will present and discuss Dynamic Programming Multi-rate Optimization (DPMO) algorithm to efficiently solve this optimization problem.
Date: 28-Jul-2009    Time: 11:00:00    Location: 336


In Search of Shapes

Karthik Ramani

Purdue University, USA

Abstract—Technological advances in science and in engineering in recent years have resulted in a great explosion of data of all kinds. Recent scientific research and technological investments have significantly improved the information retrieval process using textual content. However, the vast quantity of 3D content available is growing and no mechanisms are available for users to successfully navigate. The increasing quantity of 3D data has led to the need for further research in order to understand, analyze, and perform operations on 3D content which can further advance discovery and innovation in many fields. This includes biology, engineering design, medical and computer vision (laser scanning). Traditional approaches such as keywords, annotating and navigation are alone insufficient to describe or search on 3D shapes intuitively. Reusing and sharing the knowledge embedded in 3D shapes is an important way to accelerate the design process, improve product quality, and reduce costs. Two representations for search are presented. The first approach uses a 2.5 d spherical function then employs a fast spherical harmonics transformation to get a rotation invariant descriptor. The second method represents the shape of a 2D drawing from the statistics perspective as a distance distribution between pairs of randomly sampled points. Both the representations have many valuable advantages: invariant to affine transformation, insensitive to noise or cracks, simple, and fast. A new design pattern is introduced in which the highly-interactive sketch-based user interface and retrieval is combined seamlessly together with user suggestions. In addition, the interface is also integrated with a constraint solver for freehand sketches, part-class suggestion, and navigation of large repositories. An application (FEAsy) for structural analysis in early design through sketching is demonstrated. I summarize with recent research in proteomics using least median of square and inner distance for flexible protein structure/shape searching. <br><br><b>About the Author:</b><br> Karthik Ramani is a Professor in the School of Mechanical Engineering at Purdue University. He earned his B.Tech from the Indian Institute of Technology, Madras, in 1985, an MS from The Ohio State University, in 1987, and a Ph.D. from Stanford University in 1991, all in Mechanical Engineering. He has worked as a summer intern in Delco Products, Advanced Composites, and as a summer faculty intern in Dow Plastics, Advanced Materials. He was awarded the Dupont Young Faculty Award, the National Science Foundation (NSF) Research Initiation Award, the NSF CAREER Award, the Ralph Teetor Educational Award from the Society of Automotive Engineers, Outstanding Young Manufacturing Engineer Award from the Society of Manufacturing Engineers, and the Ruth and Joel Spira Award for Outstanding contributions to the Mechanical Engineering Curriculum. In 2002, he was recognized by Purdue University through a University Faculty Scholars Award and won the NSF partnership for innovation award. In 2005 he won the Discovery in Mechanical Engineering Award for his work in shape search. In 2006 he won the innovation of the year award (finalist) from the State of Indiana. He developed many successful new courses - Computer-Aided Design and Prototyping, Product and Process Design and co-developed an Intellectual Property course. In 2007 he won the only Research Excellence Award for the College of Engineering at Purdue University. He also serves as the technology advisor at Imaginestics, that has launched the worlds first on-line shape-based search engine for the engineering industry. His interests are in design of shapes, representation and search of shapes, especially computer support for early and conceptual design. He serves in the editorial board of Elsevier Journal of Computer-Aided Design as well as ASME Journal of Mechanical Design. His current work is supported by the NSF (CISE), National Institute of Health, NSF-Partnership for Innovation, and General Electric.
Date: 27-Jul-2009    Time: 11:30:00    Location: FA1 (DEI)


Single nucleotide polymorphisms characterization in a Portuguese Caucasian breast cancer and control population

Bruno Costa Gomes

Departamento de Genética / FCM / UNL

Abstract—Cancer is a complex somatic genetic disease that is caused mainly by environmental factors. However a few inherited mutations in some critical genes can be associated with cancer development. Breast cancer accounts for one in four of all female cancers, making it the first leading cause of cancer deaths in women in the western world. Numerous epidemiological factors affect the likelihood of developing breast, but no other predictor is as powerful as an inherited mutation in the tumour-suppressor genes BRCA1 or BRCA2. TP53 was deemed a plausible candidate as well. Hereditary breast cancer accounts for only 5–10% of all breast cancer cases and individuals carrying mutations in one of these genes have a 40–80% chance of developing breast cancer, making these mutations the strongest breast cancer predictors known. The other 90-95% of breast cancer cases are sporadic and occur in women in the absence of mutations in the referred susceptibility genes. This way the identification of a plausible cause for the remaining sporadic cases is a challenging work. Recent evidence shows that there are probably background genetic factors that contribute to the development of sporadic breast cancer, such as single nucleotide polymorphisms (SNPs). The emergence of comprehensive high density maps of SNPs and affordable genotyping platforms has allowed the accomplishment of association studies. Due to linkage disequilibrium, a panel of a few hundred thousand reporter SNPs (tSNPs) can be used as tags for the majority of the millions of common variants in the genome. Statistical approaches have been extensively used for the purpose of inferring haplotypes from diploid population data. An alternative, but not very explored, approach is called the Pure-Parsimony approach. This approach finds a solution to the haplotype inference problem that minimizes the total number of distinct haplotypes used, using the well know fact that haplotypes are, in general, much less numerous than genotypes. In order to get real data to develop satisfiability models and algorithms for the problem of haplotype inference by pure parsimony, a set of breast cancer patients and control populations was genotyped. To achieve this goal, approximately 100 breast cancer patients were recruited in Oncologic Units of several Lisbon Hospitals. Each cancer patient was matched, when possible, with two healthy control individuals, with the same age, tobacco smoking status and alcohol consumption habits. A second control population (about 50 individuals) characterized by the absence of breast cancer was also used to help in ascertaining the possible role of the gene polymorphisms under study as a control population. This population was identified in the Indian reserve of Sangradouro (Mato Grosso, Brasil) where the predominant ethnic group is Xavante. For each cancer and control populations 7 SNPs in BRCA1, 19 SNPs in BRCA2 and 6 SNPs in TP53 genes were genotyped, using real-time PCR, in particularly, TaqMan® SNP Genotyping Assays from Applied Biosystems. Since the majority of genotyped SNPs were tag of other ones, the real number of SNPs analyzed is much superior then those 32 analyzed in all three genes. These experiments are expected to give sufficient data to clarify the effects of variation in SNPs in the breast cancer susceptibility, and to explain specific characteristics of the populations under study that are of great interest to science.
Date: 24-Jul-2009    Time: 14:00:00    Location: 336


ARMS - Automatic Residue-Minimization based Sampling for Multi-Point Modeling Techniques

Jorge F. Villena


Abstract—This talk will describe an automatic methodology for optimizing sample point selection inside the framework of model order reduction (MOR). The procedure, based on the maximization of the dimension of the subspace spanned by the samples, iteratively selects new samples in an efficient and automatic fashion, without computing the new vectors and with no prior assumptions on the system behavior. The scheme is general, and valid for single and multiple dimensions, with applicability on multi-point nominal MOR approaches, and on multi-dimensional sampling based parametric MOR methodologies. The talk will also introduce an integrated algorithm for multi-point MOR, with automatic sample and order selection based on the transfer function error estimation. Results on a variety of industrial examples demonstrate the accuracy and robustness of the technique. <p></p> <strong>Jorge Fernandez Villena</strong> received a degree in Telecommunication Engineering from the E.S.T.I.I.T. at Universidad de Cantabria, Spain, in 2005. He is currently working towards a Ph.D. degree in Electrical and Computer Engineering at the Instituto Superior Técnico, Technical University of Lisbon, Portugal, and he is a researcher in the ALGOS Group, at INESC-ID. His research interests include integrated circuits interconnect modeling and simulation, with emphasis in numerical algorithms for parametric model order reduction. <p></p> <dl> <dt><strong>Date and local</strong></dt> <dd>Tuesday, July, 21 2009, 11h30, room 336 at INESC-ID, Lisbon. <p></p> </dd> <dt><strong>More info</strong></dt> <dd> <a href=""> Seminars page of INESC-ID</a><br> Seminar organized by the <a href="">ALGOS group (</a> </dd> </dl>
Date: 21-Jul-2009    Time: 11:30:00    Location: 336


Toward Energy-efficient Computing

David Brown

Sun Microsystems Inc.

Abstract—As a result of both the increased average power consumed by a single system, and the rapid growth in the number of total computer systems deployed, energy consumption by computers and related technologies is growing at an exponential rate analogous to Moore’s Law. The use of energy has become a consequential factor in the design of contemporary computer systems. <br> This talk frames the energy problem in general, looking at its current implications in the computing space. I’ll introduce several of the basic technologies that have been introduced which may help us to manage power use on modern computing platforms, then describe some recent experience in their application as seen from my vantage point at Sun. The conclusion, is that while some of these mechanisms are enabling, they seem far from sufficient to realise optimal energy use in computing. How should the energy problem be framed more specifically for computer system designers?<br> I will give a simple vision for energy-efficient computing, and describe a number of the elements that appear necessary if we are to solve it along those lines. Some likely avenues of research are suggested. <br> <h2>About the Author</h2> David Brown is presently working on the Solaris operating system’s core power management facilities, with particular attention to Sun’s x64 hardware platforms. Earlier at Sun he led the Solaris ABI program: a campaign to develop and deliver a practical approach to binary compatibility for applications built on Solaris. <br> Before coming to Sun, Dave was a member of the research staff at Stanford University, where he worked with Andy Bechtolsheim on the prototype SUN Workstation; later was a founder of Silicon Graphics, where he developed early system and network software and designed a floating point accelerator; and subsequently established the Workstation Systems Engineering Group for DEC in Palo Alto along with Steve Bourne, where he built the team that developed the graphics architecture applied in DEC ’s MIPS workstations and the PixelStamp and PixelVision subsystems. <br> Dave’s technical background is computer systems (operating systems and networking), and architecture with some specific attention to the design of high-performance interactive graphics systems. <br> Dave received a Ph.D. in Computer Science from Cambridge University, for a dissertation which introduced the “Unified Memory Architecture” approach for the integration of high performance graphics subsystems in a general-purpose computing architecture. This idea is now widely applied, notably in the current Intel processor and memory system architecture. <br>
Date: 20-Jul-2009    Time: 14:30:00    Location: 336


Taking the Turn — Or Not: Turn Management in Spoken Dialogue Systems

Julia Hirschberg

Columbia University

Abstract—Listeners have many options in dialogue: They may interrupt the current speaker, take the turn after the speaker has finished, remain silent and wait for the speaker to continue, or backchannel, to indicate that they are still listening &mdash; without taking the turn I will discuss three of these options which are particularly difficult, yet particularly important, for systems to distinguish in Spoken Dialogue Systems: taking the turn vs. backchanneling vs. remaining silent and letting the speaker continue. How can the system determine which option the user is choosing? How can the system decide which option it should choose and how best to signal this to the user? I will describe results of an empirical study of these phenomena in the context of a larger study of human-human turn-taking behavior in the Columbia Games Corpus. This is joint work with Agustín Gravano (University of Buenos Aires).
Date: 17-Jul-2009    Time: 15:00:00    Location: V1.17, Civil Engineering building, IST-Alameda


CSI: are Mendel's data too good to be true?

Ana Pires

Instituto Superior Técnico

Abstract—Gregor Mendel (1822-1884) is almost unanimously recognized as the founder of modern genetics. However, long ago, a shadow of doubt was cast on his integrity by another eminent scientist, the statistician and geneticist, Sir Ronald Fisher (1890?1962), who questioned the honesty of the data that form the core of Mendel's work. This issue, nowadays called the Mendel-Fisher controversy, can be traced back to 1911, when Fisher first presented his doubts about Mendel's results, though he only published a paper with his analysis of Mendel's data in 1936. A large number of papers have been published about this controversy culminating with the publication in 2008 of a book (Franklin et al., Ending the Mendel-Fisher controversy) aiming at ending the issue, definitely rehabilitating Mendel image. However, quoting from Franklin et al., the issue of the too good to be true aspect of Mendel's data found by Fisher still stands. We have submitted Mendel data and Fisher's statistical analysis to extensive computations and simulations attempting to discover an hidden explanation or hint that could help finding an answer to the questions: is Fisher right or wrong, and if Fisher is right is there any reasonable explanation for the too good to be true, other than deliberate fraud? In this talk some results of this investigation and the conclusions obtained will be presented.
Date: 17-Jul-2009    Time: 14:00:00    Location: 336


Data Parallel Acceleration of Decision Support Queries Using Cell/BE and GPUs

Pedro Trancoso

Department of Computer Science, University of Cyprus

Abstract—Decision Support System (DSS) workloads are known to be one of the most time-consuming database workloads that processes large data sets. Traditionally, DSS queries have been accelerated using large-scale multiprocessor. The topic addressed in this work is to analyze the benefits of using high-performance/low-cost processors such as the GPUs and the Cell/BE to accelerate DSS query execution. In order to achieve this goal we propose data-parallel versions of the original database scan and join algorithms. For this work we use the Rapidmind platform to code our algorithms as it offers the possibility of executing the same program on both Cell/BE and GPUs. In our experimental results we compare the execution of three queries from the standard DSS benchmark TPC-H on two systems with two different GPU models of the same generation (8500 and 8800), a system with the Cell/BE processor, and a system with dual quad-core Xeon processors. The results show that parallelism can be well exploited by the GPUs. The speedup values observed were up to 21x compared to a single processor system. In addition, for most cases, GPU performance surpassed the general-purpose multi-core.
Date: 30-Jun-2009    Time: 14:00:00    Location: 336


Transaction Activation Scheduling Support for Transactional Memory

Gilles Muller


Abstract—Transactional Memory (TM) is considered as one of the most promising paradigms for developing concurrent applications. TM has been shown to scale well on multiple cores when the data access pattern behaves “well,” i.e., when few conflicts are induced. In contrast, data patterns with frequent write sharing, with long transactions, or when many threads contend for a smaller number of cores, produce numerous aborts. These problems are traditionally addressed by application-level contention managers, but they suffer from a lack of precision and provide unpredictable benefits on many workloads. In this talk, we propose a system approach where the scheduler tries to avoid aborts by preventing conflicting transactions from running simultaneously. We use a combination of several techniques to help reduce the odds of conflicts, by (1) avoiding preempting threads running a transaction until the transaction completes, (2) keeping track of conflicts and delaying the restart of a transaction until conflicting transactions have committed, and (3) keeping track of conflicts and only allowing a thread with conflicts to run at low priority. Our approach has been implemented in Linux for Software Transactional Memory (STM) using a shared memory segment to allow fast communication between the STM library and the scheduler. It only requires small and contained modifications to the operating system. Experimental evaluation emonstrates that our approach significantly reduces the number of aborts while improving transaction throughput on various workloads.
Date: 29-Jun-2009    Time: 14:00:00    Location: 336


Language Technologies and CALL

Maxine Eskenazi

Carnegie Mellon University

Abstract—Language Technologies are still in their early stages of use as far as Computer-assisted Language Learning (CALL) is concerned. But some notable progress has been made and some systems have appeared. This talk will discuss three applications that I have developed: NativeAccent, Lets Go Lexical Entrainment, and REAP. We will look at how each one uses language technologies. We will also discuss the challenges that each system overcomes when it combines the need to serve the language learner well with the present state of each technology. We will also examine the learning science questions that are raised when the interface between the technology and the student is created.
Date: 25-Jun-2009    Time: 14:00:00    Location: VA2, IST-Alameda


Test of NoCs and NoC-based Systems-on-chip

Érika Fernandes Cota

Universidade Federal do Rio Grande do Sul

Abstract— This tutorial presents an overview of the issues related to the test of NoC-based systems. First, the main issues related to the test of complex SoCs are reviewed are presented, as well as a brief summary of the main techniques proposed so far. Then, the challenges to test NoC-based systems are discussed and current test strategies are presented: re-use of the network for core testing, test scheduling for the NoC reuse, test access methods and interface, efficient re-use of the network, and power-aware and thermal-aware NoC-based SoC testing. In addition, the challenges and solutions for the NoC (interconnects, routers, and network interface) test and diagnosis are presented. <p> <strong>Érika Fernandes Cota</strong> has obtained her Ph.D. degre in Computer Science at the UFRGS - Federal University of Rio Grande do Sul in 2003. Currently, she is assistant professor at the Federal University of Rio Grande do Sul. She has 8 articles published in specialized journals and over 40 works in conferences. She co-supervised 2 master dissertations. She works in the area of computer science, with emphasis on hardware and software IC testing. In her professional activities interacted with other 36 researchers in co-authorship of scientific papers. The most significant scientific and technological areas are test, BIST, embedded self test, fault-tolerant systems, integrated systems and embedded systems. <p></p> </dd> <dt><strong>Date and local</strong></dt> <dd>Monday, June, 22 2009, 15h30, room 336 at INESC-ID, Lisbon. <p></p> </dd> <dt><strong>More info</strong></dt> <dd> <a href=""> Seminars page of INESC-ID</a><br> Seminar organized by the <a href="">ALGOS group (</a> </dd> </dl>
Date: 22-Jun-2009    Time: 15:30:00    Location: 336


What can we do with a multitude of genome sequences?

Martin Tompa

University of Washington

Abstract—There are currently more than 600 bacterial species and 28 vertebrate species, ranging from primates to fishes, for which we know (nearly) their entire DNA sequences. These number will continue to increase rapidly over the next few years. Comparing these genome sequences has emerged as one of the most important areas of computational biology. For example, one way to predict functional portions of the human genome is to search among related genomes for sequences that appear to be remarkably similar due to purifying selection. I will discuss and demonstrate some of the methods and tools for such an approach, as well as some of the challenges and unsolved problems.
Date: 19-Jun-2009    Time: 11:00:00    Location: 336


Speech Synthesis: past, present and future and how it mirrors speech processing development in general

Alan W Black

Carnegie Mellon University

Abstract—This talk will look at the past, present and future of speech synthesis and how it relates to speech processing development in general. Specifically I will outline the advances in synthesis technology giving analogies to the developments in other speech and language processing fields (e.g. ASR and SMT) where knowledge-based techniques gave way to data-driven techniques, which in turn have pushed both machine learning technologies and later re-introduced techniques to include higher level knowledge in our data-driven approaches. We will give overviews of diphone, unit selection, statistical parametric synthesis, voice morphing technologies and how synthesis can be optimized for the desired task. We will also address issues of evaluation, both in isolation and when embedded in real tasks. (NOTE: the talk will take place at VA4, Pavilhão de Civil, IST)
Date: 05-Jun-2009    Time: 15:00:00    Location: 336


Introdução ao C++ numa hora ... para quem já sabe C

Luis Guerra e Silva

Departamento de Engenharia Informática

Abstract—Esta apresentação consiste numa introdução muito breve ao C++, para programadores de C. Introdução à linguagem: classes, instanciação, namespaces, overloading e derivação. Biblioteca STL.
Date: 02-Jun-2009    Time: 12:30:00    Location: 336


Information Extraction: Knowledge Discovery from Text

Ralph Grishman

New York University

Abstract—Much of the information on the Web is encoded as text, in a form which is easy for people to use but hard for computers to manipulate. The role of information extraction is to make the structure of this information explicit, by creating data base entries capturing specified types of entities, relations, and events in the text. We consider some of the challenges of information extraction and how they have been addressed. In particular, we consider what knowledge is required and how the means for creating this knowledge has developed over the past decade, shifting from hand-coded rules to supervised learning methods and now to semi-supervised and unsupervised techniques.
Date: 25-May-2009    Time: 14:00:00    Location: 336


Modeling and control of microflow sensors based on temperature measurement

Milan Adamek

Thomas Bata University in Zlin

Date: 21-May-2009    Time: 16:30:00    Location: Sala de reuniões do DEEC - Tprre Norte, piso 5


DFY/DFM - design for yield and manufacturability

Prof. Hans Zapf

University of Applied Sciences Munich

Abstract—Seminar Prof. Hans Zapf DFY/DFM - design for yield and manufacturability More information available soon.
Date: 15-May-2009    Time: 11:30:00    Location: 336


Semantic web applications to variable discovery in the life sciences: a cloudy future?

Jonas S. Almeida

University of Texas M.D.Anderson Cancer Center

Abstract—We do not have a semantic web as such yet and instead have a collection of semantic web technologies. These technologies have recently started to deliver on their promise of an interoperable world particularly in data driven initiatives that integrate data management with its analysis. In this presentation we will describe our own travails with identifying and putting to use data driven representations of biomolecular repositories for biomarker studies. The examples will include user-driven “incubation” of data models using a software prototype that manages their representation as dyadic predicates ( This solution falls into the generic W3C’s Resource Description Framework (RDF). As this prototype is used by other groups as the server-side partner of client-side computational statistics applications we find that cloud computing presents a better solution for deployment of web data services. The different models of cloud computing will be briefly overviewed in the life sciences context.
Date: 13-May-2009    Time: 15:30:00    Location: 336


Model checking in systems biology: an introduction

Pedro T. Monteiro

INRIA Rhône-Alpes

Abstract—The study of gene regulatory networks, as well as other biological networks, have recently yield an increase on the number and detail of available models describing specific intracellular processes. The study of these models by means of analysis and simulation tools leads to innumerous predictions representing the possible behaviours of the system. In order to validate these predictions one must confront them with experimental data. Performed manually, this comparison may prove to be impracticable and prone to errors, leading to a growing need for an automatic and scalable method to perform this task. The formal verification field provides powerful methods to deal with the analysis of large models. Some issues still need to be addressed tough, in order to achieve a good integration of formal verification tools in the modeling practice in systems biology. We present the CTRL temporal logic which is powerful enough to express biological properties (multistability and oscillations), as well as an implementation of a set of temporal logic patterns for systems biology, and its integration with the GNA modeling tool. We also propose a web-server based architecture to integrate modeling and simulation tools with model-checking tools and its application to the analysis of the carbon starvation response model in E. coli.
Date: 08-May-2009    Time: 11:00:00    Location: 336


Minimal Perfect Hashing: A Competitive Method for Indexing Internal Memory

Guilherme Menezes

Universidade Federal de Minas Gerais

Abstract—A perfect hash function (PHF) is an injective function that maps keys from a set S to unique values. Since no collisions occur, each key can be retrieved from a hash table with a single probe. A minimal perfect hash function (MPHF) is a PHF with the smallest possible range, that is, the hash table size is exactly the number of keys in S. MPHFs are widely used for memory efficient storage and fast retrieval of items from static sets. Differently from other hashing schemes, MPHFs completely avoid the problem of wasted space and wasted time to deal with collisions. Until recently, the amount of space to store an MPHF description for practical implementations found in the literature was O(log n) bits per key and therefore similar to the overhead of space of other hashing schemes. Recent results on MPHFs presented in the literature changed this scenario: an MPHF can now be described by approximately 2.6 bits per key. The objective of this work is to show that MPHFs are, after the new recent results, a good option to index internal memory when static key sets are involved and both successful and unsuccessful searches are allowed.
Date: 30-Apr-2009    Time: 15:30:00    Location: N7.1


DNA Sequence Alignment - A brief overview on computational algorithms and architectures

Nuno Sebastião


Abstract—This talk provides an introductory overview to DNA sequencing, as well as to the algorithms and architectures used for sequence alignment. The presentation will start with a brief introduction to the DNA sequencing process. Afterwards, a description of the optimal and heuristic algorithms for sequence alignment will be presented, as well as the data structures that usually support them. Special attention will be put on approximate string matching algorithms, due to the considerable speedup that may be obtained by using this type of search. Finally, some tools available for biological sequence comparison and for DNA re-sequencing will be presented, as well as some of the hardware structures used to further speed up the alignment process.
Date: 24-Apr-2009    Time: 11:00:00    Location: 336


Scoring functions for learning Bayesian networks

Alexandra M. Carvalho


Abstract—The aim of this work is to benchmark scoring functions used by Bayesian network learning algorithms in the context of classification. We considered both information-theoretic scores, such as LL, AIC, BIC/MDL, NML and MIT, and Bayesian scores, such as K2, BD, BDe and BDeu. We tested the scores in a classification task by learning the optimal TAN classifier with benchmark datasets. We conclude that, in general, information-theoretic scores perform better than Bayesian scores.
Date: 23-Apr-2009    Time: 16:00:00    Location: 04


Social Computing in Education

Chinese University of Hong-Kong

Abstract—The Web has changed the landscape of how humans interact socially. With the advent of Web 2.0, Social Computing has become a new paradigm in ways we communicate, teach, learn, and educate. In this talk, I will first introduce Social Computing and its various platforms such as blogs, wikis, mashups, social bookmarks, etc. and how they can be used in the education environment for teaching and learning. I will highlight other interesting issues such as security, trust, etc. in using these technologies for learning. Lastly, we examine some current challenges and potential future promises of Social Computing in education. Brief Profile Dr. Irwin King is currently with the Chinese University of Hong Kong. He received the BSc degree in Engineering and Applied Science from California Institute of Technology and his MSc and PhD degree in Computer Science from the University of Southern California. Dr. King research interests include machine learning, web intelligence and social computing, and multimedia processing. In these research areas, Dr. King has published over 150 refereed journal and conference manuscripts. In addition, he has contributed over 20 book chapters and edited volumes.
Date: 17-Apr-2009    Time: 14:00:00    Location: EA5


O rio da minha aldeia: from Recife to Lyon and Lisbon

Paulo G. S. da Fonseca


Abstract—In this seminar I will be honoured to present myself to the local community and talk about the joys of cities with beautiful riverside landscapes. Incidentally, I might be caught talking about my research interests concerning the characterisation of conserved functional gene modules from heterogeneous high throughput data.
Date: 16-Apr-2009    Time: 16:00:00    Location: 336


Probabilistic retrieval and visualization of biologically relevant microarray experiments

José Caldas

Helsinki University of Technology

Abstract—As ArrayExpress and other repositories of genome- wide experiments are reaching a mature size, it is becoming more meaningful to search for related experiments, given a particular study. We introduce methods that allow for the search to be based upon measurement data, instead of the more customary annotation data. The goal is to retrieve experiments in which the same biological processes are activated. This can be due either to experiments targeting the same biological question, or to as-yet unknown relationships. We use a combination of existing and new probabilistic machine learning techniques to extract information about the biological processes differentially activated in each experiment, to retrieve earlier experiments where the same processes are activated, and to visualize and interpret the retrieval results. Case studies on a subset of ArrayExpress show that, with a suf&#64257;cient amount of data, our method indeed &#64257;nds experiments relevant to particular biological questions. Results can be interpreted in terms of biological processes using the visualization techniques.
Date: 15-Apr-2009    Time: 16:00:00    Location: 336


On the Efficient Reduction of Complete EM based Parametric Models

Jorge F. Villena


Abstract— Due to higher integration and increasing frequency based effects, full Electromagnetic Models (EM) are needed for accurate prediction of the real behavior of integrated passives and interconnects. Furthermore, these structures are subject to parametric effects due to small variations of the geometric and physical properties of the inherent materials and manufacturing process. Accuracy requirements lead to huge models, which are expensive to simulate and this cost is increased when parameters and their effects are taken into account. This paper presents a complete procedure for efficient reduction of realistic, hierarchy aware, EM based parametric models. Knowledge of the structure of the problem is explicitly exploited using domain partitioning and novel electromagnetic connector modeling techniques to generate a hierarchical representation. This enables the efficient use of block parametric model order reduction techniques to generate block-wise compressed models that satisfy overall requirements, and provide accurate approximations of the complete EM behaviour, which are cheap to evaluate and simulate. <p></p> <dl> <dt><strong>Date and local</strong></dt> <dd>Wednesday, April, 8 2009, 14h30, room 336 at INESC-ID, Lisbon. <p></p> </dd> <dt><strong>More info</strong></dt> <dd> <a href=""> Seminars page of INESC-ID</a><br> Seminar organized by the <a href="">ALGOS group (</a> </dd> </dl>
Date: 08-Apr-2009    Time: 14:30:00    Location: 336


A MILP-based Approach to Path Sensitization of Embedded Software

José Carlos Campos Costa


Abstract— We propose a new methodology based on Mixed Integer Linear Programming (MILP) for determining the input values that will exercise a specified execution path in a program. In order to seamlessly handle variable values, pointers and arrays, and variable aliasing, our method uses memory addresses for variable references. This implies a dynamic methodology where all decision are taken as the program executes. During this execution, we gather constraints for the MILP problem, whose solution will directly yield the input values for the desired path. We present results that demonstrate the effectiveness of this approach. This methodology was implemented into a fully functional tool that is capable of handling medium sized real programs specified in the C language. Our work is motivated by the complexity of validating embedded systems and uses a similar approach to an existing HDL functional vector generation. We are currently integrating this method with the mentioned hardware method. The joint solution of the MILP problems will provide a hardware/software co-validation tool. <p></p> <dl> <dt><strong>Date and local</strong></dt> <dd>Wednesday, April, 8 2009, 14h30, room 336 at INESC-ID, Lisbon. <p></p> </dd> <dt><strong>More info</strong></dt> <dd> <a href=""> Seminars page of INESC-ID</a><br> Seminar organized by the <a href="">ALGOS group (</a> </dd> </dl>
Date: 08-Apr-2009    Time: 14:00:00    Location: 336


Beyond Edman Degradation: Algorithmic De novo Protein Sequencing of Monoclonal Antibodies

Nuno Bandeira

University of California, San Diego

Abstract—The characterization and engineering of monoclonal antibodies is usually preceded by time-consuming Edman/cDNA sequencing steps for determination of the heavy and light chain sequences – a low-throughput pipeline that does not address post-translational modifications. In a departure from these platforms, we have developed the Comparative Shotgun Protein Sequencing (CSPS) suite of algorithms – a mass spectrometry based protein sequencing approach resulting in over 95% sequence coverage and automatic discovery of unexpected post-translational modifications. In contrast with the current multiple-week duration of typical sequencing projects, CSPS delivers additional functionality while reducing the time required to sequence an antibody to under 72 hours, a dramatic reduction as compared to the average 2-4 months for classical Edman sequencing of an entire antibody. While we demonstrate CSPS on monoclonal antibodies, the underlying techniques are not antibody-specific and the results indicate that CSPS has the potential to be a disruptive technology for all protein sequencing applications.
Date: 07-Apr-2009    Time: 14:00:00    Location: 336


Elementos para um estudo comparativo da tipologia rítmica do Português

Plínio A. Barbosa

Instituto de Estudos da Linguagem/Unicamp

Abstract—O interesse pela tipologização dos ritmos das várias línguas não deixou de ocupar os estudos prosódicos desde que Lloyd-James opôs as línguas do tipo machine-gun (silábicas) as do tipo Morse-code (acentuais) ao referir-se ao espanhol e ao inglês americanos. Aquilo que parece de outiva como aparentemente claro, a saber que os ritmos dessas línguas e de outras como francês vs inglês, espanhol vs árabe são distintos, resiste desde a década de 1970 a mensuração objetiva. Desde 1999, com o artigo de Ramus, diversas metodologias de base segmental têm sido empregadas para classificar os ritmos das línguas. O objetivo desta comunicação é apresentar o quadro geral da pesquisa na área, bem como opor dois tipos de classificação de base prosódica para tipologizar os ritmos de variedades brasileiras e portuguesas. O conhecimento da maneira como as variedades manipulam a duração de unidades do tamanho da sílaba pode lançar luz sobre a maneira de conceber os módulos prosódicos de sistemas de síntese e reconhecimento de fala ou mesmo a adaptação de sistemas de uma variedade para outra.
Date: 13-Mar-2009    Time: 15:00:00    Location: 336


Parameter Tuning in SVM-Based Power Macro-Modeling

António Gusmão


Abstract—We describe a methodology that uses of support vector machines (SVMs) to determine simpler and better fit power macromodels of functional units for high-level power estimation. The basic approach is first to obtain the power consumption of the module for a large number of points in the input signal space. Least-Squares SVMs are then used to compute the best model to fit this set of points. We have performed extensive experiments in order to determine the best parameters for the kernels. Based on this analysis, we propose an iterative method of improving the model by selectively adding new support vectors and increasing the sharpness of the model. The macromodels obtained confirm the excellent modelling capabilities of the proposed kernel-based method, providing both excellent accuracy on maximum error (close to 17%) and average (2% error), which represents an improvement over the state-of-the-art. Furthermore, we present an analysis of the dynamic range of power consumption for the benchmarks circuits, which serves to confirm that the model is able to accommodate circuits exhibiting a more skewed power distribution.<p></p> <dl> <dt><strong>Date and local</strong></dt> <dd>Friday, March, 6 2009, 14h00, room 336 at INESC-ID, Lisbon. <p></p> </dd> <dt><strong>More info</strong></dt> <dd> <a href=""> Seminars page of INESC-ID</a><br> Seminar organized by the <a href="">ALGOS group (</a> </dd> </dl>
Date: 06-Mar-2009    Time: 14:00:00    Location: 336


Programming Distributed Systems: an Introduction to MPI

J. Monteiro


Abstract—The Message Passing Interface Standard (MPI) is a standard defined by the MPI Forum, which has over 40 members, including vendors, researchers, software library developers, and users. The goal of MPI is to establish a portable, efficient, and flexible standard for writing distributed programs based on message passing. MPI is not an IEEE or ISO standard, but has become the de facto "industry standard" for writing message passing programs on HPC platforms.<br> In this talk I will present a gentle introduction to MPI. I will start by discussing the message passing programming model. Then I will cover the basics of MPI, present one-to-one communication and collective operations.<br> <blockquote> "I am not sure how I will program a Petaflop machine, but I am sure that I will need MPI somewhere."<br> Horst D. Simon </blockquote> <dl> <dt><strong>Date and local</strong></dt> <dd>Friday, February, 20 2009, 14h00, room 336 at INESC-ID, Lisbon. <p></p> </dd> <dt><strong>More info</strong></dt> <dd> <a href=""> Seminars page of INESC-ID</a><br> Seminar organized by the <a href="">ALGOS group (</a> </dd> </dl>
Date: 20-Feb-2009    Time: 14:00:00    Location: 336


Estimating Local Ancestry in Admixed Populations

Eran Halperin

International Computer Science Institute (ICSI)

Abstract—Large-scale genotyping of SNPs has shown a great promise in identifying markers that could be linked to diseases. One of the major obstacles involved in performing these studies is that the underlying population sub-structure could produce spurious associations. Population sub-structure can be caused by the presence of two distinct sub-populations or a single pool of admixed individuals. In this talk, I will focus on the latter which is significantly harder to detect in practice. New advances in this research direction are expected to play a key role in identifying loci which are different among different populations and are still associated with a disease. Furthermore, the detection of an individual ancestry has important medical implications. I will describe two methods that we have recently developed to detect admixture, or the locus-specific ancestry in an admixed population. We have run extensive experiments to characterize the important parameters that have to be optimized when considering this problem - I will describe the results of thes experiments in context with existing tools such as SABER and STRUCTURE.
Date: 29-Jan-2009    Time: 11:00:00    Location: 336


Programming Multicores

J. Monteiro


Abstract—As the microprocessor industry switches from increasing clock frequencies and implicit instruction level parallelism to multicores, the free ride for programmers is over. In order to take advantage of the continuing increase of computational power, software development needs to address explicit parallelism. In this talk I will present an introduction to multicore programming, focusing on OpenMP and NUMA. I will cover basic material, but in-depth discussion of particular topics is welcomed <dl> <dt><strong>Date and local</strong></dt> <dd>Friday, January, 23 2009, 14h00, room 4 at INESC-ID, Lisbon. <p></p> </dd> <dt><strong>More info</strong></dt> <dd> <a href=""> Seminars page of INESC-ID</a><br> Seminar organized by the <a href="">ALGOS group (</a> </dd> </dl>
Date: 23-Jan-2009    Time: 14:00:00    Location: 04


Challenges in the Application of Quantum Mechanics to Biomolecular Problems

Ricardo Mata

Faculdade de Ciências de Universidade de Lisboa

Abstract—The range of application of quantum mechanical methods has been increasing rapidly in the last few years. Today, one is able to study large biomolecular systems at a level of accuracy which a decade ago was only possible for 5-10 atoms. These developments are an outcome of the increasing computer power available to the quantum chemist, but also by new theories/procedures which have helped remove some of the major bottlenecks in the calculations. In this talk, a short introduction to the methods of molecular and quantum mechanics will be given, addressing also their coupling in multi-level calculations. The application of these methods in the study of an enzymatically catalized reaction will be discussed, with a focus on the major computational bottlenecks as well as accuracy. Finally, I will review some of the latest developments on the use of heterogeneous acceleration in the field, namely with Nvidia GPUs and ClearSpeed processors.
Date: 21-Jan-2009    Time: 14:00:00    Location: 336


Modelling HIV-1 Evolution under Drug Selective Pressure

Anne-Mieke Vandamme

Katholicke Universiteit Leuven

Abstract—This talk will address methods for the analysis and modeling of HIV evolution, including phylogenetics and the relationship between genotype and phenotype of the HIV virus.
Date: 16-Jan-2009    Time: 16:00:00    Location: 336


Kernel methods for the prioritization of candidate genes

Yves Moreau

Katholicke Universiteit Leuven

Abstract—Hunting disease genes is a problem of primary importance in biomedical research. Biologists usually approach this problem in two steps: first a set of candidate genes is identified using traditional positional cloning or high-throughput genomics techniques; second, these genes are further investigated and validated in the wet lab, one by one. To speed up discovery and limit the number of costly wet lab experiments, biologists must test the candidate genes starting with the most probable candidates. So far, biologists have relied on literature studies, extensive queries to multiple databases and hunches about expected properties of the disease gene to determine such an ordering. Recently, the data mining tool ENDEAVOUR has been introduced, which performs this task automatically by relying on different genome-wide data sources, such as Gene Ontology, literature, microarray, sequence and more. A novel kernel method that operates in the same setting is presented: based on a number of different views on a set of training genes, a prioritization of test genes is obtained. A thorough theoretical analysis of the guaranteed performance of the method will also be presented. Finally, the application of the method to the disease data sets on which ENDEAVOUR has been benchmarked, will be reported, showing that a considerable improvement in empirical performance has been obtained.
Date: 19-Dec-2008    Time: 11:00:00    Location: 336


Biochemical neutral solutions using S-system models

Marco Vilela

University of Texas M.D.Anderson Cancer Center

Abstract—One of the major difficulties of modeling biological systems from time series is the identification of a parameter set which gives the model the same dynamical behavior of the data. A more austere goal is the identification of the biochemical interaction of the systems components from the model parameters. In this talk, we present a method for the S-System parameter space identification from biological time series based on Monte Carlo process and a parameter optimization algorithm. The proposed methodology was applied to real time series data from the glycolytic pathway of the bacteria Lactococcus latis and ensembles of models with different network topologies were generated. The parameter optimization algorithm was also successfully applied to the same dynamical data however imposing a pre-specified network topology from previous knowledge, foreseeing the method as an exploration tool for test of hypothesis and design of new experiments.
Date: 03-Dec-2008    Time: 15:00:00    Location: 04


Charge transport at the surface of organic semiconductors for molecular electronics

Helena Alves


Abstract—The investigation of material systems in which new electronic phenomena arise from the interactions of molecules, is an active topic of research. In particular, electrical transport at the surface of organic materials is a key issue in molecular electronics. Field effect transistors (FET) are not only a powerful tool to measure charge transport at the interface level but are also an essential element in modern electronics. In this talk, a general overview in molecular electronics will be given, with particular emphasis on materials, some device applications and in more detail organic field effect transistors (OFET). Single crystal OFET measurements performed on 3 different systems, TMTSF, PDIF-CN2 and TTF/TCNQ will be presented and some of the key topics in OFETs will be discussed. TMTSF devices show clear signatures of intrinsic transport (high mobility, increasing with lowering temperature) and p-type behaviour. PDIF-CN2 presents n-type transport and very good device characteristics with roomtemperature electron mobility as high as 6 cm2/Vs in vacuum and 3 cm2/Vs in air, the best ntype mobility reported until the moment in an OFET. Finally, it will be introduced a new electronic system created at the interface of two different organic crystals. Despite the fact that the two organic crystals (TTF and TCNQ) are large gap semiconductors and, therefore essentially insulating, their interface turns out to exhibit metallic character, with very high conductivity becoming larger as the temperature is lowered. As the interface assembly process is simple and can be applied to crystals of virtually any conjugated molecule, the combination of molecules with different electronic properties will then enable the assembly of molecular interfacial systems possessing properties that have no analogue in molecular bulk materials.
Date: 26-Nov-2008    Time: 12:00:00    Location: Auditório do INESC-Avila, Av. Duque de Ávila, 23


Probabilistic Control and Applications to Gene Regulatory Networks

Miguel José Simões Barão


Abstract—Probabilistic control aims to find a probabilistic decision rule in order to control a stochastic dynamic system. In this formulation, both the system model and decision rules are described by conditional probability functions that are iterated together in a closed loop. This kind of formulation fits well in problems where a large number of agents act on the same system simultaneously, one possible application being gene regulatory networks. This talk addresses the formulation of the probabilistic control problem, the optimization of probability functions, and an illustration of its application to gene regulatory network. It is shown that despite the original problem is a high dimensional one, its solution can be computed in a very efficient way.
Date: 13-Nov-2008    Time: 15:00:00    Location: 336


Introduction to GPUs Programming using CUDA

Carlos Coelho

Cadence Research Laboratories

Abstract—Graphical Processing Units (GPUs) boast an impressive amount of computational power and memory bandwidth at commodity prices driven low by volume and competition on the gaming consumer market. Due to the introduction of high level APIs, such as NVIDIA&#39;s Compute Unified Device Architecture (CUDA), harvesting the computational power of the GPU for general computing applications has become straightforward. In this talk we present an overview of NVIDIA&#39;s current GPU architecture. A brief introduction to CUDA and a discussion of performance issues and optimization techniques.<p></p> <strong>Carlos Pinto Coelho</strong> received his Ph.D. in Electrical Engineering and Computer Science from MIT in September 2007. In 2001 he joined the startup company AltraBroadband where he worked on the development of the Nexxim circuit simulator. In 1999 and 2001 he received his engineering and masters degree in Computer and Electrical Engineering from the Instituto Superior Técnico, in Lisbon. His interests include simulation and modeling of physical systems in general and biological systems in particular, artificial intelligence, parallel programming hardware, algorithms, mathematics and physics. Since 2007 he is a researcher at Berkeley Cadence Research Laboratories.<p></p> Seminar organized by the <a href="">ALGOS group (</a>
Date: 03-Nov-2008    Time: 11:30:00    Location: 336


Faithful modeling of transient expression and its application to elucidating negative feedback regulation

Ron Pinter


Abstract—Modeling and analysis of genetic regulatory networks is essential both for better understanding their dynamic behavior and for elucidating and refining open issues. We hereby present a discrete computational model that effectively describes the transient and sequential expression of a network of genes in a representative developmental pathway. Our model system is a transcriptional cascade that includes positive and negative feedback loops directing the initiation and progression through meiosis in budding yeast. The computational model allows qualitative analysis of the transcription of early meiosis-specific genes, specifically, Ime2 and their master activator, Ime1. The simulations demonstrate a robust transcriptional behavior with respect to the initial levels of Ime1 and Ime2. The computational results were verified experimentally by deleting various genes and by changing initial conditions. The model has a strong predictive aspect, and it provides insights into how to distinguish among and reason about alternative hypotheses concerning the mode by which negative regulation through Ime1 and Ime2 is accomplished. Some predictions were validated experimentally, for instance, showing that the decline in the transcription of IME1 depends on Rpd3, which is recruited by Ime1 to its promoter. Finally, this general model promotes the analysis of systems that are devoid of consistent quantitative data, as is often the case, and it can be easily adapted to other developmental pathways.
Date: 30-Oct-2008    Time: 15:00:00    Location: 336


Local Properties of Biological Networks

Ron Pinter


Abstract—The study of biological networks has led to the development of a variety of measures for characterizing network properties at different levels. Global analysis provides summary measures such as diameter, clustering coefficients, and degree distribution that describe the network as a whole, whereas local properties, such as the occurrences of motifs and graphlets allow us to focus on specific phenomena within the network. Local characteristics are suitable to study networks that are incompletely explored; in particular, they faithfully capture the neighborhoods of these parts of the networks that are better studied. In this talk I will describe several methods to analyze both protein-protein interaction (which are undirected graphs) as well as regulation networks (which are directed) along with the biological consequences that they have yielded.
Date: 29-Oct-2008    Time: 11:00:00    Location: 336


Solving Techniques and Heuristics for Max-SAT and Partial Max-SAT

Josep Argelich


Abstract—Max-SAT is the optimization version of the well know Satisfiability Problem. In this talk we will introduce the Max-SAT problem and the solving techniques used by the most successful state-of-the-art Max-SAT solvers as, for example, the Branch and Bound schema, techniques for improving the lower bound using underestimation and inference, and variable selection heuristics. Next, we will introduce Weighted and Partial Max-SAT and we will see new techniques for these formalisms and adaptations from the techniques used in Max-SAT. Finally, we will present and discuss some results of the last Max-SAT evaluation.
Date: 21-Oct-2008    Time: 14:30:00    Location: 04



Amitav Das

Microsoft Research

Abstract—Usually signal processing researchers are happy with their various ways of slicing and dicing the signals to explore various aspects of the signals, while the pattern recognition people are busy looking at various recognition/classification algorithms using whatever “features” from the signal are “given” to them. Usually these two groups of researchers each go their own way. But, for a lot of applications it is important to consider both the feature selection and classification method together which is typically NOT done. For example, MFCC is used in speech recognition as a feature which is supposed to be “speaker-independent” and represent what you are saying. But the same feature is used by people working in speaker identification as well! In my talk, I will give a brief overview of popular and emerging signal processing applications and then pick one of my research areas, namely user-identification, and show how judicious feature selection helps to keep the classification part simple and allows one to develop systems which provide high performance at very low complexity.
Date: 21-Oct-2008    Time: 10:30:00    Location: 336


New architectures for the final scaling of the CMOS world

Luigi Carro

Universidade Federal do Rio Grande do Sul

Abstract—As technology scaling reaches the physical limits of silicon, several new problems must be addressed, from the design of low-power but high performance circuits, to the reliability issue of weak transistors and mixed technologies (nanowrires, SET, etc). These technological problems will impact several layers of the current abstraction stack that covers computers and software production. New architectural solutions that explore parallelism at different granularity must be sought, not only for performance/energy trade-offs, but also as a means to assure reliability, fault tolerance and yield, thanks to regularity. This talk presents some ideas on this direction, covering future processor architectures and quaternary logic circuits, discussing technologies that can deal with this multivariable problem. <p></p> <strong>Luigi Carro</strong> was born in Porto Alegre, Brazil, in 1962. He received the Electrical Engineering and the MSc degrees from Universidade Federal do Rio Grande do Sul (UFRGS), Brazil, in 1985 and 1989, respectively. From 1989 to 1991 he worked at ST-Microelectronics, Agrate, Italy, in the R&D group. In 1996 he received the Dr. degree in the area of Computer Science from Universidade Federal do Rio Grande do Sul (UFRGS), Brazil. He is presently a professor at the Applied Informatics Department at the Informatics Institute of UFRGS, in charge of Computer Architecture and Organization disciplines at the undergraduate levels. He is also a member of the Graduation Program in Computer Science at UFRGS, where he is co-responsible for courses on Embedded Systems, Digital signal Processing, and VLSI Design. His primary research interests include embedded systems design, validation, automation and test, fault tolerance for future technologies and rapid system prototyping. He has published more than 150 technical papers on those topics and is the author of the book Digital systems Design and Prototyping (2001-in portuguese) and co-author of Fault-Tolerance Techniques for SRAM-based FPGAs (2006-Springer). For the latest news, please check <a href="" ></a>. <p></p> Seminar organized by the <strong>ALGOS group</strong> <a href=""></a>.
Date: 16-Oct-2008    Time: 11:00:00    Location: 336


From Electrical Engineering to the Theory of Fuzzy Sets and Systems

Rudolf Seising

Munchen Ludwig-Maximilians University

Abstract—In 1965, Lotfi Zadeh, a professor of electrical engineering at the University of California in Berkeley, published the first papers on Fuzzy Set Theory. Since the 1980s, this mathematical theory of “unsharp amounts” has been applied with great success in many different fields. Thanks not least of all too extensive advertising campaigns for fuzzy-controlled household appliances and to their prominent presence in the media, first in Japan and then in other countries, the word “fuzzy” has also become very well-known among non-scientists. On the other hand, the story of how Fuzzy Set Theory and its earliest applications originated has remained largely unknown. In this lecture, the history of Fuzzy Set Theory and the ways it was first used are incorporated into the history of 20th century science and technology. Influences from system theory and cybernetics stemming from the earliest part of the 20th century are considered alongside those of communication and control theory from mid-century.
Date: 15-Oct-2008    Time: 17:00:00    Location: Instituto Superior Técnico - FA2


Modeling and Verification of Integrated Circuits

Nick van der Meijs

Delft University of Technology

Abstract—Integrated circuits contain millions of electronic switches connected by kilometers of interconnect, on an area of about 1 square cm. The electrical behavior of such circuits strongly depends on the capacitive, resistive and even inductive properties of the interconnect network and the substrate. Since these parasitic properties can only be approximately accounted for during design, it is necessary for verification purposes to translate the layout (physical design) of an integrated circuit back into an electrical netlist. This process is called parasitics extraction. This presentation will first explain the background and challenges of parasitics extraction, to be followed by a review of some recent results for modeling of interconnects and substrate. Specific topics include model order reduction and manufacturing variability. The presentation will conclude with a brief overview of open problems. <p></p> <strong>Nick van der Meijs</strong> (NL, 1959) received his PhD from Delft University of Technology in 1992, where he currently is an associate professor. His teaching responsibilities include circuit theory, VLSI design, and electronic design automation. He is also Director of Studies for the EE program. His research interests circle around physical/electrical aspects of deep-submicron integrated circuits, including ultra-deep-submicron design, modeling and extraction of physical/electrical effects in large integrated circuits, and efficient (practical) algorithms for electronic design automation in general. He is leading a research group on Physical Modelling and Verification of parasitic effects in integrated circuits and is a principal developer of the SPACE layout to circuit extractor. He has served on various program committees of international conferences, is the chair of the IEEE Benelux Circuits and Systems chapter and previous chair of the ProRISC micro-electronics workshop in the Netherlands. He is a recipient of a personal ~0.9M Euros "pioneer" research grant in the Netherlands. (<a name="http" id="http" href=""></a>) <p></p> <strong>Date and local:</strong> Tuesday, September, 30 2008, 11h00, room 336 at INESC-ID, Lisbon. <p></p>
Date: 30-Sep-2008    Time: 11:00:00    Location: 336


Pattern Matching in Protein-Protein Interaction Graphs

Stéphane Vialette

Université Paris-Est Marne-la-Vallée

Abstract—In the context of comparative analysis of protein-protein interaction graphs, we use a graph-based formalism to detect the preservation of a given protein complex (pattern graph) in the protein-protein interaction graph (target graph) of another species with respect to(w.r.t.) orthologous proteins. We give an efficient exponential-time randomized algorithm in case the occurrence of the pattern graph in the target graph is requiredto be exact. For approximate occurrences, we prove a tight inapproximability result and give four approximation algorithms that deal withbounded degree graphs, small ortholog numbers, linear forests andvery simple hard instances, respectively.
Date: 18-Sep-2008    Time: 14:00:00    Location: 336


Embedded Systems for IT Security Applications - Properties and Design Considerations

Sorin A. Huss

Technische Universität Darmstatd

Abstract—Talk on: Embedded Systems for IT Security Applications - Properties and Design Considerations
Date: 15-Sep-2008    Time: 15:00:00    Location: V1.08 (Civil building at IST)


Insights on using mathematical models in systems

Albert Sorribas

Universitat de Lleida

Abstract—Mathematical models play a fundamental role in systems biology. Models based on approximate representations are an important solution for obtaining useful models. In this seminar I should present some results on various formalisms and discuss their utility. Furthermore, I shall present preliminary results on their performance on fitting experimental dynamic data. Finally, I will briefly discuss the instances in which each of the proposed formalisms can be more appropriate.
Date: 11-Sep-2008    Time: 14:30:00    Location: 336


DPA resistant design flow and its applications

Francesco Regazzoni

University of Lugano

Abstract—Side channel analysis is a technique for attacking cryptographic algorithm implementations that exploits information leaked while secret data is being processed. In this talk, after an introduction to power analysis attacks, a design flow that enables to prove the robustness against power analysis attacks without the need of manufacturing and testing the chip will be presented. Contrary to past approaches on this subject, which have argued robustness qualitatively or have required hardware manufacturing to prove it, the robustness is evaluated with real attacks, on traces generated at SPICE level in a reasonable amount of time. The talk will conclude with the presentation of two real examples where the proposed design and simulation flow was used: the exploration of MOS - Current Mode Logic as possible countermeasure and the evaluation the effects of fault attacks protections on DPA resistance.
Date: 25-Jul-2008    Time: 10:30:00    Location: 336



Margarida Gama-Carvalho

Faculdade de Medicina da Universidade de Lisboa

Abstract—RNA binding proteins (RBPs) are emerging as multifunctional entities that act on the mRNA biogenesis pathway from transcription initiation through translation and decay. Association of RBPs with mRNAs through untranslated sequence elements has been proposed to constitute a mechanism that allows for the coordination of gene expression at the post-transcriptional level, defining post-transcriptional operons (Keene, 2002). We have recently characterized the mRNA interactome of two human mRNA binding proteins (Gama-Carvalho, 2006). Classification of the target mRNAs into Gene Ontology (GO) groups suggests that each protein associates with functionally coherent mRNA populations, supporting a coordinating role in gene expression. To understand whether these RNA populations contain distinctive sequence elements we have performed sequence motif searchs for consensus binding sites in the whole transcript, coding sequence and UTRs and compared to a non-associated mRNA population. The results support the model of differential interaction between functionally related mRNA populations and specific regulatory RNA binding proteins through the presence of untranslated sequence elements for regulation (USER) codes. Identification of potential gene networks in the population of target mRNAs using the Ingenuity Pathways KnowledgeBase suggests that these proteins may be involved in the coordination of key cellular functions and signaling pathways, with potential antagonistic effects. We have obtained preliminary evidence for regulatory functions of both proteins on their target mRNAs and we now aim to model these RNA-protein interaction networks and their effects on gene expression, as well as to develop methods to identify USER codes involved in the post-transcriptional coordination of gene expression.
Date: 24-Jul-2008    Time: 16:30:00    Location: 336


Modeling and analysis of dynamic systems with the symbolic manipulation tool Pansym

Karl Thomaseth


Abstract—Concise graphical and/or textual notations are ubiquitously used to represent the physical structure of dynamical systems. Each different notation can be normally translated directly into mathematical expressions of system equations that allow further application of general system theoretical results and techniques for the analysis and exploitation of the studied systems, e.g. stability, controllability, sensitivity with respect to parameters, identifiability, optimization, open- and closed-loop control, etc. The most frequently applied symbolic manipulation is differentiation, which is readily available in modern simulation programs, such as Matlab/Simulink, which do however seldom provide the analytic expressions of the simulated systems. With the aim to overcome this limitation, i.e. to make available in symbolic form the dynamic model equations starting from a structural representation of a system, the presented software tool PANSYM has been developed over the years as a marginal project to support modeling research on biomedical systems. Although mostly still in prototypal form, the software may become useful also to others because of some interesting and unique features regarding automatic generation of: (1) ODE model equations from multidisciplinary system representations (compartmental, electrical, biochemical, bondgraphs) (DAE are transformed into ODE by computer algebra); (2) sensitivity differential equations for dependable calculations instead of numerical differentiation; (3) adjoint differential equations arising from optimal control problems; (4) source code for: (a) numerical simulation exploiting multiprocessor architectures (Fortran); (b) documentation (Latex); (c) model simulation and identification ( R ). The ad hoc coding of new formatting routines for other applications, e.g. extended Kalman filtering, and programming languages, e.g. Matlab, C++, is feasible and little time-consuming. Latest developments include: (i) identification of population models using advanced statistical approaches, such as Nonlinear Mixed Effects Models and Bayesian inference based on Markov Chain Monte Carlo; (ii) analysis of metabolic maps with generation of dynamic equations for selected pathways. Future plans include modeling of isotopomer dynamics for studying substrate recycling in intermediate metabolism.
Date: 24-Jul-2008    Time: 10:00:00    Location: 336


Procedural modelling - Towards infinite game worlds

Delft University of Technology

Abstract—Manual game content creation is an increasingly laborious task; with each advance in graphics hardware, a higher level of fidelity and detail is achievable and, therefore, expected. However, virtual worlds are still designed almost entirely by hand on a low level of abstraction (e.g. manual creation and placement of geometry). Although numerous automatic (e.g. procedural) content generation algorithms and techniques have been developed over the years, their application in both games and simulations is not yet widespread. In this talk we briefly discuss a variety of procedural techniques and analyze their potential to alleviate current problems in next-gen game design teams. <p><b>Rafael Bidarra</b> is assistant professor Game Technology at the Faculty of Electrical Engineering, Mathematics and Computer Science of Delft University of Technology, The Netherlands. He graduated in electronics engineering at the University of Coimbra, Portugal, in 1987, and received his PhD in computer science from Delft University of Technology in 1999. He teaches and supervises several courses and projects on video games within the Computer Science programme "Media and Knowledge Engineering", and leads the research work on game technology at the Computer Graphics and CAD/CAM Group. His current research interests in this area include the development and application of procedural and semantic modelling techniques in the specification and generation of both virtual worlds and game play. He has published many papers in international journals, books and conference proceedings, and has served as member of several program committees.
Date: 30-Jun-2008    Time: 09:30:00    Location: 336


Speech recognition for less-represented languages

Thomas Pellegrini


Abstract—The last decade has seen growing interest in developing speech and language technologies for a wider range of languages. State-of-the-Art speech recognizers are typically trained on huge amounts of data, both transcribed speech and texts. My thesis work focused on speech recognition for languages for which small amounts of data are available: the "less-represented languages". These languages often suffer from poor representation on the Web, which is the main collecting source. Very high out-of-vocabulary rates and poor language model estimation are common for these languages. In this presentation, I will briefly describe the difficulties posed by building new ASR systems with little data. Then I will present our attempt to improve performance, by using sub-word units in the recognition lexicon. We enhanced a data-driven word decompounding algorithm in order to address the problem of increased phonetic confusability arising from word decompounding. Experiments carried out on two distinct languages, Amharic and Turkish, achieved small but significative improvements, around 5% relative in word error rate, with 30% to 50% relative OOV reductions. The algorithm is relatively language independent and requires minimal adaptation to be applied to other languages.
Date: 18-Jun-2008    Time: 14:00:00    Location: 336


Negotiation Among Autonomous Computational Agents

Fernando Lopes


Abstract—Automated negotiation systems with software agents representing individuals or organizations and capable of reaching mutually beneficial agreements are becoming increasingly important and pervasive. Examples, to mention a few, include the business trend toward agent-based supply chain management, the industrial trend toward virtual enterprises, and the pivotal role that electronic commerce is increasingly assuming in many organizations. <a href="">...</a> <a href="">(more)</a>
Date: 30-May-2008    Time: 15:00:00    Location: 336


The CHAMELEON RF Project - a quick digest

L. Miguel Silveira


Abstract—CHAMELEON RF stands for Comprehensive High-Accuracy Modelling of Electromagnetic Effects in Complete Nanoscale RF blocks. CHAMELEON RF was a Specific Targeted Research Project (STREP) funded under the Framework Programme (FP) 6 in the area of Information Society Technologies (IST). The project run for 30 months and involved several european partners (Philips Research Eindhoven, later NXP, NL, Austriamicrosystems, AT, MAGWEL, BE, Interuniversity Micro Electronics Centre (IMEC), BE INESC-ID, PT, Polytechnic University of Bucharest, RO and Delft University of Technology, NL). The aim of the CHAMELEON RF project was to develop methodologies and prototype tools for a comprehensive and highly accurate analysis of complete next-generation nanoscale functional IC blocks that will operate at RF frequencies of up to 60 GHz. In this talk I will briefly review the project organization, its main goals, discuss some of the achievements and provide an overall view of the results achieved.
Date: 30-May-2008    Time: 14:00:00    Location: 336


Around the genome: particularities of the circular mitochondrial DNA

Luísa Pereira

IPATIMUP - Instituto de Patologia e Imunologia Molecular da Universidade do Porto

Abstract—Mitochondrial DNA (mtDNA) studies were always instrumental in phylogenetic inferences, namely in the field of human population genetics, allowing to placing the root of the human tree in East Africa, about 200,000 years ago. The success is such that sequencing efforts led already to the publication of more than 4,300 complete human mtDNAs (each ~16,600bp) and more than 7,600 complete eukaryote mtDNA genomes on GenBank. But if some of its characteristics make analyses easier and straightforward, such as the absence of recombination, other demands some conceptual modifications. One of these is the circularity of the molecule which renders common alignment algorithms and informatics tools, developed for linear molecules, not suitable for automatic alignment of many circular mtDNA genomes. Furthermore, the high heterogeneity in mutational rates between the two main regions in mtDNA (control and coding regions) and even between positions inside regions (namely between first/second and third codon positions) questions the validity of attributing the same weights to substitutions and the same gap penalties all over the molecule. The development and implementation of efficient cyclic algorithm is now crucial to retrieving most information from the huge amount of sequencing data being accumulated for mtDNA.
Date: 29-May-2008    Time: 16:30:00    Location: 336


Multiresolution by Reversing Subdivision

Faramarz Famil Samavati

University of Calgary

Abstract—Multiresolution representations (MR) for curves and surfaces have been extensively used in many areas of computer graphics. The conventional MR is usually defined based on wavelets. However, in this lecture, I present a simple reverse subdivision approach for MR, and derive multiresolution filters for quadratic and cubic B-splines curves. I also show how to use and extend these filters for tensor-product surfaces, and 2D/3D images. Finally, some example applications in model synthesizing and contextual void patching for digital elevation models are discussed. <p> <b>About the speaker:</b> Professor Faramarz Samavati currently works on various aspects of Computer Graphics. In general terms, his research areas are Geometric Modeling , Sketch-Based Modeling, Visualizations and Non-photo Realistic Rendering . More specifically, the research topics in his area are Surface Modeling , Volume Modeling , Subdivision Surfaces , Flexible Projection , Least Squares , NURBS, Multiresolution and Wavelets . Faramarz Samavati currently supervises a group of very good graduate students . He also collaborates with several other researchers ( Richard Bartels , Przemyslaw Prusinkiewicz , Mario Costa-Sousa , Brian Wyvil l, Marina Gavrilova , Sheelagh Carpandale and Joaquim Jorge ). He has over 50 technical papers in peer-reviewed journals and conferences. He is a Member of ACM and EG . Currently he is in the program committee of IMMERSCOM2007, ICIAR 2007, SBIM2007 and SMI2008.
Date: 15-May-2008    Time: 09:30:00    Location: Sala de V/C IST da Alameda e TagusPark


Fully compressed Sufix Trees

Luís M. S. Russo

Faculdade de Ciências e Tecnologia da UNL

Abstract—Suffix trees are by far the most important data structure in stringology, with myriads of applications in fields like bioinformatics and information retrieval. Classical representations of suffix trees require O(n log n) bits of space, for a string of size n. This is considerably more than the n log_2sigma bits needed for the string itself, where sigma is the alphabet size. The size of suffix trees has been a barrier to their wider adoption in practice. Recent compressed suffix tree representations require just the space of the compressed string plus Theta(n) extra bits. This is already spectacular, but still unsatisfactory when sigma is small as in DNA sequences. In this talk we introduce the first compressed suffix tree representation that breaks this linear-space barrier. Our representation requires sublinear extra space and supports a large set of navigational operations in logarithmic time. An essential ingredient of our representation is the lowest common ancestor (LCA) query. We reveal important connections between LCA queries and suffix tree navigation.
Date: 08-May-2008    Time: 17:00:00    Location: 04


Low Power Microarchitecture with Instruction Reuse

Frederico Pratas


Abstract—Power consumption has become a very important metric and challenging research topic in the design of microprocessors in the recent years. This work improves power efficiency of superscalar processors through instruction reuse at the execution stage. A new method for reusing instructions that compose small loops is proposed: instructions are first buffered in the Reorder Buffer and reused afterwards without the need for dynamically unrolling the loop, as commonly implemented by the traditional instruction reusing techniques. In order to evaluate the proposed method we modified the sim-outorder tool from Simplescalar and the Wattch Power Performance simulators. Several different configurations and benchmarks have been used during the simulations. The obtained results show that by implementing this new method in a superscalar microarchitecture, the power efficiency can be improved without significantly affecting neither the performance nor the cost.
Date: 29-Apr-2008    Time: 11:00:00    Location: 336


Visual style representations for illustrative visualization

Mário Costa Sousa

University of Calgary

Abstract—I will focus on visual style representations for illustrative visualization. As different rendering styles are an effective means for accentuating features and directing the viewer s attention, an interactive illustrative visualization system needs to provide an easy-to-use yet powerful interface for changing these styles. The lecture will review existing approaches for stylized rendering and discuss practical considerations in the choice of an appropriate representation for visual styles. <p> I will also review the state-of-the-art of sketch-based interfaces and modeling (SBIM) for scientific visualization, including different aspects and inspiration factors brought from traditional medical/scientific illustration principles, methods and practices I will describe unique techniques and problems, including presentation of systems, algorithms and implementation techniques focusing on interactive SBIM for illustrative botanical modeling and volume graphics. <p> <b>About the Speaker:</b><br> Prof. Mário Costa Sousa Mario Costa Sousa is an Associate Professor of Computer Science at the University of Calgary and member of the Computer Graphics Lab at the University of Calgary. He holds a M.Sc. (PUC-Rio, Brazil) and a Ph.D. (University of Alberta) both in Computer Science. His current focus is on research and development of techniques to capture the enhancement and expressive capability of traditional illustrations. This work involves research on Illustrative scientific visualization, non-photorealistic rendering, sketch-based interfaces and modeling, visual perception, volume rendering, interactive simulations and real-time rendering. Dr. Sousa has been very active in the graphics community, teaching courses, presenting papers and serving on many conference program committees. Sousa has active collaborations with illustrative visualization research groups, medical centers, and scientific institutes and with illustrators/studios affiliated with the Association of Medical Illustrators and the Guild of Natural Science Illustrators.
Date: 29-Apr-2008    Time: 09:00:00    Location: Sala de V/C IST da Alameda e TagusPark


Aperfeiçoando arquitecturas de distribuição de relógio do tipo "mesh"

Gustavo Wilke

Universidade Federal do Rio Grande do Sul

Abstract—O projecto da rede de relógio é uma tarefa crucial para o projecto dos circuito integrados de alto desempenho. Alem de atingir requisitos de desempenho cada vez mais exigentes, projectistas também necessitam manter o consumo de potência da rede de relógio sob controle. Esta apresentação vai discutir como distribuições de relógio do tipo "mesh" podem ser aperfeiçoadas para reduzir o seu consumo de potência e aumentar o sincronismo do relógio (i.e, reduzir o "clock skew"). Um novo projecto de um "buffer" para o "mesh" de relógio é proposto com o intuito de reduzir o consumo de curto circuito entre os diferentes "buffers" conectados ao "mesh". Para melhorar ainda mais o sincronismo e o consumo de potência dos "meshes" de relógio um algoritmo de dimensionamento estatístico de "buffers" de relógio é também proposto. Resultados experimentais demonstram que o novo projecto do "buffer" de relógio é capaz de reduzir ambos, consumo de potência e "clock skew" em mais de 50%.<p> <b>Gustavo Wilke</b> é um estudante de doutorado na Universidade Federal do Rio Grande do Sul, Brasil. Ele começou a trabalhar como um auxiliar de pesquisa em 2001 no laboratório de micro-electrónica. Em 2004 ele realizou um estágio de 6 meses na Fujitsu Laboratories of America, Inc. No mesmo ano ele terminou a graduação no curso de engenharia de computação. Em 2005 ele iniciou o curso de doutorado em micro-electrónica. De Julho de 2006 a Junho de 2007 ele realizou um segundo estagio na Fujitsu Laboratories of America, Inc. O seu tópico de pesquisa é distribuição de relógio com alto desempenho.
Date: 17-Apr-2008    Time: 11:00:00    Location: 336


Mapper: An Efficient Data Transformation Operator

Paulo Jorge Fernandes Carreira

Faculdade de Ciências de Universidade de Lisboa

Abstract—Application scenarios such as legacy data migration, Extract-Transform-Load (ETL) processes, and data cleaning require the transformation of input tuples into output tuples. Traditional approaches for implementing these data transformations enclose solutions as Persistent Stored Modules (PSM) executed by an RDBMS or transformation code using a commercial ETL tool. Neither of these is easily maintainable or optimizable. A third approach consists of combining SQL queries with external code, written in a programming language. However, this solution is not expressive enough to specify an important class of data transformations that produce several output tuples for a single input tuple. In my PhD thesis, I propose the data mapper operator as an extension to the relational algebra to address this class of data transformations. Furthermore, the thesis discusses a set of algebraic rewriting rules for optimizing expressions that combine standard relational operators with mappers. Experimental results confirm the benefits brought by some of the proposed semantic optimizations.
Date: 16-Apr-2008    Time: 15:00:00    Location: N7.1


Cross-media knowledge management

Steffen Staab

ISWeb - University of Koblenz

Abstract—"Cross-media knowledge management" Current challenges of knowledge management need to integrate sources of various media such as text, images, noise signals, video or raw data. Such a requirement poses new challenges for managing the diverse data, requiring cross-media information extraction as well as knowledge representation with capabilities for treating the uncertainty and the provenance of facts and an ontology that may represent arbitrary connections between media and facts. In the EU project X-media we investigate several cross-media knowledge management scenarios from the engineering sectors. We have defined a core ontology of multimedia and new querying capabilities for RDF, extending the semantic web query language SPARQL towards capabilities for certainty and provenance management.
Date: 11-Apr-2008    Time: 09:30:00    Location: Anfiteatro PA2 do campus Alameda


Folded and Error Tolerant Architectures for Digital Signal Processing: Design and Implementation

Ivan Milentijevi&#263;, Vladimir &#262;iri&#263;

University of Nis

Abstract—The lecture will be organized in three parts. The first part will be devoted to the University of Niš, and Computers Science Department at the Faculty of Electronic Engineering. We will briefly explain the organizational structure of the University, commenting research and teaching activities at Computer Science Department. A review of our research in the field of design and application of folded architectures will be given in the second part. We will present few design examples emphasizing specific design goals. As the main design example will present the synthesis of configurable bit-plane processing arrays for FIR filtering. The H.264/AVC application of configurable folded array, as the deblocking filter implementation with low-gate count, will be presented. Results of the architecture implementation as a real-time deblocking accelerator for mobile embedded computing platforms will be given. The third part of presentation addresses our latest results in the error tolerant architecture design. We will present a method for trading computational correctness for an additional chip area involved by fault-tolerance implementation. The method will be described in a formal way and demonstrated for the BP array. A mathematical path based on transitive closure that generates an error significance map for the BP array will be explained. The design tradeoff will be demonstrated through FPGA implementation.
Date: 10-Apr-2008    Time: 11:00:00    Location: 336


Dynamic Energy Budget Theory: A General Mathematical Theory in Biology, Empirically Tested for the Major Groups of Organisms

Tiago Domingos

Instituto Superior Técnico

Abstract—Dynamic Energy Budget (DEB) theory, developed by Bas Kooijman at the Department of Theoretical Biology in the Free University of Amsterdam is the first general biological theory at the organism level since the theory of evolution. It is a mathematical theory, comprising all taxonomic groups, with extensive empirical testing and already several practical applications, namely in toxicology (where its use is recommended by ISO and OECD), environmental engineering and biological engineering. It is based on simple mechanistic rules for the uptake of energy and nutrients and the consequences of these rules for physiological organization along the life cycles of organisms. The broad generality of DEB theory opens very significant research opportunities. In particular, the use of data mining and text mining techniques to obtain parameters for the widest set of organisms possible, and its use as a macroscopic theory to establish a framework for systems biology approaches.
Date: 03-Apr-2008    Time: 16:00:00    Location: 336


Speech related research at the Computer Vision Laboratory (ISR)

Giampiero Salvi

Instituto de Sistemas e Robótica

Abstract—In this talk I will present the activity at the Computer Vision Laboratory that can be of interest to the speech community. I will start by briefly introducing myself and my previous work at the Speech, Music and Hearing department, KTH, Stockholm. Then I will touch on a number of topics that are addressed mainly within the CONTACT European project. These include: sound localization, modelling speech development, imitation and word grounding. I will also describe a database that was recorded in cooperation with the CONTACT consortium, containing ultrasound and 3d articulograph measurements of the tongue in isolated words.
Date: 28-Mar-2008    Time: 15:00:00    Location: 336


Transistor Level Automatic Generation of Radiation-Hardened Circuits

Cristiano Lazzari


Abstract—Deep submicron (DSM) technologies have increased the challenges in circuit designs. Radiation effects are more significant because particles with low energy, without effect in older technologies, may cause system failures in DSM circuits. The goal of this talk is to present a methodology by increasing the dependability of circuits under energetic particles. The methodology is based on changing the width of transistors that belongs to critical paths. <p> <b>Cristiano Lazzari</b> works as researcher in the <a href="">ALGOS group</a> at INESC-ID. He obtained his Ph.D. in December, 2007 from the Federal University of Rio Grande do Sul (<a href="">UFRGS</a>), Brazil and from the Institute National Polytechnique de Granoble (<a href="">INPG</a>), France. The main research area in his Ph.D. was algorithms and techniques for automatic layout generation and radiation-hardened circuit generation. In 2007, Lazzari worked at the <a href="">CEITEC</a>, Brazil as backend engineer, where he was responsible for logic and physical synthesis of digital circuits.u
Date: 17-Mar-2008    Time: 11:00:00    Location: 336


Knowledge discovery in environmental microbiology and physiology: problems, tools and protocols

Andreas Bohn

Instituto de Tecnologia Química e Biológica

Abstract—The present talk deals with dynamical processes observed at the organismal level in conditions close to real-world environments. The relatively small amount of data and replicates available in such experiments poses specific challenges to the design, deployment and application of integrated computational tools for data management and analysis. They are exemplified by microcosm studies of phototrophic biofilms and in-vivo circadian rhythms of body temperature in mammalians. On the basis of these experiences, I will discuss potential alterations to common protocols of interdisciplinary collaboration, which might be useful in enhancing the efficiency of computational tools in knowledge discovery.
Date: 13-Mar-2008    Time: 16:00:00    Location: 336


Unsatisfiability-Based Algorithms for Maximum Satisfiability

João Paulo Marques-Silva

University of Southampton

Abstract—The problem of Maximum Satisfiability (MaxSAT) and some of its variants find an increasing number of practical applications in a wide range of areas. Examples include optimization problems in digital system design and verification, and in bioinformatics. However, most practical instances of MaxSAT are too hard for existing branch and bound algorithms. One recent alternative to branch and bound MaxSAT algorithms is based on unsatisfiable subformula identification. This talk provides an overview of recent algorithms for MaxSAT based on unsatisfiable subformula identification.
Date: 07-Mar-2008    Time: 11:00:00    Location: 336


Desenvolvimento da Microelectrónica no Brasil

João Baptista Martins

Universidade Federal de Santa Maria

Abstract—A palestra tem por objectivo expor a situação actual da microelectrónica no Brasil, os avanços e perspectivas para médio e longo prazo. O apoio e os incentivos governamentais para a área, os grupos de I&D e a criação de Design Houses, em especial o CEITEC (Centro de Excelência em Tecnologia Electrónica Avançada).
Date: 03-Mar-2008    Time: 14:30:00    Location: IST (Alameda) Sala VA-1


Portuguese phonological system used in Beira Interior

Sara Candeias

Instituto de Telecomunicações, Department of Electrical and Computer Engineering, University of Coim

Abstract—This study proposes a model for a phonological description of the speech patterns attested in the Portuguese language variety spoken in Fundão - Beira Interior. The research is based in analytic work of the functionalist theory, the perception of phonetic features and data arriving from statistical analyses. A phoneme database was built for such purposes comprising 142.020 examples, the realizations of which are described and analysed according to the syllabic context. The phonemic database was constructed in order to establish the pertinent features set in the referred variety. This set regulates the dynamic nature of linguistic subsystems, taking into account both the variety of realizations and the optimization of uses. The description of these uses is based in statistical analyses which are presented in relative and absolute values. It is suggested that these phonological phenomena maps may have their correlation in the Verb and Personal Pronoun syntactic-semantic categories.
Date: 29-Feb-2008    Time: 15:00:00    Location: 336


What are functional modules in biological networks

José Pereira-Leal

Instituto Gulbenkian da Ciência

Abstract—Modularity has become in recent years a widely accepted feature of biological networks. However, it seems to mean different things in different networks and even within the same type of network. This poses a challenge to the development of methods to partition networks into functionally meaning entities. I will discuss in my talk modularity in the context of protein interaction networks, from method development to evolutionary studies.
Date: 28-Feb-2008    Time: 16:00:00    Location: 336


Embedded Cryptography: flexibility and security through reconfigurability

Daniel Mesquita


Abstract—This work addresses the security aspects of embedded systems, in different scenarios. Mobile communications, secure identification cards, credit cards and in-vehicle communications are application fields where the data security is a major issue. In this context, the main tool to provide data integrity, confidentiality, authentication and non-repudiation remains the cryptography. Nevertheless, embedded cryptographic systems may lack side channel information, enabling some attacks. This presentation aims to discuss side channel attacks and countermeasures, introducing our working in progress concerning reconfigurable aspects for embedded cryptography.
Date: 21-Feb-2008    Time: 15:30:00    Location: IST, Taguspark, Anfiteatro A3


Geração de Testes Baseados em Máquinas de Estados Finitos

Prof. Adenilso da Silva Simão

Universidade de São Paulo

Abstract—Máquinas de Estados Finitos (MEFs) são modelos formais que têm sido utilizados para a descrição de uma ampla variedade de sistemas, desde protocolos até componentes de hardware, passando por classes de software e modelos de interação. No contexto do teste de software e hardware, MEFs têm sido muito utilizadas para geração de casos de teste devido ao fato de permitir a quantificação precisa das falhas que serão descobertas. Neste seminário serão apresentados os principais conceitos da área de teste baseado em MEFs, começando com a fundamentação teórica, e ilustrando os resultados clássicos. Em seguida, será mostrado o estado atual da área, com os resultados recentes. Serão também apresentados os pontos em aberto. Por fim, serão discutidos os pontos de intersecção entre a área de teste baseado em MEFs e o projeto CNPq/Grices. [BIO: Adenilso da Silva Simão é Prof. Doutor na Universidade de São Paulo (USP), Brasil. É um membro do Departamento de Sistemas de Computação do ICMC (Instituto de Ciências Matemáticas e da Computação) da USP em São Carlos desde 2004. Os seus maiores interesses são na área de teste de software e modelos formais.]
Date: 21-Feb-2008    Time: 14:30:00    Location: IST, Taguspark, Anfiteatro A3


Navegação Topológica à Inspecção de Linhas Eléctricas

Alberto Vale

ALBATROZ Engenharia S.A.

Abstract—RESUMO: A apresentação é a compilação de alguns projectos no âmbito da robótica móvel. A navegação começa na teleoperação de robots móveis através da Internet, estávamos no final do século passado. Esta navegação resumia-se aos comandos enviados pelo utilizador, cuja percepção se baseava em imagem vídeo e outros dados sensoriais. O desafio seguinte foi tornar a plataforma autónoma, como por exemplo a navegação de um robot móvel em labirintos. Esta navegação é dividida em dois níveis: evitar obstáculos e seguir trajectórias e o segundo, uma navegação de mais alto nível, descobrir e resolver o labirinto. Esta navegação de mais alto nível foi o primeiro passo para uma nova abordagem na robótica móvel, a navegação topológica. A motivação principal centra-se no ser humano que não efectua percursos diários com odómetros, pedómetros, bússolas, ou receptores de GPS, mas sim de forma eficiente com uma navegação por referências salientes no cenário envolvente. A navegação topológica obrigou a uma profunda investigação no processamento dos mais variados dados sensoriais, com especial ênfase nos dados de sensores de varrimento laser, que veio a servir para um problema bem real: a Inspecção de Linhas Eléctricas, também com novos desafios para a robótica aérea. [BIO: Alberto Vale - Licenciatura em Engenharia Electrotécnica e Computadores no Ramo de Controlo e Robótica pelo IST em 1999 e doutoramento na mesma universidade no âmbito da Robótica Móvel em 2005, tendo como orientadora a Professora Maria Isabel Ribeiro. Investigador no laboratório ISR (Instituto de Sistemas e Robótica) entre 1999 e 2005. Docente no departamento de Álgebra e Análise no IST de 2000 a 2002. Formador certificado nas áreas de Electrotecnia, Informática e Tecnologias Educativas. Co-fundador e responsável pela equipa de I&D da empresa Albatroz Engenharia S.A. desde 2006. Diversas publicações científicas e apresentações a nível nacional e internacional. Praticante de diversas actividades desportivas como mergulho, natação, canoagem e gosto pela música, fotografia, desenho e web-design.
Date: 21-Feb-2008    Time: 10:30:00    Location: IST, Taguspark, Anfiteatro A3


Aplicações da Computação Reconfigurável no Projecto de Robôs Móveis

Prof. Eduardo Marques

Universidade de São Paulo

Abstract—Este seminário apresentará aplicações da computação reconfigurável na Robótica Móvel. São utilizadas arquiteturas baseadas no conceito SoC (System-on-a-chip) para acelerar a aplicação. Este SoC é implementado em circuitos reprogramáveis do tipo FPGA (Field Programmable Gate Array) de última geração dos fabricantes Altera. A arquitetura alvo é constituída por um softcore Processor NIOS II da Altera, associado a várias unidades de processamento reconfiguráveis (RPUs) desenvolvidas especialmente para a área de robótica móvel. Esta metodologia permite a pesquisadores e projetistas na área da robótica móvel testar seus algoritmos em sistemas de capacidade de desempenho elevada e, deste modo, explorar novas soluções destes sistemas para uso em tempo-real; um requisito cada vez mais presente na robótica móvel embarcada. Os testes de validação do sistema gerado são realizados com um robô Pioneer 3DX. O trabalho atual foca um ambiente de co-projeto hardware/software para facilitar o desenvolvimento de aplicações da robótica móvel em FPGAs. Este projeto teve início em Abril de 2005 através do convênio CNPq/Grices, envolvendo a Universidade de São Paulo e a Universidade do Algarve (tendo neste momento o INESC-ID/IST como parceiro português). [BIO: Eduardo Marques é Prof. Associado na Universidade de São Paulo (USP), Brasil. É um membro do ICMC (Instituto de Ciências Matemáticas e da Computação) da USP em São Carlos desde 1986. Os seus maiores interesses são na área da computação reconfigurável aplicada à robótica móvel.]
Date: 21-Feb-2008    Time: 09:30:00    Location: IST/Taguspark, Anfiteatro A3


Identification of Transcription Factor Binding Sites in Promoter Regions by Modularity Analysis of the Motif Co-Occurrence Graph

A. P. Francisco


Abstract—Many algorithms have been proposed to date for the problem of finding biologically significant motifs in promoter regions. They can be classified into two large families: combinatorial methods and probabilistic methods. Probabilistic methods have been used more extensively, since they require less input from the user, and their output is easier to interpret. Combinatorial methods have the potential to identify hard to detect motifs, but their output is much harder to interpret, since it may consist of hundreds or thousands of motifs. In this work, we propose a method that processes the output of combinatorial motif finders in order to find groups of motifs that represent variations of the same motif, thus reducing the output to a manageable size. This processing is done by building a graph that represents the co-occurrences of motifs, and finding communities in this graph. We show that this innovative approach leads to a method that is as easy to use as a probabilistic motif finder, and as sensitive to low quorum motifs as a combinatorial motif finder. The method was integrated with two combinatorial motif finders, and made available on the Web, integrated in an application that can be used to analyze promoter regions in S. cerevisiae. Experiments performed using this system show that the method is effective in the identification of relevant binding sites.
Date: 31-Jan-2008    Time: 16:30:00    Location: 336


Aggressive Loop Pipelining for Reconfigurable Architectures

Ricardo Menotti

Universidade Tecnológica Federal do Paraná

Abstract—Este seminário aborda a aplicação de novas técnicas de loop pipelining para arquiteturas reconfiguráveis. Inicialmente, as características da computação reconfigurável são descritas e comparadas aos métodos tradicionais da computação. Posteriormente, são descritas as principais técnicas de loop pipelining utilizadas e as ferramentas para geração de hardware encontradas na literatura. Finalmente, é apresentado um método para realizar loop pipelining agressivamente e o impacto dessa técnica quando aplicada a arquiteturas reconfiguráveis.
Date: 10-Jan-2008    Time: 17:00:00    Location: IST, Taguspark, A4


Using sequence and expression data to predict microRNA targets in animals

Hélio Pais

University of East Anglia

Abstract—MicroRNAs are short (20-22) nucleotide non-coding RNAs involved in post-transcriptional regulation of gene expression. One of the essential requirements to understand the function of a microRNA is to know the genes it regulates, the so-called targets of the microRNA. It is believed that in plants most target sites present a near perfect complementarity to the sequence of the microRNA. However, in animals most target sites present lower number of complementary nucleotides. It is therefore difficult to design target prediction methods that simultaneously have high specificity and sensitivity. It has recently been discovered that in animals microRNAs not only suppress translation but also induce the destabilization of mRNA transcripts. This discovery has opened up the possibility of using data from microarray assays, performed on cells where microRNA expression is modified, to predict its targets. In the first part of the talk we will review current sequence-based microRNA target prediction tools. In the second part of the talk we will show how microarray data can be used to improve microRNA target prediction.
Date: 20-Dec-2007    Time: 17:00:00    Location: 336


Location Proteomics Using Machine Learning Techniques

Luis Coelho

Carnegie Mellon University

Abstract—Fluorescent microscopy is a method by which a labeled protein can be imaged inside a cell. Such images can be used to determine the subcellular location of the protein. I will show how machine learning techniques have been used to automate this task. I will present the basic methods used as well as some more recent work.
Date: 18-Dec-2007    Time: 17:00:00    Location: 04


Neuroprosthetic devices: the interplay between electronic and biological systems

Eduardo Fernandez

Universidade de Alicante

Abstract—The interplay between electronic and biological systems is an area on intense interest. Thus, the development of neuroprosthetic devices can have high potential impact on brain research and brain-based industry and is one of the central problems to be addressed in the next decades. This talk will review and summarize the most important physiological principles regarding any neuroprosthetic approach and present a survey of the present state of developments concerning the feasibility of a visual neuroprostheses, as a means through which a limited but useful visual sense could be restored to profoundly blind people.
Date: 14-Dec-2007    Time: 10:30:00    Location: 336


Broadband radio access network challenges

Prof. Hamid Aghvami

Kings College

Abstract—The talk will first describe three emerging technologies for next generation broadband radio access networks. It will then discuss the challenges in supporting end-to-end networking. It will also address how to ensure the establishment, maintenance and termination of network edge-to-end QoS and security in a broadband radio access networks . As an example, the design of a wireless access network in the context of end-to-end networking will then be given. Finally, it will discuss possible applications and services for future broadband radio access networks.
Date: 10-Dec-2007    Time: 14:30:00    Location: Anfiteatro A2


Speculations on a New Approach to Modeling Biological Systems

Eberhard Voit

Georgia Institute of Technology

Abstract—Computational systems biology complements experimental biology in unique ways that are hoped to reveal insights and a depth of understanding not achievable without systems approaches. A major challenge of systems biology continues to be the determination of parameter values for mathematical models. While some models can be analyzed in symbolic form, these are few and far between, and the lack of parameter values is a true obstacle for most computational analyses of realistic biological phenomena. As a consequence, computational modelers tend to take on a problem only if there is a relatively solid database for parameter estimation. Interestingly, biologists very often have very detailed mental models of the phenomenon they are investigating and are not really interested in absolutely precise numerical results, as long as they can test relevant, semi-quantitative hypotheses. However, neither they nor their modeling colleagues have the means of translating the mental models into numerical mathematical structures that would allow advanced diagnosis and testing. I will speculate in this presentation on a possible way to bridge the cleft between mental and numerical models, using modern methods from Biochemcial Systems Theory. The envisioned technique is tentatively called "concept map modeling" and seems quite reasonable, but I do not have proof yet that it will actually work in real-world applications.
Date: 29-Nov-2007    Time: 17:00:00    Location: 336


Media Asset management - O Arquivo Audiovisual da SIC

Ana Fanqueira, José Lopes


Abstract—As tecnologias digitais, nomeadamente em televisão, estão intimamente relacionadas com a mudança nos processos de comunicação nas empresas desta área na qual a SIC se integra. Um arquivo de televisão, não é já prioritariamente utilizado pela própria televisão mas também pelos outras canais de distribuição de informação ao público, seja em linha na Internet seja por telefone móvel. É neste contexto que se insere a digitalização do Arquivo da SIC. Quando uma notícia acontece há que a distribuir através do canal mais apropriado a cada pessoa, sendo objectivo do grupo Impresa chegar a todas as pessoas. É para esta filosofia que o Arquivo tem que estar preparado e para a qual um sistema de Gestão e Arquivo de Conteúdos Digitais se destina. Texto, gráficos, imagens, em diferentes formatos digitais permitem aos utilizadores total flexibilidade na utilização dos conteúdos arquivados, em qualquer ambiente de produção e/ou distribuição.
Date: 26-Nov-2007    Time: 11:00:00    Location: Anfiteatro FA1


Mining Queries

Ricardo Baeza-Yates

Yahoo Research, Barcelona

Abstract—User queries in search engines and Websites give valuable information on the interests of people. In addition, clicks after queries relate those interests to actual content. Even queries without clicks or answers imply important missing synonyms or content. In this talk we show several examples on how to use this information to improve the performance of search engines, to recommend better queries, to improve the information scent of the content of a Website and ultimately to capture knowledge, as Web queries are the largest wisdom of crowds in Internet.
Date: 06-Nov-2007    Time: 16:00:00    Location: 336


Using and dealing with immense quantities of data

Davi Reis

Google Brasil

Abstract—No próximo dia 5 de Novembro (segunda-feira), às 11h30, na sala 0.26 do campus Taguspark, terá lugar uma palestra do engenheiro Davi Reis, Google Engineer nos laboratórios de investigação e desenvolvimento do Google Brasil. A palestra será sobre algumas das tecnologias usadas presentemente pelo Google e tem como título 'Using and dealing with immense quantities of data'. A palestra terá lugar após a já anunciada apresentação do prof. Alberto Laender.
Date: 05-Nov-2007    Time: 11:30:00    Location: 0.26 Taguspark


Um Estudo sobre o Perfil da Produção Científica em Ciência da Computação

Alberto H. F. Laender

Universidade Federal de Minas Gerais

Abstract—Nesta palestra será apresentada uma breve descrição do Projecto Perfil-CC em andamento no Departamento de Ciência da Computação da UFMG e que tem como objectivo estudar o perfil da produção científica na área de Ciência da Computação. O estudo apresenta um levantamento da produção científica de 22 dos mais importantes programas de pós-graduação em Ciência da Computação da América do Norte e da Europa e dos 8 mais importantes programas do Brasil, usando como fontes de dados a DBLP - Digital Bibliography & Library Project sediada na Universidade de Trier, Alemanha, e o Qualis, sistema de classificação de periódicos e anais de conferências da CAPES - Coordenação de Aperfeiçoamento de Pessoal de Nível Superior do Ministério da Educação do Brasil.
Date: 05-Nov-2007    Time: 10:00:00    Location: 0.26, Taguspark


Tackling the Acoustic Front-end for Distant-Talking Automatic Speech Recognition

Walter Kellermann

University of Erlangen-Nuremberg

Abstract—With the ever-growing interest in 'natural' hands-free acoustic human/machine interfaces, the need for according distant-talking automatic speech recognition (ASR) systems increases. Considering interactive TV as a challenging exemplary application scenario, we investigate the structural problems presented by noisy and reverberant multi-source environments with unpredictable interference and acoustic echoes of loudspeaker signals, and discuss current acoustic signal processing techniques to enhance the input to the actual ASR system. Special attention is paid to reverberation, which affects speech recognizers much more than human listeners, and a recently published method incorporating a reverberation model on the feature level of ASR is discussed.
Date: 02-Oct-2007    Time: 15:30:00    Location: 336


Acoustic Signal Processing for Next-Generation Multichannel Human/Machine Interfaces

Walter Kellermann

University of Erlangen-Nuremberg

Abstract—The acoustic interface for future multimedia and communication terminals should be hands-free and as natural as possible, which implies that the user should be free to move and and should not need to wear any devices. For digital signal processing this poses major challenges both for signal acquisition and reproduction, which reach far beyond the current state of the technology. For ideal acquisition of an acoustic source signal in noisy and reverberant environments, we need to compensate acoustic echoes, suppress noise and interferences and we would like to dereverberate the desired source signal. On the other hand, for a perfect reproduction of real or virtual acoustic scenes we need to create desired sound signals at the listeners ears, while at the same time we have to remove undesired reverberance and to suppress local noise. In this talk we will briefly analyze the fundamental problems for signal processing in the framework of MIMO (multiple input - multiple output) systems and discuss current solutions. In accordance with ongoing research we emphasize nonlinear and multichannel acoustic echo cancellation, as well as microphone array signal processing for beamforming, interference suppression, blind source separation, and source localization.
Date: 01-Oct-2007    Time: 14:00:00    Location: Room C11, IST


Compressing Web Graphs as Texts

Gonzalo Navarro

Universidade do Chile

Abstract— The need to run different kinds of algorithms over large Web graphs motivates the research for compressed graph representations that permit accessing without decompressing them. At this point there exist a few such compression proposals, some of them very effective in practice. In this talk we introduce a novel approach to graph compression, based on regarding the graph as a text and using existing techniques for text compression/indexing. This permits accessing the graph efficiently without decompressing it, and in addition brings in new functionalities over the compressed graph. Our experimental results show that our technique has the potential of being competitive with the best alternative techniques, yet not fully satisfactory. Then we introduce a second approach, where we go back to pure compression. By far the best current result is the technique by Boldi and Vigna, which takes advantage of several particular properties of Web graphs. We show that the same properties can be exploited with a different and elegant technique, built on Re-Pair compression, which achieves about the same space but much faster navigation of the graph. Moreover, the technique has the potential of adapting well to secondary memory. Finally, we comment on ongoing work to combine those approaches. The successful scheme can be enriched with succinct data structures so as to permit further graph traversal operations.
Date: 28-Sep-2007    Time: 16:30:00    Location: 336


Low Power Microarchitecture With Instruction Reuse

Frederico Pratas


Abstract— The power consumption has become a very important metric and researchvtopic in microprocessors design in recent years. In this talk we propose a new method that reuses instructions forming small loops: the loop's instructions are first buffered in the Reorder Buffer and reused afterwards. The proposed method is implemented with the introduction of two new structures in a typical superscalar microarchitecture. In order to evaluate the proposed method, it was implemented and its operation simulated with the Simplescalarv tools. Several different configurations and benchmarks have been used, and the final conclusion is that the implementation of the proposed method in a superscalar microarchitecture improves the power efficiency without significantly affecting the performance.
Date: 17-Sep-2007    Time: 16:00:00    Location: 336



Bruno Rodrigues de Araújo

Instituto Superior Técnico

Abstract—Implicit Surfaces are a popular mathematical model used in Computer Graphics to represent shapes used for Modeling, Animation, Scientific Simulation and Visualization.Implicit surfaces provide a smoother and compact model requiring few high-level primitives to describe free-form surfaces becoming a suitable alternative to represent 3D data gathered by 3D scan for re-inverse engineering or medical data from MRI or CT scan for scientific visualization. However, they are hard to display and in order to take advantage of the current graphic pipeline which relies on triangle rasterization, they need to be converted from their continuous mathematical definition to a piecewise polygonal representation. In this work we survey the several techniques for the visualization of implicit surfaces. Starting from the identification of the different types of implicit surfaces used in Computer Graphics, we identify the main class of algorithms for its visualization and the advantages between them. Then, we focus on polygonization methods, since they are the more popular and adapted to nowadays graphic hardware. Since polygonization is a discretization process of implicit surfaces, we present the state of the art of the important issues related with the mesh generation to approximate a continuous model. These issues are related with topological correctness, sharpness and smoothness fidelity and visualization or conversion quality of the resulting polygonal approximation. By doing so, we are able to classify and compare existing visualization approaches using comparison criteria extracted from the several concerns handled by current research work on this area. The analysis of the existing techniques enable us to identify the best strategies to be followed to offer an high quality visualization of implicit surface and the more adequate solutions to overcome existing issues related with the polygonization of implicit surfaces.
Date: 13-Sep-2007    Time: 16:00:00    Location: Auditorio Omega, 9º Andar INESC


Symmetry Breaking Ordering Constraints

Zeynep Kiziltan

Università di Bologna

Abstract—Many problems in business, industry, and academia can be modelled as constraint programs consisting of matrices of decision variables. Such 'matrix models' often have symmetry. In particular, they often have row and column symmetry as the rows and columns can freely be permuted without affecting the satisfiability of assignments. Row and column symmetries can be very problematic in a systematic search as they grow super-exponentially and create a significant amount of redundancy in the search space. This talk is an overview of my PhD dissertation and it describes some of the first work for dealing with row and column symmetries efficiently and effectively. Row and column symmetry has been recognised by many other researchers as being critical in a wide range of application domains. It is now one of the most active areas of research in symmetry in CSPs. The ordering constraints and the propagators proposed in this dissertation are central to some of the mechanisms proposed for dealing with row and column symmetries. Zeynep Kiziltan is an assistant professor at the department of Computer Science of the University of Bologna in Italy. She received her PhD degree in 2004 from the University of Uppsala in Sweden where she is later appointed as associate professor. The PhD thesis of Dr. Kiziltan has won the 2004 best thesis award of the European Coordinating Committee for Artificial Intelligence.
Date: 07-Sep-2007    Time: 11:00:00    Location: 336


Transactional Boosting: A Methodology for Highly-Concurrent Transactional Objects

Maurice Herlihy

Brown University

Abstract—We describe a methodology for transforming a large class of highly-concurrent linearizable objects into highly-concurrent transactional objects. As long as the linearizable implementation satisfies certain regularity properties (informally, that every method has an inverse), we define a simple wrapper for the linearizable implementation that guarantees that concurrent transactions without inherent conflicts can synchronize at the same granularity as the original linearizable implementation. Joint work with Eric Koskinen Maurice Herlihy is Professor of Computer Science at Brown University. His research centers on practical and theoretical aspects of multiprocessor synchronization, with a focus on wait-free and lock-free synchronization. His 1991 paper ``Wait-Free Synchronization'' won the 2003 Dijkstra Prize in Distributed Computing, and he shared the 2004 Godel Prize for his 1999 paper ``The Topological Structure of Asynchronous Computation''.
Date: 05-Sep-2007    Time: 17:00:00    Location: Alfa 9th Floor


Survey on Data Network Congestion Prediction using Data Mining Techniques

Luís Ribeiro

Siemens Portugal (Alfragide)

Abstract—Telecommunication providers often face a very complex problem: how to maximize its Return On Investment (ROI) and keep costumers happy by providing them constantly good QoS levels. To keep costs low operators often accept more customers than theoretically their network resources could accommodate. Normally this is not a problem because most of the customers have a very sparse usage pattern. Even so, in some circumstances, congestion will occur. There are already some well known techniques of congestion avoidance that minimize (until a certain level) the congestion in the network. The most wide used protocol is TCP. On the network core there are active queue management protocols (Random Early Detection - RED - family [2, 3]) or the simple Drop from Tail (DT) that work on the queues of the Network Elements (NE). These protocols perform well, but they still have a small problem: there must be at least some packet loss to cause protocol actions (TCP) or, instead, the protocol action is to cause packet loss (RED or DT), which means QoS degradation. Several approaches have been taken to overcome this problem. One of them is to predict the congestion instead of reacting to it. This study provides a snapshot of the current research on congestion prediction in data networks using Data Mining techniques.
Date: 31-Jul-2007    Time: 14:00:00    Location: 336



Instituto Superior Técnico

Abstract—Discourse information is significant for several natural language processing tasks. From interpretation to generation, benefits arise when processing goes beyond sentences boundaries. In this survey, we address the main computational theories of discourse, explore how the most pertinent issues are solved and which were the most relevant contributions, examine representative discourse processing systems, and review the main evalution methodologies.
Date: 27-Jul-2007    Time: 14:00:00    Location: INESC - Sala 336



Instituto Superior Técnico

Abstract—Os sistemas de gestão de conteúdo ( Content Management Systems, CMS) são sistemas de software usados para a publicação de conteúdo em aplicações web (por exemplo sites empresariais, portais), que respondem à necessidade de facilitar a organização, controlo e publicação de um grande volume de documentos ou outro conteúdo. O comércio electrónico na Internet consiste na distribuição, compra, venda, marketing e fornecimento de produtos ou serviços na Internet e tem emergido rapidamente como um dos principais requisitos de negócio das organizações. Neste âmbito, e considerando diferentes abordagens tecnológicas de suporte ao comércio electrónico na Internet, foca-se neste trabalho de investigação o suporte através de sistemas de gestão de comércio electrónico ( e-commerce management systems) e plataformas de comércio electrónico que estendem gestores de conteúdo. Analisa-se o estado da arte dos CMS com suporte a comércio electrónico, a partir de um modelo de referência, que é usado para melhor analisar e comparar os seguintes sistemas, Commerce Starter Kit, osCommerce, VirtueMart e CATALOOK.netStore.
Date: 18-Jul-2007    Time: 17:00:00    Location: Sala 336 do INESC-ID



Instituto Superior Técnico

Abstract—Os problemas relacionados com a compreensão da Argumentação, bem como o seu papel no raciocínio humano, a sua formalização e respectivas aplicações têm sido estudados por vários campos, nomeadamente da Filosofia, Lógica e principalmente IA. A ideia geral da Argumentação é de que um argumento é aceite se consegue resistir com sucesso aos seus contra-argumentos. As crenças de um agente racional são caracterizadas pelas relações entre os argumentos em que se baseiam nessas mesmas crenças e os argumentos exteriores que as contrariam. Portanto, a Argumentação é de certa maneira baseada numa estabilidade com o exterior que torna os argumentos sugeridos aceites. Para além do estudo do conceito de Argumentação na sua generalidade, este trabalho passou, pela apresentação dos conceitos fundamentais ligados à argumentação baseada em lógica clássica (proposicional) e pelo desenvolvimento da argumentação baseada em lógica não-monótona, utilizando a lógica de omissão de Reiter.
Date: 11-Jul-2007    Time: 11:00:00    Location: IST - Pavilhão de Eng. Cilvil, r/c - sala V001



Instituto Superior Técnico

Abstract—Na década 90, Mark Weiser apresentou uma visão na qual os computadores seriam progressivamente integrados nos objectos do nosso quotidiano até se tornarem ubíquos e transparentes. Chamou-lhe Computação Ubíqua. O utilizador passaria a interagir com vários dispositivos móveis e embebidos nos objectos que o rodeiam. Para que os vários sistemas computacionais que o utilizador traz consigo ou que residem no ambiente o possam servir da melhor forma é fundamental que consigam realizar acções com base no contexto. Esta actividade é fácil para os humanos já que usamos intensivamente o contexto para comunicarmos. Na perspectiva dos computadores o contexto pode ser obtido directamente de sensores (i.e. “baixo nível”) ou aplicando a estes dados algum tipo de transformação ou inferência (i.e. “alto nível”). Este documento tem por objectivo apresentar uma visão abrangente da Computação Baseada no Contexto. Para o conseguir recorre a trabalhos desenvolvidos nesta área de investigação e desenvolvimento. Apresentamos aplicações clássicas, pioneiras na abordagem aos desafios que este tema levanta. Em seguida exploramos aspectos como a definição de infra-estruturas genéricas para a recolha, interpretação e distribuição da informação de contexto e o tema da privacidade e confiança. Optámos por analisar mais do que um trabalho para cada tópico com o objectivo de mostrar diferentes abordagens para o mesmo problema.
Date: 09-Jul-2007    Time: 17:30:00    Location: sala 336 do INESC ID



Instituto Superior Técnico

Abstract—Software as a Service (SaaS) has the potential to transform the way information technology (IT) departments relate to and even think about their role as providers of computing services to the rest of the enterprise. It started to circulate in 2000/2001, associated with firms such as Citrix Systems. The emergence of SaaS as an effective software-delivery mechanism creates an opportunity for IT departments to change their focus from deploying and supporting applications to managing the services that those applications provide. A successful service-centric IT, in turn, directly produces more value for the business by providing services that draw from both internal and external sources and align closely with business goals.
Date: 09-Jul-2007    Time: 10:00:00    Location: Sala 0.23 - Taguspark



Instituto Superior Técnico

Abstract—Este documento apresenta genericamente o processo de gestão de incidentes de Tecnologias de Informação (TI) da framework ITIL e a sua aplicabilidade à indústria farmacêutica, concretamente à farmacovigilância (vigilância de fármacos). Aplicado às TI, a missão deste processo é restaurar o mais brevemente possível a operacionalidade dos serviços, causando o menor impacto possível na organização. Aplicado à farmacovigilância, a missão é processar as notificações de reacções adversas a medicamentos, e encaminhar a informação para equipas de especialistas das entidades competentes na área da reacção, para que estes executem os procedimentos mais apropriados para a resolução do incidente (e.g. recolha de medicamentos).
Date: 09-Jul-2007    Time: 09:00:00    Location: Sala 0.23 - Taguspark


Data Format Description Framework – A Descriptive Approach to Data Standardization

Xiaoshu Wang


Abstract—Data standardization is fundamentally prescriptive because no information system can solve the data integration issue without enforcing certain rules. The question is, therefore, where the rules should be prescribed. Most existing data standards prescribe the rules over the data itself. However, excessive use of such an approach can easily lead to inefficient data representation. An alternative approach enforces the conforming rules over the description of the data. Under such data standardization, data producers are free to choose the representation of their data but should describe the representation in a standard manner. By developing software libraries that can understand the data description, this descriptive approach will give maximal flexibility in data representation while still ensuring the data interoperability.
Date: 05-Jul-2007    Time: 16:00:00    Location: 336


EA, BPM and ORM: Towards Convergence

Instituto Superior Técnico

Abstract—Most organizations are facing three emerging concerns: Operational Risk Management, Business Process Management and Enterprise Architecture. Although these three disciplines are strongly related to each other, their communities tend to reside in different domains inside an organization, with different vocabularies and without communication. So, the purpose of this article is to demonstrate a possible approach to the convergence of EA, BPM and ORM within an integrated effort, with real value to the organization, using concepts from different theories, since engineering to social sciences.
Date: 28-Jun-2007    Time: 17:00:00    Location: 9º andar, Edifício INESC, Rua Alves Redol 9, Lisboa


Qualitative Simulation of the Carbon Starvation Response in Escherichia coli

Delphine Ropers

INRIA Rhône-Alpes

Abstract—The adaptation of living organisms to their environment is controlled at the molecular level by large and complex networks of genes, mRNAs, proteins, metabolites, and their mutual interactions. In order to understand the overall behavior of an organism, we must complement molecular biology with the dynamic analysis of cellular interaction networks, by constructing mathematical models derived from experimental data, and using simulation tools to predict the behavior of the system under a variety of conditions. Following this methodology, we have started the analysis of the network of global transcription regulators controlling the adaptation of the bacterium Escherichia coli to environmental stress conditions. Even though E. coli is one of the best studied organisms, it is currently little understood how a stress signal is sensed and propagated throughout the network of global regulators, so as to enable the cell to respond in an adequate way. Using a qualitative method that is able to overcome the current lack of quantitative data on kinetic parameters and molecular concentrations, we have modeled the carbon starvation response network and simulated the response of E. coli cells to carbon deprivation. This has allowed us to identify essential features of the transition between exponential and stationary phase and to make new predictions on the qualitative system behavior following a carbon upshift. The model predictions have been tested experimentally by means of gene reporter systems.
Date: 30-May-2007    Time: 17:30:00    Location: Anfiteatro QA1.1 - Torre de Química/IST


GPLAB - A Genetic Programming Toolbox for MATLAB

Sara Silva

FCT, Universidade de Coimbra

Abstract—Genetic Programming (GP) is the automated learning of computer programs. Basically a search process, it is capable of solving complex problems by evolving populations of computer programs, using Darwinian evolution and Mendelian genetics as inspiration. GPLAB is a Genetic Programming toolbox for MATLAB. Besides most of the traditional functionalities used in GP, it also implements two additional features: (1) a method for automatically adapting the genetic operator probabilities in runtime, allowing the use of the toolbox as a test bench for new genetic operators; (2) several of the best state-of-the-art techniques for controlling the well known bloat problem, including some that result in automatic resizing of the population in runtime to save computational resources. Combining a highly modular and adaptable structure with the concern for automatic setting of most parameters, GPLAB suits all kinds of users, from the layman who wants to use it as a 'black box', to the advanced researcher who intends to build and test new functionalities. The toolbox and its documentation are freely available for download at The latest version ensures minimal compatibility with Octave.
Date: 24-May-2007    Time: 16:30:00    Location: 336


Affective Embodied Conversational Characters

Catherine Pelachaud

University of Paris 8

Abstract—Catherine Pelachaud has prominent work on Embodied Conversational Agents (ECA). Her talk will focus on several aspects regarding the creation of successful ECA, such as: 1) Nonverbal behavior: facial expression, gesture, gaze; 2) Emotion: model of expressive and emotional behavior; 3) Feedback: model of feedback behavior, of listener; 4) Audio-visual speech: lip movement, coarticulation model.
Date: 24-May-2007    Time: 14:00:00    Location: 2.8 no Taguspark


Técnicas de Tradução de Bytecode Java para Codigo C

Instituto Superior Técnico

Abstract—FERNANDO MANUEL FERREIRA MIRANDA - Sempre que um programa escrito em Java é executado, existe um custo devido à interpretação ou à compilação Just-in-time (JIT) do código intermédio (bytecodes) que é gerado quando se compila com a ferramenta javac. O objectivo deste projecto consiste em analisar as técnicas de tradução de bytecodes Java para código C. Embora o C seja portável, o objectivo não é transportar o C, mas sim fazer a compilação AOT (Ahead-Of-Time), que significa a compilação antes de o programa ser executado, no sítio onde o programa vai ser executado. Foi feito um estudo do código Java (fonte e bytecodes) e a identificação das técnicas de descompilação necessárias à tradução para linguagem C do código intermédio. Foi também efectuado um estudo das abordagens e técnicas existentes na tradução de código fonte e bytecodes para C.
Date: 10-May-2007    Time: 10:30:00    Location: sala 0.16 no Tagus Park


Microbial typing methods: databases and quantitative correspondence between typing methods results

João Carriço


Abstract—Typing methods are major tools for the epidemiological characterization of bacterial pathogens, allowing the determination of the clonal relationships between isolates based on their genotypic or phenotypic characteristics. Recent technological advances have resulted in a shift from classical phenotypic typing methods, such as serotyping, biotyping and antibiotic resistance typing, to molecular methods such as restriction fragment length polymorphisms (RFLP), pulsed-field gel electrophoresis (PFGE), and PCR serotyping . With the availability of affordable sequencing methods, another shift occurred towards sequence based typing methods such as multilocus sequence typing (MLST) and emm sequence typing (. Sequence based methods have a large appeal since they provide unambiguous data and are intrinsically portable, allowing the creation of databases that, if publicly available through the internet, enable the comparison of local data with that of previous studies in different geographical locations. Ideally an analysis of each typing method, in terms of discriminatory power, reproducibility, typeability, feasibility, and other characteristics, should be performed to better determine which method is appropriate in a given setting. Several molecular epidemiology studies of clinically relevant microorganisms provide a characterization of isolates based on different typing methods. Frequently these studies focus on a comparison between the assigned types of different typing methods, from a qualitative point of view, i.e., indicating correspondences between the types of the different methods. Although this may be useful for the comparison of the genetic backgrounds of the particular set of isolates under study, it does not allow for a broader view of how the results of the different typing methods are related. In this seminar we present the recent work on a online database for a new sequence-based typing method for Staphylococcus aureus and an online tool that implements a framework of measures that allow the quantitative assessment of the congruence for different typing methods results.
Date: 12-Apr-2007    Time: 16:30:00    Location: 336


Rationality and Fault-Tolerance

Jean-Philippe Martin

Microsoft Research

Abstract—Abstract: When there is no central administrator to control the actions of nodes in a distributed system, users may deviate for personal gain. How, then, can we design protocols that give any useful guarantee? In this talk I present research done at the University of Texas at Austin. I show how the BAR model accurately describes these environments and, through an example, show how to apply this model to build protocols that provide guarantees despite rational and Byzantine nodes.
Date: 26-Mar-2007    Time: 09:30:00    Location: Room 918 (Auditório Alfa, INESC-ID, Rua Alves Redol)


Birrell's Distributed Reference Listing Revisited

Richard Elliot Jones

University of Kent

Abstract—The Java RMI collector is arguably the most widely used distributed garbage collector. Its distributed reference listing algorithm was introduced by Birrell in the context of Network Objects, where the description was informal and heavily biased toward implementation. In this paper, we formalise this algorithm in an implementation-independent manner, which allows us to clarify weaknesses of the initial presentation. In particular, we discover cases critical to the correctness of the algorithm that are not accounted for by Birrell. We use our formalisation to derive an invariant-based proof of correctness of the algorithm that avoids notoriously difficult temporal reasoning. Furthermore, we offer a novel graphical representation of the state transition diagram, which we use to provide intuitive explanations of the algorithm and to investigate its tolerance to faults in a systematic manner. Finally, we examine how the algorithm may be optimised, either by placing constraints on message channels or by tightening the coupling between application program and distributed garbage collector. References: Birrell's distributed reference listing revisited Luc Moreau, Peter Dickman, and Richard Jones ACM Transactions on Programming Languages and Systems (TOPLAS), 27(6):1-52, November 2005.
Date: 19-Mar-2007    Time: 14:00:00    Location: Auditório Alfa, Sala 918, INESC-ID


Enriching Speech Recognition by Recovering Punctuation and Performing Capitalization

Fernando Batista


Abstract—This presentation describes my work on inserting punctuation marks and capitalizing the output of an Automatic Speech Recognition System (ASR). The output of an ASR often consists on a raw text, usually in a lower-text format, without any punctuation marks. This work aims to provide more usable transcriptions both for humans and machines. Different experiments were performed: using transducers; the SRILM toolkit; and maximum entropy models. The presentation will describe the advantages and major difficulties on applying each one of the methodologies. Results of experiments conducted both over written newspaper corpora and the speech output will be presented. As this work is not concluded yet, I will present the future work on this matter.
Date: 09-Mar-2007    Time: 14:30:00    Location: 336


BiGGEsTS - Biclustering Gene Expression Time-Series

Joana Gonçalves

Universidade da Beira Interior

Abstract—A ferramenta BiGGEsTS - Biclustering Gene Expression Time-Series, tem como objectivo a integração de algoritmos de biclustering para análise de séries temporais de expressão genética. Estes algoritmos abordam o problema de biclustering em dados provenientes de séries temporais de expressão genética de forma directa, isto é, permitem identificar biclusters formados por um conjunto de genes com expressão coerente num subconjunto contíguo dos instantes temporais em análise. Os biclusters identificados poderão posteriormente ser visualizados e estudados usando a aplicação, e segundo várias dimensões de análise, de forma a identificar aqueles que são relevantes do ponto de vista biológico e podem depois ajudar na identificação de módulos regulatórios. Embora existam já ferramentas que integram algoritmos de biclustering aplicados a dados de expressão genética em geral, o desenvolvimento de uma ferramenta para o caso específico das séries temporais, dada a particularidade dos algoritmos e biclustering integrados e dos resultados obtidos é inovador. Neste seminário será apresentada a versão actual da ferramenta e discutidas direcções para trabalho futuro.
Date: 01-Mar-2007    Time: 16:00:00    Location: 336


Variation-Aware Timing Analysis

Luis Guerra e Silva


Abstract—With IC technology steadily progressing into nanometer dimensions, precise control over all aspects of the fabrication process becomes an area of increasing concern. The impact of process parameter variations in circuit performance is becoming quite significant, particularly in what concerns timing. In this new context, traditional timing verification methodologies are starting to fail. Improving this situation requires tools that are better suited to handle realistic process variations and the complex inter-relations that exist between those variations. In this talk we propose a variation-aware methodology for timing analysis of ICs. We present new delay modeling techniques, where cell and interconnect delays are modeled by affine functions of the process parameters, rather than fixed numeric values. Additionally, we present a variation-aware timing analysis methodology that, using parametric delay models, extends traditional corner-based signoff techniques. Both contributions can be easily integrated into existing timing engines, producing insightful information for effectively guiding manual or automated circuit optimization in a variation-aware fashion.
Date: 01-Mar-2007    Time: 14:00:00    Location: 336


A multi-microphone approach to speech processing in a smart-room environment

Alberto Abad Gareta


Abstract—Recent advances on computer technology, speech and language processing, or image processing, have made possible that some new ways of person-machine communication and computer assistance to human activities start to appear feasible. Concretely, the interest on the development of new challenging applications in indoor environments equipped with multiple sensors, also known as smart-rooms, has considerably grown in the last times. In the last years the UPC has been participating in the EU funded CHIL project -Computers in the Human Interaction Loop-. The project was mainly aimed to develop intelligent services capable to assist and complement human activities, requiring the minimal possible awareness from the users. Consequently, there was a need of perceptual user interfaces which were multimodal and robust, and which used unobtrusive sensors. My most recent work is precisely related with the acoustic research activities carried out at the UPC in the context of the CHIL project. Particularly, I have been investigating the use of multi-microphone approaches to speech processing as a possible solution to the problems that appear in the deployment of hands-free speech applications in real room environments. First, I will describe some of the work carried out on ASR with microphone arrays. Then, I will also briefly comment my work related to speaker tracking and head orientation estimation.
Date: 23-Feb-2007    Time: 14:30:00    Location: 336


Modelo conceptual para auditoria organizacional contínua com análise em tempo real

Carlos Alberto Lourenço dos Santos

Instituto Superior Técnico

Abstract—A relevância que tem vindo a ser assumida pelos sistemas de informação organizacionais na manobra das organizações, em particular na sua capacidade de adaptação contínua a novos desafios com resposta em tempo real, exige que seja dada particular atenção à sua avaliação e validação, no pressuposto de que um sistema de informação avaliado e validado dará garantias aos stakeholders de que a organização merece a sua confiança. A acção de avaliar e validar é um processo de auditoria que, para ser conduzido segundo as boas práticas, deve verificar e avaliar se os objectivos específicos de controlo associados aos diversos pontos de controlo são atingidos com razoável certeza. Assim, é proposto nesta tese um modelo conceptual para auditoria contínua das organizações com análise em tempo real assente em cinco pilares: · recurso à teoria da engenharia organizacional para modelação do negócio, utilizando a framework CEO e a linguagem de modelação UML; · recurso à teoria do controlo interno em conformidade com a framework “Enterprise Risk Management – Integrated Framework”, publicada por COSO “Committee Of Sponsoring Organizations of the Treadway Commission”; · análise micro dos processos de negócio, ao nível das transacções organizacionais, identificando e avaliando o risco que lhes está associado; · batimento dos mecanismos de controlo sobre os processos de negócio; e, · recurso ao verificador de modelos SPIN para validação formal dos modelos de processos de negócio com controlo embebido, como garantia da sua consistência. Os principais contributos científicos fornecidos por esta tese são: projecto consistente e coerente de um sistema de controlo interno; verificação formal dos processos de negócio com controlo embebido de acordo com os objectivos específicos de controlo; modelo conceptual para suporte à auditoria organizacional contínua com análise em tempo real. Palavras-chave: sistema de informação organizacional; engenharia organizacional; transacção organizacional; sistema de controlo interno; verificação formal; auditoria organizacional.
Date: 22-Feb-2007    Time: 10:00:00    Location: Anfiteatro do Complexo do IST


Ab Initio Protein Structure Prediction using Conformational Search and Information from Known Protein Structures

Miguel Bugalho


Abstract—Most of the protein folding methods use information from known proteins to predict protein structure. For homology and fold recognition methods this information is used directly and good results can be obtained if a sufficient similar protein with known structure is found. However, if no such protein is available or for large unmatched regions, ab initio methods can be of great help (specially for small proteins). Our method uses a fragment library and a search technique to create possible structures from which a high scoring set can then be analysed. The search alternates between testing for possible fragments, and choosing stochastically one of the fragments using a score based on current and previous search information. Backtrack is performed if no fragments are available. When a structure is completed, a score is calculated using frequencies of contacts and buried state derived from known proteins. The score information is saved for use in the next structure searches and a new point in the search tree is stochastically chosen for constructing a new structure. The algorithm chooses points in previous constructed structures that had lower scores, trying to improve that structure.
Date: 15-Feb-2007    Time: 16:00:00    Location: 425



Luís Oliveira


Abstract—Nowadays, the demand for mobile and portable equipment has led to a large increase in wireless communication applications. Modern transceiver architectures, to achieve full integration and low cost, require quadrature oscillators with very accurate quadrature relationship, since quadrature errors affect strongly the overall performance of the RF (Radio Frequency) front-end. Relaxation oscillators are known for their poor phase-noise performance. Here we show that, by cross-coupling two oscillators, the phase-noise performance is improved. Moreover, strong coupling reduces the effect of mismatches and other disturbances: these are attenuated by the feedback, becoming second order effects, and this guaranties a very accurate quadrature. LC oscillators are known for their good phase-noise performance when compared with relaxation oscillators. In a cross-coupled LC oscillator, coupling is necessary for accurate quadrature in the presence of mismatches; however, this degrades the oscillator phase-noise, due to the degradation of the quality factor.
Date: 15-Feb-2007    Time: 14:00:00    Location: 425, 4th stage at INESC-ID


CARMA - _A_rquitetura _M_PSoC _R_econfigurável para _A_plicações _C_riptográficas

Daniel Mesquita


Abstract—A arquitetura CARMA foi concebida com o objetivo de prover recursos criptográficos de com alto desempenho e segurança face aos ataques por canais colaterais. Circuitos criptográficos permitem fuga de informações como consumo de energia, tempo de cálculo e emissões eletromagnéticas, entre outras. Essas informações colaterais podem ser usadas para ataques aos sistemas criptográficos (/Side Channel Attacks/). A arquitetura CARMA alia técnicas de reconfiguraçao dinâmica à uma aritmética baseada na representação RNS para evitar ataques por análise de consumo de energia. Resultados de prototipação mostram o incremento da segurança trazido pela técnica utilizada em conjunto com a CARMA. Essa arquitetura pode ser incluída no contexto da plataforma ARTEMIS, pois é suficientemente genérica para executar outros tipos de aplicações, tais como compressão de dados, detecção/correção de erros (tolerância à falhas) e processamento de imagem.
Date: 14-Feb-2007    Time: 14:30:00    Location: IST, TagusPark, sala 0.26


Mining Protein Structure Data

José Carlos Almeida Santos

Imperial College London

Abstract—This presentation will show the application of data mining techniques, in particular of machine learning, for discovery of knowledge in a protein database. The main problem we address is the determination whether an amino acid is exposed or buried in a protein for five exposition levels: 2%, 10%, 20%, 25% and 30%. First we introduce the baseline classifier for this problem which, although very simple (only takes into account the amino acid type), already achieves good prediction results. Then we explain how, by making a local PDB database, retrieving DSSP and SCOP data, we build our classifier to improve the baseline prediction. Finally we test and compare several classifiers (Neural Networks, C5.0, CART and Chaid), and parameters that might influence the prediction accuracy. Namely the level of information per amino acid, the SCOP class of the protein and the neighbourhood of the current amino acid (i.e.: the sliding window size). Keywords: Amino acid Relative Solvent Accessibility, Protein Structure Prediction, Data Mining, BioInformatics, Artificial Intelligence
Date: 01-Feb-2007    Time: 16:00:00    Location: 336


A Estatística ao Encontro da Biologia Molecular

Lisete Sousa

Dept Estatística e Investigação Operacional - Fac Ciências/Univ Lisboa

Abstract—A associação entre a explosão de dados relacionados com genética molecular e o avanço tecnológico a nível de meios informáticos é, presentemente, um desafio para o estatístico na medida em que é requerido um melhoramento dos métodos existentes e o desenvolvimento de métodos de inferência mais eficientes para lidar com dados de natureza tão complexa. Pretende-se mostrar como a estatística é fundamental na abordagem de sistemas tão diversos como a estrutura de proteínas e microarrays. Estudos concretos em Biologia Molecular, com objectivos muito específicos, servem de base à apresentação das metodologias estatísticas mais aplicadas nesta área. Salienta-se também a importância do software disponível, nomeadamente alguns packages do R e programas com interface web.
Date: 25-Jan-2007    Time: 14:00:00    Location: 336


Introduction to the ISO TC/211 and the Open Geospatial Consortium

Miguel A. Bernabé

Instituto Superior Técnico

Abstract—The ISO/TC 211 (Gegraphic Information / Geomatics) is responsible for the ISO geographic information series of standards. This work aims to establish a structured set of standards for information concerning objects or phenomena that are directly or indirectly associated with a location relative to the Earth. These standards may specify methods, tools and services for data management acquiring, processing, analyzing, accessing, presenting and transferring. The Open Geospatial Consortium is an international industry consortium of more than 335 entities to develop publicly available interface specifications. OpenGIS Specifications support interoperable solutions that geo-enable the Web, wireless and location-based services, and mainstream IT.
Date: 25-Jan-2007    Time: 14:00:00    Location: Room FA3 (IST - Alameda)


Redes em Chip, um novo paradigma de comunicação intra-chip

Mário Pereira Véstias

Instituto Superior de Engenharia de Lisboa (ISEL)

Abstract—A rede em chip (Network-on-Chip - NoC) é um nova abordagem para o projecto de sistemas num único integrado nos casos em que a comunicação se apresenta como o grande desafio de projecto. Os NoC são organizados como uma rede comutada por pacotes. Esta abordagem usa muitos dos conceitos das redes de computadores e da computação paralela, mas são tidas em conta restrições e compromissos de projecto diferentes, daí a sua especificidade. O seminário pretende introduzir os conceitos associados à rede em chip e quais as motivações subjacentes à sua utilização. Serão ainda apresentadas algumas abordagens ao projecto deste tipo de sistemas, em particular no domínio dos reconfiguráveis.
Date: 22-Jan-2007    Time: 14:30:00    Location: sala 0.16, IST, TagusPark


Next Generation of MicroBioelectronic Systems

Moises Simões Piedade


Abstract—This presentation will show a review of recent research projects aiming the development of advanced micro bioelectronic systems. Interdisciplinary research projects with objectives oriented to the realization of Lab-on-Pocket, Lab-on-Chip and Lab-on-Cell systems will be specially considered. State of research and technological achievements (and difficulties) of the ongoing INESC Biochip Project will be discussed. Finally the new INESC research project BIOMAGCMOS will be presented and explained.
Date: 18-Jan-2007    Time: 14:00:00    Location: 336


Aplicações da Computação Reconfigurável no Projecto de Robôs Móveis

Eduardo Marques

Universidade de São Paulo

Abstract—Este seminário apresentará aplicações da computação reconfigurável na Robótica Móvel. São utilizadas arquiteturas baseadas no conceito SoC (System-on-a-chip) para acelerar a aplicação. Este SoC é implementado em circuitos reprogramáveis do tipo FPGA (Field Programmable Gate Array) de última geração dos fabricantes Altera.A arquitetura alvo é constituída por um softcore Processor NIOS II da Altera, associado a várias unidades de processamento reconfiguráveis (RPUs) desenvolvidas especialmente para a área de robótica móvel. Esta metodologia permite a pesquisadores e projetistas na área da robótica móvel testar seus algoritmos em sistemas de capacidade de desempenho elevada e, deste modo, explorar novas soluções destes sistemas para uso em tempo-real; um requisito cada vez mais presente na robótica móvel embarcada. Os testes de validação do sitema gerado são realizados com um robô Pioneer 3DX. O trabalho atual foca um ambiente de co-projeto hardware/software para facilitar o desenvolvimento de aplicações da robótica móvel em FPGAs. Este projeto teve início em Abril de 2005 através do convênio CNPq/Grices, envolvendo a Universidade de São Paulo e a Universidade do Algarve (tendo neste momento o INESC-ID/IST como parceiro português).
Date: 14-Dec-2006    Time: 14:30:00    Location: IST, TagusPark, sala 0.26


Architecture and Performance of Dynamic Offloader for Cluster Network

Keiichi Aoki

University of Tsukuba

Abstract—Improved hardware technology for network hardware, a technique to migrate a part of calculations performed by the host CPU into the network hardware is focused by the researchers in the high performance network architecture field, because the recent network hardware obtains high performance processor(s) for controling network protocols. Several researches or products for offloading communication protocol to network device have succeeded in increasing communication performance. However, in these techniques, the host processor still needs to access the network hardware frequently since the unit of offloading is small. This presentation will show a design of software environment to offload the user-defined software modules to Maestro2 cluster network This mechanism is called Maestro Dynamic Offloading mechanism (MDO). MDO allows an application program to offload its major part of communication procedure of the application program to address this problem. The experimental results of the performance evaluation of MDO will be shown by offloading collective communication pertterns.
Date: 07-Dec-2006    Time: 16:00:00    Location: 336


Extracting MUCs from Constraint Networks

Lakhdar Saïs

Université d'Artois

Abstract—We address the problem of extracting Minimal Unsatis&#64257;able Cores (MUCs) from constraint networks. This computationally hard problem has a practical interest in many application domains such as con&#64257;guration, planning, diagnosis, etc. Indeed, identifying one or several disjoint MUCs can help circumscribe different sources of inconsistency in order to repair a system. In this paper, we propose an original approach that involves performing successive runs of a complete backtracking search, using constraint weighting, in order to surround an inconsistent part of a network, before identifying all transition constraints belonging to a MUC using a dichotomic process. We show the effectiveness of this approach, both theoretically and experimentally.
Date: 28-Nov-2006    Time: 16:00:00    Location: 336


Modelling Services in Information Systems Architectures

Anacleto Correia

Instituto Superior Técnico

Abstract—Twenty years ago, Zachman proposed a framework - the Information Systems Architecture - that was certainly one of the main contributions to the Enterprise Architecture research area. More recently, the concept of service was proposed and largely adopted, thus introducing another but fundamental perspective about how organizations not only operate internally but also relate with stakeholders. In this paper we propose an extension to the Zachman framework that incorporates the concept of service
Date: 28-Nov-2006    Time: 12:00:00    Location: Sala 0.9 Taguspark


Functional organization of chromosomes in the mammalian cell nucleus

Ana Pombo

Imperial College London

Abstract—Chromosomes are not randomly folded in a spaguetti-like state in the mammalian cell nucleus, as initially thought, but occupy distinct territories. Recent studies show that these chromosome territories have preferential arrangements in different cell types, which correlate with the kinds of chromosome rearrangements that occur preferentially in each cell type. Evidence for a growing number of long-range interactions between DNA segments in the same or different chromosomes has raised the possibility of a three-dimensional network of genome interactions. As the long-range interactions described so far correlate with gene activity states, they are likely to influence and be influenced by the transcriptome of each cell type. We propose that this interchromosomal network of interactions contains epigenetic information and determines cell-type specific chromosome conformations and re-arrangements.
Date: 27-Nov-2006    Time: 14:00:00    Location: Anfiteatro FA1 - IST


Computing Max satisfiable (MSS) and Min Unsatisfiable (MUC) of CNF formulas

Cédric Piette

Université d'Artois

Abstract—In this presentation, a new complete technique to compute Maximal Satisfiable Subsets (MSS) and Minimally Unsatisfiable Subformulas (MUS) of sets of Boolean clauses is introduced. The approach improves the currently most efficient complete technique in several ways. It makes use of the powerful concept of critical clause and of a computationally inexpensive local search oracle to boost an exhaustive algorithm proposed by Liffiton and Sakallah. These features can allow exponential efficiency gains to be obtained. Accordingly, experimental studies show that this new approach outperforms the best current existing exhaustive ones.
Date: 27-Nov-2006    Time: 14:00:00    Location: 336


Regular Expression Matching for Reconfigurable Packet Inspection

João Bispo


Abstract—Recent intrusion detection systems (IDS) use regular expressions instead of static patterns as a more efficient way to represent hazardous packet payload contents. This presentation focuses on regular expressions pattern matching engines implemented in reconfigurable hardware. We present a Nondeterministic Finite Automata (NFA) based implementation, which takes advantage of new basic building blocks to support more complex regular expressions than the previous approaches. Our methodology is supported by a tool that automatically generates the circuitry for the given regular expressions, outputting VHDL representations ready for logic synthesis. Furthermore, we include techniques to reduce the area cost of our designs and maximize performance when targeting FPGAs. Experimental results show that our tool is able to generate a regular expression engine to match more than 500 IDS regular expressions (from the Snort ruleset) using only 25K logic cells and achieving 2 Gbps throughput on a Virtex2 and 2.9 on a Virtex4 device. Concerning the throughput per area required per matching non-Meta character, our design is 3.4 and 10x more efficient than previous ASIC and FPGA approaches, respectively.
Date: 25-Nov-2006    Time: 14:30:00    Location: sala 1.65, IST, TagusPark.


Who spoke when

Janez Žibert

University of Ljubljana

Abstract—The thesis addresses the problem of structuring the audio data in terms of speakers, i.e., finding the regions in the audio streams that belong to one speaker and joining each region of the same speaker together. The task of organizing the audio data in this way is known as speaker diarization and was first introduced in the NIST project of Rich Transcription in "Who spoke when" evaluations. The speaker-diarization problem is composed of several tasks. This thesis addresses three of them: speech/non-speech segmentation, speaker- and background-change detection, and speaker clustering. The main objectives in our research were to develop new representations of audio data that were more suitable for each task and to improve the accuracy and increase the robustness of standard approaches under various acoustic and environmental conditions. The motivation for the improvement of the existing methods and the development of new procedures for speaker-diarization tasks is the design of a system for the speaker-based audio indexing of broadcast news shows.
Date: 23-Nov-2006    Time: 15:00:00    Location: INESC ID, 4th floor meeting room


Linearity improvement techniques for high-speed ADCs

Pedro Figueiredo

ChipIdea Microelectrónica

Abstract—The Analog-to-Digital Converters (ADCs) are the link between the "real" analog world and the tremendous processing and memorization possibilities of the digital circuits. There are several architectures, each suited to a certain sampling frequency / resolution range. This talk will focus on high-speed medium resolution ADCs implemented in CMOS technologies, which are a fundamental building block of many video and communication systems, as well as of optical and magnetic data storage frontends. The linearity that can be achieved by these converters is mainly limited by the offset voltages existing in its constituting blocks. This talk addresses state-of-art offset reduction techniques, and presents results from integrated ADC prototypes.
Date: 23-Nov-2006    Time: 14:00:00    Location: 336


Text-Independent Cross-Language Voice Conversion for Speech-to-Speech Translation

David Sündermann

Universitat Politècnica de Catalunya

Abstract—For applications like multi-user speech-to-speech translation, it is helpful to individualize the output voice to make voices distinguishable. Ideally, this should be done by applying the input speaker's voice characteristics to the output speech. In general, a speech-to-speech translation system consists of three main modules: speech recognition, text translation, and speech synthesis. Since the latter, the speech synthesis module, normally is based on a large speech corpus of a professional speaker manually corrected and carefully tuned, the output voice characteristics are static. This is overcome by a fourth module, the voice conversion unit, which processes the synthesizer's speech according to the input voice characteristics. Due to the nature of speech-to-speech translation, input and output voices use different languages leading to the following two challenges: (i) As opposed to state-of-the-art voice conversion, whose statistical parameter training is based on parallel utterances of both involved speakers (text-dependent approach), here we have to rely on text-independent parameter training: There is no way to produce parallel utterances in different languages; (ii) Most voice conversion techniques estimate conversion functions that depend on the phonetic class, either explicitly (e.g. using CART) or implicitly (e.g. using GMM). However, considering different languages, we face different phoneme sets that make it hard to find conversion functions for phonetic units, which are not covered by the other phoneme set. In this talk, I present text-independent voice conversion techniques that are cross-language portable and aim at solving these challenges. In this context, I will (i) introduce a speech alignment technique based on unit selection dealing with non-parallel speech and (ii) show that vocal tract length normalization, which is applied to convert the source voice towards the target, can be directly applied to the time frames without the detour through frequency domain. The techniques' performance is assessed on several multi-lingual corpora in the framework of subjective evaluations. In addition to the evaluation results, speech samples will be used to illustrate the discussed techniques' effectiveness.
Date: 17-Nov-2006    Time: 15:30:00    Location: 336


Motion Tracking on Manifolds

Jorge Silva

Instituto Superior de Engenharia de Lisboa (ISEL)

Abstract—There has been growing interest in algorithms capable of learning models from large volumes of multidimensional data, using statistical, geometrical and dynamical information. There are many domains of application for such algorithms, e. g. in exploratory data analysis, computer vision, system identification, control, computer graphics and multimedia databases. While the linear case can be solved by the well-known Principal Component Analysis technique, the non-linear case is more complex. Recently, there have been advances in algorithms that approximate the data through manifold learning. The present works fits this frameworks, with emphasis on the problem of motion tracking - particularly in video sequences - assuming that the whole observation space is not occupied, but rather a manifold embedded in that space. This thesis proposes a manifold learning algorithm, named Gaussian Process Tangent Bundle Approximation (GP-TBA). This algorithm can deal with arbitrary manifold topology by decomposing the manifold into multiple local models, while also providing a probabilistic description of the data based on Gaussian process regression. The model provided by GP-TBA is also used to simplify the motion tracking problem, for which a multiple filter architecture, using e. g. Kalman or particle filtering, is described. The GP-TBA algorithm and the filter bank framework are illustrated with experimental results using real video sequences.
Date: 16-Nov-2006    Time: 16:00:00    Location: 336


CMOS Technology Sub-1 V Supply Voltage References Based on Asymmetric Gain Stage Architecture

Igor Filanovsky

Universidade Aberta

Abstract—Voltage references are used in various fields of application as in digital-to-analog converters, automotive industry, battery-operated DRAMs and others. Widely known bandgap references (BGR) are not able to operate when the supply voltage drops below 0.9 V. There is a need (at least, in the future) for voltage references operating with low power supply (say, 0.6 V). Non-bandgap references (NBGR) are promising circuits for low-voltage supplies, yet they are not sufficiently investigated.
Date: 16-Nov-2006    Time: 15:30:00    Location: IST (Taguspark) Anfiteatro 3


Network Inference From Co-occurences

Mário A. T. Figueiredo

Instituto de Telecomunicações (IT)

Abstract—We consider the problem of inferring the structure of a network from co-occurrence data; observations that indicate which nodes occur in a signaling pathway but do not directly reveal node order within the pathway. This problem is motivated by network inference problems arising in computational biology and communication systems, in which it is difficult or impossible to obtain precise time ordering information. Without order information, every permutation of the activated nodes leads to a different feasible solution, resulting in combinatorial explosion of the feasible set. However, physical principles underlying most networked systems suggest that not all feasible solutions are equally likely. Intuitively, nodes which co-occur more frequently are probably more closely connected. Building on this intuition, we model path co-activations as randomly shuffled samples of a random walk on the network. We derive a computationally efficient network inference algorithm and, via novel concentration inequalities for importance sampling estimators, prove that a polynomial complexity Monte Carlo version of the algorithm converges with high probability.
Date: 16-Nov-2006    Time: 14:00:00    Location: 336



J. M. Lemos


Abstract—The use of control techniques to drive biomedical systems is a subject that receives increasing attention. Suitable sensors and actuators, on one side, and progress made in control theory, from which suitable algorithms are derived, made feedback control possible in several situations. The seminar will address the problem of controlling neuromuscular blockade and the level of counsciousness in patients subject to general anesthesia. Adaptive control algorithms, including switched multiple model control and multiple model predictive adaptive control, together with the embedding of sensor fault tolerance will be described, as a means to tackle the uncertainty in the systems to control. Clinical cases obtained at Hospital geral de Santo António (Porto) by the Departamento de Matemática Aplicada (FCUP) that illustrate these algorithms will be presented.
Date: 09-Nov-2006    Time: 14:00:00    Location: 336



Carlos Ferreira

Instituto Superior Técnico

Abstract—Problemas acontecem todos os dias em todas as áreas, desde o pneu do carro que fura a caminho da reunião com o chefe, ao problema generalizado de representar em software o ano com apenas dois dígitos e que deu origem aos milhares de euros gastos na transição para o ano 2000. Todos estes problemas acabam por resultar de alguma forma em preocupações e perdas de tempo e dinheiro! Devemos então tentar evitar os problemas, ou seja evitar ou controlar os riscos. Actualmente a maioria das organizações compreenderam a importância de controlar os riscos nos seus projectos, no entanto existe pouco conhecimento da forma como deve ser feito. A falta de conhecimento pode dar origem a escolhas de técnicas de gestão de risco inadequadas. Neste trabalho, apesar de não se incluírem todas as metodologias e técnicas existentes, pretendemos elucidar sobre os processos de gestão de risco actuais, salientando, através de um estudo comparativo, algumas vantagens e fraquezas de cada um.
Date: 08-Nov-2006    Time: 18:30:00    Location: 336


Extractive summarization of broadcast news

Ricardo Ribeiro


Abstract—We present early results from our work on extractive summarization of broadcast news. The feature-based summarizer receives as input the automatic transcription of the news, already divided into stories, and produces as output a summary for each story. The main problems dealt with were sentence segmentation and scoring. Since summary evaluation requires hand-made summaries and/or human grading of the produced summaries, it is left as future work.
Date: 03-Nov-2006    Time: 15:30:00    Location: 336


Knowledge Discovery in Genomics and BioIntelligence Research

A. Fazel Famili

Institute for Information Technology (IIT) - National Research Council of Canada

Abstract—Knowledge discovery is the process of developing strategies to discover useful and ideally all previously unknown knowledge from historical or real-time data. Applied to high throughput genomics applications, knowledge discovery processes will help in various research and development activities, such as (i) studying data quality for possible anomalous or questionable expressions of certain genes or experiments, (ii) identifying relationships between genes and their functions based on time-series or other high throughput genomics profiles, (iii) investigating gene responses to treatments under various conditions such as in-vitro or in-vivo studies, and (iv) discovering models for clinical diagnosis/classifications based on expression profiles among two or more classes. This presentation consists of three parts. In part one, we provide an overview of knowledge discovery in genomics and the BioMine project. In part two of this talk we describe some of our case studies using the BioMiner data mining software that we have built in this project. These are all cases in which real genomics data sets (obtained from public or private sources) have been used for tasks such as gene function identification and gene response analysis. We will describe a few examples explaining complexities and challenges in dealing with real data. In the last part of this talk, we share our experiences gained over the last 6 years and describe our current activities and future plans in BioIntelligence research direction.
Date: 30-Oct-2006    Time: 10:00:00    Location: 336


A Domain Knowledge Advisor for Dialogue Systems

Porfírio Pena Filipe


Abstract—This paper describes ongoing research in order to enhance our Domain Knowledge Manager (DKM) that is a module of a multi&#8209;propose Spoken Dialogue System (SDS) architecture. The application domain is materialized as an arbitrary set of devices, such as household appliances, providing useful tasks to the SDS users. Our main contribution is a DKM advisor service, which suggests the best task&#8209;device pairs to satisfy a request. Additionally, we also propose a DKM recognizer service to identify the domain&#8217;s concepts from a natural language request. These services use as knowledge source a domain model, to obtain knowledge about devices and the tasks they provide. The implementation of these services allows the DKM to provide a high&#8209;level and easy to use small interface, instead of a conventional service interface with several remote procedures/methods. These services have been tested into a domain simulator. Our contributions try to reach SDS domain portability issues.
Date: 20-Oct-2006    Time: 15:30:00    Location: 336


PET Positron Emission Tomography - Electrónica de Front-End

Edgar Francisco Monteiro Albuquerque


Abstract—Instrumentação para medicina é um campo relativamente novo na engenharia e de crescimento muito rápido decorrente dos avanços na última década nas áreas da microelectrónica e particularmente na área dos sensores de estado sólido. Com a finalidade de aproveitar estes novos desenvolvimentos bem como competências, conhecimento e outras sinergias existente em instituições nacionais, foi criado em Dezembro de 2002 o Consortium PET – Mammography liderado pelo LIP, Laboratório de Física de Partículas e constituído por sete instituições nacionais especializadas nas áreas de medicina nuclear, física dos detectores de radiação, biofísica, engenharia médica , electrónica, computação e engenharia mecânica, entre as quais o INESC-ID/INOV. Coube ao INESC-ID/INOV o desenvolvimento dos sistemas electrónicos necessários, e, nomeadamente, ao Grupo de Circuito Analógicos e Mistos o desenvolvimento de circuito integrado para processamento dos sinais provenientes dos sensores, circuito designado por Front-End ASIC. Neste seminário apresentam-se os requisitos para o Front-End ASIC, e a arquitectura proposta. Apresentam-se ainda os blocos desenvolvidos que compõem o sistema, nomeadamente: amplificador, comparador de elevada precisão, memórias analógicas, multiplexers analógicos e controlador do sistema. São apresentados os protótipos fabricados bem como os respectivos resultados dos ensaios laboratoriais. São ainda apresentadas as dificuldades encontradas em integrar um sistemas misto analógico/digital de elevado desempenho deste tipo, contendo blocos analógicos de elevada precisão e digitais operando a frequências elevadas. Por fim analisa-se o estado actual do projecto sendo enumeradas as tarefas ainda a realizar e é apontado o trabalho a realizar no futuro próximo bem como as lições aprendidas no decurso deste projecto.
Date: 12-Oct-2006    Time: 14:00:00    Location: 336


Dynamic Entropy-Compressed Sequences and Applications

Gonzalo Navarro

Departamento de Ciencias de Computación (DCC), Universidad de Chile

Abstract—Data structures are called succinct when they take little space (meaning usually of lower order) compared to the data they give access to. A more ambitious challenge is that of compressed data structures, which aim at operating within space proportional to that of the compressed data they give access to. Designing compressed data structures goes beyond compression in the sense that the data must be manageable in compressed form without first decompressing it. This is a trend that has gained much attention in recent years. In this talk we will introduce a simple data structure for managing bit sequences, so that the space required is essentially that of the zero-order entropy of the sequence, and the operations of inserting/deleting bits, accessing a bit position, and computing rank/select over the sequence, can all be done in logarithmic time. Rank operation gives the number of 1 (or 0) bits up to a given position, whereas select gives the position of the j-th 1 (or 0) bit in the sequence. This basic result has a surprising number of consequences. We show how it permits obtaining novel solutions to the dynamic partial sums with indels problem, dynamic wavelet trees, and dynamic compressed full-text indexes.
Date: 09-Oct-2006    Time: 16:00:00    Location: 336


Discriminative Modeling in NLP/SLP

Christian Weiss


Abstract—Over a long period generative models such as HMMs are state-of-the-art in NLP/SLP. HMMs are successful in various domains. Speech Recognition, PoS-tagging, G2P, TTS etc.... HMMs have some limitations that recent statistical modeling approaches overcome. These statistical learning algorithms can be grouped under Discriminative Modeling or as in Speech Recognition under Discriminative Training. One of those algorithms is the Conditional Random Field approach. The talk gives an overview what a Conditional Random Field is and the difference between Generative Models and Discriminative Models.
Date: 06-Oct-2006    Time: 15:30:00    Location: 4th floor meeting room


Questões espaciais no traçado de reservas para a protecção da biodiversidade

Jorge Orestes

ISA - Instituto Superior de Agronomia

Abstract—No traçado de redes de áreas protegidas, para além de garantir a representação da biodiversidade, há que ter em consideração requisitos respeitantes às configurações espaciais das reservas. Um desses requisitos consiste em assegurar um certo grau de conexidade ou contiguidade entre as parcelas seleccionadas. Vou apresentar e discutir três problemas relativos à conexidade no desenho de redes para a protecção da biodiversidade.
Date: 03-Oct-2006    Time: 16:00:00    Location: 336


Pontuação e capitalização em transcrições de fala

Fernando Batista


Abstract—Serão apresentadas algumas experiências realizadas ao longo dos últimos dois meses, no sentido de inserir a pontuação e fazer a correcta grafia a maiúscula (capitalização) em textos provenientes de um reconhecedor de fala. O objectivo do trabalho consiste em avaliar o desempenho dos métodos automáticos na execução destas tarefas e perceber de que forma se podem optimizar. Até ao momento foram feitas experiências utilizando o toolkit SRILM e transdutores. O trabalho ainda não se encontra concluído, pelo que a apresentação se centrará em escrever a metodologia que está a ser empregue, em que condições e os diversos obstáculos que têm surgido.
Date: 29-Sep-2006    Time: 15:30:00    Location: 336


Applications of Rewriting-Logic in Reconfigurable Hardware Design Space Exploration

Carlos Morra

University of Karlsruhe

Abstract— Recon&#64257;gurable architectures are increasingly being used for digital signal processing applications. The typical development process for DSP applications starts with a set of mathematical equations which are manipulated and interpreted by the developer, and then manually translated into a lower abstraction level. The developer must consider many different implementation approaches and parameters in order to obtain the best trade-offs for the given application on the target architecture. The exploration of different approaches and implementation alternatives is a very complex, time consuming and error-prone process which requires a lot of expertise from the developer. To address this problem, a novel tool &#64258;ow based on rewriting logic is being developed. The talk presents the toolflow and some applications examples.
Date: 27-Sep-2006    Time: 16:00:00    Location: 336


Formats and services for data and algorithm interoperation in Bioinformatics

Jonas S Almeida

University of Texas M.D.Anderson Cancer Center

Abstract—Data integration in life sciences is, presently, at a conundrum. On the one hand the diversity of data is increasing as explosively as its volume but on the other hand the value of individual data sets can only be appreciated when enough of those distinct pieces of the systemic puzzle are put together. Consequently, it is just as imperative to have agreeable standard formats as it is that they are not enforced so strictly as to be an obstacle to reporting the very novel data that brings value to systemic integration. In this presentation the emerging use of semantic web technologies is highlighted as regards its practical implications for experimental biology and translational biomedical research. The new integrative technologies create tremendous opportunities for a wider participation by both individual and national initiatives into large scale international research efforts. They also create the challenge of locally developing fluid multidisciplinary capabilities which are still not the norm in the life sciences. A prototypic integrative infrastructure will be demonstrated to illustrate the obstacles and potential of ontology driven data processing that can be freely downloaded open source from References Wang X, R Gorlitsky, and JS Almeida (2005) From XML to RDF: How Semantic Web Technologies Will Change the Design of Omic Standards. Nature Biotechnology, Sep;23(9):1099-1103. Almeida et al. (2006) Data Integration Gets 'Sloppy'. Nature Biotechnology Sep;24(9):6-7.
Date: 25-Sep-2006    Time: 14:00:00    Location: 336


Geração de funções de ranking usando Programação Genética

Marcos André Gonçalves

Universidade Federal de Minas Gerais

Abstract—A efetividade de um sistema de recuperação de informação depende fundamentalmente da qualidade da função de ordenação (ranking) dos documentos. Até hoje, literalmente, milhares de alternativas de funções de ordenação já foram empiricamente estudadas. Já se sabe também que o comportamento de funções consideradas standard, como TF-IDF e BM25, pode variar de acordo com o contexto (coleção e consultas) para a qual são aplicadas. Em função disso, abordagens que conseguem aprender características específicas deste contexto para gerar uma função de ordenação mais específica, têm conseguido resultados mais efetivos do que as funções standard. Uma dessas abordagens é Programação Genética (GP). Diversos trabalhos utilizam evidências estatísticas da coleção, dos documentos e das consultas como características dos indivíduos. Diferentemente daqueles, este trabalho utiliza evidências mais significativas no lugar de informações estatísticas. Estas evidências foram extraídas de conhecidas funções de ordenação (CCA) e de probabilidades (PROB) de ocorrência de termos e documentos em uma coleção. Os melhores resultados obtidos com estas evidências para a coleção TREC-8, apresentaram ganhos de cerca de 41% na precisão média (MAP) contra BM25 e de quase 18% contra uma abordagem que usa GP a partir de evidências estatísticas.
Date: 13-Sep-2006    Time: 15:00:00    Location: Taguspark, anfiteatro A5




Departamento de Engenharia Informática

Abstract—Neste artigo, começamos por descrever a questão da fraude na área de telecomunicações, explicando quais as suas principais características e problemas. Para melhor contextualizar o problema, são descritas as principais acções e formas de fraude conhecidas, com algumas notas sobre os seus padrões específicos e exemplos concretos, seguido de um quadro resumo comparativo. De seguida, é explicada a arquitectura genérica de um sistema de controlo de fraude, com os principais módulos, fontes de informação e periodicidade de análise, culminando com um resumo comparativo dos principais sistemas comerciais disponíveis actualmente no mercado. Finalmente é feita uma análise e discussão do estado da arte na área, são avançadas um conjunto de opiniões e críticas sobre o que existe e já foi feito, e discute-se a razoabilidade da distinção de situações normais, ainda que atípicas, para aplicação de métodos de análise e detecção inteligentes e automáticas.
Date: 28-Jul-2006    Time: 15:00:00    Location: TAGUSPARK – PISO 0 Anfiteatro A3



Énio Manuel Dória Pereira

Departamento de Engenharia Informática

Abstract—A Computação Afectiva é uma área de investigação bastante recente que estuda a influência e uso das emoções nos sistemas computacionais. Entre os vários tópicos abrangidos por esta disciplina encontra-se a síntese de emoções e é neste âmbito que se insere o presente trabalho. Neste trabalho pretendemos estabelecer uma ligação entre um sistema que simula um mundo virtual povoado por agentes e um outro cujo propósito é gerar emoções. Este último baseia-se na teoria de emoção OCC. Esta ligação corresponde a fornecer aos agentes uma função de avaliação que trate, para cada situação vivida pelos agentes, de a avaliar e potencialmente atribuir valores a variáveis, ditas de appraisal, definidas na teoria OCC, com a finalidade de desencadear emoções.
Date: 28-Jul-2006    Time: 10:00:00    Location: ANFITEATRO PA-3 (PISO – 1 DO PAV. DE MATEMÁTICA) DO IST




Departamento de Engenharia Informática

Abstract—Com a evolução das redes informáticas novos desafios surgem a cada dia. Com o aparecimento de cada vez mais aplicações de rede com uma maior sensibilidade ao tipo de tráfego que requerem para funcionar correctamente, o campo da qualidade de serviço torna-se cada vez mais central e importante. Paralelamente, há que considerar não só o surgimento de novas tecnologias, mas também a evolução de tecnologias já existentes. O Transmission Control Protocol, elemento fundamental da pilha de protocolos de rede, e o protocolo de eleição para o transporte, com confiança, de dados, é um protocolo que tem de coexistir e adaptar-se (ao mesmo tempo que mantém as suas características de bom funcionamento) com todas estas mudanças. Neste trabalho vamo-nos centrar no TCP numa perspectiva baseada na qualidade de serviço. O nosso objectivo é o de analisar, de uma forma crítica, as limitações e potencialidades do protocolo num ambiente de rede limitado por requisitos de qualidade de serviço.
Date: 27-Jul-2006    Time: 15:00:00    Location: 336



Jorge Valadas

Departamento de Engenharia Informática

Abstract—Applications have become interwoven with our everyday lives, we now more than ever feel the need to be able access them anywhere we are. In order to allow users to have such a behaviour mobility must be supported. This article analyses two possible levels at which mobility can be supported. We first explain the reasons which lead to the need for mobility at each of these levels and follow with an analysis of several of the mobility architectures available, both at the network and the application layer. A qualitative comparison is then made of the solutions and conclusions are drawn regarding the situations in which they should be used.
Date: 25-Jul-2006    Time: 16:00:00    Location: INESC – 9º PISO Auditório Alfa (Sala 918)



Laércio Junior

Departamento de Engenharia Informática

Abstract—This paper studies the evolution of policy-based management and examines some of the proposed architectures for the management of networks and services, evaluating their advantages and disadvantages. A comparison is made among three representative models on general architecture attributes, policy-related attributes, and general evaluation metrics for scalability, reliability and performance. This analysis leads to the observation that there are important shortcomings in all models, among them the limited scope of the policies used. This is a major issue for provision of end-to-end quality of service (QoS) in heterogeneous, multi-provider networks. Using the information collected, and considering the necessities of modern Internet applications, a model for a policy-based management architecture is derived. It is scalable, technology independent and capable of providing end-to-end QoS.
Date: 25-Jul-2006    Time: 15:00:00    Location: INESC – 9º PISO Auditório Alfa (Sala 918)



David Sardinha Andrade de Aveiro

Departamento de Engenharia Informática

Abstract—This thesis has the main purpose of assessing the usefulness to model organizational functions in the context of organizational engineering. To achieve this we present an extensive analysis of current insights found in literature of diverse fields of knowledge, like management, engineering, biology and philosophy, to bridge the gap between the several different perspectives and clarify what in fact are organizational functions and what should be considered artifacts of the functional dimension of an organization. Based on current findings and on previous work done in the field of organizational engineering, an ontology is proposed, for the purpose of modeling the functional concern of an enterprise architecture, in a coherent manner. Namely, in organizations, representing a function means specifying, for a certain process X, its interdependencies with other parts of the organization, which contribute to its self-maintenance, namely: (1) a norm (goal value) for a certain state variable of the process; (2) which other process (or processes) depend on this norm, in order to remain functional; (3) the set of business rules - embedded in the process itself, or other process(es) - that work as resilience mechanisms to expected exceptions and try to reestablish the norm to the process functioning; (4) set of specialized and accumulated knowledge related to process X's domain used for treatment of unexpected exceptions in a special dynamics called microgenesis. We further propose an extension to an existing modeling framework that argues that the multidimensional aspects of the enterprise should be organized into five architectural components: Organization, Business, Information, Application and Technological architectures. The extension we propose is an additional architectural view: the Function Architecture. This architecture allows the modeling of organizational functions separating it's inherent concerns of: operation, monitoring, resilience and microgenesis, while maintaining coherence in their components and interconnections. Several benefits seem to arise out of this proposal, like: simplification of organizational models thanks to the separation of concerns; increased traceability between fundamental entities of organizations and model elements reutilization; detection of vital processes and gaps on the organization's self maintenance mechanisms; among others. The usefulness of these possible benefits will be assessed in the final stage of this thesis which aims at a practical experiment on modeling functions in real organizations, with the proposed architecture, in the context of at least two case studies, already planned for execution. Keywords: Function Modeling, Organizational Engineering, Organizational Modeling, Organizational Function
Date: 24-Jul-2006    Time: 10:00:00    Location: INESC – Auditório Alfa 9º piso


Ambrósio como auxiliar à execução de tarefas culinárias

Filipe Miguel Fonseca Martins, Pedro Jorge Fino da Silva Arez


Abstract—Ambrósio como auxiliar à execução de tarefas culinárias.
Date: 21-Jul-2006    Time: 15:30:00    Location: 336



Vasco Manquinho

Departamento de Engenharia Informática

Abstract—New algorithms for Pseudo-Boolean Optimization have been motivated by the recent advances in Propositional Satisfiability (SAT) algorithms. These new techniques developed in SAT algorithms are powerful mechanisms in manipulating problem constraints, but they are not effective in dealing with information from the cost function in Pseudo-Boolean Optimization problem instances. In this dissertation we propose a new algorithmic framework for solving the Pseudo-Boolean Optimization problem. We start by introducing a new algorithm that integrates SAT-based techniques such has non-chronological backtracking, Boolean constraint propagation and constraint learning with classical branch and bound techniques, namely the use of lower bound estimation procedures on the value of the cost function. Moreover, we provide conditions for using several lower bound estimation procedures with SAT-based techniques and introduce the notion of bound conflict learning. Finally, we also propose the use of cutting plane techniques commonly used in Integer Linear Programming within a SAT-based framework. Experimental results show that our algorithm is more effective in solving several Pseudo-Boolean Optimization problem instances and provide a significant contribution to this area.
Date: 20-Jul-2006    Time: 14:00:00    Location: ANFITEATRO DO COMPLEXO I DO IST


Representing Policies for Quantified Boolean Formulas

Daniel le Berre

CRIL - Faculté Jean Perrin, Université d'Artois

Abstract—The practical use of Quantified Boolean Formulas (QBFs) often calls for more than solving the validity problem QBF. For this reason we investigate the corresponding function problems whose expected outputs are policies. QBFs which do not evaluate to true do not have any solution policy, but can be of interest nevertheless; for handling them, we introduce a notion of partial policy. We focus on the representation of policies, considering QBFs of the form \\\\forall X \\\\exists Y \\\\phi. Because the explicit representation of policies for such QBFs can be of exponential size, descriptions as compact as possible must be looked for. To address this issue, two approaches based on the decomposition and the compilation of \\\\phi are presented.
Date: 20-Jul-2006    Time: 10:00:00    Location: 336



João Saraiva

Departamento de Engenharia Informática

Abstract—Ever since the introduction of computers into society, researchers have constantly been trying to raise the abstraction level at which we write software programs; the first computer programming languages, structured languages and object-oriented languages are examples of this. We are currently adopting a new abstraction level based on models instead of source code: Model-Driven Engineering (MDE). This new abstraction level is the driving force for some recent modeling approaches, such as OMG's Unified Modeling Language (UML) or Domain-Specific Modeling (DSM). But MDE and all its approaches are founded on metamodeling, the definition of a language representing a problem-domain and then the usage of that language to create models. A key factor for the success of an approach is appropriate tool support; this has been the case with UML and DSM. However, it was only recently that tool creators started considering metamodeling as a first citizen issue in their list of greatest concerns and priorities. In this paper, we evaluate a set of MDE tools from the perspective of the metamodeling activity. This evaluation is focused on both architectural and practical aspects of modeling and how the metamodeling activity is supported. Then, using the results of this evaluation, we discuss the current status of MDE tools and the direction that tool creators seem to be taking.
Date: 18-Jul-2006    Time: 15:00:00    Location: INESC – 9º Piso Auditório Alfa (sala 918)



Ana Ramalho

Departamento de Engenharia Informática

Abstract—A quantidade de informação disponível sobre sistemas biológicos aumentou intensamente nos últimos anos, tornando o mecanismo de regulação da expressão dos genes o problema chave desta era. A técnica de biclustering, associada a dados de expressão de genes, tem sido muito utilizada para uma melhor compreensão deste mecanismo. Com esta técnica pretende-se identificar processos biológicos nos quais estão envolvidos os genes dos biclusters e, tentar decifrar redes de regulação pela análise do conjunto dos biclusters obtidos. No entanto, este problema continua em aberto, pois esta metodologia ainda não permitiu uma compreensão completa e correcta sobre o mecanismo de regulação. Neste trabalho é descrita uma nova metodologia que utiliza dados de regulação da transcrição para identificar agrupamentos de genes que são regulados por um conjunto comum de factores de transcrição. Para tal, foi desenvolvido um algoritmo que identifica biclusters constantes numa matriz de regulação binária. Esta metodologia foi aplicada a dados de regulação do organismo Saccharomyces cerevisiae. Os resultados obtidos para as regulações documentadas evidenciaram em geral agrupamentos com importante significado biológico, uma vez que foram agrupados genes que faziam parte dos mesmos processos biológicos. Os resultados obtidos para as regulações potenciais são mais complexos na sua análise mas abrem portas para a identificação das funções dos genes e para a compreensão dos processos biológicos onde estes estão envolvidos.
Date: 18-Jul-2006    Time: 14:00:00    Location: ANFITEATRO PA-3 (PISO – 1 DO PAV. DE MATEMÁTICA) DO IST


Recent work by Jean-Luc Rouas

Jean-Luc Rouas


Abstract—Jean-Luc Rouas will tak about two recent research trends: - Detection of audio events for surveillance in public transportation - Identification of dialects using prosodic cues
Date: 11-Jul-2006    Time: 15:30:00    Location: 336



Luis Coelho

Departamento de Engenharia Informática

Abstract—The problem we address is text indexing for approximate matching. We consider that we are given a text T which undergoes some preprocessing to generate an index. We can later query this index to identify the places where a string occurs up to a certain number of allowed errors k, where by error we mean the substitution, deletion or replacement of one character (edition or Levenstein distance). We present a structure for indexing which occupies space O(n log^k n) in the average case, independent of alphabet size, n being the text size. This structure can be used to report the existence of a match with k errors in O (3k mk+1 ) and to report the occurrences in O(3^k m^{k+1} + ed) time, where m is the length of the pattern and where ed the number of matching edit scripts. These bounds are independent of alphabet size. The construction of the structure has time bound by O (k N |S|), where N is the number of nodes in the index and |S| the alphabet size.
Date: 06-Jul-2006    Time: 16:30:00    Location: 336



Fausto Jorge Morgado Pereira de Almeida

Departamento de Engenharia Informática

Abstract—Proponho uma abordagem de melhoramento iterativo da Inteligência Artificial para o planeamento de turnos para tripulantes, mas adoptando ideias da Investigação Operacional. Nesta abordagem, os planos são melhorados de acordo com objectivos de melhoramento bem definidos, num espaço abstracto de meta-operadores. Ao contrário dos operadores convencionais ou de macro-operadores, os meta-operadores permitem dar grandes saltos no espaço de estados e evitar ficar preso em mínimos locais. A sua utilização constitui uma inovação na abordagem de melhoramento iterativo, substituindo com vantagem os outros tipos de operadores. Cada meta-operador resolve um sub-problema de menor dimensão que o problema original, usando um solucionador construtivo adequado. Para definir um sub-problema é seleccionado um conjunto de turnos a melhorar, de acordo com um determinado objectivo, e usa-se as suas actividades. O solucionador deve encontrar uma forma diferente de combinar estas actividades em novos turnos, que esteja mais próxima dos objectivos de melhoramento da procura global. Com este método consegue-se, seguindo uma abordagem de caixa-branca, reparar ou optimizar planos de forma eficaz. O sistema resultante, o SMI, foi testado com vários problemas fornecidos por uma empresa ferroviária europeia, e os resultados foram comparados com os obtidos pelos seus planeadores e por um sistema da vanguarda industrial.
Date: 05-Jul-2006    Time: 14:00:00    Location: ANFITEATRO DO COMPLEXO I DO IST


The Cost of Search?

Toby Walsh

National ICT Australia and University of New South Wales

Abstract—Whilst waiting for a search procedure like a TSP or SAT solver to finish, you might ask yourself a number of questions. Is the search procedure just about to come back with an answer, or has it taken a wrong turn? Should I go for coffee and expect to find the answer on my return? Is it worth leaving this to run overnight, or should I just quit as this search is unlikely ever to finish? To help answer such questions, we propose some new online methods for estimating the size of a backtracking search tree.
Date: 29-Jun-2006    Time: 10:30:00    Location: 336


Leak Resistant Architecture: Statements and Perspectives

Daniel Mesquita

Laboratoire d´Informatique, de Robotique e de Microélectronique de Montpellier (Jean-Claude Bajard)

Abstract—Hardware implementations of cryptographic algorithms may leak some information as computing time, electromagnetic emissions and power consumption. Based on this information, some kind of attacks can be performed to recover cryptographic keys. This presentation shows two approaches to thwart some Side Channel Attacks (SCA). The first one is an analog hardware countermeasure that counteracts SCA that not requires any modification on the cryptographic algorithm, the messages or keys. The second method concerns a combination of reconfigurable techniques with the recently proposed Leak Resistant Arithmetic (LRA) to thwart SCA based on power analysis. The main aim of this approach is to perform modular multiplication and exponentiation, the most significant cryptographic operations, by randomly change the intermediate results of a cryptographic computation. In this way SCA based on power analysis is no longer efficient. This approach resulted in a Leak Resistant Reconfigurable Architecture (LR²A). Both method were simulated and synthesized for the CMOS 0.18µ technology. A short version of the LR²A was prototyped in FPGA and a SCA attack was performed to show the efficiency of the new architecture.
Date: 23-Jun-2006    Time: 14:00:00    Location: 336


Parsing Conversational Speech

Mari Ostendorf

University of Washington

Abstract—With recent advances in automatic speech recognition (ASR), there are increasing opportunities for natural language processing of speech, including applications such as speech understanding, summarization and translation. Parsing can play an important role here, but much of current parsing technology has been developed on written text. Spontaneous speech differs substantially from written text, posing challenges for parsing that include the absence of punctuation and the presence of disfluencies and ASR errors. Prosodic cues can help fill in this gap, and there is a long history of linguistic research indicating that prosodic cues in speech can provide disambiguating context beyond that available from punctuation. However, leveraging prosodic cues can be challenging, because of the many roles prosody serves in speech communication. This talk looks at means of leveraging prosody combined with lexical cues and ASR uncertainty models to improve parsing (and recognition) of spontaneous speech. The talk will begin with an overview of studies of prosody and syntax, both perceptual and computational. The focus of the talk will be on our work with a state-of-the-art statistical parser, discussing the issues of sentence segmentation, disfluencies, sub-sentence prosodic constituents, and ASR uncertainty. In addition, we show how these issues impact the use of parsing language models in ASR. We conclude by highlighting challenges in speech processing that impact parsing, including tighter integration of ASR and parsing, as well as portability to new domains.
Date: 21-Jun-2006    Time: 11:00:00    Location: IST, Torre Norte, Anfiteatro Ea3


Speaker Characterization with MLSFs

Hugo Cordeiro

Instituto Superior de Engenharia de Lisboa (ISEL)

Abstract—The work described in this paper concerns the analysis of an alternative feature for speaker characterization, in the context of speaker recognition: Line Spectrum Frequencies (LSF), but derived from mel-filter bank energies. This new feature, that we denominate mel-LSFs (MLSFs), shows similar performance comparing to MFCCs for male speakers, one of the most common feature found in speaker recognition, but for female speakers MLSFs performs better than MFCCs. When combined with mel LSFs differences, MLSFs feature overcomes the performance of the MFCCs for male and female speakers, even with temporal delta, ?MFCCs, included. Performance is measured in the context of speaker verification, using EER and minimum HTER. Detection error threshold (DET) curves are also presented, as well as HTER curves. The main objective of this study is to compare different features performances with a common framework, from what a standard support vector machine recogniser was developed. Tests are based on the cellular component of the “2002 NIST Speaker Recognition Evaluation Corpus.
Date: 16-Jun-2006    Time: 15:30:00    Location: 336


“Qualidade de Serviço em Web-Services”

Ricardo Manuel Ferreira Seabra Gomes

Departamento de Engenharia Informática

Abstract—Os Web Services surgiram como uma nova tentativa para tornar os vários serviços disponibilizados numa arquitectura distribuída inter-operáveis, utilizando para isso tecnologias que tivessem sido previamente normalizadas para comunicação entre os vários elementos intervenientes. Apesar de ser uma tecnologia nova, já existem diversos sistemas no mercado que utilizam Web Services. Com a entrada em produção destes sistemas, surge a necessidade de introduzir parâmetros para auferir da qualidade dos serviços prestados. No entanto, a especificação, tal como foi desenvolvida, não contém nenhuma forma de definir parâmetros de qualidade de serviço na descrição dos serviços prestados. Este facto levou a que fossem feitas diversas propostas com esse mesmo objectivo por parte de diversos fabricantes, tais como a IBM, HP ou Microsoft. No âmbito deste trabalho será efectuado um estado da arte das propostas actualmente existentes de Web Services com suporte de qualidade de serviço, quer a nível de descrição do serviço, quer a nível de arquitectura. É ainda dada uma perspectiva da forma como se pode garantir qualidade de serviço extremo-a-extremo, realizando um mapeamento entre o suporte de qualidade de serviço de nível aplicacional e aquele que a rede oferece.
Date: 07-Jun-2006    Time: 09:00:00    Location: INESC – Auditório Alfa (sala 918) 9º piso


Design for Testability and 0 PPM Strategies: Industrial Experience

Anton Chichkov


Abstract—Cost-effective test of semiconductor products in DSM (Deep Sub-Micron) technologies is a challenging problem. High-quality test, leading to extremely low defect levels, or escape rates (in the order of few ppm (parts per million)) requires a unified approach of intelligent management of different test strategies. Digital test is difficult, analog and mixed-signal design and test is even more demanding. The author works in worldwide leading company and will address, from an industrial point of view, the following topics: Need for test Cost of test and DFT (Design for Testability) Link yield coverage and PPMs Challenges for 0 PPM strategies Some real cases of RMA Some possible directions to go with DFT Addressing Bridging Faults Addressing Open Faults Analogue BIST Overview of state of the art in the industry for test development and coverage
Date: 01-Jun-2006    Time: 11:00:00    Location: IST, Torre Norte, Anfiteatro EA4


An Introduction to Grid Computing

Tiago Manuel da Cruz Luís, João Rui Mariano Leal


Abstract—We present our work in the development and installation of a grid system in the Spoken Language Systems Lab. We will give a brief notion about what is grid computing, and what were the platforms used in our work, namely Condor and Globus. Finally, we present examples of how a grid system can be used and what are its benefits.
Date: 26-May-2006    Time: 15:30:00    Location: 336



Maria Alexandra Rentroia Bonito

Departamento de Engenharia Informática

Abstract—The doctoral work outlined in this document is part of an ongoing research program conducted at Instituto Superior Técnico (IST) by GELO (Group for E-Learning in Organizations). Its main objective is to analyse the relationship between motivation-tolearn, usability of learning systems, and learning outcomes by using an integrated usability evaluation framework. The main research question is: Do e-learning programs, designed taking into account usability and motivation-to-elearn aspects, positively enhance learning outcomes? If so, what design variables are more relevant to motivate what groups to engage in e-learning? We propose a conceptual framework and its embodiment in an e-learning system. To assess its effectiveness, we developed an usability evaluation method to drive empirical tests. This evaluation method combines quantitative and qualitative measures and fosters design-oriented user feedback along the learning process. The characteristics of the proposed holistic evaluation method address some identified weaknesses in usability evaluation studies, such as the lack of: (a) an integrated conceptual approach to evaluate e-learning systems’ usability in the context of use, and (b) a design tool to bridge the designer-user communication gap in a cost-effective manner. Preliminary results, garnered over the course of the last two years, with a small subject population, suggested that the proposed evaluation method allows an structured and design-oriented assessment on usability of e-learning systems taking into account learners’ motivation to e-learn. The proposed research work will test and validate these results by using a larger population, to yield a workable approach to predict e-learning outcomes. The main expected contribution of this research work is an empirically tested usability evaluation method that allows development teams to anticipate the impact of learners’ motivation and usability of e-learning systems on outcomes. This will contribute to support development teams’ decision-making process when designing e-learning experiences, focusing their effort on online learners’ valued items in specific learning situations.
Date: 25-May-2006    Time: 14:30:00    Location: Sala de Reuniões do DEI


CABL: Conteúdos Audiovisuais para Banda Larga

Inês Oliveira

Universidade Lusófona de Humanidades e Tecnologias

Abstract—A maior parte das aplicações multimédia requer o acesso ao conteúdo propriamente dito, veja-se por exemplo o caso da TV Interactiva, notícias personalizadas ou mesmo vídeo a pedido. Face a isto é imprescindível aceder à informação audiovisual em termos de conteúdo, caso contrário a elevada quantidade de vídeo torna-se um entrave à sua recuperação. O acesso ao conteúdo audiovisual permite recuperar áudio, vídeo e imagens de forma automática, se bem que seja um desafio bastante complexo. Em primeiro lugar, porque a informação audiovisual se caracteriza pelo seu grande volume de dados e pela sua heterogeneidade. Em segundo, porque as representações com base em conteúdo descrevem a informação audiovisual com base em propriedades visuais ou acústicas (cores, texturas, movimentos, frequência, etc.) e não em propriedades semânticas. A produção automática ou semi-automática de resumos da informação multimédia tem vindo assim a ser uma das formas adoptadas para resolver o problema da elevada quantidade de dados a recuperar. A plataforma CABL, em particular, tem como principal objectivo criar um serviço, e aplicações associadas, para fornecimento de conteúdos audiovisuais em língua portuguesa sobre banda larga. Este projecto inclui uma aplicação de gestão que, após a sua conclusão, permitirá a geração semi-automática de resumos audiovisuais com base em categorias semânticas e perfis de utilizadores.
Date: 19-May-2006    Time: 15:30:00    Location: 336


Finding good unsatisfiable sub-clause-sets

Oliver Kullmann

University of Wales Swansea

Abstract—I want to present some joint work with Joao Marques-Silva and Ines Lynce on the problem of finding "good" unsatisfiable sub-clause-sets of some given unsatisfiable clause-set. This problem has applications in consistency checking of order specifications as well as in model checking. Our approach is based on a fine-grained analysis of the clause-set, exploiting proof-theoretical and semantical properties. Especially the analysis of minimally unsatisfiable sub-clause-sets and generalisations to "lean" clause-sets (i.e., "autarky-free" clause-sets) will play a role here.
Date: 16-May-2006    Time: 14:00:00    Location: 217



Ana Rita Silva Marques Amado Fernandes

Departamento de Engenharia Informática

Abstract—Apesar das muitas iniciativas e sistemas informáticos criados com o objectivo de suportar os processos de negócio, as pessoas ainda investem muito tempo do seu trabalho na selecção e obtenção da informação necessária ao desempenho das suas actividades dentro da organização. No que respeita ao trabalho colaborativo, também encontramos inúmeros sistemas que fornecem apenas um apoio parcial aos actores do negócio visto estarem focalizados num determinado objectivo e não tomarem em consideração as capacidades multi-tarefa dos indivíduos. Do ponto de vista operacional, para suportar apropriadamente as tarefas intelectuais na organização, a informação requerida deve ser fornecida de forma pró-activa e oportuna, segundo os padrões humanos de processamento da informação. A incorporação desses padrões em tecnologias e ferramentas requer uma perspectiva diferente da organização e de novos conceitos organizacionais. Os actores envolvidos nos processos de negócio, especialmente os actores humanos, são entidades complexas capazes de exibir múltiplos comportamentos segundo a tarefa e o papel desempenhado na execução da mesma. A presente investigação concretiza o conceito de “Contexto de Interacção” apresentando-o como o elemento chave para modelar os actores e as suas interacções que ocorrem durante a execução das tarefas que compõem os processos de negócio. Para tal, foi definida e aplicado um método apropriado à natureza qualitativa da informação, ilustrando os conceitos referidos com um caso prático efectuado num ambiente organizacional real. Palavras-chave: Modelação Organizacional; Modelação de Processos de Negócio e Actores de Negócio; Modelação baseada em Papéis; Contexto de acção e Contexto de Interacção; Trabalho Colaborativo; Teoria de Speech Acts.
Date: 05-May-2006    Time: 15:00:00    Location: ANFITEATRO PA-3 DO EDÍFICIO DE PÓS-GRADUAÇÃO DO IST


Predicting transient error rates due to radiation for processor-based digital architectures

Dr. Raóul Velazco

Lab, Intitut National Polytechnique de Grenoble

Abstract— Microelectronic circuits operating in radiation environments can be affected by the so-called Single Event Upsets (S.E.U.) phenomenon. SEUs, also referred as “upsets”, “soft errors” or “bit flips”, are mainly responsible for transient (non destructive) changes in the information stored in memory cells within integrated circuits. The cause of SEUs is the creation of a spurious current pulse in sensitive areas of the circuit. This current pulse appears as the consequence of the ionization produced from the interaction of energetic particles with the silicon substrate. Since the last 20 years SEUs have been a major concern for space applications due to the presence of charged particles (heavy ions, protons) in space environment. The constant improvements accomplished by the microelectronics manufacturing technology, make the today’s integrated circuits operating in the Earth’s atmosphere potentially sensitive to SEUs. Indeed, upsets observed in aircraft’s equipment and even in systems operating at ground level, have been explained by the interaction of neutrons present in the atmosphere. Notice that in this case, the incident particle has no charge, but the ionization is provoked by the daughter particles resulting from the intereaction between the neutron and atoms present in the silicon substrate. Perturbations provoked by Single Event Upsets (SEUs) increase with the reduction of transistor\\\'s features. In this talk it will be presented a strategy allowing estimating SEU error-rates based on a limited radiation ground testing (performed in particle accelerators) and fault injection results. A flexible and versatile test platform, well suited to implement such a strategy will be described. Experimental results obtained for different processors will illustrate the accuracy of error rate predictions resulting from the use of the proposed error-rate prediction strategy.
Date: 05-May-2006    Time: 11:00:00    Location: 336


Adaptive Main Memory Compression

Thomas Gross


Abstract—Title: Adaptive Main Memory Compression <br><br> Irina Chihaia Tuduce and Thomas Gross<br> Departement Informatik<br> ETH Zurich<br><br> Applications that use large data sets frequently exhibit poor performance because the size of their working set exceeds the available physical memory. As a result, these applications suffer from excess page faults and ultimately exhibit thrashing behavior. For some applications, compression offers a way to reduce the number of page faults that must be serviced from the disk. We describe here a system that can be implemented with a small number of kernel changes.<br><br> The key idea to exploit the benefits of memory compression is to adapt the allocation of real (physical) memory between uncompressed and compressed pages without user involvement. The system manages its resources dynamically on the basis of the varying demands of each application and also on the situational requirements that are data dependent. The technique used to localize page fragments in the compressed area allows the system to reclaim or add space easily if it is advisable to shrink or grow the size of the compressed area.<br><br> The design is implemented in Linux, runs on both 32-bit and 64-bit architectures, and has been demonstrated to work in practice under complex workload conditions and memory pressure. The benefits from our approach depend on the relationship between the size of the compressed area, the application's compression ratio, and the access pattern of the application. For a range of benchmarks and applications, the system shows an increase in performance by a factor of 1.3 to 55.<br><br> Short CV<br> Thomas R. Gross is a Professor of Computer Science at ETH Zurich, Switzerland. He is the head of the Computer Systems Institute, from 1999-2004 he was the deputy director of the NCCR on on "Mobile Information and Communication Systems", a research center funded by the Swiss National Science Foundation. He is also an Adjunct Professor in the School of Computer Science at Carnegie Mellon University.<br><br> Thomas Gross joined CMU in 1984 after receiving a Ph.D. in Electrical Engineering from Stanford University. In 2000, he became a Full Professor at ETH Zurich. He is interested in tools, techniques, and abstractions for software construction and has worked on many aspects of the design and implementation of programs. To add some realism to his research, he has focussed on compilers for uni-processors and parallel systems and has contributed to many areas of compilation (code generation, optimization, debugging, partitioning of computations, data parallelism and task parallelism). Compilers are also interesting systems that illustrate the use of many concepts to structure programs (frameworks, patterns, components). Compilers require a good cost-model of the target environment (e.g., to make space-time tradeoffs) but recent systems have become so complex that simple models no longer suffice. In his current research, Thomas Gross and his colleagues investigate network- and system-aware programs -- i.e. programs that can adjust their resource demands in response to resource availability.<br><br> In addition to working on compilers, Thomas Gross has been involved in several projects that straddle the boundary between applications and compilers. And since many programs are eventually executed on real computers, He has also participated in the past in the development of several machines: the Stanford MIPS processor, the Warp systolic array, and the iWarp parallel systems. His current work in computer systems concentrates on networks.
Date: 03-May-2006    Time: 10:00:00    Location: Sala 905 (Sala Omega do POSI)


Modelos simples com tempo discreto de circuitos de regulação genética

Ricardo Coutinho

Instituto Superior Técnico

Abstract—Descreve-se a modulação de redes de regulação genética através de sistemas dinâmicos seccionalmente afins com tempo discreto. Apresentam-se os re