| CS-2016-01 | |||||
|---|---|---|---|---|---|
| Title | Providing Serializability for Pregel-like Graph Processing Systems | ||||
| Authors | Minyang Han and Khuzaima Daudjee | ||||
| Abstract | We apply recent work on referring expression types to the issue of identification in conceptual modelling. In particular, we consider how such types yield a separation of concerns in a setting where an information system based on a conceptual schema is to be mapped to a relational schema plus SQL queries. We start from a simple object-centered representation (as in semantic data models), where naming is not an issue because everything is self-identified (possibly using surrogates). We then allow the analyst to attach to every class a preferred "referring expression type", and to specify uniqueness constraints in the form of generalized functional dependencies. We show (1) how a number of well-formedness conditions concerning an assignment of referring expressions can be efficiently diagnosed, and (2) how a concrete relational schema and SQL queries over this schema are derived from a combination of the conceptual schema and queries over it, once identification issues have been separately resolved as above. | ||||
| Date | February 1, 2016 | ||||
| Report | Providing Serializability for Pregel-like Graph Processing Systems (PDF) | ||||
| CS-2016-02 | ||||
|---|---|---|---|---|
| Title | Distributed Data Deduplication | |||
| Authors | Xu Chu, Ihab Ilyas and Paraschos Koutris | |||
| Abstract | Data deduplication refers to the process of identifying tuples in a relation that refer to the same real world entity. The complexity of the problem is inherently quadratic with respect to the number of tuples, since a similarity value must be computed for every pair of tuples. In order to avoid comparing tuple pairs that are obviously non-duplicates, matching algorithms use blocking techniques that divide the tuples into blocks and compare only tuples within the same block. However, even with the use of blocking, data deduplication remains a costly problem for large datasets. In this paper, we show how to further speed up data deduplication by leveraging parallelism in a shared-nothing computing environment. Our main contribution is a distribution strategy, called \disdedup, that minimizes the maximum workload across all worker nodes and provides strong theoretical guarantees. We demonstrate the effectiveness of our proposed strategy by performing extensive experiments on both synthetic datasets with varying block size distributions, as well as real world datasets. | |||
| Date | February 1, 2016 | |||
| Report | Distributed Data Deduplication (PDF) | |||
| CS-2016-03 | ||||
|---|---|---|---|---|
| Title | 
        On
        Referring
        Expressions
        in
        Information | |||
| Authors | Alexander Borgida, David Toman and Grant Weddell | |||
| Abstract | We apply recent work on referring expression types to the issue of identification in conceptual modelling. In particular, we consider how such types yield a separation of concerns in a setting where an information system based on a conceptual schema is to be mapped to a relational schema plus SQL queries. We start from a simple object-centered representation (as in semantic data models), where naming is not an issue because everything is self-identified (possibly using surrogates). We then allow the analyst to attach to every class a preferred "referring expression type", and to specify uniqueness constraints in the form of generalized functional dependencies. We show (1) how a number of well-formedness conditions concerning an assignment of referring expressions can be efficiently diagnosed, and (2) how a concrete relational schema and SQL queries over this schema are derived from a combination of the conceptual schema and queries over it, once identification issues have been separately resolved as above. | |||
| Date | April 28, 2016 | |||
| Report | On Referring Expressions in Information Systems derived from Conceptual Models (PDF) | |||
| CS-2016-04 | ||||
|---|---|---|---|---|
| Title | Feature-Oriented Modelling in BIP: A Case Study | |||
| Authors | Cecylia Bocovich and Joanne Atlee | |||
| Abstract | In this paper, we investigate the usage of Behaviour-Interaction-Priority version 2 (BIP2), a component-based modelling framework, for specifying feature-oriented systems. We evaluate BIP2 in the context of the Feature Interaction Problem and quantify the amount of work needed to add features to an existing system (i.e., in terms of rework to existing features, and work to identify and specify interactions). We present the results of a case study on a telephony system with five optional features where we found that the amount of work depends heavily on how features are interconnected. We identify a number of different strategies for interconnecting features, and propose one that reduces the amount of work and rework needed to add new features to an existing system. | |||
| Date | September 20, 2016 | |||
| Report | Feature-Oriented Modelling in BIP: A Case Study (PDF) | |||
| CS-2016-05 | ||||
|---|---|---|---|---|
| Title | Improving Time-of-Use Electricity Pricing in Ontario | |||
| Authors | Adedamola Adepetu, Srinivasan Keshav | |||
| Abstract | Time-of-Use (ToU) electricity pricing is an electricity pricing scheme where consumers are charged at a rate that is dependent on the time of electricity consumption. This pricing scheme is often implemented to match the cost of generating and supplying electricity, and to make consumers defer appliance usage; this would reduce the daily electricity consumption peak that can both reduce the cost of generation and carbon footprints. We first critique the current ToU scheme in Ontario and make recommendations to improve it. Subsequently, we create an Agent-Based Model (ABM) to study ToU pricing and its effectiveness in reducing peak loads, which allows us to evaluate the benefit of our recommendations. We nd that while ToU is effective in incentivizing load deferral, improvements can be made in the Ontario ToU scheme. Keywords: demand response, agent-nased model, electricity pricing | |||
| Date | September 20, 2016 | |||
| Report | Improving Time-of-Use Electricity Pricing in Ontario (PDF) | |||